Thursday, July 15, 2010

Django from Scratch

Today I had the pleasure of creating a brand-new Django project from scratch. Here are some things I realized or remembered during my first couple of hours of work after maintaining a large Django project for over a year.

Create a local_settings.py File
Put it in the project folder and import it from settings.py with exception handling so it's okay if it doesn't exist. Make sure your version control ignores this file. Use this to override your database settings, add plugins (like Django Debug Toolbar), and to set DEBUG = True. Always leave DEBUG set to False in the real settings file. That's just one thing you don't want the slightest possibility of moving into production. Also, don't reverse the process and import settings.py in local_settings.py. It causes problems when you have to remember to set or import the proper DJANGO_SETTINGS_MODULE for any given environment. It's so much easier to just let Django use the default settings file hard-coded into manage.py.

Create Initial Fixtures
During development you'll likely create some data. This is useful for obvious reasons. If you create a fixture file with the name initial_data in a path specified in your FIXTURE_DIRS setting, it will be automatically loaded each time you flush the database, or after a syncdb. At the very least it will save you from re-entering the admin super-user info each time you refresh the database.

Use South
This has become a no-brainer for me. South is a database migration tool for Django which is very mature and has become the de facto standard in the community. As tweaks are made to the models, South makes modifying the database completely painless. It makes things a little easier if you create a South migration right from the start, rather than waiting till you make a change. The documentation is very good and explains all. Tip: When you get ready to deploy for the first time, you can always wipe out your South migrations and make a new "initial" migration. Your migrations folder will be cleaner and it'll look like you designed the models right on the first try.

Use setUp in Your TestCases
I could have also titled this one "test one thing at a time." I spent a lot more time today than I should have trying to figure out how to repair a dirty database connection after one test that (correctly) raised an IntegrityError. In the end, I broke out the following test into another function. In hindsight, there was no point in trying to do them both together, except that I was loading a couple of model instances that I wanted to use for both tests. Creating a setUp function in my test class solved the problem nicely, and is much cleaner. If a test ever modifies one of those instances it could introduce a subtle bug into my test suite.

Create Test Fixtures
Have some test data that your tests load. This may seem obvious, but when I first started writing tests I did it the hard way, creating model instances at the beginning of a test because they were dependencies of the model I actually wanted to test.

Create a Test Directory
A fresh Django app will leave a file named tests.py in the new directory. This is to remind you that testing is important. But a single file can become a mess. I much prefer the technique I learned from Eric Holscher's blog. Delete that tests file and create a tests directory. In that directory, create an __init__.py file. Create a bunch of test files, each serving a different purpose, and import them in __init__.py. That way, they'll be nicely organized and still be run automatically when you run ./manage.py test.

Use the related_name Keyword
In models where you use a ForeignKeyField, always specify a related_name. You're going to use it anyway eventually, so you may as well be consistent. Also, it will make your code (and your models) clearer to the maintenance programmer, who will probably be you. Also, adding one later on could break existing code that was using the default syntax.

Don't Use Sqlite
This one upsets me, because I love sqlite. However, it doesn't support certain constraints, which means tests will fail that would pass on the production database. Worse, some constraints do work, but only when a table is created, not when they're added later by a migration. Also, it doesn't support all the functionality required to use South, like the ability to delete columns from a table.

May your next project be a Django project!

--ShawnMilo

Sunday, July 11, 2010

A Simple Celery with Django How-To

I've been wanting to give Celery a try for a while now, but every time I tried I ran into a major issue. Either I'm too stupid or the documentation completely fails to demonstrate a simple, end-to-end working example if you're not using RabbitMQ.

So, I finally cobbled something together from countless bits of documentation from the official site and some Google searches. Here it is for your amusement.

Details: I'm using my Django project's database and MongoDB.

Preparation:
  • Use pip to install celery and ghettoq.
  • Add ghettoq to your INSTALLED_APPS.
  • Run syncdb.

Then you'll need a file named celeryconfig.py which exists on your PYTHONPATH. Here's mine:

#!/usr/bin/env python

from django.conf import settings

CARROT_BACKEND = "ghettoq.taproot.Database"

BROKER_HOST = settings.MONGO_HOST
BROKER_PORT = settings.MONGO_PORT

CELERY_RESULT_BACKEND = "mongodb"
CELERY_MONGODB_BACKEND_SETTINGS = {
"host": settings.MONGO_HOST,
"port": settings.MONGO_PORT,
"database": "celery",
"taskmeta_collection": "celery_collection",
}

CELERY_IMPORTS = ("myproject.myapp.tasks", )

Notes on the above:
  • I have my MongoDB settings in my settings.py file.
  • I've created a file named tasks.py and put it in the folder of one of my Django apps.

The tasks file contains two functions. One will be run whenever I call it from a Django view. The other will run every five minutes, cron-style.

That tasks file looks like this:

from celery.decorators import task
from celery.task.schedules import crontab
from celery.decorators import periodic_task

from myproject.myapp.models import Appointment

from datetime import datetime, timedelta
import time

@task
def sample_task():
print "gonna do something now and later!"
time.sleep(10)
print "It's later!"

@periodic_task(run_every=crontab(minute="*/5"))
def send_appointment_reminders():

#appointments one day away
start_date = datetime.now() + timedelta(days = 1)
end_date = start_date + timedelta(minutes = 5)

appointments = Appointment.objects.filter(
scheduled__gte = start_date,
scheduled__lt = end_date,
)

print "%d appointments found." % (appointments.count())

#send appointment reminder for this appointment now
pass

Obviously the functions here are incomplete and trivial, but they cover my two basic needs:

  • Run something asynchronously from a view.
  • Run something on a schedule without decoupling it from my Django project, making it harder to manage and maintain configurations between development and production. (A.K.A. "avoid cron.")
Last, but not least, add some code to one of your Django views:

from myproject.myapp import tasks

result = tasks.sample_task.delay()
print "result was %s" % (result,)

Then just run celeryd -B and you're all set. (The -B tells the daemon to also run the celerybeat daemon, which runs the Celery 'cron' functions. Otherwise you have to run celerybeat separately or the cron jobs are ignored.)

That's it! What's missing here is explaining how to obtain the results of the tasks later on. I haven't gotten that far yet, but if you take "result" in the sample above it contains a UUID. You can save that UUID now and use it subsequently to query the status of your asynchronous function call.

I hope this helps someone.