Sunday, July 11, 2010

A Simple Celery with Django How-To

I've been wanting to give Celery a try for a while now, but every time I tried I ran into a major issue. Either I'm too stupid or the documentation completely fails to demonstrate a simple, end-to-end working example if you're not using RabbitMQ.

So, I finally cobbled something together from countless bits of documentation from the official site and some Google searches. Here it is for your amusement.

Details: I'm using my Django project's database and MongoDB.

Preparation:
  • Use pip to install celery and ghettoq.
  • Add ghettoq to your INSTALLED_APPS.
  • Run syncdb.

Then you'll need a file named celeryconfig.py which exists on your PYTHONPATH. Here's mine:

#!/usr/bin/env python

from django.conf import settings

CARROT_BACKEND = "ghettoq.taproot.Database"

BROKER_HOST = settings.MONGO_HOST
BROKER_PORT = settings.MONGO_PORT

CELERY_RESULT_BACKEND = "mongodb"
CELERY_MONGODB_BACKEND_SETTINGS = {
"host": settings.MONGO_HOST,
"port": settings.MONGO_PORT,
"database": "celery",
"taskmeta_collection": "celery_collection",
}

CELERY_IMPORTS = ("myproject.myapp.tasks", )

Notes on the above:
  • I have my MongoDB settings in my settings.py file.
  • I've created a file named tasks.py and put it in the folder of one of my Django apps.

The tasks file contains two functions. One will be run whenever I call it from a Django view. The other will run every five minutes, cron-style.

That tasks file looks like this:

from celery.decorators import task
from celery.task.schedules import crontab
from celery.decorators import periodic_task

from myproject.myapp.models import Appointment

from datetime import datetime, timedelta
import time

@task
def sample_task():
print "gonna do something now and later!"
time.sleep(10)
print "It's later!"

@periodic_task(run_every=crontab(minute="*/5"))
def send_appointment_reminders():

#appointments one day away
start_date = datetime.now() + timedelta(days = 1)
end_date = start_date + timedelta(minutes = 5)

appointments = Appointment.objects.filter(
scheduled__gte = start_date,
scheduled__lt = end_date,
)

print "%d appointments found." % (appointments.count())

#send appointment reminder for this appointment now
pass

Obviously the functions here are incomplete and trivial, but they cover my two basic needs:

  • Run something asynchronously from a view.
  • Run something on a schedule without decoupling it from my Django project, making it harder to manage and maintain configurations between development and production. (A.K.A. "avoid cron.")
Last, but not least, add some code to one of your Django views:

from myproject.myapp import tasks

result = tasks.sample_task.delay()
print "result was %s" % (result,)

Then just run celeryd -B and you're all set. (The -B tells the daemon to also run the celerybeat daemon, which runs the Celery 'cron' functions. Otherwise you have to run celerybeat separately or the cron jobs are ignored.)

That's it! What's missing here is explaining how to obtain the results of the tasks later on. I haven't gotten that far yet, but if you take "result" in the sample above it contains a UUID. You can save that UUID now and use it subsequently to query the status of your asynchronous function call.

I hope this helps someone.

2 comments:

  1. Nice!

    Btw, with django-celery (http://pypi.python.org/pypi/django-celery) you can drop those lines in celeryconfig.py directly in your projects settings.py.

    Just install django-celery, and add djcelery to INSTALLED_APPS.

    ReplyDelete
  2. You're certainly correct about django-celery. I started out wanting the simplest possible setup that would work, without even necessarily tying it to Django. That's not how it ended up, obviously. :o/

    ReplyDelete