Delayed Job and the Leaky Abstraction

Delayed Job has some failings as a background queue (it uses an ORM, can serialize lots of objects to the datastore, mixes scheduling and queuing, requires a Rails environment per process), but it is easy to set up and works well for a lot of sites. You don’t need to set up new infrastructure, which is highly attractive to me.

However, one feature of Delayed Job that I really dislike is its delay abstraction:

Class.delay.method_call

Or, worse:

instance.delay.method_call

This gives you the semblance of transparent background work, but in reality, it serializes the object to YAML and reconstitutes it in the worker process…this is not always foolproof.

Recently, I was debugging a problem in an app that used Delayed Job to send notifications in the background. The code looked a bit like this:

notification = Notification.new(some_params)
notification.user = current_user
notification.delay.deliver

This was working in the tests but not on production. However, the background job was not failing.

I ran the console and deserialized the failed job:

Delayed::Job.find(id).payload_object

However, I wasn’t able to trace the bug all the way down the stack, because of a strange problem deserializing the object on the console. It blew up with the message uninitialized constant Syck::Syck (NameError). There was no error when actually running the code, but I was fairly certain the problem was with how the user object was deserialized in the job.

In deliver, the code looked something like this:

user.account.owners.each do |owner|
  send_notification(owner, some_params)
end

The job worked, but no notifications were sent. Why? Because that owners collection was empty. When the user object was serialized, it had the account association loaded but not the owners. I think the deserialized object was in a state where it didn’t need to load the association, but it was actually empty.

In this case, a “cool” feature gets in my way of getting my job done. I prefer the more explicit way.

Instead of using the delay sugar, it is much better to create a job class to handle this:

class SendNotificationJob < Struct.new(:user_id, :some_params)
  def perform
    user = User.find(user_id)
    user.account.owners.each do |owner|
      send_notification(owner, some_params)
    end
  end
end

And use it like this:

Delayed::Job.enqueue(SendNotificationJob.new(user.id, some_params))

As you can see, instead of passing the user object, I pass just its ID. This has a number of benefits:

Smaller objects to serialize

Google App Engine’s deferred library (which was inspired by Delayed Job) calls this out explicitly. Google advises developers to use small functions and not to pass entities to the defer function.

No Magic

Everyone reading a Delayed::Job.enqueue knows what it is doing.

Easier to read the delayed_job table for debugging

Especially compared to serializing an instance (as opposed to a class method) you get a nicely readable database row per job with not much in it. In addition, if you use New Relic, you’ll see all your jobs by name.

Easier to test

You can write tests for each job that will function just as if the job had been run by a Delayed Job worker. I recommend running your tests with Delayed::Worker.delay_jobs = true, and testing the jobs separately. Otherwise, it is not realistic.