Jobs that loop work should create sub jobs for each iteration lest one fail

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the architecture category.

Last Updated: 2024-11-21

I have a background job in my code that emails every zip file contained in an order to customers. Roughly, the code is as follows:

class EmailFilesJob
  def perform(order)
    order.zip_files_available_for_download.each do |zip_file|
      file = Paperclip.io_adapters.for(zip_file.zip)
      GenericEmail.send_file(file, order.user.email).deliver_now
    end
  end
end

One customer an obscenely large order with ~60 zip files, one of which was too big to email. When this email files job executed, it got through the first 15 files, then on the 16th it hit the huge file and the job aborted.

Nothing too bad so far. Except that my job-runner retried this job two more times, leading to the first 15 files being delivered three times apiece to the user's inbox, leaving an unprofessional mess.

The lesson here is that background jobs that take many smaller actions (e.g. send 60 emails) should somehow maintain progress (e.g. with state or by queuing up sub-background jobs) to avoid breaking idempotency.

This could have been achieved with a simple change:

class EmailFilesJob
  def perform(order)
    order.zip_files_available_for_download.each do |zip_file|
      file = Paperclip.io_adapters.for(zip_file.zip)
      # deliver_later queues up another job
      GenericEmail.send_file(file, order.user.email).deliver_later
    end
  end
end

deliver_later would queue up the creation of another job to send that one zip file, meaning that the overall EmailFilesJob would execute perfectly every time, since all it does is queue up other jobs. The individual email that was too large would exclusively fail, and this can be safely retried.