This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the dumb-mistakes-and-gotchas category.
Last Updated: 2024-11-21
I accidentally paid some of my sellers twice once month at Oxbridge Notes causing an accounting nightware.
Basically the accounting code ran every 10 minutes. Had it finished within 10 minutes, it would have set some state in the DB preventing the possibility of double payment. But -- and I did not anticipate this -- one month the job took over 10 minutes to run, meaning a 2nd invocation of it started while the previous one was still in action.
This lead to race conditions since processA did not know about processB (and processA has already taken its picture of the DB's state at the start of its runtime and didn't re-check for any potential effects of processB during its execution, so therefore had stale state).
One way to deal with this would be to do a fresh DB query before the critical action (in my case: sending money):
# I use the reload method to re-read from the DB
seller_invoice.pay unless seller_invoice.reload.paid?
This still isn't bullet-proof: for that we would need a lock (db mutex) to stop other proceses from accessing that row:
seller_invoice.with_lock do
# reload is not necessary this time, since the code in the 2nd process only
# executes on this record when the first one is done, thanks to the lock.
seller_invoice.pay unless seller_invoice.paid?
end
With the lock, the first process (/"job") executes unaffected, whereas the second
job will halt at the line with_lock
and not execute the code within the block
until the first process has done its thing. That 2nd process won't even grab the data!
This locking behavior can be demonstrated with a simple demo. Paste these into side-by-side tabs in your REPL:
u = User.find(21)
u.with_lock do
puts "start"
puts u.store
sleep 5
puts "awake"
u.update(store: "fr")
end
u = User.find(21)
u.with_lock do
puts "start"
puts u.store
sleep 5
puts "awake"
u.update(store: "au")
end
You will notice that the second entry won't print start until the first one is done.
Always use DB locks to prevent race conditions in queues etc. This is especially important in code that deals with money.