This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the concurrency category.
Last Updated: 2024-11-23
I had some code that downloaded 1000 markdown files from a GitHub repo at boot and cached them in RAM. I noticed, however, that this was happening once in each of my worker processes, not only wasting resources, but causing conflicting behavior.
The solution was to switch based on whether it was a server or a worker process. In rails this entailed doing the following
config.after_initialize do
CodeDiary.load_articles_into_memory if defined?(::Rails::Server)
end
In Django with gunicorn, I needed a custom solution, where I touched a file during worker 1 then relied to on whether that file was touched or not to branch (such that workers 2-12 did not actiate it).
With multiple dynos on Heroku, I also set an ENV var per dyno and used this to prevent excess work.