This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the exceptions category.
Last Updated: 2024-11-21
I ran out of GPU memory on a Jupyter notebook on a server after getting some exceptions. The garbage collector could not get rid of it. I had to rerun the whole notebook, which took ages, it being ML.
It turns out that the issue was Python was keeping the last exception around,
which had a reference to a massive pytorch
dataset. Instead of restarting, I
could have gotten rid of this by raising a new exception, one with a negligible
footprint:
1/0
Exceptions can take up huge amounts of RAM. The easiest way to kill that reference is to raise a new exception.