This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the unix category.
Last Updated: 2025-01-18
I had this code to generate a random string
tr -dc "[:alnum:]" </dev/urandom | head -c 32
tr: Illegal byte sequence
The error happened because /dev/urandom generated a sequence of bytes that could not be represented as UTF-8, which tr
expected because of the global variables in my env:
LC_ALL=en_US.UTF-8
LC_CTYPE=UTF-8
What we want to do, just for the duration of the command, is tell the program to not try to convert the sequence of bytes to strings - i.e. treat it as a string of bytes. We do this usually by setting LC_CTYPE=C
(normally) However there is another relevant variable: LC_ALL
. It has higher precedence than LC_CTYPE
and because my terminal profile sets LC_ALL
(to some UTF variant) I
had to set LC_ALL
for it to work.
The working command:
LC_ALL=C tr -dc "[:alnum:]" </dev/urandom | head -c 32
fQLKSWK4kPUmEAzPnHMI0JdUm0sXsR6w%
Briefly it:
/dev/urandom
-d [:alnum:]
) but then take the complement of that with -c
). The overall effect is to delete everything that IS NOT alphanumerical.e.g.
LC_ALL=C tr -dc "[:alnum:]" </dev/urandom | head -c 32
fQLKSWK4kPUmEAzPnHMI0JdUm0sXsR6w% ~/code/code-diary master
LC_ALL=C tr -d "[:alnum:]" </dev/urandom | head -c 32
�"���%�����
�������Ŕ�˒�%