Uniq only makes consecutive items unique

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the unix category.

Last Updated: 2024-12-03

I wanted to get the counts of different IP addresses in some logs. I ran:

$ cat 1-minute-of-production.logs | awk ' { split($15, a, "="); print a[2]} '

# Output:

10.113.192.240
10.113.192.240
10.31.87.172
10.31.87.172
10.29.126.3
10.47.238.26
10.113.192.240
10.113.192.240

Then when I piped it into uniq -c I got

... | sort 
 2 10.113.192.240
 2 10.31.87.172
 1 10.29.126.3
 1 10.47.238.26
 2 10.113.192.240

This is wrong, because the first and the last IP addresses are the same.

The issue is that uniq only makes consecutive items uniq. Therefore I needed to sort first:

 ... | sort | uniq -c
   4 10.113.192.240
   1 10.29.126.3
   2 10.31.87.172
   1 10.47.238.26

If you don't need counts, sort -u is faster than sort | uniq since it doesn't need to do IPC.