This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the unix category.
Last Updated: 2025-01-18
I wanted to get the counts of different IP addresses in some logs. I ran:
$ cat 1-minute-of-production.logs | awk ' { split($15, a, "="); print a[2]} '
# Output:
10.113.192.240
10.113.192.240
10.31.87.172
10.31.87.172
10.29.126.3
10.47.238.26
10.113.192.240
10.113.192.240
Then when I piped it into uniq -c
I got
... | sort
2 10.113.192.240
2 10.31.87.172
1 10.29.126.3
1 10.47.238.26
2 10.113.192.240
This is wrong, because the first and the last IP addresses are the same.
The issue is that uniq
only makes consecutive items uniq
. Therefore I needed to sort first:
... | sort | uniq -c
4 10.113.192.240
1 10.29.126.3
2 10.31.87.172
1 10.47.238.26
If you don't need counts, sort -u
is faster than sort | uniq
since it doesn't need to do IPC.