Skip to main content

Log analysis with Unix

By February 3, 2008No Comments

Sometimes I have to dig deep to find the information I need in a sites server logs which will enable me to do the analysis I need to do.
I have a variety of log analysers which between then can process most logs I come across, but sometimes I get something in a custom format which normal tools baulk at.

The answer? Use the unix tools on my mac to process the server logs, and extract the information I need.

Say I have a logfile from a client which is in a custom/weird/old format. I usually want to analyse the path to purchase. If I know that the 5 steps in the path to purchase are called shop1.php, shop2.php – up to shop5.php I can search for lines in this logfile which relate to someone calling one of the shop*.php pages using grep:

grep -i ‘shop[1-5].php’ serverlog.log

This pulls out a line of text, but its got some information in there I don’t need if i’m just interested in how often a page in the purchase path is called. In this case I want to pull out the first 20 characters of the 6th (so you tell AWK to look at the seventh) column of data

awk ‘{print substr($7,1,20)}’

Then I want to sort the log,

sort serverlog.log

count how many occurrences of each line there are

uniq -c

then sort the lines in descending order

sort -r

I can pull all of this together and use pipe “|” to join up all the commands:

grep -i ‘stage[1-5].htm’ serverlog.log | awk ‘{print substr($7,1,20)}’ | sort | uniq -c | sort -r

Using unix I can pretty much get any info I need. Cool eh?

If you really want to get into this kind of stuff, which you may, given you are still reading this, you should checkout the unix text processing bible
“Unix Text Processing, by Dale Dougherty and Tim O’Reilly”.

Leave a Reply