Treating a log4j log like a database

I’m finally posting a Python module I wrote months ago that I have found to be very handy.

Our log4j log files get big — really big. It doesn’t help that we log just about everything under the sun. This makes it difficult to parse through and find the messages that we are looking for. I got to thinking, “With these set fields, the log files are really just a database. Why can’t I query like that?” I know that Chainsaw does something like this, but I wanted to put it into my own scripts. Python scripts, that is.

Thus loginfo.py was born. This module contains one class called LogInfo that parses the log and then lets you do simple queries on it. So you can grab all INFO messages, all messages containing a certain word, or all messages in X hours or minutes ago. You can also chain these queries — so you can get all INFO messages that occurred in the past 4 hours.

Need an example? Here is the example in the module:

lfile = file(sys.argv[1])
log = LogInfo()

for x in lfile.readlines():
log.addEntry(x)

## prints out all error messages from the past eight hours
for msg in log.error().hoursAgo(8):
print msg['date'],msg['sev'],msg['class'],msg['message']

Simple, huh? Grab the module here: loginfo.py

Leave a Reply

You must be logged in to post a comment.