Reading The Logs

time to read 3 min | 564 words

For production, there is nothing that can replace well thought-of logs. They are the #1 tool for understanding what is going on in a sytem. Often, they are the only real way to understand what is going on even when you are developing. This is often true in multi threaded programs, where just debugging them is not very effective.

The problem with logging is that there are usually too much of them. Following the trace of logs can be a daunting task when your system produce 50 logs per minutes on idle, and can produce hundreds and thousands of messages per minute while working. Just playing with the levels is not enough to make sense of things, you often need to correlate between several message, often at various logging levels, by different loggers.

All of this points me to the not so surprising conclusion that logs are data, and that I should be able to handle them like I handle all other data sources (selecting, grouping, filtering, etc). Luckily for me, it is very easy to make log4net write to a database. I can't think of how many times the ability to slice and dice the data have saved me. For instnace, being able to find that between 10:05 - 10:15 the number of errors from the web service is passing 50, which cause a trigger in another part of the system, which caused all deliveries to arrive an hour late. There is simply no way I could look at the messages for a the whole day and find that out without this.

So, after extolling the value of databases, why do I bother to post this?

Well, there are two problems that I run into when logging to the database. The first is that if the database is down, you don't do any logging (and in log4net 1.2.9, if the database is down, you won't be doing any logging afterward to the database, even if it came back up). The second is that if the application is at a client, and I get a call about an issue, there isn't much I can do about it without the logs.

The guy in charge for letting me know about problems can't read the issues by himself. Idealy, he could just email me the logs, and I would go over them and understand what the problem was. The problem is that to get an export from the DB requires a DBA to handle it, who is not always avialable. Writing to file and getting that is possible, but I already explained how hard it is to get to the point from a file. Getting log4net to produce a file that can be imported to SQL is possible, but it probably isn't trivail.

Messy, messy situation. I became familiar with a lot of large file viewers and remembering a lot of things at the same time when trying to figure out a problem. I also learned to recognize when this is not a good idea and driving there to take a look at the database.

I have a solution for this, which I will post shortly, but in the meantime, I am interested in what you think about this issue...