Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,163
Privacy Policy · Terms
filter by tags archive
time to read 6 min | 1193 words

I have explained before what the constraints I have when working with logging. I got several good comments about it, and several has hit the nail directly on the head.

Just to recap, here is the list of my requirements:

  • The output from the logs should be deliverable by email by a layman. This basically means a file.
  • The ouptut should allow me to slice / dice the data. The best way is if I could utilize existing SQL skills to do this.
  • The system may run unattended for weeks at a time, so the log file can't reach huge proportions.
  • I'm not really interested in historical data - something that happened long time ago is not important.
  • Plug into my current logging infrastructure (log4net).
  • Nice to have - Convention over configuration

After giving this much thought, I decided to implement a RollingSqliteAppender. This is based of the existing ADO.Net appender, but is focused on utilizing Sqlite only. Sqlite needs only a single file for its database, so it answers the first requirement, it also has a runtime (for .Net) of about less than 500Kb, so size / installation is not an issue. Despite writing to a file, it is a fully fledged database, meaning that I get to do all sorts of interesting things to it.

The third / forth requirements are what prompted the creation of this appender, rather than using the existing ADO.Net one. I took a look at RollingFileAppender and implemented similar functionality for my appender as well. Although I am checking rows, rather size/date, since this is a bit easier  to check. What is basically does is save all the logs to a sqlite file, until the amount of rows in the database is filled, then it rename the file and create a new one. Another reason for the need to create my own appender is that when creating a new file, the database structure is created for you automatically.From the name, you can figure out that this is a log4net appender, so integrating it is not an issue.

The last part is about minimum configuration. I tend to use the pretty standard logging table fairly often. That means [date, thread, level, logger, message, exception], so the appender is setup to support this (90% scenario, I would guess) out of the box. Here is how you configure the loggerin:

<log4net>

       <appender name="rolling-sqlite-appender"

                       type="Rhino.Commons.Logging.RollingSqlliteAppender, Rhino.Commons"/>

       <root>

              <appender-ref ref="rolling-sqlite-appender"/>

       </root>

</log4net>

This is it.

Now, it does provide a set of options that you can override (because I need that functionality fairly often).

Here is a list of the ones unique for the RollingSqliteAppender:

  • CreateScript - the script to generate the table(s) when creating a new database file. Default is to create the standard table (date, thread, level, logger, message, exception) named "logs".
  • TableName - must match to the name of the table in CreateScript. Default: "logs"
  • FileNameFormat - file format for the file. Default is {0}.log4net, where {0} is the current process name (which result is mostly correct results in most cases).
  • Directory - where to save the files. Default: Current Directory
  • MaxNumberOfRows - number of rows in the database that triggers rolling the file. Note that this is not a hard number, it depends on your BufferSize settings and flush options. It is close enough for my needs. Default: 50,000
  • MaxNumberOfBackUps - number of backup files to save. Backup files are given a numeric prefix where lower is newer. Default: 3
Note: The default buffer size is set to 512, you may want to change that. This is probably the most common reason why people thinks that log4net is not working.

In my tests, each file was about 5Mb - 10Mb in size, and they compressed nicely to a more emailable size.

By the way, I really like the log4net architecture.

The code is here

Reading The Logs

time to read 3 min | 564 words

For production, there is nothing that can replace well thought-of logs. They are the #1 tool for understanding what is going on in a sytem. Often, they are the only real way to understand what is going on even when you are developing. This is often true in multi threaded programs, where just debugging them is not very effective.

The problem with logging is that there are usually too much of them. Following the trace of logs can be a daunting task when your system produce 50 logs per minutes on idle, and can produce hundreds and thousands of messages per minute while working. Just playing with the levels is not enough to make sense of things, you often need to correlate between several message, often at various logging levels, by different loggers.

All of this points me to the not so surprising conclusion that logs are data, and that I should be able to handle them like I handle all other data sources (selecting, grouping, filtering, etc). Luckily for me, it is very easy to make log4net write to a database. I can't think of how many times the ability to slice and dice the data have saved me. For instnace, being able to find that between 10:05 - 10:15 the number of errors from the web service is passing 50, which cause a trigger in another part of the system, which caused all deliveries to arrive an hour late. There is simply no way I could look at the messages for a the whole day and find that out without this.

So, after extolling the value of databases, why do I bother to post this?

Well, there are two problems that I run into when logging to the database. The first is that if the database is down, you don't do any logging (and in log4net 1.2.9, if the database is down, you won't be doing any logging afterward to the database, even if it came back up). The second is that if the application is at a client, and I get a call about an issue, there isn't much I can do about it without the logs.

The guy in charge for letting me know about problems can't read the issues by himself. Idealy, he could just email me the logs, and I would go over them and understand what the problem was. The problem is that to get an export from the DB requires a DBA to handle it, who is not always avialable. Writing to file and getting that is possible, but I already explained how hard it is to get to the point from a file. Getting log4net to produce a file that can be imported to SQL is possible, but it probably isn't trivail.

Messy, messy situation. I became familiar with a lot of large file viewers and remembering a lot of things at the same time when trying to figure out a problem. I also learned to recognize when this is not a good idea and driving there to take a look at the database.

I have a solution for this, which I will post shortly, but in the meantime, I am interested in what you think about this issue...

time to read 10 min | 1984 words


One of the biggest problems with abstractions is that they may allow you to do stupid things without them being obvious. In OR/M-land, that usually means SELECT N+1 issues.
The problem is that you often develop a certain functionality first, and only then realize that while you tested, all was fine and dandy on the five items that you had, but on the real system, you have 5,000, and the DBA is on its way to ER...

Anyway, I am currently working with Web Applications, and I wanted to get a good indication about what pages are troublesome.
Being who I am, I immediately began to design a framework that would correlate page requests to trace data from SQL Server, and then another system that would analyze it and spit out a report saying: "Wow, that ScaryReport.aspx page is making 30% of the calls in the application, take a look at that".

Not wishing to spent the next two years on this project, I decided to do something a bit more modest, and utilize the already existing infrastructure.
In this case ,the infrastructure is NHibernate and lot4net.

The secret tidbit is that NHibernate logs all queries to a log4net logger named "NHibernate.SQL". From there, it was a simple matter of adding a logging helpers to Rhino Commons that would output the current page and the current request id (the request hash code, basically).
Then, It was a matter of defining the following table:

CREATE TABLE [dbo].[NHibernatePerPageQueries](
   
[Id] [int] IDENTITY(1,1) PRIMARY KEY NOT NULL,
   
[RequestId] [int] NOT NULL,
   
[Date] [datetime] NOT NULL,
   
[Message] [nvarchar](max)  NOT NULL,
   
[PageURL] [nvarchar](max)  NOT NULL
)


Then, to define the appender:

<appender name="NHibernatePerPageAppender"
            
type="log4net.Appender.AdoNetAppender">
    
<
bufferSize value="10" />
    
<
connectionType value="System.Data.SqlClient.SqlConnection,
System.Data, Version=2.0.0.0, Culture=neutral,
PublicKeyToken=b77a5c561934e089
" />
    
<
connectionString value="Data Source=localhost;Initial
Catalog=Logs;User ID=logger;Password=logger;
" />
    
<
commandText value="INSERT INTO dbo.[NHibernatePerPageQueries]
([Date],[Message],[PageUrl],[RequestId]) VALUES (@log_date, @message,
@pageUrl,@currentRequestId)
" />
    
<
parameter>
           <
parameterName value="@log_date" />
          
<
dbType value="DateTime" />
          
<
layout type="log4net.Layout.RawTimeStampLayout" />
    
</
parameter>
     <
parameter>
           <
parameterName value="@message" />
          
<
dbType value="String" />
          
<
size value="4000" />
          
<
layout type="log4net.Layout.PatternLayout">
               
<
conversionPattern value="%message" />
          
</
layout>
     </
parameter>
     <
parameter>
           <
parameterName value="@pageUrl" />
          
<
dbType value="String" />
          
<
size value="2000" />
          
<
layout type="log4net.Layout.PatternLayout">
               
<
conversionPattern value="%property{nhibernate_page_url}" />
          
</
layout>
     </
parameter>
     <
parameter>
           <
parameterName value="@currentRequestId" />
          
<
dbType value="String" />
          
<
size value="2000" />
          
<
layout type="log4net.Layout.PatternLayout">
               
<
conversionPattern value="%property{current_request_id}" />
          
</
layout>
     </
parameter>
</
appender>


And defining the logger:

<logger name="NHibernate.SQL">
    
<
level value="DEBUG" />
    
<
appender-ref ref="NHibernatePerPageAppender" />
</
logger>


We are still not done, though, we need the following in the Application_Start():

GlobalContext.Properties["nhibernate_page_url"] = WebLoggingHelper.CurrentPage;
GlobalContext
.Properties["current_request_id"] = WebLoggingHelper.CurrentRequestId;


This is it, now I can correlate the number of queries per page hits, and act accordingly.
Normally, I think that the following queries should be enough:

-- Get pages ordered by number of total queries made from them

SELECT COUNT(*) [Number Of Queries In Page], PageUrl
FROM
NHibernatePerPageQueries
WHERE
substring(Message,1,1) != '@' -- remove parameters logs
GROUP
BY PageUrl ORDER BY COUNT(*)

 

-- Get pages ordered by number of queries per page

SELECT AVG(countOfQueries) [Number Of Queries In Page Per Request], PageUrl FROM (
    
SELECT COUNT(*) countOfQueries, PageUrl
     FROM NHibernatePerPageQueries
     WHERE substring(Message,1,1) != '@' -- remove parameters logs
    
GROUP BY PageUrl, RequestId ) innerQuery
GROUP
BY PageUrl ORDER BY AVG(countOfQueries)

 

Enjoy,
time to read 10 min | 1834 words


I just added a small http module to Rhino Commons. It is a very simple module that times how long it takes to process a page.
It only times the server-side processing, of course, but it is a great way to tell you where you need to pay attention.
It is using log4net to log the data, so you can redirect the output to a database, and from there, you can get all the data you want.

Configurating the module is very simple. Create the following table:

CREATE TABLE [dbo].[PagePerformance](
    
[Id] [int] PRIMARY KEY IDENTITY(1,1) NOT NULL,
    
[Date] [datetime] NOT NULL,
    
[Message] [nvarchar](max) NOT NULL,
    
[PageURL] [nvarchar](max) NOT NULL,
    
[Duration] [nvarchar](50) NOT NULL,
    
[StartRequest] [datetime] NOT NULL
)

Now, add the following to your <httpModules> tag in web.config:

<httpModules>
     <
add name="PagePerformanceModule"
          
type="Rhino.Commons.HttpModules.PagePerformanceModule"/>
</
httpModules>


Finally, we define in log4net.config the appender configuration:

<appender name="PagePerfAppender"
            
type="log4net.Appender.AdoNetAppender">
    
<
bufferSize value="10" />
    
<
connectionType value="System.Data.SqlClient.SqlConnection,
System.Data, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
" />
    
<
connectionString value="Data Source=localhost;Initial
Catalog=Logs;User ID=logger;Password=logger;
" />
    
<
commandText value="INSERT INTO dbo.PagePerformance
([Date],[Message],[PageUrl],[Duration],[StartRequest]) VALUES (@log_date, @message, @pageUrl, @duration, @startRequest)
" />
    
<
parameter>
           <
parameterName value="@log_date" />
          
<
dbType value="DateTime" />
          
<
layout type="log4net.Layout.RawTimeStampLayout" />
    
</
parameter>
     <
parameter>
           <
parameterName value="@message" />
          
<
dbType value="String" />
          
<
size value="4000" />
          
<
layout type="log4net.Layout.PatternLayout">
               
<
conversionPattern value="%message" />
          
</
layout>
     </
parameter>
     <
parameter>
           <
parameterName value="@pageUrl" />
          
<
dbType value="String" />
          
<
size value="2000" />
          
<
layout type="log4net.Layout.PatternLayout">
               
<
conversionPattern value="%property{page_url}" />
          
</
layout>
    
</
parameter>
     <
parameter>
           <
parameterName value="@duration" />
          
<
dbType value="String" />
          
<
size value="50" />
          
<
layout type="log4net.Layout.PatternLayout">
               
<
conversionPattern value="%property{page_duration}" />
          
</
layout>
     </
parameter>
     <
parameter>
           <
parameterName value="@StartRequest" />
          
<
dbType value="DateTime" />
           
<
layout type="log4net.Layout.PatternLayout">
               
<
conversionPattern value="%property{start_request}" />
          
</
layout>
     </
parameter>
</
appender>

The page_duration, start_request and page_url properties are added to log4net configuration by the http module.
Then, all that is left is defining the logger itself:

<logger name="Rhino.Commons.HttpModules.PagePerformanceModule">
    
<
level value="DEBUG" />
    
<
appender-ref ref="PagePerfAppender" />
</
logger>

If everything went well, you have now got yourself a table in the DB that measures
how much it takes to serve each page.

Note: There is a small buffer (10 page requests) defined, to reduce the cost of the logging, so you would need to hit refresh a couple of time before you see the results in the table.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - 15 hours from now
  2. Configuration values & Escape hatches - 4 days from now
  3. What happens when a sparse file allocation fails? - 6 days from now
  4. NTFS has an emergency stash of disk space - 8 days from now
  5. Challenge: Giving file system developer ulcer - 11 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}