Things we learned from production, part IV–is your paperwork in order?

Sep 28 2012

Things we learned from production, part IV–is your paperwork in order?

time to read 2 min | 362 words

One of the major points that we worked on in the 1.2 release was making the ops team work easier. That included additional logging, like we have previously discussed, making RavenDB plays nicer with other parts of the system, adding performance counters, etc.

But those are the obvious things, and this series isn’t about the obvious things. One of the problems that we run into is that we already had a moderately good porthole into how RavenDB works.

The problem was that this porthole gave you access to the state of a single database ,which was great…

Except that in order to get a database statistics, you had to actually load that database. Imagine a system under load, and the admin need to check what is causing the load. The act of checking a database statistics will actually force that database to load, generating even more load. This is especially dangerous when we are talking about automated health monitoring tools, the fact that we monitor the health of our software shouldn’t cause it to do additional work.

In RavenDB 1.2 we have taken steps to make sure that we can report on all the active database without having to guess which ones are active and which aren’t. We have also taken additional steps to make sure that we give the admin even more information about what is going on.

You can see this pattern pretty much everywhere, in indexes, in operations, in database and server stats. There are a lot more places where we explicitly built the hooks to make it possible for the admin to figure out what is going on.

The lesson from that is that you have to provide a lot of information for the administrators, so they can figure out what is going on (and that administrator may very well be you, at 2 AM, trying to diagnose a problem). At the same time, you have to be sure to provide those hooks in a way that have minimal impact on the system. Having admin hooks in place that will put undue burden on the application is seriously not a cool thing to do.

Tweet Share Share 10 comments

Tags:

Comments

28 Sep 2012
10:29 AM

Sergey Shumov

Ayende, when will you update RavenDB repository at github?

28 Sep 2012
10:31 AM

Ayende Rahien

Sergey, It was last updated about 2 hours ago. Check the 1.2 branch.

28 Sep 2012
11:36 AM

tobi

The SQL Server guys use the approach to expose even lots of implementation details through DMVs (wait types, latches, ...) so that users can take a peek and diagnose stuff. I think that is the right approach.

28 Sep 2012
12:58 PM

Daan Le Duc

Ayende there are only 3 branches none of them is called 1.2

https://github.com/ravendb/ravendb/branches

Am i looking at the wrong place? Would love to see how you implemented your latest blog articles.

Thanks!

28 Sep 2012
13:04 PM

Daan Le Duc

Found it! seems to be on your private github account https://github.com/ayende/ravendb/tree/1.2

28 Sep 2012
13:15 PM

Jesper

@Daan Try https://github.com/ayende/ravendb/tree/1.2

28 Sep 2012
15:46 PM

Ayende Rahien

Daan, Yes, you are looking at the stable branch, you need to look here: https://github.com/ayende/ravendb/tree/1.2

28 Sep 2012
15:47 PM

Ayende Rahien

Jesus, They are already per database

28 Sep 2012
17:34 PM

dotnetchris

@Ayende i think your response to @Jesus was meant to be on the previous post.

30 Sep 2012
19:18 PM

Karep

Would love to here about some details. After reading "Realease It" I'm thinking about making my app better for opps but I'm afraid my application will be full of logging statements making it hard to find the business logic of code.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB