You are judged on what is above the waterline

time to read 6 min | 1182 words

image

This is written a day before the Jewish New Year, so I suppose that I’m a bit introspective. We recently had a discussion in the office about priorities and what things are important for RavenDB.  This is a good time to do so, just after the RC release, we can look back and see what we did right and what we did wrong.

RavenDB is an amazing database, even if I say so myself, but one of the things that I find extremely frustrating is that so much work is going into things that have nothing to do with the actual database. For example, the studio. At any given point, we have a minimum of 6 people working on the RavenDB Studio.

Here is a confession, as far as I’m concerned, that studio is a black box. I go and talk to the studio lead, and things happen. About the only time that I actually go into the studio code is when I did something and that broke the studio build. But I’ll admit that I usually go to one of the studio guys and have them fix it for me Smile.

Yep, at this stage, I can chuck that “full stack” title right off the edge. I’m a backend guy now, almost completely. To be fair, when I started writing HTML (it wasn’t called web apps, and the fancy new stuff was called DHTML) the hot new thing was the notion that you shouldn’t use <blink/> all the time and we just got <table> for layout.  I’m rambling a bit, but I’ll get there, I think.

I find the way web applications are built today to be strange, horribly complex and utterly foreign (I find just the size of the node_modules folder is scary). But this isn’t a post about an old timer telling you how good were the days when you had 16 colors to play with and users knew their place. This post is about how a product is perceived.

I mentioned that RavenDB is amazing, right? It is certainly the most complex project that I have ever worked on and it is choke full of really good stuff. And none of that matters unless it is in the studio. In fact, that team of people working on the studio? That includes only the people actually working on the studio itself. It doesn’t include all the work done to support the studio on the server side. So it would be pretty fair to say that over half of the effort we put into RavenDB is actually spent on the studio at this point.

And it just sounds utterly crazy, right? We literally spent more time on building the animation for the cluster state changes so you can see them as they happen then we have spent writing the cluster behavior. Given my own inclinations, that can be quite annoying.

It is easy to point at all of this hassle and say: “This is nonsense, surely we can do better”. And indeed, a lot of our contemporaries get away with this for their user interface:

I’ll admit that this was tempting. It would free up so much time and effort that it is very tempting.

It would also be quite wrong. For several reasons. I’ll start from the good of the project.

Here is one example from the studio (rough cost, a week for the backend work, another week for various enhancement / fixes for that, about two weeks for the UI work, some issues still pending for this) showing the breakdown of the I/O operations made on a live RavenDB instance.

image

There is also some runtime cost to actually tracking this information, and technically speaking we would be able to get the same information with strace or similar.  So why would we want to spend all this time and effort on something that a user would already likely have?

Well, this particular graph is responsible for us being able to track down and resolve several extremely hard to pin down performance issues with how we write to disk. Here is how this look like at a slightly higher level, the green is writes to the journal, blue are flushes to the data file and orange are fsyncs.

image

At this point, I have a guy in the office that stared at theses graphs so much that he can probably tell me the spin rate of the underlying disk just by taking a glance. Let us be conservative and call it a mere 35% improvement in the speed of writing to disk.

Similarly, the cluster behavior graph that I complained about? Just doing QA on the graph part allowed us to find several issues that we didn’t notice because they were suddenly visible and there.

That wouldn’t justify the amount of investment that we put into them, though. We could have built diagnostics tools much more cheaply then that, after all. But they aren’t meant for us, all of these features are there for actual customer use in production, and if they are useful for us during development, they are going to be invaluable for users trying to diagnose things in production. So while I may consider the art of graph drawing black magic of the highest caliber I can most certainly see the value in such features.

And then there is the product aspect. Giving a user a complete package, were they can hit the ground running and feel that they have a good working environment is a really good way to keep said user. There is also the fact that as a database, things like the studio are not meant primarily for developers. In fact, the heaviest users of the studio are the admin stuff managing RavenDB, and they have a very different perspective on things. Making the studio useful for such users was an interesting challenge, luckily handled by people who actually know what they are doing in this regard much better than me.

And last, but not least, and tying back to the title of this post. Very few people actually take the time to do a full study of any particular topic, we use a lot of shortcuts to make decisions. And seeing the level of investment put into the user interface is often a good indication of overall production quality. And yes, I’m aware of the slap some lipstick on this pig and ship it mentality, there is a reason is works, even if it is not such a good idea as a long term strategy. Having a solid foundation and having a good user interface it a lot more challenging, but far better end result.

And yet, I’m a little bit sad (and mostly relieved) that I have areas in the project that I can neithe work nor understand.