Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,567
|
Comments: 51,184
Privacy Policy · Terms
filter by tags archive
time to read 9 min | 1622 words

Frans has a long post about how important is documentation for the maintainability of a project. I disagree.

Update: I have another post in this subject here.

Before we go on, I want to make sure that we have a clear understanding of what we are talking about, I am not thinking about documentation as the end user documentation, or the API documentation (in the case of reusable library), but implementation documentation about the project itself. I think that some documentation (high level architecture, coding approach, etc) are important, but the level to which Frans is taking it seems excessive to me.

The thing is though: a team of good software engineers which works like a nicely oiled machinery will very likely create proper code which is easy to understand, despite the methodology used.

So, good people, good team, good interactions. That sounds like the ideal scenario to me. I can think of at least five different ways in which a methodology can break apart and poison such a team (assigning blame, stiff hierarchy, overtime, lack of recognition, isolate responsibilities and create bottlenecks). Not really a good scenario.

The why is of up-most importancy. The reason is that because you have to make a change to a piece code, you might be tempted to refactor the code a bit to a form which was rejected earlier because for example of bad side-effects for other parts.

And then I would run the tests and they would show that this causes failure in another part, or it would be caught in QA, or I would do the responsible thing and actually get a sense of the system before I start messing with it.

If you don't know the why of a given routine, class or structure, you will sooner or later make the mistake to refactor the code so it reflects what wasn't the best option and you'll find that out the hard way, losing precious time you could have avoided.

This really has nothing to do with the subject at hand. I can do the same with brand new code, to go on in a tangent somewhere, it is something that you have to deal with in any profession.

That's why the why documentation is so important: the documented design decisions: "what were the alternatives? why were these rejected?" This is essential information for maintainability as a maintainer needs that info to properly refactor the code to a form which doesn't fall into a form which was rejected.

Documentation is important, yes, but I like it for the high level overview. "We use MVC with the following characteristics, points of interest in the architecture include X,Y,Z, points of interest in the code include D,B,C, etc". But I stop there, and rarely updates the documents beyond that. We have the overall spec, we have the architectural overview and the code tourist guide, but not much more. Beyond that, you are suppose to go and read the code. The build script usually get special treatment, by the way.

This also assumes that the original builders of the systems was omniscient. Why shouldn't I follow a form that was rejected? Just because the original author of an application thought that Xyz was the end-all-be-all of software, doesn't means that Brg isn't a valid approach and should be considered. It should not surprise you that I reject the idea out of hand.

Code isn't documentation, it's code. Code is the purest form of the executable functionality you have to provide as it is the form of the functionality that actually gets executed, however it's not the best form to illustrate why the functionality is constructed in the way it is constructed.

Code can be cumbersome to express the true intent with, that is why I am investing a lot of time coming up with intent revealing names and pushing all the infrastructure concerns down. The best way to illustrate why certain functionality exists in such a way is to cover it with tests, that was you can see intended usage and can follow the train of thoughts of the previous guy. I routinely head off to the unit tests of various projects to get an insight about such things can work.

I've seen technical documents which did make a lot of sense and were essential to understanding what was going on at such a level that making changes was easy.

Frans, at what level where they? Earlier you were talking about routine level, but I want to know what you think is the appropriate documentation coverage for a system?

If your project consists of say 400,000 lines of code, it's not a walk in the park to even get a slightest overview where what is located without reading all of those lines if there's no documentation which is of any value.

The problem here is that you make no assumption about the state of the code. I would take undocumented 400,000 LOC code base that has (passing) unit tests over one that had extensive documentation but little to no tests any time. The reasoning is simple, if it is testable, it is maintainable, period. Yes, it would take time to wrap my head around a system this size, but I can most certainly do it, and unit tests allows me to do a lot of things safely. Assume that you have extensive documentation, what happens when the code diverge from the Documentation?

You see, documentation isn't a separate entity of the code written: it describes in a certain DSL (i.e human readable and understandable language) what the functionality is all about; the code will do so in another DSL (e.g. C#). hats the essential part: you have to provide functionality in an executable form. Code is such a form, but it's arcane to read and understand for a human (or is your code always 100% bugfree when you've written a routine? I seriously doubt it, no-one is that good), however proper documentation which describes what the code realizes is another.

Documentation is not a DSL, and it is most certainly not understandable in many cases. Documentation can be ambiguous in the most insidious ways. The code is not another DSL, this assumes that the code and the documentation are somehow related, but the code is what actually run, so that is the authoritive  on any system. Documentation can help understanding, but it doesn't replace code, and I seriously doubt that you can call it a DSL. The part that bothers me here is that the documentation is viewed as executable form, unless you are talking about something like FIT, that is not the case. I can't do documentation.VerifyMatch(code);

When I need to make a change and need to know why a routine is the way it is, I look up the design document element for that part and check why it is the way it is and which alternatives are rejected and why. After 5 years, your own code also becomes legacy code. Do you still maintain code you've written 2-3 years ago? If so, do you still know why you designed it the way it is designed and also will always avoid to re-consider alternatives you rejected back then because they wouldn't lead to the right solution?

I still maintain code that I wrote two years ago (Rhino Mocks come to mind), and I kept very little documentation about why I did some things. But I have near 100% test coverage, and the ability to verify that I still have working software. Speaking on the paid side of the fence, a system that I have started written two years ago has gone two major revisions in the mean time, and is currently being maintained by another team. I am confident in my ability to go there, sit with the code, and understand what is going on. And of course, "I have no idea why I did this bit" are fairly common, but checking the flow of the code, it is usually clear that A) I was an idiot, B) I had this and that good reason for that. Sometimes it is both at the same time.

It needs pointing out again, what was true 5 years ago is something that you really need to reconsider today.

What's missing is that a unit test isn't documenting anything

And here I flat out disagree. Unit tests are a great way to document a system in a form that keeps it current. Reading the tests for a class can give you a lot of insight about how it is supposed to be used, and what the original author thought about when he built it.

It describes the same functionality but in such a different DSL that a human isn't helped by wading through thousands and thousands of unit tests to understand what the api does and why.

Not really any different than wading through thousands of pages of documentation, which you can't even be sure to be valid.

Using unit tests for learning purposes or documentation is similar to learning how databases work, what relational theory is, what set theory is etc. by looking at a lot of SQL queries.

The only comment that I have for this: That is how I did it.

Wouldn't you agree that learning how databases work is better done by reading a book about the theory behind databases, relational theory, set theory and why SQL is a set-oriented language?

Maybe, but I believe that the best way to learn something is to use it in anger and there is absolutely nothing that can beat having something to play with and explore.

time to read 4 min | 609 words

Frans' contribution to the conversation about maintainable code deserve its own post, but I would like to mention this part in particular:

you will not [understand the code]. Not now, not ever. And not only you, but everyone out there who writes code, thus that includes me as well, will not be able to read code and understand it immediately.

You know what? That is not limited to bad code. I had hard time grokking good code bases, simply because of their size and complexity (NHibernate and Windsor comes to mind). Other code bases are as large, but they have easier approachability, probably because they are dealing with less complex domain an tend to have a wide coverage rather than deep (MonoRail comes to mind).

A while ago I was involved with an effort to migrate an ancient system to SQL Server 2005. The system compromised of over 100,000 lines of code, spread over some thousands of files scattered randomly in a case sensitive file system (you can guess why this is significant). The code base was about 85% SQL and 15% bash shell scripts. The database in question was a core system and contained slightly over 4,000 tables. One of the core tables was called tmp1_PlcyDma and was used to do business critical processing. That code base took data driven code generation to a level I have never seen before. I gave up trying to track down 7(!) levels of code->generating code->execute code->generating code->rinse->repeat

To say that the code base was bad is quite an understatement. To mention that the only place where I could run the code was by using a telnet console into a test environment that was not identical to production is only the start. I could mention no debugging, runtime of ~5 hours, test time of ~3 hours, etc. The code grew organically over a ten years period and you could track the developer progress from merely annoying to criminally insane (he invented his own group by construct, using triple nested cursors and syntax so obscure that even the DBA that worked with the system for the last 5 years had no idea what was going on).

Perhaps the thing that I remember most from this project that we had a bug that kept two people hunting after it for three weeks. The issue was a missing ';'. Oh, and the criteria for success in this project was successful migration, with bug-per-bug compatibility, and no one really knew what it did, including the authors(!).

But, you know what, after a month or so of looking at the code, it got to the point where I could look at something like pc_cl_mn.sql and know that it would contains the monthly policy calculation, and that this piece of code was doing joins manually via cursors again, that plcy_tr_tmp.sql was the "indexes priming" script, etc, etc.

The code was still a horror, but once you understood that the authors of this code had a... "special" way of looking at databases, you got to the point where you could get the point of the code in an hour instead of a day, and then move that to a saner approach.

So, what does this horrifying story has to do with Frans' point above?

The premise that you can read and understand code immediately is highly dependant on what you are familiar with. I know of no one that can just sit in front of an unfamiliar  code base and start producing value within the first ten minutes. But on a good code base, you should be able to be able to start producing value very quickly.

People over Code

time to read 5 min | 849 words

While there is value in the item on the right, I value the item on the left more.

This is in response to a comment by Jdn, I started to comment in reply, and then I reconsidered, this is much more important. A bit of background. Karthik has commented that "Unfortunately too often many software managers fall into the trap of thinking that developers are "plug and play" in a project and assume they can be added/removed as needed." and proceeded with some discussion on why this is and how it can be avoided.

I responded to that by saying that I wouldn't really wish to work with or for such a place, to be precise, here is what I said:

I would assert that any place that treats their employee in such a fashion is not a place that I would like to work for or with.
When I was in the army, the _ultimate_ place for plug & play mentality, there was a significant emphasis on making soldiers happy, and a true understanding of what it was to have a good soldier serving with you. Those are rare, and people fight over them.
To suggest that you can replace one person with another, even given they have the same training is ludicrous

From personal experience, when I was the Executive Officer of the prison, the prison Commander has shamelessly stole my best man when I was away at a course, causing quite a problem for me (unfortunately not something that you can just plug & play). That hurt, and it took about six months to get someone to do the job right, and even then, the guy wasn't on the same level. (And yes, this had nothing to do with computers, programming, or the like.)

Now, to Jdn's comment:

In a perverse way, I can see, from the perspective of a business, why having good/great developers, who bring in advanced programming techniques, can be a business risk.
[...snip...] you have to view all employees as being replaceable, because the good/great ones will always have better opportunities (even if they are not actively looking), and turnover for whatever reason is the norm not the exception.
Suppose you are a business with an established software 'inventory', and suppose it isn't the greatest in the world. But it gets the job done, more or less. Suppose an Ayende-level developer comes in and wants to change things.  We already know he is a risk because he says things like:
"not a place that I would like to work for or with."

If you view me as replaceable, I will certainly have an incentive to moving to somewhere where I wouldn't be just another code monkey. Bad code bothers me, I try to fix that, but that is rarely a reason to change a workplace. I like challenges. And there are few things more interesting than a colleague's face after a masterfully done shift+delete combination.
What I meant with that is that I wouldn't want to work for a place that thought of me and my co-workers as cogs in a machine, to be purchased by the dozen and treated as expendable.

You know what the most effective way to get good people? Treating them well, appreciating their work and making them happy. If a person like what they are doing, and they like where they are doing it, there would need to be a serious incentive to moving away. A good manager will ensure that they are getting good people, and they will ensure that they will keep them. That is their job.

Mediocre code that can be maintained by a wider pool of developers is in a certain respect more valuable to a business than having great code that can only be maintained by a significantly smaller subset of developers.

At a greater cost over the life time of the project. If you want to speak in numbers the MBAs will understand, you are going to have far higher TCO because you refuse to make the initial investment.

To quote Mark Miller, you can get more done, faster, if you have good initial architecture and overall better approach to software.

Jdn's concludes with a good approach:

I'm offering services for clients.  I can't disrupt their business because I don't think their code is pretty enough.
What I can do better, going forward, is learn to make the incremental changes that gets them on their way to prettier code.  My attitude is *not* "well, I can't do anything so I won't even try."
But at the end of the day, I have to do what is best for the *client*.  If that means typed datasets (picking on them, but include anything you personally cringe over), then I can partial class and override to make them better, but typed datasets it will be.

I would probably be more radical about the way that I would go about it, but the general approach is very similar, especially when you have an existing code base or architecture in place.

time to read 1 min | 163 words

imageTime and time again, Working Effective with Legacy Code comes up in conversations that I have with like minded fellows. It is a very good guide to working with code, not necessarily legacy one. I have read it a few years ago, and have been vastly impressed, to quote myself:

Working Effectively with Legacy Code is a book that should be a mandatory reading for anyone who is interested in coding for a living.

I consider this book the #1 reason for the existence of Rhino Mocks, and I can't really recommend it heartily enough.

If you haven't read it yet, go and get it.

That and Evans' DDD are on my list of books to re-read, but I am saving that for when I need a serious productively boost. That is one hell of a book to set me off writing good code.

time to read 3 min | 548 words

Jdn is making an excellent point in this post:

Okay, so, TDD-like design, ORM solution, using MVP.  Oh, and talk to the users, preferably before you being coding.

One problem (well, it's really more than one).  I know for a fact that I am going to be handing this application off to other people.  I will not be maintaining it.  I know the people who I will be handing it off to, so I know their skill sets, I know generally how they like to code.

None of them have ever used ORM.

None of them do unit testing.  One knows what they are and for whatever reason hates them.  The others just don't know.

None of them have ever used MVP/MVC, and I doubt any but one has even heard of it.

All of them are intelligent, so could grasp all the concepts readily, and become proficient with them over time.  If they are given time by their bosses, or do the work overtime, or whatever.

There is a 'standard' architecture in place that they have worked with for quite some time.  I personally think it blows, and frankly, so do most of them, but it is familiar, and applications can be passed between developers as they use a common style.

There are several things that are going on in this situation.  The two most important ones are that the currently used practice of bad code, is also (luckily) wildly recognized as such and the people who work there are open minded and intelligent.

Before I get to the main point, I want to relate something about my current project. If you wish to maintain it, you need to have a good understanding of OR/M, IoC and MVC.  Without those, you can't really do much with the application. That said, good use of IoC means that it is mostly transparent, and abusing the language give you natural syntax like FindAll( Where.User.Name == "Ayende") for the (simple) OR/M, and MVC isn't hard to learn.

Back to Jdn's post, let us consider his point for a moment. Building the application using TDD, IoC, OR/M, etc would create a maintainable application, but it wouldn't be maintainable by someone who doesn't know all that. Building an application application using proven bad practices will ensure that anyone can hack at it, but that it has much higher cost to maintain and extend.

I am okay with that. Because my view is that having the developers learn a better way to build software is much less costly than continuing to produce software that is hard to maintain. In simple terms, if you need to invest a week in your developers, you will get your investment several times over when they produce better code, easier to maintain and extend and with fewer bugs.

Doing it the old way seams defeatist to me (although, in Jdn's case, he seems to be leaving his current employee, which is something that I am ignoring in this analysis). It is the old "we have always done it this way" approach. Sure, you can use a mule to plow a field, it works. But a tractor would do much better job, even though it require knowing how to drive first.

time to read 2 min | 351 words

image The "Tools For Mort" post from Nick Malik had me check outside to verify that the skies are still blue.

Nick seems to define a Mort as:

Mort works in a small to medium sized company, as a guy who uses the tools at hand to solve problems.  If the business needs some data managed, he whips up an Access database with a few reports and hands it to the three users who need the data.  He can write Excel macros and he's probably found in the late afternoons on Friday updating the company's one web site using Frontpage.

Mort is a practical guy.  He doesn't believe in really heavy processes.  He gets a request, and he does the work. 

So far, he is following the same well known path of describing Mort. The problem is that he then seems to decide that Mort is a super agile guy. Take a look at Sam Gentile's comment:

MSFT is making tools for Morts (the priority) at the expense of every other user (especially Enterprise Developers and Architects). They have nothing for TDD. And I would further contend that making these tools "dumbed down" has significantly contributed to why Morts are Morts in the first place and why they are kept there.

And Nick's response:

Wow, Sam.  I didn't know you had so much animosity for the Agile community!  Are you sure that's what you intended to say? 

Do you really mean that Microsoft should make a priority of serving top-down project managers who believe in BDUF by including big modeling tools in Visual Studio, because the MDD people are more uber-geeky than most of us will ever be?  I hate to point this out, Sam, but Alpha Geeks are not the ones using TDD.  It's the Morts of the programming world.  Alpha geeks are using Domain Specific Languages.

I really have no idea how to respond to such a claim. It certainly doesn't match my experience.

time to read 2 min | 395 words

imageAfter some discussion about whatever a tree is the correct UI to show the use for my permissions issue, I decided to see if there is another way to handle that.

This is not the way the UI looks like, obviously, but it should give you an indication about how it works. Gray check mark means that something below this level is checked, a check mark means that it has permission on this node. Note that I can have permission on a node, and permission on sub nodes, and it has different meanings. If I have a permission on a node I have cascading permission to all its children, but a child may be associated with multiply parents (and not always at the same level of the tree, sigh).

In this case, we have Baron that has permission to schedule work for the Tel Aviv's help desk stuff. He also has permissions to schedule work to John & Barbara, no matter in what capacity.

It other words, even thought Baron can assign work to Jet, he can only do so when Jet works for the Tel Aviv Help Desk,  he cannot assign Jet to work in the Pesky Programmers role. He can do that to John & Barbara (assuming he has the rights to do assign work on the Pesky Programmers department, of course, which is another tree.

The idea is that you can assign detailed permissions to any parts of the tree that you are interested in. There is another screen that allows you to find the hierarchy of objects if you are really interested (not shown here).

Naturally, permissions are many to many, the tree is many to many and I have a headache just try to figure it out. Just to point out, this is done on a web application, and the complexity is that the real tree has about two thousands entries at the lowest level (and ~7 at the top most level), so you need to get data lazily from the server, but, you also need to display the grayed check box, so the user will know that a child node is marked, that was the main difficulty, actually.

So, I am open for ideas about how to design this better.

time to read 2 min | 235 words

Partial is a MonoRail term to a piece of UI that you extract outside, so you can call it again, often with Ajax. This is something that is harder to do in WebForms. Yesterday I found an elegant solution to the problem.

ASPX code:

<div id="UserDetailsDiv">
   <ayende:UserDetails runt="server" ID="TheUserDetails"/>
</div>

User Control:

<div>
Name: <asp:label ID="Name" runat="server"/> <br/>
Email: <asp:label ID="Email" runat="server"/> 
</div>

Client Side Code:

function changeUser(newUserId, div)
{
	var srv = new MyApp.Services.UserDetails();
	srv.GetUserDetailsView(newUserId, onSucccessGetUserDetailsView, null, div);
}

function onSucccessGetUserDetailsView(response, userContext)
{
	var div = $(userContext);
	div.innerHTML = response;
	new Effect.Highlight(div);
}

Web Service Code:

[WebMethod(EnableSession = true)]
public string GetUserDetailsView(int userId)
{
	User user = Controller.GetUser(userId);
	//there may be a better way to do this, I haven't bothered looking
	UserDetails userDetails = (UserDetails)new Page().LoadControl("~/Users/UserControls/UserDetails.ascx");
	userDetails.User = user;
	userDetails.DataBind();
	using(StringWriter sw = new StringWriter())
	using(HtmlTextWriter ht = new HtmlTextWriter(sw))
	{
		userDetails.RenderControl(ht);
		return sw.GetStringBuilder().ToSTring();
	}
}
time to read 2 min | 202 words

Jeff Brown has a good post about information in software:

There is no technical reason preventing software applications from adopting common standards in the representation of their information.

[...snip...]

Would software interoperability improve if we could just agree on common meta-classes for data structures?

[...snip...]

In any case, it bothers me profoundly that software is so vertical. There is too little common ground. Each application contains a wealth of information but remains steadfastly inaccessible.

It is worth point out that most organization can't agree on what something as fundamental as the Customer within the organization. This is because different parts of the organization are responsible for different aspects of the customer, and they have radically different needs. 

As Jeff points out, software that is open & extensible usually carry a price tag of six figures as well as a hefty customization fee. That is just the nature of the beast, because being a generalist costs, because the business doesn't care if you you can handle fifty different ideas of customers, they want your to fit their idea of customer, do it well, and fit with the different view of a customer within the organization. That doesn't come easily.

FUTURE POSTS

  1. The null check that didn't check for nulls - 9 hours from now

There are posts all the way to Apr 28, 2025

RECENT SERIES

  1. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  2. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
  3. RavenDB 7.1 (6):
    18 Mar 2025 - One IO Ring to rule them all
  4. RavenDB 7.0 Released (4):
    07 Mar 2025 - Moving to NLog
  5. Challenge (77):
    03 Feb 2025 - Giving file system developer ulcer
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}