Reviewing Lightning memory-mapped database libraryA thoughtful hiatus
I thought that I would stop a bit from focusing on what the LMDB code is doing in favor to some observations about the code itself. Going into this codebase it like getting hit in the face with a shovel. Now, this might be my personal experience, as someone who has done a lot of managed code work in the past. But I used to be a pretty horrible C/C++ guy (the fact that I say C/C++ should tell you exactly what my level was).
But I don’t think that it was just that. Even beyond the fact that the code is C, and not C++ (which I am much more used to), there is a problem that only become clear to me well after I read the code for the millionth time. It grew. Looking at the way the code is structured, it looks like it was about as nice a C codebase as you can get (don’t get excited, that isn’t saying much). But overtime, features were added, but the general structure of the codebase wasn’t adjusted to account for that.
I am talking about things like this:
There are actually 22 (!) ‘if(IS_LEAF(mp))’ references in the codebase.
Or what about this?
It looks like certain features (duplicate keys support, for example) was added that had a lot of implication on the code, but it wasn’t refactored accordingly. It make it very hard to go through.
More posts in "Reviewing Lightning memory-mapped database library" series:
- (08 Aug 2013) MVCC
- (07 Aug 2013) What about free pages?
- (06 Aug 2013) Transactions & commits
- (05 Aug 2013) A thoughtful hiatus
- (02 Aug 2013) On page splits and other painful things
- (30 Jul 2013) On disk data
- (25 Jul 2013) Stepping through make everything easier
- (24 Jul 2013) going deeper
- (15 Jul 2013) Because, damn it!
- (12 Jul 2013) tries++
- (09 Jul 2013) Partial
Comments
Can this code repetition be a case of unwrapping what otherwise would have been function calls, making those things inline for performance reasons?
Anton, No, I don't believe it. There are other ways to do that, see: http://gcc.gnu.org/onlinedocs/gcc/Inline.html
Eh. Yes, cursor_next is mostly a mirror image of cursor_prev. Likewise cursor_first / cursor_last. But I made a conscious choice to keep them separate. I could easily have unified them but there would be additional branching and special casing going on.
As for the LEAF2 cases - it's cheaper to have the two lines of repeated code than to go thru the overhead of a function call.
Howard, You can do inline function, which has the same cost of repeating the code and not have duplicate code. Now, if you need to change something, you have to search for all the places where this is happening.
Comment preview