NH ProfPersistence Format
As someone that so firmly believe that persistence is a solved problem, I keep tripping over it. The issue is quite simple, each scenario has radically different requirements, and usually require different solutions.
In NH Prof case, just using RDBMS is not a really good solution, but before arriving to that conclusion, I really need to explain what the requirements are. For NH Prof, there are several reasons to want to be able to persist things:
- Create an offline dump of the profiling session, to be analyzed later.
- This is actually a critical feature from my perspective, since it allows me to troubleshoot user issues quite easily. The user send me the dump file, I load it in NH Prof and can see exactly what their problem is.
- Saving a profiling session to be analyzed at a later date (File > Save / Load).
- In addition to the first tow, persistence format basically means the format of a stream, and we can also use a stream as a communication mechanism.
Right now, NH Prof actually have three different ways of handling each of those tasks (xml log, binary serialization and remoting). Obviously I would like to avoid having to do this, if only because more code that does the same thing for different purposes tend to create triple the amount of work. There is also the problem that each of those methods give a different set of data to the application, which make my life quite a bit harder.
There is also another consideration, which will make sense to you when we release NH Prof v1.0, but I don’t want to talk about that reason just yet.
So, what is the solution? Can we make it work?
The answer actually lies in the architecture that NH Prof utilize. In its core, NH Prof is a sophisticated analysis engine with a fancy UI on top. And what it analyze is the event stream from NHIbernate. That can actually cause some interesting problems. When we save to a file, what should we save? The event stream? The result of the analysis? There are arguments for both approaches.
My decision was based on several factors, simplicity and “how much pain do I have to deal with” were chief among them. The end result is that I decided to make use of Protocol Buffers, which is a serialization format that Google put out. It has some interesting properties, such as being fast to deserialize and serialize, light weight and cross platform. After some time struggling with the various options, I settled on Jon Skeet’s C# implementation, and so far it looks very good. Maybe I should join the fan club? :-)
Anyway, it means that all three separate persistence options are going to move over to be a protocol buffers implementation. There are still some issues that I have to deal with, mostly with the reliability of the network connection and retries attempts, but I feel certain that I can make this happen. The end result is pretty significant simplification in the way that I am working with the codebase, and it resolve a few other problems as well (mostly related to my misuse of remoting).
All in all, I think NH Prof is rapidly moving toward a functional release status.
More posts in "NH Prof" series:
- (09 Dec 2010) Alert on bad ‘like’ query
- (10 Dec 2009) Filter static files
- (16 Nov 2009) Exporting Reports
- (08 Oct 2009) NHibernate Search Integration
- (19 Aug 2009) Multiple Session Factory Support
- (07 Aug 2009) Diffing Sessions
- (06 Aug 2009) Capturing DDL
- (05 Aug 2009) Detect Cross Thread Session Usage
- (22 May 2009) Detecting 2nd cache collection loads
- (15 May 2009) Error Detection
- (12 May 2009) Queries by Url
- (04 Feb 2009) View Query Results
- (18 Jan 2009) Superfluous <many-to-one> update
- (18 Jan 2009) URL tracking
- (10 Jan 2009) Detecting distributed transactions (System.Transactions)
- (06 Jan 2009) The Query Cache
- (05 Jan 2009) Query Duration
- (24 Dec 2008) Unbounded result sets
- (24 Dec 2008) Row Counts
Comments
Ayende, maybe you should consider Protocol Buffers as message serialization format for Rhino Service Bus?
How is the protobuf protocol better than binary serialization, JSON, etc.?
Rafal,
No, that would mean a lot more pain for the developers, needing to deal with the proto files.
configurator,
cross platform, stable for versioning, small, fast to deserialize
proto files can be omitted - I use probobuf-net. The only thing you need to do to make ProtoBuff serializable class is decorate it with appropriate attributes :)
Marciej,
The problem is that then you are not compatible with other protobuf impl (unless you can generate the proto from the classes).
More important, you are also requiring something that we don't want, attributes.
There is also the issue of human readability
Actually there is an API for generating .proto files for such classes, so there are no problems while integrating with other protobuf libs (well I actually tested only one that is for Java and assumed that the rest will work ;). I've choosen attributes over .proto files because, well, it's more convenient for me, but of course each project has it's own requirements. Another thing I like about protobuf-net is that collections (the ones that are declared as 'repeat') can be mapped to dictionaries, lists (even generic ones) without any problem which is so much nicer than using arrays. Anyway I consider PB a great stuff, it's a lot faster than normal binary serialization (one can however cry about not preserving CLR type information) and it greatly improves interoperability.
Just my plug for protobuf-net: http://code.google.com/p/protobuf-net/
Re attributes - note that the current work I'm doing will allow POCO support within protobuf-net, which is (IMO) the more .NET-idiomatic approach.
The data on the wire is fully compatible with other protocol buffers implementations, plus it supports some things that the others don't, while still retaining wire compatibility. Inheritance being the most obvious.
Comment preview