Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 84 words

I want you to take a look at the dates of this conversation:

200901310515.jpg

The third message in the list is setting up a reminder in three weeks. I needn't have bothered.

I got a reply instantly, with the exact response that I could have wished for. Awesome!

ORCS Web in general is a very good provider, in their service, in their responsiveness and the overall level of "how can we help you" that I get from them.

Real pleasure to deal with.

time to read 14 min | 2722 words

While testing Rhino Service Bus, I run into several pretty annoying issues. The most consistent one is that the actual work done by the bus is done on another thread, so we have to have some synchronization mechanisms build into the bus just so we would be able to get consistent tests.

In some tests, this is not really needed, because I can utilize the existing synchronization primitives in the platform. Here is a good example of that:

   1: [Fact]
   2: public void when_start_load_balancer_that_has_secondary_will_start_sending_heartbeats_to_secondary()
   3: {
   4:     using (var loadBalancer = container.Resolve<MsmqLoadBalancer>())
   5:     {
   6:         loadBalancer.Start();
   7:  
   8:         Message peek = testQueue2.Peek();
   9:         object[] msgs = container.Resolve<IMessageSerializer>().Deserialize(peek.BodyStream);
  10:  
  11:         Assert.IsType<HeartBeat>(msgs[0]);
  12:         var beat = (HeartBeat)msgs[0];
  13:         Assert.Equal(loadBalancer.Endpoint.Uri, beat.From);
  14:     }
  15: }

Here, the synchronization is happening in line 8, Peek() will wait until a message arrive in the queue, so we don’t need to manage that ourselves.

This is not always possible, however, and this actually breaks down for more complex cases. For example, let us inspect this test:

   1: [Fact]
   2: public void Can_ReRoute_messages()
   3: {
   4:     using (var bus = container.Resolve<IStartableServiceBus>())
   5:     {
   6:         bus.Start();
   7:         var endpointRouter = container.Resolve<IEndpointRouter>();
   8:         var original = new Uri("msmq://foo/original");
   9:  
  10:         var routedEndpoint = endpointRouter.GetRoutedEndpoint(original);
  11:         Assert.Equal(original, routedEndpoint.Uri);
  12:  
  13:         var wait = new ManualResetEvent(false);
  14:         bus.ReroutedEndpoint += x => wait.Set();
  15:  
  16:         var newEndPoint = new Uri("msmq://new/endpoint");
  17:         bus.Send(bus.Endpoint,
  18:                  new Reroute
  19:                  {
  20:                      OriginalEndPoint = original,
  21:                      NewEndPoint = newEndPoint
  22:                  });
  23:  
  24:         wait.WaitOne();
  25:         routedEndpoint = endpointRouter.GetRoutedEndpoint(original);
  26:         Assert.Equal(newEndPoint, routedEndpoint.Uri);
  27:     }
  28: }

Notice that we are making explicit synchronization in the tests, line 14 and line 24. ReroutedEndpoint is an event that we added for the express purpose of allowing us to write this test.

I remember several years ago the big debates on whatever it is okay to change your code to make it more testable. I haven’t heard this issue raised in a while, I guess that the argument was decided.

As a side note, in order to get rerouting to work, we had to change the way that Rhino Service Bus viewed endpoints. That was a very invasive change, and we did it in less than two hours, but simply making the change and fixing the tests where they broke.

time to read 1 min | 186 words

For NH Prof, I need to have some licensing solution so people would be reminded that after 30 days of using the trail, they should pay. Initially, I bought a licensing component. That didn’t work out, and I now find myself in the position of having to writing the licensing infrastructure for NH Prof.

Any second that I put into the licensing infrastructure is a second that I can’t put into actually making the product itself useful. More than that, in order to produce a good licensing story, you need to invest a lot of time writing some tricky code so hackers would have harder time breaking this.

I got some advice in the matter from friends, which I am very grateful for, if not for the fact that this is so depressing.

Now, just to make things more complicated. Licensing is actually a big topic. I got requests from users regarding the licensing. Those range from being able to use a license on several machines, support floating licenses and removing a license from a machine.

Argh, what a waste of time!

How to fix a bug

time to read 1 min | 170 words

Yesterday I added a bug to Rhino Service Bus. It was a nasty one, and a slippery one. It relates to threading nastiness with MSMQ, and the details aren’t really interesting.

What is interesting is how I fixed it. You can see the commit log here:

image 

At some point, it became quite clear that trying to fix the bug isn’t going to work. I reset the repository back to before I introduced that bug (gotta love source control!) and started reintroducing my change in a very controlled manner.

I am back with the same functionality that I had before I reset the code base, and with no bug. Total time to do this was about 40 minutes or so. I spent quite a bit longer than that just trying to fix up that.

Lesson learned, remember to cut your losses early :-)

time to read 2 min | 263 words

I explicitly don’t want to go over the exact scenario that this is relating to. I want to talk about a general sentiment that I got from several people from Microsoft a few times, which I find annoying.

It can be summed up pretty easily by this quote:

You all know that we work on the Agile process here, right? We get something out (perhaps a little early) and then improve it. Codeplex is for open source and continuous improvement with community feedback.

The context is a response to a critique about unacceptable level of quality in something Microsoft put out. Again, I do not want to discuss the specifics. I want to discuss the sentiment, I got answers in a similar spirit from several Microsoft people recently, and I find it annoying in the extreme.

Agile doesn’t mean that you start with crap, call it organic fertilizer and try to tell me that it will improve in the future. Quality is supposed to be built in, it is the scope that you grow incrementally, not the product quality.

I actually find the open source comment to be even more annoying. Open source does not mean that you get someone else to do your dirty work. And if you take something and call it open source, it doesn’t mean that you are not going to get called on the carpet for the quality of whatever you released.

Calling it open source does not mean that the community is accountable for its quality.

time to read 2 min | 316 words

Those are not actually new features, if you want to be strict about it. There is a whole bunch of things in NH Prof that already exists, but are only now starting to have an exposed UI.

I believe that NH Prof’s ability to analyze and detect problems in your NHibernate’s usage is one of the most valuable features that it have. Heavens know that I spent enough time on that thing to make it understand everything that I learned about NHibernate in 5 years of usage (how did it get to be that long)?

The problem is that NH Prof is not self aware yet, and assumptions about what is good practice or not cannot be made in vacuum, they must be made in context, which NH Prof lacks. As such, it is possible that you’ll find yourself inundated with alerts that aren’t valid for your scenario.

A typical example would be that for your project, which uses MySql, you cannot use NHibernate’s batching (which isn’t supported on MySql). Therefore, the batching alert is not only invalid, it is actually annoying. You can globally disable an alert from the settings dialog:

image

But that is like whacking flies with a rifle. It will kill all the alerts of that type. What about when you want to ignore a specific alert in a specific circumstance?

NH Prof supports this as well: 

image

The profiler is smart enough to ignore the same alert from the same source in the future.

Of course, we can also remove ignored alerts from the settings dialog as well.

time to read 3 min | 564 words

Recently I found myself facing several pretty tough problems. Solving the problem or the generic case would have been hard if not impossible. In all of those cases, I was able to redefine the problem to make the solution trivially simple.

Problem #1 – SQL display in NH Prof

We had an issue with NH Prof regarding scrolling a big list. The problem was that the performance for big lists isn’t that great. WPF has a solution for that, virtual lists, which means that we only get to bind to the visible portion of the list, which significantly improve the system performance. The problem is that when you do this, you lose smooth scrolling, and then you get into a bit of a situation when you have large SQL statements. The UI doesn’t work in a nice way.

I wanted to have both. We figured out a couple of ways to do that, but I kept having this nagging feeling that I am being stupid. Eventually I realized that I had a problem in the problem specification. We do not need to display the large SQL statement in the list. It makes no sense from a UI perspective anyway. I was just coasting along on inertia without thinking, and I run into this issue.

Before:

image

After:

image

There is no value in the long statement. We are stripping a lot of information away from the statements anyway, to make it easy to understand what is going on there at a glance. The previous version just put additional burden on the user to try to understand what is going on in the mess. If they want the detailed view, we have that, and it is formatted, nice and easy to read.

Problem #2 – Heterogeneous load balancing

When building Rhino Service Bus load balancing support (what NServiceBus calls distributer / grid), I had run into a major issue of non elegance. I initially had thought that each node in the grid would tell the balancer what messages the node can handle, and on arrival, the balancer will inspect the message and dispatch to the an available end point.

The problem was that I didn’t like this, it required too many moving parts (each node keep telling the balancer which messages it could handle, updating several dispatch lists on each message, etc). It was a complex solution, and I didn’t like where I was heading.

Again, I had a problem with the solution complexity because the problem was stated in a problematic fashion. I didn’t need to support Heterogeneous system, I don’t have one at the moment. I can specify that a load balancer is going to always front a homogenous set of nodes, and reduce the problem to a dequeue & send.

Rethinking about the problem can often tell you that you are trying to solve more than what you should. By reducing the scope of the problem by a degree that is often meaningless to the desired  business requirement, we can drastically simplify the solution and the implementation.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}