Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive

Node.cs

time to read 3 min | 457 words

No, the title is not a typo. There is so much noise around Node.js, I thought it would be fun to make a sample of how it would work in C# using the TPL. Here is how the hello world sample would look like:

public class HelloHandler : AbstractAsyncHandler
{
    protected override Task ProcessRequestAsync(HttpContext context)
    {
        context.Response.ContentType = "text/plain";
        return context.Response.Output.WriteAsync("Hello World!");
    }
}

And the code to make this happen:

public abstract class AbstractAsyncHandler : IHttpAsyncHandler
{
    protected abstract Task ProcessRequestAsync(HttpContext context);

    private Task ProcessRequestAsync(HttpContext context, AsyncCallback cb)
    {
        return ProcessRequestAsync(context)
            .ContinueWith(task => cb(task));
    }

    public void ProcessRequest(HttpContext context)
    {
        ProcessRequestAsync(context).Wait();
    }

    public bool IsReusable
    {
        get { return true; }
    }

    public IAsyncResult BeginProcessRequest(HttpContext context, AsyncCallback cb, object extraData)
    {
        return ProcessRequestAsync(context, cb);
    }

    public void EndProcessRequest(IAsyncResult result)
    {
        if (result == null)
            return;
        ((Task)result).Dispose();
    }
}

And you are pretty much done. I combined this with a HttpHandlerFactory which does the routing, and you get fully async, and quite beautiful code.

time to read 2 min | 367 words

According to the way the blog posts are currently scheduled, I just spent about a month doing nothing but talking about stuff that has very little to do with the implementation, code or even just rough architecture. I bet you thought that you were going to see some code, diagrams and something real that you can sink your teeth into…

Well, not so fast, this is supposed to be a DDD sample, as such, the first and foremost topic that discuss is the actual domain. I think that this is likely to be the last of the pure domain posts, and I’ll get started with the actual design stuff shortly. But before we do that, we need to learn about one last aspect of the domain, the Inmate’s Record.

So far, we have dealt mostly with the Dossier, the legal stuff that means that we can keep an Inmate in lawful incarceration, but in addition to that, we also have the Inmate’s Record. The Record is basically all the interesting things that the prison stuff needs to know about the Inmate. Those things range from cliff notes version that you need to look at before you interact with an Inmate to a detailed record of his stay in prison.

The cliff note version is usually used in briefing about the guy, “Look, we have to take him to his court date, you need to remember, the guy is on suicide watch, so never leave him alone…”. In the cliff note version, we highly important aspects of the Record. Suicidal, Flighty (tried / planning to escape), Avoid Putting With Inmate X, etc.

The Full Record is used for things like intelligence reviews, interviews, parole hearings, and in any case where there is a need to learn a lot about the Inmate.

What goes into the Record?

  • Guards’ reports
  • Intelligence gathered
  • Disciplinary actions

And probably a whole lot more that I am forgetting. It is important to note the difference between the Dossier, which is usually handled by Legal and the Record, which is usually handled by Staff. They both refer to the same Inmate, but they are usually handled, maintained and used completely separately.

time to read 2 min | 298 words

Counting is more than just a regular event in prison, it is more like the heart beat of the entire operation. Indeed, one of the more disruptive events in a prison is when Inmates refuse to be Counted. That is ranked up there will a full scale riot.

Macto is meant to be mostly about the legal aspects of an Inmate’s incarceration, but it can’t ignore the Counting. Indeed, we need to explicitly support those. Just to make our life complicated, when Inmates are Counted, they don’t actually have to be Counted (except at Opening Count and Closing Count, of course), they just have to be Accounted For.

For example, an Inmate may be at the Courthouse during Noon Counting, and that is just fine, as long as we know that he is there. Or there might be an Inmate that is present in another cell during the day, which is also pretty common.

Any changes for the counting for Closing Count are usually pretty extraordinary, something being hospitalized, just arriving from a very long day at Court, etc.

In Macto, we need to record not only that the counting has been made, but also:

  • How many Inmates where present?
  • How many Inmates where supposed to be there?
  • If there are any discrepancies, are they accounted for?

Oh, and you can’t just not accept invalid data, because if an Inmate has Escaped, you still need to be Count all of the rest (in fact, you want to be able to Count them very quickly, nothing make sure that you’ll Count as a suspected or real Escape attempt).

Speaking of which, there is another aspect of Inmate management that we haven’t spoken about yet, the actual tracking of the inmate, but I’ll discuss that in my next post.

time to read 2 min | 341 words

So the officer shows up in the morning, logs into Macto, and… what does he sees? What are the day to day operations that are required?

This is usually much harder to figure out, because there isn’t any particular action that initiate things, it is usually the routine stuff that trips you.

Since we are mostly interested in the Inmates legal statuses, every day, we need to start with an Action Plan:

  • Which Inmates go home today?
  • Which Inmates’ incarceration should be extended?
  • Which of the Inmates need to go to court?
  • Notify interested parties about Inmates who are scheduled to be released soon.
  • Are there any Inmates who should have been freed but are still hanging around?

Again, note how limited our scope is. We don’t deal with things like cell searches, scheduled drills, etc. Those are happening in any reasonable prison, and they probably need to be tracked, reported on, and scheduled. But those things are pretty much Routine Military Activity (same as the requirement that every soldier re-qualify on firearms once in some period), and there is probably software out there that already does it. We are focusing on the Dossiers, and that is a complex enough world on its own.

Did you notice the actual difference between the first two items of the action plan? What is the difference between them?

Inmates which gets to go home are usually those that were sentenced and served their time. Inmates whose incarceration should be extended are Inmates whose authority for incarceration is time limited, and would have to be released. However, there is usually a reason why they are incarcerated, and that usually means that instead of letting them go, we have to take them in front of a judge to extend the incarceration period until they are finally sentenced.

This part of the system is basically reports and alerts. It gives the user the information about what sort of actions should be taken to ensure that we don’t run into habeas the curpus scenarios without good answers.

time to read 3 min | 510 words

Update: See below for details about the server load.

I just run into this tweet:

image

That isn’t the first time that we heard this, and it is actually surprising, not only was there no attention to performance given throughout the lifecycle of the project so far, we actually discovered some issues that theoretically should hurt performance.

The secret is that RavenDB is really good in optimizing itself based on usage patterns. That came out of the realization that we had to drop people into the Pit of Success as much as possible. Raccoon Blog shows that we were able to do just that.

Please note that I am testing this with a RavenDB server over the internet, to emphasis the actual costs involved.

For example, let us look at the current state of this blog, using RavenDB MVC Profiler:

image

Just to be clear, in order to actually show the problem, I am running this locally, while the RavenDB server is actually the production RavenDB server for this blog, in other words, the vast majority of the time is actually network traffic, making queries to RavenDB over the Internet.

Remember that I said that we didn’t pay attention to performance? Notice how many remote queries we are making. We have 5 sessions and 7 queries. Why is that?

The reason for that is that we are using the Session per Action approach for scoping the session, and we make heavy use of Child Actions, each of which is going to get its own session. Give me a moment to fix that…

Well, that took three minutes, mostly because I wanted to do it right. There are advantages for Infrastructure Ignorance, and one of them is the ease in which we can make such changes.

image

The change isn’t drastic, we went from having 5 sessions and 7 requests to 1 session and 6 requests, in the post details view, however, it saved us two requests.

The actual cost of opening a session is essentially zero, so it is more the requests that we are making than anything else. We can improve performance even further by applying additional tactics, such as aggressive caching or batched calls, but we will save those for another post.

Update: I was asked about the load, and I hopped off to Google Analytics and got a few numbers:

image

That isn’t super high, but it is respectable.

time to read 2 min | 335 words

Working on RaccoonBlog is like therapy, it is simple, it is fun and there isn’t a lot of thinking involved. We recently did a bunch of new features that I just pushed. None of the feature is earth shattering, but each is improving the blog somewhat in fun ways.

RSS Feeds per categories:

image

Post and comments count:

image

The number shocked me, actually. And yes, there will be a post about this.

Per post comments feed

image

If you like keeping track on the comments of a particular post, you can now do so very easily.

Global comments feed

image

If you care to track all comments on all the posts, you can do that as well. And yes, I’ll post on that as well.

Social Login

image

One of the things that I hate is the need to login all the time. We remember those details by default, but it is still annoying. I hope that this would be useful to you too.

How far do you have to wait?

image

time to read 2 min | 392 words

Accepting a new Inmate into prison is usually composed of the bureaucracy in the beginning, and ends with the Inmate arriving at his bunk. The last part is actually pretty complex.

Deciding where the Inmate would go is a decision that is composed of many factor:

  • What type of an Inmate is he? (Just arrested, sentenced, sentenced for a long period, etc)
  • Why is he in prison for?
  • What kind is he? (You want to avoid Inmate infighting, it creates paperwork, so you avoid putting them in conflict if possible)
  • Where there is room available?

I am skipping on other stuff, but I think that you get the picture.

The Inmate’s location is another thing that seems simple on the surface but gets complicated when you drill down. The Inmate’s location is actually compromised from several different aspects. First, and the most obvious one, is the actual physical location of the Inmate. For example:

  • Cell Block B, Section D, Cell 349
  • Redbrick Hospital, ER
  • Vacation
  • East Misphat Courthouse
  • “Loaned” to another prison

All of those are pretty obvious (loaning an Inmate is rarely done, but can happen if he has to do something like show up at a court near the other prison that is far from his current prison).

The next aspect is his location is who is signed for this prisoner. That is a problematic concept if you don’t understand how prisons work. You can think about it as a chain of responsibility. Since the Inmate is in Lawful Custody, if something happens to him, then someone is going to answer some questions. We used to have a joke, “you break, you replace”.  Who signed on for this Inmate is basically who has the legal responsibility for this Inmate. This can be:

  • Cpt. Yom Kashe, Commander Cell Block D
  • Lt. Halach Alley, Escorting to Courthouse
  • Sarge. Yashnoni, Guarding at Hospital

I think that you get the picture. And finally, we have who has overall responsibility for this Inmate? Put simply, while the Inmate is hospitalized for a week, it may be Sargent Yashnoni who is actually standing over the bed, but it is Captain Kashe that has the responsibility for the guy. He is the one who has to report him as “not present, location is known”.

time to read 3 min | 508 words

Yes, I know that this is basically saying that select is broken, but I am seeing some very strange stuff here. The code in question is this:

for (int i = 0; i < 15; i++)
{
    var discoveryClient = new DiscoveryClient(new UdpDiscoveryEndpoint());
    var findCriteria = new FindCriteria(typeof(IDiscoverableService))
    {
        Duration = TimeSpan.FromSeconds(1)
    };
    discoveryClient.Find(findCriteria);
    discoveryClient.Close();
}

The full repro can be found here.

The problem is, put simply, that before this code, the working set is:  Working Set: 57,640 kb, after this have executed, it is 90,288 kb.

I went over the code with a fine tooth comb, but I don’t really see where all of this memory have gone.

The actual memory reported by GC.GetTotalMemory does go down after the cleanup, so I guess it could be that .NET isn’t releasing the memory back to the OS, but it still worries me somewhat.

I tried it with higher numbers for the loop, and it seems like eventually it settles down on some number and doesn’t grow from there. My main problem is what happens when you start doing async stuff. Let us take a look here:

int count = 150;
var countdown = new CountdownEvent(count);

for (int i = 0; i < count; i++)
{
    var discoveryClient = new DiscoveryClient(new UdpDiscoveryEndpoint());
    var findCriteria = new FindCriteria(typeof(IDiscoverableService))
    {
    };
    discoveryClient.FindProgressChanged += (sender, eventArgs) => {  }; // do nothing
    discoveryClient.FindCompleted += (sender, eventArgs) =>
    {
        countdown.AddCount();
        discoveryClient.Close();
        PrintMemory("Complete: ");
    };
    discoveryClient.FindAsync(findCriteria);
}

countdown.Wait();

At peek, I am seeing 600 MB (!) of memory used. Note that we are now using the default duration of 20 seconds, and it seems that DiscoveryClient is very heavy on memory.

If you know how, can you take a look at the code and tell me that I am crazy?

time to read 3 min | 436 words

Why did you write your own blogging software? I can hear a lot of people asking. The answer, to tell you the truth, is actually quite simple.

I needed it. My blogging habits puts me quite outside any curve that you care to name, and the usual blogging software just don’t take some stuff into account.

Probably the most important aspect of that is the notion of scheduling. When I got the muse, I got the muse, and I post a lot. Just to give you some idea, the entire Microsoft N Layer App Sample was posted during one morning at the office. This would be my 6th post of the day, etc. That means that the process of sending a new post out has to be as automated as possible, including when to schedule it.

Raccoon Blog does this for me, finding out the latest blog post and scheduling the new post on the next work day. But it gets better, because blogging is basically writing over time, it is important to actually see it that way. Here is the main admin screen for Raccoon Blog:

image

As you can see, all of the posts are outlined nicely on a calendar, giving a real sense of when they are going to be posted. It is also showing a problem. When I do a series, I usually write all of them at once, and then dump it to the queue. But it is not necessarily the best case for reading. If you don’t care for the topic of the series, you might be bored for a long time. So it is good to mix up content a bit.

This is magic, but Raccoon Blog lets me drag & drop posts to new dates, meaning that in a few minutes, I can mix up the content so the blog wouldn’t be a one track topic for weeks on end. And the reason it is magic is that this is so much better than the alternatives that I have tried that it feels like heaven.

Here is how it looks like after the re-ordering:

image

Now that is more like it, mixed topics, a lot of stuff that would interest you (I hope) and no pain at all during the entire process.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}