Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive

Memorable code

time to read 6 min | 1027 words

public class Program
{
    static List<Thread> list = new List<Thread>();
    private static void Main(string[] args)
    {
        var lines = File.ReadAllLines(args[0]);

        foreach (var line in lines)
        {
            var t = new Thread(Upsert)
            {
                Priority = ThreadPriority.Highest,
                IsBackground = true
            };
            list.Add(t);
            t.Start(line);
        }

        foreach (var thread in list)
        {
            thread.Join();
        }

    }

    private static void Upsert(object o)
    {
        var args = o.ToString().Split(',');
        try
        {
            using(var con = new SqlConnection(Environment.CommandLine.Split(' ')[1]))
            {
                var cmd = new SqlCommand
                {
                    Connection = con, 
                    CommandText = "INSERT INTO Accounts VALUES(@p1, @p2, @p3, @p4,@p5)"
                };

                for (var index = 0; index < args.Length; index++)
                {
                    cmd.Parameters.AddWithValue(@"@p" + (index + 1), args[index]);
                }

                try
                {
                    cmd.ExecuteNonQuery();
                }
                catch (SqlException e)
                {
                    if(e.Number == 2627 )
                    {
                        cmd.CommandText = "UPDATE Accounts SET Name = @p2, Email = @p3, Active = @p4, Birthday = @p5 WHERE ID = @p1";
                        cmd.ExecuteNonQuery();
                    }
                }
            }
        }
        catch (SqlException e)
        {
            if(e.Number == 1205)
            {
                var t = new Thread(Upsert)
                {
                    Priority = ThreadPriority.Highest,
                    IsBackground = true
                };
                list.Add(t);
                t.Start(o);
            }
        }
    }
}
time to read 4 min | 721 words

When Entity Framework came out, there was a lot of excitement, and a lot of people picked it up. I was fairly confused about that, because I didn’t really understand why. One of the major reason that people kept saying is that “it might have problems now, but we are doing this to get an early start for what comes down the road”.

Indeed, at the time, Microsoft has some really interesting plans for Entity Framework and EDM:

Long-term we are working to build EDM awareness into a variety of other Microsoft products so that if you have an Entity Data Model, you should be able to automatically create REST-oriented web services over that model (ADO.Net Data Services aka Astoria), write reports against that model (Reporting Services), synchronize data between a server and an offline client store where the data is moved atomically as entities even if those entities draw from multiple database tables on the server, create workflows from entity-aware building blocks, etc. etc.

I emphasized some parts of that, because I think it is really interesting to look back at those statements in hindsight. We are about 3 years after the fact, and we can see that most of those promised projects actually came about. None of them actually uses the EDM, however. At the time, however, there was a lot of talks and plans about That One Model and how you would define it once and use it across all Microsoft products. I even recall one DotNetRocks show how SQL Server is probably going to move from the relational model to the EDM Model, as part of a company wide effort to go to a Model Based architecture, etc.

This is important specifically because of the ending statement of the blog post.

So the differentiator is not that the EF supports more flexible mapping than nHibernate or something like that, it's that the EF is not just an ORM--it's the first step in a much larger vision of an entity-aware data platform.

What actually happened is the landscape of database tooling in 2011 is drastically different than it was in 2008. The needs ,requirements and usage scenarios are changing with respect to the Cloud, No SQL, Sharding and more. One of the oft repeated phrases about Entity Framework at the time is that it is not an OR/M, it is so much more than that. Go and read the recent posts on the EF Design blog. You will see a lot of stuff about Entity Framework as an OR/M. You’ll see none at all about the “much larger vision of an entity aware data platform”.

That isn’t actually surprising, many of us in the community called out the impracticalities of such a vision at the time.

The point of this post isn’t to pick on Entity Framework, (in hindsight, a lot of the furor about Entity Framework seems overblown, actually) but it is to talk about something that is quite important.

It is very easy to talk about what you are going to do in the future, there is no actual commitment there, you can plan however you like, but the further in time you go, the least likely those plans are going to happen. Not because whoever made those plans lied, but simply because circumstances change. And when that happen plans are either going to change or become irrelevant.

The other aspect of this to that is that you should very rarely try to base your own decisions on what someone else is saying that they are planning to do that far down the road. Especially if it means that you are going to take a currently inferior product just so you would be familiar with it when it becomes great (part 23.13.B, section 12A in the Grand Plan). You should base your decisions on the current and upcoming stuff, not stuff that is so far in the future, the entire industry is going to change twice before the due date.

Sure, you probably want to keep an eye on what is going on and what the future plans are, but it isn’t really a good idea to base your decisions on that. I mean, if you were listening to the 2008 PDC, you would have bet the farm on Oslo…

time to read 3 min | 434 words

I really like notion of reducing the number of remote calls, so why did I stick the Lazy Requests feature of RavenDB in the session.Advanced section of the API and not put it in the center, directly off the session?

The answer is that I expect Lazy Requests to be a very powerful feature, but at the same time, it isn’t important enough a feature for us to justify increasing the surface area of the session. One of the main goals with RavenDB is simplicity and power. The simple stuff should be simple. We actually consider this a bug if you can’t pick up RavenDB and start using it in ten minutes or less.

That does not means that we don’t add powerful features, but we are careful in ensuring that those features won’t contaminate the Getting Started scenario.

Another consideration is that as powerful as Lazy Requests are, the common best practice for RavenDB is already reducing the number of requests drastically, so we mostly need them for occasional use, vs. common usage. One we figured that in many cases, using Lazy Requests is a rare thing, the decision where to put it became much simpler. In other words, it doesn’t really matter if you are making two queries vs. one. It matters a lot more if you are doing 30.

One of the more interesting aspects of designing RavenDB is actually in the exposed API. We are working hard to make sure that this API is as simple and predictable as possible. I am more than willing to give users options to solve specific problems, but it is important to consider that at its core, RavenDB is a database, and as such, what people mostly care about is CRUD. And that is why the session interface is the way it is, because you get to do CRUD right off the bat, and if you want more knobs to turn and handles to crank, you go to behind the session.Advanced door and can get all of the features that you could imagine.

Another aspect of that is the suggestions of API from users for all sort of stuff. From SaveChangesAndWaitForIndexing to DeleteAll, etc.

Those things are useful, sure. But they can be implemented as extension methods, and they wouldn’t be useful for the general case. The thing that I am trying to avoid is the case where you have something like what happened to Rhino Commons Repository<T>, which got so many features to handle one off use cases that it was really quite hard to use for the common case.

time to read 2 min | 229 words

In the following code, what do you believe the output should be?

class Program
{
    static void Main(string[] args)
    {
        dynamic stupid = new Stupid{Age = 3};

        Console.WriteLine(stupid.Age);
    }
}

public class Stupid : DynamicObject
{
    public int Age { get; set; }

    public override bool TryGetMember(GetMemberBinder binder, out object result)
    {
        result = 1;
        return true;
    }
}

Argh!

The default is to use the actual type members, and then fall back to the dynamic behavior. Whereas I would expect it to first check the dynamic behavior and then fall back to the actual members if it can’t find dynamic stuff.

Quite annoying.

time to read 2 min | 289 words

I was pointed to this codebase, as a good candidate for review. As usual, I have no contact with the project owners, and I am merely using the code as a good way to discuss architecture and patterns.

It starts with this class:

image_thumb

Okay, this codebase is going to have the following problems:

  • Complex and complicated
  • Hard to maintain
  • Hard to test
  • Probably contains a lot of code “best practices” that are going to cause a lot of pain

And I said that this is the case without looking at any part of the code except for this constructor. How am I so certain of that?

Put simply, with 9 dependencies, and especially with those kind of dependencies, I can pretty much ensure that this class already violate the Single Responsibility Principle. It is just doing too much, too complex and too fragile.

I shudder to think what is involved in testing something like that. Now, to be fair, I looked at the rest of the codebase, and it seems like I caught it in a state of flux, with a lot of stuff still not implemented.

Nevertheless… this is a recipe for disaster, and I should know, I have gone ctor happy more than once, and I learned from it.

And here is the obligatory self reference:

image

And yes, this does give me a headache, too.

time to read 1 min | 146 words

I was pointed to this codebase, as a good candidate for review. As usual, I have no contact with the project owners, and I am merely using the code as a good way to discuss architecture and patterns.

It starts with this class:

image

Stop right here!

Okay, this codebase is going to have the following problems:

  • Complex and complicated
  • Hard to maintain
  • Hard to test
  • Probably contains a lot of code “best practices” that are going to cause a lot of pain

Tomorrow, I’ll discuss why I had that reaction in detail, then dive into the actual codebase and see if I am right, or just have to wipe a lot of egg off my face.

time to read 1 min | 88 words

I don’t usually do posts about current events, but this one is huge.

To celebrate the event, you can use the following coupon code: SLT-45K2D4692G to get 19.41% discount (Gilad was captive for 1,941 days) for all our profilers:

Hell, even our commercial support for NHibernate is participating.

Please note that any political comment to this post that I don’t agree with will be deleted.

time to read 3 min | 485 words

This is a question that comes up relatively often in the RavenDB mailing list. How do I handle multiple users with RavenDB? Does it support multiple users? Does it supports the Membership Provider?

Those questions usually confuse a very key concept regarding users. Whose users are they?

In particular, we need to make a distinction between System Users and Application Users. Despite using the same term for both, there is actually very little connection between the two.

Here is an example of a System User:

<connectionStrings>
    <add name="RavenDB" connectionString="Url=http://scotty.ravendb.net;user=beam;password=up"/>
</connectionStrings>

As you can probably surmise, this is a connection string, and the user is ‘beam’. This user is a System User, if you call the Ops Team and ask them why the password expired, they can help you there.

This is a system user, it controls access to external resources, and usually you have very few of those. Usually they control things like what parts of the disk you can write to, what databases you can connect to, etc. For the most part, they aren’t in your control, you don’t manage them and neither does you application

In contrast to that, here is a great example of an Application User:

image

An Application User is unique to its application. It is usually manifested as a document (or a database row) and doesn’t have any existence beyond that. If you called the Twitter Team Ops and told them that the RavenDB account password need resetting, they would be pissed that you are wasting their time.

This distinction is important, because it implies a lot about how we use those two different types of users.

System Users are used… well, for the system. Application Users are the actual users using the system. Very rarely are they one and the same. Usually our application use service accounts, and any security checks for what an Application User can do are implemented as part of the business logic, not by setting ACLs.

Don’t confuse the two, despite the common name.

And coming back all the way to the original question. RavenDB comes with the notion of System Users via Windows Auth and OAuth, and it helps with Application Users using the Authorization Bundle. But you really don’t want to use the membership API, regardless of the underlying storage.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}