Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 82 words

Designer

Today I had a meeting to go over the 2024 budget, and we ran into one of the most important line times. Our coffee budget.

You know the old adage about: Coders are turning Coffee into Code, right?

Certainly true in our case, in 2023 we spent a large 5-figure sum on coffee alone. And 2024 is shaping up to be even more expensive.

Happy new year!

time to read 2 min | 235 words

1

If you are reading this blog, I assume that you are a like-minded person. My idea of relaxation is to sit and write code. Hopefully on something that I’m not familiar with. I have many such blog post series covering topics I care about. It’s my idea of meditation.

For the end of 2023, I thought that we could do something similar but on a broader scale. A while ago Alex Klaus wrote a walkthrough on how to build a complete application from scratch using modern best practices (and RavenDB). We refreshed the code and made it widely available, offering you something fun , educational, and productive to engage with.

The system is a bug tracker (allowing us to focus on the architecture rather than domain concerns), and you can play with a deployed version live. The code is available under the MIT license, and we’ll be very happy to receive any suggested improvements.

Topics that are covered:

  1. Building an enterprise application with the .NET and RavenDB

  2. Non-Relational Data Modeling Through Domain Driven Design Prism

  3. Hidden side of document IDs in RavenDB

  4. Dynamic Fields for Indexing

  5. Entity Relationships in non-relational database (one-to-many, many-to-many)

  6. Multi-tenant database in NoSQL

  7. Database Integration Testing – The Secret Recipe

As usual, I would love any feedback you have to offer.

time to read 6 min | 1070 words

With the release of RavenDB 6.0, we are now starting to focus on smaller features. The first one out of the gate, part of RavenDB 6.0.1 release, is actually a set of enhancements around making backups faster, smaller and cheaper.

I just checked, and the core backup behavior of RavenDB hasn't changed much since 2010(!). In other words, decisions that were made almost 14 years ago are still in effect. There have been a… number of changes in both RavenDB, its operating environment and the size of the database that we deal with.

In the past year, we ran into a number of cases where people working with datasets in the high hundreds of GB to low TB range had issues with backups. In particular, with the duration of backups. After the 6.0 release, we had the capacity to do a lot more about this, so we took a look.

On first impression, you would expect that backing up a database whose size exceeds 750GB will take… a while. And indeed, it does. The question is, why? It’s a lot of data, sure. But where does the time go?

The format of RavenDB backups is really simple. It is just a GZipped JSON file. The contents are treated as a JSON stream that contains all the data in the database. This has a number of advantages, the file size is small, the format itself lends itself well to extension, it is streamable, etc. In fact, it is a testament to the early design decision that we haven’t really had to touch that in so long.

Given that the format is stable, and that we have a lot of experience with producing JSON, we approach the task of optimizing the backups with a good idea where we should go. The problem is likely with I/O (we need to go through the entire database, after all). There were some (pretty wild) ideas flying around on how to address this, but the first thing to do, of course, was to run it under the profiler.

The results, as you can imagine, were not what we expected. It turns out that we spend quite a lot of the time inside of GZip, compressing the data. It turns out that when we set up the backup format all those years ago, we chose GZip and Optimal compression mode. In other words, we wanted the file size to be as small as possible. That… makes sense, of course. But it turns out that the vast majority of the time is actually spent compressing the data?

Time to start looking deeper into that. GZip is an old format (it came out in 1992!). And recently there have been a number of new compression algorithms (Zstd, Brotli, etc). We decided to look into those in detail. GZip also has several modes that affect compression ratio vs. compression time.

After a bit of experimentation, we have the following details when backing up a 35GB database.

Algorithm & Mode    

Size

Time

GZip - Optimal

5.9 GB

6 min, 40 sec

GZip - Fastest

6.6 GB

4 min, 7 sec

ZStd - Fastest

4.1 GB

3 min, 1 sec

The data in this case is mostly textual (JSON), and it turns out that we can reduce the backup time by more than half while saving 30% in the space we take. Those are some nice numbers.

You’ll note that ZStd also has a mode that controls compression ratio vs compression time. We tried checking this as well on a different dataset (a snapshot of the actual database) with a size of 25.5GB and we got:

Algorithm & Mode   

Size

Time

ZStd - Fastest

2.18 GB

56 sec

ZStd - Optimal

1.98 GB

1 min, 41 sec

GZip - Optimal

2.99 GB

3 min, 50 sec

As you can see, GZip isn’t going to get a participation trophy at this point, coming dead last for both size and time.

In short, RavenDB 6.0.1 will use the new ZStd compression algorithm for backups (and export files),  and you can expect to have greatly reduced backup times as well as smaller backups overall.

This is now the default mode for RavenDB 6.0.1 or higher, but you can control that in the backup settings if you so wish.

image

Restoring from old backups is no issue, of course, but restoring a ZStd backup on an older version of RavenDB is not supported. You can configure RavenDB to use the GZip algorithm if that is required.

Another feature that is going to improve backup performance is the notion of backup mode, which you can see in the image above. RavenDB backups support multiple destinations, so you can back up to Amazon S3 as well as Azure Blob Storage as a single unit. 

At the time of designing the backup system, that was a nice feature to have, since we assume that you’ll usually have a backup to a local disk (for quick restore) as well as an offsite backup for longer-term storage. In practice, almost all backup configurations in RavenDB have a single destination. However, because we have support for multiple backup destinations, the backup process will first write the backup file to the local disk and then upload it.

The new Direct Upload mode only supports a single destination, and it streams the data to the destination directly, without touching the disk. As a result of this change, we are using far less I/O during backup procedures as well as reducing the total time it takes to run the backup.

This is especially useful if your backup destination is nearby and the network is good. This is frequently the case in the cloud, where you are backing up to S3 in the same region. In our tests, it reduced the backup time by 30% in some cases.

From a coding perspective, those are not huge changes, but together they mean that backups in RavenDB are now cheaper, faster, and far smaller. That translates to a better operating environment for your system. It also means that the costs of storing backups are going to go down by a significant amount.

You can read all the technical details about the few features in the feature announcements:

time to read 4 min | 622 words

A customer contacted us to complain about a highly unstable cluster in their production system. The metrics didn’t support the situation, however. There was no excess load on the cluster in terms of CPU and memory, but there were a lot of network issues. The cluster got to the point where it would just flat-out be unable to connect from one node to another.

It was obviously some sort of a network issue, but our ping and network tests worked just fine. Something else was going on. Somehow, the server would get to a point where it would be simply inaccessible for a period of time, then accessible, then not, etc. What was weird was that the usual metrics didn’t give us anything. The logs were fine, as were memory and CPU. The network was stable throughout.

If the first level of metrics isn’t telling a story, we need to dig deeper. So we did, and we found something really interesting. Here is the total number of TCP connections on the server over time.

So there are a lot of connections on the system, which is choking it? But the CPU is fine, so what is going on? Are we being attacked? We looked at the connections, but they all came from authorized machines, and the firewall was locked down tight.

unnamed

If you look closely at the graph, you can see that it hits 32K connections at its peak. That is a really interesting number, because 32K is also the number of ephemeral port range values for Linux. In other words, we basically hit the OS limit for how many connections could be sustained between a client and a server.

The question is what could be generating all of those connections? Remember, they are coming from a trusted source and are valid operations.  Indeed, digging deeper we could see that there are a lot of connections in the TIME_WAIT state.

We asked to look at the client code to figure out what was going on. Here is what we found:

There is… not much here, as you can see. And certainly nothing that should cause us to generate a stupendous amount of connections to the server. In fact, this is a very short process. It is going to run, read a single line from the input, write a document to RavenDB, and then exit.

To understand what is actually going on, we need to zoom out and understand the system at a higher level. Let’s assume that the script above is called using the following manner:

What will happen now? All of this code is pretty innocent, I’m sure you can tell. But together, we are going to get the following interesting behavior:

For each line in the input, we’ll invoke the script, which will spawn a separate process to connect to RavenDB, write a single document to the server, and exit. Immediately afterward, we'll have another such process, etc.

Each of those processes is going to have a separate connection, identified by a quartet of (src ip, src port, dst ip, dst port). And there are only so many such ports available on the OS. Once you close a connection, it is moved to a TIME_WAIT mode, and any packets that arrive for the specified connection quartet are going to be assumed to be from the old connection and drop. Generate enough new connections fast enough, and you literally lock yourself out of the network.

The solution to this problem is to avoid using a separate process for each interaction. Aside from alleviating the connection issue (which also requires non trivial cost on the server) it allows RavenDB to far better optimize network and traffic patterns.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}