Protocol design implications: REST vs. TCP
I was going over design documents today, and I noticed some common themes in the changes that we have between RavenDB 3.5 and RavenDB 4.0.
With RavenDB 3.5 (and all previous versions), we always had the communication layer as HTTP REST calls between nodes. When I designed RavenDB, REST was the thing to do, and it is reflected in the design of RavenDB itself. However, 8 years later, we sat down and considered whatever this is really appropriate for everything. The answer was a resounding no. In fact, while over 95% of RavenDB is still pure REST calls, we have moved certain key functions to using TCP directly.
Note that this goes in directly contrast to this post of mine from 2012: Why TCP is evil and HTTP is king.
The concerns in this post are still valid, but we have found that there are a few major reasons why we want to switch to TCP for certain stuff. In particular, the basic approach is that the a client will communicate with the server using HTTP calls, but servers communicate with one another using TCP. The great thing about TCP is that it is a stream oriented protocol, so I don’t need to carry state with me on every call.
With HTTP, each call is stateless, and I can’t assume anything about the other side. That means that I need to send the state, manage the state on the other side, and have to deal with potential issues such as concurrency in the same conversation, restarts of one side that the other side can’t easily detect, repeated validation on each call, etc.
With TCP, on the other hand, I can make a lot of assumptions about the conversation. I have state that I can carry between calls to the other side, and as long as the TCP connection is opened, I can assume that it is valid. For example, if I need to know what is the last item I sent to the remote end, I can query that at the beginning of the TCP connection, as part of the handshake, and then I can just assume that what I sent to the other side has arrived (since otherwise I’ll eventually get an error, requiring me to create a new TCP connection and do another handshake). On the other side, I can verify the integrity of a connection once, without requiring me to repeatedly verify our mutual state on each and every message being passed.
This has drastically simplified a lot of code on both the sending and receiving ends, and reduced the number of network roundtrips by a significant amount.
Comments
Ayende,
Did you have a look a COAP ? it was mainly made for IoT but mightit could hep you here. Not sure if it handles session the way you want it .
Remi, I don't want a REST model, I want a stateful model
You could use a session with HTTP. A session would just be a
Dictionary<int, SomeState>
. It seems unnecessary to use a "physical" protocol such as TCP to implement a logical concept such as conversations. It seems an unnecessary coupling and inefficient. SQL Server does this, too. It requires one TCP connection for each session which causes problems.I do support a custom protocol for high performance scenarios, though.
tobi, That would require a LOT of code to run / manage, etc. It also doesn't work without extensive locking, timers and complications, all of which we get for free from TCP.
Did you consider Websockets ? Lots of the advantages of HTTP but stateful
I like it. I've never been a fan of trying to do everything over HTTP. It has it's place but when you control both ends, there are more efficient approaches, despite the challenges involved. Thanks for the update.
Have you considered using grpc instead of HTTP REST calls? It seems to have wide language support.
David, We did tried that, and we really wanted to make it work, but there are a few problems there. a) There is a significant cost compared to TCP that is very noticable when you start passing large amounts of data. b) The quality of the implementation is... less than desirable in several cases. In particular, .NET Core + Linux doesn't really work well for our purposes.
armon, That gives me very little, actually. I don't want RPC. I want a stream of communication, which is why TCP is perfect for this
TCP is great, but it requires you to implement things like message framing, authentication and other things that HTTP give you for free. So, you probably will need to use another protocol above it, or create your own.
I'm with Toby. http/rest because its way simpler then role and maintain your own tcp communication stuff. You can create a session cache like Toby mentioned for enhanced perf and eliminating a lot your objections. Way less complicated and error prone.
Anyway just my 2ct, We all know what you are going to do :-)
Edward, Give it a try, and see how simpler it is than using the TCP socket directly
I am a big fan of using TCP directly instead of other application level protocols. As André mentioned it may require a little bit of protocol design. The only challenge I've had is to maintain the sockets that are in different state - authentication, handshake etc.
How about http/2? https://en.wikipedia.org/wiki/HTTP/2
Edward, It isn't supported by Kestrel of the client side API, so that was out
Comment preview