Looking into Odin and Zig: My notes
I was pointed to the Odin language after my post about the Zig language. On the surface, Odin and Zig are very similar, but they have some fundamental differences in behavior and mindset. I’m basing most of what I’m writing here on admittedly cursory reading of the Odin language docs and this blog post.
Odin has a great point on conditional compilation. The if statements that are evaluated at compile time are hard to distinguish. I like Odin’s when clauses better, but Zig has comptime if as well, which make it easier. The actual problem I have with this model in Zig is that it is easy to get to a situation where you write (new) code that doesn’t get called, but Zig will detect that it is unused and not bother compiling it. When you are actually trying to use it, you’ll hit a lot of compilation errors that you need to fix. This is in contrast to the way I would usually work, which is to almost always have the code in compliable state and leaning hard on the compiler to double check my work.
Beyond that, I have grave disagreements with Ginger, the author of the blog post and the Odin language. I want to pull just a couple of what I think are the most important points from that post:
I have never had a program cause a system to run out of memory in real software (other than artificial stress tests). If you are working in a low-memory environment, you should be extremely aware of its limitations and plan accordingly. If you are a desktop machine and run out of memory, don’t try to recover from the panic, quit the program or even shut-down the computer. As for other machinery, plan accordingly!
This is in relation to automatic heap allocations (which can fail, which will usually kill the process because there is no good way to report it). My reaction to that is “640KB is enough for everything”, right?
To start with, I write databases for a living. I run my code on containers with 128MB when the user uses a database that is 100s of GB in size. Even if running on proper server machines, I almost always have to deal with datasets that are bigger than memory. Running out of memory happens to us pretty much every single time we start the program. And handling this scenario robustly is important to building system software. In this case, planning accordingly in my view is not using a language that can put me in a hole. This is not theoretical, that is real scenario that we have to deal with.
The biggest turnoff for me, however, was this statement on errors:
…my issue with exception-based/exception-like errors is not the syntax but how they encourage error propagation. This encouragement promotes a culture of pass the error up the stack for “someone else” to handle the error. I hate this culture and I do not want to encourage it at the language level. Handle errors there and then and don’t pass them up the stack. You make your mess; you clean it.
I didn’t really know how to answer that at first. There are so many cases where that doesn’t even make sense that it isn’t even funny. Consider a scenario where I need to call a service that would compute some value for me. I’m doing that as gRPC over TCP + SSL. Let me count the number of errors that can happen here, shall we?
- Bad reaction on the service (run out of memory, for example).
- Argument passed is not a valid one
- Invalid SSL certificate
- Authentication issues
- TCP firewall issue
- DNS issue
- Wrong URL / port
My code, which is calling the service, need to be able to handle any / all of those. And probably quite a few more that I didn’t account for. Trying to build something like that is onerous, fragile and doesn’t really work. For that matter, if I passed the wrong URL for the service, what is the code that is doing the gRPC call supposed to do but bubble the error up? If the DNS is returning an error, or there is a certificate issue, how do you clean it up? The only reasonable thing to do is to give as much context as possible and raise the error to the caller.
When building robust software, bubbling it up so the caller can decide what to do isn’t about passing the back, it is a b best practice. You only need to look at Erlang and how applications with the highest requirements for reliability are structured. They are meant to fail, error handling and recovery is something that happens in dedicated (supervisors) locations, because these places has the right context to make an actual determination.
The killer impact of this, however, is that Zig has explicit notion of errors, while Odin relies on the multiple return values system. We have seen how good that is with Go. In fact, one of the most common issues with Go is the issue with how much manual work it takes to do proper error handling.
But I think that the key issue here is that errors as a first class aspect of the language gives us a very powerful ability, errdefer. This single language feature is the reason I think that Zig is an amazing language. The concept of first class errors combine with errdefer makes building complex structures so much easier.
Consider the following code:
Note that I’m opening a file, mapping it to memory, validating its size and then that it has the right hash. I’m using defer to ensure that I cleanup the file handle, but what about the returned memory, in this case, I want to clean it up if there is an error, but not otherwise.
Consider how you would write this code without errdefer. I would have to add the “close the map” portion to both places where I want to return an error. And what happens if I’m using more than a couple of resources, I may be needing to do something that require a file, network socket, memory, etc. Any of those operations can fail, but I want to clean them up only on failure. Otherwise, I need to return them to my caller. Using errdefer (which relies on the explicit distinction between regular returns and errors) will ensure that I don’t have a problem. Everything works, and the amount of state that I have to keep in my head is greatly reduce.
Consider how you’ll that that in Odin or Go, on the other hand, and you can see how error handling become a big beast. Having explicit language support to assist in that is really nice.
Comments
Just to be clear, Erlang in general relies heavily in bubbling up errors via the {ok, Value} or {error, Error} return value conventions. Where it shines is having isolated lightweight processes that can be supervised and their crashes handled as needed.
So in general Erlang code is a lot more like Odin and in general process supervisors don't have the proper context to handle errors.
Conrad,
If you aren't matching on the error, you'll automatically cause another error and that will be raised further. In other words, you can ignore the errors you don't need to handle and everything is taken care of for you. For example, see:
https://github.com/processone/ejabberd/blob/d741f6f5f25db4819b17031c1b28183c862caee0/src/misc.erl#L258-L262
In this case, we have error handling left for the caller, no need to mess about it here. The code relies on the match failure on error and then handled elsewhere.
Hi Oren,
Thanks for your thoughtful response (that sounds like I'm a bot, sorry ;-)
I think I understand better what you mean by "Zig has explicit notion of errors", just like Erlang has an explicit concept of a match failure in your example, which will cause the process to crash.
Btw, the more I read I about RavenDB, the better it looks, hats off!
Having written software for almost two decades now, those two statements by the author of the blog post shock me.
I have written applications that have run the system (or process memory space) out of memory on a number of occasions.
The statement about error handling reminds me of the mindset of developers at my first job writing VB6. Got an error? On Err Resume Next problem solved! Error bubble up is the best practice because not all code can make the decision about how to handle an error.
I don't think Odin's exception handling is all that bad, it certainly has its use case. It is just one of design that can be considered.
In .Net, we actually have many similar implementation. e.g.
IActionResult
from ASP.NET andHttpResponseMessage
fromHttpClient
. Where response of the method should be and can be explicitly handled withouttry...catch
. Which either me or someone have asked ASP.NET team, they said such approach give great performance, without exception to chock up all the stacktrace.His comment on the issue of
try..catch
also make sense for business app. Where an API that require multiple entitie from different source, an outsidetry...catch
is hard to know what's going on. Individual result with error detail can determine the outcome of the API.e.g.
This is one of modify entity with such approach, where other scenario will be getting items from different API or table.
try...catch
outside the whole is hard to know whether you encounter issue at get or modify. Of course there is way to get around such as named sepcific exception, butIActionResult
encourage developer to verify error. I know plenty of junior or evenSenior
developer don't even check HTTP response error, not handling anything. Of course that's one kind of extreme.Anyway, it's just a tool or another kind of pattern or way to solve the problem. I don't think it should be used everywhere but has its benefit.
Jason, Let's take HttpClient as a great example. When you execute a request, you may get an error response, but the concept here is that you got a response.It is now on you to handle that.The old WebRequest code used to throw for non successful requests, and that made handling things like 404 / 302, etc much harder. In those cases, you are looking at expected errors and the handling of them should be in the code itself. On the other hand, HttpClient will throw if the request itself failed. In other words, there is the expected values (a request may return 200, 404, etc) and unexpected (failed to connect to server). And you absolutely want to make sure that you cover all of them for robust software.
On errdefer, I still really wish/hope that C# would expose the CLR
fault
. Which isfinally
but only if leaving thetry
via an exception.@Oren. You are absolutely right. From language level I completely agree that going full in on
Odin
kind of approach is overkill.Exception
kind of approach should be basic style, where the styleOdin
uses should be additional.As you have said,
HttpResponseMessage
only covers HTTP response, it does not cover case such as network issue, SSL hand shacking issue. Only covers anytime HTTP response been successfully received (ignore the HTTP status).I hosted a conversation between the two creators in this fairly well-known Compiler Podcast. It has closed captioning and a chapters menu --might be worth a listen to expand your thoughts on the languages!
P.S. We'll be hosting another podcast this year.
Thank you taking the time to try out both Odin and Zig.
I want to clarify my position as the comments you are commenting on are lacking a little nuance.
For the first one regarding OOM, Odin's allocators all support error values signalling things such as
Out_Of_Memory
,Invalid_Pointer
,Invalid_Argument
, andMode_Not_Implemented
. So if you you want to handle those error states, there is nothing preventing you from doing so in Odin. However, there are two aspects to my original argument: you should know the constraints of the platform you are on and plan accordingly as it is part of your problem; what happens in the case when you have completely ran out of memory (globally/system-wide)?I've worked on systems with 16KiB and that's just how much you had to deal with. Meaning we had to plan accordingly for what we needed so that we never ran out of memory. The program was heavily tested (both analyzed and empirically) so that we made sure that it ran within the constraint of 16 KiB. On such platforms,
malloc
was never called, and in fact banned.In this comment, I say "program" and not allocator here. I have many allocators that run out of memory _on purpose_. On desktop applications that I have worked on (I know this does not apply to all), if we did run out of memory, the better option is to just crash rather than try to recover. But of course some applications cannot do this, especially when dealing with third-party programs, such as databases (as you state). In those cases, the best you can do is empirically test how much is required and gracefully handle those cases when you do OOM (if possible). That's part of your problem, it's not an external thing. If you can control things, try to.
Minor note: Zig's allocators only return one possible error value
error.OutOfMemory
, whilst Odin has 4 possible error values meaning there is a finer granularity in Odin with its error value system for its allocators.Onto the second comment you bring up regarding error value handling.
Yes. And you should handle them accordingly, and not just in a general "catch all", especially since these are wildly different kinds of failure states.
Regarding error value propagation, I needed to clarify my position a little more in that article I was referring to an aggregate set of comments from a GitHub post and not a truly nuanced argument. Odin has the
or_return
operator which allows the user to easily propagate values in code. It is similar to Rust's?
or Zig'stry
but complements Odin's multiple return values AND does not rely on having a concept of an "error value type".Demo of
or_return
https://github.com/odin-lang/Odin/blob/fd256002b3190076bb91ec3e02ae17c858222eb5/examples/demo/demo.odin#L2030 And check out Odin'score:math/big
which has big extensive use of the concept.Even though I added this concept to Odin, I do believe that my general hypotheses are still correct regarding exception-like error value handling. The main points being:
The most important one is the degenerate state issue, where all values can generate to a single type. It appears that you and many others pretty much only want to know if there was an error value or not and then pass that up the stack, writing your code as if it was purely the happy path and then handling any error value. Contrasting with Zig, Zig has a built-in concept of an error value type, and all error values are either inferred
!
, have specific error set, or degenerate toanyerror
. In practice from what I have seen of many Zig programmers, most people just don't handle error values and just pass them up the stack to a very high place and then pretty much handle any error state as they are all the same degenerate value: "error or not". This problem occurs in Go code too because everyone degenerates to theerror
interface in practice, therefore it's now equivalent to a fancy boolean.For error values in Odin, you can use any data type you require for your problem, and even compose many together. It's very common to just use an
enum
or even a boolean, but sometimes you need something more complex or aggregate too. A good example of this is when you need to aggregate different possible error values from multiple different data types. The two common approaches are to make a mapping procedure to convert them to the specificenum
OR to have a (discriminated)union
of theenum
s. This means Odin does not suffer from the degenerate state issue that Zig, Go, and many other languages do suffer from. This is because Odin doesn't have a built-in degenerate type which would be typically used for error values.One issue with Zig's approach is due to its error value inference system, it's very difficult to know what possible errors values a procedure returns just from reading the code (assuming
!T
oranyerror!T
is used), you have to compile the code in order to determine that. This may be okay for a single developer, but if you are on a larger team you are now either relying on external documentation (which Zig can provide) or the compiler telling you this. With a specific (error) set this is not a problem because it's explicit to the reader.Most of the time you are reading code, not writing code.
Odin's type system and feature set is rich enough that you can achieve everything you want _extremely cleanly_. You can many parallels between Odin and Zig:
errdefer
would be akin todefer if err != nil {
in Odin,try
would be akin toor_return
, etc.n.b. Someone recently wrote an article which is closer to my own position on error value handling (with some of the author's own flair and preferences) here: http://tuukkapensala.com/files/software_does_not_contain_errors.txt
Thank you again for writing this article. And as I always say, I highly recommend people try out as many tools (which includes programming languages) as they can to find what are the best tools for your problems!
Regards,
Bill
Thank you for this article. I've been looking with interest at Zig. You comment about
errdefer
got me thinking how I would do this in Go. It's actually simple with named return parameters:It's a little more verbose than Zig, but not onerous. Best, C
Craig,
Is there a cost here to do the function invocation vs, standard call?
That depends on many things, but in general, that is my responsability. I cannot handle that at the place that I'm trying to allocate, in 99% of the cases. I need to send it up, and I want to do that in a structured manner.
For example, using RavenDB's example. We have an indexing batch that needs to run and may allocate too much. We handle running out of memory but cancelling the indexing batch, freeing all the memory and trying again with a lower maximum number of items in the batch.
Note that the allocation failure may happen everywhere in the indexing batch code (pretty big). So we need a way to return that information up. And we want that to be something better than an error code. Errors are important, and we need to deal with them as a first class concept, in my eyes.
When writing C code, you'll spend a lot of time managing errors, getting confused if a value being
0
means that it is an error or is it-1
, etc? Having this clear is a major plus.My users will run RavenDB (same softwrae) on 128MB docker instances and 256 GB massive machines. In both cases, I do not want to have them configure the system extensively. I want things to just work. That means that I am going to be optimistic about things, try to allocate, and if I fail, I want to handle that. An example of one such scenario is above.
My software can't usually make assumptions or plan. I have no idea where users are going to run it, nor what kind of data they will put in or what kind of workload is required from the indexes. But having the ability to get and respond to errors in a structured and consistent manner helps a lot to deal with that.
Errors are going to happen, and my task is to deal with that appropraitely. Not trying to "fix" that.
That actually isn't realy relevant. The issue isn't what the error code you return is. The issue is what kind of system you have here to dealing with errors.
In the case of Zig, you have to deal with the error scenario. You can use
catch
orif
or soemthing, but that is explicitly and in your face. In Odin's case, you have to write theif err ...
manually. And if you forgot to do that, everything will still work.or_return
is a syntactic sugar, but it isn't helping the core issue. the ability to ignore the error silently in the code. That means that it is possible and easy to lose the error entirely. That isn't a minor issue. I'll point you to this paper: https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdfAnd this super critical finding:
That is when we are talking about well tested production systems and catastrophic failures that can be seen as failures to handle errors properly.
I'm not harping on error handling by accident. I'm seeing that all the time.
About
or_return
- I don't like the ergonimics of the keyword, but it has other issues. Again, it is easy to ignore an error code. I would argue that you want to have an explicit error type, since errors are imporant and you want to have clear distinction on how they are used.Given this code:
And I can call it like:
Note that this is quite easy to get wrong. In Go, I'll get an unused variable error, at least. In Odin, you'll need
-vet
to get it.In Rust or Zig, you'll simple not be able to use this value without showhow dealing with the erroronous case.
You are correct here. In almost all cases, what I want to have is just whatever or not there was an error. Sometimes there is something that I can do locally to handle the error. For example, if I'm getting a disk full error when trying to extend a file, I may be able to use less eager allocation scheme, but for the vast majority of cases, I don't care why the function failed. Only that it did. Note that at the same time, the information about the error is _critical_. But not usually for a higher level code to act upon it but to provide more context for logs, user visible error, etc.
In that case, by the way, Zig's approach for an error type that is just an int is lacking. And Rust's richer error type give you the ability to have more context. But even there, that tends to be usually rare thing to want to do.
With exceptions, you have exceptions hierarchies, and again, they are generally very rarely actualy used properly. The difference with exceptions is that you'll usually get good stack traces, and while Zig has some support for that, it isn't integrated with erors nicely.
Note that WRT to the
union
for errors and the like. We see something similar to that in Rust, with its errors union mode, and it is typically quite painful there. With many crates to deal with this issue. That comes back to what exactly is the error that you care about. At the high level code, I don't care if the DNS has no entry or if there is a firewall in the middle. I just know that by REST call failed and need to try an alternate or retry later.About Zig's
anyerror!T
, I agree that this makes it hard to read the code, but that only matters if you actually care about the error. I would assume that you can write an AST transformer that would provide the details of the errors explicitly. In some cases, I would argue that this is required, for example, when we are talking about interface changes (and adding an error is such a scenario). But it is a big strength when you are writing code, becuase you don't need to maintain that state in your head, which is very rarely required.And when reading code, you usually don't care, you care about "an error / not an error", that is it.
Also,
defer if err != nil {
is quite cool and much better than the Go one, even if I likeerrdefer
better.And tahnks for the interesting discussion
Comment preview