Remote-Url: https://niedzejkob.p4.team/bootstrap/throw-catch/ Retrieved-at: 2021-11-09 07:15:32.466305+00:00 September 20, 2021 · 12 minute readConsidering Forth's low-level nature, some might consider it surprising how well-suited it is to handling exceptions. But indeed, ANS Forth does specify a simple exception handling mechanism. As Forth doesn't have a typesystem capable of supporting a mechanism like Rust'sResult, exceptions are the preferred error handling strategy. Let's take a closer look at how they're used, and how they're implemented.A user's perspectiveThe exception mechanism consists of two user-facing words:catchandthrow. Unlike other control flow words, which act as additional syntax,catchmerely wants an execution tokenat the top of the stack, which usually means that[']will be used to obtain one just before the call tocatch(though outside of a definition,'is used instead).Ifexecuteing the execution token passed tocatchit doesn't throw anything,catchwill push a0to indicate success:42'dupcatch .s( <3> 42 42 0 ok )On the other hand, ifthrowisexecuted,throw's argument is left on the stack to indicate the exception's type::welp7throw;1 2'welpcatch .s( <3> 1 2 7 ok )The stack elements below this exception code are not just what was there whenthrowwas ran, though — if there's more than one possiblethrowlocation, the layout of the stack would become unpredictable. That is whycatchremembers the stack depth, such thatthrowmay restore it. As a result, if ourwelppushes additional elements onto the stack, they'll get discarded::welp3 4 5 7throw;1 2'welpcatch .s( <3> 1 2 7 ok )and if it consumes some stack items, their place will be filled by uninitialized slots when the stack pointer is moved::welp2drop 2drop7throw;1 2 3 4'welpcatch .s( <5> 140620924927952 7 140620924967784 56 7 ok )The way to think about it is to consider the stack effect of' foo catchas a whole. For example, iffoohas a stack effect of( a b c -- d ), then' foo catchhas( a b c -- d 0 | x1 x2 x3 exn ), where thex?are the slots taken up by the argumentsa b c, which could've been replaced with pretty much any value, and thus can only be dropped.What's key here is that theamountof items on the stack becomes known, and now we can safely discard what could've been touched byfoo, to access anything we might've been storing below.Let's take a look at a fuller example of how this can all be used. Suppose we have a/word that implements division, but crashes if you attempt to divide by zero. Let's wrap it with a quick check that throws an exception instead.First, we'll need to choose an integer that'll signify our exception's type. There aren't any conventions as to how this should be done, except for some reserved values:0is used bycatchto signify "no exception"values in the range{-255...-1}are reserved for errors defined by the standardvalues in the range{-4095...-256}are reserved for errors defined by the Forth implementationSince the standard assigns an identifier for "division by zero", we might as well use it.-10constantexn:div0I couldn't actually find any guidance on how these are typically picked for application-specific exceptions. If I had to guess, one'd start with small positive integers, rather than at-4096going down. For what it's worth,Miniforth's extended exception mechanismsidesteps this by using memory addresses as identifiers.Anyway, throwing an exception looks exactly like what you'd expect::/( a b -- a/b )dup0=ifexn:div0 throwthen/( previous, unguarded definition of / — not recursion );You could then use it like so::/.( a b -- )over.." divided by "dup.." is "[']/catchif2drop( / takes 2 arguments, so we need to drop 2 slots )." infinity"( sad math pedant noises )else.then ;This works just as you'd expect it to:7 4 /.7 divided by 4 is 1 ok7 0 /.7 divided by 0 is infinity okOf course, checking for a zero divisor explicitly would probably make more sense in this case, but a more realistic example would obscure the details of exception handling too much...0 throwand its usesBefore we look into the implementation ofthrowandcatchthemselves, I'd like to highlight one more special case. Specifically,throwchecks whether the exception number is zero, and if it is, doesn't actually throw it —0 throwis always a no-op.There are a few aspects as to why this is the case. Firstly, actually throwing a zero would be confusing, ascatchuses zero to signify that no exception was caught. But hold on, it's not exactly in character for Forth to check for this. There's plenty of other ways to fuck up already. They could've said "it'll eat a sock if you try to do that" and celebrated the performance win.And even if you do check, why make it a no-op? Shouldn't you throw a "tried to throw a zero" exception instead, to make sure the mistake is noticed?The answer, simply enough, is that it's not necessarily a mistake. There are some useful idioms that center around0 throwbeing a no-op.One concerns a more succint way for checking a condition::/( a b -- a/b )dup0=exn:div0andthrow /;This works since, in Forth, the canonical value fortruehas all the bits set (unlike C, which only sets the lowest bit), sotrue exn:div0 andevaluates toexn:div0. Of course, when using this idiom, one must be careful to use a canonically-encoded flag, and not something that may return arbitrary values that happen to be truthy.The other idiom allows exposing an error-code based interface, that can be conveniently used as an exception-based one. For example,allocate(which allocates memory dynamically like C'smalloc) has the stack effectsize -- ptr err. If the caller wants to handle the allocation failure here and now, it can do... allocate if ( it failed :/ ) exit then ( happy path )But throwing an exception when an error is returned only requiresallocate throw--- if no error occurred, the0will get dropped.The internalsNow, how is this sausage made?jonesforth, a very popularliterate programmingimplementation of Forth,suggestsimplementing exceptions by, essentially, havingthrowscan the return stack for a specific address within the implementation ofcatch. This feels like something one would come up with after studying the complex unwinding mechanisms in languages like C++ or Rust— they too unwind the stack, using some very complex support machinery spanning the entire toolchain. However, the reason they need to do that is to run the destructors of objects in the stack frames that are about to get discarded.Forth, as you're probably aware, does not have destructors. This allows for a much simpler solution — instead of scanning the return stack for the position wherecatchwas most recently executed, we can just havecatchsave this position in a variable.variablecatch-rp( return [stack] pointer at CATCH time )Apart from the simplicity, this approach also has performanceand robustness advantages — remember that>rand do-loops can also push things onto the return stack. It would be a great shame if such a value happened to equal the special marker address that's being scanned for...To support nested invocations ofcatch, we'll need to save the previous value ofcatch-rpon the stack. While we're at it, this is also a good place to save the parameter stack pointer. This effectively creates a linked list of "exception handling frames", allocated on the return stack:Note that the "return tocatch" entry isabovethe data pushed bycatch. This is because the former only gets pushed oncecatchcalls a non-assembly word — in this case, theexecutethat ultimately consumes the execution token.Some assembly requiredSince the stack pointers themselves aren't exposed as part of the Forth programming model, we'll need to write some words in assembly to manipulate them. The words for the return stack pointer are straight-forward::coderp@( -- rp )bx push, di bx movw-rr, next,:coderp!( rp -- )bx di movw-rr, bx pop, next,(this syntax (as well as the implementation of the assembler) was explainedin a previous article)Manipulating the data stack pointer is a bit harder to keep track of, as the value of the stack pointer itself goes through the data stack. I ended up choosing the following rule:sp@pushes the pointer to the element that was at the top beforesp@was executed. In particular, this meanssp@ @does the same thing asdup:This diagram bends the reality a little, asthe top of the stack is kept inbx, and not in memory, as an optimization. Thus, we first need to storebxinto memory::codesp@bx push, sp bx movw-rr, next,sp!works similarly, with the guiding principle thatsp@ sp!should be a no-op::codesp!bx sp movw-rr, bx pop, next,Note that there aren't actually any implementation differences betweensp@/sp!and their return stack counterparts (apart from using thespregister instead ofdi). You just need to think more about one than about the other...The last:codeword we'll need isexecute, which takes an execution token and jumps to it.:codeexecutebx ax movw-rr, bx pop, ax jmp-r,Interestingly,executedoesn't actuallyneedto be implemented in assembly. We could just as well do it in Forth with some assumptions on how the code gets compiled — write the execution token into the compiled representation ofexecuteitself, just before we reach the point when it gets read::execute[here3cells+] literal!( any word can go here, so... )drop;( chosen by a fair dice roll... )This kind of trickery is unnecessarily clever in my opinion, though. It doesn't actually have any portability advantages, since it assumes so much about the Forth implementation it's running on, and on top of that, it's probably larger and slower. Still, it's interesting enough to mention, even if we don't actually use it in the end.Putting it all togetherLet's take another look at how the return stack should look:Let's construct that, then::catch( i*x xt -- j*x 0 | i*x n )sp@>rcatch-rp@>rrp@ catch-rp!Then, it's time toexecute. It will only actually return if no exception is thrown, so next we handle the happy path by pushing a0:execute0Finally, we pop what we pushed onto the return stack. The previous value ofcatch-rpdoes need to get restored, but the data stack pointer needs to get dropped, as we aren't supposed to restore the stack depth in this case.r>catch-rp!rdrop;throwbegins by making sure that the exception code is non-zero, and then rolls back the return stack to the saved location.:throwdupifcatch-rp@rp!Restoringcatch-rphappens just as you'd expect:r>catch-rp!The saved SP is somewhat harder. Firstly, we don't want to lose the exception code, so we'll need to save it on the return stack before restoring SP:r>swap>rsp!Secondly, whensp@was ran, the execution token was still on the stack — we need to remove that stack slot before pushing the exception code in its place:dropr>else( the 0 throw case )dropthen ;But wait, there's more!We've seen how the standard exception mechanism works in Forth. The facilities of throw-and-catch are provided, but in quite a rudimentary form. In mynext post, I explain how Miniforth builds upon this mechanism to attachcontextto the exceptions, resulting in actionable error messages when the exception bubbles up to the top-level. See you there!