XlogicX Blog

I_GOTO

I wrote code with a GOTO in it. It has been multiple decades since the last time I did this; when I was stuck using 'lined-BASIC' in the early to mid 90's.

And you know what, I don't care. I could have done this code without a goto if I refactored it a little bit, but I don't care. I think even the anti-goto code hipsters would agree that the argument isn't about performance. Those that are delusional to think that it is about performance do not understand what is going on under the hood. To be clear, program flow control at a machine-code level looks eerily similar to lined basic. Instead of a GOTO the Assembly mnemonic is a JMP. Instead of GOing TO a line number, we JMP to an ADDRESS. Yes there is a CALL / RET structure, but a CALL is just a fancy JMP that put's it's current 'line number' onto the stack, and a RET is just a fancy JMP that jumps to whatever address is on the top of the stack. It should hopefully go without saying that your mighty high-level languages are doing this behavior on the processor regardless of your subroutines, objects, and fancy recursion.

So what is this about? this hatred of GOTO's. It's about code readability; reading 'spaghetti code' is a challenge, I concede to this. Even though I write a lot of perl, I still make attempts to have readable code; comments, proper indentation, meaningful variable names, and factored with just the right amount of subroutines. I don't always live up to this, but I try.

Why did I use a GOTO this time? It was the perfect quick hack, and the hack wouldn't hurt performance. What about readability? Consider the task I was trying to solve. I am writing some code that benchmarks regular expressions (in preparation for a Defcon talk in a few months). It does a lot of different things. Without going into unnecessary details, the piece of code that has this GOTO does one thing: it takes a list of regular expressions, and for each one, returns the last non-optional lexeme. So [abc]?x{1,4}(de|fe)+ would return (de|fe)+. If the lexemes were reversed as (de|fe)+x{1,4}[abc]?, x{1,4} would instead be returned (because ? is an optional modifier). And if x{1,4} were instead x{0,4}, this lexeme would also be treated as optional. It's easy enough to do this analysis manually, but to script it for any arbitrary expression has it's challenges.

I have a screenshot of the code below. Note that I modified it to fit better in the screen, so to be fair, sane indention and comments aren't in this version. But who in there right mind looks at this code and thinks: "This code is confusing, because I got lost in that GOTO!"