Now that the BF compiler is together, it’s time to add some obvious optimizations.
Series:
Part 1 (Parsing)
Part 2 (IL Generation)
Part 3 (Compiler)
Part 4 (Optimization)
I’m going to approach this post a bit differently than most before it. Typically a post’s code stands on its own, but this post is largely a refactor of Part 3. I’ll focus on the diffs necessary to implement the desired optimizations. Along the way I’ll add some handy debugging code. Before I get started, this post assumes knowledge from the previous posts (especially post 3), so feel free to go back and peruse earlier posts in this series. Getting started, I want optimizations to be optional.
1 | let optimizationsEnabled = true |
First up, the internal BF language representation as F#’s discriminated unions. I add addtional the intermediate representation (IR) ops to support my optimizations and debugging support. There are a multitude of optimizations that can be done. For this post I’m only going to focus on folding contiguous pointer shifts, contiguous data increment/decrements, setting a cell to zero. In addition to these optimizations, I add a debug print ? that works like ., but prints the int value instead of the standard char. It is mildly controversial to extend BF, but I find this is a minimally helpful tool for debugging.
I’ve decided, for demonstrative purposes, to extend the IR instead of replace existing operators. Where you see operators such as Gtx or Plusx, I could have just as easily rewired the existing Gt and Plus, respectively. In a more formal setting, I would’ve most likely refactored the existing operators. It makes for a cleaner codebase, but this method allows me to show a couple more F#isms.
Extended Examples:
>>>> : Compresses to Gtx 4
<<<> : Compresses to Ltx 3
++++ : Compresses to Plusx 4
++- - : Compresses to Nop (Since they cancel out, there isn’t anything to do.)
[-] : Compresses to Set0 (This is a common way to set a cell to 0. I can do a direct set without looping.)
Once I create the extended operators, I add them to AllOps DU. I also add a “No Op” operator (Nop) for cases where my folds zero out.
1 | type ExtendedOps = |
Here is one of the places where F# shines, code correctness. Now that I’ve added these types, my code doesn’t compile anymore. The error message is this: Incomplete pattern matches on this expression. For example, the value ‘Extended (_)’ may indicate a case not covered by the pattern(s). Looking at the code below you can see I have a match expression against the AllOps type, but I clearly don’t handle all cases, namely the ones I just added. I knew this was the case, but I’m happy when the compiler can figure out these cases. In a larger codebase that might not even be mine, this is a powerful dynamic.
1 | and processAst (ilGenerator:ILGenerator) (data:MemArray) (dataPointer:DpPointer) (ast:AllOps) = |
Here is the new version of the match, problem solved. This obviously causes it’s own cascading effect, now I need a emitExtendedOp function.
1 | and processAst (ilGenerator:ILGenerator) (data:MemArray) (dataPointer:DpPointer) (ast:AllOps) = |
I handle the extended operators in a similar fashion to normal operators. This also a prime time for a minor refactor. Pointer shifts and data increment/decrements for the extended versions are identical to the orignal ones except for the amount. As a result I will make the emit[Gt|Lt|Plus|Minus] to be more generic, and take a modifer amount.
1 | // Generate IL for an operator |
Continuing the cascading changes, I refactor the operator-specific emit functions. These all have the same refactor to include the parameter and use that instead of the fixed value of 1:
Add (i:int) as a parameter
Replace ilGenerator.Emit(OpCodes.Ldc_I4_1) with ilGenerator.Emit(OpCodes.Ldc_I4, i)
1 | // Increment dataPointer |
Implementing emitSetx takes a value and does an explict value set to data[dataPointer].
1 | // Set value at data[dataPointer] |
The debug print provides two pieces of information, the current dataPointer value and the integer value at data[dataPointer]. String.Format
is a bit more involved than some of the other function calls. I need to load the format string, and box the corresponding integer values. Then I can do a Console.WriteLine
of the formatted string with the variable values injected.
1 | // Print decimal value at data[dataPointer] & index value to stdout |
Now that I have addressed the compilation issues and all the new emitting code, it’s time to look at the parsing code. I would typically separate parsing from optimization, making it a secondary step in the compilation. But BF is so simple, plus I can tie this back to the original goal, leverage FParsec.
The parser changes leverage existing parsers and then implement additional logic on top. pGreaterLesser reads any number of consecutive > and <, then it counts them and determines the difference. This results in the optimized operator Gtx, Ltx, or Nop. pPlusMinus works the same way. pSet matches “[-]” (loop at current position until data[dataPointer] = 0), it a straightforward match, but you may notice this is a pstring, versus pchar match. pDebugPrint is a character match. After adding the new parsers, it’s a matter of just wiring them into their respective parser groups. This is also where I check to see if optimizations are enabled or not.
1 | let commentChar c = |
With refactoring complete, I look at the Hello World program from the last post. First, debug prints.
1 | let code = "++++++++++++?[>++++++>++++++++>+++<<<-]>?>?>?<<.>+++++.+++++++..+++.>----.<<+++++++++++++++.>.+++.------.--------.>+." |
Second, a look at the optimizations. I perform two compilations; one normal, one optimized. The results are what I’d expect to see. This sample is so short there isn’t really a peformance gain, the code is is just cleaner. On a more intense application, it’s easy to see how this code compression could add up to at least a minimal performance boost.
1 | let code = "++++++++++++[>++++++>++++++++>+++<<<-]>.>+++++.+++++++..+++.>----.<<+++++++++++++++.>.+++.------.--------.>+." |
Below shows the compilation differences of the normal versus optimized compilation.
There is it. A simple refactor to add debugging and minor optimizations. I didn’t expect this to be much work, but it is nice how F# can guide me along the refactor, once I have my intentions defined. I hope you found this post to be at least a mildly useful example of the power of F# and FParsec. Until next time.