In this post the brainf*** interpreter is further optimized by translating the source code to an internal instruction set.

In Part 3 a second "jump table" was introduced to make the interpreter a little faster.

The result was an interpreter that could generate the mandelbrot example in 43 seconds.

While useful, it wasn't very creative compared to the earlier optimization. The following optimization is a lot more impactful.

Preprocessing

The idea is that a lot of time is spent reading through the code itself and that code is in memory as a string.

We have to go through that code a lot during loops, reading characters from the string. Additionally, we have to maintain those jump tables from the previous optimizations. This all adds up.

What if we run through the code once, record its meaning into instruction sets? Then we can loop through instructions that contain all the information we need.

If we do that we won't need to figure out every time how much to add or substract somewhere, we'll just know from looking at a single Instruction.

Instruction Set

Instead of working with "brainf***" directly, we'll make our own instruction set and build up our instructions from the original code.

The plan is to have a list of Instruction objects that contain all the necessary information to perform a certain instruction. This Instruction will need a "Type" and an optional numerical parameter.

For example: Add 5 to current memory location is an instruction of the type "Add" with a "5" as parameter.

Each instruction will have a type, indicating what needs to be done:

private enum InstructionType
{
    None = 0,
    Add, Subtract, ShiftLeft, ShiftRight, Print, Read, BeginLoop, EndLoop
}

Of these, Add, Subtract, ShiftLeft and ShiftRight have parameters. The others don't.

The Instruction is a C# class:

private class Instruction
{
    public InstructionType Type { get; private set; }

    public int Parameter { get; set; }

    public Instruction(InstructionType type) : this(type, 0)
    { }

    public Instruction(InstructionType type, int parameter)
    {
        Type = type;
        Parameter = parameter;
    }
}

After looping once through the code we'll have a list of Instruction objects.

Running the Code

Running the code is now a three step process:

  • Strip all irrelevant characters from the source
  • Build the Instruction set
  • Execute the Instructions
public void Run(FileStream source)
{
    var strippedSource = Strip(source);

    var instructions = BuildInstructions(strippedSource);

    Execute(instructions);
}

The step to strip all irrelevant characters is so that we don't have to worry about skipping those characters when reading through the code.

Building Instructions

Some bf instructions can be directly converted to an Instruction. For example Print and Read:

case '.':
    instructions.Add(new Instruction(InstructionType.Print));
    break;
case ',':
    instructions.Add(new Instruction(InstructionType.Read));
    break;

For instruction Add (+), Subtract (-), ShiftLeft (<) and ShiftRight (>) we also need a parameter to know how much to add, substract or shift.

When we encounter such a character, we count how many times the same character follows, and we progress further in the code by that many steps.

For Add that looks like this:

case '+':
    instructions.Add(new Instruction(InstructionType.Add, CountSeries(reader)));
    break;

The CountSeries method progresses the reader until it encounters a character different from + and returns how many times it saw that character being repeated.

private int CountSeries(CharReader cr)
{
    var count = 0;
    var currentChar = cr.GetChar();
    while (cr.HasCharacters() && cr.GetChar() == currentChar)
    {
        count++;
        cr.Forward();
    }
    cr.Back();
    return count;
}

Reading the Loop Instructions

For the loops ([ and ]) we're still using the jumptable from before. But only when building the instruction sets.

For the Instruction itself, the Parameter for BeginLoop indicates the location in the code to jump to when the loop can be skipped. For EndLoop it's the location where to jump to if we need to continue the loop.

The tricky part is the "location to jump to" refers to one of the instructions in the instruction set and no longer a location in the original code. Luckily we can use the Stack from Part 2 again:

case '[':
    instructions.Add(new Instruction(InstructionType.BeginLoop));
    jumpTable.Push(instructions.Count - 1);
    break;
case ']':
    var beginPosition = jumpTable.Pop();
    var beginInstruction = instructions[beginPosition];
    instructions.Add(new Instruction(InstructionType.EndLoop, beginPosition));
    beginInstruction.Parameter = instructions.Count - 1;
    break;

Executing the Instructions

With an array of Instruction objects, actualy executing the code becomes relatively straight-forward. This is good because some of the Instructions will be executed many times. Things that happen a lot are preferably really simple.

For example, the Add instruction is now:

 case InstructionType.Add:
    memory[pointer] += (byte)instruction.Parameter;
    break;

Even the most complicated part, the loops aren't too bad:

We either jump to the specified location or continue to the next instruction, depending on the condition.

 case InstructionType.BeginLoop:
    if (memory[pointer] == 0)
    {
        instructionPointer = instruction.Parameter;
    }
    break;
 case InstructionType.EndLoop:
    if (memory[pointer] != 0)
    {
        instructionPointer = instruction.Parameter;
    }
    break;

Results

The result is the mandelbrot can now be generated in 10 seconds. Quite something knowing our previous optimization left us at 43 seconds.

Drawbacks

There is a drawback to this approach: it requires looping through the whole code before we run, and keeping everything in memory as an Instruction object. So we put a little more strain on the memory. But in this case that's not too bad.

If the source code is really large, this could become a bigger issue.

This optimization required significant changes to the existing code.

Conclusion

There's still more room for optimization, but going from 43 s to 10 s is quite something.

It shows that a good preparation can really make a lot of difference. In this case the preparation was building a set of useful instructions. This extra work payed off in the end.

Please find the full code here on Github: Optimization05.csetty sweet.