This tutorial will teach you how to make a Forth interpreter in assembly with initial prototype simulators made in Python.
The aim is twofold. First, we want to provide a description that is more flexible to allow other implementations (possibly drastically different ones from the description here). Stopping to read after the first few section should still give some idea of how to implement Forth. Second, we want each intermediate step to be runnable so that feedback can be used to verify the reader's understanding.
We will skip most optimization that only improves speed or memory usage by a constant factor except for one. These optimizations may be discussed at the end of the final part.
The assembly interpreter will be in x86-64 assembly for Linux but since most of the program is written in "Forth style" using only
ret. Of course, system calls have to be translated.
Forth consists of five parts:
Each entry of the stacks and array are fixed width. We'll use the usual trick of storing a location of (i.e., a pointer to) anything more complicated.
memory array is intended to be segmented where each segment corresponds to the body of a single function. A segment for f stores a list of function names, which are the calls to make (one after the other) when f is called.
data stack is intended to store the arguments/parameters for function calls, effectively function calls need no explicit argument specification, only the occasional data stack shuffling before the call. A function will pop the first few elements of the data stack and push its output(s), if any, onto the data stack. This is the defining property of Stack-oriented programming languages like Forth.
call stack is intended to store locations in memory where we should continue execution.
names map is a table from function names to their first instruction in
memory (if not a primitive function) or to a predefined function.
primitive functions should be defined in the host language in which our Forth interpreter will be written. Usually, in a Forth interpreter, there are enough primitive functions so that interpreted instructions are able to read and write to
Function names can be anything that does not contain a whitespace.
The main loop of the Forth interpreter repeats the following indefinitely.
There's no input handling yet (that is done by an outer loop).
We can already witness this loop working with a hard coded initial call stack, data stack and set of primitive functions. See forth1.py for an example with only calls to primitive functions. See forth2.py for an example with function definition and call.
Feel free to skip examples like forth1.py and instead implement your own interpreter from these instructions.
For debugging, we can piggyback on Python's debugger to use as our own debugger. (Optionally) import
from pdb import stack_trace as bp
"bp": bp to names. Now writing
memory anywhere will insert a break point. Use
pdb's shell to explore the program's state at that point.
By adding some more primitive functions, we could make the language interpreted Turing-complete. We won't do that here and will instead add primitives as we need them but feel free to add them to your own interpreter at this step.
We'll now add input/output to our interpreter. We've already seen an example of output using the primitive function
The user inputs a list of whitespace separated function names that the interpreter calls in order. We call these function names as input tokens from now on.
We add the
exit primitive to terminate execution no matter how deeply nested the current call is.
Try forth3.py and type
push1 print at the input prompt.
Press Control-C or Control-D to stop the program. This gives an error and there's current no other way to exit the input loop.
Implementation detail: In this implementation, we chose to use memory index 0 to contain an command (one at a time) and memory index 1 to contain
exit effectively ending the current command.
We'll now add primitives to write to memory. This will also give us function definition (both anonymous and named).
[ will call a primitive function
write_loop that is a loop which reads the input until it finds a
]. Every function name in between will be appended to memory. At the end of the loop,
return is appended.
What is between
] will effectively contain the (anonymous) function's body. Note that this is our first primitive function which consumes input tokens.
To name our anonymous function, we add a
bind: primitive which reads the next function name and binds it to the index at the top of the stack. We make
[ return the first index of memory written so the two work in conjection to define functions. To get the function call example of
forth2.py, we can now write
[ push1 print ] bind: my_func my_func
at the input prompt. The second
my_func calls the function we just defined.
Implementation detail: We change how input is handled since input needs to be stored so both
write_loop can access it. We also add a check for the end of file so
input_loop can be exited cleanly by pressing Control-D.
write_loop is the name of the Python function bound to the token
Note that the language implemented in this tutorial isn't exactly Forth but the same steps here can be used to implement the original Forth and the computation model is the same for both languages.
We now have the three central parts of the interpreter written:
We will add more functions (primitive and non-primitive) to our interpreter with the goal of writing as much of
write_loop themselves as possible in Forth. Our plan is to eventually write "Forth-style" assembly with mostly just
We don't want a new function for each constant we want to push onto the stack (such as
push3, ...). Instead, just like
bind:, we can have a generic
push: function which pushes the next input token onto the stack.
Implementation detail: For our Python simulator, we will have an extra function
pushe: which evaluates the result using Python's
ast.literal_eval before pushing it onto the data stack so now we can push things other than strings.
In general, we'll name functions that consume the next input token like
push: with a semi-colon ":" at the end.
Try forth5.py with input
pushe: 1 print [ pushe: 1 print ] bind: my_func my_func
if takes two arguments, a boolean and memory index. If the boolean is true, what's at the memory index is interpreted. Otherwise, it is skipped.
Now is a good time to add some arithmetic and comparison operations to our interpreter.
+ - * / mod and or not
== != < > <= >=
They are all primitive and we put them in
operators.py to be imported.
Try forth6.py. Try inputs
pushe: 3 pushe: 2 == [ pushe: 1 print ] if pushe: 2 pushe: 2 == [ pushe: 1 print ] if [ pushe: 1 print ] bind: print1 pushe: 3 pushe: 3 == push: print1 names.get if
Operators are defined in the operators.py file.
repeat repeats the last three instructions (before
repeat's position in memory) indefinitely.
call calls the index at the top of the stack.
In tandem, they can be used to execute loops. Try forth7.py with input
[ pushe: 10 print ] bind: foo push: foo call [ pushe: 11 print ] bind: foo [ push: foo call repeat ] bind: bar bar
The problem is that repeat only works when inside a function. This is OK for now but we will eventually change how
write_loop works so that
repeat can be called on the input (maybe not in this tutorial).
We'll need a number of primitives that simply rearrange the top of the data stack. In this implementation, they will all be named
#### is a sequence of stack indices.
Here are some examples of primitives and their effect.
s21: remove the top max(2, 1) = 2 elements and put the 2nd and 1st elements at the top. Effect: swaps top two elements.
s321: remove the top max(3, 2, 1) = 3 elements and put the 3rd, 2nd and 1st element at the top. Effect: reverse order of top three elements.
s11: remove the top max(1, 1) = 1 element and put 1st and (another) 1st element at the top. Effect: put a copy of the top element at the top.
s2: remove the top max(2) = 2 elements and put the 2nd element at the top. Effect: removes the top element/pops the stack.
s#### removes the top n stack elements from the stack where n is the maximum in
####, and the elements at indices
#### (before the removal) are put at the top of the stack.
Just like for constants, we could instead implement only one function
s: that treats the next input token as indices (e.g.,
s: 11 instead of
s11). However, we actually need very few stack shufflers so in this implementation, we'll use a separate function for each.
push: a push: b s21 print print push: a push: b s2 print
names, we add getters, setters, push, pop, etc as primitive functions as needed.
We're finally ready to write Forth in Forth itself. This implementation follows the Python implementation of the same functions.
[ call_stack.len pushe: 0 == [ pushe: 3 call_stack.pop_many ] if call_stack.pop s11 memory.get names.get s21 pushe: 1 + call_stack.push s11 is-primitive s11 not [ s21 call_stack.push ] if push: call-primitive names.get if ] call repeat ] bind: main-loop [ next-input pushe: 2 memory.set_at pushe: 2 call_stack.push main-loop ] call repeat ] bind: input-loop [ memory.len [ next-input s11 pushe: ']' == [ s2 push: return memory.append pushe: 3 call_stack.pop_many ] if memory.append ] call repeat ] bind: write-loop
Here, we're using nested square brackets which isn't defined in our interpreter. So for this version, we must first move the anonymous functions defined by inner square bracket and name these functions. forth.f contains the version that can be read by our interpreter.
main-loop after our function definitions unexpectedly stops at the (first)
call_stack.pop. Indeed popping the call stack should return from the current function but we actually meant to pop the stack of the inner Forth that's now running in our Forth interpreter! To fix this, we will create a second call stack
call_stack2 for the inner Forth and make all
call_stack.* functions and a few other functions refer to
call_stack2, except for
call_stack.pop_many where we actually want to return from inner functions.
We also need an ugly hack so the write loop in forth.f is run as a primitive with respect to
call_stack2 (this means that its run on
call_stack instead of
call_stack2). This change is made in the
call-primitive functions. Alternatively, we could put 2's everywhere inside the body of this write loop but then we couldn't test it independently of
names can be shared between both inner and outer Forth interpreters without major issue. We'll use memory index 2 and 3 for storing the current input token of the inner Forth (and 0 and 1 for the outer Forth as usual).
Remark: We use
call_stack.pop_many in place of
break. For the moment, we just count the number of elements to pop from the call stack. To avoid keeping track of these numbers, we should implement continuations, which is not covered in this tutorial.
Try forth9.py. Enable some debugging messages to see that the last line is actually running inside the inner Forth.
location: Integer index, usually in the
memory array (but possibly elsewhere like the
strings array). Sometimes called pointers.
function name: A string naming a function. This string can contain any character that's not a whitespace. They are call words in Forth.