Next: Three-Address Code
Up: Compiler Theory: Intermediate Code Generation
Previous: Intermediate Languages
One popular form of intermediate representation is code
for an abstract stack machine.
ABSTRACT STACK MACHINE.
This machine has three memory zones.
- The instruction memory
- which holds the instruction of the program
being executed.
- The data memory
- which store values. Each value can be accessed
- by its address
- or by the identifier of a variable associated with this address. if any.
- A stack
- which is used to perform arithmetic operations.
The instruction are quite limited and fall into three classes:
- arithmetic operations
- stack manipulation
- control flow.
|
Instruction |
|
|
Stack |
Data value |
and address |
1 |
push 5 |
|
|
16 |
0 |
1 |
2 |
rvalue 2 |
|
top |
7 |
11 |
2 |
3 |
+ |
|
|
|
7 |
3 |
4 |
rvalue 3 |
|
|
|
... |
4 |
5 |
* |
pc |
|
|
|
|
6 |
... |
|
|
|
|
|
ARITHMETIC OPERATIONS.
The abstract stack machine code for an arithmetic expression E
simulates the evaluation of a postfix representation of E
using a stack.
- The evaluation proceeds by processing the postfix representation from left to right.
- Each operand is pushed onto the stack as it is encountered.
- When a k-ary operator is reached,
- its leftmost argument is k - 1 positions below the top of the stack and
- its rightmost argument is at the top of the stack.
- The evaluation applies the operator to its k arguments, pop them and push
the result onto the stack.
In our intermediate language,
- all values are integers
- 0 corresponds to false and
- all nonzero integers corresponds to true.
L-VALUES AND R-VALUES.
In assignments like
x := a
x := x + 1
- the right side specifies an integer value,
- the left side specifies where the value should be stored
The terms L-values and R-values refer to values
that are appropriate on the left and rigth sides of an assignment, respectively.
STACK MANIPULATION.
We have the following instructions.
- push v
- which pushes v onto the stack.
- rvalue
- which pushes the content of data location .
- lvalue
- which pushes the address of data location .
- pop
- which throws away the value on top of the stack.
- :=
- which places the R-value on top in the L-value below it, and both are popped.
- copy
- which pushes a copy of the top value on the stack.
Example 2
The translation of the PASCAL-like statement
x := (a + b) * (c mod 5)
in our abstract stack machine is
lvalue x
rvalue a
rvalue b
+
rvalue c
push 5
mod
*
:=
CONTROL FLOW.
The stack machine executes instructions in numerical order
unless told to do otherwise by a conditional or unconditional jump
statement.
We assume that our machine supports symolic labels,
that is we can decorate any instruction with a name, its label.
Then the control-flow instructions for our abstract stack machine are
- label
- the current statement receives the label ,
- goto
- the next instruction is taken from statement with label ,
- gofalse
- pop the top value, say v, and jump to label if v = 0,
- gotrue
- pop the top value, say v, and jump to label if v 0.
- label
- stop the execution.
Example 3
The stack machine code for an alternative
if expr1 then action1 else action2
looks like
code for expr1 |
gofalse else |
code for action1 |
goto out |
label else |
code for action2 |
label out |
Example 4
A boolean disjonction
expr1 or expr2
can be viewed as an alternative:
if expr1 then true else expr2
Hence, the stack machine code for it looks like
code for expr1 |
copy |
gotrue out |
pop |
code for expr2 |
label out |
- Recall that gotrue and gofalse pop the value on top of the stack
in order to simplify code generation for conditional and while statements.
- By copying the value of expr1, we ensure that the value on top
of the stack is true if the gotrue instruction leads to a jump.
A boolean conjonction
expr1 and expr2
would be treated similarly as the alternative:
if expr1 then expr2 else false
Next: Three-Address Code
Up: Compiler Theory: Intermediate Code Generation
Previous: Intermediate Languages
Marc Moreno Maza
2004-12-02