mirror of
https://github.com/PDP-10/its.git
synced 2026-01-13 15:27:28 +00:00
470 lines
22 KiB
Plaintext
470 lines
22 KiB
Plaintext
The CMAC Machine
|
||
|
||
|
||
|
||
Copyright (c) 1976, 1977, 1978
|
||
by
|
||
Alan Snyder
|
||
M.I.T. Laboratory for Computer Science
|
||
Cambridge, Ma. 02139
|
||
|
||
|
||
|
||
|
||
|
||
CMAC is a set of macros which form an assembly language for a
|
||
very simple machine. It has been designed to be used to
|
||
bootstrap the C compiler to new host machines. The code produced
|
||
for the CMAC machine is not efficient. However, efficiency is
|
||
not the relevant consideration. The primary goal is to minimize
|
||
the effort needed to implement the macros.
|
||
|
||
The CMAC machine has two data types, integers and pointers.
|
||
Characters are mapped into integers; character strings are
|
||
sequences of integers. The C compiler does not use
|
||
floating-point; thus, floating-point is not included in the CMAC
|
||
machine. Both integers and pointers are stored in single machine
|
||
words.
|
||
|
||
On word-addressed machines, pointers will be the same as
|
||
integers. On byte-addressed machines, they will most likely be
|
||
byte addresses, and thus always multiples of two, four, or
|
||
whatever. Pointers are distinguished from integers in the CMAC
|
||
machine because, on host machines which are not word-addressed,
|
||
operations on pointers (such as increment and decrement) will be
|
||
different than the corresponding integer operations.
|
||
|
||
The CMAC machine has two registers, called A and B. Each
|
||
register is capable of holding either an integer or a pointer.
|
||
All operations are performed on values in registers. Explicit
|
||
CMAC instructions are used to get values into and out of the
|
||
registers. This division of labor allows the problems of the
|
||
different storage classes of memory references to be isolated in
|
||
the load and store macro definitions, thus simplifying the
|
||
definitions of the operational macros.
|
||
|
||
CMAC programs are designed to be translated into the machine
|
||
language of the host machine using the standard macro assembler
|
||
of the host machine, augmented by a set of macro definitions for
|
||
the CMAC macros. In order to run the C compiler, the compiler
|
||
programs should be so translated and then linked together using
|
||
the standard linker of the host machine. In addition, a few
|
||
support routines (for I/O) must be hand-coded (these routines are
|
||
described in a separate document). It is conceivable that the
|
||
4 April 1978 - 2 - The CMAC Machine
|
||
|
||
|
||
CMAC programs might have to first be edited in order to conform
|
||
to the format expected by the host machine's macro assembler.
|
||
|
||
Alternatively, the CMAC programs can be translated (again, by
|
||
some macro processor) into a compact representation which can
|
||
then be interpreted by an interpreter running on the host
|
||
machine. This requires both the writing of an interpreter and
|
||
the writing of the macro definitions which construct the
|
||
interpreted representation. However, the use of this technique
|
||
may be necessary on small machines where the direct translation
|
||
from CMAC macros to machine code results in excessively large
|
||
programs.
|
||
|
||
The form of a CMAC program is that of an ASCII text file
|
||
consisting of text lines terminated by the newline (LF)
|
||
character. Each line contains one macro call, consisting of a
|
||
TAB, followed by the macro name, optionally followed by a TAB and
|
||
one or more character string arguments separated by commas. For
|
||
example,
|
||
|
||
FOO A,B
|
||
|
||
is a call of a macro named "FOO" with arguments "A" and "B".
|
||
Macro names and arguments consist of upper-case letters and
|
||
digits. Numeric arguments may be prefixed by a minus sign. In C
|
||
identifiers, the underscore character will be represented by 'J';
|
||
all external identifiers in the C compiler are unique in their
|
||
first five characters. All numeric arguments are in decimal.
|
||
|
||
1. Move Macros
|
||
|
||
The move family of macros are used to move values between memory
|
||
and the two registers. There are three basic classes of move
|
||
macros: load, store, and load-address. Within each class, there
|
||
are macros corresponding to the relevant C storage classes. The
|
||
first argument of each of these macros specifies a register
|
||
(either A or B). The second argument is dependent on the
|
||
particular C storage class; it gives the information needed to
|
||
specify an exact storage location. The form of this information
|
||
is described in Table I.
|
||
|
||
1.1 Load Macros
|
||
|
||
The load class of macros each has two arguments, a register
|
||
(either A or B) and a storage class-dependent argument. The
|
||
function of a load macro is to move the value in the location
|
||
specified by the second argument into the register specified by
|
||
the first argument. The load macros are as follows:
|
||
The CMAC Machine - 3 - 4 April 1978
|
||
|
||
|
||
Table I. Forms of Storage Class-Particular Information
|
||
|
||
class information form
|
||
|
||
auto word offset of variable in current stack frame
|
||
extern the actual C identifier
|
||
static a static-variable unique number
|
||
literal an integer value
|
||
parm parameter number (starting with 0)
|
||
indirect the register containing the pointer
|
||
register the register containing the value
|
||
string a string-literal unique number
|
||
|
||
_________________________________________________________________
|
||
|
||
LAUTO R,OFFSET load register from auto variable
|
||
LEXTRN R,NAME load register from external variable
|
||
LSTAT R,# load register from static variable
|
||
LLIT R,# load register with literal value
|
||
LPARM R,# load register from parameter
|
||
LVPTR R,R load register via pointer in register
|
||
LREG R,R move from one register to the other
|
||
|
||
1.2 Store Macros
|
||
|
||
The store class of macros each has two arguments, a register
|
||
(either A or B) and a storage class-dependent argument. The
|
||
function of a store macro is to move the value in the register
|
||
specified by the first argument to the location specified by the
|
||
second argument. The store macros are as follows:
|
||
|
||
STAUTO R,OFFSET store register into auto variable
|
||
STEXTN R,NAME store register into external variable
|
||
STSTAT R,# store register into static variable
|
||
STPARM R,# store register into parameter
|
||
STVPTR R,R store register via pointer in register
|
||
|
||
1.3 Load-address Macros
|
||
|
||
The load-address class of macros each has two arguments, a
|
||
register (either A or B) and a storage class-dependent argument.
|
||
The function of a load-address macro is to construct a pointer to
|
||
the location specified by the second argument, and put that
|
||
pointer value in the designated register. The load-address
|
||
macros are as follows:
|
||
|
||
LAAUTO R,OFFSET load address of auto variable
|
||
LAEXTN R,NAME load address of external variable
|
||
LASTAT R,# load address of static variable
|
||
LAPARM R,# load address of parameter
|
||
LASTRG R,# load address of string literal
|
||
4 April 1978 - 4 - The CMAC Machine
|
||
|
||
|
||
2. Operate Macros
|
||
|
||
The operate family of macros perform operations upon values in
|
||
the two registers. There are two classes of operate macros:
|
||
unary and binary.
|
||
|
||
2.1 Unary Operators
|
||
|
||
The unary operators each take one argument, which specifies a
|
||
register. This register is used for both the source and
|
||
destination of the operation. The unary operate macros are as
|
||
follows:
|
||
|
||
CMINUS R arithmetic minus
|
||
CNOT R bitwise negation
|
||
|
||
Since there are only two possible argument values, it is possible
|
||
to implement these operations as subroutine calls to one of a
|
||
pair of subroutines, one for each register.
|
||
|
||
2.2 Binary Operators
|
||
|
||
The binary operators all take their first (or "left") operand
|
||
from the A register and their second (or "right") operand from
|
||
the B register, and place their result in the A register, leaving
|
||
the B register unchanged. The semantics of the operators are the
|
||
same as the corresponding C operators. Since the operand
|
||
locations are fixed, no arguments are provided to these macros.
|
||
The binary operate macros are as follows:
|
||
|
||
CADD integer addition
|
||
CSUB integer subtraction
|
||
CMUL integer multiplication
|
||
CDIV integer division
|
||
CMOD integer remainder
|
||
CLS bitwise left shift
|
||
CRS bitwise right shift
|
||
CAND bitwise AND
|
||
COR bitwise OR
|
||
CXOR bitwise XOR
|
||
PINC pointer increment
|
||
PDEC pointer decrement
|
||
PSUB pointer subtraction
|
||
|
||
Note that the pointer operations are not necessarily the same as
|
||
the integer operations. Pointer increment and decrement take a
|
||
pointer as their first operand and an integer as their second
|
||
operand and produce a pointer result. The integer represents an
|
||
offset in words; it may have to be scaled before it is actually
|
||
added or subtracted to the pointer. Pointer subtraction takes
|
||
two pointer operands and produces an integer result. The integer
|
||
should represent the number of words that the first pointer is
|
||
offset from the second; it may be necessary to scale the result
|
||
The CMAC Machine - 5 - 4 April 1978
|
||
|
||
|
||
of the subtraction in order to get the integer in the proper
|
||
units.
|
||
|
||
Since the binary operator macros take no arguments, they are
|
||
easily implemented as subroutine calls.
|
||
|
||
3. Conditional-jump Macros
|
||
|
||
The conditional-jump family of macros all take a single argument,
|
||
a internal-label unique number. This number designates a label
|
||
to which control should jump depending upon the result of a test
|
||
specified by the conditional-jump macro. This test is based on
|
||
the values in the registers. There are two types of
|
||
conditional-jump operators, unary and binary. The unary
|
||
operators test the value in the A register. The binary operators
|
||
perform a comparision between the values in the A register and
|
||
the B register. The unary conditional-jump macros are as
|
||
follows:
|
||
|
||
JNULL # jump if null pointer
|
||
JNNULL # jump if non-null pointer
|
||
|
||
The binary conditional-jump macros are as follows:
|
||
|
||
JEQ # jump if A == B
|
||
JNE # jump if A != B
|
||
JLT # jump if A < B
|
||
JGT # jump if A > B
|
||
JLE # jump if A <= B
|
||
JGE # jump if A >= B
|
||
|
||
4. Keyword Macros
|
||
|
||
The remaining family of macros is the set of keyword macros.
|
||
These correspond closely to the keyword macros used in the
|
||
abstract machine of the C compiler. The CMAC keyword macros are
|
||
described below. Each description is headed by the name of a
|
||
macro and its argument names; following is a description of the
|
||
arguments and the intended function of the macro call.
|
||
|
||
4.1 Program Definition Macros
|
||
|
||
HEAD
|
||
|
||
The HEAD macro marks the beginning of a CMAC program. It may
|
||
produce any needed header statements.
|
||
|
||
CEND
|
||
|
||
The CEND macro marks the end of a CMAC program. It may produce
|
||
an END statement, if needed.
|
||
4 April 1978 - 6 - The CMAC Machine
|
||
|
||
|
||
CENTRY NAME
|
||
|
||
NAME is a C identifier. The expansion of the CENTRY macro should
|
||
declare the specifed variable to be an entry point, that is, one
|
||
which is defined in the current program but accessible to other
|
||
programs.
|
||
|
||
CEXTRN NAME
|
||
|
||
The CEXTRN macro is similar to the CENTRY macro except that it
|
||
defines the variable to be an external reference, that is, one
|
||
which is used in the current program but assumed to be defined in
|
||
another program.
|
||
|
||
PURE
|
||
|
||
The PURE macro indicates that the following program text
|
||
represents PURE code or data. This macro may be used, along with
|
||
the IMPURE macro, to segregate PURE and IMPURE storage. The use
|
||
of this macro is optional.
|
||
|
||
IMPURE
|
||
|
||
The IMPURE macro indicates that the following program text
|
||
represents impure storage.
|
||
|
||
4.2 Symbol Defining Macros
|
||
|
||
CEQU NAME
|
||
|
||
NAME is a C identifier; it is to be defined as having a value
|
||
equal to the current value of the location counter.
|
||
|
||
LABDEF N
|
||
|
||
The LABDEF macro defines the location of internal label number N
|
||
to be the current value of the location counter.
|
||
|
||
STATIC N
|
||
|
||
The STATIC macro defines the location of the static variable
|
||
whose internal static variable number is N to be the current
|
||
value of the location counter. Typically, this macro will define
|
||
an assembly language symbol by which the static variable can be
|
||
referenced.
|
||
|
||
STRDEF N
|
||
|
||
The STRDEF macro defines the address of the string constant whose
|
||
internal number is N to be the current value of the location
|
||
counter. It is immediately followed by one or more INTCON
|
||
macros; the last one will define a zero word.
|
||
The CMAC Machine - 7 - 4 April 1978
|
||
|
||
|
||
LINNUM N
|
||
|
||
The LINNUM macro associates the line in the source program whose
|
||
line number is specified by the integer N with the current value
|
||
of the location counter. It need not produce any code; it is
|
||
provided merely to aid in the reading of CMAC programs.
|
||
|
||
4.3 Storage Defining Macros
|
||
|
||
ADCON NAME
|
||
|
||
NAME is C identifier. The ADCON macro should define a word of
|
||
storage initialized with a pointer to the specified external
|
||
variable. This macro is used in the initialization of static and
|
||
external pointers and arrays of pointers.
|
||
|
||
SADCON N
|
||
|
||
N is an integer. The SADCON macro should define a word of
|
||
storage initialized with a pointer to the static variable
|
||
numbered N. This macro is used in the initialization of static
|
||
and external pointers and arrays of pointers.
|
||
|
||
INTCON I
|
||
|
||
The INTCON macro should define a word of storage whose initial
|
||
value is that specified by the integer I. It is used in the
|
||
initialization of static and external variables and arrays, in
|
||
the definition of string constants, and in the construction of
|
||
tables for the LSWITCH macro.
|
||
|
||
LABCON N
|
||
|
||
The LABCON macro should define a word of storage whose initial
|
||
value is the address corresponding to internal label number N.
|
||
The LABCON macro is used to construct the tables for the LSWITCH
|
||
and TSWITCH macros.
|
||
|
||
STRCON N
|
||
|
||
The STRCON macro should define a word of storage whose initial
|
||
value is a pointer to the string constant whose internal string
|
||
number is N. The STRCON macro is used in the initialization of
|
||
static and external variables.
|
||
|
||
CZERO N
|
||
|
||
The CZERO macro specifies the definition of a block of storage
|
||
initialized to zero; the size in words of this storage area is
|
||
specified by the integer N.
|
||
4 April 1978 - 8 - The CMAC Machine
|
||
|
||
|
||
4.4 Control Macros
|
||
|
||
PROLOG FUNCNO,FUNCNAME
|
||
|
||
The PROLOG macro produces the prolog code for a C function.
|
||
FUNCNAME is the name of the C function. FUNCNO is an integer
|
||
which specifies the internal function number of the function; it
|
||
may be used in conjunction with the EPILOG macro to access the
|
||
size of the function's stack frame. The PROLOG macro should
|
||
define the entry point name and produce the code necessary to
|
||
save the environment of the calling function and to set up the
|
||
environment of the called function using the information provided
|
||
in the function call. These actions may be performed by a
|
||
subroutine. The first eight words of every stack frame are
|
||
reserved for use by the PROLOG macro; that is, the first
|
||
automatic variable in a function is given an offset of eight
|
||
words. The PROLOG macro call appears in a CMAC program
|
||
immediately before the first instruction of the corresponding
|
||
function.
|
||
|
||
EPILOG FUNCNO,FRAMESIZE
|
||
|
||
The EPILOG macro produces the epilog code for a C function. The
|
||
epilog code should restore the environment of the calling
|
||
function and return to that function. These actions may be
|
||
performed by a subroutine. FUNCNO and FRAMESIZE are integers
|
||
which specify the internal function number of the function and
|
||
the size in words of its stack frame, respectively. These
|
||
integers can be used to define an assembly-language symbol whose
|
||
value is the size of the stack frame; this symbol can then be
|
||
used by the code produced by the PROLOG macro which allocates the
|
||
stack frame.
|
||
|
||
CCALL NARGS,ARGP,NAME
|
||
|
||
The CCALL macro generates a function call. NARGS is an integer
|
||
specifying the number of arguments to the function call; ARGP is
|
||
an integer specifying the word offset in the caller's stack frame
|
||
of the arguments which have been so placed by previous
|
||
instructions. NAME is the name of the function being called.
|
||
|
||
CALREG NARGS,ARGP,REG
|
||
|
||
The CALREG macro is like the CCALL macro except that the function
|
||
being called has been computed dynamically. The address of this
|
||
function is located in the register specified by REG.
|
||
|
||
CRETRN
|
||
|
||
The CRETRN macro produces the statements needed to return from a
|
||
function to the calling function, i.e., transfer to the EPILOG
|
||
code. The returned value of the function will have been placed
|
||
in the A register by previous CMAC instructions.
|
||
The CMAC Machine - 9 - 4 April 1978
|
||
|
||
|
||
CGOTO N
|
||
|
||
The CGOTO macro produces an unconditional jump to the location
|
||
defined by internal label number N.
|
||
|
||
LSWITCH N,DEFLT
|
||
|
||
The LSWITCH macro should generate code which jumps according to
|
||
the value of the integer in register A. This macro is
|
||
immediately followed by N (N>0) INTCON macros (the cases), which
|
||
are immediately followed by N LABCON macros (the corresponding
|
||
labels), followed by an ELSWIT macro. A search should be made
|
||
through the case list; if a match is found, a jump should be made
|
||
to the label defined by the corresponding LABCON macro. If the
|
||
integer matches none of the list entries, then a jump should be
|
||
made to the internal label whose internal label number is given
|
||
by the integer DEFLT.
|
||
|
||
ELSWIT N,DEFLT
|
||
|
||
This macro completes an LSWITCH.
|
||
|
||
TSWITCH LO,HI,DEFLT
|
||
|
||
The TSWITCH macro produces an indexed jump based on the value of
|
||
the integer in register A. This macro is immediately followed by
|
||
a sequence of HI-LO+1 LABCON macros defining the target labels
|
||
corresponding to integer values from LO to HI. Values outside
|
||
this range should result in transfers to the internal label whose
|
||
internal label number is given by the integer DEFLT.
|
||
|
||
ETSWIT LO,HI,DEFLT
|
||
|
||
This macro completes a TSWITCH.
|