1
0
mirror of https://github.com/PDP-10/its.git synced 2026-01-13 15:27:28 +00:00
PDP-10.its/doc/c/cmac.text
Lars Brinkhoff aec62f3117 Edit SAIL files to be like the originals from ITS.
Rename according to SAIL file C.LIB[C,SYS].  Fix E stuff and remove
trailing ^C and ^@s.
2018-10-30 08:36:56 +01:00

470 lines
22 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

The CMAC Machine
Copyright (c) 1976, 1977, 1978
by
Alan Snyder
M.I.T. Laboratory for Computer Science
Cambridge, Ma. 02139
CMAC is a set of macros which form an assembly language for a
very simple machine. It has been designed to be used to
bootstrap the C compiler to new host machines. The code produced
for the CMAC machine is not efficient. However, efficiency is
not the relevant consideration. The primary goal is to minimize
the effort needed to implement the macros.
The CMAC machine has two data types, integers and pointers.
Characters are mapped into integers; character strings are
sequences of integers. The C compiler does not use
floating-point; thus, floating-point is not included in the CMAC
machine. Both integers and pointers are stored in single machine
words.
On word-addressed machines, pointers will be the same as
integers. On byte-addressed machines, they will most likely be
byte addresses, and thus always multiples of two, four, or
whatever. Pointers are distinguished from integers in the CMAC
machine because, on host machines which are not word-addressed,
operations on pointers (such as increment and decrement) will be
different than the corresponding integer operations.
The CMAC machine has two registers, called A and B. Each
register is capable of holding either an integer or a pointer.
All operations are performed on values in registers. Explicit
CMAC instructions are used to get values into and out of the
registers. This division of labor allows the problems of the
different storage classes of memory references to be isolated in
the load and store macro definitions, thus simplifying the
definitions of the operational macros.
CMAC programs are designed to be translated into the machine
language of the host machine using the standard macro assembler
of the host machine, augmented by a set of macro definitions for
the CMAC macros. In order to run the C compiler, the compiler
programs should be so translated and then linked together using
the standard linker of the host machine. In addition, a few
support routines (for I/O) must be hand-coded (these routines are
described in a separate document). It is conceivable that the
4 April 1978 - 2 - The CMAC Machine
CMAC programs might have to first be edited in order to conform
to the format expected by the host machine's macro assembler.
Alternatively, the CMAC programs can be translated (again, by
some macro processor) into a compact representation which can
then be interpreted by an interpreter running on the host
machine. This requires both the writing of an interpreter and
the writing of the macro definitions which construct the
interpreted representation. However, the use of this technique
may be necessary on small machines where the direct translation
from CMAC macros to machine code results in excessively large
programs.
The form of a CMAC program is that of an ASCII text file
consisting of text lines terminated by the newline (LF)
character. Each line contains one macro call, consisting of a
TAB, followed by the macro name, optionally followed by a TAB and
one or more character string arguments separated by commas. For
example,
FOO A,B
is a call of a macro named "FOO" with arguments "A" and "B".
Macro names and arguments consist of upper-case letters and
digits. Numeric arguments may be prefixed by a minus sign. In C
identifiers, the underscore character will be represented by 'J';
all external identifiers in the C compiler are unique in their
first five characters. All numeric arguments are in decimal.
1. Move Macros
The move family of macros are used to move values between memory
and the two registers. There are three basic classes of move
macros: load, store, and load-address. Within each class, there
are macros corresponding to the relevant C storage classes. The
first argument of each of these macros specifies a register
(either A or B). The second argument is dependent on the
particular C storage class; it gives the information needed to
specify an exact storage location. The form of this information
is described in Table I.
1.1 Load Macros
The load class of macros each has two arguments, a register
(either A or B) and a storage class-dependent argument. The
function of a load macro is to move the value in the location
specified by the second argument into the register specified by
the first argument. The load macros are as follows:
The CMAC Machine - 3 - 4 April 1978
Table I. Forms of Storage Class-Particular Information
class information form
auto word offset of variable in current stack frame
extern the actual C identifier
static a static-variable unique number
literal an integer value
parm parameter number (starting with 0)
indirect the register containing the pointer
register the register containing the value
string a string-literal unique number
_________________________________________________________________
LAUTO R,OFFSET load register from auto variable
LEXTRN R,NAME load register from external variable
LSTAT R,# load register from static variable
LLIT R,# load register with literal value
LPARM R,# load register from parameter
LVPTR R,R load register via pointer in register
LREG R,R move from one register to the other
1.2 Store Macros
The store class of macros each has two arguments, a register
(either A or B) and a storage class-dependent argument. The
function of a store macro is to move the value in the register
specified by the first argument to the location specified by the
second argument. The store macros are as follows:
STAUTO R,OFFSET store register into auto variable
STEXTN R,NAME store register into external variable
STSTAT R,# store register into static variable
STPARM R,# store register into parameter
STVPTR R,R store register via pointer in register
1.3 Load-address Macros
The load-address class of macros each has two arguments, a
register (either A or B) and a storage class-dependent argument.
The function of a load-address macro is to construct a pointer to
the location specified by the second argument, and put that
pointer value in the designated register. The load-address
macros are as follows:
LAAUTO R,OFFSET load address of auto variable
LAEXTN R,NAME load address of external variable
LASTAT R,# load address of static variable
LAPARM R,# load address of parameter
LASTRG R,# load address of string literal
4 April 1978 - 4 - The CMAC Machine
2. Operate Macros
The operate family of macros perform operations upon values in
the two registers. There are two classes of operate macros:
unary and binary.
2.1 Unary Operators
The unary operators each take one argument, which specifies a
register. This register is used for both the source and
destination of the operation. The unary operate macros are as
follows:
CMINUS R arithmetic minus
CNOT R bitwise negation
Since there are only two possible argument values, it is possible
to implement these operations as subroutine calls to one of a
pair of subroutines, one for each register.
2.2 Binary Operators
The binary operators all take their first (or "left") operand
from the A register and their second (or "right") operand from
the B register, and place their result in the A register, leaving
the B register unchanged. The semantics of the operators are the
same as the corresponding C operators. Since the operand
locations are fixed, no arguments are provided to these macros.
The binary operate macros are as follows:
CADD integer addition
CSUB integer subtraction
CMUL integer multiplication
CDIV integer division
CMOD integer remainder
CLS bitwise left shift
CRS bitwise right shift
CAND bitwise AND
COR bitwise OR
CXOR bitwise XOR
PINC pointer increment
PDEC pointer decrement
PSUB pointer subtraction
Note that the pointer operations are not necessarily the same as
the integer operations. Pointer increment and decrement take a
pointer as their first operand and an integer as their second
operand and produce a pointer result. The integer represents an
offset in words; it may have to be scaled before it is actually
added or subtracted to the pointer. Pointer subtraction takes
two pointer operands and produces an integer result. The integer
should represent the number of words that the first pointer is
offset from the second; it may be necessary to scale the result
The CMAC Machine - 5 - 4 April 1978
of the subtraction in order to get the integer in the proper
units.
Since the binary operator macros take no arguments, they are
easily implemented as subroutine calls.
3. Conditional-jump Macros
The conditional-jump family of macros all take a single argument,
a internal-label unique number. This number designates a label
to which control should jump depending upon the result of a test
specified by the conditional-jump macro. This test is based on
the values in the registers. There are two types of
conditional-jump operators, unary and binary. The unary
operators test the value in the A register. The binary operators
perform a comparision between the values in the A register and
the B register. The unary conditional-jump macros are as
follows:
JNULL # jump if null pointer
JNNULL # jump if non-null pointer
The binary conditional-jump macros are as follows:
JEQ # jump if A == B
JNE # jump if A != B
JLT # jump if A < B
JGT # jump if A > B
JLE # jump if A <= B
JGE # jump if A >= B
4. Keyword Macros
The remaining family of macros is the set of keyword macros.
These correspond closely to the keyword macros used in the
abstract machine of the C compiler. The CMAC keyword macros are
described below. Each description is headed by the name of a
macro and its argument names; following is a description of the
arguments and the intended function of the macro call.
4.1 Program Definition Macros
HEAD
The HEAD macro marks the beginning of a CMAC program. It may
produce any needed header statements.
CEND
The CEND macro marks the end of a CMAC program. It may produce
an END statement, if needed.
4 April 1978 - 6 - The CMAC Machine
CENTRY NAME
NAME is a C identifier. The expansion of the CENTRY macro should
declare the specifed variable to be an entry point, that is, one
which is defined in the current program but accessible to other
programs.
CEXTRN NAME
The CEXTRN macro is similar to the CENTRY macro except that it
defines the variable to be an external reference, that is, one
which is used in the current program but assumed to be defined in
another program.
PURE
The PURE macro indicates that the following program text
represents PURE code or data. This macro may be used, along with
the IMPURE macro, to segregate PURE and IMPURE storage. The use
of this macro is optional.
IMPURE
The IMPURE macro indicates that the following program text
represents impure storage.
4.2 Symbol Defining Macros
CEQU NAME
NAME is a C identifier; it is to be defined as having a value
equal to the current value of the location counter.
LABDEF N
The LABDEF macro defines the location of internal label number N
to be the current value of the location counter.
STATIC N
The STATIC macro defines the location of the static variable
whose internal static variable number is N to be the current
value of the location counter. Typically, this macro will define
an assembly language symbol by which the static variable can be
referenced.
STRDEF N
The STRDEF macro defines the address of the string constant whose
internal number is N to be the current value of the location
counter. It is immediately followed by one or more INTCON
macros; the last one will define a zero word.
The CMAC Machine - 7 - 4 April 1978
LINNUM N
The LINNUM macro associates the line in the source program whose
line number is specified by the integer N with the current value
of the location counter. It need not produce any code; it is
provided merely to aid in the reading of CMAC programs.
4.3 Storage Defining Macros
ADCON NAME
NAME is C identifier. The ADCON macro should define a word of
storage initialized with a pointer to the specified external
variable. This macro is used in the initialization of static and
external pointers and arrays of pointers.
SADCON N
N is an integer. The SADCON macro should define a word of
storage initialized with a pointer to the static variable
numbered N. This macro is used in the initialization of static
and external pointers and arrays of pointers.
INTCON I
The INTCON macro should define a word of storage whose initial
value is that specified by the integer I. It is used in the
initialization of static and external variables and arrays, in
the definition of string constants, and in the construction of
tables for the LSWITCH macro.
LABCON N
The LABCON macro should define a word of storage whose initial
value is the address corresponding to internal label number N.
The LABCON macro is used to construct the tables for the LSWITCH
and TSWITCH macros.
STRCON N
The STRCON macro should define a word of storage whose initial
value is a pointer to the string constant whose internal string
number is N. The STRCON macro is used in the initialization of
static and external variables.
CZERO N
The CZERO macro specifies the definition of a block of storage
initialized to zero; the size in words of this storage area is
specified by the integer N.
4 April 1978 - 8 - The CMAC Machine
4.4 Control Macros
PROLOG FUNCNO,FUNCNAME
The PROLOG macro produces the prolog code for a C function.
FUNCNAME is the name of the C function. FUNCNO is an integer
which specifies the internal function number of the function; it
may be used in conjunction with the EPILOG macro to access the
size of the function's stack frame. The PROLOG macro should
define the entry point name and produce the code necessary to
save the environment of the calling function and to set up the
environment of the called function using the information provided
in the function call. These actions may be performed by a
subroutine. The first eight words of every stack frame are
reserved for use by the PROLOG macro; that is, the first
automatic variable in a function is given an offset of eight
words. The PROLOG macro call appears in a CMAC program
immediately before the first instruction of the corresponding
function.
EPILOG FUNCNO,FRAMESIZE
The EPILOG macro produces the epilog code for a C function. The
epilog code should restore the environment of the calling
function and return to that function. These actions may be
performed by a subroutine. FUNCNO and FRAMESIZE are integers
which specify the internal function number of the function and
the size in words of its stack frame, respectively. These
integers can be used to define an assembly-language symbol whose
value is the size of the stack frame; this symbol can then be
used by the code produced by the PROLOG macro which allocates the
stack frame.
CCALL NARGS,ARGP,NAME
The CCALL macro generates a function call. NARGS is an integer
specifying the number of arguments to the function call; ARGP is
an integer specifying the word offset in the caller's stack frame
of the arguments which have been so placed by previous
instructions. NAME is the name of the function being called.
CALREG NARGS,ARGP,REG
The CALREG macro is like the CCALL macro except that the function
being called has been computed dynamically. The address of this
function is located in the register specified by REG.
CRETRN
The CRETRN macro produces the statements needed to return from a
function to the calling function, i.e., transfer to the EPILOG
code. The returned value of the function will have been placed
in the A register by previous CMAC instructions.
The CMAC Machine - 9 - 4 April 1978
CGOTO N
The CGOTO macro produces an unconditional jump to the location
defined by internal label number N.
LSWITCH N,DEFLT
The LSWITCH macro should generate code which jumps according to
the value of the integer in register A. This macro is
immediately followed by N (N>0) INTCON macros (the cases), which
are immediately followed by N LABCON macros (the corresponding
labels), followed by an ELSWIT macro. A search should be made
through the case list; if a match is found, a jump should be made
to the label defined by the corresponding LABCON macro. If the
integer matches none of the list entries, then a jump should be
made to the internal label whose internal label number is given
by the integer DEFLT.
ELSWIT N,DEFLT
This macro completes an LSWITCH.
TSWITCH LO,HI,DEFLT
The TSWITCH macro produces an indexed jump based on the value of
the integer in register A. This macro is immediately followed by
a sequence of HI-LO+1 LABCON macros defining the target labels
corresponding to integer values from LO to HI. Values outside
this range should result in transfers to the internal label whose
internal label number is given by the integer DEFLT.
ETSWIT LO,HI,DEFLT
This macro completes a TSWITCH.