mirror of
https://github.com/mikpe/pdp10-tools.git
synced 2026-01-11 23:53:19 +00:00
840 lines
36 KiB
Plaintext
840 lines
36 KiB
Plaintext
doc/pdp10-abi.txt
|
|
Copyright (C) 2015-2017 Mikael Pettersson
|
|
|
|
This file is part of pdp10-tools.
|
|
|
|
pdp10-tools is free software: you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation, either version 3 of the License, or
|
|
(at your option) any later version.
|
|
|
|
pdp10-tools is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with pdp10-tools. If not, see <http://www.gnu.org/licenses/>.
|
|
|
|
|
|
PDP10 ELF Application Binary Interface Supplement
|
|
|
|
3 LOW-LEVEL SYSTEM INFORMATION
|
|
|
|
Machine Interface
|
|
|
|
Processor Architecture
|
|
|
|
The following documents define the PDP10 architecture:
|
|
|
|
* DECsystem-10/DECSYSTEM-20 Processor Reference Manual, AD-H391A-T1, Digital Equipment Corporation, 1982.
|
|
* KC10 Functional Description, Digital Equipment Corporation, 1983.
|
|
* Extended adressing, Digital Equipment Corporation, 1983.
|
|
* TOAD-1 System Architecture Reference Manual, XKL Systems Corporation, 1996.
|
|
|
|
Linux for PDP10 requires the following features introduced with the
|
|
KL10B processor:
|
|
|
|
* extended addressing
|
|
* "G format" floating point
|
|
|
|
From now on, an unqualified reference to PDP10 or "the architecture"
|
|
in this document means a KL10B or KL10B-compatible processor.
|
|
|
|
Programs intended to execute directly on the processor use the PDP10
|
|
instruction set, and the instruction encodings and semantics of the architecture.
|
|
|
|
An application program can assume that all instructions defined by the
|
|
architecture that are neither privileged nor model-specific exist and
|
|
work as documented.
|
|
|
|
Data Representation.
|
|
|
|
Byte Ordering
|
|
|
|
The architecture defines a 9-bit byte, an 18-bit halfword, a 36-bit word, and
|
|
a 72-bit doubleword.
|
|
|
|
Note - Although the architecture supports other sub-word values via its so-called
|
|
"byte pointers", those are not part of this specification.
|
|
|
|
Byte ordering defines how the bytes that make up halfwords, words, and doublewords
|
|
are ordered in memory. Most significant byte (MSB) ordering, or "Big-Endian"
|
|
as it is sometimes called, means that the most significant byte is located in the
|
|
lowest addressed byte position in a storage unit (byte 0).
|
|
|
|
Figures 3-1 through 3-3 illustrate the conventions for bit and byte numbering
|
|
within storage units of varius widths. These conventions apply to both integer data
|
|
and floating-point data, where the most significant byte of a floating-point value
|
|
holds the sign and at least the start of the exponent.
|
|
The figures show big-endian byte numbers in the upper left corners, and bit numbers in the
|
|
lower corners.
|
|
|
|
Note - In the PDP10 documentation, the bits in a word are numbered from left to
|
|
right (MSB to LSB).
|
|
|
|
Figure 3-1. Bit and Byte Numbering in Halfwords
|
|
|
|
+---------+---------+
|
|
|0 |1 |
|
|
| msb | lsb |
|
|
|0 8|9 15|
|
|
+---------+---------+
|
|
|
|
Figure 3-2. Bit and Byte Numbering in Words
|
|
|
|
+---------+---------+---------+---------+
|
|
|0 |1 |2 |3 |
|
|
| msb | | | lsb |
|
|
|0 8|9 17|18 26|27 35|
|
|
+---------+---------+---------+---------+
|
|
|
|
Figure 3-3. Bit and Byte Numbering in Doublewords
|
|
|
|
+---------+---------+---------+---------+
|
|
|0 |1 |2 |3 |
|
|
| msb | | | |
|
|
|0 8|9 17|18 26|27 35|
|
|
+---------+---------+---------+---------+
|
|
|4 |5 |6 |7 |
|
|
| | | | lsb |
|
|
|36 44|45 53|54 62|63 71|
|
|
+---------+---------+---------+---------+
|
|
|
|
Fundamental Types
|
|
|
|
Figure 3-4 shows how ISO C scalar types correspond to those of the PDP10 processor.
|
|
For all types, a NULL pointer has the value zero.
|
|
|
|
Type C sizeof Alignment PDP10
|
|
-------------------------------------------------------------------------------------------------------
|
|
Boolean | _bool 1 1 unsigned byte
|
|
---------------|----------------------------------------------------------------------------------------
|
|
Character | char 1 1 unsigned byte
|
|
| unsigned char 1 1 Note: LDB and ILDB zero-extend
|
|
|----------------------------------------------------------------------------------------
|
|
| signed char 1 1 signed byte
|
|
---------------|----------------------------------------------------------------------------------------
|
|
Short | short 2 2 signed halfword
|
|
| signed short 2 2
|
|
|----------------------------------------------------------------------------------------
|
|
| unsigned short 2 2 unsigned halfword
|
|
---------------|----------------------------------------------------------------------------------------
|
|
Integral | int 4 4 signed word
|
|
| signed int 4 4
|
|
| long int 4 4
|
|
| signed long 4 4
|
|
| enum 4 4
|
|
|----------------------------------------------------------------------------------------
|
|
| unsigned int 4 4 unsigned word
|
|
| unsigned long 4 4
|
|
---------------|----------------------------------------------------------------------------------------
|
|
Long Long | long long 8 4 signed doubleword
|
|
| signed long long 8 4
|
|
---------------|----------------------------------------------------------------------------------------
|
|
| unsigned long long 8 4 unsigned doubleword
|
|
---------------|----------------------------------------------------------------------------------------
|
|
Pointer | any-type * 4 4 unsigned word
|
|
| any-type (*) () 4 4
|
|
---------------|----------------------------------------------------------------------------------------
|
|
Floating-point | float 4 4 single precision
|
|
|----------------------------------------------------------------------------------------
|
|
| double 8 4 "G format" double precision
|
|
| long double 8 4 "G format" double precision
|
|
---------------|----------------------------------------------------------------------------------------
|
|
|
|
Note - long long is implemented in software, as the architecture's
|
|
double precision fixed point instructions have unsuitable semantics.
|
|
|
|
Aggregates and Unions
|
|
|
|
Aggregates (structures and arrays) and unions assume the alignment of
|
|
their most strictly aligned component, that is, the component with the
|
|
largest alignment. The size of any object, including aggregates and unions,
|
|
is always a multiple of the alignment of the object. An array uses the
|
|
same alignment as its elements. Structure and union objects may require
|
|
padding to meet size and alignment constraints. The contents of any padding
|
|
is undefined.
|
|
|
|
* An entire structure or union object is aligned on the same boundary as
|
|
its most strictly aligned member.
|
|
* Each member is assigned to the lowest available offset with the appropriate
|
|
alignment. This may require internal padding, depending on the previous
|
|
member.
|
|
* If necessary, a structure's size is increased to make it a multiple of
|
|
the structure's alignment. This may require tail padding, depending on
|
|
the last member.
|
|
|
|
In the following examples, members' byte offsets appear in the upper left corners.
|
|
|
|
Figure 3-5: Structure Smaller Than a Word
|
|
|
|
Byte aligned, sizeof is 1
|
|
struct { +---------+
|
|
char c; |0 |
|
|
}; | c |
|
|
+---------+
|
|
|
|
Figure 3-6: No Padding
|
|
|
|
Word aligned, sizeof is 8
|
|
struct { +---------+---------+------------------+
|
|
char c; |0 |1 |2 |
|
|
char d; | c | d | s |
|
|
short s; +---------+---------+------------------+
|
|
long n; |4 |
|
|
}; | n |
|
|
+--------------------------------------+
|
|
|
|
Figure 3-7: Internal Padding
|
|
|
|
Halfword aligned, sizeof is 4
|
|
struct { +---------+---------+
|
|
char c; |0 |1 |
|
|
short s; | c | pad |
|
|
}; +---------+---------+
|
|
|2 |
|
|
| s |
|
|
+-------------------+
|
|
|
|
Figure 3-8: Internal and Tail Padding
|
|
|
|
Word aligned, sizeof is 16
|
|
struct { +---------+---------------------------+
|
|
char c; |0 |1 |
|
|
double d; | c | pad |
|
|
short s; +---------+---------------------------+
|
|
}; |4 |
|
|
| d |
|
|
+-------------------------------------+
|
|
|8 |
|
|
| d |
|
|
+------------------+------------------+
|
|
|12 |14 |
|
|
| s | pad |
|
|
+------------------+------------------+
|
|
|
|
Figure 3-9: union Allocation
|
|
|
|
Word aligned, sizeof is 4
|
|
union { +---------+---------------------------+
|
|
char c; |0 |1 |
|
|
short s; | c | pad |
|
|
int j; +---------+---------+-----------------+
|
|
}; |0 |2 |
|
|
| s | pad |
|
|
+-------------------+-----------------+
|
|
|0 |
|
|
| j |
|
|
+-------------------------------------+
|
|
|
|
Bit-fields
|
|
|
|
C struct and union definitions may have "bit-fields", defining integral objects
|
|
with a specified number of bits.
|
|
|
|
Figure 3-10: Bit-Field Ranges
|
|
|
|
Bit-field Type Width w Range
|
|
----------------------------------------------------
|
|
signed char | | -2^(W-1) to 2^(W-1)-1
|
|
char | 1 to 9 | 0 to 2^W-1
|
|
unsigned char | | 0 to 2^W-1
|
|
----------------------------------------------------
|
|
signed short | | -2^(W-1) to 2^(W-1)-1
|
|
short | 1 to 18 | 0 to 2^W-1
|
|
unsigned short | | 0 to 2^W-1
|
|
----------------------------------------------------
|
|
signed int | | -2^(W-1) to 2^(W-1)-1
|
|
int | | 0 to 2^W-1
|
|
enum | | 0 to 2^W-1
|
|
unsigned int | 1 to 36 | 0 to 2^W-1
|
|
signed long | | -2^(W-1) to 2^(W-1)-1
|
|
long | | 0 to 2^W-1
|
|
unsigned long | | 0 to 2^W-1
|
|
----------------------------------------------------
|
|
signed long long | | -2^(W-1) to 2^(W-1)-1
|
|
long long | 1 to 72 | 0 to 2^W-1
|
|
unsigned long long | | 0 to 2^W-1
|
|
----------------------------------------------------
|
|
|
|
"Plain" bit-fields (that is, those neither signed nor unsigned) always have
|
|
non-negative values. Although they may have type short,
|
|
int, long, or long long (which can have negative values), bit-fields of
|
|
these types have the same range as bit-fields of the same size with the
|
|
corresponding unsigned type. Bit-fields obey the same size and alignment
|
|
rules as other structure and union members, with the following additions:
|
|
|
|
* Bit-fields are allocated from left to right (most to least significant).
|
|
* A bit-field must entirely reside in a storage unit appropriate for its
|
|
declared type. Thus, a bit-field never crosses its unit boundary.
|
|
* Bit-fields must share a storage unit with other structure and union members
|
|
(either bit-field or non-bit-field) if and only if there is sufficient
|
|
space within the storage unit.
|
|
* Unnamed bit-fields' types do not affect the alignment of a structure or
|
|
union, although an individual bit-field's member offsets obey the alignment
|
|
constraints. An unnamed, zero-width bit-field shall prevent any further
|
|
member, bit-field or other, from residing in the storage unit corresponding
|
|
to the type of the zero-width bit-field.
|
|
|
|
The following examples (Figures 3-11 through 3-16) show struct and union
|
|
member's byte offsets in the upper left corners. Bit numbers appear in the
|
|
lower corners.
|
|
|
|
Figure 3-11: Bit Numbering
|
|
|
|
+---------+---------+---------+---------+
|
|
|0 |1 |2 |3 |
|
|
0111222333444 | 0111 | 0222 | 0333 | 0444 |
|
|
|0 8|9 17|18 26|27 35|
|
|
+---------+---------+---------+---------+
|
|
|
|
Figure 3-12: Left-to-Right Allocation
|
|
|
|
Word aligned, sizeof is 4
|
|
struct { +-----+------+--------+-----------------+
|
|
int j:5; |0 | | | |
|
|
int k:6; | j | k | m | pad |
|
|
int m:8; |0 |5 |11 |19 |
|
|
}; +-----+------+--------+-----------------+
|
|
|
|
Figure 3-13: Boundary Alignment
|
|
|
|
Word aligned, sizeof is 12
|
|
struct { +----------+----------+-------+---------+
|
|
short s:10; |0 | | | |
|
|
int j:10; | s | j | pad | c |
|
|
char c; |0 |10 |20 |27 |
|
|
short t:10; +----------+--------+-+-------++--------+
|
|
short u:10; |4 | | | |
|
|
char d; | t | pad | u | pad |
|
|
}; |0 |10 |18 |28 |
|
|
+---------++--------+----------+--------+
|
|
|8 | |
|
|
| d | pad |
|
|
|0 |9 |
|
|
+---------+-----------------------------+
|
|
|
|
Figure 3-14: Storage Unit Sharing
|
|
|
|
Halfword aligned, sizeof is 2
|
|
struct { +---------+---------+
|
|
char c; |0 |1 |
|
|
short s:9; | c | s |
|
|
}; |0 |9 |
|
|
+---------+---------+
|
|
|
|
Figure 3-15: union Allocation
|
|
|
|
Halfword aligned, sizeof is 2
|
|
union { +---------+---------+
|
|
char c; |0 |1 |
|
|
short s:9; | c | pad |
|
|
}; |0 |9 |
|
|
+---------+---------+
|
|
|0 |1 |
|
|
| s | pad |
|
|
|0 |9 |
|
|
+---------+---------+
|
|
|
|
Figure 3-16: Unnamed Bit-Fields
|
|
|
|
Byte aligned, sizeof is 9
|
|
struct { +---------+---------------------------+
|
|
char c; |0 |1 |
|
|
int :0; | c | :0 |
|
|
char d; |0 |9 |
|
|
short :10; +---------+---------+----------+------+
|
|
char e; |4 |5 |6 | |
|
|
char :0; | d | pad | :10 | pad |
|
|
}; |0 |9 |18 |28 |
|
|
+---------+---------+----------+------+
|
|
|8 |
|
|
| e |
|
|
|0 |
|
|
+---------+
|
|
|
|
Note - In Figure 3-16, the presence of the unnamed int and short fields do not
|
|
affect the alignment of the structure. They align the named members relative
|
|
to the beginning of the structure, but the named members may not be aligned in
|
|
memory on suitable boundaries. For example, the d members in an array of this
|
|
structure will not all be on an int (4-byte) boundary.
|
|
|
|
As the examples show, int bit-fields (including signed and unsigned) pack more
|
|
densely than smaller base types. You can use char and short bit-fields to force
|
|
particular alignments, but int is generally more efficient.
|
|
|
|
Function Calling Sequence
|
|
|
|
This section discusses the standard function calling sequence, including stack
|
|
frame layout, register usage, and parameter passing.
|
|
|
|
The standard calling sequence requirements apply only to global functions. Local
|
|
functions that are not reachable from other compilation units may use different
|
|
conventions. Nevertheless, it is recommended that all functions use the standard
|
|
calling sequence when possible.
|
|
|
|
Registers
|
|
|
|
The PDP10 architecture provides 16 general purpose registers, each 36 bits wide,
|
|
and a 13-bit "Program Flags" special purpose register. By convention, the general
|
|
purpose registers are referred to via the octal numbers 0 through 017.
|
|
All of these registers are global to all procedures active for a given thread.
|
|
Brief register descriptions appear in Figure 3-17, followed by more detailed
|
|
information about the registers.
|
|
|
|
Figure 3-17: Processor Registers
|
|
|
|
Register Call Effect Usage
|
|
--------------------------------------------------
|
|
0 Volatile Temporary
|
|
1-4 Volatile Temporaries, argument passing and return values
|
|
5-7 Volatile Temporaries
|
|
010-013 Preserved Local variables
|
|
014 Reserved Thread pointer [TODO: "Preserved, local variable or static chain" in TOPS-20]
|
|
015 Preserved Local variable or frame pointer
|
|
016 Preserved Local variable, GOT pointer for PIC code
|
|
017 Preserved Stack pointer
|
|
Program Flags Volatile Contains special purpose flags
|
|
|
|
Registers 010 to 013 and 015 to 017 are nonvolatile; that is, they "belong to" the calling
|
|
function. A called function shall save these registers' values before it changes
|
|
them, restoring their values before it returns. Registers 0 through 7, and the
|
|
program flags, are volatile; that is, they are not preserved across function calls.
|
|
|
|
[TODO: may function linkage clobber any register?]
|
|
|
|
Register 014 is reserved by the system as a thread pointer, and must not be changed
|
|
by application code.
|
|
|
|
Register 015 may be used as a frame pointer holding the base address for the current
|
|
stack frame. Consequently, a function has registers pointing to both ends of its frame.
|
|
Incoming arguments reside in the previous frame, referenced as negative offsets from
|
|
register 015, while local variables reside in the current frame, referenced as positive
|
|
offsets from register 015. A function must preserve this register for its caller.
|
|
|
|
Register 016 is used as the global offset table base register for position-independent code.
|
|
For absolute code, this register is available as a local variable and has no specified
|
|
role in the function calling sequence. In either case, a function must preserve this
|
|
register for its caller.
|
|
|
|
Register 017 contains the current thread's stack pointer, as used by the PUSH,
|
|
PUSHJ, POP, POPJ, and ADJSP instructions. It shall always point to the top-most,
|
|
valid word of the current stack frame, and grow towards high addresses.
|
|
|
|
Signals can interrupt processes. Functions called during signal handling have no
|
|
unusual restrictions on their use of registers. Moreover, if a signal handling
|
|
function returns, the process resumes its original execution path with all registers
|
|
restored to their original values. Thus, programs and compilers may freely use all
|
|
registers, except those reserved for system use, without the danger of signal handlers
|
|
inadvertently changing their values.
|
|
|
|
The Stack Frame
|
|
|
|
A function will be passed a frame on the runtime stack by the function which called it,
|
|
and may allocate a new stack frame for itself if needed. This stack grows upwards from
|
|
low to high addresses. Figure 3-18 shows the stack frame organization. SP in the figure
|
|
denotes the stack pointer (general purpose register 017) of the called function after it
|
|
has executed code establishing its stack frame. Old SP denotes the stack pointer before
|
|
the stack frame is established; it points at the word containing the return address, one
|
|
word above the caller's stack frame. Since the stack pointer is a word pointer, offsets
|
|
are in number of words, not bytes.
|
|
|
|
Base Offset Contents Frame
|
|
-------------+-----------------+---------------
|
|
SP | argument | High addresses
|
|
| build area |
|
|
+-----------------+
|
|
| local variables | Current
|
|
| and register |
|
|
+1W | save area |
|
|
+-----------------+
|
|
Old SP +0W | return address |
|
|
-------------+-----------------+---------------
|
|
-1W | argument word 0 |
|
|
| ... |
|
|
| argument word n | Previous
|
|
+-----------------+
|
|
| ... |
|
|
| | Low addresses
|
|
-------------+-----------------+---------------
|
|
|
|
[TODO: show layout when a frame pointer is used?]
|
|
|
|
Parameter Passing
|
|
|
|
All integer-valued arguments are passed as 36-bit words. A byte or halfword is
|
|
zero- or sign-extended to a word, depending on the signedness of its type.
|
|
|
|
Structures and unions are passed as sequences of words, left-to-right. The contents
|
|
of any tail padding is unspecified.
|
|
|
|
Integers and floats larger that one word are passed as structures.
|
|
|
|
If the called function returns a structure or union, the caller passes the address
|
|
of an area large enough to hold the result as the first argument to the function,
|
|
shifting remaining arguments one step to the right. The called function copies the
|
|
returned structure or union into this area before it returns.
|
|
|
|
The sequence of words that make up the argument list is passed in registers and the stack as follows:
|
|
|
|
* the first (left-most) four words are passed in registers 1 to 4, in that order; if the argument list
|
|
is shorter than four words, then only as many argument registers as there are argument words are
|
|
defined, remaining ones have undefined contents
|
|
|
|
* remaining argument words after the first four, if any, are passed on the top of the current
|
|
stack frame, in left-to-right order, at offsets -1, -2, -3, etc from the stack pointer as
|
|
seen by the called function
|
|
|
|
[TODO: prevent any individual argument from being split between registers and the stack?]
|
|
[TODO: reserve a 4-word argument home area at the top of the frame, between the return address
|
|
and the stacked arguments -- that would simplify variadic functions]
|
|
|
|
The return address is passed at the top of the stack, at offset 0 from the stack pointer as
|
|
seen by the called function.
|
|
|
|
Variable Argument Lists
|
|
|
|
Portable C programs must use the header file <stdarg.h> to handle variable argument lists
|
|
on the PDP10.
|
|
|
|
Function Return Values
|
|
|
|
An integral or pointer return value is returned in register 1. A long long integer or
|
|
double-precision float is returned in registers 1 and 2. A byte or halfword is
|
|
zero- or sign-extended to a word, depending on the signedness of its type.
|
|
|
|
The caller of a function that returns a structure or union passes the address of an
|
|
area large enough to hold the result in the first argument register. Before the called
|
|
function returns to its caller, it copies the return value to this area; the called
|
|
function also returns this address in register 1.
|
|
|
|
[TODO: allow small structures to be returned in registers, then rephrase long long and
|
|
double returns to be as if they are structures]
|
|
|
|
Operating System Interface
|
|
|
|
Virtual Address Space
|
|
|
|
Processes execute in an 18, 23, or 30-bit virtual address space, partitioned into sections
|
|
of 256 kilowords (2^18 words) each. Memory management translates virtual addresses to
|
|
physical addresses, hiding physical addressing and letting a process run anywhere in the
|
|
system's real memory. Processes typically begin with three logical segments, commonly
|
|
called "text", "data", and "stack". Dynamic linking creates more segments during execution,
|
|
and a process can create additional segments for itself with system services.
|
|
|
|
Page Size
|
|
|
|
Memory is organized into pages, which are the system's smallest units of memory allocation.
|
|
The hardware page size for the PDP10 architecture is 512 words.
|
|
|
|
Shared Libraries
|
|
|
|
Shared libraries depend on position-independent code (PIC). However, the PDP10 supports only
|
|
absolute addressing of both code and data, via 18-bit local addresses (offsets from the current
|
|
section), or via 30- or 36-bit global addresses.
|
|
|
|
This ABI specifies that each shared library is aligned on a section boundary, as that allows most
|
|
local code references in the shared library to use local absolute addresses. To retrieve its
|
|
global offset table (GOT) pointer, a function in the shared library executes the following code:
|
|
|
|
JSP 0, _GET_GOT_
|
|
|
|
where _GET_GOT_ is a linker-generated function containing:
|
|
|
|
_GET_GOT_:
|
|
HLLZ 016, 0 # copy section number
|
|
HRRI 016, L # insert local section offset for L
|
|
L: ADD 016, @[_GLOBAL_OFFSET_TABLE_-.]
|
|
JRST 0
|
|
|
|
This places a return address in AC 0, jumps via a local absolute address to the linker-generated
|
|
function _GET_GOT_, which computes into AC 016 the GOT address via the section number in AC 0 and
|
|
the link-time constant offset from _GET_GOT_ to the GOT, and then returns via AC 0. Each section
|
|
in a shared library must contain its own specific _GET_GOT_ function.
|
|
|
|
Note -- It is possible to lift the restriction that shared libraries are section-aligned.
|
|
In this case, to compute its GOT a function could execute:
|
|
|
|
MOVSI 0, 025400 # assemble "JRST 016" in AC 0
|
|
HRR 0, 016
|
|
JSP 016, 0 # place PC in AC 016, jump to AC 0
|
|
ADD 016, @[_GLOBAL_OFFSET_TABLE_-.]
|
|
|
|
In this case, even local code references within the shared library would have to be
|
|
indirect via the GOT. This ABI does not support this model for shared libraries.
|
|
|
|
Virtual Address Assignments
|
|
|
|
Conceptually, processes have the full 30-bit (4096 section) address space available. In practice,
|
|
however, several factors limit the size of a process.
|
|
|
|
* The processor may only support 32 or 1 sections. The KC10, XKL-1, and SC-40 support 4096 sections,
|
|
the KL10B supports 32 sections, and the KA10, KI10, early KL10, and KS10 support only a single section.
|
|
* Section 0 may not contain code in programs compiled for 32 or 4096 sections, due to the
|
|
differing semantics of executing in section zero versus a non-zero section.
|
|
* Section 0 may not contain the stack for programs running in non-zero sections, since the stack
|
|
pointer would be interpreted as a local stack pointer and not function as intended.
|
|
* Locations 0 to 017 in sections 0 and 1 alias the general purpose registers, and are therefore
|
|
unavailable for data or code allocation.
|
|
* Shared libraries, as defined by this ABI, must be section-aligned.
|
|
|
|
The Large Code Model
|
|
|
|
The large code model provides processes with access to a 30-bit address space with
|
|
sections 1 to 4095.
|
|
|
|
+------------------+
|
|
07777_777777 | ... |
|
|
| |
|
|
04000_000000 | Dynamic segments |
|
|
+------------------+
|
|
03777_777777 | ... |
|
|
| |
|
|
| BSS segment |
|
|
+------------------+
|
|
| ... |
|
|
| |
|
|
| Data segment |
|
|
+------------------+
|
|
| ... |
|
|
| |
|
|
00002_001000 | Text segment |
|
|
+------------------+
|
|
00001_777777 | ... |
|
|
00001_777000 | Guard page |
|
|
+------------------+
|
|
00001_776777 | ... |
|
|
| |
|
|
00001_001000 | Stack segment |
|
|
+------------------+
|
|
00001_000777 | |
|
|
00001_000000 | Guard page |
|
|
+------------------+
|
|
|
|
The main program code and data is loaded starting in section 2 at offset 01000 (page 1),
|
|
and the main stack is allocated in section 1 at offset 01000 (page 1). Pages 0
|
|
and 0777 of section 1 are reserved and unmapped. Section 0 is reserved and unmapped.
|
|
|
|
The upper half of the address space is reserved for dynamic segments, allowing
|
|
for up to 2048 shared libraries to be mapped into the process. Unused space there
|
|
is available for dynamic memory allocation.
|
|
|
|
Programs compiled for the large code model will only run on processors implementing full
|
|
extended addressing.
|
|
|
|
The Small Code Model
|
|
|
|
The small code model provides processes with access to a 23-bit address space with
|
|
sections 1 to 31.
|
|
|
|
+------------------+
|
|
00037_777777 | ... |
|
|
| |
|
|
00020_000000 | Dynamic segments |
|
|
+------------------+
|
|
00017_777777 | ... |
|
|
| |
|
|
| BSS segment |
|
|
+------------------+
|
|
| ... |
|
|
| |
|
|
| Data segment |
|
|
+------------------+
|
|
| ... |
|
|
| |
|
|
00002_001000 | Text segment |
|
|
+------------------+
|
|
00001_777777 | ... |
|
|
00001_777000 | Guard page |
|
|
+------------------+
|
|
00001_776777 | ... |
|
|
| |
|
|
00001_001000 | Stack segment |
|
|
+------------------+
|
|
00001_000777 | |
|
|
00001_000000 | Guard page |
|
|
+------------------+
|
|
|
|
The small code model is identicial to the large code model, except for the base
|
|
of the dynamic segments, and that the number of shared libraries is limited to
|
|
at most 16.
|
|
|
|
Programs compiled for the small code model will only run on processors implementing extended addressing.
|
|
|
|
The Tiny Code Model
|
|
|
|
The tiny code model provides processes with access to an 18-bit address space in section 0.
|
|
|
|
+------------------+
|
|
00000_777777 | Dynamic memory |
|
|
| allocation |
|
|
| ... |
|
|
| BSS segment |
|
|
+------------------+
|
|
| ... |
|
|
| Data segment |
|
|
+------------------+
|
|
| ... |
|
|
00000_400000 | Text segment |
|
|
+------------------+
|
|
00000_377777 | Dynamic memory |
|
|
| allocation |
|
|
| ... |
|
|
+------------------+
|
|
| Guard page |
|
|
+------------------+
|
|
| ... |
|
|
00000_001000 | Stack segment |
|
|
+------------------+
|
|
00000_000777 | |
|
|
00000_000000 | Guard page |
|
|
+------------------+
|
|
|
|
The main program code and data is loaded starting at offset 0400000 (page 0400),
|
|
and the main stack is allocated at offset 01000 (page 1). Page 0 is reserved
|
|
and unmapped.
|
|
|
|
Dynamic memory allocation via mmap() proceeds from offset 0777777 and downwards
|
|
towards the BSS segment. If that area is exhausted, dynamic memory allocation
|
|
then proceeds from offset 0377777 and downwards towards the main stack segment.
|
|
|
|
An unmapped guard page separates the stack segment from the following segment.
|
|
The location of this page is adjusted as needed if the dynamic memory allocation
|
|
segment grows or shrinks.
|
|
|
|
Programs compiled for the tiny code model will run on all PDP10 processors.
|
|
However, the tiny code model does not support shared libraries, and code executing
|
|
in section 0 has different behaviour than code executing in non-zero sections.
|
|
|
|
Process Initialization
|
|
|
|
This section describes the machine state that exec() creates for "infant" processes,
|
|
including argument passing, register usage, and stack frame layout. Programming
|
|
language systems use this initial program state to establish a standard environment for
|
|
their application programs. As an example, a C program begins executing at a function
|
|
named main, conventionally declared in the following way.
|
|
|
|
Figure 3-??: Declaration for main
|
|
|
|
extern int main(int argc, char *argv[], char *envp[]);
|
|
|
|
Briefly, argc is a non-negative argument count; argv is an array of argument strings, with
|
|
argv[argc] == 0; and envp is an array of environment strings, also terminated by a NULL
|
|
pointer.
|
|
|
|
Although this section does not describe C program initialization, it gives the information
|
|
necessary to implement the call to main or to the entry point for a program in any other
|
|
language.
|
|
|
|
Process Stack and Registers
|
|
|
|
When a process receives control, its stack holds the arguments, environment, and
|
|
auxiliary vector from exec(). Argument strings, environment strings, and the auxiliary
|
|
information appear in no specific order within the information block; the system makes
|
|
no guarantee about their relative arrangement. The system may also leave an unspecified
|
|
amount of memory between the null auxiliary vector entry and the end of the information block.
|
|
A sample initial stack is shown in Figure 3-??.
|
|
|
|
Figure 3-??: Initial Process Stack
|
|
|
|
+----------------------------------------+
|
|
SP: | Argument count word | High Address
|
|
+----------------------------------------+
|
|
| zero word |
|
|
+----------------------------------------+
|
|
| |
|
|
| Argument pointers (1 word each) |
|
|
| |
|
|
+----------------------------------------+
|
|
| zero word |
|
|
+----------------------------------------+
|
|
| |
|
|
| Environment pointers (1 word each) |
|
|
| |
|
|
+----------------------------------------+
|
|
| zero word |
|
|
+----------------------------------------+
|
|
| AT_NULL auxiliary vector entry |
|
|
+----------------------------------------+
|
|
| |
|
|
| Auxiliary vector (2 word entries) |
|
|
| |
|
|
+----------------------------------------+
|
|
| AT_NULL auxiliary vector entry |
|
|
+----------------------------------------+
|
|
| Unspecified |
|
|
+----------------------------------------+
|
|
| Information block, including argument |
|
|
| and environment strings, and auxiliary |
|
|
| information (size varies) | Low Address
|
|
+----------------------------------------+
|
|
|
|
[TODO: The above stack layout is subject to change, the below register contents is correct.]
|
|
|
|
When a process is first entered (from an exec() system call), the contents of registers
|
|
other than those listed below are unspecified. Consequently, a program that requires
|
|
registers to have specific values must set them explicitly during process initialization.
|
|
It should not rely on the operating system to set all registers to 0. Following are the
|
|
registers whose contents are specified:
|
|
|
|
Figure 3-??
|
|
|
|
001 Argument count (argc).
|
|
002 Pointer to NULL-terminated array of pointers to argument strings (argv).
|
|
003 Pointer to NULL-terminated array of pointers to environment strings (envp).
|
|
017 The initial stack pointer.
|
|
|
|
[TODO: a register e.g. 1 with a function pointer for atexit()?
|
|
Not all archs use that. Only if Linux really wants it.]
|
|
|
|
[TODO: Following PPC64, change to pass all args in regs and leave stack contents unspecified?
|
|
They are the odd-ball. Do whatever is most convenient on Linux.]
|
|
|
|
Every process has a stack, but the system defines no fixed stack address. Furthermore,
|
|
a program's stack address can change from one system to another -- even from one process
|
|
invocation to another. Thus the process initialization code must use the stack address
|
|
in general purpose register 017. Data in the stack segment at addresses above the stack
|
|
pointer contain undefined values.
|
|
|
|
Whereas the argument and environment vectors transmit information from one application
|
|
program to another, the auxiliary vector conveys information from the operating system
|
|
to the program. This vector is an array of structures, which are defined in Figure 3-??.
|
|
|
|
[TODO: Auxiliary Vector definitions and figures]
|
|
|
|
Coding Examples
|
|
|
|
Conventions
|
|
|
|
Position-Independent Function Prologue
|
|
|
|
Variable Argument Lists
|
|
|
|
DWARF Definition
|
|
|
|
Object Files
|
|
|
|
ELF Header
|
|
|
|
Special Sections
|
|
|
|
Symbol Table
|
|
|
|
Symbol Values
|
|
|
|
Relocation
|
|
|
|
Program Loading and Dynamic Linking
|
|
|
|
Program Loading
|
|
|
|
Program Header
|
|
|
|
Dynamic Linking
|
|
|
|
Program Interpreter
|
|
|
|
There is one valid program interpreter for programs conforming to the PDP10 ABI: "/lib/ld-linux.so.1".
|
|
[TODO: /usr/lib/ld.so.1 instead?]
|