PDP-10.its/doc/kcc/cc.doc

			KCC USER DOCUMENTATION
<1 About KCC>

	KCC is a compiler for the C language on the PDP-10.  It was
originally begun by Kok Chen of Stanford University around 1981 (hence
the name "KCC"), and has had many improvements made to it since then
by a number of people at Stanford, Columbia, and SRI.  It implements C
as described by the following references:
	H&S: Harbison and Steele, "C: A Reference Manual",
	 HS1: (1st edition) Prentice-Hall, 1984, ISBN 0-13-110008-4
	 HS2: (2nd edition) Prentice-Hall, 1987, ISBN 0-13-109802-0
	K&R: Kernighan and Ritchie, "The C Programming Language",
		Prentice-Hall, 1978, ISBN 0-13-110163-3

	Currently KCC is only supported for TOPS-20, although there is
no reason it cannot be used for other PDP-10 systems or processors, if
the need arises.  The remaining discussion assumes you are on a
TOPS-20 system.

<1 Using KCC>

	C source files should have the extension ".C", such as PROG.C
and SUBS.C.  To build a C program, whether from one or more source
files ("modules"), there are three things that must be done.  First,
all modules have to be compiled with KCC to produce .REL files (e.g.
PROG.REL and SUBS.REL); second, the LINK loader must be invoked to
load all of the necessary modules into an executable core image; and
third, this image must be saved on disk as an .EXE file.

	Every complete C program must contain one and only one module
that defines the function "main".  This function is where control begins
when the program is executed, and unless otherwise specified the .EXE
file will be named after the module that "main" appears in.

	You can make a C program either by using the EXEC commands
COMPILE, LOAD, and SAVE, or by invoking KCC directly.	For example,
suppose "main" is defined in PROG.C, and the file SUBS.C contains
auxiliary subroutines.  Then,

To make:		EXEC command		Direct KCC invocation
-------			------------		---------------------
PROG.EXE from .C files:	@LOAD PROG,SUBS		@CC -q PROG SUBS
			@SAVE PROG

Just the .REL files:	@COMPILE PROG,SUBS	@CC -q -c PROG SUBS

PROG.EXE from .RELs:	Same as 1st		@CC PROG.REL SUBS.REL

	One advantage of using the EXEC commands is that they will
only compile those files which appear to require it, i.e. modules for
which the .C file is more recent than the .REL file.  The EXEC can also
translate TOPS-20 directory names into a format that the DEC loader will
understand, so that commands like @COMPILE <FOO>PROG are possible.
	However, KCC will do a similar form of conditional compilation
if the -q switch is set, for those modules specified without a .C
extension. (This may become the default someday.)  More commonly, the
EXEC at your site may not have been modified to know about KCC, or you
may wish to specify certain options to the compilation, or you may
just come from a UNIX background and feel more used to the direct
invocation method.

<1 Direct Invocation - Compiler switches>

	The KCC compiler switches are intended to resemble those of the
UN*X "cc" command as closely as possible.  If you are familiar with these,
you can probably use KCC instinctively.  The command line is broken up into
argument strings each separated by a space (NOT by a comma).  If an argument
string starts with a "-", it is a switch, otherwise it is a filename.
Case is significant in switches!
	Normally, if the filename as given exists, it is used
regardless of its form.  The exception is files with a ".REL"
extension, which are never compiled but are passed on to the linking
loader.  If a filename does not exist and appears to have no
extension, ".C" is added.  This feature is primarily useful with the
-q switch as it requests conditional compilation.  Case is not
significant in filenames.

	If none of -c, -E, or -S are given as switches, KCC will invoke
LINK after compilation and an executable file (*.EXE) will be produced.

	The ordering of switches and filenames, in general, does not
matter; all switches are processed before compiling starts.  However,
note that filenames and libraries will be compiled and/or loaded in
the order given, and -I paths will also be scanned in the order given.

	It is possible to specify KCC switches while giving a
COMPILE-class command to the EXEC, if your EXEC recognizes the switch
/LANGUAGE-SWITCHES.  The argument to this EXEC switch should be a
double-quoted string which starts with a space.  For example:
	@compile foo /laNGUAGE-SWITCHES:" -m -d=sym"

------------------------------------------------------------------------
The following are the available compiler switches, in alphabetical order.
They are the same as those used by UN*X "cc", except where marked with
a "*" -- these are mainly of interest to KCC implementors.

* -A<file> Specify a file name for the assembler header file (included
	at the start of all assembler output).
  -c	Compile and assemble, but don't link (produce *.REL).
  -C	Retain comments in preprocessor (only useful with -E).
* -d	Debugging output.  Same as -d=all.  Generates many debug files.
* -d=<fs>	Debugging fine-tuning.
	<fs> are flag names of particular kinds of debug output files.
	The names can be abbreviated.  Prefixing the name with a
	'+' turns it on; '-' turns it off.  All flags are initially
	assumed off.  Current flags are:
		parse	Parse tree output (*.DEB)
		pho	Peep-Hole Optimizer output (*.PHO) - HUGE!!!
		sym	Symbol table output (*.CYM)
		all	All of the above
	E.g. "-d=parse+sym" == "-d=all-pho"
  -D<ident> Define following ident to "1" or string after '='.
	E.g. "-DMAXSIZE=25".  Several of these may be specified.
  -E	Run source only through preprocessor, to standard output.
* -H<path> Specify a non-standard location for <>-enclosed #include files.
  -i	Loader: load code for multi-section (extended addressing) operation.
  -I<path> Supply a search path for doublequoted #include files.
	Several of these may be specified, and will be searched in
	that order.
* -L<path> Loader: Specify a non-standard location for library files.
* -L=<str> Loader: Specify an arbitrary string argument to the loader.
	Note that the syntax does not permit spaces to be included.
	Several of these may be given.
  -lnam	Loader: Specify library filename for loader.  The "nam"
	argument is used to construct the filename LIBnam.REL in the
	library directory path and this is searched when encountered
	in the specifications.
* -m	Use MACRO rather than FAIL.  Semi-obsolete, same as -x=macro.
  -O	Optimize (no-op, defaults on).  Same as -O=all.
* -O=<fs>	Optimization fine-tuning.  Mainly for debugging.
	<fs> are flag names of particular kinds of optimizations.
	The names can be abbreviated.  Prefixing the name with a '+' turns
	it on; '-' turns it off.  All flags are initially assumed off,
	so to ask for no optimization use -O= (same as -O=-all).
	Current flags are:
		parse	Parse tree optimization
		gen	Code generator optimizations
		object	Object code (peephole) optimizations
		all	All of the above
	E.g. "-O=parse+gen" == "-O=all-object"
  -o=<file>	Specify output filename for the executable image.
	For UN*X-compatibility kicks, "-o <file>" also works.
* -P=<fs> Portability level specifications.  Several switches may be given in
	a format similar to that for -d and -O.  The <fs> flags
	specify the C implementation level that the compiler should use:
		base	Base level C -- most portable and restricted
		carm	H&S CARM level -- full implementation
		ansi	ANSI C draft level (only partly effective)
	Only one of the previous 3 is allowed, plus an optional:
		kcc	Permit KCC-specific extensions to the selected level.
	The default is "ansi+kcc" if -P is not given.  -P alone is
	interpreted as "base".
* -q	Conditional compilation.  All file specs without an extension will
	only be compiled if the .C file is more recent than the .REL file.
	For example, "cc -q foo bar.c arf.rel"
		compiles FOO.C if it is more recent than FOO.REL,
		always compiles BAR.C, and never compiles ARF.
  -S	Don't assemble (produce *.FAI or *.MAC, plus *.PRE)
  -U<ident> Undefine following identifier.  All -U switches are processed
	before any -D switches.  Only __FILE__ and __LINE__ are predefined.
* -v	Verbose - same as "-v=all".
* -v=<fs> Verbosity switches, similar to -d and -O.
		fundef	- print function names as they are defined (not yet).
		stats	- show statistics for run
		load	- show command string given to loader (if any)
  -w	Don't type out warnings.
* -x=<fs>	Cross-compile switches.  Several switches may be given in
	a format similar to that for -d and -O.  The <fs> flags
	specify an aspect of the "target machine" that the
	code should be compiled for (case is significant!):
		Target System:	tops20, tops10, waits, tenex, its
		Target CPU:	ka, ki, ks, kl0, klx
		Target Assembler: fail, macro, midas
		Target char size: ch7	(to compile with 7-bit chars)
	e.g. "-x=ka+tenex".  See "Cross-compiling".
------------------------------------------------------------------------

NOTE: <path> syntax
	The -I, -H, and -L switches all take a "path" as argument.
This is interpreted as specifying both a prefix and a postfix string
which are used to sandwich a partial filename from some other source
(#include "xxx", #include <xxx>, and -lxxx respectively).  The two
strings are separated by the character '+' (this is site dependent
however).  Thus, for example:
	Specification		Prefix	Postfix		Sample with "xxx"
	-I+[SYS,NEW]		""	"[SYS,NEW]"	xxx[SYS,NEW]
	-HNEWC:			"NEWC:"	""		NEWC:xxx
	-LPS:<C>LIB+.REL    "PS:<C>LIB"	".REL"		PS:<C>LIBxxx.REL

NOTE: Obsolete features

	The following switches and interpretations are obsolete.  They will
likely be flushed altogether, but are documented here for historical reasons:

	* -n	same as -O= (no optimization)
	* -s	same as -d=sym (output *.CYM symbol table dump)

	It used to be a feature that "simple" switches, which did not
take any arguments, could be lumped together into a single switch
string.  For example, "cc -mS test" is the same as the more standard
"cc -m -S test".  However, use of this feature is discouraged; the
potential confusion and inconsistency don't seem to be worth it.

NOTE: Switch Portability

	The following lists the switches implemented by other systems
but not by KCC.  This information seems useful and this is a convenient
place to put it.  Other-system switches that KCC implements are not included.
Switches which mean one thing to KCC but another thing to other systems
are included.  Currently only 4.2BSD switches are listed.
	-g   Output additional symtab info for dbx(1), pass -lg to ld(1)
	-go  Ditto for sdb(1).
	-p   Output profiling code for prof(1).
	-pg  Ditto but for gprof(1).
	-R   Passed on to as(1) to make initialized vars shared and read-only.
	-Bpath Use substitute compiler pass programs specified by <path>.
	-t[p012] Use only the pass programs from -B designated by -t.
ld(1) switches:
	A, D, d, e, l, M, N, n, o, r, S, s, T, t, u, X, x, y, z

<1 User Program - Command line interpretation>

	The C runtime startup interprets the command line to a C program
in a consistent fashion, and supports (1) argument string passing,
(2) I/O redirection, (3) pipes, and (4) background processing.  There
is also provision for (5) suppressing this default command line
interpretation.

(1) Command line arguments:
	Command line arguments can be passed to the main() function
from the EXEC or monitor in the UN*X fashion.  That is, main() is
given two arguments, the first of which is an argument count and
the second a pointer to an array of char pointers, each of which
constitutes an argument.  Thus it is conventional to declare the
parameters to main() in this way:
		main(argc, argv)
		int argc;
		char **argv;
For example, if you have a C program saved as PROG.EXE, then invoking
PROG with the command:
		@PROG one two
will set argc to 3, and the strings that argv points to will
be "PROG", "one", and "two".  Note that arguments are separated by
blanks and not by commas!

(2) I/O redirection:
	I/O redirection of stdin and stdout is also supported.
Thus:
  1.  @PROG <foo	; will take all stdin input from the file "foo".
  2.  @PROG >bar	; will send all stdout output to a new file "bar".
  3.  @PROG >>log	; will append all stdout output to the old file "log".
These can be combined:
      @PROG <foo >bar	; does both 1 and 2. (from "foo", to "bar")
However,
      @PROG <foo>bar	; interprets "<foo>bar" as a single argument string,
			; because it looks like a <directory>filename.

(3) Pipes:
	On TOPS-20 systems which implement the PIP: device (developed at
Stanford), pipes can also be supported, so that a command such as:
	@PROG | BAZ
causes the stdout of PROG to be redirected to the stdin of BAZ.

(4) Background processing:
	Again, provided the EXEC has been suitably modified, a
command line ending in an ampersand ('&') will cause the program
to be run in the background, while the user goes on to do other
things:
	@PROG one two&

(5) Suppressing the command line interpretation:
	In certain unusual circumstances it may be necessary to suppress
the default command line interpretation, so that the user program itself
can handle it in a different way.  For information on how to do this,
see the include file <urtsud.h>.

<1 C as implemented by KCC>

	KCC is intended to conform to the description of C as
specified by Harbison & Steele's "C: A Reference Manual".  It is
strongly recommended that all C programmers use this book in preference
to Kernighan & Ritchie.  As the ANSI C standard becomes more concrete,
KCC will likewise evolve to conform to this standard; some of the
proposed ANSI features are already implemented.

	The -P (portability) switch controls the exact level at which
KCC attempts to compile a C program.  There are three possible levels,
and only one of these may be in effect:
	ANSI - permits all currently implemented ANSI constructs to be
		recognized and compiled.  This is basically CARM level
		plus some new things; KCC does not yet fully
		implement the ANSI draft standard, as it keeps changing.
		Users should be cautious about using ANSI features.
	CARM - Disables all ANSI-added features which are not in Harbison
		and Steele's CARM book.  KCC fully implements this level.
	BASE - The most restrictive level.  This is basically the same as
		CARM, but will make KCC complain about some constructs
		or usages that are likely to be unimplemented by some
		other compilers.
	In addition, there is a "KCC extensions" flag which is independent
of the level; when enabled, this permits a number of KCC-specific extensions
to be recognized regardless of whatever level is in effect.
	Normally KCC uses the ANSI level with KCC extensions enabled;
this corresponds to "-P=ansi+kcc".

	The next several pages document KCC's implementation of C by
following the general ordering of H&S and pointing out aspects where
KCC differs or describing which of several optional behaviors KCC
implements.  Any ANSI features which are implemented are also described.

<2 KCC Lexical Elements>		[H&S 2, "Lexical Elements"]

	KCC uses the US ASCII character set.  There is provision for
using a separate target character set, different from the source set,
but currently the only such is a target set for WAITS ASCII.

	KCC has no maximum line length.  Error messages will quote
only the most recent part of an offending line if it is longer than 80
characters.

	KCC is standard in that nested comments are not supported.  If
the sequence "/*" is seen within a comment, a warning message will be
printed just in case the user neglected to terminate the previous
comment.

<2 Identifier names>

	KCC adheres to the standard definition of C identifier syntax,
allowing the character "_", the letters A-Z and a-z, and the digits
0-9 as valid identifier characters.  Identifiers may have any length,
but only the first 31 characters (case sensitive) are unique during
compilation, which conforms to the ANSI minimum.  This applies to all
of the following name spaces (as per H&S 4.2.4):
	Macro names
	Statement labels
	Structure, union, and enum tags
	Component (member) names
	Ordinary names:
		Enum constants and typedef names.
		Variables (see discussion of storage classes).

	However, the situation is different for symbols which must be
exported to the PDP-10 linker.  Such names are truncated to 6
characters and case is no longer significant.  The character '_'
(underscore) is transformed into '.' (period); the PDP-10 software
allows the additional symbol characters '$' and '%', but there is no
way to generate these with C unless special provision is made; see
#asm and '`' under "KCC Extensions".  See also the discussion of
exported symbols.

<2 Reserved Words>
	KCC has a number of additional reserved words depending on
the portability level setting.  When KCC extensions are allowed, as
is normally the case, the following keywords exist:
		"asm"	- used for assembly code inclusion.
		"entry"	- only in certain special circumstances.
			See the discussion of libraries and entry points.
	 When ANSI level is in effect (again, the normal case), there
are three additional reserved words.  All can be considered type
modifiers:
		"signed"	Indicates integer type is signed.  Implemented.
		"const"		Constant object (recognized but unimplemented)
		"volatile"	Volatile object (recognized but unimplemented)

<2 Constants>

	The types "int" and "long" are the same -- one PDP-10 word of
36 bits, with the high bit a sign bit.  Thus, the largest positive integer
constant is 0377777777777, or 34,359,738,368.
	The type "double" is represented by a PDP-10 hardware format
standard range double precision number (two words).  On KA processors
the format is slightly different.  The decimal range is from 1.5e-39
to 1.7e38, with eighteen digits of precision.
	Character constants have type "int".  Multicharacter constants
are non-standard and not supported.  Because characters are 9-bit bytes,
numeric escape code values can range from '\0' to '\777'.  Hexadecimal
character constants are not permitted.
	String constants are stored as 9-bit byte strings, and do not
share storage.  That is, two instances of the constant string "foo"
will be stored in two distinct places.  On TOPS-20, string constants
are put in the "pure" segment of a program, but this does not actually
enforce any read-only restrictions.
	If the portability level is ANSI then adjacent string constants
are concatenated into a single string.  Thus, "foo" "bar" is the same
as "foobar".

<2 Preprocessor directives>	[H&S 3, "The C Preprocessor"]

All standard C preprocessor directives are supported as described in
Harbison and Steele, including #elif and the "defined" operator.  This
page specifies how KCC behaves for situations which are implementation
dependent.

Lexical Conventions: [H&S 3.2]
	Preprocessor commands must have '#' as the first character on
the line; whitespace cannot precede it.  KCC allows whitespace between
the '#' and the command name (this is non-portable).  Formal parameter
names ARE recognized within character and string constants in macro
body definitions.  Comments are treated as whitespace and not passed
on to anything else; however, KCC will print a "Nested comment"
warning if it encounters a comment which contains "/*".  This serves both
to catch slightly non-portable usage (see H&S 2.2) and to detect
places where the user may have accidentally omitted a "*/".

Defining Macros: [H&S 3.3]
	When defining a macro, formal parameter names are recognized
within string and character constants, and therefore no check is made
for lexical correctness of such constants; this will change when the
ANSI standard firms up.  Any comments and whitespace in the macro body
are replaced by a single space.  KCC permits an argument token list
(arguments to a macro call) to extend over multiple lines.  Arguments
to a call are converted in a fashion similar to that for macro bodies
-- comments and whitespace are replaced by a single space.  Newlines
within an argument list are also considered whitespace.  However,
string and character constants in arguments are treated as tokens, and
their contents are not scanned for macro names.

Predefined Macros: [H&S 3.3.4]
	__LINE__ expands into the current decimal line number.	(BSD, ANSI)
	__FILE__ expands into the current source filename.	(BSD, ANSI)
	__DATE__ expands into the date of compilation.
	__TIME__ expands into the time of compilation.
		The date/time of compilation is cleared at the start of
		compilation for each source file, and is set by the first
		occurrence of __DATE__ or __TIME__ within that source file.
	__STDC__ expands into the ANSI standard level # (not implemented yet).

The first two macros are furnished for compatibility with 4.2BSD; the
next two were added from ANSI.  __STDC__ will only be added when -P=ansi
is a full implementation.  There are no other predefined macros; use the
file <c-env.h> for standard KCC environment definitions.

Undefining and Redefining Macros: [H&S 3.3.5]
	It is not an error to redefine an already defined macro, but a
warning message will be output unless the new macro definition is the
same as the old definition; i.e. redundant definitions are allowed.
There is no macro definition stack, i.e. definitions are not
pushed/popped by #define/#undef.  Attempting to define a macro named
"defined" will cause an error, since otherwise it would conflict with
the "defined" operator.

Converting Tokens to Strings: [HS2 3.3.8]
	KCC does recognize formal parameter names within string and
character constants.  This will change as the ANSI standard shapes up.

File Inclusion: [H&S 3.4]
	Included files may be nested to 10 levels.  Macro expansion
is done on the line if the filename does not start with '<' or '"'.
Filenames may contain '>' or '"' characters.
	#include <filename> looks only in the standard directory.
	#include "filename" looks first in DSK:,
		then in the -I paths in order of specification (left to right),
		then in the standard directory.
The standard directory for include files is C: on TOPS-20, <KC> on
TENEX, and [SYS,KCC] on WAITS, but this is site dependent in any case.

Conditional Compilation: [H&S 3.5] #if,#else,#endif,#elif,#ifdef,#ifndef
	The "defined" operator is recognized only within #if and #elif
expressions.  Note that neither #elif nor "defined" are in K&R, and
H&S is used as the reference here; neither will be recognized unless
the portability level is at least "carm".  Within the body of a failing
conditional, only other conditional commands are recognized; all others,
even illegal commands, are ignored.

Explicit Line Numbering: [H&S 3.6] #line
	The information from #line will be used in KCC error messages.
Macro expansion is performed on the line.  Like all other
preprocessor commands, #line is eliminated and not passed on when
using the -E switch.  With regard to "#" alone at the start of a line,
remember that whitespace is allowed between the "#" and the command
name, thus KCC will not recognize a "#" alone as a synonym for "#line".
If there is no command name, the line is simply ignored without error.

KCC-specific Commands:
	#asm, #endasm
	These two commands cause the text delimited by them to be
macro-expanded (as for -E) and converted into an "asm()" expression
for direct inclusion in the output assembly language file.  This
currently only works inside functions.  This feature is very likely to
change, and should only be used where absolutely necessary.  Keep the
code simple, as someday KCC may want to parse it.
See "KCC Extensions" for additional details.

<2 Storage classes>		[H&S 4.3  "Storage Class Specifiers"]

KCC implements the standard storage classes of auto, extern, register,
static, and typedef (H&S sec 4.3), with the following notes:

REGISTER declarations are currently equivalent to AUTO.  KCC does not
assign variables to registers, and optimizations are performed without
using the "hint" given by REGISTER.  AUTO variables are almost always
more efficient, and in any case they are easier to implement.

KCC uses the "omitted-EXTERN" solution to deal with the question of
top-level definitions versus references (H&S sec 4.8).  That is,
omitting "extern" from a top-level declaration has the effect of
indicating that this is a defining declaration rather than a referencing
declaration.

Duplicate Declarations:
	As per H&S 4.2.5, KCC permits any number of external
referencing declarations, if the types are the same.  However, because
KCC treats omitted-extern declarations as defining declarations, these
references must all have an explicit "extern".  Likewise, an external
reference may be later followed by a defining declaration.
	KCC has additional special handling for declarations of
functions, because it can always be determined whether a function
declaration is a reference or a definition.  Any number of "static"
referencing declarations are allowed.  Conflicts are resolved as
follows: If an implicit external reference is followed by a static
reference or definition, KCC will assume the function is static.  It
is an error if the first reference has an explicit "extern".  It is
also an error if a static reference is followed by an external
reference or definition.  In either case compilation proceeds as if
the function was static.

<2 Initializers>				[H&S 4.6 "Initializers"]

	KCC adheres to H&S in all required respects.  The following
notes cover points which H&S describes as implementation dependent:

Optional braces are allowed for all non-aggregate initializers.  It is
permitted to drop braces from initializer lists under the rules
described in H&S 4.6.8 (HS1 4.6.9), but KCC attempts to perform
extremely stringent checking on the "shape" of initializers, and will
complain about too many or too few braces.

FLOATING-POINT initializers may be of any arithmetic type.  KCC performs
compile-time floating-point arithmetic, so initializers for static and
external variables may use any constant arithmetic expression.

POINTER initializers, as described in H&S, must evaluate to an integer or
to an address plus (or minus) an integer constant.

ARRAY initializers are currently not allowed for automatic arrays.
This will change as ANSI permits it.

ENUMERATION initializers may use any integer (as well as enum) expression.

STRUCTURE initializers can initialize bit fields with any integer expression.
As for arrays, automatic and register structures cannot be initialized.
This will change as ANSI permits it.

UNIONS currently cannot be initialized.  This will change as ANSI
permits it.

<2 Exported symbols>			[H&S 4.8 "External Names"]

Symbols which are exported to the assembler file have special restrictions
imposed by current PDP-10 software, which only recognizes 6-character
symbols from the set A-Z, 0-9, '.', '$', and '%'.  In particular, case
is not significant.

Also, there is a distinction between symbols exported only to the assembler
and those exported both to the assembler and the linker.  While there is
technically no reason that any symbol has to be given to the assembler if
it is not also meant for the linker, in practice it is convenient for
debugging to have some "local" symbol definitions available so that DDT
can access them.

Here is a breakdown of export status by storage class:

typedef	 = Exports nothing.  (Not a real storage class)
auto	 = Exports nothing.  (Local stack variables use an internal offset)
register = Exports nothing.
static	 = If not global scope (i.e. is within a block) then nothing exported;
		an internally-generated label is used.
	If global (top-level, within no block) then exported to assembler only.
		A label is made, but no INTERN or ENTRY statement.
extern	 = Always exported to both assembler and linker.
	 Omitted-extern:  a DEFINITION.  A label, INTERN, and ENTRY are output.
	 Explicit-extern: a REFERENCE.  An EXTERN statement is output, but only
		if the symbol is actually referenced by the code.

Omitted-Extern:
	External declarations with no "extern" storage class
explicitly given are assumed to be external DEFINITIONS.  A defined
extern symbol will have its own label, plus an INTERN statement
telling the assembler that this is an externally visible symbol, plus
an ENTRY statement which allows library routine search to find this
symbol.  ENTRY statements will be put into the .PRE output file rather
than the main output file, since the assembler will need to scan them
prior to anything else.

Explicit-Extern:
	If an "extern" is explicitly given, the compiler assumes that
it is simply a REFERENCE.  Nothing will be done unless the symbol is
actually referenced by the code, in which case an EXTERN line will be
generated in the assembler output for that file.  The reason for the
reference count check is that each assembler EXTERN constitutes a
library search request which must be satisfied by a module with the
corresponding symbol declared as an ENTRY.  Unless this is only done
for actual references, the many superfluous declarations found in *.h
files will tend to cause many unneeded library modules to be loaded.

Static symbols:
	Note that global static symbols are passed on to the assembler
even though this is not necessary; an internally-generated label could
be used just as well.  The main reason this is done is to facilitate
debugging with DDT, otherwise it could be difficult to identify static
functions when looking at the machine instructions.  This may cause
problems if identifiers which are otherwise distinct become identical
as a result of the conversion to a 6-char PDP-10 symbol.

However, a symbol declared static within a given source file will
never be visible from another file that you may link later with it.  For
example, a function declared as

	static char *function()
	{
	   ...
	}

will only be visible from other functions within the same source file.
This allows several modules to have functions with the same name
modulo the six character limit, as long as no two of the functions are
both extern.  It is STRONGLY recommended for multi-module programs
that you declare as many functions as possible to be "static".

<2 Libraries and Entries>

REL files to be converted by MAKLIB into object libraries must have
any external symbols declared with ENTRY rather than merely INTERNing
them, and this declaration must be at the start of the REL file.  In
order to do this, KCC generates a *.PRE "prefix" output file in
addition to the *.FAI or *.MAC output file, and invokes the assembler
in such a way that the PRE file is assembled before the main file.
This file contains ENTRY statements and any other predeclarations that
are needed before the assembler sees the actual code.  Normally the
user will never see this file, but if the -S switch is used then it
will be left around as well as the FAI/MAC file.  Note that if running
the assembler manually on the FAI/MAC file, you must invoke it with
a command line like this:
	[@]FAIL				[@]MACRO
	[*]FOO=FOO.PRE,FOO.FAI		[*]FOO=FOO.PRE,FOO.MAC


COMPATIBILITY INFO:
	For compatibility, KCC will continue to recognize an "entry"
keyword for some time to come.  The following describes the obsolete
syntax:

To declare an entry, use the "entry" keyword at the start of the source,
before any other declarations:

    "entry" ident ["," ident ...] ";"

i.e., the keyword "entry", followed by a list of identifiers separated
by commas, followed by a semicolon.  This is passed on essentially
verbatim to the assembler, and has no other affect on compilation.  It
should be used at the start of any runtimes or other file intended for
a library, on all variables and functions that should be visible as
entries in the library.

Note that it should still be safe to use "entry" as a non-keyword; if
used other than at the start of the file it will be treated like any
other normal identifier.

To repeat: the "entry" statement is no longer necessary.  It should not
be used in new code, and should be removed from old code.

<2 Types>				[H&S 5 "Types"]

STORAGE UNITS:
	A KCC storage unit (what "sizeof" returns) is a 9-bit byte, and
there are 4 of these in each 36-bit PDP-10 word, ordered left to right
from most significant to least significant.

INTEGERS:
	KCC's integer types have the following sizes:
		Type	Bits	"sizeof" value
		char	9	1
		short	18	2	(PDP-10 halfword)
		int	36	4	(PDP-10 word)
		long	36	4	(PDP-10 word)

All of these types may be explicitly declared as "signed" if ANSI
level is in effect.  Single variables declared as "char" or "short"
are stored right-justified into a full word; only when packed into an
array or structure are they stored as 9-bit (or 18-bit) bytes, left to
right within each word.

UNSIGNED INTEGERS:
	Unsigned integers are fully implemented; any integer object
may be either "signed" or "unsigned", and both forms use exactly the
same amount of storage, with the high order bit considered the sign
bit (if the object is signed).  However, because the PDP-10 has
no instructions specifically for unsigned data, some operations are
slower for unsigned ints.
	Addition (+) and subtraction (-) are the same.
	== and != are the same.
	Left shift (<<) always uses the LSH instruction (logical shift).
	Right shift (>>) uses LSH for unsigned, ASH for signed operands.
		ASH is an arithmetic shift which propagates the sign bit.
	<,<=,>,>= are slightly slower for unsigned operands.
	Casts to floating-point are slower.
	Multiply (*) is also slightly slower.
	Divide (/) and remainder (%) are much slower.

CHARACTER:
	The plain "char" type is "unsigned char".  Sign extension is
done only if chars are explicitly declared as "signed char".  Normally
a char is 9 bits, although it is possible to compile code using a
7-bit assumption (see the section on char pointer hints).
	Old versions of KCC used to store the chars of a string
constant in 7-bit form, packed 5 to a word (ASCIZ format); this is no
longer the case and string constants are normally now full 9-bit char
strings.
	An extension to KCC provides five additional types of "char"
objects, specified as "_KCCtype_charN", where N is the number of bits
in the char and may be one of 6, 7, 8, 9, or 18.  All may be signed
or unsigned; their "plain" form is unsigned.  See the "KCC Extensions"
section for additional details.


FLOATING-POINT:
	The "float" type is represented by one word in the PDP-10
single precision floating point format; there is one bit of sign, 8
bits of exponent, and 27 bits of mantissa.
	The "double" type uses two words in the PDP-10 double
precision format.  (Note that for the KA-10 this is a software format
rather than the more usual hardware format.)  The exponent range is
approximately 1.5e-39 to 1.7e38 in both formats; single precision has
about 8 significant digits and double precision has 18.  See a PDP-10
hardware reference manual for details.
	KCC also supports the new ANSI "long double" type when ANSI
level is in effect.  Currently this is the same as "double" but this
will probably change on KL-10s to use "G" format floating point, which
has an exponent range of 2.8e-309 to 9.0e307 but only 17 significant
digits.
	The (double) type can represent all values of (long).  That
is, conversion of a (long) to a (double) and back to (long) results in
exactly the original value.

POINTERS:
	Pointers are always a single word, but can have two different
internal formats.  Pointers to chars, shorts, or bit fields, are PDP-10 byte
pointers (local or one-word global); pointers to all other objects are
PDP-10 global word addresses.  Byte pointers point to the byte itself
rather than to the preceding byte, thus LDB instead of ILDB is done
to fetch the byte.
	It is very important to ensure that functions which return
values of (char *) be properly declared; likewise, any function
arguments which are expected to be (char *) must be cast to this if
necessary.  Operations which expect a char pointer will not work
properly when given a word pointer, and vice versa.  See the section
on "pointer hints" near the end of this file for additional information.
	The "NULL" pointer is represented internally as a zero word,
i.e. the same representation as the integer value 0, regardless of
the type of the pointer.  The PDP-10 address 0 (AC 0) is zeroed and
never used by KCC, in order to help catch any use of NULL pointers.

ARRAYS:
	The only special thing about arrays is that arrays of chars
consist of 9-bit bytes packed 4 to a word, and arrays of shorts have
18-bit halfwords packed 2 to a word; all other objects occupy at least
one word.

ENUMERATIONS:
	KCC treats enumeration types simply as integers.  In the words
of H&S 5.5 (HS1 5.6.1), KCC uses the "integer model" of enumerations,
which is what ANSI has adopted.

STRUCTURES and UNIONS:
	Structures and unions are always word-aligned and occupy a
whole number of words.  Unlike the case for other declarations of type
"char" or "short", adjacent "char" and "short" members in a structure
are packed together as for arrays.  Structures and unions may be
assigned, passed as function parameters, and returned as function
values.
	Bit fields are implemented; the maximum size of a bit field is
36 bits.  They may be declared as "int", "signed int", or "unsigned
int"; plain "int" bitfields are unsigned.  Fields are packed left to
right, conforming to the PDP-10 byte ordering convention.  It's too
bad that C does not allow pointers to bit fields, because the PDP-10
byte pointer instructions are perfectly suited to this application!

FUNCTIONS:
	As per H&S.  A pointer to a function is simply a word address.
For the gory details of function calls and stack usage, see the
"Internals" section.

TYPEDEFS:
	As per H&S.  With regard to 5.10.2 (HS1 5.11.1), KCC has no
problems with redefining typedef names in inner blocks.

<2 Type Conversions>		[H&S 6 "Conversions and Representations"]

Integer conversions:

	There are no representation changes when converting any
integer type to any other integer type of the same size.  Sign
extension and truncation are performed when necessary to convert from
one size to another.  Conversions from pointers are done as per H&S
6.2.3 (V1 6.3.4); a pointer is treated as an unsigned int and then
converted to the destination type using the integral conversion rules.

Floating-point conversions:

	Casting (float) to (double) or (long double) retains the
same value.  However, (double) to (long double) may lose one digit
of precision, depending on the implementation chosen for (long double).
	A cast to (float) of an int may lose some precision,
although a char or short can always be fully transformed.  (double)
can retain the exact value of an int or long int, which can be
restored to its original value by converting back to int.
	Casting an unsigned integer to a floating-point value always
results in a positive number.

Pointer conversions:

	There are a great variety of pointer conversions possible; however,
you can make sense of them if you simply note the following Three Laws of
Pointers:
	(1) Nihil ex nihilis -- a NULL pointer always remains NULL.
	(2) Smaller is finer -- a pointer to any object can always
		be converted into a pointer to a SMALLER (or equal-sized)
		object, without losing any information.  Converting it back
		to the original type restores the original value.
	(3) Bigger is blunter -- converting a pointer to any object to
		a pointer to a LARGER object will force the pointer to
		have an alignment suitable for that of the larger type;
		any fine details of positioning within the new type are lost,
		and the original pointer cannot be recovered (unless it
		was already properly aligned to begin with).  The new
		object pointed to will completely enclose the smaller
		object.

Specifically:
	Chars are aligned on 9-bit byte boundaries, shorts on halfword
boundaries, and all other data types on word boundaries (with the
exception of bitfields and the _KCCtype_charN types).  Converting any
pointer to a (char *) and back is always possible, as a char is the
smallest possible object.  If the original object was larger than a
char, the char pointer will point to the first byte of the object; this
is the leftmost 9-bit byte in a word (if word-aligned) or in the halfword
(if a short).

	A cast to (int *) of a char pointer produces an address that
points to the word that the char pointer indicates, regardless of
which byte in the word was being pointed at.

	Pointer casts are not always trivial, but they are reasonably
fast (from 1 to 4 instructions depending on the alignment requirements).

	The only exception to the 3 rules is the case of pointers to
objects of _KCCtype_charN types (see the KCC extensions section).
Casting any pointer to or from those types is performed by first
converting the original pointer into a word pointer (thus forcing
alignment to a word boundary) and then applying the desired
conversion.

Assignment conversions:

	KCC permits any casting conversion during an assignment, but
will complain about an implied cast if the conversion is not one of
the legal assignment conversions.

Unary conversions:

	The "Usual Unary Conversions" are different for CARM and ANSI:
	Original operand type		    Converted type
					CARM		ANSI (default)
	float				double		float
	signed char/short/bitfield	int		int
	unsigned char/short		unsigned int	int
	unsigned bitfield		unsigned int	*int or @unsigned int
			* = if bitfield has fewer bits than an int.
			@ = if bitfield has more (or same #) bits than an int.

	The first difference is (float) to (double).  What H&S
describes as an "optional compilation mode" to suppress the unary
conversion of (float) to (double) is always in effect for ANSI level,
as ANSI is allowing this feature as part of the standard conversions,
and the resulting PDP-10 code is much more efficient.  If ANSI level
is not selected, then all (float) values will be implicitly converted
into (double) as per the old C standard.  Note that all portability
levels require that (float) values always be promoted to (double) in
function arguments, so this particular implicit conversion is always
in effect.
	The second difference is in the integer promotions.  CARM uses
what ANSI calls "unsigned preserving" rules; ANSI uses "value preserving"
rules, meaning that a conversion to a wider type should always result in
a signed integer type regardless of whether the shorter type was unsigned
or not, as long as the new type can represent all values of the old type.

Binary conversions:

	As already noted, (float) values are not always implicitly
converted to (double) before being operated on, if ANSI level is in
effect.  There is one other difference between ANSI and CARM
with respect to the usual binary conversions:
	If one operand is "long" and the other is "unsigned int",
		CARM: makes both "unsigned long".
		ANSI: makes both "long".

<2 Expressions>				[H&S 7 "Expressions"]

As per H&S, with the following notes:

[7.2.2] (V1 7.2.3) Overflow and underflow are neither noticed nor
handled.  The result is whatever the PDP-10 hardware gives in those
cases.

[7.3.3] KCC correctly does not use parentheses to force the usual
unary conversions.

[7.4.2] (V1 7.3.5) KCC permits component selection for structures
returned from functions, except when the component is an array.  That
is, "f().a" will work and will select component "a" of the returned
structure, but it is not legal to do "f().array[i]".  This point may
be clarified in the future by the ANSI draft standard.

[7.4.3] (V1 7.3.6) KCC correctly does not allow formal parameters of
type "function", so the issue of converting this type does not arise.
	KCC does not currently do any checking to see if the types of
the arguments match the types of the parameters for the called
function.  When ANSI function prototypes are implemented, this will
change.  KCC does not issue any warnings about discarded function
return values.

[7.5.1] (V1 7.4.1) Casts - KCC correctly implements "narrowing" casts
for floating point and for integers.

[7.5.2] (V1 7.4.2) "sizeof" - the result of "sizeof" currently has type (int).
This is far more than adequate for any possible size value.  The
result of sizeof is ALWAYS in terms of 9-bit bytes, regardless of the
setting of -x=ch7, with two exceptions: the size of a char is always
1, and the size of a char array is the # of elements (chars) in the
array.  This is true no matter how many bits are in a char.

[7.5.6] (V1 7.4.6) '&' - Attempting to apply '&' to a "register" variable
simply causes KCC to issue a warning message and force the variable to
class "auto".  KCC does not permit '&' to be applied to array or
function names; this will change as ANSI permits it.

[7.5.7] (V1 7.4.7) '*' - Applying the indirection operator to a null
pointer (0) simply retrieves (or sets) the contents of AC 0, which
should always be zero if nothing accidentally sets it.  Treating the
null pointer as a char pointer will always retrieve zeroes and set
nothing.

[7.6.1] (V1 7.5.1) '*','/','%' -
	Division by zero is a no-op; the value will be that of the dividend.
Truncation is always toward zero whether the operands are negative or
not:
		5/2 == (-5)/(-2) == 2
		(-5)/2 == 5/(-2) == -2
	For the remainder operator, (x)%0 gives unpredictable garbage.
The sign of the remainder will be the same as that of the dividend:
		5%2 == 5%(-2) == 1
		(-5)%2 == (-5)%(-2) == -1
	These operations are slower for unsigned than for signed operands.
Division in particular is slow.

[7.6.2] (V1 7.5.2) '-' - The type of the difference between two
pointers is (int).

[7.6.3] (V1 7.5.3) '<<','>>' - Left shift (<<) always uses logical
shifting; bits can be shifted into the sign bit.  Right shift uses
logical shifting for unsigned integer types (the sign bit is shifted
out, and 0-bits shifted in), but uses ARITHMETIC shifting for signed
integer types (the sign bit is propagated).
	Using a negative value for the right operand reverses the
direction of the shift.  Using a large number (36 or greater) simply
shifts everything to oblivion as expected.  Note that it is possible
to use left-shift arithmetic shifting (the ASH instruction) by giving
a negative shift distance to >>; of course this is very non-portable.

[7.8] (V1 7.7) '?' - KCC correctly permits the result of a conditional
expression to have structure, union, enumeration, or void types.

[7.9.1] (V1 7.8.1) Structure and union assignment is (of course) permitted.

[7.9.2] (V1 7.8.2) 'op=' Compound assignment -
	KCC does not support the obsolete "=+" compound assignment forms.

[7.11] (V1 7.10) Constant expressions -
	KCC can and does evaluate constant floating-point expressions at
compile time.  Almost all casts are also allowed, except certain
pointer-pointer conversions where the result would depend on whether
the program was running multi-section.
	KCC is currently somewhat too liberal about the constant
expressions in preprocessor #if statements; it allows the use of any
integral constant expression, including enum constants and sizeof
operators.  This is possible because the preprocessor is integrated
with the compiler.  The eventual fix for this will probably issue a
warning but permit the usage.

[7.12] (V1 7.11) KCC correctly does not interleave expression
computations.

[7.13] (V1 7.12) KCC tries to issue warnings about discarded values.
This may change with time.

[7.14] (V1 7.13) KCC does some optimization of memory accesses, but
not much.  This may change with the coming of ANSI's "volatile" type
modifier.

<2 Statements>				[H&S 8 "Statements"]

As per H&S, with the following notes:

[8.7] switch statement - KCC permits the control expression of a switch
statement to be of any integral or enumeration type.

<2 Functions>				[H&S 9 "Functions"]

[9.4] Adjustments to Parameter Types
	Parameters which are declared as "char" or "short" are really
handled as type "int", and "float" is really "double"; however, KCC
does not implement narrowing as per 9.4, because the description of
this is too unclear -- what happens if such a parameter is used as
an lvalue?
	The situation will improve with ANSI function prototypes.

	KCC follows the language strictly and does not permit formal
parameters of type "function returning...".

<1 The C Libraries>		[H&S Part II (V1 11: "The Run-time Library")]

	ALL of the facilities described in H&S part II are
implemented as described.  In addition, various UN*X system call
emulations and standard library routines are also supported.
	The file LIBC.DOC furnishes a complete summary of the
implemented library routines; there is also USYS.DOC, which both
summarizes the system-call simulations.  In general, users are advised
to read H&S or a UPM (Unix Programmer's Manual) for complete
descriptions of library functions, as these files are primarily
intended to document KCC-specific differences rather than to provide a
user guide.

<2 [H&S 13] Standard Language Additions>
<2 [H&S 14] (V1 11.1) Character Processing>
<2 [H&S 15] (V1 11.2) String Processing>
<2 [H&S 16] Memory Functions>
<2 [H&S 17] (V1 11.5) Input/Output Facilities>	(V1: "Standard I/O")
<2 [H&S 18] (V1 11.4) Storage Allocation>
<2 [H&S 19] (V1 11.3) Mathematical Functions>
<2 [H&S 20] Time and Date Functions>
<2 [H&S 21] Control Functions>
<2 [H&S 22] Miscellaneous Functions>
<2 C Library - Other Library Functions>
	A few other miscellaneous facilities exist which are not
	listed in CARM, such as jsys() and the TERMCAP library.  They
	are described in LIBC.DOC.

<1 C Library - UN*X System Calls>

	The KCC runtime environment is intended to resemble that of UN*X
to a limited extent.  For example, main() is invoked with "argc, argv"
arguments parsed from the command line, and many system calls are
emulated.  This emulation is not intended to be complete, and the calls
exist primarily to help transport software to and from UN*X systems.
Whenever possible, the standard portable routines as described in H&S
should be used instead of these "system calls".
	The file USYS.DOC summarizes the calls which KCC supports, and
describes how they differ from the UN*X versions.  A UPM (Unix
Programmer's Manual) should be consulted for descriptions of how these
calls should behave on UN*X itself.

<1 KCC Language Extensions>

	KCC implements a number of extensions to the C language which
are intended to allow for better integration with other PDP-10 software.
It is possible to disable these extensions by means of the -P switch.
These extensions are:
	[1] The "entry" keyword (obsolete).
	[2] The '`' identifier quoting mechanism.
	[3] The #asm and asm() assembly language mechanism.
	[4] The "_KCCtype_charN" data types.


<2 Extension [1] - The "entry" keyword>

	The use of this statement has been described earlier in the
discussion of library entry points.  However, it is an obsolete feature
and should no longer be needed for any purpose.  Future versions of KCC
will flush it if no one objects.


<2 Extension [2] - Identifier Quoting>

	The current PDP-10 software allows symbols to have 6 characters
from the set A-Z, 0-9, ., %, $.  KCC maps 0-9 to 0-9, a-z and A-Z to A-Z,
and '_' to '.'.
	KCC supports a non-standard extension to C whereby any characters
enclosed within accent-grave ('`') marks are treated as a valid C identifier.
This allows the user to specify identifiers containing the characters '$'
and '%', as well as any arbitrary character, although KCC will print a
warning if a character not in the PDP-10 set is seen.
		Examples: `$FOO`, `OPENF%`, `$$BP`, `switch`

	This mechanism should be used ONLY where necessary.  It is not
portable and should be conditionalized if used in portable code.
Identifiers defined in this way should be CONSISTENTLY quoted in this
way, because they are stored internally with '`' as their first
character to distinguish them from normal unquoted identifiers and
keywords.  This avoids potential confusion and allows one to specify
an identifier which is otherwise a reserved keyword, such as `if`.

<2 Extension [3] - #asm and asm()>

	Many C compilers have an escape mechanism which allows the
programmer to specify a series of assembly language instructions within
a C program.  KCC's means of doing this is with the "asm()" expression,
which looks exactly like a function call.
	Currently only one argument is allowed to asm() and this must
be a string literal.  The text of the string is simply passed directly to
the assembler output file at that point in the compilation.
	There is also a preprocessor command called #asm, which converts
everything up to an #endasm into an asm() expression.  This is convenient
for very long stretches of assembler code, or where the enclosed text
must be macro-expanded.

	Invoke %%CODE or %%DATA to switch between assembling pure and
impure (variable) code/data.  #asm inclusions will always begin in the
code segment, and must always end in the code segment.  Never use
%%CODE when already in the code segment, or %%DATA when already in the
data segment.

	Because asm() is syntactically an expression, it can only
appear where an expression is legal.  However, any attempt to use it
anywhere but as the sole contents of a function body is highly fraught
with peril.  If it is necessary to specify some assembler directives
separate from any function, an acceptable way of doing this is by
means of a static dummy function, such as:
	static void
	dummyfunct(){
		asm("%%DATA\n STUFF: ASCIZ/foo/ \n %%CODE\n");
	}

	It cannot be repeated too often that use of asm() is strongly
discouraged.  It is possible that someday its functionality will be
extended to the point that KCC can parse and understand the contents
(thus, for example, references to C auto variables would be allowed);
however, this would primarily be for the purpose of allowing KCC to
generate .REL files directly rather than to encourage wider use of asm().

	At the start of the assembler file, a PURGE is done of all the
assembler IF pseudos.  Thus, assembler code cannot use any IF pseudo
tests, nor macros which use them.  Incidentally, attempting to use a
SEARCH MONSYM will cause FAIL to barf several times with a "FAIL BUG
IN SEARCH" message, due to the lack of the IF pseudos; this is
annoying but harmless.  MACRO does not have this problem.

<2 Extension [4] - "_KCCtype_charN" data types>

	Normally the "char" data type is 9 bits.  In the PDP-10 world
much existing software depends on 7-bit characters, and to make it
easier to write the necessary system-dependent code a 7-bit char data
type was introduced and generalized.  The 5 possible char sizes (6, 7,
8, 9, and 18) were chosen because it is only for those sizes that
OWGBPs exist (one-word global byte pointers), and thus only those sizes
can be guaranteed to work when using extended addressing.

	Any of the char types can be signed or unsigned; if the plain
form is used, unsigned is assumed.  Narrowing and widening is done
properly whatever the size.  Note that the 18-bit size corresponds
to "short"; it is included mainly for completeness rather than in the
expectation that someone would actually use it.  The 9-bit size
is the same as regular "char", unless the -x=ch7 option is in effect,
in which case "char" is the same as the 7-bit size.

	These types can normally be used just as for "char".  However,
there are some special effects associated with certain operations:
	(1) "sizeof" of a N-bit char array returns the number of N-bit
		chars (elements) in the array.  Usually this is what you
		want.  Giving this number to malloc will cause problems
		only for chars of 18 bits.
	(2) A cast (explicit or implicit) of a string literal to a
		N-bit char pointer will cause the string literal to be
		stored as N-bit bytes.  This is NOT strict C, which would
		merely convert the char pointer; however, this is the
		most useful interpretation.  This permits the somewhat
		bizarre construct of using a string literal to make
		an array of 18-bit bytes (this is the only aspect where
		"_KCCtype_char18" differs from "short").
	(3) 6-bit string literals are stored as SIXBIT rather than using
		the low 6 bits of the ASCII char values.  Note that while
		such strings are null-terminated, null is a valid
		SIXBIT character (meaning space).  The value of invalid
		SIXBIT characters is undefined.
	(4) Function parameters cannot be declared to have a type of
		char size 7 or 8.  The reason is complicated; see
		the last part of this section.

Some examples:
	_KCCtype_char6 tmp[] = "tmp";	/* A 4-element array of SIXBIT chars */
	_KCCtype_char7 wd[5] = "word";	/* A 5-element array of 7-bit chars */
	_KCCtype_char8 packet[40];	/* A 40-element array of 8-bit chars */
	_KCCtype_char18 useless;	/* Same as "unsigned short useless;" */
	_KCCtype_char7 *arg = "text";	/* A pointer to an ASCIZ string */
	_KCCtype_char6 *pt6;		/* A pointer to a 6-bit char string */

	arg = "othertext";	/* Implicit conversion to ASCIZ */
	pt6 = "dskdmp";		/* Implicit conversion to SIXBIT */
	pkg_call((_KCCtype_char7 *)"argtext");	/* Explicit cast to ASCIZ */

Portability issues:

	The long names for these types were deliberately chosen so as to
minimize the chances of possible conflict with identifiers in software
imported from elsewhere, and to discourage the indiscriminate (non-portable)
use of the types.  Note that users who must make heavy use of them (for
good reasons, we hope) can simply use typedefs or #defines at the start
of their code in order to equate them with simpler names; e.g.

		#define char7 _KCCtype_char7	/* Use shorter typename */

	This method also has the advantage of localizing non-portable
constructs in a way that gives others a fighting chance to port the
software elsewhere by changing the initial definitions.

Storage:

	There are a few aspects of the way N-bit char objects are stored
which may be surprising at first.  Char arrays are always packed starting
with the leftmost byte in a word; however, single-char objects (such as
"char c;" have their value stored in the rightmost ALIGNED BYTE.
	This is a necessary consequence of the fact that the '&'
operator applied to a char object must result in a valid char pointer,
and the very strong desire that all C code work with extended addressing.
There are only a few possible kinds of OWGBPs and they all require this
alignment.  For 6, 9, and 18 bits this causes no difficulty since bytes
of those sizes completely fill a word, and there are no unused low-order
bits; thus char values may be stored completely right-justified, and in
some cases full-word operations can be performed on them.
	However, for 7 and 8 bit bytes the rightmost byte will leave 1
and 4 unused low-order bits, respectively, and this is where KCC
stores the values for such objects.  Debuggers examining a program with
IDDT may be surprised that "_KCCtype_char8 foo = 1;" results in a
word labelled FOO with its value 020 instead of 1.
	This alignment restriction causes no real problems except for
the obscure case of function parameter declarations.  In the absence
of ANSI function prototypes, the default "function argument
promotions" are performed when a call is made; all integers shorter
than (int) are converted to (int) and passed as such.  But this means
that the integer value is right-justified; if the function parameter
was declared to match the promoted type (int) then all is well, but
attempts to declare it as a 7 or 8 bit char will just result in a
confused function (attempts to read the parameter value or take its
address will fail since the value is not properly aligned).  This
could be fixed by having KCC do an implicit conversion upon function
entry, but it is far simpler and much, much more efficient to simply declare
such parameters as (int) in the first place.
	If the code will never be run on a KL then, of course, this and
many other things could be simplified.

<1 KCC Internals>
<2 KCC Internals - Memory organization>

	A C program compiled by KCC has four distinct memory regions:
data, text (code), stack, and free.
	DATA - This contains all user-declared data variables, both
		initialized (set to user's specification) and
		un-initialized (set to zero).
		The first address following this region is stored in "_edata".
	TEXT - This is the UNIX terminology for program code.
		The first address following this region is stored in "_etext".
	STACK - The program stack.  This grows upwards in memory.
	FREE - The region of memory that malloc() can dynamically allocate.
		This starts at the address stored in "_end" and can allocate
		memory up to (but not including) the address stored in
		"_ealloc".

In addition, there may be small unused areas of memory.

The normal layout on TOPS-20 for a single-section program:

	Start addr	End addr	Region Name
	LOW		_edata-1	DATA
	_edata		<??>		STACK
	<??>		HIGH-1		 - (unused)
	HIGH		_etext-1	TEXT
	_etext		_ealloc-1	FREE
	_ealloc		777777		 - (unused, reserved)

Normally LOW == 0 and HIGH == 400000.  These correspond to the normal
addresses for low and high segments.  Also, normally _ealloc is set to
770000, so that pages 770-777 can be reserved for mapping DDT (some people
seem to prefer that to IDDT).

The normal layout on TOPS-20 for a MULTI-section program:
		Start	End		Region Name
	Section 0			 - (unused)
	Section 1
		1,,LOW	_edata-1	DATA
		_edata	1,,HIGH-1	 - (unused)
		1,,HIGH	_etext-1	TEXT
		_etext	1,,777777	 - (unused)
	Section 2
		2,,0	<??>		STACK
		<??>	2,,777777	 - (unused)
	Sections 3-37
		3,,0	_ealloc-1	FREE (all sections up to 37)
		_ealloc	37,777777	 - (unused, reserved)

Normally _ealloc is set to 37,,700000 so that pages 700-777 of section 37
are reserved for mapping XDDT (again, for those people who don't know about
IDDT).

<2 KCC Internals - Stack structure>

The organization of the portion of the stack seen by a C routine is
shown in the following diagram (with the top of the stack being the
earlier lines in this file, and the stack pointer at the very top):

SP-->________________________________________________________________
    |    Spilled registers                                           |
    |    generated when we need more intermediate values than        |
    |    there are available PDP-10 registers                        |
    |________________________________________________________________|
    |             |                                                  |
    | (as many    |    Arguments being stacked for the next call     |
    | repetitions |    These are generated in the reverse of         |
    | of these    |    lexical order; thus the first argument        |
    | two areas   |    appears at the top of the stack.  This is     |
    | as levels   |    so that functions like printf which take a    |
    | of nesting  |    variable number of arguments can work.        |
    | in function |__________________________________________________|
    | calls)      |                                                  |
    |             |    Values to be saved over the call              |
    |             |    e.g. if we do foo()+bar() then one function   |
    |             |    has to be called first, and we save its       |
    |             |    value here so we can add it to the other      |
    |             |    result once the second call returns           |
    |_____________|__________________________________________________|
    |                                                                |
    |    Local variables                                             |
    |    stored in lexical order, i.e. the first declared            |
    |    variable is lowest on the stack                             |
    |________________________________________________________________|
    |                                                                |
    |    Return address for calling function                         |
    |________________________________________________________________|
    |    Pointer for return value                                    |
    |    this only exists if the function returns a struct           |
    |    that takes more than two words; otherwise the result        |
    |    is returned in registers 1 and (if two words) 2             |
    |________________________________________________________________|
    |                                                                |
    |    Arguments to this call                                      |
    |    in reverse lexical order as described above                 |
    |________________________________________________________________|

Of course, not all of these areas are likely to appear at once.
There is no frame pointer, only a stack pointer; generated code always
knows the location of the stack pointer in relation to changes in the
above structure (as arguments get pushed and popped, registers get
spilled and despilled, etc).  Thus code to access an argument or local variable
will use a different offset from the stack pointer depending on where
it is generated.

<2 KCC Internals - Calling conventions and register use>

	Arguments to KCC C functions are passed on the stack and
returned in the registers.  Functions are not expected to save
any registers upon entry, and in fact are assumed to clobber all
of ACs 1-16 inclusive.

Caller conventions - argument passing:

	Since all function calls are assumed to clobber the registers,
it is up to the caller to save on the stack any register values which
it wishes to preserve over the function call.
	As described in the section on stack structure, function
arguments are then pushed in reverse order onto the stack; the last
argument is pushed first, and the first argument is pushed last.
Passing a structure as argument consists of copying it whole onto the
stack.  If the function is expected to return a structure or union
longer than two words, a "zeroth arg" must also be pushed, which is
the address of a location that the function should copy the returned
structure into.  The function is then called with a PUSHJ 17,
instruction which adds the return address onto the stack.

Caller conventions - result returning:
	All accumulators (except AC17) are at the callee's disposal.
However, AC0 is never used by generated code, as some old programs
assume NULL always points to zero, and as the hardware imposes several
restrictions on its use.  AC15 and AC16 are also reserved for minor
KCC runtime functions.
	Single word function return values are left in AC1; double
word returns go in AC1 and AC2.  Return values larger than that are
copied into the location specified by the struct-return pointer, which
is provided by the caller as the "zeroth" argument.

<2 KCC Internals - Extended addressing>

	A C program can be run in an extended section by specifying
this in either of two ways at load time, depending on whether you are
using KCC or the EXEC to do the loading.

	(a) KCC: Use the "-i" switch.
		e.g.	@cc -i prog.c
	(b) LOAD (or LINK): The first module should be C:LIBCKX.
		e.g.	@load c:libckx,prog

No special switches need be given to KCC for the generated code to be
suitable for extended addressing - the same code will always run
either extended or non-extended.

	In extended sections, code and permanently allocated data
(i.e. global variables) live in section N, the stack lives in section
N+1, and allocated memory begins in section N+2, expanding to fill all
higher sections.  Normally N==1; this can be changed if really
necessary.  All byte pointers not intended for immediate use (e.g.
literal arguments to a LDB or DPB instruction) are constructed as
OWGBPs (One-Word Global Byte Pointer).

<1 Cross-compiling>

The -x, -L, -H, and -A switches allow some degree of cross-compilation.
The effects of the various -x specifications are listed below:

CPU: ka, ki, ks, kl0, klx
	KCC can compile code to run on any CPU type; this is done both
by means of different code generation sequences and by assembler
macros which KCC also generates as needed. "ka" specifies a KA-10
using software format floating point doubles (all other types use
hardware format).  "ki" specifies a KI-10, and "ks" both a KS-10 and a
KL-10A without extended addressing.  "kl0" specifies a KL-10B capable
of extended addressing, but restricts the code to section 0; "klx"
specifies a KL-10B non-zero section environment.

	It is possible to specify more than one CPU type; the intent
is to allow for producing code that will run on all specified
machines.  As distributed, KCC code is compiled for "ks+kl0+klx".
However, the results of other combinations are somewhat unpredictable
and should be avoided at the moment.

SYSTEM: tops20, tenex, tops10, waits, its

	Currently there are only two things affected by this setting:
character and string constant values, and ERJMP.
	[1] If compiling for WAITS (or for anything else if on WAITS),
	character values are mapped to and from WAITS ASCII and standard US
	ASCII.
	[2] If compiling for TOPS20 or TENEX, the proper value of
	ERJMP and an auxiliary definition called ERJMPA are generated.
There may be more distinctions in the future.


ASSEMBLER: fail, macro, midas

	The assembler selection is independent of the system or CPU.
Currently either FAIL and MACRO can be selected and both will work.
Selecting MIDAS does not yet work completely.


CHARSIZE: ch7

	It is possible to request that KCC generate code which assumes
that chars are 7 bits, and char pointers are 7-bit byte pointers.
Thus, arrays of chars will have 5 chars per word, instead of 4.  This
feature, invoked by the "-x=ch7" switch, is mainly of use to people
who must integrate C code with old software that cannot deal with
anything but 7-bit bytes.  It is not really guaranteed to work in all
conceivable cases.  In particular, you should be aware that many of
the normally-compiled library routines (such as malloc) will continue
to return 9-bit char pointers, although the str- and mem- functions
should work with either 9-bit or 7-bit strings.
	The values returned by "sizeof" will not change.  As explained
in the discussion of the sizeof operator, sizes are always in terms of
9-bit bytes, except that the size of a char array is always the number of
elements (chars) in the array.  sizeof(char) is always 1.

General comments:
	Ideally KCC (on any system) should be able to generate code
for any other PDP-10 system.  To actually do this requires some
understanding of how the various parts of a program come together.  It
is not enough just to specify some -x switches; you must take care of
the following:

	1. #include files.  You may need to use an alternate standard
	include-file directory to satisfy <>-type includes.  -H can be
	used to specify an alternate location.

	2. Switches.  You should use -D to predefine any parameters
	from <c-env.h> which are not properly defaulted.
	Alternatively you can put a different version of c-env.h in
	a non-standard location pointed to by -H (as above).

	3. Library.  The C runtime library loaded with the program must
	be the correct one (already cross-compiled for the target).  KCC
	always generates a default "-lc" request for the C runtime library;
	the location searched for this can be specified by the -L switch.

For details on porting the C library and KCC itself, see the file PORT.DOC
in the KCC source directory.

<1 Char Pointer Hints>

	The code generated for handling char pointers always uses
byte-pointer instructions, and so will work for any byte size (at
least on machines implementing the ADJBP instruction).  This can
sometimes be useful when dealing with PDP-10 based data structures.
However, such pointers have to be constructed "by hand" since all char
pointers that KCC generates are either 9-bit or 7-bit.  See also the
-x=ch7 option in "Cross-compiling".

	In general, when char pointers are involved, constructs like
*++ptr are faster than *ptr++. This is because *++ptr can usually be
folded by the optimizer into an ILDB (or IDBP) instruction.  There is
no equivalent on the PDP-10 to a *ptr++ construct; this must always
be done as at least two instructions.

	Whenever possible, try to avoid using two char pointers in
subtraction, as in (ptr1-ptr2).  Many instructions have to be executed
to find the difference between two char pointers, due to the strange
internal format.  For the same reason, try to avoid less-than (<, <=)
or greater-than (>, =>) comparison of char pointers.  Tests for
equality (== and !=) are fine, however.  Finally, on machines which do
not implement the ADJBP instruction (KA, KI), it is also helpful to
avoid addition or subtraction of integers to char pointers.

	None of this applies to other types of pointers, such as (int *),
which are simple addresses and can be manipulated very efficiently.

<1 Portable Math Library>
* Menu:
* PML: (KCC-PML)                Portable Math Library
<1 Local library additions>
* Menu:
* LIBLCL: (KCC-LIBLCL)          Local library additions
* LIBT20: (KCC-LIBT20)		Frank Wancho's TOPS-20 library