1
0
mirror of https://github.com/PDP-10/its.git synced 2026-01-11 23:53:12 +00:00

CLU reference manual.

Written in R.
This commit is contained in:
Lars Brinkhoff 2021-08-26 12:13:26 +02:00
parent b467dcc16a
commit 76e5b7cb8b
23 changed files with 10865 additions and 1 deletions

View File

@ -51,7 +51,8 @@ DOC = info _info_ sysdoc sysnet syshst kshack _teco_ emacs emacs1 c kcc \
kldcp libdoc lisp _mail_ midas quux scheme manual wp chess ms macdoc \
aplogo _temp_ pdp11 chsncp cbf rug bawden llogo eak clib teach pcnet \
combat pdl minits mits_s chaos hal -pics- imlac maint cent ksc klh \
digest prs decus bsg madman hur lmdoc rrs danny netwrk klotz hello
digest prs decus bsg madman hur lmdoc rrs danny netwrk klotz hello \
clu
BIN = sys sys1 sys2 emacs _teco_ lisp liblsp alan inquir sail comlap \
c decsys graphs draw datdrw fonts fonts1 fonts2 games macsym \
maint _www_ gt40 llogo bawden sysbin -pics- lmman r shrdlu imlac \

View File

@ -155,7 +155,28 @@ clib/nm.insert 197904090030.29
clib/-read-.-this- 198002261810.43
clib/tv.128 197908312338.58
clu/clu.order 197711161922.32
clu/action.refman 197806022022.04
clu/clusym.r 197806271243.01
clu/exampl.refman 197805301747.35
clu/except.refman 197806061946.59
clu/exprs.refman 197806022027.52
clu/e}d.refman 197806022030.03
clu/gram.refman 197804052219.26
clu/io.refman 197805301748.31
clu/killed.refman 197806062048.44
clu/lex.refman 197806022030.41
clu/module.refman 197806062035.55
clu/newio.refman 197803301852.29
clu/opdefs.refman 197806062047.11
clu/part}a.refman 197806022036.29
clu/refman.insert 197806062042.58
clu/refman.r 197805301540.46
clu/refman.save 197804052227.08
clu/refman.sect 197804051701.28
clu/stmts.refman 197806022038.19
clu/syntax.refman 197806062039.18
clu/ts.clusys 197801112003.24
clu/types.refman 197806022040.13
clucmp/cludmp.3_77 197801311537.30
clusys/alpha.10 197808301728.33
clusys/common.8 197801302220.27

199
doc/clu/action.refman Normal file
View File

@ -0,0 +1,199 @@
.sr sect_except Section`12
.sr sect_for_stmt Section`11.5.2
.sr sect_decl_w_init Section`8.2.2
.sr sect_update_sugars Section`11.2
.
.chapter "Assignment and Invocation"
.para
The two fundamental actions of CLU are assignment
of computed objects to variables, and invocation of
procedures (and iterators) to compute objects.
All other actions are merely compositions of these two.
Since the correctness of assignments and invocations depends on a
type-checking rule, we describe that rule first,
then assignment, and finally invocation.
.section "Type Inclusion"
.para
CLU is designed to allow compile-time type-checking.
The type of each variable is known by the compiler.
Furthermore, the type of objects that could result from
the evaluation of an expression (invocation) is known at compile-time.
Hence, every assignment can be checked at compile-time to make
sure that the variable is only assigned objects of its declared type.
The rule is that an assignment 2v*`:=`2E* is legal only if
the set of objects defined by the type of 2E*
(loosely,
the set of all objects that could possibly result from evaluating the expression)
must be included in the set of all objects that could be denoted by 2v*.
.para
Instead of speaking of the set of objects defined by a type,
we generally speak of the type and say that
the type of the expression must be 2included in* the type of the variable.
If it were not for the type any, the inclusion rule
would be an equality rule.
This leads to a simple interpretation of the type inclusion rule:
.show 2
.fi
.ir +500m
The type of a variable being assigned an expression
must be either the type of the expression, or any.
.ir -500m
.nf
.eshow
.section "Assignment"
.para
Assignment is the means of causing a variable to denote an object.
Some assignments are implicit, i.e., performed as part of the execution
of various mechanisms of the language
(most notably procedure invocation, iterator invocation,
exception handling, and the tagcase statement).
All assignments, whether implicit or explicit,
are subject to the type inclusion rule.
The remainder of this section discusses explicit assignments.
.para
The assignment symbol ":=" is used in
two other syntactic forms that are not true assignments,
but rather abbreviations for certain invocations.
These forms are used for updating structures such as records and arrays
(see sect_update_sugars).
.subsection "Simple Assignment"
.para
The simplest form of assignment is:
.show
idn := expression
.eshow
In this case the expression is evaluated, and the
resulting object is assigned to the variable.
The expression must return a single object
(whose type must be included in that of the variable).
Examples of simple assignments are:
.show 5
x := 1 s(1)% x's type must include int, i.e., it must be int or any
y := string$substr (s, 5, n)t(1)% y's type must include string
a := array [int]$new ()t(1)% a's type must include array [int]
p := array [int]$create(3)t(1)% p's type must include array [int]
z := (foo = bar) t(1)% z's type must include bool
.eshow
It is also possible to declare a variable and assign to it in a single
statement; this is called a declaration with initialization,
and was discussed in sect_decl_w_init.
.subsection "Multiple Assignment"
.para
There are two forms of assignment that assign to more than one variable at once:
.show
idn, etc := expression, etc
.eshow
and
.show
idn, etc := invocation
.eshow
.para
The first form of multiple assignment is a generalization of the simple assignment.
The first variable is assigned the first expression,
the second variable the second expression, and so on.
The expressions are all evaluated (from left to right)
before any assignments are performed.
The number of variables in the list must equal the number of expressions,
no variable may occur more than once,
and the type of each variable must include the type of the corresponding expression.
.para
This form of multiple assignment allows easy permutation
of the objects denoted by several variables:
.show 2
x, y := y, x
i, j, k := j, k, i
.eshow
and similar simultaneous assignments of variables that would
otherwise require temporary variables:
.show 2
a, b := (a + b), (a - b)
quotient, remainder := (u / v), (u // v)
.eshow
Notice that there is no form of this statement with declarations.
.para
The second form of multiple assignment allows
us to retain the objects resulting from an invocation
returning two or more objects.
The first variable is assigned the first object,
the second variable the second object, and so on.
The order of the objects is the same as in the return statement
of the invoked routine.
The number of variables must equal the number of objects returned,
no variable may occur more than once,
and the type of each variable must include the type of the corresponding expression.
.para
Here are two examples of this statement, one without declarations, and one with:
.show 2
first, last, balance := acct$query (acct_no)
x, y, z: real := vector$components (v)
.eshow
.section "Invocation"
.para
Invocation is the other fundamental action of CLU.
In this section we discuss procedure invocation;
iterator invocation is discussed in sect_for_stmt.
However, up to and including passing of arguments,
the two are the same.
.para
Invocations take the form:
.show
primary ( lbkt expression, etc rbkt )
.eshow
A primary is a slightly restricted from of expression,
which includes variables and routine names,
among other things.
(See next section.)
.para
The sequence of activities in performing an invocation are as follows:
.nlist
The primary is evaluated.
It must evaluate to a procedure or iterator.
.nnext
The expressions are evaluated, from left to right.
.nnext
New variables are introduced corresponding to the formal arguments
of the routine being invoked (i.e., a new environment is created
for the invoked routine to execute in).
.nnext
The objects resulting from evaluating the expressions
(the actual arguments) are assigned to the corresponding new variables
(the formal arguments).
The first formal is assigned the first actual,
the second formal the second actual, and so on.
The type of each object must be included in the type of the
corresponding formal argument.
.nnext
Control is transferred to the routine at the start of its body.
.end_list
An invocation is considered legal in exactly
those situations where all the (implicit) assignments
involved in its execution are legal.
.para
commentary
It is permissible for a routine to assign to a formal argument variable;
the effect is just as if it assigned to any of its other variables.
From the point of view of the invoked routine, the only difference
between its formal argument variables and its other local variables
is that the formals are initialized by its caller.
.para
When an invoked procedure terminates, it returns zero or more result objects,
agreeing in number, order, and type with the information in the procedure header.
However, procedures can terminate in two ways:
they can terminate 2normally*,
returning zero or more objects,
or they can terminate 2exceptionally*,
signalling an exceptional condition.
When a procedure terminates normally, the result objects
become available to the caller,
and will (sometimes) be assigned to variables.
When a procedure terminates exceptionally,
the flow of control will not go to the point of return
of the invocation, but rather will go elsewhere as described
in sect_except.
.para
Here are some examples of invocations:
.show 2
p () s(1)% invoking a procedure taking no arguments
array [int]$create (-1)t(1)% invoking an operation of a type
routine_table [index] (input)t(1)% invoking a procedure fetched from an array
.eshow

33
doc/clu/clusym.r Normal file
View File

@ -0,0 +1,33 @@
meta-symbols (large boldface characters)
string register what it contains
lbkt [
rbkt ]
lcurly {
rcurly }
lparen (
rparen )
vbar |
etc ...
def ::=
.
.
.fs A
.nr meta_offset fheight
.fs 0
.nr meta_offset (fheight-meta_offset)/2
.sr lbkt (meta_offset!m)A[*A^*
.sr rbkt (meta_offset!m)A]*A^*
.sr lcurly (meta_offset!m)A{*A^*
.sr rcurly (meta_offset!m)A}*A^*
.sr lparen (meta_offset!m)A(*A^*
.sr rparen (meta_offset!m)A)*A^*
.sr vbar (meta_offset!m)A|*A^*
.fs 5
.nr meta_offset fheight
.fs 0
.nr meta_offset (fheight-meta_offset)/2
.sr orbar (meta_offset!m)5|*5^*
.sr etc A...*
.sr def A=*

875
doc/clu/exampl.refman Normal file
View File

@ -0,0 +1,875 @@
.de foo
.fo 0 fonts1;30vrx rwskst
.fo 1 31vgb
.fo 2 fonts1;30vrix ebmkst
.fo 3 37vrb
.fo 4 30vrb
.fo 5 40vgl
.fo 6 s30grk
.fo 7 20vg
.fo 8 20vr
.fo 9 22fg
.fo A fonts1;52meta rwskst
.fo B 31vg
.fo C 25fg
.fo D 25vg
.fo E 25vgb
.em
.so rws;escape chars
.appendix "Examples"
.achapter "Priority Queue Cluster"
.sr log log2
.para
This cluster is an implementation of priority queues.
It inserts elements in O(log`n) time, and removes the "best"
element in O(log`n) time, where n is the number of items in
the queue, and "best" is determined by a total ordering predicate
which the queue is created with.
.para
The queue is implemented with a binary tree which is balanced such that
every element is "better" than its descendents, and the minimum depth
of the tree differs from the maximum depth by one. The tree is implemented
by keeping the elements in an array, with the left son of a[i] in a[i*2],
and the right son in a[i*2+1]. The root of the tree, a[1], is the "best"
element.
.para
Each insertion or deletion must rebalance the tree. Since the
tree is of depth strictly less than log`n,
the number of comparisons is less than log`n for
insertion and less than 2`log`n for removal of an element. Consequently,
a sort using this technique takes less than 3`n`log`n comparisons.
.para
This cluster illustrates the use of a type parameter, and the
use of a procedure as an object.
.bp
.ls 1.1
.nf
.fs D
.ta 8 12 16 20 24 32 40
.de def_nkeys
. for n 0 nargs-1
. sr _\\n E\\n*
. end for
. em
.def_nkeys any array
.def_nkeys begin bool break
.def_nkeys cand char cluster continue cor cvt
.def_nkeys do down
.def_nkeys else elseif end except exit
.def_nkeys false for force
.def_nkeys has
.def_nkeys if in int is iter itertype
.def_nkeys nil null
.def_nkeys oneof others
.def_nkeys proc proctype
.def_nkeys real record rep return returns
.def_nkeys signal signals string
.def_nkeys tag tagcase then true type
.def_nkeys up
.def_nkeys when where while
.def_nkeys yield yields
.keep
p_queue = _cluster [t: _type] _is
create,(24)% Create a p_queue with a particular sorting predicate
top,(24)% Return the best element
size,(24)% Return the number of elements
empty,(24)% Return true if there are no elements
insert,(24)% Insert an element of type t
remove;(24)% Remove the best element and return it
.end_keep
.keep
pt = _proctype (t, t) _returns (_bool);
at = _array[t];
_rep = _record [a: at, p: pt];
.end_keep
.keep
create =(8)_proc (p: pt) _returns (_cvt);
_return (_rep${a: at$create(1), p: p});(48)% Low index of array must be 1 !
_end create;
.end_keep
.keep
top =(8)_proc (x: _cvt) _returns (t) _signals (empty);
_return (at$bottom(x.a));
_except _when bounds: _signal empty; _end;
_end top;
.end_keep
.keep
size =(8)_proc (x: _cvt) _returns (_int);
_return (at$size(x.a));
_end size;
.end_keep
.keep
empty =(8)_proc (x: _cvt) _returns (_bool);
_return (at$size(x.a) = 0);
_end empty;
.end_keep
.keep
insert =(8)_proc (x: _cvt, v: t);
a: at := x.a;
p: pt := x.p;
at$addh(a, v);(48)% Make room for new item
son: _int := at$high(a);(48)% Node to place v if father wins
dad: _int := son/2;(48)% Get index of father
_while dad > 0 _cand p(v, a[dad]) _do(48)% While father loses
a[son] := a[dad];(48)% Move father down
son, dad := dad, dad/2;(48)% Get new son, father
_end;
a[son] := v;(48)% Insert the element into place
_end insert;
.end_keep
.keep
remove = _proc (x: _cvt) _returns (t) _signals (empty);
a: at := x.a;
p: pt := x.p;
r: t := at$bottom(a);(48)% Save best for later return
_except _when bounds: _signal empty; _end;
v: t := at$remh(a);(48)% Remove last element
max_son: _int := at$size(a);(48)% Get new size
_if max_son = 0 _then _return (r); _end;(48)% If now empty, we're done
max_dad: _int := max_son/2;(48)% Last node with a son
dad: _int := 1;(48)% Node to place v if it beats sons
_while dad <= max_dad _do(48)% While node has a son
son: _int := dad*2;(48)% Get the first son
s: t := a[son];
_if son < max_son(48)% If there is a second son
_then(48)% Find the best son
ns: t := a[son + 1];
_if p(ns, s) _then son, s := son + 1, ns; _end;
_end;
_if p(v, s) _then _break; _end;(48)% If v beats son, we're done
a[dad] := s;(48)% Move son up
dad := son;(48)% Move v down
_end;
a[dad] := v;(48)% Insert the element into place
_return (r);(48)% Return the previous best element
_end remove;
_end p_queue;
.end_keep
.pf 0
.fs 0
.bp
.nr page 30
.de nshow
.be nshow_block
.nv ols ls
.nv ls 100
.sp (ols-100)/100
.nv indent indent
.hx indent indent 10
.nf
.nv pfont 12
.nv font 12
.em
3Part D - Text Formatter
.sp 2
.para
The following program is a simple text formatter.
The input consists of a sequence of unformatted text lines
mixed with commands lines.
Each line is terminated by a newline character,
and command lines begin with a period to distinguish them from text lines.
For example:
.nshow
Justification only occurs in "fill" mode.
In "nofill" mode, each input text line is output without modification.
The .br command causes a line-break.
.br
Just like this.
.eshow
The program produces justified, indented, and paginated text.
For example:
.nshow
Justification only occurs in "fill" mode. In "nofill" mode,
each input text line is output without modification. The .br
command causes a line-break.
Just like this.
.eshow
.para
The output text is indented 10 spaces from the left margin,
and is divided into pages of 50 text lines each.
A header,
giving the page number,
is output at the beginning of each page.
.para
An input text line consists of a sequence of words and word-break characters.
The word-break characters are space, tab, and newline;
all other characters are constituents of words.
Tab stops are considered to be every eight spaces.
.para
The formatter has two basic modes of operation.
In "nofill" mode,
each input text line is output without modification.
In "fill" mode,
input is accepted until no more words can fit on the current output line.
(An output line has 60 characters.)`
Newline characters are treated essentially as spaces.
Extra spaces are then added between words until the last word has its last character
in the rightmost position of the line.
.para
In fill mode, any input line that starts with a word-break character
causes a line-break:
the current output line is neither filled nor adjusted,
but is output as is.
An "empty" input line (one starting with a newline character)
causes a line-break and then causes a blank line to be output.
.para
The formatter accepts three different commands:
.sr list_left_margin 3
.sr list_indent 8
.ilist
.br causes a line-break
.next
.nf causes a line-break, and changes the mode to "nofill"
.next
.fi causes a line-break, and changes the mode to "fill"
.end_list
.para
The program performs input and output on streams,
which are connections (channels) to text files.
The following operations on streams are used:
.sr list_indent 12
.ilist
empty tests if the end of the file has been reached
.next
getc removes and returns the next character from the stream
.next
peekc like 2getc, but the character is not removed
.next
getl removes and returns (the remainder of) the input line
and removes but does not return the terminating newline character
.next
putc outputs a character, with newline indicating end of line
.next
puts outputs the characters of a string using 2putc
.next
close closes the stream and associated output file, if any
.end_list
.bp
.nr page 32
.so r;box rmac
.nf
.nr boxfont 3
1Module Dependency Diagram
.sp 2
box(reader)
.sp 4
box(do_line)
.sp 4
box(do_text_line) box(do_command)
.sp 4
dbox(page) box(read_name)
.sp 5
dbox(line)
.sp 4
dbox(word) box(put_spaces)
.sp 7
dbox(stream)
.sp 4
Note: boxes with a double line at the top indicate clusters.
.fi
.bp
.nr page 33
.pf D
.fs D
.nf
.ls 1.1
.keep
reader = _proc (instream, outstream, errstream: stream)
% Read the instream, processing it and placing the output on
% outstream and writing error messages on errstream.
p: page := page$create (outstream)
_while ~stream$empty (instream) _do
do_line (instream, p, errstream)
_end
page$terminate (p)
_end reader
.end_keep
.keep
do_line = _proc (instream: stream, p: page, errstream: stream)
% Process an input line. This procedure reads one line from
% instream. It is then processed either as a text line or as
% a command line, depending upon whether or not the first
% character of the line is a period.
c: _char := stream$peekc (instream)
_if c = '.' _then
do_command (instream, p, errstream)
_else
do_text_line (instream, p)
_end
_end do_line
.end_keep
.keep
do_text_line = _proc (instream: stream, p: page)
% Process a text line. This procedure reads one line from
% instream and processes it as a text line. If the first
% character is a word-break character, then a line-break is
% caused. If the line is empty, then a blank line is output.
% Otherwise, the words and word-break characters in the line
% are processed in turn.
c: _char := stream$getc (instream)
_if c = '\n' _then(40)% empty input line
page$skip_line (p)
_return
_end
_if c = ' ' _cor c = '\t' _then
page$break_line (p)
_end
_while c ~= '\n' _do
_if c = ' ' _then
page$add_space (p)
_elseif c = '\t' _then
page$add_tab (p)
_else
w: word := word$scan (c, instream)
page$add_word (p, w)
_end
c := stream$getc (instream)
_end
page$add_newline (p)
_end do_text_line
.end_keep
.keep
do_command = _proc (instream: stream, p: page, errstream: stream)
% Process a command line. This procedure reads one line from
% instream and processes it as a command.
stream$getc (instream)(40)% skip the period
n: _string := read_name (instream)
_if n = "br" _then
page$break_line (p)
_elseif n = "fi" _then
page$break_line (p)
page$set_fill (p)
_elseif n = "nf" _then
page$break_line (p)
page$set_nofill (p)
_else
stream$puts ("'", errstream)
stream$puts (n, errstream)
stream$puts ("' not a command.\n", errstream)
_end
stream$getl (instream)(32)% read remainder of input line
_end do_command
.end_keep
.keep
read_name = _proc (instream: stream) _returns (_string)
% This procedure reads a command name from instream. The
% command name is terminated by a space or a newline. The
% command name is removed from instream; the terminating space
% or newline is not.
s: _string := ""
_while _true _do
c: _char := stream$peekc (instream)
_except _when end_of_file: _return (s) _end
_if c = ' ' _cor c = '\n' _then
_return (s)
_end
s := _string$append (s, stream$getc (instream))
_end
_end read_name
.end_keep
.bp
.keep
page = _cluster _is create, add_word, add_space, add_tab, add_newline,
break_line, skip_line, set_fill, set_nofill, terminate
% The page cluster does the basic formatting. It supports the
% basic actions: BREAK_LINE, SKIP_LINE, SET_FILL, SET_NOFILL,
% TERMINATE. It performs the appropriate actions for the
% basic components of the input: WORDs, SPACEs, TABs, and
% NEWLINEs. It maintains a current output line for the
% purposes of performing justification. It performs
% pagination and the production of headings. For this purpose
% it maintains the current line number and the current page
% number.
_rep = _record [
line: line,(40)% The current line.
fill: _bool,(40)% True <==> in fill mode.
lineno: _int,(40)% The number of lines output
% so far on this page (not
% including any header lines).
pageno: _int,(40)% The number of the current
% output page.
outstream: stream(40)% The output stream.
]
.end_keep
.keep
create = _proc (outstream: stream) _returns (_cvt)
% Create a page object. The first page is number 1, there are
% no lines yet output on it. Fill mode is in effect.
_return ( _rep${(24)line: line$create (),
fill: _true,
lineno: 0,
pageno: 1,
outstream: outstream})
_end create
.end_keep
.keep
add_word = _proc (p: _cvt, w: word)
% Process a word. This procedure adds the word w to the
% output document. If in nofill mode, then the word is simply
% added to the end of the current line (there is no
% line-length checking in nofill mode). If in fill mode, then
% we first check to see if there is room for the word on the
% current line. If the word will not fit on the current line,
% we first justify and output the line and then start a new
% one. However, if the line is empty and the word won't fit
% on it, then we just add the word to the end of the line; if
% the word won't fit on an empty line, then it won't fit on
% any line, so we have no choice but to put it on the current
% line, even if it doesn't fit.
_if p.fill _cand ~line$empty (p.line) _then
h: _int := word$width (w)
_if line$length (p.line) + h > 60 _then
line$justify (p.line, 60)
output_line (p)
_end
_end
line$add_word (p.line, w)
_end add_word
.end_keep
.keep
add_space = _proc (p: _cvt)
% Process a space -- just add it to the current line.
line$add_space (p.line)
_end add_space
.end_keep
.keep
add_tab = _proc (p: _cvt)
% Process a tab -- just add it to the current line.
line$add_tab (p.line)
_end add_tab
.end_keep
.keep
add_newline = _proc (p: _cvt)
% Process a newline. If in nofill mode, then the current line
% is output as is. Otherwise, a newline is treated just like
% a space.
_if ~p.fill
_then output_line (p)
_else line$add_space (p.line)
_end
_end add_newline
.end_keep
.keep
break_line = _proc (p: _cvt)
% Cause a line break. If the line is not empty, then it is
% output as is. Line breaks have no effect on empty lines --
% multiple line breaks are the same as one.
_if ~line$empty (p.line) _then output_line (p) _end
_end break_line
.end_keep
.keep
skip_line = _proc (p: _cvt)
% Cause a line break and output a blank line.
break_line (_up (p))
output_line (p)(32)% line is empty
_end skip_line
.end_keep
.keep
set_fill = _proc (p: _cvt)
% Enter fill mode.
p.fill := _true
_end set_fill
.end_keep
.keep
set_nofill = _proc (p: _cvt)
% Enter nofill mode.
p.fill := _false
_end set_nofill
.end_keep
.keep
terminate = _proc (p: _cvt)
% Terminate the output document.
break_line (_up (p))
_if p.lineno > 0 _then
stream$putc ('\p', p.outstream)
_end
stream$close (p.outstream)
_end terminate
.end_keep
.keep
%(8)Internal procedure.
output_line = _proc (p: _rep)
% Output line is used to keep track of the line number and the
% page number and to put out the header at the top of each
% page.
_if p.lineno = 0 _then(40)% print header
stream$puts ("\n\n", p.outstream)
put_spaces (10, p.outstream)
stream$puts ("Page ", p.outstream)
stream$puts (int2string (p.pageno), p.outstream)
stream$puts ("\n\n\n", p.outstream)
_end
p.lineno := p.lineno + 1
line$output (p.line, p.outstream)
_if p.lineno = 50 _then
stream$putc ('\p', p.outstream)
p.lineno := 0
p.pageno := p.pageno + 1
_end
_end output_line
_end page
.end_keep
.keep
put_spaces = _proc (n: _int, outstream: stream)
% This procedure outputs N spaces to outstream.
_for i: _int _in _int$from_to_by (1, n, 1) _do
stream$putc (' ', outstream)
_end
_end put_spaces
.end_keep
.bp
.keep
line = _cluster _is create, add_word, add_space, add_tab, length,
empty, justify, output
% A line is a mutable sequence of words, spaces, and tabs.
% The length of a line is the amount of character positions
% that would be used if the line were output. One may output
% a line onto a stream, in which case the line is made empty
% after printing. One may also justify a line to a given
% length, which means that some spaces in the line will be
% enlarged to make the length of the line equal to the desired
% length. Only spaces to the right of all tabs are subject to
% justification. Furthermore, spaces preceding the first word
% in the output line or preceding the first word following a
% tab are not subject to justification. If there are no
% spaces subject to justification or if the line is too long,
% then no justification is performed and no error message is
% produced.
token = _oneof [
space: _int,(32)% the int is the width of the space
tab: _int,(32)% the int is the width of the tab
word: word
]
at = _array [token]
_rep = _record [
length: _int,(32)% the current length of the line
stuff: at(32)% the contents of the line
]
.end_keep
.keep
create = _proc () _returns (_cvt)
% Create an empty line.
_return (_rep${
length: 0,
stuff: at$new ()
})
_end create
.end_keep
.keep
add_word = _proc (l: _cvt, w: word)
% Add a word at the end of the line.
at$addh (l.stuff, token$make_word (w))
l.length := l.length + word$width (w)
_end add_word
.end_keep
.keep
add_space = _proc (l: _cvt)
% Add a space at the end of the line.
at$addh (l.stuff, token$make_space (1))
l.length := l.length + 1
_end add_space
.end_keep
.keep
add_tab = _proc (l: _cvt)
% Add a tab at the end of the line.
width: _int := 8 - (l.length//8)
l.length := l.length + width
at$addh (l.stuff, token$make_tab (width))
_end add_tab
.end_keep
.keep
length = _proc (l: _cvt) _returns (_int)
% Return the current length of the line.
_return (l.length)
_end length
.end_keep
.keep
empty = _proc (l: _cvt) _returns (_bool)
% Return true if the line is empty.
_return (at$size(l.stuff) = 0)
_end empty
.end_keep
.keep
justify = _proc (l: _cvt, len: _int)
% Justify the line, if possible, so that it's length is equal
% to LEN. Before justification, any trailing spaces are
% removed. If the line length at that point is greater or
% equal to the desired length, then no action is taken.
% Otherwise, the set of justifiable spaces is found, as
% described above. If there are no justifiable spaces, then
% no further action is taken. Otherwise, the justifiable
% spaces are enlarged to make the line length the desired
% length. Failure is signalled if justification is attempted
% but the resulting line length is incorrect. This condition
% indicates a bug in justify; it should never be signalled,
% regardless of the arguments to justify.
remove_trailing_spaces (l)
_if l.length >= len _then _return _end
diff: _int := len - l.length
first: _int := find_first_justifiable_space (l)
_except _when none: _return _end
enlarge_spaces (l, first, diff)
_if l.length ~= len _then _signal failure ("justification failed") _end
_end justify
.end_keep
.keep
output = _proc (l: _cvt, outstream: stream)
% Output the line and reset it.
_if ~empty (_up (l)) _then
put_spaces (10, outstream)
_for t: token _in at$elements (l.stuff) _do
_tagcase t
_tag word (w: word):
word$output (w, outstream)
_tag space, tab (width: _int):
put_spaces (width, outstream)
_end
_end
_end
stream$putc ('\n', outstream)
l.length := 0
at$trim (l.stuff, 1, 0)
_end output
.end_keep
.keep
%(8)Internal procedures.
remove_trailing_spaces = _proc (l: _rep)
% Remove all trailing spaces from the line.
_while at$size (l.stuff) > 0 _do
_tagcase at$top (l.stuff)
_tag word, tab:
_break
_tag space (width: _int):
at$remh (l.stuff)
l.length := l.length - width
_end
_end
_end remove_trailing_spaces
.end_keep
.keep
find_first_justifiable_space = _proc (l: _rep) _returns (_int) _signals (none)
% Find the first justifiable space. This space is the first
% space after the first word after the last tab in the line.
% Return the index of the space in the array. Signal NONE if
% there are no justifiable spaces.
a: at := l.stuff
_if at$size (a) = 0 _then _signal none _end
lo: _int := at$low (a)
hi: _int := at$high (a)
i: _int := hi
% find the last tab in the line (if any)
_while i>lo _cand ~token$is_tab (a[i]) _do
i := i - 1
_end
% find the first word after it (or the first word in the line)
_while i<=hi _cand ~token$is_word (a[i]) _do
i := i + 1
_end
% find the first space after that
_while i<=hi _cand ~token$is_space (a[i]) _do
i := i + 1
_end
_if i>hi _then _signal none _end
_return (i)
_end find_first_justifiable_space
.end_keep
.keep
enlarge_spaces = _proc (l: _rep, first, diff: _int)
% Enlarge the spaces in the array whose indexes are at least
% FIRST. Add a total of DIFF extra character widths of space.
nspaces: _int := count_spaces (l, first)
_if nspaces = 0 _then _return _end
neach: _int := diff/nspaces
nextra: _int := diff//nspaces
_for i: _int _in _int$from_to_by (first, at$high (l.stuff), 1) _do
_tagcase l.stuff[i]
_tag space (width: _int):
width := width + neach
l.length := l.length + neach
_if nextra > 0 _then
width := width + 1
l.length := l.length + 1
nextra := nextra - 1
_end
l.stuff[i] := token$make_space (width)
_others:
_end
_end
_end enlarge_spaces
.end_keep
.keep
count_spaces = _proc (l: _rep, i: _int) _returns (_int)
% Return a count of the number of spaces in the line whose
% indexes in the array are at least I.
count: _int := 0
_while i <= at$high (l.stuff) _do
_tagcase l.stuff[i]
_tag space:
count := count + 1
_others:
_end
i := i + 1
_end
_return (count)
_end count_spaces
_end line
.end_keep
.bp
.keep
word = _cluster _is scan, width, output
% A word is an item of text. It may be output to a stream.
% It has a width, which is the number of character positions
% that are taken up when the word is printed.
_rep = _string
.end_keep
.keep
scan = _proc (c: _char, instream: stream) _returns (_cvt)
% Construct a word whose first character is C and whose
% remaining characters are to be removed from the instream.
s: _string := _string$c2s (c)
_while _true _do
c := stream$peekc (instream)
_except _when end_of_file: _break _end
_if c = ' ' _cor c = '\t' _cor c = '\n' _then
_break
_end
s := _string$append (s, stream$getc (instream))
_end
_return (s)
_end scan
.end_keep
.keep
width = _proc (w: _cvt) _returns (_int)
% Return the width of the word.
_return (_string$size (w))
_end width
.end_keep
.keep
output = _proc (w: _cvt, outstream: stream)
% Output the word.
stream$puts (w, outstream)
_end output
_end word
.end_keep
.pf 0
.fs 0
.fi
.ls 1.5

354
doc/clu/except.refman Normal file
View File

@ -0,0 +1,354 @@
.sr sect_handle Section`12.2
.sr app_io Appendix`III
.
.chapter "Exception Handling and Exits"
.para
A routine is designed to perform a certain task.
However, in certain cases that task may be impossible to perform.
In such a case, instead of returning normally (which would
imply successful performance of the intended task), the
routine should notify its caller by signalling an 2exception*,
consisting of a descriptive name and zero or more result objects.
.para
For example,
the procedure string$fetch takes a string and an integer index and returns
the character of the string with the given index.
However,
if the integer is not a legal index into the string,
the exception bounds is signalled instead.
The type specification of a routine contains a description of the exceptions
it may signal;
for example,
string$fetch is of type
.show
proctype (string, int) returns (char) signals (bounds)
.eshow
.para
The exception handling mechanism consists of two parts,
the signalling of exceptions and the handling of exceptions.
Signalling is the way a routine notifies its caller of an exceptional condition;
handling is the way the caller responds to such notification.
A signalled exception always goes to the immediate caller,
and the exception must be handled in that caller.
When a routine signals an exception,
the current activation of that routine terminates and the corresponding invocation
(in the caller) is said to 2raise* the exception.
When an invocation raises an exception E,
control immediately transfers to the closest handler for E that is attached to
a statement containing the invocation.
When execution of the handler completes,
control passes to the statement following the one to which the handler is attached.
.section "Signal Statement"
.para
An exception is signalled with a signal statement, which has the form:
.show
signal name lbkt ( expression, etc ) rbkt
.eshow
A signal statement may appear anywhere in the body of a routine.
The execution of a signal statement begins with evaluation of the expressions (if any),
from left to right,
to produce a list of 2exception* 2results*.
The activation of the routine is then terminated.
Execution continues as described in sect_handle below.
.para
The exception name must be either one of the exception names listed in
the routine heading,
or 2failure*.
(The 2failure* exception is implicitly part of all routine headings and types;
it may not be declared explicitly.)
If the corresponding exception specification in the heading has the form
.show
name (T1, etc, Tn)
.eshow
then there must be exactly 2n* expressions in the signal statement,
and the type of the 2ith* expression must be included in Ti.
Otherwise,
if the name is 2failure*,
then there must be exactly one expression present,
whose type is string.
.para
The following useless procedure contains a number of examples
of signal statements:
.show 7
signaller = s(1)proc (i: int) returns (int) 
signals (zero, negative(int))
t(1)if i < 0 then signal negative(-i)
t(1) elseif i > 0 then return(i)
t(1) elseif i = 0 then signal zero
t(1) else signal failure("unreachable statement executed!")
t(1) end
t(1)end signaller
.eshow
.section "Except Statement"
.para
When a routine activation terminates by signalling an exception,
the corresponding invocation (the text of the call)
is said to 2raise* that exception.
When an exception is raised,
the caller specifies what action should be taken by the use of handlers
attached to statements.
.para
A statement with handlers attached is called an except statement,
and has the form:
.show 3
statement except s(1)lcurly when_handler rcurly
t(1)lbkt others_handler rbkt
t(1)end
.eshow
where
.show 4
.long_def others_handler
.def1 when_handler "when name , etc lbkt ( decl , etc ) rbkt : body"
.or "when name , etc ( * ) : body"
.def1 others_handler "others lbkt ( idn : type_spec ) rbkt : body"
.eshow
At least one handler must be present.
Let S be the statement to which the handlers are attached,
and let X be the entire except statement.
Each when_handler specifies one or more exception names and a body.
The body is executed if an exception with one of those names is raised
by an invocation in S.
All of the names listed in the when_handlers must be distinct.
The optional others_handler is used to handle all exceptions
not explicitly named in the when_handlers.
The statement S can be any form of statement,
and can even contain other except statements.
.para
If, during the execution of S,
some invocation in S raises an exception E,
control immediately transfers to the closest handler for E that is
attached to a statement containing the invocation.
When execution of the handler completes
(assuming no exception causes a transfer out of the handler),
control passes to the statement following the one to which the handler is
attached.
Thus if the closest handler is attached to S,
the statement following X is executed next.
If execution of S terminates normally
(i.e., control never transfers to a handler attached to S),
then the statement following X is executed next.
.para
An exception raised inside a handler is treated the same as any other exception:
control passes to the closest handler for that exception.
Note that an exception raised in some handler attached to S cannot be handled
by any handler attached to S;
either the exception is handled within the handler,
or it is handled by some handler attached to a statement containing X.
.para
We now consider the forms of handlers in more detail.
The form
.show
when name , etc lbkt ( decl , etc ) rbkt : body
.eshow
is used to handle exceptions with the given names
when the exception results are of interest.
The optional declared variables,
which are local to the handler,
are assigned the exception results before the body is executed.
Every exception potentially handled by this form must have the same number
of results as there are declared variables,
and the types of the results must match exactly the types of the variables.
The form
.show
when name , etc ( * ) : body
.eshow
handles all exception with the given names,
regardless of whether or not there are exception results;
any actual results are discarded.
Hence exceptions with differing numbers and types of results can be handled together.
.para
The form
.show
others lbkt ( idn : type_spec ) rbkt : body
.eshow
is optional,
and must appear last in a handler list.
This form handles any exception not handled by other handlers in the list.
If a variable is declared,
it must be of type string.
The variable,
which is local to the handler,
is assigned a lower case string representing the actual exception name;
any results are discarded.
.para
Note that exception results are ignored when matching exceptions to handlers;
only the names of exceptions are used.
Thus the following is illegal,
in that int$div signals zero_divide without any results,
but the closest handler has a declared variable:
.show 6
begin
y: int := 0
x: int := 3 / y
except when zero_divide (z: int): return end
end
except when zero_divide: return end
.eshow
.para
An invocation need not be surrounded by except statements
that handle all potential exceptions.
This policy was adopted because in many cases the programmer can prove that
a particular exception will not arise.
For example, the invocation int$div(x,`7)
will never signal zero_divide.
However, this policy does lead to the possibility that some invocation may raise
an exception for which there is no handler.
To avoid this situation,
every routine body is contained implicitly in an except statement of the form
.show 4
begin s(1)2routine body* end
except s(1)when failure (s: string): s(2)signal failure(s);
t(1)others (s: string):t(2)signal failure("unhandled exception: " || s);
t(1)end;
.eshow
2Failure* exceptions are propagated unchanged;
an exception named 2name* becomes failure("unhandled exception: 2name*").
.section "An Example"
.para
We now present an example demonstrating the use of exception handlers.
We will write a procedure, sum_stream,
which reads a sequence of signed decimal integers from a character
stream and returns the sum of those integers.
The stream is viewed as containing a sequence of fields separated by spaces;
each field must consist of a non-empty sequence of
digits, optionally preceded by a single minus sign.
Sum_stream has the form
.show 5
sum_stream = s(1)proc (s: stream) returns (int) signals(s(2)overflow,
t(2)unrepresentable_integer(string),
t(2)bad_format (string));
t(1)etc
t(1)end sum_stream;
.eshow
Sum_stream signals overflow if the sum of the
numbers or an intermediate sum is outside the implemented range of integers.
Unrepresentable_integer is signalled if the stream contains
an individual number that is outside the implemented range of integers.
Bad_format is signalled if the stream contains a
field that is not an integer.
.para
We will use the 2getc* operation of the 2stream* data type
(see app_io), whose type is
.show
proctype (stream) returns (char) signals (end_of_file, not_possible(string));
.eshow
This operation returns the next character from the stream, unless the
stream is empty, in which case end_of_file is signalled.
Not_possible is signalled if the operation cannot be performed on the
given stream (e.g., it is an output stream, or does not allow character
operations, etc.)
We will assume that we are given a stream for which 2getc* is always possible.
.para
The following procedure is used to convert character strings to integers:
.show 5
s2i = s(1)proc (s: string) returns (int) 
signals (s(2)invalid_character(char),
t(2)bad_format,
t(2)unrepresentable_integer);
t(1)etc
t(1)end s2i;
.eshow
2S2i* signals invalid_character if its string argument
contains a character other than a digit or a minus sign.
Bad_format is signalled if the string contains
a minus sign following a digit, more than one minus
sign, or no digits.
Unrepresentable_integer is signalled if the string
represents an integer that is outside the implemented range of integers.
.para
An implementation of sum_stream is presented in Figure current_figure.
.begin_figure "The sum_stream procedure."
.show
.ta 14 18 22
sum_stream = proc (s: stream) returns (int) signals (s(1)overflow,
t(1)unrepresentable_integer(string),
t(1)bad_format(string));
sum: int := 0;
num: string := "";
while true do
% skip over spaces between values; sum is valid, num is meaningless
c: char := stream$getc(s);
while c = ' ' do
c := stream$getc(s);
end;
% read a value; num accumulates new number, sum becomes previous sum
while c ~= ' ' do
num := string$append(num, c);
c := stream$getc(s);
end;
except when end_of_file: end;
% restore sum to validity
sum := sum + s2i(num);
end;
except
when end_of_file: return(sum);
when unrepresentable_integer: signal unrepresentable_integer (num);
when bad_format,invalid_character(*): signal bad_format (num);
when overflow: signal overflow;
end;
end sum_stream;
.rtabs
.eshow
.finish_figure
There are two loops within an infinite loop:
one to skip spaces, and one to accumulate digits for conversion to a number.
Notice the placement of the inner end_of_file handler.
If end_of_file is raised in the second inner loop,
then the sum is computed correctly, and the first invocation
of stream$getc will again raise end_of_file.
This time, however, the infinite loop is terminated and
execution transfers to the other end_of_file handler,
which then returns the accumulated sum.
.para
We have placed the remaining exception handlers outside of the infinite loop
to avoid cluttering up the main part of the algorithm.
Each of these exception handlers could also
have been placed after the particular statement containing
the invocation that signalled the corresponding exception.
The (*) form is used in the handler for the bad_format and invalid_character
exceptions since the exception results are not used.
Note that the overflow handler catches exceptions
signalled by the int$add procedure, which is invoked
using the infix + notation.
Note also that in this example all of the exceptions
raised by sum_stream originate as exceptions signalled by lower-level modules.
Sum_stream simply reflects these exceptions
upwards in terms that are meaningful to its callers.
Although some of the names may be unchanged, the meanings
of the exceptions (and even the number of results) are
different in the two levels.
.para
As mentioned above, we have assumed stream$getc never signals
not_possible; if it does, then sum_stream will terminate, raising
the exception failure("unhandled exception: not_possible").
.section "Exits and the Placement of Handlers"
.para
A 2local* transfer of control can be effected by using an exit statement,
which has the form:
.show
exit name lbkt ( expression, etc ) rbkt
.eshow
An exit statement is similar to a signal statement except
that where the signal statement 2signals* an exception
to the 2calling* routine, the exit statement 2raises* the
exception directly in the 2current* routine.
An exception raised by an exit statement
must be handled (explicitly) by a containing except statement
with a handler of the form
.show
when name , etc lbkt ( decl , etc ) rbkt : body
.eshow
The types of the expressions in the exit statement must match exactly
the types of the variables declared in the handler.
.para
The exit statement and the signal statement mesh nicely to form a uniform mechanism.
The signal statement can be viewed simply as terminating a routine activation;
an exit is then performed at the point of invocation in the caller.
(Because this exit is implicit,
it is not subject to the restrictions on exits listed above.)

548
doc/clu/exprs.refman Normal file
View File

@ -0,0 +1,548 @@
.sr syntype Section`?.?
.sr store Section`11.2.1
.sr set Section`11.2.2
.sr typeapndx Appendix`II
.sr literals Sections 7.1 to 7.6
.sr constants Section`8.3
.sr procstmt Section`11.1
.sr forstmt Section`11.5.2
.sr updown Section`13.4
.sr signalling Section`12
.sr invoke Section`9.3
.
.sr s1 1
.sr s2 2
.
.am table_of_contents
. bp
. em
.chapter Expressions
.sr self Section`chapter
.para
An expression evaluates to an object in the CLU universe.
This object is said to be the 2result* or 2value* of the expression.
Expressions are used to name the object to which they evaluate.
The simplest forms of expressions are literals, variables, and routine names.
These forms directly name their result object.
More complex expressions are generally built up out of nested procedure invocations.
The result of such an expression is the value returned by the outermost invocation.
.para
Like many other languages,
CLU has prefix and infix operators for the common arithmetic and comparison operations,
and uses the familiar syntax for array indexing and record component selection
(e.g., 2a*[2i*] and 2r*.2s*).
However,
in CLU these notations are considered to be abbreviations for procedure calls.
This allows built-in types and user-defined types to be treated as uniformly as possible,
and also allows the programmer to use familiar notation when appropriate.
.para
In addition to invocation,
four other forms are used to build complex expressions out of simpler ones.
These are the conditional operators cand and cor (see self.8),
and the type conversion operations up and down (see self.10).
.para
There is a syntactically restricted form of expression called a 2primary*.
A primary is any expression that does not have a prefix or infix operator,
or parentheses,
at the top level.
In certain places,
the syntax requires a primary rather than a general expression.
This has been done to increase the readability of the resulting programs.
.para
As a general rule,
procedures with side effects should not be used in expressions,
and programs should not depend on the order in which expressions are evaluated.
However,
to avoid surprises,
the subexpressions of any expression are evaluated in strict left-to-right order.
.para
The various forms of expressions are explained below.
.
.section Literals
.
.para
Integer, real, character, string, boolean and null literals are expressions.
The syntax for literals is given in literals.
The type of a literal expression is the type of the object named by the literal.
For example,
true is of type bool,
"abc" is of type string,
etc.
.
.section Variables
.
.para
Variables are identifiers that name objects of a given type.
The syntactic type of a variable is the type given in the declaration of that variable,
and determines which objects may be named by the variable.
.
.section "Procedure and Iterator Names"
.
.para
Procedures and iterators may be defined as separate modules,
or in a cluster as operations of the type.
Those defined as separate modules are named by expressions of the form:
.show
idn lbkt [constant, etc ] rbkt
.eshow
The optional constants are the parameters of the procedure or iterator abstraction.
(Constants were discussed in constants.)
.para
When a procedure or iterator is defined as an operation of a type,
that type must be part of the name of the routine.
The form for naming an operation of a type is:
.show
type_spec $ name lbkt [constant, etc ] rbkt
.eshow
.para
The type of a procedure or iterator name is just the type of the named routine.
Some examples of procedure and iterator names are:
.show 4
primes
sort[int]
int$add
array [bool]$elements
.eshow
.
.section "Procedure Invocations"
.
.para
Procedure invocations have the form
.show
primary ( lbkt expression, etc rbkt )
.eshow
The primary is evaluated to obtain a procedure object,
and then the expressions are evaluated left-to-right to obtain the argument objects.
The procedure is invoked with these arguments,
and the object returned is the result of the entire expression.
For more discussion see invoke.
.para
The following expressions are invocations:
.show 3
p (x)
int$add (a, b)
within [3.2] (7.1, .003e7)
.eshow
.para
Any procedure invocation P(Es1, ... En) must satisfy two constraints:`
the type of P must be of the form
.show
proctype (Ts1, ... Tn) returns (R) signals (...)
.eshow
and the type of each expression Ei must be included in the corresponding type Ti.
The syntactic type of the entire invocation expression is given by R.
.para
Procedures can also be invoked as statements (see procstmt).
Iterators can only be invoked by a for statement (see forstmt).
.
.section "Selection Operations"
.
.para
Arrays and records are collections of objects.
Selection operations provide access to the individual elements or
components of the collection.
Simple notations are provided for invoking
the 2fetch* and 2store* operations of array types,
and the 2get* and 2set* operations of record types.
In addition,
these "syntactic sugarings" for selection operations may be used for
user-defined types with the appropriate properties.
.
.subsection "Element Selection"
.
.para
An element selection expression has the form:
.show
primary [ expression ]
.eshow
This form is just syntactic sugar for an invocation of a 2fetch* operation,
and is completely equivalent to:
.show
T$fetch (primary, expression)
.eshow
where T is the type of 2primary*.
.para
For example,
if a is an array of integers,
then
.show
a[27]
.eshow
is completely equivalent to the invocation
.show
array[int]$fetch (a, 27)
.eshow
.para
The element selection expression is not restricted to arrays.
The expression is legal whenever the corresponding invocation is legal.
In other words,
T (the type of 2primary*) must provide a procedure operation named 2fetch*,
which takes two arguments whose types include
the types of 2primary* and 2expression*,
and which returns a single result.
When 2primary* is an array[S] for some type S,
2expression* must be an int,
and the result has type S.
.para
The use of 2fetch* for user-defined types should be
restricted to types with array-like behavior.
Objects of such types will contain (along with other information)
a collection of objects,
where the collection can be indexed in some way.
For example,
it might make sense for an associative_memory type to provide
a 2fetch* operation to access the value associated with a key.
2Fetch* operations are for use in expressions;
thus they should never have side-effects.
.para
Array-like types may also provide a 2store* operation;
see store.
.
.subsection "Component Selection"
.
.para
The component selection expression has the form:
.show
primary 1.* name
.eshow
This form is just syntactic sugar for an invocation of a 2get_name*
operation, and is completely equivalent to:
.show
T$get_2name* (primary)
.eshow
where T is the type of 2primary*.
.para
For example, if x has type record [first: int, second: real], then
.show
x1.*first
.eshow
is completely equivalent to
.show
record[first: int, second: real] $ get_first (x)
.eshow
.para
The component selection expression is not restricted to records.
The statement is legal whenever the corresponding invocation is legal.
In other words,
T (the type of 2primary*) must provide a procedure operation named 2get_name*,
which takes one argument whose type includes the type of 2primary*,
and which returns a single result.
When T is a record type,
then T must have a selector called 2name*,
and the type of the result will be the type of the component named by that selector.
.para
The use of 2get* operations for user-defined types should be restricted to
types with record-like behavior.
Objects of such types will contain (along with other information)
one or more named objects.
For example,
it might make sense for a file type to provide a get_author operation,
which returns the name of a file's creator.
2Get* operations are intended for use in expressions;
thus they should never have side-effects.
.para
Types with named components may also provide 2set* operations;
see set.
.
.section "Array and Record Constructors"
.
.para
Constructors are expressions that enable users to create and initialize
arrays and records.
Constructors are not provided for user-defined types.
.
.subsection "Array Constructors"
.
.para
An array constructor has the form:
.show
type_spec $ [ lbkt expression: rbkt lbkt expression, etc rbkt ]
.eshow
The type specification must name an array type: array[T].
This is the type of the constructed array.
The expression preceding the ":" must evaluate to an integer,
and bcomes the low bound of the constructed array.
If this expression is omitted,
the low bound is 1.
The expressions following the ":" are evaluated to obtain the elements of the array.
They correspond (left-to-right) to the indices
2low_bound*, 2low_bound*+1, 2low_bound*+2, ...
.para
For example,
the expression
.show
array[bool] $ [79: true, false]
.eshow
constructs a new boolean array with two elements:
true (at index 79),
and false (at index 80).
The expression
.show
array[ai] $ [ai$[], ai$[]]
.eshow
(where 2ai* is equated to array[int])
creates two distinct integer arrays,
both empty,
and creates a third array to hold them.
The low bound of each array is 1.
.para
For any array constructor with type_spec array[T],
the type of each element expression in the constructor must be included in T.
An array constructor is computationally equivalent to an array create operation,
followed by a number of addh operations.
However,
such a sequence of operations cannot be written as an expression.
.
.subsection "Record Constructors"
.
.para
A record constructor has the form:
.show
type_spec $ { field, etc }
.eshow
where
.show
.def field "name, etc : expression"
.eshow
Whenever a field has more than one name, it is equivalent to a sequence of
fields, one for each name. Thus, the following two constructors are
equivalent:
.show 3
R = record [ a: int, b: int, c: int ]
R $ { a, b: 7, c: 9 }
R $ { a: 7, b: 7, c: 9 }
.eshow
.para
In a record constructor, the type specification
must name a record type: record`[Ss1:Ts1,`...,`Sn:Tn].
This will be the type of the constructed record. The component names
in the field list must be exactly the names Ss1,`...,`Sn,
although these names may appear in any order.
The expressions are evaluated left-to-right, and there is one evaluation per
component name even if component names are "factored." The results of these
evaluations form the components of a newly constructed record. This record is
the value of the entire constructor expression.
.para
As an example, consider the following record constructor:
.show 3
AS = array [string]
RT = record [list1, list2: AS, item: int]
RT $ {item: 2, list1, list2: AS$["Susan", "George", "Jan"]}
.eshow
This produces a record that contains an integer and two distinct arrays.
The arrays are distinct because the array constructor expression is evaluated
twice, once for 2list1* and once for 2list2*.
.para
The type of the expression for component Si must be included in Ti.
This constructor is computationally equivalent to a record create operation
(see typeapndx),
but that operation is not available to the user.
.
.section "Prefix and Infix Operators"
.
.para
CLU allows infix and prefix notation to be used as a shorthand for
the following operations.
The table shows the shorthand form and the equivalent expanded form for each operation.
For each operation,
the type T is the type of the first operand.
.show 23
.sp .5
.ta 1i 2.3i
Shorthand form Expansion
.sp .5
exprs1 ** exprs2 T$power (exprs1, exprs2)
exprs1 // exprs2 T$mod (exprs1, exprs2)
exprs1 / exprs2 T$div (exprs1, exprs2)
exprs1 * exprs2 T$mul (exprs1, exprs2)
exprs1 || exprs2 T$concat (exprs1, exprs2)
exprs1 + exprs2 T$add (exprs1, exprs2)
exprs1 - exprs2 T$sub (exprs1, exprs2)
exprs1 < exprs2 T$lt (exprs1, exprs2)
exprs1 <= exprs2 T$le (exprs1, exprs2)
exprs1 = exprs2 T$equal (exprs1, exprs2)
exprs1 >= exprs2 T$ge (exprs1, exprs2)
exprs1 > exprs2 T$gt (exprs1, exprs2)
exprs1 ~< exprs2 ~ (exprs1 < exprs2)
exprs1 ~<= exprs2 ~ (exprs1 <= exprs2)
exprs1 ~= exprs2 ~ (exprs1 = exprs2)
exprs1 ~>= exprs2 ~ (exprs1 >= exprs2)
exprs1 ~> exprs2 ~ (exprs1 > exprs2)
exprs1 & exprs2 T$and (exprs1, exprs2)
exprs1 | exprs2 T$or (exprs1, exprs2)
-` expr T$minus (expr)
~` expr T$not (expr)
.rtabs
.eshow
.para
Operator notation is used most heavily for the built-in types,
but may be used for user-defined types as well.
When these operations are provided for user-defined types,
they should always be side-effect-free,
and they should mean roughly the same thing as they do for the built-in types.
For example,
the comparison operations should only be used for types that have
a natural partial order.
Usually,
the comparison operations (lt, le, equal, ge, gt) will be of type
.show
proctype (T, T) returns (bool)
.eshow
the other binary operations (e.g., add, sub) will be of type
.show
proctype (T, T) returns (T) signals (...)
.eshow
and the unary operations will be of type
.show
proctype (T) returns (T) signals (...)
.eshow
.
.section "Cand and Cor"
.
.para
Two additional binary operators are provided.
These are the conditional and operator,
cand,
and the conditional or operator,
cor.
.show
expressions1 cand expressions2
.eshow
is the boolean 2and* of expressions1 and expressions2.
However,
if expressions1 is false,
expressions2 is never evaluated.
.show
expressions1 cor expressions2
.eshow
is the boolean 2or* of expressions1 and expressions2,
but expressions2 is not evaluated unless expressions1 is false.
.para
Conditional expressions can be used to avoid run-time errors.
For example,
the following boolean expressions can be used without fear of "bounds"
or "zero_divide" errors:
.show 2
(low_bound <= i) cand (i <= high_bound) cand (A[i] ~= 0)
(n = 0) cor (1000//n = 0)
.eshow
.para
For both cand and cor,
expressions1 and expressions2 must have syntactic type bool.
Uses of cand and cor are not equivalent to any procedure invocation.
.
.section Precedence
.
.para
When an expression is not fully parenthesized,
the proper nesting of subexpressions might be ambiguous.
The following precedence rules are used to resolve such ambiguity.
The precedence of each infix operator is given in the table below.
Higher precedence operations are performed first.
Prefix operators always have precedence over binary operators.
.ne 8
.para
The precedence for infix operators is as follows:
.show 12
.ta 1.5i 2i 2.5i 3i 3.5i 4i
.sp .5
Precedence Operators
.sp .5
5 **
.sp .5
4 * / //
.sp .5
3 + - ||
.sp .5
2 < <= = >= >
~< ~<= ~= ~>= ~>
.sp .5
1 & cand
.sp .5
0 | cor
.rtabs
.eshow
.para
The order of evaluation for operators of the same precedence is left-to-right,
except for **,
which is right-to-left.
.para
The following examples illustrate the precedence rules.
.show 9
.sp .5
.ta 3i
Expression Equivalent Form
.sp .5
a + b // c a + (b // c)
.sp .5
a + b - c (a + b) - c
.sp .5
a + b ** c ** d a + (b ** (c ** d))
.sp .5
a = b | c = d (a = b) | (c = d)
.sp .5
- a * b (-a) * b
.rtabs
.eshow
.
.section "Up and Down"
.
.para
There are no implicit type conversions in CLU.
Two forms of expression exist for explicit conversions.
These are:
.show 2
up (expression)
down (expression)
.eshow
.para
1Up* and 1down* may be used only within the body of a cluster operation.
1Up* changes the type of the expression from the representation type
of the cluster to the abstract type.
1Down* converts the type of the expression from the abstract type
to the representation type.
These conversions will be explained further in updown.
.section Force
.para
CLU has a single built-in procedure generator called force.
1Force* takes one type parameter,
and is written
.show
force [ type_spec ]
.eshow
The procedure force[T] has type
.show
proctype (any) returns (T) signals (wrong_type)
.eshow
If force[T] is applied to an object that is included in type T,
then it returns that object.
If force[T] is applied to an object that is not in type T,
then it signals "wrong_type" (see signalling).
.para
1Force* is a necessary companion to the type any.
The type any allows programs to pass around objects of arbitrary type.
However, to do anything substantive with an object,
one must use the basic operations of that object's type.
This raises a conflict with compile-time type-checking,
since an operation can only be applied when the arguments are known to be
of the correct types.
This conflict is resolved by using force.
1Force*[T] allows a program to check,
at run-time,
that a particular object is actually of type T.
If this check succeeds,
then the object can be used in all the ways appropriate for objects of type T.
.para
For example,
the procedure force[T] allows us to legally write the following code:
.show 2
x: any := 3
y: int := force [int] (x)
.eshow
while the following is illegal:
.show 2
x: any := 3
y: int := x
.eshow
because the type of 2y* (int) does not include the type of the expression 2x*
(any).

315
doc/clu/e}d.refman Normal file
View File

@ -0,0 +1,315 @@
.sr sect_type_set Section`13.5
.sr sect_library Section`4
.sr sect_semantics Section`3
.sr sect_expr Section`11
.sr sect_assign Section`9.2
.sr sect_except Section`12
.sr sect_module Section`13
.sr sect_mult_assn Section`9.2.2
.
.chapter "Scopes, Declarations, and Equates"
.para
We now describe how to introduce and use constants and variables,
and the scope of constant and variable names.
Scoping units are described first,
followed by a discussion of variables, and finally constants.
.section "Scoping Units"
.para
Scoping units follow the nesting structure of statements.
Generally,
a scoping unit is a body and an associated "header".
The scoping units are:
.nlist
From the start of a module to its end.
.nnext
From a cluster, proc, or iter to the matching end.
.nnext
From a then or else to the end of the corresponding body.
.nnext
From a for, do, or begin to the matching end.
.nnext
From a tag or others in a tagcase statement
to the end of the corresponding body.
.nnext
From a when or others in an except statement
to the end of the corresponding body.
.nnext
From the start of a type_set to its end.
.end_list
The last case above, the scope in a type_set, is a special case that
will be discussed in sect_type_set.
Whatever we say about scopes in the remainder of this section refers only to
cases 1 through 6.
.para
The structure of scoping units is such that if one scoping unit
overlaps another scoping unit (textually), then one is fully
contained in the other.
The contained scope is called a 2nested* scope, and the containing
scope is called a 2surrounding* scope.
.para
New constant and variable names may be introduced in a scoping unit.
Names for constants are introduced by equates, which are syntactically
restricted to appear grouped together at or near the beginning of scoping units.
For example, equates may appear at the beginning of a body,
but not after any statements in the body.
.para
In contrast, declarations, which introduce new variables,
are allowed wherever statements are allowed, and hence
may appear throughout a scoping unit.
Equates and declarations are discussed in more detail in the following two sections.
.para
In the syntax there are two distinct nonterminals
for identifiers: 2idn* and 2name*.
Any identifier introduced by an equate or declaration is an 2idn*,
as is the name of the module being defined, and any operations it has.
An 2idn* names a specific type or object.
The other kind of identifier is a 2name*.
A 2name* is used to refer to a subpiece of something,
and is always used in context; for example, 2names* are used
as record selectors.
The scope rules apply only to 2idns*.
.para
The scope rules are very simple:
.nlist
An idn may not be redefined in its scope.
.nnext
Any idn that is used as an external reference in a _module
may not be used for any other purpose in that module.
.end_list
Unlike other "block-structured" languages,
CLU prohibits the redefinition of an identifier in a nested scope.
An identifier used as an external reference names a module or constant;
the reference is resolved using the compilation environment (see sect_library).
.section "Variables"
.para
Objects are the fundamental "things" in the CLU universe;
variables are a mechanism for denoting (i.e., naming) objects.
This underlying model is discussed in detail in sect_semantics.
A variable has two properties: its type,
and the object that it currently denotes (if any).
A variable is said to be 2uninitialized* if it does not denote any object.
.para
There are only three things that can be done with variables:
.nlist
New variables can be introduced.
Declarations perform this function, and are described below.
.nnext
An object may be assigned to a variable.
After an assignment the variable denotes the object assigned.
Assignment is discussed in sect_assign.
.nnext
A variable may be used as an expression.
The value of such an expression (i.e., the result of evaluating it)
is the object that the variable denotes at the time the expression is evaluated.
Expressions and their evaluation are described in sect_expr.
.end_list
.subsection "Declarations"
.para
Declarations introduce new variables.
The scope of a variable is from its declaration to the
end of the smallest scoping unit containing its declaration;
hence, variables must be declared before use.
.para
There are two sorts of declarations: those with initialization,
and those without.
Simple declarations (those without initialization) take this form:
.show
.def decl "idn, etc : type_spec"
.eshow
A simple declaration introduces a list of variables,
all having the type given by the type_spec.
This type determines the types of objects that can be assigned to the variable.
Here are some examples of simple declarations:
.keep
.show 4
i: int s(1)% declare i to be an integer variable
i, j, k: chart(1)% declare i, j, and k to be character variables
x, y: complext(1)% declare x and y to be complex number variables
z: anyt(1)% declare z to be of type any; thus, z may denote any object
.eshow
.end_keep
The variables introduced in a simple declaration initially denote no objects,
i.e., they are uninitialized.
Attempts to use uninitialized variables (if not detected at compiler-time)
cause the run-time exception
.show
failure("uninitialized variable")
.eshow
(Exceptions are discussed in sect_except.)
.subsection "Declarations with Initialization"
.para
A declaration with initialization combines
declarations and assignments into a single statement.
A declaration with initialization is entirely equivalent to
one or more simple declarations followed by an assignment statement.
The two forms of declaration with initialization are:
.show
idn : type_spec := expression
.eshow
and
.show
decl1, etc, decln := invocation
.eshow
These are equivalent to (respectively):
.show 2
idn : type_spec
idn := expression
.eshow
and
.show 2
decl1 etc decln % declaring idn1 etc idn m
idn1, etc, idnm := invocation
.eshow
In the second form, the order of the idns in the
assignment statement is the same as in the original declaration
with initialization.
(The invocation must return 2n* objects;
see sect_mult_assn).
.para
Here are some examples of declarations with initialization:
.show 6
astr: array[string] := array[string]$create (1)
s(1)% declare astr to be an array variable and initialize it to an empty array
first, last: string, balance: int := acct$query (acct_no)
t(1)% declare first and last to be string variables, balance an integer variable,
t(1)% and initialize them to the results of a bank account query
.eshow
The above two statements are equivalent to the following sequences of statements:
.show 6
astr: array[string]
astr := array[string]$create (1)
first, last: string
balance: int
first, last, balance := acct$query (acct_no)
.eshow
.section "Equates and Constants"
.para
An equate allows a single identifier to be used as
an abbreviation for a constant that
may have a lengthy textual representation.
We use the term constant in a very narrow sense here:
constants,
in addition to being immutable,
must be computable at compile-time.
Constants are either
types (built-in or user defined),
or objects that are the results of evaluating constant expressions.
(Constant expressions are defined below.)
.end_list
.para
The syntax of equates is:
.show 5
.long_def constant
.def1 equate "idn = constant"
.or "idn = type_set"
.def1 constant type_spec
.or "expression % the expression must be a constant expression"
.eshow
This section describes only the first form of equate;
discussion of type_sets is deferred to sect_type_set.
.para
An equated identifier may be used as an expression.
The value of such an expression is the constant to which
the identifier is equated.
An equated identifier may not be used as the target of an assignment.
.para
The scope of an equated identifier is the smallest scoping unit
surrounding the equate defining it;
here we mean the entire scoping unit, not just the portion after the equate.
All the equates in a scoping unit must appear near the beginning of the scoping unit.
The exact placement of equates depends on the containing syntactic construct;
usually equates appear at the beginnings of bodies.
.para
Equates may be in any order within the group.
Thus, forward references among equates in the same scoping unit are
allowed, but cyclic dependencies are illegal.
For example,
.show 3
x = y
y = z
z = 3
.eshow
is a legal sequence of equates, but
.show 3
x = y
y = z
z = x
.eshow
is not.
Since equates introduce idns, the scoping restrictions on idns
apply (i.e., the idns may not be defined more than once).
.subsection "Abbreviations for Types"
.para
Identifiers may be equated to type
specifications, thus giving abbreviations for type names.
For example:
.show 7
at = array [int]
ot = oneof [there: rt, null: null]
rt = record [a: foo, b: bar]
pt = proctype (int, int) returns (int) signals (overflow)
it = itertype (int, int, int) yields (int) signals (bounds)
seq = sequence
mt = mark_table
.eshow
.para
Notice that since equates may not have cyclic dependencies,
directly recursive type specifications cannot be written.
However, this does not prevent the definition of recursive types:
clusters allow them to be written (see sect_module).
.subsection "Constant Expressions"
.para
Here we define the subset of objects that equated identifiers may denote,
by stating which expressions are constant expressions.
(Expressions are discussed in detail in sect_expr.)
A 2constant expression* is an expression that can be evaluated
at compile-time to produce an immutable object of a built-in type.
Specifically this includes:
.nlist
Literals.
.nnext
Identifiers equated to constants.
.nnext
Procedure and iterator names, including force [t] for any type t.
.nnext
Invocations of procedure operations of the built-in constant types
(except string$s2ac),
provided that all operands are constant expressions.
.nnext
Formal parameters.
.end_list
For completeness, here is a list of the built-in constant types:
null, int, real, bool, char, string,
oneof types, procedure types, and iterator types.
We explicitly forbid applying any operations to formal parameters,
since the values of formal parameters are not known at compile-time.
.para
Here are some examples of equates involving expressions:
.show 13
hash_modulus = 29
pi = 3.14159265
win = true
control_c = '\003'
prompt_string = "Input: "
nl = string$c2s ('\n')
prompt = nl || prompt_string
prompt_len = string$size (prompt)
quarter = pi / 2.0
ftb = int$from_to_by
ot = oneof [pair: cell, null: null]
cell = record [first, second: int]
nilptr = ot$make_null (nil)
.eshow
Note that the following equate is illegal because it uses a record constructor,
which is not a constant expression:
.show
cell_1_2 = ot$make_cell (cell${first: 1, second: 2})
.eshow
.para
Any invocation in a constant expression must terminate normally;
a program is illegal if evaluation of any constant expression would signal an exception.
(Exceptions are discussed in sect_except.)
Illegal programs will not be executed.

38
doc/clu/gram.refman Normal file
View File

@ -0,0 +1,38 @@
.am table_of_contents
. rs
. bp
. sp
3Detailed Description*
. sp
. em
.ec e epsilon
.sr syntax_pdx I
.chapter "Notation"
.para 1
We use an extended BNF grammar to define the syntax.
The general form of a production is:
.show
.def nonterminal alternative
.or alternative
.or ```...
.or alternative
.eshow
.para 1
The following extensions are used:
.nr fff fheight/4
.show
a , etc stab(1)stands for stab(2)lparen!a vbar a , a vbar a , a , a vbar (fff!m)...rparen
lcurly!arcurlytab(1)stands fortab(2)lparen\e vbar a vbar a a vbar a a a vbar (fff!m)...rparen
lbkt!arbkttab(1)stands fortab(2)lparen\e vbar arparen
.eshow
.para 1
All semicolons are optional in CLU,
but for simplicity they appear in the syntax without enclosing meta-brackets.
Nonterminal symbols appear in normal face.
Reserved words appear in bold face.
All other terminal symbols are non-alphabetic,
and appear in normal face.
.para
Full productions are not always shown in the body of this document;
often alternatives are presented and explained individually.
Appendix`syntax_pdx contains the complete syntax.

1195
doc/clu/io.refman Normal file

File diff suppressed because it is too large Load Diff

586
doc/clu/killed.refman Normal file
View File

@ -0,0 +1,586 @@
.chapter "Scope, Declarations, and Equates"
.para
As was discussed in sect_progs, the structure of a CLU
program is not deeply nested as is customary in block-structured
languages, but rather consists of a group of modules all at
the same level.
The names of the modules, and in the case of clusters, the names
of the operations, are globally known throughout this level.
However, since modules are not nested within other modules,
identifiers used within modules to name, for example, variables,
are purely local.
Since we expect modules to be rather small (in the absence of nesting),
we felt it was reasonable to insist that local identifiers not
be redefined within a module.
Therefore, although there is block structure within a module, it is
not possible to redefine in an inner scope an identifier declared
in an outer scope.
.para
Each full_module defines a scoping unit.
In addition, all compound statements define new scoping units in the
obvious places.
For example, in the if statement, both the then clause
and the else clause are new scoping units.
.para
Variables may be declared anywhere within a scoping unit;
declarations are not constrained to appear at the beginning of a unit.
The actual scope of the variable begins after its declaration, and continues to the
end of the smallest enclosing scoping unit.
.para
A variable declaration gives a name for the variable and the
type of the variable.
In addition, an initial value for the variable may be provided.
The use of an uninitialized variable will raise an exception
if the error is not caught at compile time.
(CLU arrays and records are defined in such a way that there
are no uninitialized elements or components to worry about,
so we can guarantee that all expressions are well-defined
by checking variable usage.)
.para
2Equates* are used to establish abbreviations for types and constants.
Each expression in an equate must be compile-time
computable, and must produce an object belonging to one of the
built-in, immutable types (see sect_semantics.1).
.para
Equates must all appear in a group at the beginning of a scoping unit;
the order of the equates is unimportant, but they may not be recursive.
The actual scope of the equates is the entire scoping unit containing them.
.chapter "Expressions and Statements"
.para
CLU is somewhat unusual in that almost all expressions are considered
to be just a syntactic means of invoking procedures.
This view permits user-defined and built-in types to be treated
uniformly, e.g, x + y invokes T$add
whether the type of x, T, is built-in or user-defined.
This view also fits our model that exceptions arise from invocations
(see sect_except).
Note that this view does not preclude in-line code for expressions
(for both built-in and user-defined types);
we simply view the production of in-line code as analogous
to in-line substitution for an invocation, followed possibly by
some optimization.
.para
One exception to the view of expressions as invocations
is the use of the cand (conditional and) and cor
(conditional or) operators.
These operators are defined to shortcut evaluation of their operands;
for example, the second operand of cand will be evaluated only
if the first operand evaluates to true. Thus, cand
and cor cannot be explained in terms of invocation.
These operators are not available for overloading, and they
do not raise any exceptions.
.para
CLU statements are, for the most part, fairly conventional.
The most basic statements are the assignment statement and the
invocation statement;
the semantics of these statements is discussed in the next section.
.para
There are a number of compound statements: block,
conditional, iterative, tagcase and except statements.
Blocks are used to group statements, and to introduce new scoping units.
The conditional statement is the usual if statement, with an
additional elseif form which may be used when there are a number of
clauses all at the same level.
.para
There are two iterative statements: one is the usual while statement;
the other, the for statement, is used in conjunction with an
iterator which controls the looping.
.para
The tagcase statement is used to discriminate on the
tag of a oneof object;
it provides 2arms* for possible values of the tag, plus a special
others arm to handle tag values not mentioned explicitly.
.para
The except statement is used to handle exceptions
arising from invocations, plus locally generated exits.
Its form is similar to that of the tagcase statement, with arms to handle
explicitly named exceptions and exits, and an optional
others arm to handle any exception not explicitly mentioned.
.para
Finally, there are a number of termination statements.
The return statement terminates a procedure or iterator in the
normal condition, while the signal statement is used to terminate
in an exceptional condition.
The yield statement is used within an iterator to
produce the next item in the sequence.
.para
1Return*, signal, and yield are inter-module control mechanisms.
The remaining termination statements are all intra-module.
The exit statement raises an exit condition that must be
handled by a local except statement.
The break statement terminates the smallest enclosing loop, and
the continue statement terminates just
the current cycle of the smallest enclosing loop.
.en
.sr sect_iter Section 14.5
.sr sect_proctypes Section 9.11
.sr app_types Appendix II
.sr sect_handle Section 14.3
.sr app_io Appendix III
\k
.chapter "Exception Handling and Exits"
.para 1
A procedure is designed to perform a certain task, taking
some number and types of arguments and returning some
number and types of result objects.
However, in certain
cases (e.g., for particular values of arguments), that task
may be impossible to perform.
In such a case, instead of returning normally (which would
imply successful performance of the intended task), the
procedure should notify its caller of its failure
by signalling an i(exception).
.para
For example, consider integer division.
The int$div procedure takes two integer arguments and returns their quotient.
However, if the second argument to int$div is zero, then
there is no quotient.
In this case, instead of returning,
int$div signals the exception zero_divide.
We include in the type specification of a procedure
a description of the exceptions it may signal, for example,
int$div is of type
.show
proctype (int, int) returns (int) signals (zero_divide)
.eshow
.para
In this section, we will concentrate on exceptions signalled by procedures.
However, exceptions may also be signalled by iterators, and all we say about
procedures applies to iterators as well, except
as described in sect_iter below.
.section "The Exception Handling Mechanism"
.para 1
The exception handling mechanism consists of two parts,
the signalling of exceptions and the handling of exceptions.
Signalling is the way a procedure notifies its caller
that it has discovered an exceptional condition.
Handling is the way that the caller of the procedure
specifies what is to be done if the procedure signals
an exception.
.para
Signalling an exception is an alternative form of returning.
When a procedure signals an exception, the current
activation of that procedure terminates and control is transferred
to a handler in the caller.
The signaller may
return objects to the exception handler, to help explain
the exceptional condition.
.para
An exception is identified by a name.
A procedure may
signal zero or more exceptions, whose names must be distinct.
Since signalling is like returning, each exception has an associated
list of types specifying what objects may be returned to the caller.
An exception name and its
associated list of types is called an i(exception@specification).
The specifications of the exceptions
signalled by a procedure are part of
the type of the procedure (see sect_proctypes).
In addition,
any procedure can signal the exception i(failure), which
always has a single accompanying object of type string.
The failure exception is implicitly part of all
procedure types; it may not be declared explicitly.
(The use of i(failure) is intended to indicate errors
from which it is unlikely or impossible to recover,
such as hardware malfunctions.)
.section "Signalling Exceptions"
.para 1
An exception is signalled by the signal statement, which has the form:
.show
signal name lbkt ( expression, etc ) rbkt
.eshow
where i(name) is the name of the exception to be signalled.
.para
A signal statement may appear anywhere in the body of a procedure.
The execution of a signal statement begins with the
evaluation of the expressions (if any), from left to right,
to produce a list of i(signal@argument) objects.
The activation of the executing procedure is then terminated.
Execution continues as described in section sect_handle below.
.para
The named exception must be either i(failure) or one
of the exceptions listed in the procedure header.
If the exception is i(failure), then there must be
exactly one signal argument expression, whose
type is string.
Otherwise, if the corresponding
exception specification in the procedure header has the form
.show
name (T1, etc, Tn)
.eshow
then there must be exactly i(n) signal argument expressions
and the type of the expression i(i) must be
included in the Ti.
.para
The following useless procedure contains a number of examples
of signal statements:
.show
.ta 20
signaller = proc (i: int) 
signals (foo, bar (int), bletch (string, bool));
if i < 0 then signal foo; end;
if i > 0 then signal bar (i - 1); end;
if i = 0 then signal bletch ("zero", true); end;
signal failure ("unreachable statement executed");
end signaller;
.eshow
.section "Handling Exceptions"
.para 1
When a procedure activation terminates by signalling an exception, we say that
the corresponding procedure invocation (the text of the call)
i(raises) that exception.
The caller specifies what action should be taken when an exception
is raised by the use of i(handlers), which are
written using the except statement.
.para
The except statement has the form:
.show
statement except s(1)lcurly when_handler rcurly
t(1)lbkt others_handler rbkt
t(1)end
where
.long_def others_handler
.def1 when_handler
when name , etc lbkt ( decl , etc ) rbkt : body
.or
when name , etc ( * ) : body
.def1 others_handler
others lbkt ( idn : type_spec ) rbkt : body
.eshow
We will call the statement to which the handlers are attached S.
The handlers handle exceptions that are raised by invocations in the statement S.
Each when_handler specifies one or more exception names
and a body to be executed if one of those exceptions is raised.
The optional others_handler is used to handle all exceptions
not explicitly named in the when_handlers.
The statement S can be a compound
statement, and can even contain other except statements.
Whenever two
except statements are nested in this fashion, and both have handlers
for the same exception, the innermost handler will take precedence
(see below).
.para
An except statement is executed as follows.
First, the statement S is executed.
If it terminates normally, then the except statement terminates normally also.
If some exception E is raised in S,
and the exception E is not handled by a handler within S,
then the execution of S is terminated and the attached handlers
are examined to see if any one of them will handle the exception E.
If so, then the body of the corresponding
handler is executed;
when the body terminates, the entire except statement terminates.
If there is no handler for the exception E in this except statement,
then the except statement itself terminates raising the exception E.
This will presumably be handled by some enclosing except statement.
.para
Thus, when an exception E is raised,
control is passed to the innermost exception handler
that handles the exception E.
Exceptions that are raised inside of handlers
are treated no differently from other
exceptions: control is passed to the innermost exception handler
for that exception (in a surrounding except statement).
Whenever a handler terminates, the except statement of
which it is a part terminates as well.
The set of invocations for which
a handler is effective is called the i(range) of that handler.
The range of a handler for
an exception E is that set of invocations within the attached
statement that are not inside the range of a nested handler
for the exception E.
.para
Recall that the infix and prefix operators are merely
syntactic sugar for procedure invocations.
Thus, the execution of such operators can signal exceptions and these
exceptions can be handled by the procedure containing
the use of the operator.
app_types describes the operations of the built-in types
and type generators, and the exceptions that those operations may signal.
.para
An invocation need not be surrounded by except statements
to handle all exceptions potentially raised by that invocation.
This policy was adopted because in many cases the programmer can prove that
a particular exception will not arise.
For example, the invocation int$div(x,7)
will never signal zero_divide.
However, this policy does lead to the possibility that some invocation may raise
an exception E and not be within the range of any handler for E.
Thus, we make the following rule.
If an invocation raises an exception E,
and that invocation is not within the range of any handler for E, then the
procedure containing that invocation is terminated and signals
the exception i(failure).
The exception name E is made into a string (all in lower case), and this string
is the argument of the failure signal.
As a special case, if the original exception E was itself i(failure), then
the original string argument is passed along with the new signal,
instead of "failure".
(This avoids losing the original exception name when a i(failure)
propagates up several levels.)
.para
Now let us consider the form of the handlers in more detail.
The when forms handle particular sets of exceptions.
The first form, without declarations, simply specifies a set of exception names.
This form is used to handle exceptions with no associated signal arguments.
The same form i(with) declarations is used to handle exceptions with signal arguments.
Each exception must have the same number of arguments as specified in the formal
argument list (i.e., the declarations), and their types must match exactly.
Within the body of the handler, the
declared formal arguments may be used to access the actual signal arguments.
These arguments are variables (initialized to the signal arguments),
local to the handler body.
The second form (with *) can be used to handle any exceptions of the
given names, regardless of whether or not there are associated signal arguments.
Any actual signal arguments will be thrown away.
.para
All of the exception names appearing in the when_handlers of
an except statement must be distinct.
Each exception must be potentially raised by some invocation within the
range of the handler.
For any exception handled using the when form with arguments,
all invocations within the range of the handler that potentially
raise an exception with that name
must provide the exact number and types of signal arguments
as specified in the formal argument list.
(The programmer must place handlers for an exception sufficiently close
to the invocations that raise that exception so that this restriction
is satisfied.)
.para
The others form is optional.
At most one may be used in an except statement, and it must appear last.
An others_handler handles any exception not handled by another
handler in the except statement.
If a formal argument is declared, it must be of type string.
If the actual exception is not i(failure), then the formal argument
will denote a string object which is the
name of the actual exception, in lower case; any actual signal
arguments will be thrown away.
However, if the actual exception is i(failure), then the formal
argument will denote the actual (string) signal argument.
.section "An Example"
.para 1
We now present an example demonstrating the use of exception handlers.
We will write a procedure, sum_stream,
which reads a sequence of signed decimal integers from a character
stream and returns the sum of those integers.
The stream is viewed as containing a sequence of fields separated by spaces;
each field must consist of a non-empty sequence of
digits, optionally preceded by a single minus sign.
Sum_stream has the form
.show
.ta 20
sum_stream = proc (s: stream) returns (int) signals (s(1)overflow,
t(1)unrepresentable_integer (string),
t(1)bad_format (string));
etc
end sum_stream;
.eshow
Sum_stream signals overflow if the sum of the
numbers or an intermediate sum is outside the implemented range of integers.
Unrepresentable_integer is signalled if the stream contains
an individual number that is outside the implemented range of integers.
Bad_format us signalled if the stream contains a
field that is not an integer.
.para
We will use the i(getc) operation of the i(stream) data type
(see app_io), whose type is
.show
proctype (stream) returns (char) signals (end_of_file, not_possible(string));
.eshow
This operation returns the next character from the stream, unless the
stream is empty, in which case end_of_file is signalled.
Not_possible is signalled if the operation cannot be performed on the
given stream (e.g., it is an output stream, or does not allow character
operations, etc.)
We will assume that we are given a stream for which getc is possible.
.para
The following procedure is used to convert character strings to integers:
.show
.ta 20
s2i = proc (s: string) returns (int) signals (s(1)invalid_character (char),
t(1)bad_format,
t(1)unrepresentable_integer);
etc
end s2i;
.eshow
S2i signals invalid_character if its string argument
contains a character other than a digit or a minus sign.
Bad_format is signalled if the string contains
a minus sign following a digit, more than one minus
sign, or no digits.
Unrepresentable_integer is signalled if the string
represents an integer that is outside the implemented range of integers.
.para
An implementation of sum_stream is presented in Figure current_figure.
.begin_figure "The sum_stream procedure."
.show
.ta 20 28 36 42
sum_stream = proc (s: stream) returns (int) signals (s(1)overflow,
t(1)unrepresentable_integer (string),
t(1)bad_format (string));
sum: int := 0;
num: string := "";
while true do
% skip over spaces between values; sum is valid, num is meaningless
c: char := stream$getc(s);
while c = ' ' do
c := stream$getc(s);
end;
% read a value; num accumulates new number, sum becomes previous sum
while c ~= ' ' do
num := string$append(num, c);
c := stream$getc(s);
end;
except when end_of_file: end;
% restore sum to validity
sum := sum + s2i(num);
end;
except
when end_of_file: return(sum);
when unrepresentable_integer: signal unrepresentable_integer (num);
when bad_format,invalid_character(*): signal bad_format (num);
when overflow: signal overflow;
end;
end sum_stream;
.eshow
.finish_figure
There are two loops within an infinite loop:
one to skip spaces, and one to accumulate digits for conversion to a number.
Notice the placement of the inner end_of_file handler.
If end_of_file is raised in the second inner loop,
then the sum is computed correctly, and the first invocation
of stream$getc will again raise end_of_file.
This time, however, the infinite loop is terminated and
execution transfers to the other end_of_file handler,
which then returns the accumulated sum.
.para
We have placed the remaining exception handlers outside of the infinite loop
to avoid cluttering up the main part of the algorithm.
Each of these exception handlers could also
have been placed after the particular statement containing
the invocation that signalled the corresponding exception.
The (*) form is used in the handler for the bad_format and invalid_character
exceptions since the signal arguments are not used.
Note that the overflow handler catches exceptions
signalled by the int$add procedure, which is invoked
using the infix + notation.
Note also that in this example all of the exceptions
raised by sum_stream originate as exceptions signalled by lower-level modules.
Sum_stream simply reflects these exceptions
upwards in terms that are meaningful to its callers.
Although some of the names may be unchanged, the meanings
of the exceptions (and even the number of arguments) are
different in the two levels.
.para
As mentioned above, we have assumed the stream$getc will not signal
not_possible; if it does, then sum_stream will terminate, raising
the exception failure("unhandled exception: not_possible").
.section "Summary"
.para 1
Any activation of a procedure may terminate in one of two ways:
it may terminate normally, returning zero or more result
objects, or it may signal an exception, along with zero or
more signal arguments.
In the latter case, we say that
the invocation of the procedure may i(raise) the given exception.
The set of possible exceptions that may be raised by
a procedure invocation is determined from the type of the procedure.
This set always includes the i(failure) exception.
.para
If a procedure invocation is a
component of an expression, and the invocation terminates
by raising an exception E (with associated signal
arguments), then the entire expression immediately terminates,
raising the exception E (with the associated signal arguments).
The set of possible exceptions that may be raised by an expression
is the set of all exceptions that may
be raised by procedure invocations within that expression.
.para
Expressions are embedded in statements.
If, during the execution of a statement, an embedded expression
terminates by raising an exception E (with associated signal
arguments), then the statement itself immediately
terminates raising the exception E (with the associated
signal arguments).
.para
Statements may be composed from smaller statements.
In general, if a component statement terminates by raising an exception E,
then the containing statement also immediately terminates, raising the exception E.
However, if the statement is an except statement, and
the except statement contains a handler that handles
the exception E, then the handler body is executed,
as described in the preceding section.
If an iterator invocation terminates raising an exception E,
then the entire for statement which invoked the iterator
immediately terminates raising the exception E.
.para
The set of possible exceptions that may be raised by
a non-except statement is the union of the sets of possible
exceptions that may be raised by any component expression or statement.
The set of possible exceptions that may be raised
by an except statement consists of the
set of exceptions that may be raised by the
component statement, minus the set of exceptions handled
by the handlers, plus the set of exceptions that may be
raised by the handler bodies.
.para
Thus, any expression or statement may terminate either normally
or by raising some exception.
The set of possible exceptions always includes the i(failure) exception.
.para
Finally, all procedure (and iterator) bodies are implicitly surrounded by
an exception handler of the form:
.show
begin
etc i(body) etc
end except
when others (s: string) : signal failure(s)
end
.eshow
.section "Exits and the Placement of Handlers"
.para 1
A i(local) transfer of control can be effected by
by using the exit statement, which has the form:
.show
exit name lbkt ( expression, etc ) rbkt
.eshow
The exit statement is similar to the signal statement except
that where the signal statement i(signals) an exception
to the i(calling) procedure, the exit statement i(raises) the
exception directly in the i(current) procedure.
An exception raised by an exit statement
i(must) be handled by a handler in the procedure
containing the exit statement.
The handler must explicitly name the particular exception
(i.e., the others form cannot be used)
and may not throw away any signal arguments
(i.e., the (*) form cannot be used).
.para
The exit statement and the signal statement
mesh nicely to form a uniform mechanism.
The signal statement can be viewed
as simply terminating a procedure activation; an exit is
then performed at the point of invocation.
(Because this exit is implicit, it is not subject to the restrictions listed above.)
.para
In some cases, however, other requirements may prohibit placing
exception handlers to take advantage of the implicit exit.
For example, assume that you wish to handle a particular exception
signalled by a particular set of invocations.
To avoid catching unwanted exceptions,
the handler must be placed sufficiently close to the
set of invocations so that no other invocation raising an exception
of that name is in the range of the handler.
The facts that the handlers must be close to the invocations,
and that the statement you wish to terminate when the exception is raised
may be rather large can require you to put explicit exit statements
in the handlers to force termination of the larger statement.
The point is that exits are a necessary feature in maintaining
the overall effectiveness of the signal mechanism.

159
doc/clu/lex.refman Normal file
View File

@ -0,0 +1,159 @@
.am table_of_contents
. rs
. bp
. sp
3Detailed Description*
. sp
. em
.ec e epsilon
.sr syntax_pdx Appendix`I
.chapter "Notation"
.para
We use an extended BNF grammar to define the syntax.
The general form of a production is:
.show 4
.def nonterminal alternative
.or alternative
.or ```...
.or alternative
.eshow
.para
The following extensions are used:
.nr fff fheight/4
.if
. width "a , etc "
. sv list_left_margin 800m
. sv list_right_margin
. sv list_indent width!m
. sv list_space
. sv list_spacing 1.5
. ilist
a , etc stands for a list of one or more 2a*'s separated by commas;
i.e., "a" or "a, a" or "a, a, a" etc.
. next
lcurly!arcurly stands for a sequence of zero or more 2a*'s;
i.e., " " or "a" or "a a" etc.
. next
lbkt!arbkt stands for an optional 2a*;
i.e., " " or "a".
. end_list
. en
.para
All semicolons are optional in CLU,
but for simplicity they appear in the syntax as ";" rather than "lbkt!;rbkt".
Nonterminal symbols appear in normal face.
Reserved words appear in bold face.
All other terminal symbols are non-alphabetic,
and appear in normal face.
.para
Full productions are not always shown in the body of this document;
often alternatives are presented and explained individually.
syntax_pdx contains the complete syntax.
.
.
.so rws;column rmac
.sr scope_sec 10.1
.sr lit_sec 9
.
.chapter "Lexical Considerations"
.para
A module is written as a sequence of tokens and separators.
A 2token* is a sequence of "printing" ASCII characters
(octal 40 thru octal 176) representing
a reserved word,
an identifier,
a literal,
an operator, or
a punctuation symbol.
A 2separator* is
a "blank" character
(space, vertical tab, horizontal tab, carriage return, newline, form feed)
or a comment.
In general,
any number of separators may appear between tokens.
Tokens and separators are described in more detail in the sections below.
.section "Reserved Words"
.para
The following character sequences are reserved words:
.elements any array
.elements begin bool break
.elements cand char cluster continue cor cvt
.elements do down
.elements else elseif end except exit
.elements false for force
.elements has
.elements if in int is iter itertype
.elements nil null
.elements oneof others
.elements proc proctype
.elements real record rep return returns
.elements signal signals string
.elements tag tagcase then true type
.elements up
.elements when where while
.elements yield yields
.show
.columns 6 3 0 0 1
.nr trace 0
.eshow
Upper and lower case letters are not distinguished in reserved words.
For example, 'end', 'END', and 'eNd' are all the same reserved word.
Reserved words appear in bold face in this document.
.section "Identifiers"
.para
An 2identifier* is a sequence of
letters,
digits,
and underscores that begins with a letter or underscore,
and that is not a reserved word.
As in reserved words,
upper and lower case letters are not distinguished in identifiers.
.para
In the syntax there are two different nonterminals for identifiers.
The nonterminal 2idn* is used when the identifier has scope
(see Section`scope_sec);
idns are used for variables, parameters, module names,
and as abbreviations for constants.
The nonterminal 2name* is used when the identifier has no scope;
names are used for record selectors,
oneof tags,
operation names,
and exceptional condition names.
.section "Literals"
.para
There are literals for naming all objects of the basic types
(null, bool, int, real, char, string).
Their forms are discussed in Section`lit_sec.
.section "Operators and Punctuation Symbols"
.para
The following character sequences are used as operators and punctuation symbols:
.elements ( ) { } [ ]
.elements : ; , . $ :=
.elements < <= = >= > ""
.elements ~< ~<= ~= ~>= ~> ""
.elements + - * / "" ""
.elements || ** // & | ~
.show
.columns 6 3 0 0 1
.eshow
.section "Comments and Other Separators"
.para
A 2comment* is a sequence of characters that
begins with a percent sign,
ends with a newline character,
and contains only printing ASCII characters and horizontal tabs in between.
For example:
.show 2
z := a[i] + s(1)% a comment in an expression
t(1)b[i];
.eshow
.para
A 2separator* is a blank character or a comment.
Zero or more separators may appear between any two tokens,
except that at least one separator is required between any two adjacent
non-self-terminating tokens:
reserved words,
identifiers,
integer literals,
and real literals.
This rule is necessary to avoid lexical ambiguities.

1058
doc/clu/module.refman Normal file

File diff suppressed because it is too large Load Diff

1122
doc/clu/newio.refman Normal file

File diff suppressed because it is too large Load Diff

1297
doc/clu/opdefs.refman Normal file

File diff suppressed because it is too large Load Diff

815
doc/clu/part}a.refman Normal file
View File

@ -0,0 +1,815 @@
.begin_table_of_contents 4
.width "1199"
.nr width ll-rindent-width
.am table_of_contents
. ta 8 width!m
3````s(toc1)*
3Overview*
4```````````s(toc2)`````````````````````s(toc3)*
. em
.
.sr sect_library Section`4
.sr sect_compare Section`2.3
.sr app_io Appendix`III
.sr sect_semantics Section`3
.sr sect_progs Section`1.5
.sr sect_except Section`12
.sr sect_parms Section`1.4
.
.sr E 2E*
.sr m 2m*
.sr p 2p*
.sr q 2q*
.sr x 2x*
.sr y 2y*
.sr z 2z*
.sr a 2a*
.sr b 2b*
.sr s 2s*
.sr v 2v*
.sr insert 2insert*
.sr alpha 9a*
.sr beta 9b*
.sr gamma 9g*
.
.de circle
. if
. nv h hpos
(+1.5)\0(-1.5)
. hp h!m
\1
. hs (1.2*fheight)m
\2
. en
. em
.
.
.de setup_centers
. if
. nv sum center_spacing*(nargs-1)
. fr i 0 nargs-1
. width "\\i"
. nr sum sum+width
. en
. width \0
. nr center_stop0 master_center-sum/2
. fr i 0 nargs-1
. width "\\i"
. nv ip i+1
. nr center_stop\ip center_stop\i+width+center_spacing
. nr center_stop\i center_stop\i+width/2
. en
. en
. em
.
.de center_all
. fr i 0 nargs-1
. width "\\i"
. hp (center_stop\i-width/2)m
\\i
. en
. br
. em
.
.chapter "Modules"
.para
A CLU program consists of a group of modules.
Three kinds of modules are provided,
one for each kind of abstraction that we have found to be useful in program construction.
Procedures support procedural abstraction,
iterators support control abstraction,
and clusters support data abstraction.
.section "Procedures"
.para
A CLU 2procedure* performs an action on zero or more 2argument* objects,
and terminates returning zero or more 2result* objects.
All communication between a procedure and its invoker takes place through
these arguments and results,
i.e.,
a procedure has no global variables.
.para
A procedure may terminate in one of a number of 2conditions*.
One of these is the normal condition; the others are exceptional conditions.
Differing numbers and types of results may be returned in different conditions.
All information about the names of conditions and the number and types of
arguments and results is described in the 2procedure heading*.
For example,
.show
square_root = proc (x: real) returns (real) signals (no_real_result)
.eshow
is the heading of a square_root procedure,
which takes a single real argument.
Square_root terminates either in the normal condition (returning the square root of x)
or in the no_real_result condition (returning no results).
.section "Iterators"
.para
An 2iterator* computes a sequence of 2items* based on its input arguments.
These items are provided to its invoker one at a time.
Each item consists of zero or more objects.
.para
An iterator is invoked by a for statement.
The iterator provides each item by 2yielding* it.
The objects in the item are assigned to the loop variables of the for statement,
and the body of the for statement is executed.
Then control is returned to the iterator so it can yield the next item in the sequence.
The for loop is terminated when the iterator terminates,
or the for loop body may explicitly terminate itself and the iterator.
.para
Just like a procedure,
an iterator has no global variables,
and may terminate in one of a number of conditions.
In the normal condition,
no results can be returned,
but different numbers and types of results can be returned in the exceptional conditions.
All information about the names of conditions,
and the number and types of arguments and results is described in
the 2iterator heading*.
For example,
.show
leaves = iter (t: tree) yields (node)
.eshow
is the heading for an iterator that produces all leaf nodes of a tree object.
This iterator might be used in a for statement as follows:
.show 3
for s(1)leaf: node in leaves(x) do
t(1)... examine(leaf) ...
t(1)end
.eshow
.section "Clusters"
.para
A 2cluster* defines a data abstraction,
which is a set of objects and a set of 2primitive operations*
to create and manipulate those objects.
The operations can be either procedural or control abstractions.
The 2cluster heading* states what operations are available, e.g.,
.show
int_set = cluster is create, insert, elements
.eshow
states that the operations of int_set are
2create*, 2insert*, and 2elements*.
.para
A cluster is used to define a distinct 2data type*,
different from all others.
Users of this type are constrained to treat objects of the type abstractly.
That is,
the objects may be manipulated only via the primitive operations.
This means that information about how the objects are represented in storage
may not be used.
Instead,
the primitive operations must be used to manipulate and
query the information in the objects.
.para
Inside the cluster,
a 2concrete representation* (in terms of some other type) is chosen for the objects,
and the operations are implemented in terms of this representation.
Each operation is implemented by a 2routine* (a procedure or iterator);
these routines are exactly like those not contained in clusters,
except that they can treat the objects being defined by the
cluster both abstractly and in terms of the concrete representation.
(Treating the objects abstractly is useful when defining recursive structures,
where the concrete representation makes use of the new type.)
A cluster may contain additional procedures and iterators,
which are purely for local use;
these routines do not define operations of the type.
.section "Parameterized Modules"
.para
Procedures, iterators, and clusters can all be 2parameterized*.
Parameterization provides the ability to define a class of related abstractions
by means of a single module.
Parameters are limited to the following types:
int, real, bool, char, string, null, and type.
The most interesting and useful of these are the type parameters.
.para
When a module is parameterized by a type parameter,
this implies that the module was written without knowledge of what
the actual parameter type would be.
Nevertheless,
if the module is to do anything with objects of the parameter type,
certain operations must be provided by any actual type.
Information about required operations is described in a where clause,
which is part of the heading of a parameterized module.
For example,
.show 2
set = cluster [t: type] s(1)is create, insert, elements
t(1)where t has equal: proctype (t, t) returns (bool)
.eshow
is the header of a parameterized cluster defining a generalized set abstraction.
Sets of many different element types can be obtained from this cluster,
but the where clause states that the element type is constrained to
provide an 2equal* operation.
.para
To use a parameterized module,
actual values for the parameters must be provided,
using the general form
.show
module_name [ parameter_values ]
.eshow
Parameter values are limited to compile-time computable quantities.
Providing actual parameters selects one abstraction out of
the class of related abstractions defined by the parameterized module;
since the values are compile-time known,
the compiler can do the selection and can check that
the where clause restrictions are satisfied.
The result of the selection,
in the case of a parameterized cluster,
is a type,
which can then be used in declarations;
in the case of parameterized procedures or iterators,
a procedure or iterator is obtained,
which is then available for invocation.
For example,
set[int] is a use of the set abstraction shown above,
and is legal because int does have an 2equal* operation.
.para
A parameterized cluster, procedure, or iterator is said to implement a
2type generator*, 2procedure generator*, or 2iterator generator*,
respectively.
.section "Program Structure"
.para
As was mentioned before,
a program consists of a group of modules.
Each module defines either a single abstraction or,
if parameterized,
a class of related abstractions.
Modules are never embedded in other modules.
Rather,
the program is a single level structure,
with all modules potentially usable by all other modules in the program.
Type-checking of inter-module references is carried out using information
in the module headings,
augmented,
in the case of clusters,
by the headings of the procedures and iterators that implement the operations.
.para
Each module is a separate textual unit,
and is compiler independently of other modules.
Compilation and program construction are discussed in sect_library.
.chapter "Data Types"
.para
One of the primary goals of CLU was to provide,
through clusters,
a type extension mechanism that permits user-defined types
to be treated as similarly as possible to built-in types.
This goal has been achieved to a large extent.
Both built-in and user-defined types are viewed as providing
sets of primitive operations,
with access to the real representation information limited to just these operations.
The ways in which built-in types differ from user-defined types
will be discussed in sect_compare below.
.section "Built-in Types"
.para
CLU provides a rich set of built-in types and type-generators.
The built-in types are
int, real, bool, char, string, null, and any.
1Int* and real provide the usual arithmetic and relational operations,
and bool provides the standard boolean operations.
1Char* is the full ASCII character set;
the usual relational operators are provided,
along with conversion to and from integers.
1Strings* are (possibly empty) sequences of characters;
usual string operations like selecting the 2ith* character,
and concatenation are provided.
However,
strings are somewhat unusual in that string objects cannot be modified.
For example,
it is not possible to change a character in a string;
instead,
a new string,
differing from the original in that position,
may be created.
.para
1Null* is a type containing one object, nil.
1Null* is used primarily in conjunction with the tagged union type
discussed below.
.para
1Any* is provided to permit an escape from compile-time type checking.
The type any introduces no new objects,
but instead may be used as the type of a variable when the programmer wishes
to assign objects of different types to that variable,
or does not know what kind of object will be assigned to the variable.
CLU provides a built-in procedure generator,
force,
which permits a run-time examination of the type of object
named by a variable of type any.
.para
The built-in type generators are:
array, record, oneof, proctype, and itertype.
Arrays are one-dimensional.
The type of element contained in the array is specified by a type parameter,
e.g., array[int] and array[array[int]].
(The latter example shows how a two-dimensional array might be handled.)
CLU arrays are unusual in that they can grow dynamically.
An array is usually empty when first created.
Array operations can grow and shrink the array at either end,
query the current size and low and high bounds of the array,
and access and update elements within the current bounds.
.para
CLU records are heterogeneous collections of component objects;
each component is accessed by a selector name.
Records must be explicitly constructed by means of a special 2record constructor*.
The constructor requires that an object be provided for each component of the record;
this requirement ensures that no component of the record is undefined
in the sense of naming no object.
Record operations permit selection of component objects and
updating the components with new objects.
.para
A oneof type is a tagged,
discriminated union.
The objects of a oneof type each consist of a 2tag* (an identifier)
and an object of some other type.
Operations are provided for creating oneof objects.
Oneof objects are usually decomposed through the tagcase statement.
.para
Procedure and iterator types provide procedures and iterators as objects.
These types are parameterized by all the information appearing in
a procedure or iterator heading, with the exception of the formal argument names.
Since all communication with a procedure or iterator is through
the arguments and results,
no problems concerning evaluation environments arise in the use of
procedures or iterators as first-class objects.
.para
In addition to all the built-in types and type generators mentioned above,
CLU programs may also make use of the type type.
The use of type values is limited to parameters of parameterized modules;
there are no arguments or variables of type type.
Furthermore,
the values of type type (comprising all built-in and user-defined types)
are all compile-time-known.
.para
Finally,
CLU provides a number of types and procedures to support I/O.
These types are not considered to be built-in types of CLU,
but they must be available in the library.
These types are described in app_io.
.section "User-Defined Types"
.para
Users may define new types by providing clusters that implement them.
The cluster may implement a single type,
or,
in the case of a parameterized cluster,
a group of related types.
The type or types defined by a cluster are distinct from all built-in types
and from all types defined by other clusters.
.section "Comparison of User-Defined and Built-In Types"
.para
Little distinction is made between user-defined types and built-in types.
Either can be used freely to declare the arguments,
variables,
and results of routines.
In addition,
in either case there is a set of primitive operations associated with the type,
and the same syntax is used to invoke these operations.
The ordinary syntax to name an operation is
.show
type $ opname
.eshow
Since different types will often have operations of the same name (e.g., 2create*),
this compound form is used to avoid ambiguity.
.para
For many operations there is also a customary abbreviated form of invocation,
which can be used for user-defined types as well as for built-in types.
There is a standard translation from each abbreviated form to
the ordinary form of invocation.
For example,
an addition operation is usually invoked using the infix notation "x + y";
this is translated into "T$add(x, y)",
where T is the type of x.
Extending notation to user-defined types in this way is sometimes called
2operator overloading*.
We permit almost all special syntax to be overloaded;
there are always constraints on the overloading definition
(e.g., 2add* must have two input arguments and one result),
but they are quite minimal.
.para
Nevertheless,
there are three main distinctions between built-in types and user-defined types:
.nlist
Built-in type and type generator names cannot be redefined.
(This is why we always show them in boldface in this document.)
.nnext
Some built-in types,
e.g., int, real, etc.,
have literals.
There is no mechanism for defining literals for user-defined types.
.nnext
Some built-in types are related to certain other constructs of CLU.
For example,
the tagcase statement is a control construct especially provided to
permit discrimination on oneof objects.
In addition,
in places where compile-time constants are required,
e.g., as actual parameters to parameterized modules,
the expressions that may appear are limited to a subset of the built-in types and
their operations.
One reason for this limitation is that the permitted types are known to contain only
2immutable* objects (see sect_semantics.1).
.end_list
.chapter "Semantics"
.para
All languages present their users with some model of computation.
This section describes those aspects of CLU semantics that differ
from the common ALGOL-like model.
In particular,
we discuss the notions of objects and variables,
and the definitions of assignment and argument passing that follow from these notions.
We also discuss type correctness.
.section "Objects and Variables"
.para
The basic elements of CLU semantics are 2objects* and 2variables*.
Objects are the data entities that are created and manipulated by programs.
Variables are just the names used in a program to refer to objects.
.para
Each object has a 2type*,
which characterizes its behavior.
A type defines a set of primitive operations
to create and manipulate objects of that type.
An object may be created and manipulated only via the operations of its type.
.para
An object may 2refer* to objects.
For example,
a record object refers to the objects that are the components of the record.
This notion is one of logical,
not physical,
containment.
In particular,
it is possible for two distinct record objects to refer to (or 2share*)
the same component object.
In the case of a cyclic structure,
it is even possible for an object to "contain" itself.
Thus,
it is possible to have recursive data structure definitions and shared data objects
without explicit reference types.
.para
Objects exist independently of procedure and iterator activations.
Space for objects is allocated from a dynamic storage area as the result of invoking
constructor operations of certain primitive CLU types,
such as records and arrays.
In theory,
all objects continue to exist forever.
In practice,
the space used by an object may be reclaimed (via garbage collection)
when that object is no longer accessible.
(An object is accessible if it is denoted by a variable of an active routine or is
a component of an accessible object.)
.para
Objects may be divided into two categories.
Some objects exhibit time-varying behavior.
Such an object,
called a 2mutable* object,
has a state that may be modified by certain operations without changing
the identity of the object.
Records and arrays are examples of mutable objects.
For example,
updating the 2ith* element of any array a causes the state of a to change
(to contain a different object as the 2ith* element).
.para
If a mutable object m is shared by two other objects x and y,
then a modification to m made via x will be visible when m is examined via y.
Communication through shared mutable objects is most beneficial in the context of
procedure invocation,
described below.
.para
Objects that do not exhibit time-varying behavior are called 2immutable* objects,
or 2constants*.
Examples of constants are integers, booleans, characters, and strings.
The value of a constant object cannot be modified.
.para
Variables are names used in programs to 2denote* particular objects
at execution time.
Unlike variables in many common programming languages,
which are containers for values,
CLU variables are simply names that the programmer uses to refer to objects.
As such,
it is possible for two variables to denote (or 2share*) the same object.
CLU variables are much like those in LISP,
and are similar to pointer variables in other languages.
However,
CLU variables are 2not* objects;
they cannot be denoted by other variables or referred to by objects.
Thus,
variables are completely private to the procedure or iterator in which they are declared,
and cannot be accessed or modified by any other routine.
.section "Assignment and Invocation"
.para
The basic actions in CLU are 2assignment* and 2invocation*.
The assignment primitive x := E,
where x is a variable and e is an expression,
causes x to denote the object resulting from the evaluation of E.
For example,
if E is a simple variable y,
then the assignment x`:=`y causes x to denote the object denoted by y.
The object is 2not* copied;
after the assignment is performed,
the object will be 2shared* by x and y.
Assignment does not affect the state of any object.
.para
.sr assign_fig Figure`current_figure
assign_fig illustrates these notions of object, variable, and assignment.
Here we show variables in a stack,
and objects in a heap (free storage area),
an obvious way to implement CLU.
assign_fig!a contains three objects:
alpha, beta, and gamma.
alpha is an integer (in fact, 3) and is denoted by variable x,
while beta and gamma are of type
set[int] and are denoted by variables y and z,
respectively.
assign_fig!b shows the result of executing
.show
y := z
.eshow
Now y and z both refer to,
or share,
the same object,
gamma;
beta is no longer accessible,
and so can be garbage collected.
.begin_figure "figure_font!Assignment*"
.sp 2
x circle(alpha 3 int)
.sp 2
y circle(beta {9} set[int])
.sp 2
z circle(gamma {} set[int])
.sp 2
1assign_fig!a.*
.sp 2
x circle(alpha 3 int)
.sp 2
y circle(beta {9} set[int])
.sp 2
z circle(gamma {} set[int])
.sp 2
1assign_fig!b.*
.finish_figure
.para
.para
Invocation involves passing argument objects from the caller to the called routine and
returning result objects from the routine to the caller.
The objects returned by the procedure,
or yielded by an iterator,
may be assigned to variables in the caller.
Argument passing is defined in terms of assignment;
the formal arguments of a routine are considered to be local variables of the routine
and are initialized,
by assignment,
to the objects resulting from the evaluation of the argument expressions.
We call the argument passing technique 2call by sharing*,
because the argument objects are shared between the caller and the called routine.
The technique does not correspond to most traditional argument passing techniques
(it is similar to argument passing in LISP).
In particular it is not call by value because mutations of arguments
performed by the called routine will be visible to the caller.
And it is not call by reference because access is not given to the
variables of the caller,
but merely to certain objects.
.sr invoc_fig Figure`current_figure
.para
invoc_fig illustrates invocation and object mutation.
invoc_fig!a continues from the situation shown in assign_fig!b,
and illustrates the situation immediately after invocation of
.show
set[int]$insert (y, x).
.eshow
(but before executing the body of 2insert*),
where 2insert* has two formal arguments.
The first,
s,
denotes the set,
and the second,
v,
denotes the integer to be inserted into s.
Note that the variables of the caller (x, y and, z)
are not accessible to 2insert*.
invoc_fig!b illustrates the situation after 2insert* returns.
Note that object gamma has been modified and now refers to alpha
(the set gamma now contains 3),
and since gamma is shared by both y and z,
the modification of gamma is visible through both these variables.
.begin_figure "figure_font!Invocation and object mutation*".
.sp 2
x circle(alpha 3 int)
.sp .5
y
.sp .5
z
.sp 2
s circle(gamma {} set[int])
.sp .5
v
.sp 2
1invoc_fig!a.*
.sp 2
x circle(alpha 3 int)
.sp .5
y
.sp .5
z
.sp 2
circle(gamma {`} set[int])
.sp 2
1invoc_fig!b.*
.finish_figure
.para
Procedure invocations may be used directly as statements;
those that return exactly one object may also be used as expressions.
Iterators may be invoked only through the for statement.
Arbitrary recursion among procedures and iterators is permitted.
.section "Type Correctness"
.para
The declaration of a variable specifies the type of the objects
which the variable may denote.
In an assignment,
the object denoted by the right-hand side must have the same type as
the variable on the left-hand side:
there are no implicit type conversions.
(The type of object denoted by an expression is
the return type of the outermost procedure invoked in that expression,
or,
if the expression is a variable or literal,
the type of that variable or literal.)
There is one special case;
a variable declared to be of type any may be assigned the value of any expression.
.para
Argument passing is defined in terms of assignment;
for an invocation to be legal,
it must be possible to assign the actual arguments (the objects) to the formal arguments
(the variables) listed in the heading of the routine to be invoked.
Furthermore,
a return (or yield) statement is legal only if the result objects could be
legally assigned to variables having the types stated in the routine heading.
.para
CLU is a 2type-safe* language,
in that it is not possible to treat an object of type T as if it were an object of
some other type S;
in particular,
one cannot assign an object of type T to a variable of type S (unless S is any).
The type any provides an escape from compile-time type determination,
and a built-in procedure generator force
can be used query the type of an object at run-time.
However,
any and force are defined in such a way that
the type-safety of the language is not undermined.
The type-safety of CLU,
plus the restriction that only the code in a cluster may convert between
the abstract type and the concrete representation,
insure that the behavior of an object is indeed characterized completely
by the operations of its type.
.chapter "The CLU Library"
.para
As was mentioned earlier,
it is intended that the modules making up a CLU program all be
separate compilation units.
A fundamental requirement of any CLU implementation is that it support
separate compilation,
with type checking of inter-module references.
This checking can be done either at compile time or at load time
(when a group of separately compiled modules are combined together to form a program).
A second fundamental requirement is that the implementation support top down programming.
The definition of CLU does not specify how an implementation should meet
these requirements.
However,
in this section we describe the current CLU implementation,
which may serve as a model for others.
.para
Our implementation makes use of the CLU library,
which plays a central role in supporting inter-module references.
The library contains information about all abstractions.
It supports incremental program development,
one abstraction at a time,
and,
in addition,
makes abstractions that are defined during the construction of one program available
as a basis for subsequent program development.
The information in the library permits the separate compilation of single modules,
with complete type checking at compile time of all external references
(such as procedure names).
.para
The library provides a hierarchical name space for retrieving information about
abstractions.
.sr lib_fig Figure`current_figure
The leaf nodes of the library are 2description units* (DUs),
one for each abstraction.
lib_fig illustrates the structure of the library.
.begin_figure "figure_font!A sketch of the library structure showing a DU with pathname B.Y*"
.sp 3
A B
.sp 2
X Y
.sp 1.5
DU
.finish_figure
.para
A DU contains all system-maintained information about its abstraction.
.sr struc_fig Figure`current_figure
A sketch of the structure of a DU is shown in struc_fig.
For purposes of program development and module compilation,
two pieces of information must be included in the DU:
implementation information,
describing zero or more modules that implement the abstraction,
and the interface specification.
.begin_figure "0A sketch showing the structure of a DU*"
.sp 2
.nr master_center (ll-indent-rindent)/2
.hx center_spacing 5
.setup_centers DU
.center_all DU
.sp 2
.setup_centers specification abstractions implementations information
.center_all interface abstractions implementations other
.center_all specification "used in" ` information
.center_all ` interface
.sp 2
.nr master_center center_stop2
.hx center_spacing 1i
.setup_centers 1 "1. . .* n
.center_all 1 "1. . .* n
.sp 2
.nr master_center center_stop0
.hx center_spacing 3
.setup_centers source object implementation information
.center_all source object abstractions other
.center_all code code "used in" information
.center_all ` ` implementation
.sp 1
.finish_figure
The 2interface specification* is that information needed to type-check
uses of the abstraction.
For procedural and control abstractions,
this information consists of the number and types of parameters, arguments, and results,
the names of exceptional conditions and the number and types of results returned
in each case,
plus any constraints on type parameters
(i.e., the where clause, as described in sect_parms).
For data abstractions,
it includes the number and types of parameters,
constraints on type parameters,
and the name and interface specification of each operation.
.para
An abstraction is entered in the library by submitting the interface specification;
no implementations are required.
In fact,
a module can be compiled before any implementations have been provided for
the abstractions that it uses;
it is necessary only that interface specifications have been given for
those abstractions.
Ultimately,
there can be many implementations of an abstraction;
each implementation is required to satisfy the interface specification of
the abstraction.
Because all uses and implementations of an abstraction are checked against
the interface specification,
the actual selection of an implementation can be delayed until just before
(or perhaps during) execution.
We imagine a process of binding together modules into programs,
prior to execution,
at which time this selection would be made.
.para
An important detail is the method by which modules refer to abstractions.
To avoid the problems of name conflicts that can arise in large systems,
the names used by a module to refer to abstractions can be chosen to suit
the programmer's convenience.
When a module is submitted for compilation,
its external references must be bound to DUs so that type-checking can be performed.
The binding is accomplished by constructing a 2compilation environment* (CE),
mapping names to DUs and constants,
which is passed to the compiler along with the source code when compiling the module.
A copy of the CE is stored by the compiler in the library as part of the module.
A similar process is involved in entering interface specifications of abstractions,
as these will include references to other (data) abstractions.
.para
When the compiler type checks a module,
it uses the compilation environment to map the external names in the module
to description units,
and then uses the interface specifications in those description units
to check that the abstractions are used correctly.
The type-correctness of the module thus depends upon the binding of names to DUs
and the interface specifications in those DUs,
and could be invalidated if changes to the binding or the interface specifications were
subsequently made.
For this reason,
the process of compilation permanently binds a module to the abstractions it uses,
and the interface description of an abstraction,
once defined,
is not allowed to change.
Of course,
a new description unit can be created to describe a modified abstraction.
Furthermore,
during design (before any implementing modules have been entered into the system)
it is reasonable to permit abstraction interfaces to change.
.para
The library and DU structure described above can be used for purposes other than
compiling and loading programs.
In each case,
additional information can be stored in the DU;
the "other" fields shown in struc_fig are intended
to illustrate such additional information.
For example,
the library provides a good basis for program verification.
Here the "other" information in the DU would contain a formal specification of
the abstraction,
and possibly some theorems that had been proved about the abstraction,
while for each implementation that had been verified, an outline of the correctness proof
might be retained.
Additional uses of the library include retention of
debugging and optimization information.

252
doc/clu/refman.insert Normal file
View File

@ -0,0 +1,252 @@
.ec k \
.nd _refman_insert_loaded 0
.if _refman_insert_loaded
. tm Ignoring reload of clu;refman insert
. nx
. en
.nr _refman_insert_loaded 1
.dv xgp
.fo 0 fonts1;30vrx rwskst
.fo 1 fonts1;nonmb1 rwskst
.fo 2 fonts1;30vrix ebmkst
.fo 3 37vrb
\k this is used in an appendix
.fo 4 30vrb
\k this is used for some meta syntax characters
.fo 5 40vgl
\k this is used for a not-equal sign
.fo 6 plunk
\k this is used for small super- and sub-scripts
.fo 7 20vg
\k this is used for some special characters
.fo 8 fonts1;31sym jcskst
\k this is used for some meta syntax characters
.fo 9 s30grk
\k Barbara uses greek letters (why? I don't know!)
.fo A fonts1;52meta rwskst
.fo D 25vg
.fo E 25vgb
.ls 1.5
.tr `
.nr verbose 1
.nr tty_table_of_contents 1
.nr print_page1_headings 1
.nr chapter_starts_page 1
.sr left_heading 4DRAFT*
.sr right_heading date
.nr both_sides 1
.nr figure_font 0
.nr chapter_toc_font 4
.nr section_toc_font 4
.sr chapter_toc_form 3\section_number.t(toc1)\section_title * pn(\page)
.sr section_toc_form ```\section_numbert(toc2)\section_title pn(\page)
.sr appendix_toc_form 3\section_number_title.t(toc3)\section_title * pn(\page)
.sr achapter_toc_form ```\section_number.t(toc2)\section_title pn(\page)
\k
\k Note: the first module, which adds "part a" to the toc
\k sets the tabs as well.
\k
.so r;r macros
.so r;div rmac
.so clu;clukey r
.so clu;clusym r
.
\k to output a page number in the toc
.
.de pn
. \0
.br
.em
.
.am chapter \k Ignore this
. if font
. tm "Current font not 0 at chapter end!"
. en
. em
.
.am section \k Ignore this
. if font
. tm "Current font not 0 at section end!"
. en
em
.
\k para macro - do a .para before every paragraph
.
.de para
. ne 3
. ti 4
. em
.
\k show macro - to start an indented example in nofill, single-spaced
\k takes the number of lines to keep together as an optional argument
.
.de show
. be show_block
. nv ols ls
. nv ls 100
. sp (ols-100)/100
. hv indent indent 8
. nf
. if nargs>0
. ne \0l
. en
. em
.
\k eshow macro - to end a .show
.
.de eshow
. fi
. en
. sp (ls-100)/100
. em
.
\k s macro - s(n) sets "tab" n at the current horizontal position
.
.de s
. nr tab_pos\0 hpos
. em
.
\k t macro - t(n) resets the horizontal position to "tab" n
.
.de t
. if hpos>tab_pos\0
. tm Tabbed backwards with tab \0 at \lineno
. en
. hs (tab_pos\0-hpos)m
. em
.
\k b macro - b(n) does a line break and then goes to "tab" n
.
.de b
. br
. hs (tab_pos\0-hpos)m
. em
.
\k long_def - .long_def <non-term> for defining the longest non-terminal
\k after a .long_def, use only .def1's, don't do a .def
.
.de long_def
. width "\0 "
. nr d_stop0 width+indent
. width "def"
. nr d_stop1 d_stop0+(width/2)
. width "def "
. nr d_stop2 d_stop0+width
. em
.
\k def macro - .def foo <opt-alt> for showing full BNF syntax of "foo"
\k first alternative is <opt-alt>, if present, else text following .def line
.
.de def
\0 
. nr d_stop0 hpos
. width "def"
. nr d_stop1 hpos+(width/2)
def 
. nr d_stop2 hpos
. if nargs>1
\1
. en
. em
.
\k def1 macro - .def1 foo <opt-alt> for showing full BNF syntax of "foo"
\k to be used only after a .def or .long_def has been done, to line up the ::='s
\k first alternative is <opt-alt>, if present, else text following .def line
.
.de def1
\0
. hp d_stop0!m
def
. hp d_stop2!m
. if nargs>1
\1
. en
. em
.
\k or macro - .or <opt-alt> for starting next alternative in BNF
\k next alternative is <opt-alt>, if present, else text following .or line
.
.de or
. br
. hs (d_stop1-hpos)m
orbar
. hs (d_stop2-hpos)m
. if nargs
\0
. en
. em
.
\k nlist macro - for numbered lists
.
.sr list_indent 3
.sr list_space .5
.
.de nlist
. ilist
. nnum
. em
.
\k nnext macro - for starting the next element in a numbered list
.
.de nnext
. next
. nnum
. em
.
\k end_list macro - for ending a numbered list
.
.de end_list
. br
. en
. rtabs
. sp (ls-100)/100
. em
.
\k internal macro only
.
.de nnum
\list_count. 
. em
.
\k internal macro only
.
.de ibegin_list
. br
. nv ls
. nv indent indent
. nv rindent rindent
. nv tab_stop
. sv list_indent \list_indent
. sv list_space \list_space
. nv list_count 0
. if nargs>1
. sr list_indent \1
. en
. if nargs>2
. sr list_space \2
. en
. in +\list_left_margin
. if \0
. in +\list_indent
. sr list_indent -\list_indent
. en
. ta indent!m
. ir +\list_right_margin
. nr tab_stop indent
. ls \list_spacing
. ns
. next
. rs
. em
.
.de width
. ignore
. nv indent
. nv ll 30000
. nv start hpos
\0
. nr width hpos-start
. nr ha habove
. nr hb hbelow
. end_ignore
. em

21
doc/clu/refman.r Normal file
View File

@ -0,0 +1,21 @@
.so clu/refman.insert
.so r/sect.rmac
.de segment
. def_seg \0 clu/\0.refman
. em
.sr save_file_name clu/refman.save
.
.segment part_a
.segment lex
.segment types
.segment e_d
.segment action
.segment exprs
.segment stmts
.segment except
.segment module
.segment syntax
.segment opdefs
.segment io
.
.mult_run

495
doc/clu/refman.save Normal file
View File

@ -0,0 +1,495 @@
.de seg1_save_macro end_save_macro
.nr page 24
.nr chapter 6
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents1
.de table_of_contents1
3@@@@s(toc1)*
3Overview*
4@@@@@@@@@@@s(toc2)@@@@@@@@@@@@@@@@@@@@@s(toc3)*
.sp
.ne 5l
.fs 4
31.t(toc1)Modules *. 5
.sp
.ns
.fs 4
@@@1.1t(toc2)Procedures . 5
.fs 4
@@@1.2t(toc2)Iterators . 5
.fs 4
@@@1.3t(toc2)Clusters . 6
.fs 4
@@@1.4t(toc2)Parameterized Modules . 6
.fs 4
@@@1.5t(toc2)Program Structure . 7
.sp
.ne 5l
.fs 4
32.t(toc1)Data Types *. 9
.sp
.ns
.fs 4
@@@2.1t(toc2)Built-in Types . 9
.fs 4
@@@2.2t(toc2)User-Defined Types . 11
.fs 4
@@@2.3t(toc2)Comparison of User-Defined and Built-In Types . 11
.sp
.ne 5l
.fs 4
33.t(toc1)Scope, Declarations, and Equates *. 13
.sp
.ns
.sp
.ne 5l
.fs 4
34.t(toc1)Expressions and Statements *. 14
.sp
.ns
.sp
.ne 5l
.fs 4
35.t(toc1)Semantics *. 16
.sp
.ns
.fs 4
@@@5.1t(toc2)Objects and Variables . 16
.fs 4
@@@5.2t(toc2)Assignment and Invocation . 17
.fs 4
@@@5.3t(toc2)Type Correctness . 19
.sp
.ne 5l
.fs 4
36.t(toc1)The CLU Library *. 21
.sp
.ns
.em
.rm table_of_figures1
.de table_of_figures1
.em
.end_save_macro
.de seg2_save_macro end_save_macro
.nr page 25
.nr chapter 7
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents2
.de table_of_contents2
.bp
.sp
3Detailed Description*
.sp
.sp
.ne 5l
.fs 4
37.t(toc1)Notation *. 24
.sp
.ns
.em
.rm table_of_figures2
.de table_of_figures2
.em
.end_save_macro
.de seg3_save_macro end_save_macro
.nr page 27
.nr chapter 8
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents3
.de table_of_contents3
.sp
.ne 5l
.fs 4
38.t(toc1)Lexical Considerations *. 25
.sp
.ns
.fs 4
@@@8.1t(toc2)Reserved Words . 25
.fs 4
@@@8.2t(toc2)Identifiers . 25
.fs 4
@@@8.3t(toc2)Literals . 25
.fs 4
@@@8.4t(toc2)Operators and Punctuation Symbols . 26
.fs 4
@@@8.5t(toc2)Comments and Other Separators . 26
.em
.rm table_of_figures3
.de table_of_figures3
.em
.end_save_macro
.de seg4_save_macro end_save_macro
.nr page 36
.nr chapter 9
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents4
.de table_of_contents4
.sp
.ne 5l
.fs 4
39.t(toc1)Types, Type Generators, and Type Specifications *. 27
.sp
.ns
.fs 4
@@@9.1t(toc2)Null . 27
.fs 4
@@@9.2t(toc2)Bool . 28
.fs 4
@@@9.3t(toc2)Int . 28
.fs 4
@@@9.4t(toc2)Real . 28
.fs 4
@@@9.5t(toc2)Char . 29
.fs 4
@@@9.6t(toc2)String . 29
.fs 4
@@@9.7t(toc2)Any . 30
.fs 4
@@@9.8t(toc2)Array Types . 31
.fs 4
@@@9.9t(toc2)Record Types . 32
.fs 4
@@@9.10t(toc2)Oneof Types . 33
.fs 4
@@@9.11t(toc2)Procedure and Iterator Types . 34
.fs 4
@@@9.12t(toc2)Other Type Specifications . 35
.em
.rm table_of_figures4
.de table_of_figures4
.em
.end_save_macro
.de seg5_save_macro end_save_macro
.nr page 43
.nr chapter 10
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents5
.de table_of_contents5
.sp
.ne 5l
.fs 4
310.t(toc1)Scopes, Declarations, and Equates *. 36
.sp
.ns
.fs 4
@@@10.1t(toc2)Scoping Units . 36
.fs 4
@@@10.2t(toc2)Variables . 37
.fs 4
@@@10.3t(toc2)Equates and Constants . 39
.em
.rm table_of_figures5
.de table_of_figures5
.em
.end_save_macro
.de seg6_save_macro end_save_macro
.nr page 47
.nr chapter 11
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents6
.de table_of_contents6
.sp
.ne 5l
.fs 4
311.t(toc1)Assignment and Invocation *. 43
.sp
.ns
.fs 4
@@@11.1t(toc2)Type Inclusion . 43
.fs 4
@@@11.2t(toc2)Assignment . 43
.fs 4
@@@11.3t(toc2)Invocation . 45
.em
.rm table_of_figures6
.de table_of_figures6
.em
.end_save_macro
.de seg7_save_macro end_save_macro
.nr page 57
.nr chapter 12
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents7
.de table_of_contents7
.bp
.sp
.ne 5l
.fs 4
312.t(toc1)Expressions *. 47
.sp
.ns
.fs 4
@@@12.1t(toc2)Literals . 48
.fs 4
@@@12.2t(toc2)Variables . 48
.fs 4
@@@12.3t(toc2)Procedure and Iterator Names . 48
.fs 4
@@@12.4t(toc2)Procedure Invocations . 48
.fs 4
@@@12.5t(toc2)Selection Operations . 49
.fs 4
@@@12.6t(toc2)Constructing Arrays and Records . 51
.fs 4
@@@12.7t(toc2)Prefix and Infix Operators . 52
.fs 4
@@@12.8t(toc2)CAND and COR . 54
.fs 4
@@@12.9t(toc2)Precedence . 54
.fs 4
@@@12.10t(toc2)UP and DOWN . 55
.fs 4
@@@12.11t(toc2)FORCE . 55
.em
.rm table_of_figures7
.de table_of_figures7
.em
.end_save_macro
.de seg8_save_macro end_save_macro
.nr page 65
.nr chapter 13
.nr appendix 0
.nr current_table 1
.nr current_figure 5
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents8
.de table_of_contents8
.sp
.ne 5l
.fs 4
313.t(toc1)Statements *. 57
.sp
.ns
.fs 4
@@@13.1t(toc2)Invocation . 57
.fs 4
@@@13.2t(toc2)Update Statements . 58
.fs 4
@@@13.3t(toc2)Block Statement . 59
.fs 4
@@@13.4t(toc2)Conditional Statement . 60
.fs 4
@@@13.5t(toc2)Loop Statements . 60
.fs 4
@@@13.6t(toc2)Tagcase Statement . 62
.fs 4
@@@13.7t(toc2)Termination Statements . 63
.em
.rm table_of_figures8
.de table_of_figures8
.em
.end_save_macro
.de seg9_save_macro end_save_macro
.nr page 74
.nr chapter 14
.nr appendix 0
.nr current_table 1
.nr current_figure 6
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents9
.de table_of_contents9
.sp
.ne 5l
.fs 4
314.t(toc1)Exception Handling and Exits *. 65
.sp
.ns
.fs 4
@@@14.1t(toc2)The Exception Handling Mechanism . 65
.fs 4
@@@14.2t(toc2)Signalling Exceptions . 66
.fs 4
@@@14.3t(toc2)Handling Exceptions . 66
.fs 4
@@@14.4t(toc2)An Example . 69
.fs 4
@@@14.5t(toc2)Summary . 71
.fs 4
@@@14.6t(toc2)Exits and the Placement of Handlers . 72
.em
.rm table_of_figures9
.de table_of_figures9
.em
.end_save_macro
.de seg10_save_macro end_save_macro
.nr page 97
.nr chapter 15
.nr appendix 0
.nr current_table 1
.nr current_figure 8
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents10
.de table_of_contents10
.sp
.ne 5l
.fs 4
315.t(toc1)Modules *. 74
.sp
.ns
.fs 4
@@@15.1t(toc2)CLU Programs . 74
.fs 4
@@@15.2t(toc2)Procedures . 75
.fs 4
@@@15.3t(toc2)Iterators . 76
.fs 4
@@@15.4t(toc2)Clusters . 78
.fs 4
@@@15.5t(toc2)Parameterized Modules . 88
.em
.rm table_of_figures10
.de table_of_figures10
.em
.end_save_macro
.de seg11_save_macro end_save_macro
.nr page 104
.nr chapter 0
.nr appendix 1
.nr current_table 1
.nr current_figure 8
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents11
.de table_of_contents11
.bp
.sp
.ne 5l
.fs 4
3Appendix I.t(toc3)CLU Syntax *. 97
.sp
.ns
.em
.rm table_of_figures11
.de table_of_figures11
.em
.end_save_macro
.de seg12_save_macro end_save_macro
.nr page 120
.nr chapter 12
.nr appendix 2
.nr current_table 1
.nr current_figure 8
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents12
.de table_of_contents12
.sp
.ne 5l
.fs 4
3Appendix II.t(toc3)Built-in Types and Type Generators *. 104
.sp
.ns
.fs 4
@@@1.t(toc2)Introduction . 104
.fs 4
@@@2.t(toc2)Null . 105
.fs 4
@@@3.t(toc2)Bool . 105
.fs 4
@@@4.t(toc2)Int . 106
.fs 4
@@@5.t(toc2)Real . 107
.fs 4
@@@6.t(toc2)Char . 109
.fs 4
@@@7.t(toc2)String . 110
.fs 4
@@@8.t(toc2)Array Types . 112
.fs 4
@@@9.t(toc2)Record Types . 116
.fs 4
@@@10.t(toc2)Oneof Types . 117
.fs 4
@@@11.t(toc2)Procedure and Iterator Types . 118
.fs 4
@@@12.t(toc2)Any . 119
.em
.rm table_of_figures12
.de table_of_figures12
.em
.end_save_macro
.de seg13_save_macro end_save_macro
.nr page 135
.nr chapter 10
.nr appendix 3
.nr current_table 1
.nr current_figure 8
.nr toc_page 1
.nr toc_size 4
.nr any_figures 0
.rm table_of_contents13
.de table_of_contents13
.sp
.ne 5l
.fs 4
3Appendix III.t(toc3)Input/Output *. 120
.sp
.ns
.fs 4
@@@1.t(toc2)Files . 120
.fs 4
@@@2.t(toc2)File Names . 121
.fs 4
@@@3.t(toc2)A File Type? . 123
.fs 4
@@@4.t(toc2)File System Procedures . 123
.fs 4
@@@5.t(toc2)Streams . 125
.fs 4
@@@6.t(toc2)String I/O . 128
.fs 4
@@@7.t(toc2)Istreams . 129
.fs 4
@@@8.t(toc2)Terminal I/O . 131
.fs 4
@@@9.t(toc2)Miscellaneous Procedures . 133
.fs 4
@@@10.t(toc2)Dates . 133
.em
.rm table_of_figures13
.de table_of_figures13
.em
.end_save_macro

38
doc/clu/refman.sect Normal file
View File

@ -0,0 +1,38 @@
.so clu/refman.insert
.so r/sect.rmac
.de segment
. def_seg \0 clu/\0.refman
. em
.sr save_file_name clu/refman.save
\k
\k the stuff for each segment
\k
.segment part_a
.segment gram
.segment lex
.segment types
.segment e_d
.segment action
.segment exprs
.segment stmts
.segment except
.segment module
.segment syntax
.segment opdefs
.segment io
chap 1 part-a
chap 7 gram
chap 8 lex
chap 9 types
chap 10 e&d
chap 11 action
chap 12 exprs
chap 13 stmts
chap 14 except
chap 15 module
pndx 1 syntax
pndx 2 opdefs
pndx 3 io
.mult_run

404
doc/clu/stmts.refman Normal file
View File

@ -0,0 +1,404 @@
.chapter "Statements"
.
.sr self Section`chapter
.sr exceptions Section`12
.sr assign Section`9
.sr scope Section`8.1
.sr fetch Section`10.5.1
.sr getops Section`10.5.2
.sr exstmt Section`12.4
.sr itermod Section`13.3
.sr oneofs Section`7.10
.sr procmod Section`13.2
.
.sr s1 1
.sr s2 2
.
.para
In this section,
we describe most of the statements of CLU.
We omit discussion of the
signal, exit, and except statements,
which are used for signalling and handling exceptions,
as described in exceptions.
.para
CLU is a statement-oriented language,
i.e.,
statements are executed for their side-effects and do not return any values.
Most statements are 2control* statements that permit the
programmer to define how control flows through the program.
The real work is done by the 2simple* statements:
assignment and invocation.
Assignment has already been discussed in assign;
the invocation statement is discussed in self.1 below.
Two special statements that look like assignments but are really invocations
are discussed in self.2.
.para
The syntax of CLU is defined to permit a control statement to control a group of
declarations, equates, and statements rather than just a single statement.
Such a group is called a 2body*.
Scope rules for bodies were discussed in scope.
Occasionally,
it is necessary to explicitly indicate that a group of statements should be treated like
a single statement;
this is done by the block statement,
discussed in self.3.
.para
The conditional statement is discussed in self.4.
Loop statements are discussed in self.5,
as are some special statements that control termination of a single iteration
or a single loop.
The tagcase statement is discussed in self.6.
Finally,
the return statement is discussed in self.7,
and the yield statement in self.8.
.section "Invocation"
.para
An invocation statement invokes a procedure.
Its form is the same as an invocation expression:
.show
primary ( lbkt expression, etc rbkt )
.eshow
The 2primary* must evaluate to a procedure object,
and the types of the 2expressions* must be included in
the types of the formal arguments for that procedure.
The procedure may or may not return some results;
if it does return results,
they are discarded.
.para
For example,
the statement
.show
array[int]$remh (a)
.eshow
will remove the top element of 2a*
(assuming 2a* is an array[int]).
Remh also returns the top element,
but it is discarded in this case.
.section "Update Statements"
.para
Two special statements are provided for updating components of records and arrays.
In addition they may be used with user-defined types with the appropriate properties.
These statements resemble assignments syntactically,
but are really invocations.
.subsection "Element Update"
.para
The element update statement has the form
.show
primary [ expressions1 ] := expressions2
.eshow
This form is merely syntactic sugar for an invocation of a store operation,
and is completely equivalent to the invocation statement
.show
T$store (primary, expressions1, expressions2)
.eshow
where T is the type of 2primary*.
For example,
if 2a* is an array of integers,
.show
a[27] := 3
.eshow
is completely equivalent to the invocation statement
.show
array[int]$store (a, 27, 3)
.eshow
.para
The element update statement is not restricted to arrays.
The statement is legal if the corresponding invocation statement is legal.
In other words,
T (the type of 2primary*) must provide a procedure operation named 2store*,
which takes three arguments whose types include those of
2primary*, 2expressions2*, and 2expressions2*, respectively.
In case 2primary* is an array[S] for some type S,
2expressions1* must be an integer,
and 2expressions2* must be included in S.
.para
We recommend that the use of 2store* for user-defined types be
restricted to types with array-like behavior,
i.e.,
types whose objects contain collections of indexable elements.
For example,
it might make sense for an associative_memory type to provide
a 2store* operation for changing the value associated with a key.
Such types may also provide a 2fetch* operation;
see fetch.
.subsection "Component Update"
.para
The component update statement has the form
.show
primary 1.* name := expression
.eshow
This form is merely syntactic sugar for an invocation of a set_name operation,
and is completely equivalent to the invocation statement
.show
T$set_2name* (primary, expression)
.eshow
where T is the type of 2primary*.
.para
For example,
if 2x* has type record[first:`int,`second:`real],
then
.show
x.first := 6
.eshow
is completely equivalent to
.show
record[first: int, second: real] $ set_first (x, 3)
.eshow
.para
The component update statement is not restricted to records.
The statement is legal if the corresponding invocation statement is legal.
In other words,
T (the type of 2primary*) must provide a procedure operation called set_2name*,
which takes two arguments whose types include the types of
2primary* and 2expression*, respectively.
When T is a record type,
then T must have a selector called 2name*,
and the type of 2expression* must be included in the type of
the component named by that selector.
.para
We recommend that 2set* operations be provided for user-defined types only if
record-like behavior is desired,
i.e.,
it is meaningful to permit some parts of the abstract object to be modified
by selector name.
In general,
2set* operations should not perform any substantial computation,
except possibly checking that the arguments satisfy certain constraints.
For example,
in a bank_account type,
there might be a set_min_balance operation to set what the minimum balance
in the account must be.
However,
2deposit* and 2withdraw* operations make more sense than a set_balance operation,
even though the set_balance operation could compute the amount deposited or
withdrawn and enforce semantic constraints.
.para
In our experience,
types with 2set* operations occur much less frequently than types with
2get* operations (see getops).
.section "Block Statement"
.para
The block statement permits a group of statements to be grouped together into
a single statement.
Its form is
.show
begin body end
.eshow
Since the syntax already permits bodies inside control statements,
the main use of the block statement is to group statements together for
use with the except statement;
see exstmt.
.section "Conditional Statement"
.para
The form of the conditional statement is
.show 4
if expression then body
lcurly elseif expression then body rcurly
lbkt else body rbkt
end
.eshow
The expressions must be of type bool.
They are evaluated successively until one is found to be true.
The body corresponding to the first true expression is executed,
and the execution of the if statement then terminates.
If none of the expressions is true,
then the body in the else clause is executed (if the else clause exists).
The elseif form provides a convenient way to write a multi-way branch.
.section "Loop Statements"
.para
There are two forms of loop statements:
the while statement and the for statement.
Also provided are a continue statement,
to terminate the current cycle of a loop,
and a break statement,
to terminate the innermost loop.
These are discussed below.
.subsection "While Statement"
.para
The while statement has the form:
.show
while expression do body end
.eshow
Its effect is to repeatedly execute the body as long as the expression remains true.
The expression must be of type bool.
If the value of the expression is true,
the body is executed,
and then the entire while statement is executed again.
When the expression evaluates to false,
execution of the while statement terminates.
.subsection "For Statement"
.para
The for statement invokes an iterator (see itermod).
The iterator produces a sequence of 2items* (groups of zero or more objects)
one item at a time;
the body of the for statement is executed for each item in the sequence.
.para
The for statement has the form:
.show
for lbkt idn, etc rbkt in invocation do body end
.eshow
or
.show
for lbkt decl, etc rbkt in invocation do body end
.eshow
The invocation must be an iterator invocation.
The 2idn* form uses previously declared variables to serve as the loop variables,
while the 2decl* form introduces new variables,
local to the for statement,
for this purpose.
In either case,
the variables must agree in number, order, and type with
the number, order, and types of the objects (the item) yielded each time by the iterator.
.para
Execution of the for statement proceeds as follows.
First the iterator is invoked,
and it either yields an item or terminates.
If the iterator yields an item,
its execution is temporarily suspended,
the objects in the item are assigned to the loop variables,
the body of the for statement is executed,
and then execution of the iterator is resumed (from the point of suspension).
Whenever the iterator terminates,
the entire for statement terminates.
.para
An example of a for statement is
.show 6
a: array [int];
etc
sum: int := 0;
for s(1)x: int in array[int]$elements (a) do
t(1)sum := sum + x;
t(1)end;
.eshow
which will compute the sum of all the integers in an array of integers.
This example makes use of the 2element* iterator on arrays,
which yields the elements of the array one by one.
.subsection "Continue Statement"
.para
The continue statement has the form
.show
continue
.eshow
Its effect is to terminate execution of the body of the smallest loop statement
in which it appears,
and to start the next cycle of that loop (if any).
.subsection "Break Statement"
.para
The break statement has the form
.show
break
.eshow
Its effect is to terminate execution of the smallest loop statement in which it appears.
Execution continues with the statement following that loop.
.para
For example,
.show 7
sum: int := 0;
for s(1)x: int in array[int]$elements (a) do
t(1)sum := sum + x;
t(1)if s(2)sum >= 100
t(2)then sum := 100; break;
t(2)end;
t(1)end;
.eshow
computes the minimum of 100 and the sum of the integers in 2a*.
Note that execution of the break statement will terminate both the iterator
and the for loop,
continuing with the statement following the for loop.
.section "Tagcase Statement"
.para
The tagcase statement is a special statement provided for decomposing oneof objects.
Recall that a oneof type is a discriminated union,
and each oneof object is a pair consisting of a 2tag* and some other object,
called the 2value* (see oneofs).
The tagcase statement permits the selection of a body to perform based on
the tag of the oneof object.
.para
The form of the tagcase statement is
.show 4
tagcase expression
tag_arm lcurly tag_arm rcurly
lbkt others : body rbkt
end
.eshow
where
.show
.def tag_arm "tag name, etc lbkt (idn: type_spec) rbkt : body"
.eshow
The expression must evaluate to a oneof object.
The tag of this object is then matched against the names on the tag_arms.
When a match is found,
if a declaration (2idn*: 2type_spec*) exists,
the value part of the oneof object is assigned to the local variable 2idn*.
The body of the matching tag_arm is then executed;
2idn* is defined only in that body.
If no match is found,
the body in the others arm is executed.
.para
In a syntactically correct tagcase statement,
the following constraints are satisfied.
The type of the expression must be some oneof type, T.
The tags named in the tag_arms must be a subset of the tags of T,
and no tag may occur more than once.
If all tags of T are present,
there is no others arm;
otherwise an others arm must be present.
Finally,
on any tag_arm containing a declaration (2idn*:`2type_spec*),
2type_spec* must equal the type specified as corresponding in T to
the tag or tags named in the tag_arm.
.para
An example of a tagcase statement is
.show 10
x: oneof [cell: cell, null: null]
cell = record [car: int, cdr: int_list]
etc
tagcase x
tag null: return (false)
tag cell (r: cell): if s(1)r.car = y
t(1)then return (true)
t(1)else x := r.cdr
t(1)end
end
.eshow
The tagcase statement shown might be used in a list (of integers) operation
that determines whether some given integer (2y*) is on the list.
.section "Return Statement"
.para
The form of the return statement is:
.show
return lbkt (expression, etc) rbkt
.eshow
The return statement terminates execution of the containing procedure or iterator.
If the return statement is in a procedure,
the number, types, and order of the 2expressions* must agree with
the requirements stated in the returns clause of the procedure header.
The expressions (if any) are evaluated from left to right,
and the objects obtained become the results of the procedure.
If the return statement occurs in an iterator no results can be returned.
.para
For example,
inside a procedure 2p* with type
.show
proctype (etc) returns (int, char)
.eshow
the statement
.show
return (3, 'a')
.eshow
is legal and returns the two result objects 3 and 'a'.
.section "Yield Statement"
.para
1Yield* statements may occur only in the body of an iterator.
The form of a yield statement is:
.show
yield lbkt (expression, etc) rbkt
.eshow
It has the effect of suspending operation of the iterator,
and returning control to the invoking for statement.
If the expression list is present,
the values obtained by evaluating the expressions (in left to right) are passed
to the for statement to be assigned to the corresponding list of identifiers.
The number, types, and order of the expressions must agree with
the requirements stated in the iterator heading.

509
doc/clu/syntax.refman Normal file
View File

@ -0,0 +1,509 @@
.ec e epsilon
.am table_of_contents
. bp
. em
.appendix "CLU Syntax"
.para
We use an extended BNF grammar to define the syntax.
The general form of a production is:
.show 4
.def nonterminal alternative
.or alternative
.or ```...
.or alternative
.eshow
.para
The following extensions are used:
.nr fff fheight/4
.if
. width "a , etc "
. sv list_left_margin 800m
. sv list_right_margin
. sv list_indent width!m
. sv list_space
. sv list_spacing 1.5
. ilist
a , etc stands for a list of one or more 2a*'s separated by commas;
i.e., "a" or "a, a" or "a, a, a" etc.
. next
lcurly!arcurly stands for a sequence of zero or more 2a*'s;
i.e., " " or "a" or "a a" etc.
. next
lbkt!arbkt stands for an optional 2a*;
i.e., " " or "a".
. end_list
. en
.para
All semicolons are optional in CLU,
but for simplicity they appear in the syntax as ";" rather than "lbkt!;rbkt".
Nonterminal symbols appear in normal face.
Reserved words appear in bold face.
All other terminal symbols are non-alphabetic,
and appear in normal face.
.eq temp_or or
.eq temp_def def
.de bdef
.be def_block
.nv ls 110
.br
.nv fill 0
.keep ndef
\0
.hp d_stop0!m
def
.hp d_stop2!m
.em
.
.
.de or
.br
.hp d_stop1!m
orbar
.hp d_stop2!m
.em
.
.
.de ndef
.end_keep
.sp .5
.en
.sr fff em look, ma! a space
.if nargs>0
.sr fff
.en
.\fff!bdef \0
.em
.
.
.de comment
.sp .5
.hp d_stop1!m
% 
.em
.de comment1
.br
.hp d_stop1!m
% 
.em
.
.
.de prec
.hp percent_stop!m
%
.hp prec_stop!m
\curprec
.em
.
.
.de nprec
.nr curprec curprec-1
.nr prec_stop prec_stop+prec_width
.prec
.em
.
.
.width "others_handler "
.nr d_stop0 width
.width "def"
.nr d_stop1 d_stop0+(width/2)
.width "cluster_body def "
.nr d_stop2 width
.width "expression // expression "
.nr percent_stop d_stop2+width
.width "% "
.nr prec_width width
.
.
.bdef module
lcurly equate rcurly procedure
.or
lcurly equate rcurly iterator
.or
lcurly equate rcurly cluster
.
.
.ndef procedure
idn = s(1)proc lbkt parms rbkt args lbkt returns rbkt lbkt signals rbkt lbkt where rbkt ;
t(1)body
t(1)end idn ;
.
.
.ndef iterator
idn = s(1)iter lbkt parms rbkt args lbkt yields rbkt lbkt signals rbkt lbkt where rbkt ;
t(1)body
t(1)end idn ;
.
.
.ndef cluster
idn = s(1)cluster lbkt parms rbkt is idn , etc lbkt where rbkt ;
t(1)cluster_body
t(1)end idn ;
.
.
.ndef parms
[ parm , etc ]
.
.
.ndef parm
idn , etc : type
.or
idn , etc : type_spec
.
.
.ndef args
( lbkt decl , etc rbkt )
.
.
.ndef decl
idn , etc : type_spec
.
.
.ndef returns
returns ( type_spec , etc )
.
.
.ndef yields
yields ( type_spec , etc )
.
.
.ndef signals
signals ( exception , etc )
.
.
.ndef exception
name lbkt ( type_spec , etc ) rbkt
.
.
.ndef where
where restriction , etc
.
.
.ndef restriction
idn has oper_decl , etc
.or
idn in type_set
.
.
.ndef type_set
{ idn | idn has oper_decl , etc ; lcurly equate rcurly }
.or
idn
.
.
.ndef oper_decl
op_name , etc : type_spec
.
.
.ndef op_name
name lbkt [ constant , etc ] rbkt
.
.
.ndef constant
expression
.or
type_spec
.
.
.ndef body
lcurly equate rcurly lcurly statement rcurly
.
.
.ndef cluster_body
s(1)lcurly equate rcurly rep = type_spec ; lcurly equate rcurly
b(1)routine lcurly routine rcurly
.
.
.ndef routine
procedure
.or
iterator
.
.
.ndef equate
idn = constant ;
.or
idn = type_set ;
.
.
.ndef type_spec
null
.or
bool
.or
int
.or
real
.or
char
.or
string
.or
any
.or
rep
.or
cvt
.or
array [ type_spec ]
.or
record [ field_spec , etc ]
.or
oneof [ field_spec , etc ]
.or
proctype ( lbkt type_spec , etc rbkt ) lbkt returns rbkt lbkt signals rbkt
.or
itertype ( lbkt type_spec , etc rbkt ) lbkt yields rbkt lbkt signals rbkt
.or
idn [ constant , etc ]
.or
idn
.
.
.ndef field_spec
name , etc : type_spec
.
.
.ndef statement
decl ;
.or
idn : type_spec := expression ;
.or
decl , etc := invocation ;
.or
idn , etc := invocation ;
.or
idn , etc := expression , etc ;
.or
primary . name := expression ;
.or
primary [ expression ] := expression ;
.or
invocation ;
.or
while expression do body end ;
.or
for lbkt decl , etc rbkt in invocation do body end ;
.or
for lbkt idn , etc rbkt in invocation do body end ;
.or
if s(1)expression then body
t(1)lcurly elseif expression then body rcurly
t(1)lbkt else body rbkt
t(1)end ;
.or
tagcase expression
t(1)tag_arm lcurly tag_arm rcurly
t(1)lbkt others : body rbkt
t(1)end ;
.or
return lbkt ( expression , etc ) rbkt ;
.or
yield lbkt ( expression , etc ) rbkt ;
.or
signal name lbkt ( expression , etc ) rbkt ;
.or
exit name lbkt ( expression , etc ) rbkt ;
.or
break ;
.or
continue ;
.or
begin body end ;
.or
statement except s(1)lcurly when_handler rcurly
t(1)lbkt others_handler rbkt
t(1)end ;
.
.
.ndef tag_arm
tag name , etc lbkt ( idn : type_spec ) rbkt : body
.
.
.ndef when_handler
when name , etc lbkt ( decl , etc )rbkt : body
.or
when name , etc ( * ) : body
.
.
.ndef others_handler
others lbkt ( idn : type_spec ) rbkt : body
.
.
.ndef expression
.nr prec_stop percent_stop+prec_width
.nr curprec 6
primary
.or
( expression )
.or
~ expressionprec() (precedence)
.or
- expressionprec()
.or
expression ** expressionnprec()
.or
expression // expressionnprec()
.or
expression / expressionprec()
.or
expression * expressionprec()
.or
expression || expressionnprec()
.or
expression + expressionprec()
.or
expression - expressionprec()
.or
expression < expressionnprec()
.or
expression <= expressionprec()
.or
expression = expressionprec()
.or
expression >= expressionprec()
.or
expression > expressionprec()
.or
expression ~< expressionprec()
.or
expression ~<= expressionprec()
.or
expression ~= expressionprec()
.or
expression ~>= expressionprec()
.or
expression ~> expressionprec()
.or
expression & expressionnprec()
.or
expression cand expressionprec()
.or
expression | expressionnprec()
.or
expression cor expressionprec()
.
.
.ndef primary
nil
.or
true
.or
false
.or
int_literal
.or
real_literal
.or
char_literal
.or
string_literal
.or
idn
.or
idn [ constant , etc ]
.or
primary . name
.or
primary [ expression ]
.or
invocation
.or
type_spec $ { field , etc }
.or
type_spec $ [ lbkt expression : rbkt lbkt expression , etc rbkt ]
.or
type_spec $ name lbkt [ constant , etc ] rbkt
.or
force [ type_spec ]
.or
up ( expression )
.or
down ( expression )
.
.
.ndef invocation
primary ( lbkt expression , etc rbkt )
.
.
.ndef field
name , etc : expression
.
.
.ndef
.para
2Reserved word*:
one of the identifiers appearing in bold face in the syntax.
Upper and lower case letters are not distinguished in reserved words.
.para
2Name*, 2idn*:
a sequence of
letters,
digits,
and underscores that begins with a letter or underscore,
and that is not a reserved word.
Upper and lower case letters are not distinguished in names and idns.
.para
2Int_literal*:
a sequence of one or more decimal digits.
.para
2Real_literal*:
a mantissa with an (optional) exponent.
A mantissa is either a sequence of one or more decimal digits,
or two sequences (one of which may be empty) joined by a period.
The mantissa must contain at least one digit.
An exponent is 'E' or 'e',
optionally followed by '+' or '-',
followed by one or more decimal digits.
An exponent is required if the mantissa does not contain a period.
.para
2Char_literal*:
either a printing ASCII character
(octal 40 thru octal 176),
other than single quote or backslash,
enclosed in single quotes,
or one of the following escape characters enclosed in single quotes:
.show
escape sequence s(1)character
.sp .5
\'t(1)' s(2)(single quote)
\"t(1)"t(2)(double quote)
\\t(1)\t(2)(backslash)
\nt(1)NLt(2)(newline)
\tt(1)HTt(2)(horizontal tab)
\pt(1)FFt(2)(newpage)
\bt(1)BSt(2)(backspace)
\rt(1)CRt(2)(carriage return)
\vt(1)VTt(2)(vertical tab)
\***t(1)specified by octal value (* is an octal digit)
.eshow
The escape sequences may be written using upper case letters.
.para
2String_literal*:
a sequence of zero or more character representations,
enclosed in double quotes.
A character representation is either
a printing ASCII character other than double quote or backslash,
or one of the escape sequences listed above.
.para
2Comment*:
a sequence of characters that begins with a percent sign,
ends with a newline character,
and contains only printing ASCII characters and horizontal tabs in between.
.para
2Separator*:
a blank character
(space, vertical tab, horizontal tab, carriage return, newline, form feed)
or a comment.
Zero or more separators may appear between any two tokens,
except that at least one separator is required between any two adjacent
non-self-terminating tokens:
reserved words,
identifiers,
integer literals,
and real literals.
.eq or temp_or
.eq def temp_def

529
doc/clu/types.refman Normal file
View File

@ -0,0 +1,529 @@
.sr opdefs_pdx Appendix`II
.sr int_pdx Appendix`II.4
.sr real_pdx Appendix`II.5
.sr type_def_sec Section`3.1
.sr cluster_sec Section`13
.sr self Section`7
.sr builtin_secs Sections self.1 to self.7
.sr gen_secs Sections self.8 to self.11
.sr cand_sec Section`10.8
.sr fetch_sec Section`10.5.1
.sr fetch_store_secs Sections 10.5.1 and 11.2.1
.sr force_sec Section`10.11
.sr array_cons_sec Section`10.6
.sr equate_sec Section`8.3
.sr rec_cons_sec Section`10.6
.sr get_set_secs Sections 10.5.2 and 11.2.2
.sr tagcase_sec Section`11.6
.sr system_sec Section`3.1
.sr invoke_sec Section`9.3
.sr const_sec Section`8.3
.sr bind_sec Section`4
.sr failure_sec Section`12.1
.sr cvt_parm_secs Sections 13.4 and 13.5
.chapter "Types, Type Generators, and Type Specifications"
.para
A 2type* consists of a set of objects together with
a set of operations to manipulate the objects.
As discussed in type_def_sec,
types can be classified according to whether their objects are mutable or immutable.
An immutable object (e.g, an integer) has a value that never varies,
while the value (state) of a mutable object can vary over time.
.para
A 2type generator* is a 2parameterized* type definition,
representing a (usually infinite) set of related types.
A particular type is obtained from a type generator by writing
the generator name along with specific values for the parameters;
for every distinct set of legal values,
a distinct type is obtained.
For example,
the array type generator has a single parameter that determines the element type;
array[int], array[real], and array[array[int]] are three distinct
types defined by the array type generator.
Types obtained from type generators are called 2parameterized* types;
others are called 2simple* types.
.para
Within a program,
a type is specified by a syntactic construct called a 2type_spec*.
The type specification for a simple type
is just the identifier (or reserved word) naming the type.
For parameterized types,
the type specification consists of the identifier (or reserved word) naming
the type generator, together with the parameter values.
.para
This section gives an informal introduction to
the built-in types and type generators provided by CLU;
many details (such as error conditions) are not discussed.
Complete and precise definitions are given in opdefs_pdx.
builtin_secs describe the objects, literals,
and some of the operations for each of the built-in types,
while gen_secs describe the objects, type specifications,
and interesting operations of types obtained from the built-in type generators.
A number of operations can be invoked using infix and prefix operators;
as the various operation names are introduced, the corresponding operator,
if any, will follow in parentheses.
.para
In addition, we describe type specifications for user-defined types,
and other special type specifications in Section self.12.
The mechanism by which new types and type generators are implemented
is presented in cluster_sec.
.
.section "Null"
.para
The type null has exactly one immutable object,
represented by the literal nil.
The type null is generally used as a kind of "place holder" in a oneof type
(see self.9).
.
.section "Bool"
.para
The two immutable objects of type bool, with literals true and false,
represent logical truth values.
The binary operations 2equal* (=), 2and* (&), and 2or* (|),
are provided, as well as unary 2not* (~).
.
.section "Int"
.para
The type int models (a range of) the mathematical integers.
The exact range is not part of the language definition,
and can vary somewhat from implementation to implementation (see int_pdx).
Integers are immutable objects,
and are written as a sequence of one or more decimal digits.
The binary operations
2add* (+), 2sub* (-), 2mul* (*), 2div* (/), 2mod* (//),
and 2power* (**) are provided, as well as unary 2minus* (-).
There are binary comparison operations
2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>).
In addition, there are two operations, 2from_to* and 2from_to_by*,
for iterating over a sequence of integers.
For example, one can iterate over the odd numbers between one and 100 with
.show
int$from_to_by(1, 100, 2)
.eshow
.
.section "Real"
.para
The type real models (a subset of) the mathematical real numbers.
The exact subset is not part of the language definition,
although certain constraints are imposed (see real_pdx).
Reals are immutable objects,
and are written as a 2mantissa* with an (optional) 2exponent*.
A mantissa is either a sequence of one or more decimal digits,
or two sequences (one of which may be empty) joined by a period.
The mantissa must contain at least one digit.
An exponent is 'E' or 'e',
optionally followed by '+' or '-',
followed by one or more decimal digits.
An exponent is required if the mantissa does not contain a period.
As is usual, 2m*E2x* = 2m**102x*.
Examples of real literals are:
.show
3.14 3.14E0 314e-2 .0314E+2 3. .14
.eshow
.para
As with integers,
the operations
2add* (+), 2sub* (-), 2mul* (*), 2div* (/), 2mod* (//),
2power* (**), 2minus* (-),
2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>),
are provided.
It is important to note that there is no form of
2implicit* conversion between types.
So, for example,
the various binary operators cannot have one integer and one real argument.
The 2i2r* operation converts an integer to a real,
2r2i* rounds a real to an integer,
and 2trunc* truncates a real to an integer.
.
.section "Char"
.para
The type char provides the alphabet for text manipulation.
Characters are immutable, and form an ordered set.
Every implementation must provide at least 128, but no more than 512, characters;
the first 128 characters are the ASCII characters in their standard order.
.para
Printing ASCII characters (octal 40 thru octal 176),
other than single quote or backslash,
can be written as that character enclosed in single quotes.
Any character can be written by enclosing one of the
following escape sequences in single quotes:
.show 12
escape sequence s(1)character
.sp .5
\'t(1)' s(2)(single quote)
\"t(1)"t(2)(double quote)
\\t(1)\t(2)(backslash)
\nt(1)NLt(2)(newline)
\tt(1)HTt(2)(horizontal tab)
\pt(1)FFt(2)(form feed, newpage)
\bt(1)BSt(2)(backspace)
\rt(1)CRt(2)(carriage return)
\vt(1)VTt(2)(vertical tab)
\***t(1)specified by octal value (* is an octal digit)
.eshow
The escape sequences may be written using upper case letters.
Examples of character literals are:
.show
'7' 'a' '"' '\"' '\'' '\B' '\177'
.eshow
.para
The usual binary comparison operations exist for characters:
2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>).
There are also two operations, 2i2c* and 2c2i*,
for converting between integers and characters:
the smallest character corresponds to zero,
and the characters are numbered sequentially.
.
.section "String"
.para
The type string is used for representing text.
A string is an immutable sequence of zero or more characters.
Strings are lexicographically ordered, based on the ordering for characters.
A string is written as a sequence of zero or more character representations,
enclosed in double quotes.
Within a string literal,
a printing ASCII character other than double quote or backslash
is represented by itself.
Any character can be represented by using the escape sequences listed above.
Examples of string literals are:
.show
"Item\tCost" "altmode (\033) = \\033" "" " "
.eshow
.para
The characters of a string are indexed sequentially starting from one,
and there are a number of operations that deal with these indexes:
2fetch*, 2substr*, 2rest*, 2indexc*, and 2indexs*.
The 2fetch* operation is used to obtain a character by index.
Invocations of 2fetch* can be written using a special syntax
(fully described in fetch_sec):
.show
s[i] % get the character at index i of s
.eshow
2Substr* returns a string given a string, a starting index, and a length:
.show
string$substr("abcde", 2, 3) = "bcd"
.eshow
2Rest*, given a string and a starting index, returns the rest of the string:
.show
string$rest("abcde", 3) = "cde"
.eshow
2Indexc* computes the least index at which a character occurs in a string,
and 2indexs* does the same for a string;
the result is zero if the character or string does not occur:
.show 3
string$indexc('d', "abcde") = 4
string$indexs("cd", "abcde") = 3
string$indexs("abcde", "cd") = 0
.eshow
.para
Two strings can be concatenated together with 2concat* (||),
and a single character can be appended to the end of a string with 2append*.
Note that string$concat("abc",`"de") and string$append("abcd",`'e')
produce the 2same* string as writing "abcde".
2C2s* converts a character to a single-character string.
The size of a string can be determined with 2size*.
2Chars* iterates over the characters of a string,
from the first to the last character.
There are also the usual comparison operations:
2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>).
.
.section "Any"
.para
A type specification is used to restrict
the class of objects that a variable can denote,
a procedure or iterator can take as arguments, a procedure can return, etc.
There are times when no restrictions are desired,
when any object is acceptable.
At such times, the type specification any is used.
For example,
one might wish to implement a table mapping strings to arbitrary objects,
with the intention that different strings could map to objects of different types.
The lookup operation, used to get the object corresponding to a string,
would have its result declared to be of type any.
.para
The type any is the 2union* of all possible types,
and it is the 2only* true union type in CLU;
all other types are 2base* types.
Every object is of type any, as well as being of some base type.
The type any has no operations;
however,
the base type of an object can be tested at run-time (see force_sec).
.
.section "Array Types"
.para
Arrays are one-dimensional, and are mutable.
Arrays are unconventional because the number of elements in an array
can vary dynamically.
Furthermore, there is no notion of an "uninitialized" element.
.para
The 2state* of an array consists of an integer called the 2low bound*,
and a sequence of objects called the 2elements*.
The elements of an array are indexed sequentially,
starting from the low bound.
All of the elements must be of the same type;
this type is specified in the array type specification,
which has the form
.show
array [ type_spec ]
.eshow
Examples of array type specifications are
.show 2
array[int]
array[array[string]]
.eshow
.para
There are a number of ways to create a new array,
of which only two are mentioned here.
The 2create* operation takes an argument specifying the low bound,
and creates a new array with that low bound and no elements.
An array 2constructor* can be used to create an array
with an arbitrary number of elements.
For example,
.show
array[int] $ [5: 1, 2, 3, 4]
.eshow
creates an integer array with low bound five, and four elements, while
.show
array[bool] $ [true, false]
.eshow
creates a boolean array with low bound one (the default), and two elements.
Array constructors are discussed fully in array_cons_sec.
.para
An array type specification states nothing about the bounds of an array.
This is because arrays can grow and shrink dynamically.
2Addh* adds a new element to the end of the array,
with index one greater than the previous top element.
2Addl* adds a new element to the beginning of the array,
and decrements the low bound by one,
so that the new element has an index one less than the previous bottom element.
2Remh* removes the top element;
2reml* removes the bottom element and increments the low bound.
Note that all of these operations preserve the indexes of the other elements.
Also note that these operations do not create holes;
they merely add to or remove from the ends of the array.
.para
As an example, if a 2remh* were performed on the integer array
.show
array[int] $ [5: 1, 2, 3, 4]
.eshow
the element 4 would disappear, and the new top element would be 3,
still with index 7.
If a 0 were added using 2addl*, it would become the new bottom element,
with index 4.
.para
The 2fetch* operation extracts an element by index,
and the 2store* operation replaces an element by index.
There is no notion of an "uninitialized" element;
an index is illegal if no element with that index exists.
Invocations of these operations can be written using special forms
(covered fully in fetch_store_secs):
.show 2
a[i] % fetch the element at index i of a
a[i] := 3; % store 3 at index i of a
.eshow
.para
The 2top* and 2bottom* operations return
the element with the highest and lowest index, respectively.
The 2high* and 2low* operations return the highest and lowest indexes,
respectively.
The 2elements* iterator yields the elements from bottom to top,
and the 2indexes* iterator yields the indexes from low to high.
There is also a 2size* operation that returns the number of elements.
.para
Every newly created array has an identity that is distinct from all other arrays;
two arrays can have the same elements without being the same array object.
The identity of arrays can be distinguished with the 2equal* (=) operation.
The 2similar1* operation tests if two arrays have the same state,
using the 2equal* operation of the element type.
2Similar* tests if two arrays have similar states,
using the 2similar* operation of the element type.
For example, writing
.show
ai$[3: 1, 2, 3]
.eshow
(where "ai" is equated to array[int])
in different places produces arrays that are similar1 and similar (but not equal),
while the following produces arrays that are similar, but not similar1 (or equal):
.show
array[ai] $ [1: ai$create(1)]
.eshow
.
.section "Record Types"
.para
A record is a mutable collection of one or more named objects.
The names are called 2selectors*, and the objects are called 2components*.
Different components may have different types.
A record type specification has the form
.show
record [ field_spec , etc ]
.eshow
where
.show
.def field_spec "name , etc : type_spec"
.eshow
Selectors must be unique within a specification,
but the ordering and grouping of selectors is unimportant.
For example, all the of the following name the same type:
.show 2
record [last, first, middle: string, age: int]
record [first, middle, last: string, age: int]
record [last: string, age: int, first, middle: string]
.eshow
.para
A record is created using a record 2constructor*.
For example:
.show
info $ {last: "Jones", first: "John", age: 32, middle: "J."}
.eshow
(assuming that "info" has been equated to one of the above
type specifications; see equate_sec.)
An expression must be given for each selector,
but the order and grouping of selectors need not resemble
the corresponding type specification.
Record constructors are discussed fully in rec_cons_sec.
.para
For each selector "sel",
there is an operation 2get_*sel to extract the named component,
and an operation 2set_*sel to replace the named component with
some other object.
For example,
there are 2get_middle* and 2set_middle* operations for the type specified above.
Invocations of these operations can be written in a special form
(discussed fully in get_set_secs):
.show 2
r.middle % get the 'middle' component of r
r.age := 33; % set the 'age' component of r to 33
.eshow
.para
As with arrays,
every newly created record has an identity that is distinct from all other records;
two records can have the same components without being the same record object.
The identity of records can be distinguished with the 2equal* (=) operation.
The 2similar1* operation tests if two records have the same components,
using the 2equal* operations of the component types.
2Similar* tests if two records have similar components,
using the 2similar* operations of the component types.
.
.section "Oneof Types"
.para
A oneof type is a 2tagged discriminated union*.
A oneof is an immutable labeled object,
to be thought of as "one of" a set of alternatives.
The label is called the 2tag*,
and the object is called the 2value*.
A oneof type specification has the form
.show
oneof [ field_spec , etc ]
.eshow
where (as for records)
.show
.def field_spec "name , etc : type_spec"
.eshow
Tags must be unique within a specification,
but the ordering and grouping of tags is unimportant.
.para
As an example of a oneof type,
the representation type for a linked list of integers,
int_list,
might be written
.show 2
oneof [s(1)empty: s(2)null,
t(1)cell:t(2)record [car: int, cdr: int_list]]
.eshow
As another example, the contents of a "number container" might be specified by
.show 4
oneof [s(1)empty: s(2)null,
t(1)integer:t(2)int,
t(1)real_num:t(2)real,
t(1)complex_num:t(2)complex];
.eshow
.para
For each tag "t" of a oneof type,
there is a 2make_*t operation which takes an object
of the type associated with the tag,
and returns the object (as a oneof) labeled with tag "t".
For example,
.show
number$make_real_num (1.37)
.eshow
creates a oneof object with tag "real_num"
(assuming "number" has been equated to the "number container" type specification above;
see equate_sec).
.para
The 2equal* operation tests if two oneofs have the same tag,
and if so, tests if the two object are the same,
using the 2equal* operation of the value type.
2Similar* tests if two oneofs have the same tag,
and if so, tests if the two objects are similar,
using the 2similar* operation of the value type.
.para
To determine the tag and value parts of a oneof object,
one normally uses the tagcase statement, discussed in tagcase_sec.
.
.section "Procedure and Iterator Types"
.para
Procedures and iterators are immutable objects, created by the CLU system
(see system_sec).
The type specification for a procedure or iterator contains most of the information
stated in a procedure or iterator heading; a procedure type specification has the form
.show
proctype ( lbkt type_spec , etc rbkt ) lbkt returns rbkt lbkt signals rbkt
.eshow
and an iterator type specification has the form
.show
itertype ( lbkt type_spec , etc rbkt ) lbkt yields rbkt lbkt signals rbkt
.eshow
where
.show 4
.long_def exception
.def1 returns "returns ( type_spec , etc )"
.def1 yields "yields ( type_spec , etc )"
.def1 signals "signals ( exception , etc )"
.def1 exception "name lbkt ( type_spec , etc ) rbkt"
.eshow
The first list of type specifications describes
the number, types, and order of arguments.
The returns or yields clause gives the number, types, and order of the objects
to be returned or yielded.
The signals clause lists the exceptions raised by the procedure or iterator;
for each exception name,
the number, types, and order of the objects to be returned is also given.
All names used in a signals clause must be unique, and cannot be "failure"
which has a standard meaning in CLU (see failure_sec).
The ordering of exceptions is not important.
For example, both of the following type specifications name the procedure type
for string$substr:
.show 2
proctype (string, int, int) returns (string) signals (bounds, negative_size)
proctype (string, int, int) returns (string) signals (negative_size, bounds)
.eshow
1String*$chars has the following iterator type:
.show
itertype (string) yields (char)
.eshow
.para
Procedure and iterator types have an 2equal* (=) operation.
Invocation is 2not* an operation,
but a primitive action of CLU semantics (see invoke_sec).
.
.section "Other Type Specifications"
.para
The type specification for a user-defined type has the form
.show
idn lbkt [ constant , etc ] rbkt
.eshow
where each 2constant* must be compile-time computable (see const_sec).
The identifier must be bound to a data abstraction (see bind_sec).
If the referenced abstraction is parameterized,
constants of the appropriate types and number must be supplied.
The order of parameters always matters in user-defined types.
.para
There are three special type specifications that are used
when implementing new abstractions: rep, cvt, and type.
These forms are discussed in cvt_parm_secs.
Within an implementation of an abstraction,
formal parameters declared with type can be used as type specifications.
.para
In addition, identifiers which have been equated to type specifications
can also be used as type specifications.
Equates are discussed in equate_sec.