.chapter "Scope, Declarations, and Equates"
.para
As was discussed in sect_progs, the structure of a CLU
program is not deeply nested as is customary in block-structured
languages, but rather consists of a group of modules all at
the same level.
The names of the modules, and in the case of clusters, the names
of the operations, are globally known throughout this level.
However, since modules are not nested within other modules,
identifiers used within modules to name, for example, variables,
are purely local.
Since we expect modules to be rather small (in the absence of nesting),
we felt it was reasonable to insist that local identifiers not
be redefined within a module.
Therefore, although there is block structure within a module, it is
not possible to redefine in an inner scope an identifier declared
in an outer scope.
.para
Each full_module defines a scoping unit.
In addition, all compound statements define new scoping units in the
obvious places.
For example, in the if statement, both the then clause
and the else clause are new scoping units.
.para
Variables may be declared anywhere within a scoping unit;
declarations are not constrained to appear at the beginning of a unit.
The actual scope of the variable begins after its declaration, and continues to the
end of the smallest enclosing scoping unit.
.para
A variable declaration gives a name for the variable and the
type of the variable.
In addition, an initial value for the variable may be provided.
The use of an uninitialized variable will raise an exception
if the error is not caught at compile time.
(CLU arrays and records are defined in such a way that there
are no uninitialized elements or components to worry about,
so we can guarantee that all expressions are well-defined
by checking variable usage.)
.para
2Equates* are used to establish abbreviations for types and constants.
Each expression in an equate must be compile-time
computable, and must produce an object belonging to one of the
built-in, immutable types (see sect_semantics.1).
.para
Equates must all appear in a group at the beginning of a scoping unit;
the order of the equates is unimportant, but they may not be recursive.
The actual scope of the equates is the entire scoping unit containing them.
.chapter "Expressions and Statements"
.para
CLU is somewhat unusual in that almost all expressions are considered
to be just a syntactic means of invoking procedures.
This view permits user-defined and built-in types to be treated
uniformly, e.g, x + y invokes T$add
whether the type of x, T, is built-in or user-defined.
This view also fits our model that exceptions arise from invocations
(see sect_except).
Note that this view does not preclude in-line code for expressions
(for both built-in and user-defined types);
we simply view the production of in-line code as analogous
to in-line substitution for an invocation, followed possibly by
some optimization.
.para
One exception to the view of expressions as invocations
is the use of the cand (conditional and) and cor
(conditional or) operators.
These operators are defined to shortcut evaluation of their operands;
for example, the second operand of cand will be evaluated only
if the first operand evaluates to true.  Thus, cand
and cor cannot be explained in terms of invocation.
These operators are not available for overloading, and they
do not raise any exceptions.
.para
CLU statements are, for the most part, fairly conventional.
The most basic statements are the assignment statement and the
invocation statement;
the semantics of these statements is discussed in the next section.
.para
There are a number of compound statements:  block,
conditional, iterative, tagcase and except statements.
Blocks are used to group statements, and to introduce new scoping units.
The conditional statement is the usual if statement, with an
additional elseif form which may be used when there are a number of
clauses all at the same level.
.para
There are two iterative statements: one is the usual while statement;
the other, the for statement, is used in conjunction with an
iterator which controls the looping.
.para
The tagcase statement is used to discriminate on the
tag of a oneof object;
it provides 2arms* for possible values of the tag, plus a special
others arm to handle tag values not mentioned explicitly.
.para
The except statement is used to handle exceptions
arising from invocations, plus locally generated exits.
Its form is similar to that of the tagcase statement, with arms to handle
explicitly named exceptions and exits, and an optional
others arm to handle any exception not explicitly mentioned.
.para
Finally, there are a number of termination statements.
The return statement terminates a procedure or iterator in the
normal condition, while the signal statement is used to terminate
in an exceptional condition.
The yield statement is used within an iterator to
produce the next item in the sequence.
.para
1Return*, signal, and yield are inter-module control mechanisms.
The remaining termination statements are all intra-module.
The exit statement raises an exit condition that must be
handled by a local except statement.
The break statement terminates the smallest enclosing loop, and
the continue statement terminates just
the current cycle of the smallest enclosing loop.
.en
.sr sect_iter Section 14.5
.sr sect_proctypes Section 9.11
.sr app_types Appendix II
.sr sect_handle Section 14.3
.sr app_io Appendix III
\k
.chapter "Exception Handling and Exits"
.para 1
A procedure is designed to perform a certain task, taking
some number and types of arguments and returning some
number and types of result objects.
However, in certain
cases (e.g., for particular values of arguments), that task
may be impossible to perform.
In such a case, instead of returning normally (which would
imply successful performance of the intended task), the
procedure should notify its caller of its failure
by signalling an i(exception).
.para
For example, consider integer division.
The int$div procedure takes two integer arguments and returns their quotient.
However, if the second argument to int$div is zero, then
there is no quotient.
In this case, instead of returning,
int$div signals the exception zero_divide.
We include in the type specification of a procedure
a description of the exceptions it may signal, for example,
int$div is of type
.show
proctype (int, int) returns (int) signals (zero_divide)
.eshow
.para
In this section, we will concentrate on exceptions signalled by procedures.
However, exceptions may also be signalled by iterators, and all we say about
procedures applies to iterators as well, except
as described in sect_iter below.
.section "The Exception Handling Mechanism"
.para 1
The exception handling mechanism consists of two parts,
the signalling of exceptions and the handling of exceptions.
Signalling is the way a procedure notifies its caller
that it has discovered an exceptional condition.
Handling is the way that the caller of the procedure
specifies what is to be done if the procedure signals
an exception.
.para
Signalling an exception is an alternative form of returning.
When a procedure signals an exception, the current
activation of that procedure terminates and control is transferred
to a handler in the caller.
The signaller may
return objects to the exception handler, to help explain
the exceptional condition.
.para
An exception is identified by a name.
A procedure may
signal zero or more exceptions, whose names must be distinct.
Since signalling is like returning, each exception has an associated
list of types specifying what objects may be returned to the caller.
An exception name and its
associated list of types is called an i(exception@specification).
The specifications of the exceptions
signalled by a procedure are part of
the type of the procedure (see sect_proctypes).
In addition,
any procedure can signal the exception i(failure), which
always has a single accompanying object of type string.
The failure exception is implicitly part of all
procedure types; it may not be declared explicitly.
(The use of i(failure) is intended to indicate errors
from which it is unlikely or impossible to recover,
such as hardware malfunctions.)
.section "Signalling Exceptions"
.para 1
An exception is signalled by the signal statement, which has the form:
.show
signal name lbkt ( expression, etc ) rbkt
.eshow
where i(name) is the name of the exception to be signalled.
.para
A signal statement may appear anywhere in the body of a procedure.
The execution of a signal statement begins with the
evaluation of the expressions (if any), from left to right,
to produce a list of i(signal@argument) objects.
The activation of the executing procedure is then terminated.
Execution continues as described in section sect_handle below.
.para
The named exception must be either i(failure) or one
of the exceptions listed in the procedure header.
If the exception is i(failure), then there must be
exactly one signal argument expression, whose
type is string.
Otherwise, if the corresponding
exception specification in the procedure header has the form
.show
name (T1, etc, Tn)
.eshow
then there must be exactly i(n) signal argument expressions
and the type of the expression i(i) must be
included in the Ti.
.para
The following useless procedure contains a number of examples
of signal statements:
.show
.ta 20
signaller = 	proc (i: int) 
signals (foo, bar (int), bletch (string, bool));
	if i < 0 then signal foo; end;
	if i > 0 then signal bar (i - 1); end;
	if i = 0 then signal bletch ("zero", true); end;
	signal failure ("unreachable statement executed");
	end signaller;
.eshow
.section "Handling Exceptions"
.para 1
When a procedure activation terminates by signalling an exception, we say that
the corresponding procedure invocation (the text of the call)
i(raises) that exception.
The caller specifies what action should be taken when an exception
is raised by the use of i(handlers), which are
written using the except statement.
.para
The except statement has the form:
.show
	statement except s(1)lcurly when_handler rcurly
	t(1)lbkt others_handler rbkt
	t(1)end
where
.long_def others_handler
.def1 when_handler
when name , etc lbkt ( decl , etc ) rbkt : body
.or
when name , etc ( * ) : body

.def1 others_handler
others lbkt ( idn : type_spec ) rbkt : body
.eshow
We will call the statement to which the handlers are attached S.
The handlers handle exceptions that are raised by invocations in the statement S.
Each when_handler specifies one or more exception names
and a body to be executed if one of those exceptions is raised.
The optional others_handler is used to handle all exceptions
not explicitly named in the when_handlers.
The statement S can be a compound
statement, and can even contain other except statements.
Whenever two
except statements are nested in this fashion, and both have handlers
for the same exception, the innermost handler will take precedence
(see below).
.para
An except statement is executed as follows.
First, the statement S is executed.
If it terminates normally, then the except statement terminates normally also.
If some exception E is raised in S,
and the exception E is not handled by a handler within S,
then the execution of S is terminated and the attached handlers
are examined to see if any one of them will handle the exception E.
If so, then the body of the corresponding
handler is executed;
when the body terminates, the entire except statement terminates.
If there is no handler for the exception E in this except statement,
then the except statement itself terminates raising the exception E.
This will presumably be handled by some enclosing except statement.
.para
Thus, when an exception E is raised,
control is passed to the innermost exception handler
that handles the exception E.
 Exceptions that are raised inside of handlers
are treated no differently from other
exceptions: control is passed to the innermost exception handler
for that exception (in a surrounding except statement).
Whenever a handler terminates, the except statement of
which it is a part terminates as well.
The set of invocations for which
a handler is effective is called the i(range) of that handler.
The range of a handler for
an exception E is that set of invocations within the attached
statement that are not inside the range of a nested handler
for the exception E.
.para
Recall that the infix and prefix operators are merely
syntactic sugar for procedure invocations.
Thus, the execution of such operators can signal exceptions and these
exceptions can be handled by the procedure containing
the use of the operator.
app_types describes the operations of the built-in types
and type generators, and the exceptions that those operations may signal.
.para
An invocation need not be surrounded by except statements
to handle all exceptions potentially raised by that invocation.
This policy was adopted because in many cases the programmer can prove that
a particular exception will not arise.
For example, the invocation int$div(x,7)
will never signal zero_divide.
However, this policy does lead to the possibility that some invocation may raise
an exception E and not be within the range of any handler for E.
Thus, we make the following rule.
If an invocation raises an exception E,
and that invocation is not within the range of any handler for E, then the
procedure containing that invocation is terminated and signals
the exception i(failure).
The exception name E is made into a string (all in lower case), and this string
is the argument of the failure signal.
As a special case, if the original exception E was itself i(failure), then
the original string argument is passed along with the new signal,
instead of "failure".
(This avoids losing the original exception name when a i(failure)
propagates up several levels.)
.para
Now let us consider the form of the handlers in more detail.
The when forms handle particular sets of exceptions.
The first form, without declarations, simply specifies a set of exception names.
This form is used to handle exceptions with no associated signal arguments.
The same form i(with) declarations is used to handle exceptions with signal arguments.
Each exception must have the same number of arguments as specified in the formal
argument list (i.e., the declarations), and their types must match exactly.
Within the body of the handler, the
declared formal arguments may be used to access the actual signal arguments.
These arguments are variables (initialized to the signal arguments),
local to the handler body.
The second form (with *) can be used to handle any exceptions of the
given names, regardless of whether or not there are associated signal arguments.
Any actual signal arguments will be thrown away.
.para
All of the exception names appearing in the when_handlers of
an except statement must be distinct.
Each exception must be potentially raised by some invocation within the
range of the handler.
For any exception handled using the when form with arguments,
all invocations within the range of the handler that potentially
raise an exception with that name
must provide the exact number and types of signal arguments
as specified in the formal argument list.
(The programmer must place handlers for an exception sufficiently close
to the invocations that raise that exception so that this restriction
is satisfied.)
.para
The others form is optional.
At most one may be used in an except statement, and it must appear last.
An others_handler handles any exception not handled by another
handler in the except statement.
If a formal argument is declared, it must be of type string.
If the actual exception is not i(failure), then the formal argument
will denote a string object which is the
name of the actual exception, in lower case; any actual signal
arguments will be thrown away.
However, if the actual exception is i(failure), then the formal
argument will denote the actual (string) signal argument.
.section "An Example"
.para 1
We now present an example demonstrating the use of exception handlers.
We will write a procedure, sum_stream,
which reads a sequence of signed decimal integers from a character
stream and returns the sum of those integers.
The stream is viewed as containing a sequence of fields separated by spaces;
each field must consist of a non-empty sequence of
digits, optionally preceded by a single minus sign.
Sum_stream has the form
.show
.ta 20
sum_stream = 	proc (s: stream) returns (int) signals (s(1)overflow,
	t(1)unrepresentable_integer (string),
	t(1)bad_format (string));
	etc
	end sum_stream;
.eshow
Sum_stream signals overflow if the sum of the
numbers or an intermediate sum is outside the implemented range of integers.
Unrepresentable_integer is signalled if the stream contains
an individual number that is outside the implemented range of integers.
Bad_format us signalled if the stream contains a
field that is not an integer.
.para
We will use the i(getc) operation of the i(stream) data type
(see app_io), whose type is
.show
proctype (stream) returns (char) signals (end_of_file, not_possible(string));
.eshow
This operation returns the next character from the stream, unless the
stream is empty, in which case end_of_file is signalled.
Not_possible is signalled if the operation cannot be performed on the
given stream (e.g., it is an output stream, or does not allow character
operations, etc.)
We will assume that we are given a stream for which getc is possible.
.para
The following procedure is used to convert character strings to integers:
.show
.ta 20
s2i = 	proc (s: string) returns (int) signals (s(1)invalid_character (char),
	t(1)bad_format,
	t(1)unrepresentable_integer);
	etc
	end s2i;
.eshow
S2i signals invalid_character if its string argument
contains a character other than a digit or a minus sign.
Bad_format is signalled if the string contains
a minus sign following a digit, more than one minus
sign, or no digits.
Unrepresentable_integer is signalled if the string
represents an integer that is outside the implemented range of integers.
.para
An implementation of sum_stream is presented in Figure current_figure.
.begin_figure "The sum_stream procedure."
.show
.ta 20 28 36 42
sum_stream = 	proc (s: stream) returns (int) signals (s(1)overflow,
	t(1)unrepresentable_integer (string),
	t(1)bad_format (string));
	sum: int := 0;
	num: string := "";

	while true do

		% skip over spaces between values; sum is valid, num is meaningless
		c: char := stream$getc(s);
		while c = ' ' do 
			c := stream$getc(s);
			end;

		% read a value; num accumulates new number, sum becomes previous sum
		while c ~= ' ' do
			num := string$append(num, c);
			c := stream$getc(s);
			end;
		except when end_of_file: end;

		% restore sum to validity
		sum := sum + s2i(num);
		end;
	except
	     when end_of_file: return(sum);
	     when unrepresentable_integer: signal unrepresentable_integer (num);
	     when bad_format,invalid_character(*): signal bad_format (num);
	     when overflow: signal overflow;
	     end;
	end sum_stream;
.eshow
.finish_figure
There are two loops within an infinite loop:
one to skip spaces, and one to accumulate digits for conversion to a number.
Notice the placement of the inner end_of_file handler.
If end_of_file is raised in the second inner loop,
then the sum is computed correctly, and the first invocation
of stream$getc will again raise end_of_file.
This time, however, the infinite loop is terminated and
execution transfers to the other end_of_file handler,
which then returns the accumulated sum.
.para
We have placed the remaining exception handlers outside of the infinite loop
to avoid cluttering up the main part of the algorithm.
Each of these exception handlers could also
have been placed after the particular statement containing
the invocation that signalled the corresponding exception.
The (*) form is used in the handler for the bad_format and invalid_character
exceptions since the signal arguments are not used.
Note that the overflow handler catches exceptions
signalled by the int$add procedure, which is invoked
using the infix + notation.
Note also that in this example all of the exceptions
raised by sum_stream originate as exceptions signalled by lower-level modules.
Sum_stream simply reflects these exceptions
upwards in terms that are meaningful to its callers.
Although some of the names may be unchanged, the meanings
of the exceptions (and even the number of arguments) are
different in the two levels.
.para
As mentioned above, we have assumed the stream$getc will not signal
not_possible; if it does, then sum_stream will terminate, raising
the exception failure("unhandled exception: not_possible").
.section "Summary"
.para 1
Any activation of a procedure may terminate in one of two ways:
it may terminate normally, returning zero or more result
objects, or it may signal an exception, along with zero or
more signal arguments.
In the latter case, we say that
the invocation of the procedure may i(raise) the given exception.
The set of possible exceptions that may be raised by
a procedure invocation is determined from the type of the procedure.
This set always includes the i(failure) exception.
.para
If a procedure invocation is a
component of an expression, and the invocation terminates
by raising an exception E (with associated signal
arguments), then the entire expression immediately terminates,
raising the exception E (with the associated signal arguments).
The set of possible exceptions that may be raised by an expression
is the set of all exceptions that may
be raised by procedure invocations within that expression.
.para
Expressions are embedded in statements.
If, during the execution of a statement, an embedded expression
terminates by raising an exception E (with associated signal
arguments), then the statement itself immediately
terminates raising the exception E (with the associated
signal arguments).
.para
Statements may be composed from smaller statements.
In general, if a component statement terminates by raising an exception E,
then the containing statement also immediately terminates, raising the exception E.
However, if the statement is an except statement, and
the except statement contains a handler that handles
the exception E, then the handler body is executed,
as described in the preceding section.
If an iterator invocation terminates raising an exception E,
then the entire for statement which invoked the iterator
immediately terminates raising the exception E.
.para
The set of possible exceptions that may be raised by
a non-except statement is the union of the sets of possible
exceptions that may be raised by any component expression or statement.
The set of possible exceptions that may be raised
by an except statement consists of the
set of exceptions that may be raised by the
component statement, minus the set of exceptions handled
by the handlers, plus the set of exceptions that may be
raised by the handler bodies.
.para
Thus, any expression or statement may terminate either normally
or by raising some exception.
The set of possible exceptions always includes the i(failure) exception.
.para
Finally, all procedure (and iterator) bodies are implicitly surrounded by
an exception handler of the form:
.show
begin
	etc i(body) etc
	end except
		when others (s: string) : signal failure(s)
		end
.eshow
.section "Exits and the Placement of Handlers"
.para 1
A i(local) transfer of control can be effected by
by using the exit statement, which has the form:
.show
exit name lbkt ( expression, etc ) rbkt
.eshow
The exit statement is similar to the signal statement except
that where the signal statement i(signals) an exception
to the i(calling) procedure, the exit statement i(raises) the
exception directly in the i(current) procedure.
 An exception raised by an exit statement
i(must) be handled by a handler in the procedure
containing the exit statement.
 The handler must explicitly name the particular exception
(i.e., the others form cannot be used)
and may not throw away any signal arguments
(i.e., the (*) form cannot be used).
.para
The exit statement and the signal statement
mesh nicely to form a uniform mechanism.
The signal statement can be viewed
as simply terminating a procedure activation; an exit is
then performed at the point of invocation.
(Because this exit is implicit, it is not subject to the restrictions listed above.)
.para
In some cases, however, other requirements may prohibit placing
exception handlers to take advantage of the implicit exit.
For example, assume that you wish to handle a particular exception
signalled by a particular set of invocations.
To avoid catching unwanted exceptions,
the handler must be placed sufficiently close to the
set of invocations so that no other invocation raising an exception
of that name is in the range of the handler.
The facts that the handlers must be close to the invocations,
and that the statement you wish to terminate when the exception is raised
may be rather large can require you to put explicit exit statements
in the handlers to force termination of the larger statement.
The point is that exits are a necessary feature in maintaining
the overall effectiveness of the signal mechanism.