.sr all 4*
.sr eps 4*
.sr app_exam Appendix`IV
.sr sect_parms Section`13.5
.sr sect_invoke Section`9.3
.sr sect_library Section`4
.sr sect_instant Section`10.3
.sr sect_opname Section`10.3
.sr app_io Appendix`III
.sr sect_for_stmt Section`11.5.2
.chapter "Modules"
.para
A CLU program consists of a group of modules.
Three kinds of modules are provided,
one for each kind of abstraction we have found to be useful in program construction.
Procedures support procedural abstraction,
iterators support control abstraction,
and clusters support data abstraction.
In this section we describe each type of module.
We explain how to define modules, and also discuss how to use each module type
in building programs.
.section "Programs"
.para
If written out in full,
a CLU program would be a sequence of 2modules*:
.show
lcurly module rcurly
.eshow
where
.show 3
.def module "lcurly equate rcurly procedure"
.or "lcurly equate rcurly iterator"
.or "lcurly equate rcurly cluster"
.para
A module defines a new scope.
The identifiers introduced in the equates (if any)
and the identifier naming the abstraction (the 2module name*)
are local to that scope (and therefore may not be redefined in an inner scope).
However,
the module name is also globally available throughout the program and is used
in other modules to refer to the module it names.
Every module name used in the program must be defined by some module of the program.
.para
Module names are the only global identifiers within a program,
but it is permissible to redefine these identifiers in other modules.
Note that this is in contrast to the usual CLU rule that
identifiers may not be redefined in an inner scope.
.para
The programmer does not write a program as shown above.
Rather,
the programmer writes modules separately and enters them into
the CLU system as convenient.
Each module may refer to abstractions implemented by
other modules by using non-local identifiers.
The system will provide some means of determining what abstractions are meant by
these non-local identifiers;
one such mechanism is defined in sect_library.
Some example programs appear in app_exam.
.section "Procedures"
.para
A procedure performs an action on zero or more
2arguments*, and terminates returning zero or more 2results*.
A procedure supports a 2procedural abstraction*:  a mapping from
a set of input objects to a set of result objects, with possible
modification of some of the input objects.
A procedure may terminate in one of a number of 2conditions*;
one of these is the 2normal condition*, while others are
2exceptional conditions*.
Differing numbers and types of results may be returned in the
different conditions.
.para
The form of a procedure is
.show 3
idn = s(1)proc lbkt parms rbkt args lbkt returns rbkt lbkt signals rbkt lbkt where rbkt ;
t(1)body
t(1)end idn ;
.eshow
where
.show 4
.long_def exception
.def1 args "( lbkt decl , etc rbkt )"
.def1 returns "returns ( type_spec , etc )"
.def1 signals "signals ( exception, etc )"
.def1 exception "name lbkt ( type_spec, etc ) rbkt"
.eshow
.para
In this section we discuss non-parameterized procedures.
For a non-parameterized procedure, the 2parms* and 2where* clauses are missing.
Parameterized modules are discussed in sect_parms.
.para
The heading of a procedure describes the way in which the
procedure communicates with its caller.
The 2args* clause describes the number, order, and types of
arguments required to invoke the procedure, while the 2returns*
clause describes the number, order, and types of results returned
when the procedure terminates normally
(by executing a return statement or reaching the end of its body).
A missing returns clause indicates that no results are returned.
.para
The 2signals* clause names the exceptional conditions
in which the procedure can terminate, and specifies the
number, order, and types of result objects returned in each condition.
In addition to the conditions explicitly named in the signals clause, any
procedure can terminate in the 2failure* condition.
The 2failure* condition returns with one result, a string object.
All names of exceptions in the signals clause must be distinct,
and none can be "failure."
.para
A procedure is an object of some procedure type.
For a non-parameterized procedure, this type
is derived from the procedure heading by removing the
procedure name, rewriting the formal argument declarations with
no "factoring" and then deleting the names of formal
arguments, and finally replacing proc by proctype.
.para
As was discussed in sect_invoke,
the invocation of a procedure causes the introduction of the formal
variables, and the actual arguments are assigned to these variables.
Then the procedure body is executed.
Execution terminates when a return statement or a signal
statement is executed, or when the textual end of the body is reached.
If a procedure that should return results reaches the textual end
of the body, the procedure terminates in the
2failure*("no return values") condition.
At termination the result objects, if any, are passed back
to the invoker of the procedure.
.para
The 2idn* following the end of the procedure
must be the same as the 2idn* naming the procedure.
.para
An example of a procedure is given in app_exam.
.section "Iterators"
.para
An iterator computes a sequence of 2items*,
one item at a time, where an item is a group of zero or more objects.
In the generation of such a sequence, the computation of
each item of the sequence is usually controlled by information about
what previous items have been produced.
Such information and the way it controls the production of items
is local to the iterator.
The user of the iterator is not concerned with how the items
are produced, but simply uses them (through the for statement)
as they are produced.
Thus the iterator abstracts from the details of how the production
of the items is controlled;
for this reason, we consider an iterator to implement a control abstraction.
Iterators are particularly useful as operations of data abstractions
that are collections of objects (e.g., sets), since they may produce the
objects in a collection without revealing how the collection
is represented.
.para
An iterator has the form
.show 3
idn = s(1)iter lbkt parms rbkt args lbkt yields rbkt lbkt signals rbkt lbkt where rbkt ;
t(1)body
t(1)end idn ;
.eshow
where 
.show
.def yields "yields ( type_spec , etc )"
.eshow
In this section we discuss non-parameterized iterators,
in which the 2parms* and 2where* clauses are missing.
Parameterized modules are discussed in sect_parms.
.para
The form of an iterator is very similar to the form of a procedure.
There are only two differences:
.nlist
An iterator has a yields clause in its heading in
place of the returns clause of a procedure.
The yields clause lists the number, order, and types
of objects yielded each time the iterator produces the next item in the sequence.
If zero objects are yielded, then the yields clause is missing.
.nnext
Within the iterator body, the yield statement is used to
produce the next item in the sequence.
An iterator terminates in the same manner as a procedure
(note that it may not return any results).
.end_list
.para
An iterator is an object of some iterator type.
Its type can be derived from its header by removing the
iterator name, rewriting the formal argument declarations without
any "factoring" and then deleting the formal argument names, and replacing
iter by itertype.
.para
The execution of iterators is described in sect_for_stmt.
.para
An example of an iterator is
.show 6
from_to_by = iter (from, to, by: int) yields (int)
	while (from <= to & by >= 0) | (from >= to & by < 0) do
		yield (from)
		from := from + by
		end
	end from_to_by
.eshow
Additional examples of iterators are given in the next section.
.sp
1Remarks*
.para
Iterators provide a useful mechanism for abstracting from the details of control.
Furthermore, they permit for statements to iterate over
the objects of interest, rather than requiring a mapping from
the integers to those objects.
.para
It is important to realize that the
argument objects passed to the iterator are also accessible in the
body of the for loop controlled by the iterator.
If some argument object is mutable, and the
iterator modifies it, the change can affect the behavior of the for
loop body, and vice-versa.
Such changes can be the cause of program errors.
.para
As a general principle, an iterator should not modify its argument objects.
There are some examples, however, where modification is appropriate.
For example, an iterator that produces the characters from an input stream
would advance the stream "window" (the currently accessible character)
on each iteration.
.para
Also as a general principle, the for loop body should not modify
the iterator's argument objects.
Again, occasional examples exist where modification is desirable.
In programming such examples, the programmer must ensure that the
iterator will still behave correctly in spite of the modifications.
.section "Clusters"
.para
A cluster is used to implement a new data type, distinct
from any other built-in or user-defined data type.
A data type (or data abstraction) consists of a set of objects
and a set of primitive operations.
The primitive operations provide the most basic ways of manipulating the objects;
ultimately every action that can be performed on the objects must be
expressed in terms of the primitive operations.
Thus the primitive operations define the lowest level of observable object behavior.
.para
The form of a cluster is
.show 3
idn = s(1)cluster lbkt parms rbkt is idn , etc lbkt where rbkt ;
t(1)cluster_body
t(1)end idn ;
.eshow
where
.show 5
.def cluster_body "s(1)lcurly equate rcurly rep = type_spec ; lcurly equate rcurly"
t(1)routine lcurly routine rcurly

.def1 routine procedure
.or iterator              
.eshow
In this section we discuss non-parameterized clusters,
in which the 2parms* and 2where* clauses are missing.
Parameterized modules are discussed in sect_parms.
.para
The primitive operations are named by
the list of 2idns* following the reserved word is.
All of the idns in this list must be distinct.
.para
To define a new data type, it is necessary to choose a
2concrete representation* for the objects of the type.
The special equate
.show
rep = type_spec
.eshow
within the cluster body identifies type_spec as the concrete representation.
Within the cluster, rep may be used as an abbreviation for type_spec.
.para
The identifier naming the cluster is available for use in the cluster body.
Use of this identifier within the cluster body permits the
definition of recursive types (an example is given below).
.para
In addition to specifying the representation of objects,
the cluster must implement the primitive operations of the type.
The operations may be either procedural or control abstractions;
they are implemented by procedures and iterators, respectively.
Most of the routines in the cluster body define the primitive operations
(those whose names are listed in the cluster heading).
Any additional routines are 2hidden*:
they are private to the cluster and may not
be invoked by users of the abstract type.
All the routines must have distinct names.
.para
Outside the cluster, the type's objects may only be treated
abstractly (i.e., manipulated by using the primitive operations).
To implement the operations, however, it is usually necessary
to manipulate the objects in terms of their concrete
representation.
It is also convenient sometimes to manipulate the objects abstractly.
Therefore, inside the cluster it is possible to view the type's
objects either abstractly or in terms of their representation.
The syntax is defined to specify unambiguously, for each variable that
refers to one of the type's objects, which view is being taken.
Thus, inside a cluster named T, a declaration
.show
v: T
.eshow
indicates that the object referred to by 2v* is to be
treated abstractly, while a declaration
.show
w: rep
.eshow
indicates that the object referred to by 2w* is to be treated concretely.
Two primitives, up and down, are available for converting
between these two points of view.
The use of up permits a type rep object to be viewed abstractly,
while down permits an abstract object to be viewed concretely.
For example, the following two assignments are legal
.show 3
v := up (w)

w := down (v)
.eshow
Only routines inside a cluster may use up and down.
Note that up and down are used merely to inform the compiler that the
object is going to be viewed abstractly or concretely, respectively.
.para
A common place where the view of an object changes is at the
interface to one of the type's operations:  the user, of course, views
the object abstractly, while inside the operation, the object is viewed concretely.
To facilitate this usage, a special type specification, cvt, is provided.
The use of cvt is restricted to the args, returns, yields and
signals clauses of routines inside a cluster, and may be used at the
top level only (e.g., array`[cvt] is illegal).
When used inside the args clause, it means that the view of the
argument object changes from abstract to concrete when it is assigned
to the formal argument variable.
When cvt is used in the returns, yields, or signals
clause, it means the view of the result object changes from abstract
to concrete as it is returned (or yielded) to the caller.
Thus cvt means abstract outside, concrete inside.
.para
The cvt form does not introduce any new ability
over what is provided by up and down.
It is merely a shorthand for a common case.
In its absence, the heading of each routine would have to be written
using the abstract type in place of cvt.
Then inside the routine, additional variables of type rep would be
declared, the argument objects assigned to these variables using
down, and each return, yield, or signal statement would use
up explicitly.
The use of cvt simply causes the appropriate up or down
to be performed automatically, and avoids the declaration of additional variables.
.para
The type of each routine is derived from its heading in the usual manner,
except that each occurrence of cvt is replaced by the abstract type.
.para
Inside the cluster, it is not necessary to use the compound form
(2type_spec*$2op_name*) for naming locally defined routines.
Furthermore, the compound form cannot be used for invoking hidden routines.
.para
The identifier following the end must match the identifier naming the cluster.
.para
Some examples of clusters are shown below.
The first example implements (part of) a complex number data type.
This data type may be implemented using either x and y
coordinates, or rho and theta coordinates; the cluster shown uses
x and y coordinates.
Note that the 2create*, 2get_x*, and 2get_y* operations
might signal an exception if rho/theta coordinates were used;
therefore these exceptions are listed in the headers,
even though in this implementation the exceptions will not be signalled.
The coordinates of a complex number can be queried using the 2get*
operations explicitly, or by using the special syntax, e.g.,
.show
x.theta
.eshow
No 2set* operations are provided, since we believe complex numbers
should be immutable like other numbers (integers, reals, etc.).
Other operations on complex numbers are the usual arithmetic ones
(only 2add* is shown), and 2equal*, 2similar*, and 2copy*
(these are discussed in the remarks section below).
(Note:
we have assumed that 2square_root* and 2arctangent2* exist in the library.)
.para
The second example cluster implements lists of integers.
These lists are immutable, like pure lists in LISP.
The implementation is recursive: the representation type
refers to the abstract type.
Notice the 2elements* operation, which produces all integers
in the list in order; it is an example of a recursive iterator.
.para
The final example is sets of integers.
The sets are mutable: operations 2insert* and 2delete* modify sets.
Again note the 2elements* iterator, which produces all elements
of a set in some unspecified order.
Also note the use of 2is_in* in insert;
since 2is_in* requires an abstract object as its argument,
up is used to provide one.
.begin_page_figure "Example Clusters"

.ne 4
complex = cluster is create, add, get_x, get_y, get_rho, get_theta, 
equal, similar, copy

    rep = record [x, y: real]

.ne 3
    create = proc (x, y: real) returns (cvt) signals (overflow, underflow)
        return (rep${x: x, y: y})
        end create

.ne 7
    add = proc (a, b: cvt) returns (cvt) signals (overflow, underflow)
        return (rep${x: a.x + b.x, y: a.y + b.y})
            except
                when overflow: signal overflow
                when underflow: signal underflow
                end
        end add

.ne 3
    get_x = proc (c: cvt) returns (real) signals (overflow, underflow)
        return (c.x)
        end get_x

.ne 3
    get_y = proc (c: cvt) returns (real) signals (overflow, underflow)
        return (c.y)
        end get_y

.ne 7
    get_rho = proc (c: cvt) returns (real) signals (overflow, underflow)
        return (square_root (c.x * c.x + c.y * c.y))
            except
                when overflow: signal overflow
                when underflow: signal underflow
                end
        end get_rho

.ne 7
    get_theta = proc (c: cvt) returns (real) signals (overflow, underflow)
        return (arctangent2 (c.x, c.y))
            except
                when overflow: signal overflow
                when underflow: signal underflow
                end
        end get_theta

.ne 3
    equal = proc (c1, c2: cvt) returns (bool)
        return (rep$similar1 (c1, c2))
        end equal

.ne 3
    similar = proc (c1, c2: cvt) returns (bool)
        return (rep$similar1 (c1, c2))
        end similar

.ne 5
    copy = proc (c: complex) returns (complex)
        return (c)
        end copy

    end complex
.if vpos>4000
.  bp
.else
.  sp 5l
.  end if
.ne 5
int_list = cluster is create, cons, car, cdr, is_empty, elements, 
equal, similar, copy

    rep = oneof [cell: cell, empty: null]
    cell = record [car: int, cdr: int_list]

.ne 3
    create = proc () returns (cvt)
        return (rep$make_empty (nil))
        end create

.ne 3
    cons = proc (i: int, l: int_list) returns (cvt)
        return (rep$make_cell (cell${car: i, cdr: l}))
        end cons

.ne 6
    car = proc (l: cvt) returns (int) signals (empty)
        tagcase l
            tag cell (c: cell): return (c.car)
            tag empty: signal empty
            end
        end car

.ne 6
    cdr = proc (l: cvt) returns (int_list) signals (empty)
        tagcase l
            tag cell (c: cell): return (c.cdr)
            tag empty: signal empty
            end
        end cdr

.ne 3
    is_empty = proc (l: cvt) returns (bool)
        return (rep$is_empty (l))
        end is_empty

.ne 10
    elements = iter (l: cvt) yields (int)
        tagcase l
            tag cell (c: cell):
                yield (c.car)
                for i: int in elements (c.cdr) do
                    yield (i)
                    end
            tag empty:
            end
        end elements

.ne 3
    equal = proc (l1, l2: cvt) returns (bool)
        return (rep$similar (l1, l2))
        end equal

.ne 3
    similar = proc (l1, l2: cvt) returns (bool)
        return (rep$similar (l1, l2))
        end similar

.ne 5
    copy = proc (l: int_list) returns (int_list)
        return (l)
        end copy

    end int_list
.if vpos>4000
.  bp
.else
.  sp 5l
.  end if
.ne 4
int_set = cluster is empty, insert, delete, is_in, size, elements, 
equal, similar, copy

    rep = array [int]

.ne 3
    empty = proc () returns (cvt)
        return (rep$new())
        end empty

.ne 3
    insert = proc (s: cvt, i: int)
        if ~ is_in (up (s), i) then rep$addh (s, i) end
        end insert

.ne 10
    delete = proc (s: cvt, i: int)
        for j: int in rep$indexes (s) do
            if i = s [j]
                then
                    s [j] := rep$top (s)
                    rep$remh (s)
                    return
                end
            end
        end delete

.ne 6
    is_in = proc (s: cvt, i: int) returns (bool)
        for j: int in rep$elements (s) do
            if i = j then return (true) end
            end
        return (false)
        end is_in

.ne 3
    size = proc (s: cvt) returns (int)
        return (rep$size (s))
        end size

.ne 5
    elements = iter (s: cvt) yields (int)
        for i: int in rep$elements (s) do
            yield (i)
            end
        end elements

.ne 3
    equal = proc (s1, s2: cvt) returns (bool)
        return (s1 = s2)
        end equal

.ne 7
    similar = proc (s1, s2: int_set) returns (bool)
        if size (s1) ~= size (s2) then return (false) end
        for i: int in elements (s1) do
            if ~ is_in (s2, i) then return (false) end
            end
        return (true)
        end similar

.ne 5
    copy = proc (s: cvt) returns (cvt)
        return (rep$copy (s))
        end copy

    end int_set
.finish_figure
.ne 6
.sp
1Remarks*
.para
The main reason CLU was developed was to support the use of data abstractions.
Use of data abstractions leads to an object-oriented style of
programming, in which concerns about data are primary and serve to
organize program structure.
It requires some effort to learn to program in this style, but the
effort is worthwhile because the resulting programs are
more modular, and easier to modify and maintain.
.para
A cluster permits all knowledge about how a data abstraction is being
implemented to be kept local to the cluster.
This localization permits the correctness of an implementation to be
established by examining the cluster alone.
Part of such a correctness proof involves showing that only
legal representations are generated by the cluster.
For example, in the 2int_set* cluster above, not all arrays
are legal 2int_set* representations; only those without duplicate
elements are legal.
Information about what constitutes a legal representation is
described during program verification by stating the
2concrete invariant*.
Each operation must preserve this invariant for each object that it manipulates
of the abstract type.
This requirement applies to all return and signal statements
in operations, and also to yield statements in iterator operations.
.para
When defining a new data type, it is important to provide
a set of primitive operations sufficient to permit all interesting
manipulations of the objects.
There is no reason to attempt to define a minimal set, however;
frequently used operations can be made operations of the cluster
even if they could be implemented in terms of other operations.
.para
Operations that will frequently be required are 2copy*,
2equal*, and 2similar*.
These operations are needed if the type being defined is intended for
general use, since without these operations,
the use of the type within another type's concrete representation is somewhat limited.
For example,
array[T]$copy uses T$copy.
In addition, most types should provide I/O operations as discussed in app_io.
.para
In many earlier sections, we have discussed the use of special syntactic
forms for invoking operations, and have described how operations
must be named and defined in order to make use of these syntactic forms.
The use of such forms is quite unconstrained:  the special form is
translated to an invocation, and is legal if the invocation is legal.
.para
Our reason for not imposing more syntactic constraints on operator
overloading is that such constraints only capture a small part of what
it means to use operator overloading correctly.
For example, to overload "=" correctly, the 2equal* operation should
be an equivalence relation satisfying the substitution property;
i.e., if two objects are 2equal*, then one can be substituted for the
other without any detectable difference in behavior.
In the sections where special syntactic forms are described,
we have discussed in each case what constitutes proper usage.
.para
Overloading operator symbols is not the only place where care must be
taken to ensure that the new definition agrees with common usage;
the same care must be taken when redefining common operation names.
For example, the 2copy* operation should provide a "copy"
of its input object, such that subsequent changes made to either
the old or the new object do not affect the other.
In the case of an immutable type, like 2complex_number*
above, in which sharing between two objects will never be visible
to the using program, 2copy* can simply return its input object.
Ordinarily, however, 2copy* should copy its input objects,
including each component (using the 2copy* operation of the
component's type), as is done in the implementation of 2int_set*.
.para
The 2equal* operation should return true if its two input
objects are the same abstract object; this is necessary to satisfy
the substitution property mentioned above.
Thus, implementing 2equal* properly requires a thorough understanding
of the abstraction being implemented.
Usually two mutable objects are equal only if they are the
exact same object in the CLU universe; e.g., see int_set$equal above.
For immutable objects, the contents of the objects matters;
e.g., see int_list$equal and complex$equal above.
.para
The 2similar* operation should return true only if its two input
objects (both of the same type) have "equivalent state".
This means that any query made about information in two similar
objects immediately after they were determined to be similar
would provide the same answer for either of the two objects
(i.e., the answers would be 2similar*).
Note that 2similar* is a weaker condition than 2equal*:
two objects are 2equal* if they are the 2same* abstract objects,
and so of course they are 2similar* for all time.
2Equal* and 2similar* return different results only for
mutable types, because only such types have objects whose state can change.
2Copy* and 2similar* should be related as follows
for any type T:
.show
all x eps T lbkt T$similar (x, T$copy(x)) rbkt
.eshow
.para
With the exception of 2set* and 2store* operations,
procedures that define operator symbols, 2copy*, 2similar*,
and the I/O operations should never modify their input objects in a way that
the user of the object can detect.
This rule does not prohibit "benevolent" side-effects, i.e.,
modifications that speed up future operations without
affecting behavior in any other way.
.section "Parameterized Modules"
.para
Clusters, procedures and iterators may all be parameterized.
Parameterization permits a set of related abstractions to be defined
by a single module.
Recall that in each module heading, there is an optional
2parms* clause and an optional where clause.
The presence of the 2parms* clause indicates that the module
is parameterized;
the where clause states certain constraints on permissible
actual values for the parameters.
.para
The form of the parms clause is
.show
[ parm , etc ]
.eshow
where
.show 2
.def parm "idn , etc : type_spec"
.or "idn , etc : type"
.eshow
Each parameter is declared like an argument.
However, only the following type specifications are legal for parameters:
int, real, bool, char, string, null, and type.
Parameters are limited to these types because the actual values for parameters
are required to be constants that can be computed at compile time.
This requirement ensures that all types are compile-time known, and
permits complete compile-time type checking.
.para
In a parameterized module, the scope rules permit the parameters
to be used in the remainder of the module heading, to define
the types of arguments and results, e.g.,
.show
p = proc [t: type] (x: t) returns (t)
.eshow
.para
To use a parameterized module, it is first necessary to
provide actual, constant values for the parameters.
(The exact forms were discussed in sect_instant.)
The result of providing values is a procedure, iterator, or type
(where the parameterized module was a procedure, iterator, or
cluster, respectively)
that may be used just like a non-parameterized module of the same type.
.para
The type of the instantiated module can be derived from the
module heading and the actual parameters as follows:
The 2parms* and where clauses are deleted,
and the remainder of the heading is rewritten replacing
each occurrence of a formal parameter name with the actual parameter value.
For example, an instantiation of procedure 2p* shown above is
.show
p [int]
.eshow
which names a procedure object of type
.show
proctype (int) returns (int)
.eshow
.para
The meaning of an instantiated module is most easily understood in terms of rewriting.
Throughout the module body, the formal parameters are replaced
with the actual values.
Now the module (with the modified header as discussed above)
is a regular (non-parameterized) module.
In the case of a cluster some of the operations may have additional parameters;
further rewriting will be performed when these operations are used.
.para
In the case of a type parameter, constraints on
permissible actual types can be given in the where clause.
The where clause lists a set of operations that the actual
type is required to have.
The where clause also constrains the parameterized module:
the only primitive operations of the type parameter that can be used
are those listed in the where clause.
.para
The form of the where clause is
.show
.long_def restriction
.def1 where "where restriction , etc"
.eshow
where
.show 6
.def1 restriction "idn has oper_decl , etc"
.or "idn in type_set"
.def1 oper_decl "op_name , etc : type_spec"
.def1 op_name "name lbkt [ constant , etc ] rbkt"
.def1 type_set "{ idn | idn has oper_decl , etc ; lcurly equate rcurly }"
.or idn
.eshow
.para
There are two forms of restrictions.
The has form lists the set of required operations directly.
Note that if some of the type's operations are parameterized,
particular instantiations of those operations must be given.
The in form requires that the actual type be a member of a 2type_set*,
a set of types having the required operations.
The two identifiers in the typeset must match, and the notation
is read like set notation; e.g.,
.show
{t | t has f: ... }
.eshow
means "the set of all types 2t* such that 2t* has 2f* ...".
The scope of the identifier is the type_set.
.para
The in form is useful because an abbreviation can be given
for a typeset via an equate.
If it is helpful to introduce some abbreviations in defining the
type_set, these are given in the optional equates within the type_set.
The scope of these equates is the type_set.
.para
A routine in a parameterized cluster may have a where clause in
its heading, and can place further constraints on the cluster parameters.
For example, any type is permissible for the array element type,
but the array 2similar* operation requires that the element type
have a 2similar* operation.
This means that array`[T] exists for any type T,
but that array`[T]$similar exists only when T$similar exists.
.sp 1
.para
Two examples of parameterized clusters are shown below.
The first defines the 2set* type generator.
This cluster is similar to 2int_set*, presented in the previous section.
The main difference is that everywhere that integer elements were assumed,
now the parameter 2t* is used.
The 2set* type generator has a where clause that requires the element type
to provide an 2equal* operation; in addition, the 2similar* operation
imposes an additional constraint on the element type by requiring
a 2similar* operation.
Thus set`[X] is legal if X has an 2equal* operation;
but set`[X]$similar can be used only if X has a 2similar* operation.
Note the procedure 2is_in_sim*; it is a hidden routine of this implementation.
Also note the use of the type_set 2sim_type*.
.para
The 2state* of a 2set* object is the set of abstract objects currently in the set.
What matters is the identity of the objects, not their state.
This should help in understanding why 2equal*, 2similar*, and 2copy*
were written the way they were.
Notice that we have two new operations, 2similar1* and 2copy1*.
2Similar1* returns true when two objects have equal state (in the
abstract sense), whereas 2similar* returns true when they have
similar state.
2Copy1* is to 2similar1* what 2copy* is to 2similar*,
i.e., T$similar1`(T$copy1`(x),`x) should always be true.
In general, mutable type generators that behave like collections
should provide 2similar1* and 2copy1* to ensure that instantiations
of the type generator can be used as part of the concrete representation
of other types.
.para
The second example is a 2list* type generator, which is similar to 2int_list*
in the previous section.
2List* does not place any constraints in its type parameter.
Therefore any element type is permissible for lists, including
type any.
Note that some of the types generated by the 2list* type generator
introduce no mutability (e.g., list`[int]), but others do.
It is debatable whether list`[array`[int]] is mutable or not:
it depends on one's point of view.
If the state of a list is considered to be the ordered set of objects
in the list, where only the identity of the objects matters,
then we could say lists are immutable even if the objects in the lists are mutable,
because the state of a list never changes.
.para
Confusion can arise unless the designer and implementor of a data type
have in mind a clear idea of exactly what constitutes the state of
the objects of the type they are defining;
it must be resolved in which cases it is only the identity of the
components that matters, and in which cases their state matters as well.
.para
The position taken in the 2list* type generator below is that the state
of a list consists only of the identity of the objects in the list,
and does not depend on their state.
Hence, these lists are immutable.
This explains why 2list* has no 2similar1*
or 2copy1* operations, and why 2equal*, 2similar*,
and 2copy* are implemented as they are.
.begin_page_figure
.ne 6
set = cluster [t: type] s(1)is empty, insert, delete, is_in, size,
elements, equal, similar, similar1, copy, copy1
 t(1)where t has equal: proctype (t, t) returns (bool)

    rep = array [t]
    sim_type = { s | s has similar: proctype (t, t) returns (bool)}

.ne 3
    empty = proc () returns (cvt)
        return (rep$new())
        end empty

.ne 3
    insert = proc (s: cvt, v: t)
        if ~ is_in (up (s), v) then rep$addh (s, v) end
        end insert

.ne 10
    delete = proc (s: cvt, v: t)
        for j: int in rep$indexes (s) do
            if v = s [j]
                then
                    s [j] := rep$top (s)
                    rep$remh (s)
                    return
                end
            end
        end delete

.ne 6
    is_in = proc (s: cvt, v: t) returns (bool)
        for u: t in rep$elements (s) do
            if u = v then return (true) end
            end
        return (false)
        end is_in

.ne 7
    is_in_sim = proc (s: cvt, v: t) returns (bool) where t in sim_type
        for u: t in rep$elements (s) do
            if t$similar (u, v) then return (true) end
            end
        return (false)
        end is_in_sim

.ne 3
    size = proc (s: cvt) returns (int)
        return (rep$size (s))
        end size

.ne 5
    elements = iter (s: cvt) yields (t)
        for v: t in rep$elements (s) do
            yield (v)
            end
        end elements

.ne 3
    equal = proc (s1, s2: cvt) returns (bool)
        return (s1 = s2)
        end equal

.ne 8
    similar = proc (s1, s2: set [t]) returns (bool) where t in sim_type
        if size (s1) ~= size (s2) then return (false) end
        for u: t in elements (s1) do
            if ~ is_in_sim (s2, u) then return (false) end
            end
        return (true)
        end similar

.ne 7
    similar1 = proc (s1, s2: set [t]) returns (bool)
        if size (s1) ~= size (s2) then return (false) end
        for u: t in elements (s1) do
            if ~ is_in (s2, u) then return (false) end
            end
        return (true)
        end similar1

.ne 4
    copy = proc (s: cvt) returns (cvt)
where t has copy: proctype (t) returns (t)
        return (rep$copy (s))
        end copy

.ne 5
    copy1 = proc (s: cvt) returns (cvt)
        return (rep$copy1 (s))
        end copy1

    end set
.if vpos>4000
.  bp
.else
.  sp 5l
.  end if
.ne 5
list = cluster [t: type] is empty, cons, car, cdr, is_empty,
elements, equal, similar, copy

    rep = oneof [cell: cell, empty: null]
    cell = record [car: t, cdr: list [t]]

.ne 3
    empty = proc () returns (cvt)
        return (rep$make_empty (nil))
        end empty

.ne 3
    cons = proc (v: t, l: list [t]) returns (cvt)
        return (rep$make_cell (cell${car: v, cdr: l}))
        end cons

.ne 6
    car = proc (l: cvt) returns (t) signals (empty)
        tagcase l
            tag cell (c: cell): return (c.car)
            tag empty: signal empty
            end
        end car

.ne 6
    cdr = proc (l: cvt) returns (list [t]) signals (empty)
        tagcase l
            tag cell (c: cell): return (c.cdr)
            tag empty: signal empty
            end
        end cdr

.ne 3
    is_empty = proc (l: cvt) returns (bool)
        return (rep$is_empty (l))
        end is_empty

.ne 10
    elements = iter (l: cvt) yields (t)
        tagcase l
            tag cell (c: cell):
                yield (c.car)
                for v: t in elements (c.cdr) do
                    yield (v)
                    end
            tag empty:
            end
        end elements

.ne 12
    equal = proc (l1, l2: cvt) returns (bool)
where t has equal: proctype (t, t) returns (bool)
        tagcase l1
            tag cell (c1: cell):
                tagcase l2
                    tag cell (c2: cell):
                        return ((c1.car = c2.car) & (c1.cdr = c2.cdr))
                    tag empty: return (false)
                    end
            tag empty: return (rep$is_empty (l2))
            end
        end equal

.ne 4
    similar = proc (l1, l2: cvt) returns (bool)
where t has similar: proctype (t, t) returns (bool)
        return (rep$similar (l1, l2))
        end similar

.ne 6
    copy = proc (l: cvt) returns (cvt)
where t has copy: proctype (t) returns (t)
        return (rep$copy (l))
        end copy

    end list
.finish_figure
