.sr opdefs_pdx Appendix`II .sr int_pdx Appendix`II.4 .sr real_pdx Appendix`II.5 .sr type_def_sec Section`3.1 .sr cluster_sec Section`13 .sr self Section`7 .sr builtin_secs Sections self.1 to self.7 .sr gen_secs Sections self.8 to self.11 .sr cand_sec Section`10.8 .sr fetch_sec Section`10.5.1 .sr fetch_store_secs Sections 10.5.1 and 11.2.1 .sr force_sec Section`10.11 .sr array_cons_sec Section`10.6 .sr equate_sec Section`8.3 .sr rec_cons_sec Section`10.6 .sr get_set_secs Sections 10.5.2 and 11.2.2 .sr tagcase_sec Section`11.6 .sr system_sec Section`3.1 .sr invoke_sec Section`9.3 .sr const_sec Section`8.3 .sr bind_sec Section`4 .sr failure_sec Section`12.1 .sr cvt_parm_secs Sections 13.4 and 13.5 .chapter "Types, Type Generators, and Type Specifications" .para A 2type* consists of a set of objects together with a set of operations to manipulate the objects. As discussed in type_def_sec, types can be classified according to whether their objects are mutable or immutable. An immutable object (e.g, an integer) has a value that never varies, while the value (state) of a mutable object can vary over time. .para A 2type generator* is a 2parameterized* type definition, representing a (usually infinite) set of related types. A particular type is obtained from a type generator by writing the generator name along with specific values for the parameters; for every distinct set of legal values, a distinct type is obtained. For example, the array type generator has a single parameter that determines the element type; array[int], array[real], and array[array[int]] are three distinct types defined by the array type generator. Types obtained from type generators are called 2parameterized* types; others are called 2simple* types. .para Within a program, a type is specified by a syntactic construct called a 2type_spec*. The type specification for a simple type is just the identifier (or reserved word) naming the type. For parameterized types, the type specification consists of the identifier (or reserved word) naming the type generator, together with the parameter values. .para This section gives an informal introduction to the built-in types and type generators provided by CLU; many details (such as error conditions) are not discussed. Complete and precise definitions are given in opdefs_pdx. builtin_secs describe the objects, literals, and some of the operations for each of the built-in types, while gen_secs describe the objects, type specifications, and interesting operations of types obtained from the built-in type generators. A number of operations can be invoked using infix and prefix operators; as the various operation names are introduced, the corresponding operator, if any, will follow in parentheses. .para In addition, we describe type specifications for user-defined types, and other special type specifications in Section self.12. The mechanism by which new types and type generators are implemented is presented in cluster_sec. . .section "Null" .para The type null has exactly one immutable object, represented by the literal nil. The type null is generally used as a kind of "place holder" in a oneof type (see self.9). . .section "Bool" .para The two immutable objects of type bool, with literals true and false, represent logical truth values. The binary operations 2equal* (=), 2and* (&), and 2or* (|), are provided, as well as unary 2not* (~). . .section "Int" .para The type int models (a range of) the mathematical integers. The exact range is not part of the language definition, and can vary somewhat from implementation to implementation (see int_pdx). Integers are immutable objects, and are written as a sequence of one or more decimal digits. The binary operations 2add* (+), 2sub* (-), 2mul* (*), 2div* (/), 2mod* (//), and 2power* (**) are provided, as well as unary 2minus* (-). There are binary comparison operations 2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>). In addition, there are two operations, 2from_to* and 2from_to_by*, for iterating over a sequence of integers. For example, one can iterate over the odd numbers between one and 100 with .show int$from_to_by(1, 100, 2) .eshow . .section "Real" .para The type real models (a subset of) the mathematical real numbers. The exact subset is not part of the language definition, although certain constraints are imposed (see real_pdx). Reals are immutable objects, and are written as a 2mantissa* with an (optional) 2exponent*. A mantissa is either a sequence of one or more decimal digits, or two sequences (one of which may be empty) joined by a period. The mantissa must contain at least one digit. An exponent is 'E' or 'e', optionally followed by '+' or '-', followed by one or more decimal digits. An exponent is required if the mantissa does not contain a period. As is usual, 2m*E2x* = 2m**102x*. Examples of real literals are: .show 3.14 3.14E0 314e-2 .0314E+2 3. .14 .eshow .para As with integers, the operations 2add* (+), 2sub* (-), 2mul* (*), 2div* (/), 2mod* (//), 2power* (**), 2minus* (-), 2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>), are provided. It is important to note that there is no form of 2implicit* conversion between types. So, for example, the various binary operators cannot have one integer and one real argument. The 2i2r* operation converts an integer to a real, 2r2i* rounds a real to an integer, and 2trunc* truncates a real to an integer. . .section "Char" .para The type char provides the alphabet for text manipulation. Characters are immutable, and form an ordered set. Every implementation must provide at least 128, but no more than 512, characters; the first 128 characters are the ASCII characters in their standard order. .para Printing ASCII characters (octal 40 thru octal 176), other than single quote or backslash, can be written as that character enclosed in single quotes. Any character can be written by enclosing one of the following escape sequences in single quotes: .show 12 escape sequence s(1)character .sp .5 \'t(1)' s(2)(single quote) \"t(1)"t(2)(double quote) \\t(1)\t(2)(backslash) \nt(1)NLt(2)(newline) \tt(1)HTt(2)(horizontal tab) \pt(1)FFt(2)(form feed, newpage) \bt(1)BSt(2)(backspace) \rt(1)CRt(2)(carriage return) \vt(1)VTt(2)(vertical tab) \***t(1)specified by octal value (* is an octal digit) .eshow The escape sequences may be written using upper case letters. Examples of character literals are: .show '7' 'a' '"' '\"' '\'' '\B' '\177' .eshow .para The usual binary comparison operations exist for characters: 2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>). There are also two operations, 2i2c* and 2c2i*, for converting between integers and characters: the smallest character corresponds to zero, and the characters are numbered sequentially. . .section "String" .para The type string is used for representing text. A string is an immutable sequence of zero or more characters. Strings are lexicographically ordered, based on the ordering for characters. A string is written as a sequence of zero or more character representations, enclosed in double quotes. Within a string literal, a printing ASCII character other than double quote or backslash is represented by itself. Any character can be represented by using the escape sequences listed above. Examples of string literals are: .show "Item\tCost" "altmode (\033) = \\033" "" " " .eshow .para The characters of a string are indexed sequentially starting from one, and there are a number of operations that deal with these indexes: 2fetch*, 2substr*, 2rest*, 2indexc*, and 2indexs*. The 2fetch* operation is used to obtain a character by index. Invocations of 2fetch* can be written using a special syntax (fully described in fetch_sec): .show s[i] % get the character at index i of s .eshow 2Substr* returns a string given a string, a starting index, and a length: .show string$substr("abcde", 2, 3) = "bcd" .eshow 2Rest*, given a string and a starting index, returns the rest of the string: .show string$rest("abcde", 3) = "cde" .eshow 2Indexc* computes the least index at which a character occurs in a string, and 2indexs* does the same for a string; the result is zero if the character or string does not occur: .show 3 string$indexc('d', "abcde") = 4 string$indexs("cd", "abcde") = 3 string$indexs("abcde", "cd") = 0 .eshow .para Two strings can be concatenated together with 2concat* (||), and a single character can be appended to the end of a string with 2append*. Note that string$concat("abc",`"de") and string$append("abcd",`'e') produce the 2same* string as writing "abcde". 2C2s* converts a character to a single-character string. The size of a string can be determined with 2size*. 2Chars* iterates over the characters of a string, from the first to the last character. There are also the usual comparison operations: 2lt* (<), 2le* (<=), 2equal* (=), 2ge* (>=), and 2gt* (>). . .section "Any" .para A type specification is used to restrict the class of objects that a variable can denote, a procedure or iterator can take as arguments, a procedure can return, etc. There are times when no restrictions are desired, when any object is acceptable. At such times, the type specification any is used. For example, one might wish to implement a table mapping strings to arbitrary objects, with the intention that different strings could map to objects of different types. The lookup operation, used to get the object corresponding to a string, would have its result declared to be of type any. .para The type any is the 2union* of all possible types, and it is the 2only* true union type in CLU; all other types are 2base* types. Every object is of type any, as well as being of some base type. The type any has no operations; however, the base type of an object can be tested at run-time (see force_sec). . .section "Array Types" .para Arrays are one-dimensional, and are mutable. Arrays are unconventional because the number of elements in an array can vary dynamically. Furthermore, there is no notion of an "uninitialized" element. .para The 2state* of an array consists of an integer called the 2low bound*, and a sequence of objects called the 2elements*. The elements of an array are indexed sequentially, starting from the low bound. All of the elements must be of the same type; this type is specified in the array type specification, which has the form .show array [ type_spec ] .eshow Examples of array type specifications are .show 2 array[int] array[array[string]] .eshow .para There are a number of ways to create a new array, of which only two are mentioned here. The 2create* operation takes an argument specifying the low bound, and creates a new array with that low bound and no elements. An array 2constructor* can be used to create an array with an arbitrary number of elements. For example, .show array[int] $ [5: 1, 2, 3, 4] .eshow creates an integer array with low bound five, and four elements, while .show array[bool] $ [true, false] .eshow creates a boolean array with low bound one (the default), and two elements. Array constructors are discussed fully in array_cons_sec. .para An array type specification states nothing about the bounds of an array. This is because arrays can grow and shrink dynamically. 2Addh* adds a new element to the end of the array, with index one greater than the previous top element. 2Addl* adds a new element to the beginning of the array, and decrements the low bound by one, so that the new element has an index one less than the previous bottom element. 2Remh* removes the top element; 2reml* removes the bottom element and increments the low bound. Note that all of these operations preserve the indexes of the other elements. Also note that these operations do not create holes; they merely add to or remove from the ends of the array. .para As an example, if a 2remh* were performed on the integer array .show array[int] $ [5: 1, 2, 3, 4] .eshow the element 4 would disappear, and the new top element would be 3, still with index 7. If a 0 were added using 2addl*, it would become the new bottom element, with index 4. .para The 2fetch* operation extracts an element by index, and the 2store* operation replaces an element by index. There is no notion of an "uninitialized" element; an index is illegal if no element with that index exists. Invocations of these operations can be written using special forms (covered fully in fetch_store_secs): .show 2 a[i] % fetch the element at index i of a a[i] := 3; % store 3 at index i of a .eshow .para The 2top* and 2bottom* operations return the element with the highest and lowest index, respectively. The 2high* and 2low* operations return the highest and lowest indexes, respectively. The 2elements* iterator yields the elements from bottom to top, and the 2indexes* iterator yields the indexes from low to high. There is also a 2size* operation that returns the number of elements. .para Every newly created array has an identity that is distinct from all other arrays; two arrays can have the same elements without being the same array object. The identity of arrays can be distinguished with the 2equal* (=) operation. The 2similar1* operation tests if two arrays have the same state, using the 2equal* operation of the element type. 2Similar* tests if two arrays have similar states, using the 2similar* operation of the element type. For example, writing .show ai$[3: 1, 2, 3] .eshow (where "ai" is equated to array[int]) in different places produces arrays that are similar1 and similar (but not equal), while the following produces arrays that are similar, but not similar1 (or equal): .show array[ai] $ [1: ai$create(1)] .eshow . .section "Record Types" .para A record is a mutable collection of one or more named objects. The names are called 2selectors*, and the objects are called 2components*. Different components may have different types. A record type specification has the form .show record [ field_spec , etc ] .eshow where .show .def field_spec "name , etc : type_spec" .eshow Selectors must be unique within a specification, but the ordering and grouping of selectors is unimportant. For example, all the of the following name the same type: .show 2 record [last, first, middle: string, age: int] record [first, middle, last: string, age: int] record [last: string, age: int, first, middle: string] .eshow .para A record is created using a record 2constructor*. For example: .show info $ {last: "Jones", first: "John", age: 32, middle: "J."} .eshow (assuming that "info" has been equated to one of the above type specifications; see equate_sec.) An expression must be given for each selector, but the order and grouping of selectors need not resemble the corresponding type specification. Record constructors are discussed fully in rec_cons_sec. .para For each selector "sel", there is an operation 2get_*sel to extract the named component, and an operation 2set_*sel to replace the named component with some other object. For example, there are 2get_middle* and 2set_middle* operations for the type specified above. Invocations of these operations can be written in a special form (discussed fully in get_set_secs): .show 2 r.middle % get the 'middle' component of r r.age := 33; % set the 'age' component of r to 33 .eshow .para As with arrays, every newly created record has an identity that is distinct from all other records; two records can have the same components without being the same record object. The identity of records can be distinguished with the 2equal* (=) operation. The 2similar1* operation tests if two records have the same components, using the 2equal* operations of the component types. 2Similar* tests if two records have similar components, using the 2similar* operations of the component types. . .section "Oneof Types" .para A oneof type is a 2tagged discriminated union*. A oneof is an immutable labeled object, to be thought of as "one of" a set of alternatives. The label is called the 2tag*, and the object is called the 2value*. A oneof type specification has the form .show oneof [ field_spec , etc ] .eshow where (as for records) .show .def field_spec "name , etc : type_spec" .eshow Tags must be unique within a specification, but the ordering and grouping of tags is unimportant. .para As an example of a oneof type, the representation type for a linked list of integers, int_list, might be written .show 2 oneof [s(1)empty: s(2)null, t(1)cell:t(2)record [car: int, cdr: int_list]] .eshow As another example, the contents of a "number container" might be specified by .show 4 oneof [s(1)empty: s(2)null, t(1)integer:t(2)int, t(1)real_num:t(2)real, t(1)complex_num:t(2)complex]; .eshow .para For each tag "t" of a oneof type, there is a 2make_*t operation which takes an object of the type associated with the tag, and returns the object (as a oneof) labeled with tag "t". For example, .show number$make_real_num (1.37) .eshow creates a oneof object with tag "real_num" (assuming "number" has been equated to the "number container" type specification above; see equate_sec). .para The 2equal* operation tests if two oneofs have the same tag, and if so, tests if the two object are the same, using the 2equal* operation of the value type. 2Similar* tests if two oneofs have the same tag, and if so, tests if the two objects are similar, using the 2similar* operation of the value type. .para To determine the tag and value parts of a oneof object, one normally uses the tagcase statement, discussed in tagcase_sec. . .section "Procedure and Iterator Types" .para Procedures and iterators are immutable objects, created by the CLU system (see system_sec). The type specification for a procedure or iterator contains most of the information stated in a procedure or iterator heading; a procedure type specification has the form .show proctype ( lbkt type_spec , etc rbkt ) lbkt returns rbkt lbkt signals rbkt .eshow and an iterator type specification has the form .show itertype ( lbkt type_spec , etc rbkt ) lbkt yields rbkt lbkt signals rbkt .eshow where .show 4 .long_def exception .def1 returns "returns ( type_spec , etc )" .def1 yields "yields ( type_spec , etc )" .def1 signals "signals ( exception , etc )" .def1 exception "name lbkt ( type_spec , etc ) rbkt" .eshow The first list of type specifications describes the number, types, and order of arguments. The returns or yields clause gives the number, types, and order of the objects to be returned or yielded. The signals clause lists the exceptions raised by the procedure or iterator; for each exception name, the number, types, and order of the objects to be returned is also given. All names used in a signals clause must be unique, and cannot be "failure" which has a standard meaning in CLU (see failure_sec). The ordering of exceptions is not important. For example, both of the following type specifications name the procedure type for string$substr: .show 2 proctype (string, int, int) returns (string) signals (bounds, negative_size) proctype (string, int, int) returns (string) signals (negative_size, bounds) .eshow 1String*$chars has the following iterator type: .show itertype (string) yields (char) .eshow .para Procedure and iterator types have an 2equal* (=) operation. Invocation is 2not* an operation, but a primitive action of CLU semantics (see invoke_sec). . .section "Other Type Specifications" .para The type specification for a user-defined type has the form .show idn lbkt [ constant , etc ] rbkt .eshow where each 2constant* must be compile-time computable (see const_sec). The identifier must be bound to a data abstraction (see bind_sec). If the referenced abstraction is parameterized, constants of the appropriate types and number must be supplied. The order of parameters always matters in user-defined types. .para There are three special type specifications that are used when implementing new abstractions: rep, cvt, and type. These forms are discussed in cvt_parm_secs. Within an implementation of an abstraction, formal parameters declared with type can be used as type specifications. .para In addition, identifiers which have been equated to type specifications can also be used as type specifications. Equates are discussed in equate_sec.