C Reference Manual 2 March 1977 Dennis M. Ritchie Bell Telephone Laboratories Murray Hill, New Jersey 07974 Alan Snyder Laboratory for Computer Science Massachusetts Institute of Technology 1. Introduction C is a computer language based on the earlier language B [1], itself a descendant of BCPL [3]. C differs from B and BCPL primarily by the introduction of types, along with the appropriate extra syntax and semantics. Most of the software for the UNIX time-sharing system [4] is written in C, as is the operating system itself. In addition to the UNIX C compiler, there exist C compilers for the HIS 6000 and the IBM System/370 [2]. This manual describes the C programming language as implemented by the portable C compiler [6]. It is a revision by the second author of the original C Reference Manual (contained in [5]), which describes the UNIX C compiler. Differences with respect to the UNIX C compiler and undesirable limitations of the current portable C compiler are described in footnotes to this document. The report ``The C Programming Language'' [5] contains a tutorial introduction to C and a description of a set of portable I/O routines, concerned primarily with I/O. 2. Lexical conventions There are six kinds of tokens: identifiers, keywords, constants, strings, expression operators, and other separators. In general blanks, tabs, newlines, and comments as described below are ignored except as they serve to separate tokens. At least one of these characters is required to separate otherwise adjacent identifiers, constants, and certain operator-pairs. If the input stream has been parsed into tokens up to a given character, the next token is taken to include the longest string of characters which could possibly constitute a token. 2.1 Comments The characters /* introduce a comment, which terminates with the characters */. Comments thus may not be nested. C Reference Manual - 2 2.2 Identifiers (Names) An identifier is a sequence of letters and digits; the first character must be alphabetic. The underscore ``_'' counts as alphabetic. Upper and lower case letters are not distinguished. There is no limit placed on the length of identifiers; all characters of internal identifiers are significant. However, the number of significant characters in external identifiers (i.e., function names and names of external variables) may be limited by 1 the operating system to as few as the first five characters. This limitation on external identifiers can be circumvented, to some extent, by using the token replacement facility (described in section 12.1). 2.3 Keywords The following identifiers are reserved for use as keywords, and may not be used otherwise: int break char continue float if double else long goto short return unsigned entry struct for auto do extern while register switch static case sizeof default typedef The entry keyword is not currently implemented by any compiler but is reserved for future use. 2.3 Constants There are several kinds of constants, as follows: 2.3.1 Integer constants An integer constant is a sequence of digits. An integer is taken to be octal if it begins with 0, hexadecimal if it begins with 0x (or 0X), and decimal otherwise. The digits 8 and 9 have octal value 10 and 11 respectively. An integer constant _________________________ 1 The UNIX C compiler distinguishes upper and lower case in all identifiers and accepts keywords only in lower case. In addition, the UNIX C compiler treats only the first eight characters of internal identifiers and the first seven characters of external identifiers as significant. C Reference Manual - 3 2 immediately followed by l or L is a long integer constant. 2.3.2 Character constants 3 A character constant consists of a single ASCII character enclosed in single quotes `` ' ''. Within a character constant a single quote must be preceded by a back-slash ``\''. Certain non-graphic characters, and ``\'' itself, may be escaped 4 according to the following table: BS \b NL \n CR \r HT \t VT \v FF \p ddd \ddd \ \\ The escape ``\ddd'' consists of the backslash followed by 1, 2, or 3 octal digits which are taken to specify the value of the desired character. A special case of this construction is ``\0'' (not followed by a digit) which indicates a null character. Character constants behave exactly like integers whose value is 5 the corresponding ASCII code. They do not behave like objects of character type. 2.3.3 Floating constants A floating constant consists of an integer part, a decimal point, a fraction part, an e (or E), and an optionally signed integer exponent. The integer and fraction parts both consist of a sequence of digits. Either the integer part or the fraction part (not both) may be missing; either the decimal point or the e and the exponent (not both) may be missing. Every floating constant is taken to be double-precision. _________________________ 2 A long integer constant is equivalent to an integer constant in the portable C compiler. 3 The UNIX C compiler allows 2 characters in character constants. Other compilers may allow as many characters as can be packed into a machine word. The order of packed characters in a machine word is machine-dependent. 4 The UNIX C compiler does not recognize \v or \p. 5 On UNIX, character constants range in value from -128 to 127. C Reference Manual - 4 2.4 Strings A string is a sequence of characters surrounded by double quotes `` " ''. A string has the type array-of-characters (see below) and refers to an area of storage initialized with the given characters. The compiler places a null byte ( \0 ) at the end of each string so that programs which scan the string can find its end. In a string, the character `` " '' must be preceded by a ``\'' ; in addition, the same escapes as described for character constants may be used. String constants are constant, i.e., they may not be modified. 3. Syntax notation In the syntax notation used in this manual, syntactic categories are indicated by italic type, and literal words and characters in gothic. Alternatives are listed on separate lines. An optional terminal or non-terminal symbol is indicated by the subscript ``opt,'' so that { expression } opt would indicate an optional expression in braces. 4. What's in a Name? C bases the interpretation of an identifier upon two attributes of the identifier: its storage class and its type. The storage class determines the location and lifetime of the storage associated with an identifier; the type determines the meaning of the values found in the identifier's storage. There are four declarable storage classes: automatic, static, external, and register. Automatic variables are created upon each invocation of the function in which they are defined, and are discarded on return. Static variables are local to a function or to a group of functions defined in one source file, but retain their values independently of function invocations. External variables are independent of any function and accessible by separately-compiled functions. Register variables are stored (if possible) in the fast registers of the machine; like automatic variables they are local to each function and disappear on return. C supports four fundamental types of objects: characters, integers, single-, and double-precision floating-point numbers. Characters (declared, and hereinafter called, char) are chosen from the ASCII set; they occupy the right-most seven bits in a machine-dependent unit of storage called a byte. Integers (int) are represented in 2's complement notation in a machine-dependent unit of storage called a word. Integers C Reference Manual - 5 1 should be at least 16 bits long. The precision and range of single precision floating point (float) quantities and double-precision floating-point (double, or long float) quantities are machine-dependent. Besides the four fundamental types there is a conceptually infinite class of derived types constructed from the fundamental types in the following ways: arrays of objects of most types; functions which return objects of a given type; pointers to objects of a given type; structures containing objects of various types. In general these methods of constructing objects can be applied recursively. 5. Objects and lvalues An object is a manipulatable region of storage; an lvalue is an expression referring to an object. An obvious example of an lvalue expression is an identifier. There are operators which yield lvalues: for example, if E is an expression of pointer type, then *E is an lvalue expression referring to the object to which E points. The name ``lvalue'' comes from the assignment expression ``E1 = E2'' in which the left operand E1 must be an lvalue expression. The discussion of each operator below indicates whether it expects lvalue operands and whether it yields an lvalue. 6. Conversions A number of operators may, depending on their operands, cause conversion of the value of an operand from one type to another. This section explains the result to be expected from such conversions. 6.1 Characters and integers A char object may be used anywhere an int may be. In all cases the char is converted to an int by extending the character value 2 with high-order zero bits. _________________________ 1 The UNIX C compiler implements a longer variety of integer (declared as long or long int) and unsigned integers (declared as unsigned or unsigned int), for which most int operations are applicable. The portable C compiler treats long int, short int, and unsigned int as synonymous with int. 2 On the PDP-11, a character is converted to an integer by propagating its sign through the upper 8 bits of the resultant integer. Thus, it is possible to have (non-ASCII) characters with negative values. C Reference Manual - 6 6.2 Float and double All floating arithmetic in C is carried out in double-precision. Whenever a float appears in an expression, it is lengthened to double by zero-padding its fraction. When a double must be converted to float, for example by an assignment, the double is rounded before truncation to float length. 6.3 Float and double; integer and character Ints and chars may be converted to float or double; truncation may occur for some values. Conversion of float or double to int 1 or char takes place with rounding. Again, erroneous results are possible for some values. 6.4 Pointers and integers Integers may be added to pointers; in such cases the int is converted as specified in the discussion of the addition operator. Two pointers to objects of the same type may be subtracted; in this case the result is converted to an integer as specified in the discussion of the subtraction operator. 7. Expressions The precedence of expression operators is the same as the order of the major subsections of this section (highest precedence first). Thus the expressions referred to as the operands of + (section 7.4) are those expressions defined in sections 7.1-7.3. Within each subsection, the operators have the same precedence. Left- or right-associativity is specified in each subsection for the operators discussed therein. The precedence and associativity of all the expression operators is summarized in an appendix. Unless otherwise noted, the order of evaluation of expressions is undefined. In particular the compiler considers itself free to compute subexpressions in the order it believes most efficient, even if the subexpressions involve side effects. 7.1 Primary expressions Primary expressions involving . , ->, subscripting, and function calls group left to right. 7.1.1 identifier An identifier is a primary expression, provided it has been suitably declared as discussed below. Its type is specified by its declaration. However, if the type of the identifier is ``array of . . .'', then the value of the identifier-expression is a pointer to the first object in the array, and the type of the expression is ``pointer to . . .''. Moreover, an array identifier is not an lvalue expression. Likewise, an identifier which is declared ``function returning _________________________ 1 On UNIX, this conversion involves truncation towards 0. C Reference Manual - 7 . . .'', when used except in the function-name position of a call, is converted to ``pointer to function returning . . .''. 7.1.2 constant A decimal, octal, character, or floating constant is a primary expression. Its type is int in the first three cases, double in the last. 7.1.3 string A string is a primary expression. Its type is originally ``array of char''; but following the same rule as in section 7.1.1 for identifiers, this is modified to ``pointer to char'' and the result is a pointer to the first character in the string. 7.1.4 ( expression ) A parenthesized expression is a primary expression whose type and value are identical to those of the unadorned expression. The presence of parentheses does not affect whether the expression is an lvalue. 7.1.5 primary-expression [ expression ] A primary expression followed by an expression in square brackets is a primary expression. The intuitive meaning is that of a subscript. Usually, the primary expression has type ``pointer to . . .'', the subscript expression is int, and the type of the result is `` . . . ''. The expression ``E1[E2]'' is identical (by definition) to ``* (( E1 ) + ( E2 )) ''. All the clues needed to understand this notation are contained in this section together with the discussions in sections 7.1.1, 7.2.1, and 7.4.1 on identifiers, *, and + respectively; section 14.3 below summarizes the implications. 7.1.6 primary-expression ( expression-list ) opt A function call is a primary expression followed by parentheses containing a possibly empty, comma-separated list of expressions which constitute the actual arguments to the function. The primary expression must be of type ``function returning . . .'', and the result of the function call is of type `` . . . ''. As indicated below, a hitherto unseen identifier followed immediately by a left parenthesis is contextually declared to represent a function returning an integer; thus in the most common case, integer-valued functions need not be declared. Any actual arguments of type float are converted to double before the call; any of type char are converted to int. In preparing for the call to a function, a copy is made of each actual parameter; thus, all argument-passing in C is strictly by value. A function may change the values of its formal parameters, but these changes cannot possibly affect the values of the actual parameters. On the other hand, it is perfectly possible to pass a pointer on the understanding that the function may change the value of the object to which the pointer points. Note that the order of evaluation of function arguments is not defined. Recursive calls to any function are permissible. C Reference Manual - 8 7.1.7 primary-lvalue . member-of-structure An lvalue expression followed by a dot followed by the name of a member of a structure is a primary expression. The object referred to by the lvalue must be of a structure type, and the 1 member-of-structure must be a member of that structure. The result of the expression is an lvalue appropriately offset from the origin of the given lvalue whose type is that of the named structure member. Structures are discussed in section 8.5. 7.1.8 primary-expression -> member-of-structure The primary-expression must be a pointer to a structure and the 2 member-of-structure must be a member of that structure type. The result is an lvalue appropriately offset from the origin of the pointed-to structure whose type is that of the named structure member. The expression ``E1->MOS'' is exactly equivalent to ``(*E1).MOS''. 7.2 Unary operators Expressions with unary operators group right-to-left. 7.2.1 * expression The unary * operator means indirection: the expression must be a pointer, and the result is an lvalue referring to the object to which the expression points. If the type of the expression is ``pointer to . . .'', the type of the result is `` . . . ''. 7.2.2 & lvalue-expression The result of the unary & operator is a pointer to the object referred to by the lvalue-expression. If the type of the lvalue-expression is `` . . . '', the type of the result is ``pointer to . . .''. 7.2.3 - expression The result is the negative of the expression. The type of the expression must be char, int, float, or double. The type of the 3 result is int or double. _________________________ 1 The UNIX C compiler allows any primary-lvalue and assumes it to have the same form as the structure containing the named structure member. 2 The UNIX C compiler allows any primary-lvalue and assumes it to be a pointer which points to an object of the same form as the structure of which the member-of-structure is a part. 3 The UNIX C compiler defines the type of the result to be the same as the type of the operand. C Reference Manual - 9 7.2.4 ! expression The result of the logical negation operator ! is 1 if the value of the expression is zero, 0 if the value of the expression is non-zero. The type of the result is int. The allowable 1 expressions are those allowed by the if statement (section 9.3). 7.2.5 ~ expression The ~ operator yields the one's complement of its operand. The type of the expression must be int or char, and the result is int. 7.2.6 ++ lvalue-expression The object referred to by the lvalue expression is incremented. The value is the new value of the lvalue expression and the type is the type of the lvalue. If the expression is of a fundamental 2 type, it is incremented by 1; if it is a pointer to an object, it is incremented by the length of the object. 7.2.7 -- lvalue-expression The object referred to by the lvalue expression is decremented analogously to the ++ operator. 7.2.8 lvalue-expression ++ The result is the value of the object referred to by the lvalue expression. After the result is noted, the object referred to by the lvalue is incremented in the same manner as for the prefix ++ 3 operator: by 1 for an object of fundamental type, by the length of the pointed-to object for a pointer. The type of the result is the same as the type of the lvalue-expression. 7.2.9 lvalue-expression -- The result of the expression is the value of the object referred to by the the lvalue expression. After the result is noted, the object referred to by the lvalue expression is decremented in a way analogous to the postfix ++ operator. 7.2.10 sizeof expression The sizeof operator yields the size, in bytes, of its operand. When applied to an array, the result is the total number of bytes in the array. The size is determined from the declarations of the objects in the expression. The major use of sizeof is in _________________________ 1 The UNIX C compiler does not allow float or double operands. 2 The portable C compiler does not allow float or double operands. 3 The portable C compiler does not allow float or double operands. C Reference Manual - 10 communication with routines like storage allocators and I/O 1 systems. 7.3 Multiplicative operators The multiplicative operators *, /, and % group left-to-right. 7.3.1 expression * expression The binary * operator indicates multiplication. If both operands are int or char, the result is int; if one is int or char and one float or double, the former is converted to double, and the result is double; if both are float or double, the result is double. No other combinations are allowed. 7.3.2 expression / expression The binary / operator indicates division. The same type considerations as for multiplication apply. 7.3.3 expression % expression The binary % operator yields the remainder from the division of the first expression by the second. Both operands must be int or char, and the result is int. The use of this operation is not recommended for negative operands. 7.4 Additive operators The additive operators + and - group left-to-right. 7.4.1 expression + expression The result is the sum of the expressions. If both operands are int or char, the result is int. If both are float or double, the result is double. If one is char or int and one is float or double, the former is converted to double and the result is double. If an int or char is added to a pointer, the former is converted by multiplying it by the length of the object to which the pointer points and the result is a pointer of the same type as the original pointer. Thus if P is a pointer to an object, the expression ``P+1'' is a pointer to another object of the same type as the first and immediately following it in storage. No other type combinations are allowed. 7.4.2 expression - expression The result is the difference of the operands. If both operands are int, char, float, or double, the same type considerations as for + apply. If an int or char is subtracted from a pointer, the former is converted in the same way as explained under + above. If two pointers to objects of the same type are subtracted, the result is converted (by division by the length of the object) to an int representing the number of objects separating the pointed-to objects. This conversion will in general give _________________________ 1 The UNIX C compiler allows this expression anywhere that a constant is required. C Reference Manual - 11 unexpected results unless the pointers point to objects in the same array, since pointers, even to objects of the same type, do not necessarily differ by a multiple of the object-length. 7.5 Shift operators The shift operators << and >> group left-to-right. 7.5.1 expression << expression 7.5.2 expression >> expression Both operands must be int or char, and the result is int. The second operand should be non-negative. The value of ``E1<>E2'' is E1 (interpreted as a bit pattern) logically right-shifted E2 bit 1 positions. Vacated bits are filled by 0 bits. 7.6 Relational operators The relational operators group left-to-right, but this fact is not very useful; ``a expression 7.6.3 expression <= expression 7.6.4 expression >= expression The operators < (less than), > (greater than), <= (less than or equal to) and >= (greater than or equal to) all yield 0 if the specified relation is false and 1 if it is true. For non-pointer operands, operand conversion is exactly the same as for the + operator. In addition, pointers of any kind can to be compared. The result in this case depends on the relative locations in 2 storage of the pointed-to objects. 7.7 Equality operators 7.7.1 expression == expression 7.7.2 expression != expression The == (equal to) and the != (not equal to) operators are exactly analogous to the relational operators except for their lower precedence. (Thus ``a>= expression lvalue =>> expression 7.13.8 lvalue <<= expression lvalue =<< expression 7.13.9 lvalue &= expression lvalue =& expression 7.13.10 lvalue ^= expression lvalue =^ expression 7.13.11 lvalue |= expression lvalue =| expression The behavior of an expression of the form ``E1 op= E2'' or ``E1 =op E2'' may be inferred by taking it as equivalent to ``E1 = E1 op E2''; however, E1 is evaluated only once. Moreover, expressions like ``i += p'' in which a pointer is added to an integer, are forbidden. The "op=" form is preferred over the "=op" form, because it eliminates ambiguities possible in expressions such as ``x=-1''. 7.14 expression , expression A pair of expressions separated by a comma is evaluated left-to-right and the value of the left expression is discarded. The type and value of the result are the type and value of the right operand. This operator groups left-to-right. It should be avoided in situations where comma is given a special meaning, for example in actual arguments to function calls (section 7.1.6) and lists of initializers (section 10.2). 8. Declarations Declarations are used within function definitions to specify the interpretation which C gives to each identifier; they do not necessarily reserve storage associated with the identifier. Declarations have the forms _________________________ 1 On UNIX, no conversion is necessary among different pointer types and integers. Thus, the value of i in this example would be preserved. C Reference Manual - 14 declaration: decl-specifiers init-declarator-list ; type-specifier ; The declarators in the init-declarator-list contain the identifiers being declared. The decl-specifiers consist of at most one type-specifier and at most one storage class specifier. decl-specifiers: type-specifier sc-specifier type-specifier sc-specifier sc-specifier type-specifier The second form of declaration is used to define structures (section 8.5). 8.1 Storage class specifiers The sc-specifiers are: sc-specifier: auto static extern register The auto, static, and register declarations also serve as definitions in that they cause an appropriate amount of storage to be reserved. In the extern case there must be an external definition (see below) for the given identifiers somewhere outside the function in which they are declared. Identifiers declared to be of class register may not be used as the operand of the address-of operator &. In addition, each implementation will have its own restrictions on the number and types of register identifiers which can be supported in any function. When these restrictions are violated, the offending 1 identifiers are treated as auto. If the sc-specifier is missing from a declaration, it is generally taken to be auto. 8.2 Type specifiers The type-specifiers are _________________________ 1 The portable C compiler treats register as synonymous with auto. C Reference Manual - 15 type-specifier: int char float double long long int short short int unsigned unsigned int long float struct { type-decl-list } struct identifier { type-decl-list } struct identifier The struct specifier is discussed in section 8.5. If the type-specifier is missing from a declaration, it is generally 1 taken to be int. 8.3 Declarators The init-declarator-list appearing in a declaration is a comma-separated sequence of declarators, each of which may be followed by an initializer for the declarator (initialization is discussed in section 10.3). init-declarator-list: init-declarator init-declarator , init-declarator-list init-declarator: declarator initializer opt The specifiers in the declaration indicate the type and storage class of the objects to which the declarators refer. Declarators have the syntax: declarator: identifier * declarator declarator ( ) declarator [ constant-expression ] opt ( declarator ) The grouping in this definition is the same as in expressions. _________________________ 1 The UNIX C compiler implements a facility whereby identifiers can be equated to types. Such identifiers can be used as type-specifiers. The portable C compiler only partially implements this facility. C Reference Manual - 16 8.4 Meaning of declarators Each declarator is taken to be an assertion that when a construction of the same form as the declarator appears in an expression, it yields an object of the indicated type and storage class. Each declarator contains exactly one identifier; it is this identifier that is declared. If an unadorned identifier appears as a declarator, then it has the type indicated by the specifier heading the declaration. If a declarator has the form * D for D a declarator, then the contained identifier has the type ``. . . pointer to X'', where `` . . . X'' is the type which the identifier would have had if the declarator had been simply D. If a declarator has the form D ( ) then the contained identifier has the type ``. . . function returning X'', where `` . . . X'' is the type which the identifier would have had if the declarator had been simply D. A declarator may have the form D[constant-expression] or D[ ] In the first case the constant expression is an expression whose value is determinable at compile time, and whose type is int. in the second the constant 1 is used. (Constant expressions are defined precisely in section 15.) Such a declarator makes the contained identifier have type ``. . . array of X'', where `` . . . X'' is the type which the identifier would have had if the declarator had been simply D. The constant specifies the number of elements in the array. An array may be constructed from one of the basic types, from a pointer, from a structure, or from another array (to generate a multi-dimensional array). Finally, parentheses in declarators do not alter the type of the contained identifier except insofar as they alter the binding of the components of the declarator. Not all the possibilities allowed by the syntax above are actually permitted. The restrictions are as follows: functions may not return arrays, structures or functions, although they may return pointers to such things; there are no arrays of functions, although there may be arrays of pointers to functions. Likewise a structure may not contain a function, but it may contain a pointer to a function. As an example, the declaration int i, *ip, f(), *fip(), (*pfi)(); declares an integer i, a pointer ip to an integer, a function f returning an integer, a function fip returning a pointer to an C Reference Manual - 17 integer, and a pointer pfi to a function which returns an integer. Also float fa[17], *afp[17]; declares an array of float numbers and an array of pointers to float numbers. Finally, static int x3d[3][5][7]; declares a static three-dimensional array of integers, with rank 3x5x7. In complete detail, x3d is an array of three items: each item is an array of five arrays; each of the latter arrays is an array of seven integers. Any of the expressions ``x3d'', ``x3d [ i ]'', ``x3d [ i ] [ j ]'', ``x3d [ i ] [ j ] [ k ]'' may reasonably appear in an expression. The first three have type ``array'', the last has type int. 8.5 Structure declarations Recall that one of the forms for a structure specifier is struct { type-decl-list } The type-decl-list is a sequence of type declarations for the members of the structure: type-decl-list: type-declaration type-declaration type-decl-list A type declaration is just a declaration which does not mention a storage class (the storage class ``member of structure'' here being understood by context) or include an initializer. type-declaration: type-specifier declarator-list ; Within the structure, the objects declared have addresses which increase as their declarations are read left-to-right. Each component of a structure begins on an addressing boundary appropriate to its type. Therefore, there may be unnamed holes 1 in a structure. Another form of structure specifier is struct identifier { type-decl-list } This form is the same as the one just discussed, except that the identifier is remembered as the structure tag of the structure _________________________ 1 The UNIX C compiler forces all structures to have an even length in bytes and be aligned on word boundaries. C Reference Manual - 18 specified by the list. A declaration may then be given using the structure tag but without the list, as in the third form of structure specifier: struct identifier Structure tags allow definition of self-referential and mutually-recursive structures (forward references to structure type names must be within the same group of definitions and be a pointed-to or returned type); they also permit the long part of the declaration to be given once and used several times. It is however absurd to declare a structure which contains an instance of itself, as distinct from a pointer to an instance of itself. A simple example of a structure declaration, taken from section 16.2 where its use is illustrated more fully, is struct tnode { char tword[20]; int count; struct tnode *left; struct tnode *right; }; which contains an array of 20 characters, an integer, and two pointers to similar structures. Once this declaration has been given, the following declaration makes sense: struct tnode s, *sp; which declares s to be a structure of the given sort and sp to be a pointer to a structure of the given sort. The names of structure members and structure tags may be the same as ordinary variables, since a distinction can be made by context. All of the members of a structure must have unique names. However, a single member name may be used in many 1 structure definitions. 9. Statements Except as indicated, statements are executed in sequence. _________________________ 1 The UNIX C compiler requires that the names of tags and members be distinct. In addition, the same member name is allowed to appear in different structures only if the two members are of the same type and if their origin with respect to their structure is the same. Thus, separate structures can share a common initial segment. C Reference Manual - 19 9.1 Expression statement Most statements are expression statements, which have the form expression ; Usually expression statements are assignments or function calls. 9.2 Compound statement So that several statements can be used where one is expected, or local variables defined, the compound statement is provided: compound-statement: { declaration-list statement-list } opt declaration-list: declaration declaration declaration-list statement-list: statement statement statement-list 9.3 Conditional statement The two forms of the conditional statement are if ( expression ) statement if ( expression ) statement else statement In both cases the expression is evaluated and if it is non-zero, the first substatement is executed. In the second case the second substatement is executed if the expression is zero. As usual the ``else'' ambiguity is resolved by connecting an else with the last encountered elseless if. The expression may be of any fundamental type or a pointer. The comparison with zero is done in a manner appropriate for the type of the expression. 9.4 While statement The while statement has the form while ( expression ) statement The substatement is executed repeatedly so long as the value of the expression remains non-zero. The test takes place before each execution of the statement, and is the same as that performed by the if statement. 9.5 Do statement The do statement has the form do statement while ( expression ) ; The substatement is executed repeatedly until the value of the expression becomes zero. The test takes place after each execution of the statement, and is the same as that performed by C Reference Manual - 20 the if statement. 9.6 For statement The for statement has the form for ( expression-1 ; expression-2 ; expression-3 ) statement opt opt opt This statement is equivalent to expression-1; while ( expression-2 ) { statement expression-3 ; } Thus the first expression specifies initialization for the loop; the second specifies a test, made before each iteration, such that the loop is exited when the expression becomes zero; the third expression typically specifies an incrementation which is performed after each iteration. Any or all of the expressions may be dropped. A missing expression-2 makes the implied while clause equivalent to ``while ( 1 )''; other missing expressions are simply dropped from the expansion above. 9.7 Switch statement The switch statement causes control to be transferred to one of several statements depending on the value of an expression. It has the form switch ( expression ) statement The expression must be int or char. The statement is typically compound. Each statement within the statement may be labelled with case prefixes as follows: case constant-expression : where the constant expression must be int or char. No two of the case constants in a switch may have the same value. Constant expressions are precisely defined in section 15. There may also be at most one statement prefix of the form default : When the switch statement is executed, its expression is evaluated and compared with each case constant in an undefined order. If one of the case constants is equal to the value of the expression, control is passed to the statement following the matched case prefix. If no case constant matches the expression, and if there is a default prefix, control passes to the prefixed statement. In the absence of a default prefix none of the statements in the switch is executed. Case or default prefixes in themselves do not alter the flow of control. C Reference Manual - 21 9.8 Break statement The statement break ; causes termination of the smallest enclosing while, do, for, or switch statement; control passes to the statement following the terminated statement. 9.9 Continue statement The statement continue ; causes control to pass to the loop-continuation portion of the smallest enclosing while, do, or for statement; that is to the end of the loop. More precisely, in each of the statements while ( ... ) { do { for ( ... ) { . . . . . . . . . contin: ; contin: ; contin: ; } } while ( ... ); } a continue is equivalent to ``goto contin''. 9.10 Return statement A function returns to its caller by means of the return statement, which has one of the forms return ; return ( expression ) ; In the first case no value is returned. In the second case, the value of the expression is returned to the caller of the function. If required, the expression is converted, as if by assignment, to the type of the function in which it appears. Flowing off the end of a function is equivalent to a return with no returned value. 9.11 Goto statement Control may be transferred unconditionally by means of the statement goto expression ; The expression should be a label (sections 9.12, 14.4) or an expression of type ``pointer to int'' which evaluates to a label. It is illegal to transfer to a label not located in the current function unless some extra-language provision has been made to adjust the stack correctly. C Reference Manual - 22 9.12 Labelled statement Any statement may be preceded by label prefixes of the form identifier : which serve to declare the identifier as a label. More details on the semantics of labels are given in section 14.4 below. 9.13 Null statement The null statement has the form ; A null statement is useful to carry a label just before the ``}'' of a compound statement or to supply a null body to a looping statement such as while. 10. Function definitions and global declarations A C program consists of a sequence of function definitions and global declarations. Global declarations may be given for simple variables and for arrays. They are used to declare and/or reserve storage for objects. 10.1 Function definitions Function definitions have the form function-definition: type-specifier function-declarator function-body opt A function declarator is similar to a declarator for a ``function returning ...'' except that it lists the formal parameters of the function in the parentheses which must follow the function name. Some examples of function-declarators are: f(a, b) returns int *f(a) returns pointer to int (*f(a))() returns pointer to function returning int The function-body has the form function-body: type-decl-list function-statement opt The purpose of the type-decl-list is to give the types of the formal parameters. No other identifiers should be declared in this list, and formal parameters should be declared only here. Formal parameters may be declared as being of class register. The function-statement is just a compound statement. function-statement: compound-statement A simple example of a complete function definition is C Reference Manual - 23 int max (a, b, c) int a, b, c; {int m; m = (a > b) ? a : b; return (m > c ? m : c); } Here ``int'' is the type-specifier; ``max(a, b, c)'' is the function-declarator; ``int a, b, c;'' is the type-decl-list for the formal parameters; ``{ . . . }'' is the function-statement. C converts all float actual parameters to double, so formal parameters declared float have their declaration adjusted to read double. Correspondingly, char parameters are adjusted to read int. Also, since a reference to an array in any context (in particular as an actual parameter) is taken to mean a pointer to the first element of the array, declarations of formal parameters declared ``array of ...'' are adjusted to read ``pointer to ...''. Finally, because neither structures nor functions can be passed to a function, it is useless to declare a formal parameter to be a structure or function (pointers to structures or functions are of course permitted). A free return statement is supplied at the end of each function definition, so running off the end causes control, but no value, to be returned to the caller. 10.2 Global declarations A global declaration has the same form as a declaration within a function (section 8), except that the sc-specifiers auto and register may not be used. Global declarations with sc-specifiers extern or static are like similar declarations within functions, except that the identifiers so declared are accessible throughout the remainder of the source file. A global static declaration reserves storage which is retained throughout the execution of a program. A global extern declaration declares that the associated identifiers have been externally defined, but is not itself such a definition. A global declaration without an sc-specifier is an external definition. It reserves storage for the identifiers and allows them to be accessed by separately-compiled functions which contain appropriate extern declarations for the identifiers. It is an error to have more than one external definition of an 1 identifier in a C program. Functions appearing in an external data definition are declared as extern. _________________________ 1 The UNIX C compiler treats external data definitions and global extern declarations as equivalent. More than one external definition of an identifier is allowed, so long as at most one includes initialization. C Reference Manual - 24 10.3 Initialization Explicit initialization is permitted in declarations which reserve storage, namely register, auto, and static declarations, and external definitions. Automatic structures and arrays may 1 not be initialized. The initial value of static and extern identifiers not explicitly initialized is zero. The initial value of register and auto identifiers not explicitly initialized is undefined. An initializer represents the initial value for the corresponding object being defined (and declared). initializer: constant { constant-expression-list } constant-expression-list: constant-expression constant-expression , constant-expression-list Thus an initializer consists of a constant-valued expression, or comma-separated list of expressions, inside braces. The braces may be dropped when the expression is just a plain constant. The exact meaning of a constant expression is discussed in section 15. The expression list is used to initialize arrays and structures; see below. The type of the identifier being defined should be compatible with the type of the initializer: a double constant may initialize a float or double identifier; a non-floating-point expression may initialize an int, char, or pointer. An initializer for an array may contain a comma-separated list of compile-time expressions. The length of the array is taken to be the maximum of the number of expressions in the list and the square-bracketed constant in the array's declarator. This constant may be missing, in which case 1 is used. The expressions initialize successive members of the array starting at the origin (subscript 0) of the array. The acceptable expressions for an array of type ``array of ...'' are the same as 2 those for type ``...''. Structures can be initialized, but this operation is incompletely implemented and machine-dependent. Basically the structure is regarded as a sequence of words and the initializers _________________________ 1 The portable C compiler does not support initialization of register or auto identifiers. 2 The UNIX C compiler also allows, as a special case, a single string to be given as the initializer for an array of chars; in this case, the characters in the string are taken as the initializing values. C Reference Manual - 25 are placed into those words. Structure initialization, using a comma-separated list in braces, is safe if all the members of the 1 structure are integers or pointers but is otherwise ill-advised. 11. Scope rules A complete C program need not all be compiled at the same time: the source text of the program may be kept in several files, and precompiled routines may be loaded from libraries. Communication among the functions of a program may be carried out both through explicit calls and through manipulation of external data. Therefore, there are two kinds of scope to consider: first, what may be called the lexical scope of an identifier, which is essentially the region of a program during which it may be used without drawing ``undefined identifier'' diagnostics; and second, the scope associated with external identifiers, which is characterized by the rule that references to the same external identifier are references to the same object. 11.1 Lexical scope C supports block-structure only within function definitions (i.e., function definitions may not be nested, but any compound statement can define variables local to that statement). The lexical scope of names declared in external definitions extends from their definition through the end of the file in which they appear. The same is true for implicit or explicit external declarations inside of function definitions. The lexical scope of formal parameters is the body of the function. The lexical scope of non-external names declared at the head of compound statements extends from their definition through the end of the compound statement. The only allowed forward reference to a label is as the expression in a goto statement. It is an error to redeclare an identifier already declared in the current context, except for a consistent set consisting of any number of external declarations plus at most one external definition for an identifier. 11.2 Scope of externals If a function declares an identifier to be extern, then somewhere among the files or libraries constituting the complete program there must be an external definition for the identifier. All functions in a given program which refer to the same external identifier refer to the same object, so care must be taken that the type and extent specified in the definition are compatible with those specified by each function which references the data. In a multi-file program, an external definition for an external identifier must appear in exactly one of the files. Any other files which wish to use the identifier must contain a _________________________ 1 The UNIX C compiler implements initialization of arbitrary structures, and allows nested bracketed sequences of initializers for aggregates. C Reference Manual - 26 corresponding extern declaration of the identifier. The identifier can be initialized only in the file where storage is allocated. 12. Compiler control lines When a line of a C program begins with the character #, it is interpreted as a special directive to the compiler. Such compiler control lines may appear anywhere in the source file, 1 except within comments and constants. The names of compiler control lines are not reserved; they are recognized by context. 12.1 Token replacement A compiler-control line of the form # define identifier token-string (note: no trailing semicolon) causes the compiler to replace subsequent instances of the identifier with the given string of tokens. When processing the # define line, token replacement is performed on the token string, but not on the identifier. When token replacement occurs, the inserted token string is not 2 subject to further token replacement. The names of compiler control lines are not subject to token replacement, nor are compiler control line arguments specified as identifiers. This facility is most valuable for definition of ``manifest constants'', as in # define tabsize 100 . . . int table[tabsize]; Macros may be defined by immediately following the identifier with a parenthesized list of formal parameters (see also section 12.3). _________________________ 1 In order to use compiler control lines with the UNIX C compiler, it is required that the first line of the source file begin with #. 2 Unfortunately, the UNIX C compiler uses a different method of token replacement with different semantics. Token replacement is not performed on the token-string when processing a # define line. However, when the token-string is inserted, it is subject to token replacement. C Reference Manual - 27 12.2 File inclusion In multi-file C programs, it is necessary to have extern declarations for any external identifier used in files other than the one in which it is defined. Rather than repeat tedious and error-prone declarations for each external identifier in each file, one can create a separate file containing these declarations and cause it to be dynamically inserted into each source file. A compiler control line of the form # include "filename" results in the replacement of that line by the entire contents of 1 the file filename. Included files may include other files. This technique is also useful for manifest constants and structure definitions. 2 12.3 Macros The C macro facility allows token replacement strings to be parameterized. A macro is defined by lines of the form # macro identifier ( parameter-list ) opt token-string # end The parameter list is a comma-separated list of identifiers, which are the formal parameters of the macro. The token-string, which may be given on zero or more input lines, may contain occurrences of the formal parameter names. When substitution is performed, these occurrences will be replaced by the corresponding actual parameters, which are strings of tokens. The format of a macro ``invocation'' is the same as for function calls. Thus, the macro facility can be used to write small ``functions'' (without local variables) which will produce in-line code. However, one must be careful in that macro parameters are essentially call by name, whereas function parameters are call by value. In addition, it is a good idea to enclose within parentheses all occurrences of formal parameters in macro definitions, in order to avoid precedence problems after substitution of actual parameters. _________________________ 1 The Unix C compiler also allows the filename to be enclosed in angle brackets instead of quotation marks. Such a filename is interpreted relative to a system standard include-file directory. 2 This facility is not supported by the UNIX C compiler. C Reference Manual - 28 12.4 Compile-time conditionals Conditional compilation of source text is provided by the forms # ifdef identifier # ifndef identifier ... ... # endif # endif These forms cause the text enclosed by the compiler control lines to be included in the compilation only if the given identifier has ( ifdef ) or does not have ( ifndef ) a lexical definition. An identifier is given a lexical definition by # define and # macro. Compile-time conditionals may be nested. 1 12.5 Undefine The undefine compiler control line has the form # undefine identifier It removes any lexical definition of the identifier established by a previous # define or # macro. The identifier will henceforth not be subject to any form of token replacement. When used with a keyword, # undefine causes the reserved identifier to lose its built-in meaning and become an ordinary identifier. 2 12.6 Renamed identifiers In writing some system support software, it is often desirable to use names for functions and external data which are not subject to accidental conflict with user-chosen names. This ability is provided by the rename compiler control line, which has the form # rename identifier string The specified identifier will replaced by the given character string when it appears in the output of the compiler. 13. Implicit declarations It is not always necessary to specify both the storage class and the type of identifiers in a declaration. Sometimes the storage class is supplied by the context: in external definitions, and in declarations of formal parameters and structure members. In a declaration inside a function, if a storage class but no type is given, the identifier is assumed to be int; if a type but no storage class is indicated, the identifier is assumed to be auto. An exception to the latter rule is made for functions, since auto functions are meaningless (C being incapable of compiling code into the stack). If the type of an identifier is ``function returning ...'', it is _________________________ 1 This facility is not supported by the UNIX C compiler. 2 This facility is not supported by the UNIX C compiler. C Reference Manual - 29 implicitly declared to be extern. In an expression, an identifier followed by ( and not otherwise declared is contextually declared to be ``function returning int''. As an initializer, an otherwise undeclared identifier is 1 contextually declared to be ``function returning int''. For some purposes it is best to consider formal parameters as belonging to their own storage class. In practice, C treats parameters as if they were automatic (except that, as mentioned above, formal parameter arrays, chars, and floats are treated specially). 14. Types revisited This section summarizes the operations which can be performed on objects of certain types. 14.1 Structures There are only two things that can be done with a structure: pick out one of its members (by means of the `` . '' or `` -> '' operators); or take its address (by unary `` & ''). Other operations, such as assigning from or to it or passing it as a parameter, draw an error message. In the future, it is expected that these operations, but not necessarily others, will be allowed. 14.2 Functions There are only two things that can be done with a function: call it, or take its address. If the name of a function appears in an expression not in the function-name position of a call, a pointer to the function is generated. Thus, to pass one function to another, one might say int f(); ... g (f); Then the definition of g might read g (funcp) int (*funcp)(); {. . . (*funcp)(); . . . } Notice that f was declared explicitly in the calling routine since its first appearance was not followed by `` ( ''. _________________________ 1 The UNIX C compiler contextually declares identifiers in initializers to be of type int. C Reference Manual - 30 14.3 Arrays, pointers, and subscripting Every time an identifier of array type appears in an expression, it is converted into a pointer to the first member of the array. Because of this conversion, arrays are not lvalues. By definition, the subscript operator [ ] is interpreted in such a way that ``E1[E2]'' is identical to ``*((E1) + (E2))''. Because of the conversion rules which apply to +, if E1 is an array and E2 an integer, then E1[E2] refers to the E2-th member of E1. Therefore, despite its asymmetric appearance, subscripting is a commutative operation. A consistent rule is followed in the case of multi-dimensional arrays. If E is an n - dimensional array of rank i x j x . . . x k, then E appearing in an expression is converted to a pointer to an (n - 1) - dimensional array with rank j x . . . x k. If the * operator, either explicitly or implicitly as a result of subscripting, is applied to this pointer, the result is the pointed-to (n - 1) - dimensional array, which itself is immediately converted into a pointer. For example, consider int x[3][5]; Here x is a 3x5 array of integers. When x appears in an expression, it is converted to a pointer to (the first of three) 5-membered arrays of integers. In the expression ``x [ i ]'', which is equivalent to ``*(x+i)'', x is first converted to a pointer as described; then i is converted to the type of x, which involves multiplying i by the length the object to which the pointer points, namely 5 integer objects. The results are added and indirection applied to yield an array (of 5 integers) which in turn is converted to a pointer to the first of the integers. If there is another subscript the same argument applies again; this time the result is an integer. It follows from all this that arrays in C are stored row-wise (last subscript varies fastest) and that the first subscript in the declaration helps determine the amount of storage consumed by an array but plays no other part in subscript calculations. 14.4 Labels Labels do not have a type of their own; they are treated as having type ``array of int''. Label variables should be declared ``pointer to int''; before execution of a goto referring to the variable, a label (or an expression deriving from a label) should be assigned to the variable. Label variables are a bad idea in general; the switch statement makes them almost always unnecessary. 15. Constant expressions In several places C requires expressions which evaluate to a constant: after case, as array bounds, and in initializers. In the first two cases, the expression can involve only integer and character constants, possibly connected by the binary operators C Reference Manual - 31 + - * / % & | ^ << >> < > <= >= == != && || ? : or by the unary operators - ~ ! Parentheses can be used for grouping, but not for function 1 calls. A bit more latitude is permitted for initializers. Besides constant expressions as discussed above, one can have double and string constants, and one can apply the unary & operator to external scalars. The unary & can also be applied implicitly by appearance of functions or unsubscripted external arrays. An undefined identifier appearing in an initializer is implicitly 2 declared to be a function returning int. 16. Examples. These examples are intended to illustrate some typical C constructions as well as a serviceable style of writing C programs. 16.1 Inner product This function returns the inner product of its array arguments. double inner (v1, v2, n) double v1[], v2[]; {double sum; int i; sum = 0.0; for (i = 0; i < n; i++) sum += v1[i] * v2[i]; return (sum); } The following version is somewhat more efficient, but perhaps a little less clear. It uses the facts that parameter arrays are really pointers, and that all parameters are passed by value. _________________________ 1 The UNIX C compiler allows sizeof, but not the relational operators, &&, ||, !, or conditional expressions. 2 The UNIX C compiler also allows initializers which evaluate to the address of an external or global static variable plus or minus a constant, such as ``&a[3]'', where a is an external or global static array. C Reference Manual - 32 double inner (v1, v2, n) double *v1, *v2; {double sum; sum = 0.0; while (n--) sum += *v1++ * *v2++; return (sum); } The declarations for the parameters are really exactly the same as in the last example. In the first case array declarations `` [ ] '' were given to emphasize that the parameters would be referred to as arrays; in the second, pointer declarations were given because the indirection operator and ++ were used. 16.2 Tree and character processing Here is a complete C program ( courtesy of R. Haight ) which reads a document and produces an alphabetized list of words found therein together with the number of occurrences of each word. The method keeps a binary tree of words such that the left descendant tree for each word has all the words lexicographically smaller than the given word, and the right descendant has all the larger words. Both the insertion and the printing routine are recursive. The program calls the library routines getchar to pick up characters and cexit to terminate execution. Cprint is called to print the results according to a format string. Because all the external definitions for data are given at the top, no extern declarations are necessary within the functions. To stay within the rules, a type declaration is given for each non-integer function when the function is used before it is defined. However, since all such functions return pointers which are simply assigned to other pointers, no actual harm would result from leaving out the declarations; the supposedly int function values would be assigned without error or complaint. # define nwords 1500 /* number of different words */ # define wsize 20 /* max chars per word */ # define tnode struct _tnode /* make tnode look like a type */ struct _tnode /* the basic structure */ {char tword[wsize]; int count; tnode *left, *right; }; tnode space[nwords]; /* the words themselves */ int nnodes nwords; /* number of remaining slots */ tnode *nextp space; /* next available slot */ tnode *freep; /* free list */ /* * The main routine reads words until end-of-file, * i.e., '\0' returned from "getchar". * "tree" is called to sort each word into the tree. */ C Reference Manual - 33 main (argc, argv) int argc; char *argv[]; {tnode *top, *tree(); char c, word[wsize]; int i; i = top = 0; while (c = getchar ()) if (('a' <= c && c<='z') || ('A' <= c && c <= 'Z')) {if (i < wsize - 1) word[i++] = c; } else if (i) {word[i++] = '\0'; top = tree (top, word); i = 0; } tprint (top); } /* * The central routine. If the subtree pointer is null, allocate * a new node for it. If the new word and the node's word are the * same, increase the node's count. Otherwise, recursively sort * the word into the left or right subtree depending on whether * the argument word is less or greater than the node's word. */ tnode *tree (p, word) tnode *p; char word[]; {tnode *alloc (); int cond; /* Is pointer null? */ if (p == 0) {p = alloc (); copy (word, p->tword); p->count = 1; p->right = p->left = 0; return (p); } /* Is word repeated? */ if ((cond = compar (word, p->tword)) == 0) {p->count++; return (p); } /* Sort into left or right */ if (cond < 0) p->left = tree (p->left, word); else p->right = tree (p->right, word); return (p); } C Reference Manual - 34 /* * Print the tree by printing the left subtree, the given node, * and then the right subtree. */ tprint (p) tnode *p; {while (p) {tprint (p->left); cprint ("%4d: %s\n", p->count, p->tword); p = p->right; } } /* * String comparison: return number ( >, =, < ) 0 * according as s1 ( >, =, < ) s2. */ compar (s1, s2) char *s1, *s2; {int c1, c2; while ((c1 = *s1++) == (c2 = *s2++)) if (c1 == '\0') return (0); return (c1 - c2); } /* * String copy: copy s1 into s2 until the null * character appears. */ copy (s1, s2) char *s1, *s2; {while (*s2++ = *s1++); } /* * Node allocation: return pointer to a free node. * Bomb out when all are gone. Just for fun, there * is a mechanism for using nodes that have been * freed, even though no one here calls "free." */ tnode *alloc () {tnode *t; if (freep) {t = freep; freep = freep->left; return (t); } if (--nnodes < 0) {cprint ("Out of space\n"); cexit (); } return (nextp++); } C Reference Manual - 35 /* * The uncalled routine which puts a node on the free list. */ free (p) tnode *p; {p->left = freep; freep = p; } To illustrate a slightly different technique of handling the same problem, we will repeat fragments of this example with the tree nodes treated explicitly as members of an array. The fundamental change is to deal with the subscript of the array member under discussion, instead of a pointer to it. The struct declaration becomes struct _tnode {char tword[wsize]; int count; int left, right; }; and alloc becomes alloc () {int t; t = --nnodes; if (t <= 0) {cprint ("Out of space\n"); cexit (); } return (t); } The free stuff has disappeared because if we deal with exclusively with subscripts some sort of map has to be kept, which is too much trouble. Now the tree routine returns a subscript also, and it becomes: int tree (p, word) char word[]; {int cond; if (p == 0) {p = alloc (); copy (word, space[p].tword); space[p].count = 1; space[p].right = space[p].left = 0; return (p); } if ((cond = compar (space[p].tword, word)) == 0) {space[p].count++; return (p); C Reference Manual - 36 } if (cond < 0) space[p].left = tree (space[p].left, word); else space[p].right = tree (space[p].right, word); return (p); } The other routines are changed similarly. It must be pointed out that this version is noticeably less efficient than the first because of the multiplications which must be done to compute an offset in space corresponding to the subscripts. The observation that subscripts ( like ``a [ i ] '' ) are less efficient than pointer indirection ( like ``*ap'' ) holds true independently of whether or not structures are involved. There are of course many situations where subscripts are indispensable, and others where the loss in efficiency is worth a gain in clarity. C Reference Manual - 37 References 1. Johnson, S. C., and Kernighan, B. W. The programming language B. Computing Science Technical Report No. 8, Bell Laboratories, Murray Hill, N. J., 1972. 2. Peterson, T. G., and Lesk, M. E. A user's guide to the C language on the IBM 370. Internal Memorandum, Bell Laboratories, 1974. 3. Richards, M. BCPL: a tool for compiler writing and system programming. Proc. SJCC 1969, 557-566. 4. Ritchie, D. M., and Thompson, K. L. The UNIX time-sharing system. Comm. ACM 7, 17 (July 1974), 365-375. 5. Ritchie, D. M., Kernighan, B. W., and Lesk, M. E. The C programming language. Computing Science Technical Report No. 31, Bell Laboratories, Murray Hill, N. J., 1975. 6. Snyder, A. A portable compiler for the language C. Rep. TR-149, Project MAC, M.I.T., Cambridge, Ma., 1975. C Reference Manual - 38 APPENDIX Syntax Summary 1. Expressions. expression: primary * expression & expression - expression ! expression ~ expression ++ lvalue -- lvalue lvalue ++ lvalue -- sizeof expression expression binop expression expression ? expression : expression lvalue asgnop expression expression , expression primary: identifier constant string ( expression ) primary ( expression-list ) opt primary [ expression ] lvalue . identifier primary ->identifier lvalue: identifier primary [ expression ] lvalue . identifier primary -> identifier * expression ( lvalue ) The primary-expression operators ( ) [ ] . -> have highest priority and group left-to-right. The unary operators * & - ! ~ ++ -- sizeof have priority below the primary operators but higher than any C Reference Manual - 39 binary operator, and group right-to-left. Binary operators and the conditional operator all group left-to-right, and have priority decreasing as indicated: binop: * / % + - >> << < > <= >= == != & | ^ && || ? : Assignment operators all have the same priority, and all group right-to-left. asgnop: = += -= *= /= %= >>= <<= &= ^= |= =+ =- =* =/ =% =>> =<< =& =^ =| The comma operator has the lowest priority, and groups left-to-right. 2. Declarations. declaration: decl-specifiers init-declarator-list ; type-specifier ; decl-specifiers: type-specifier sc-specifier type-specifier sc-specifier sc-specifier type-specifier sc-specifier: auto static extern register C Reference Manual - 40 type-specifier: int char float double long long int short short int unsigned unsigned int long float struct { type-decl-list } struct identifier { type-decl-list } struct identifier init-declarator-list: init-declarator init-declarator , init-declarator-list init-declarator: declarator initializer opt declarator: identifier * declarator declarator ( ) declarator [ constant-expression ] opt ( declarator ) type-decl-list: type-declaration type-declaration type-decl-list type-declaration: type-specifier declarator-list ; declarator-list: declarator declarator , declarator-list initializer: constant { constant-expression-list } C Reference Manual - 41 constant-expression-list: constant-expression constant-expression , constant-expression-list constant-expression: expression 3. Statements. compound-statement: { declaration-list statement-list } opt statement: expression ; compound-statement if ( expression ) statement if ( expression ) statement else statement while ( expression ) statement for ( expression ; expression ; expression ) statement opt opt opt switch ( expression ) statement case constant-expression : statement default : statement break ; continue ; return ; return ( expression ) ; goto expression ; identifier : statement ; statement-list: statement statement statement-list 4. External definitions. program: external-definition external-definition program external-definition: function-definition declaration function-definition: type-specifier function-declarator function-body opt C Reference Manual - 42 function-declarator: identifier ( parameter-list ) opt * function-declarator function-declarator ( ) function-declarator [ constant-expression ] opt ( function-declarator ) parameter-list: identifier identifier , parameter-list function-body: type-decl-list function-statement opt function-statement: compound-statement 5. Compiler control lines # define identifier token-string # define identifier( parameter-list ) token-string # include string # macro identifier ( parameter-list ) opt # end # ifdef identifier # ifndef identifier # endif # undefine identifier # rename identifier string