C Compiler Differences Copyright (c) 1976, 1977, 1978 by Alan Snyder This note lists known language differences between the portable C compiler (hereafter referred to as PC) and the UNIX C compiler (February 1977 version, hereafter known as UC). Obvious machine dependencies, such as precision, are not listed. Serious Incompatibilities 1. Structure types: In PC, the definitions of structure members are meaningful only in the context of the structure type. Thus, the same identifier can be used as the name of members of different structure types. As a result, when using a member name (after . or ->), the structure-designating expression must be of the proper structure type. In UC, members of different structures must have different names, unless the types and offsets of the members are the same. UC thus can allow any type of expression to designate a structure. 2. #define: There are significant differences between the semantics of #define in PC and UC. In PC, replacement is in terms of tokens; in UC, replacement is in terms of strings. The UC semantics is both more powerful and more dangerous, as the replacing text may merge (lexically) with surrounding text. Similarly, UC substitutes for occurrences of formal parameters even inside of string constants. Another difference is that PC performs token substitution on the body of a #define when the #define is processed. UC rescans the body (for further substitutions) after substitution; thus, infinite substitution is possible (a special hack allows "#define unix unix" to work). Intentional PC Restrictions 1. Relational operators: In PC, pointers may be compared with pointers, and pointers may be tested for equality or inequality with zero; no other pointer comparisons are allowed. (These restrictions can be avoided using the assignment operator.) 2. Block structure: UC allows a declaration to occlude a more global declaration of the same identifier; PC does not allow occlusion. 3. External data definitions: In PC, the type-specifier is not optional. In addition, the must be exactly one external definition of each external identifier in all of the files making up a program; of course, multiple 4 April 1978 - 2 - C Compiler Differences external declarations are allowed. 4. Implicit ampersand: PC always supplies implicit ampersands for functions and string constants; thus, explicit ampersands are not allowed. UC allows explicit ampersands on functions and string constants. 5. Character constants: In PC, character constants may contain only one character, whereas UC allows character constants to contain two characters. Minor Incompatibilities 1. Storage classes in declarations: In PC, storage classes must be accompanied by types in declarations. This restriction is necessary in order to parse declarations using user-defined type names. 2. Character case: In PC, upper and lower case are not distinguished in identifiers, reserved words, or compiler control commands. UC always distinguishes case, and recognizes reserved words and compiler control commands only in lower case. 3. External function definitions: PC converts char parameters to type int. 4. Initializers: PC allows an external or static (only) identifier to be used as an initializer, but no offset is allowed. Such an identifier, if not previously declared, is implicitly declared to be a function returning int. UC implicitly declares such an identifier to be of type int. UC allows more general initializers and is stickier about the use of ampersand. Also, in PC, character strings may not be used to initialize arrays of characters. 5. Conversion: In PC, conversion from double or float to int or char rounds, rather than truncates. 6. Constant expressions: PC constant expressions may not use sizeof. 7. Unary minus: In PC, the type of the result is either int or double. 8. Conditional statement: In PC, the comparison with zero is done in a manner appropriate for the type of the expression; the expression may be any fundamental type or a pointer. C Compiler Differences - 3 - 4 April 1978 9. Structure alignment: In PC, structures are aligned according to the strictest of the alignments of the members; in particular, a structure containing only characters is not necessarily word-aligned. Additional PC Features 1. Significance of identifiers: in PC, all characters of identifiers local to a file are significant. The significance of identifiers global to a file is operating-system dependent (warning: it may be as little as the first five characters). In UC, the significance of identifiers local to a file is eight characters. 2. Compiler control lines: In PC, an initial # is not required in order to use compiler control lines. Also, some additional compiler control lines are supported by PC: #rename (rename the assembly language version of an identifier to an arbitrary string), #macro-#end (an alternate form of macro definition), #undefine (remove any lexical definition of an identifier -- even a reserved word). 3. Escape sequences: PC recognizes two additional escape sequences in character and string constants, \p for FF (new page) and \v for VT (vertical tab). 4. Constant expressions: PC constant expressions may use the following operators not allowed in UC: <, >, <=, >=, ==, !=, &&, ||, ?, !. 5. Logical NOT: PC also allows float and double operands. Unsupported UC Features 1. Aggregate initialization: PC does not support aggregate initialization using nested sequences of initializers enclosed in braces. PC does support the "hack" initialization of structures containing only ints and pointers. 2. Register variables: PC implements all register variables as auto. 3. Bit fields: PC recognizes bit field definitions, but treats them as int. 4. Varieties of integers: PC recognizes long, short, and unsigned ints, but treats them all as int. 4 April 1978 - 4 - C Compiler Differences 5. Long constants: PC recognizes long constants and treats them as int constants. 6. Initialization: PC does not support the initialization of auto or register variables. 7. Standard include directory: PC does not support this feature. 8. Static functions: PC does not support static functions. 9. Type definitions: In PC, a user-defined type name may not be followed in a declaration by a left parenthesis unless a storage class is also given.