Extended Pascal

.. a new standard in computer languages

An Introduction to Extended Pascal

by Tony Hetherington

1. Background

The programming language Pascal was originally designed by Professor Niklaus Wirth, and named after a French mathematician and philosopher widely admired for the clear and direct nature of his ideas. Wirth set out two principal aims for the design of Pascal: that it should be systematic and coherent, so far as possible avoiding arbitrary restrictions, and that it should be suitable for efficient implementation on the currently available machines. A "User Manual and Report" defining Pascal was produced by Jensen and Wirth during the 1970's; the most recent third edition corresponds exactly to the first standard mentioned below.

Closely-related but not identical standards were published in 1983 by ISO (originally a BSI standard) and ANSI. While the definition they contained was very precise, and in this respect was something of a landmark, it did not attempt to add much to Wirth's original specification. One consequence was that while many implementations provided the standard language, they also devised their own extensions to provide users with features which had been found to be in demand.

After publication of the first standard, the ANSI/IEEE joint project committee continued to develop extensions, and from 1984 worked closely with the ISO group to produce the definition of an Extended Pascal language which would be upward-compatible with the earlier versions, equally rigorously defined but covering areas which experience had proved to be desirable. Extended Pascal standards which are identical in technical content were published in 1990 and 1991. After finalising the standard, the technical committees have continued to develop new features in the areas of Object Oriented programming and exception handling.

The Extended Pascal language is somewhat larger than the earlier standard (which will generally be referred to here as "classic" Pascal), and an introduction of this kind can give only a general survey. However, the first complete implementation has been produced for microcomputers running DOS, so it is fair to conclude that Wirth's original aims have still been kept in mind.

[Prev] [Contents] [Next]

2. General Enhancements

Before considering major new features, it is useful to start a survey of the Extended Pascal language with some enhancements which make the facilities inherited from classic Pascal easier to use (and which apply also in the major extensions described later). A number of these enhancements may be familiar as local extensions to users of current Pascal implementations.

Constant expressions can be used where classic Pascal requires constants, for example in declarations. These constant expressions can employ most of the predefined functions, as well as operators.

 CONST linefeed = chr(10);
 TYPE buffer = ARRAY[0..bufflen-1] OF char;

Relaxed ordering. The declarations and definitions (CONST, TYPE, VAR etc.) can be repeated, and can appear in any order, provided that there are no forward references.

Functions can return results of any assignable type, including arrays and records, and the processes of dereference (^), indexing and field selection can be applied when appropriate to the type. The result variable may be given a name (different from the function name) that can be referenced in contexts that require a variable without causing a recursive call of the function.

Subrange bounds may be constant expressions, in line with the wider relaxation described above, and such subranges define conventional static types. The bounds may also be general expressions, involving run-time variables, and introducing nonstatic types; this topic is discussed separately later.

CASE enhancements. In both variant record declarations and CASE statements, ranges are permitted in constant-lists, and an OTHERWISE clause may be introduced to complete the list.

 CASE ch OF
 '0'..'9': digit;
 'A','E','I','O','U': vowel;
 OTHERWISE other;
 END {case};

Short-circuit operators. Variants of the Boolean operators AND and OR are provided which guarantee that an expression it evaluated no further than is necessary to determine the result. The new variants are called AND_THEN and OR_ELSE, and they can be used as in the following example to simplify conditional code.

 IF (p <> NIL) AND_THEN (p^.f2 = 2)
 THEN .....
 ELSE .....

If p is nil, control passes immediately to the else part, and the illegal dereference p^.f2 (which would result in a protection violation if the program were running in a protected memory environment) is not executed.

Nondecimal numbers may be introduced into source programs using the notation base#number, for instance 16#FF or 8#377 (hexadecimal FF = octal 377 = decimal 255).

Characteristics. Just as maxint gives implementation-specific information about the integer type, there is a predefined constant maxchar (the character with largest ordinal value), and constants maxreal, minreal and epsreal which give the characteristics of the real type.

Numeric input from textfiles accepts the representation laid down in the data-interchange standard (ISO 6093), which permits a decimal point in leading or trailing position.

Zero fieldwidth. The optional fieldwidth parameter in textfile output may take the value zero.

Inverse ord. An enhancement to the succ and pred functions allows them to take an optional second parameter of integer type. This is more versatile than a simple inverse of ord, but among other things gives succ that capability; for example, given a conventional day-of-week enumeration, succ(Monday,3) yields Thursday.

Underscores are permitted as significant characters within identifiers, though not in leading or trailing positions.

[Prev] [Contents] [Next]

3. Initial states

In classic Pascal, all variables are created in an undefined state (that is, containing an undefined value). In Extended Pascal, an initial state can be defined which is automatically given to a variable when it is created. At the outer level an implementation may preset the initial state, but the facility applies also to local variables of procedures and to variables created on the heap. Furthermore, a type definition may carry an initial state, which is given to every variable of the type unless the variable declaration itself overrides it.

 TYPE intz = integer VALUE zero;
 trec = RECORD
 a,b: intz;
 c: char VALUE '*';
 END;
 VAR recp: ^ trec;

With these declarations, a call of new(recp) causes a record to be created on the heap with fields a and b initialised to zero and c initialised to asterisk.

[Prev] [Contents] [Next]

4. Array and record constructors

The set constructors of classic Pascal form the basis of constructors to define other structured values. When all the ingredients are constant, a constructed value can be named as a constant or used to define initial states. Within statements, the constructor may include run-time values.

An array constructor may list specific indexes, with the values those elements are to be given, and/or may provide a default to be given to any elements not individually listed. A record constructor names record fields together with the values to be given; the whole record must be defined. (For some purposes it may be more convenient to specify values of individual fields in the record declaration.)

 [3..6,9:5.5; 10:10.5; OTHERWISE 0] {array}
 [f1,f4:10; f2:'$'; f3:'Message'] {record}

The processes of indexing and field selection can be applied to structured constants (including string literals) just as to variables; furthermore, within statements an index may be a run-time value, so that different elements of the constant are accessed on different occasions.

[Prev] [Contents] [Next]

5. Modules

Classic Pascal does not include any facility for separate compilation of parts of a program. Besides limiting the scope of programs which can be produced on small machines, this has the important disadvantage that there is no standard form for the preparation of precompiled libraries. Almost every implementation of Pascal introduced an extension of some kind to overcome this limitation, and it was seen as one of the most important tasks of Extended Pascal to define a form of separate compilation which would not forfeit type security.

Besides the main program, Extended Pascal programs may include components known as modules. A module can export constants, types, variables, procedures and functions through named interfaces, and these interfaces may be imported by other modules or by the main program. By default, an interface is imported complete, with the names of all its constituents accessible, but there are several options to meet the difficulties which can arise in practice when importing from modules which were not designed in conjunction with one another. Instead of importing the whole of an interface, just selected items may be chosen; the names of constituents may be kept apart and referred to by giving the interface name; and constituents may be renamed on import.

A module has two parts: a heading and a block. The module heading contains declarations and definitions of any items which are to be exported, in particular the headings (but no more) of procedures and functions. The block includes the definitions of any exported procedures or functions, together with items which do not need to be known outside the module. There may also be initialization and finalization code.

The heading and block may be combined or separate. When separate, the possibility arises of alternative implementations of the same heading (with and without diagnostic code, for example), or of an implementation coded in some other language such as assembler.

A module heading may import from another module, and may re-export these imported items, allowing for example composite library interfaces to be constructed. A module block may independently import interfaces from other modules. The export and import of an interface set up what is known as a "supplying" relationship, in which the exporting module supplies the importing module or program. The supply network puts some constraints on the sequence in which modules can be compiled; in particular, it must not contain any loops (which would imply that a module was indirectly attempting to supply itself). Any initialization code for a module is executed before that of any component which it supplies.

Modular construction is normally appropriate to larger programs, and small examples inevitably appear trivial. However, the three related modules which follow demonstrate a number of the possibilities.

Module one exports an interface named i1, containing two constants named lower and upper. A variable dummy is declared but not exported. Module one has a minimal module block.

MODULE one;

EXPORT i1 = (lower,upper);

CONST lower = 0;
 upper = 11; {must be prime}

VAR dummy: Boolean;

END {of module-heading};
END {of module-block}.

Module two imports the constants lower and upper from one, uses them to define a type, and also re-exports them. It exports two interfaces named i2 and j2. Interface i2 contains the type subr; j2 contains the constants lower and upper. Module two also has a minimal module block.

MODULE two;

EXPORT i2 = (subr); {has just one constituent}
 j2 = (lower,upper);

IMPORT i1; {import all (both) constituents}

TYPE subr = lower..upper;

END {of module-heading};
END {of module-block}.

Module three demonstrates qualified import and renaming. It exports one interface i3 containing a function, a type, and two constants. It imports interfaces i1 and i2 qualified, so references to the constituents within the module are prefixed with the interface-names; further, the type subr is renamed lim_range on import, so it is referred to as i2.lim_range. The constants lower and upper are renamed on export as lim_lower and lim_upper. The function-heading of limited is declared in the module-heading, and the function-definition in the module-block. Note that the parameter-list and result-type of limited are not repeated in the definition; this arrangement is similar to forward-declared procedures in classic Pascal.

MODULE three;

EXPORT i3 = (limited,i2.lim_range, {function and type}
 i1.lower=>lim_lower,i1.upper=>lim_upper);

IMPORT i1 QUALIFIED; {lower, upper to be referenced 
 as i1.lower and i1.upper}
 i2 QUALIFIED ONLY (subr=>lim_range);

FUNCTION limited(x: integer): i2.lim_range;

END {of module-heading};

FUNCTION limited;
 BEGIN
 IF x < i1.lower THEN limited := i1.lower
 ELSE
 IF x > i1.upper THEN limited := i1.upper
 ELSE limited := x
 END {limited};

END {of module-block}.

[Prev] [Contents] [Next]

6. Restricted types

Restricted types provide a means of hiding the details of a type when it is exported. The originator of a module may declare a "restricted" version of a type, and export only the restricted form. An importer can declare variables of such a type, and pass them as parameters to procedures imported from the originating module, but can only treat them as black boxes with no knowledge of their internal structure. Within the module, the restricted parameters are of the unrestricted original type.

[Prev] [Contents] [Next]

7. Strings

In classic Pascal, the only string facilities are associated with packed arrays of char. This is another area in which a variety of local extensions have arisen. Extended Pascal includes provision for dynamic string types, and unifies them with classic Pascal strings and with characters.

String variables are declared with a maximum capacity, for instance:

 VAR s1,s2: string(20);
 fname: PACKED ARRAY [1..20] OF char;

String values have a length (number of characters). A dynamic string variable such as s1 can hold a value of any length from zero up to its capacity, and the object code keeps track of the current length. With a fixed string such as fname, as found in classic Pascal, the length of the contents is equal to the capacity; when a shorter value is assigned to fname, it is padded on the right with spaces until it fits. A variable of type char has a capacity of 1. Variables of these three kinds, together with string literals and character constants, produce general string values. In addition, individual characters or substrings of string variables can be referenced by indexing, for instance s1[i] or fname[1..8].

String values can be concatenated using the + operator, and constants can be defined by constant expressions of string type, eg. 'ABC'+chr(13). There are predeclared functions for the commonly-required string operations such as locating a substring within a longer string.

Strings can be written to or read from textfiles, and versions of the textfile read and write procedures are provided which take a string variable in place of the file, making all the conversion and editing processes available internally.

A string may be declared with capacity fixed at compile time, as in the example above, or defined by a run-time variable expression. There are also provisions for formal parameters which adjust themselves to the actual parameter at each call.

 PROCEDURE p (VAR s: string)

Dynamic strings of different capacities may be passed to this procedure with each call; the code within the procedure can discover the capacity of each actual parameter by reference to s.capacity.

If a variable n has the value 10, a string declared as string(n) has the same type as one declared as string(10) for compatibility purposes (though the type checking cannot be performed until run-time). As will be seen in the next section, this rule and the adaptable formal parameters are both particular cases of facilities that apply to all schematic types, and arise from string being formally defined to be a predeclared "schema" with additional special properties.

[Prev] [Contents] [Next]

8. Nonstatic types

It has been a characteristic of almost all versions of Pascal that data types are static, that is, are ultimately defined at compile time. The ISO version of classic Pascal included an optional feature called conformant array parameters, a specialised parameter form for which actual arrays of different sizes can be supplied in different calls of the same procedure. This feature has been included in a number of implementations, and provides a measure of flexibility which suits, for example, mathematical procedures which manipulate arrays. All that is required is that actual parameters "conform" to the formal in the sense of having the same number of dimensions and final element type.

In the context of classic Pascal, conformant arrays give a degree of flexibility when ready-made procedures are included in a program in source form, but the actual parameters must ultimately all be static, with sizes defined at compile time. Conformant arrays continue to be an optional feature of Extended Pascal, but there is in addition a more far-reaching variety of nonstatic types based on schemata.

A schema is a template describing a family of related types, from which individual types can be produced by substituting either compile-time or run-time values, typically to define subrange bounds or to select record variants. These "schematic" types can be used in almost all respects just like conventional static types: they can be used in the declaration of variables, record fields and formal parameters; they can be used as domain types of pointers, and returned by functions. It was observed earlier that subranges can have their bounds determined at runtime; such subranges are similar to individual schematic types without the benefit of a family connection.

 TYPE s(a,b: integer) = ARRAY [a..b] OF real;
 VAR x: s(0,n-1);

The schema s defines a family of array types. The variable x has a type produced from the schema by substituting the values 0 and n-1 for a and b. If n is a variable, the size of the array is determined at run-time. The index bounds of array x can be referenced as x.a and x.b, as in the statement

 FOR i := x.a TO x.b DO writeln(x[i]);

Formal parameters may be declared with the original schema name, and will adapt themselves to the actual parameter at each call, as described earlier for the particular case of strings. In this respect they are similar to conformant array parameters, but require that each actual is of a type produced from the same schema. A pointer may also be declared to have the schema name as its domain type, and an additional form of the procedure new is provided which includes actual values to select a type from the schema.

A schema can define a family of record types in which a variant is selected by a parameter of the schema. The selection may be decided at run-time, and unlike the form of new inherited from classic Pascal, it does not require a constant selector. As with all schematic types, such records can be local variables, parameters, fields of other records, and so on. The choice of variant produces a specific type, which cannot subsequently be changed; but as with any schema a parameter may be declared with the schema name which will accept as actual arguments any of the produced types. The use of variant records is made safer and more flexible by these arrangements.

 TYPE sub = 1..4;
 rec(m: sub; n: integer) = 
 RECORD
 a,b: integer;
 CASE m OF
 1: (f1: real);
 2,3: (f2: string(n));
 4: ( );
 END;
 rec2_20 = rec(2,20);

These definitions show both the selection of a record variant by parameter m and (when m is 2 or 3) how the capacity of string f2 can also be specified by parameter n. The type rec2_20 is one type produced from the schema rec.

 PROCEDURE show_cap (r: rec);
 BEGIN
 IF r.m = 2 THEN writeln(r.f2.capacity);
 END {show_cap};

If an actual parameter of type rec2_20 is passed to procedure show_cap, the value 20 is displayed.

[Prev] [Contents] [Next]

9. Type inquiry

This feature is of use primarily in conjunction with schema types, and allows for example a local work variable to be declared with the same type as an actual parameter, when this type is not known until run-time and may differ from call to call. For example, in procedure show_cap above, a variable v could be declared as

 VAR v: TYPE OF r;

This variable acquires the type of the actual parameter at each activation.

[Prev] [Contents] [Next]

10. File extensions

Extended Pascal provides a method of binding a variable within the program to an external entity; the most common example is the binding of a file variable to an operating-system file. There is a predeclared record type called BindingType which holds binding information; procedures bind and unbind perform the actions, and a function binding returns the current state of a variable. File binding can be carried out by a sequence of operations which is relatively independent of the environment; some other bindings (such as to a screen image, or a clock) may be available in specific implementations.

Classic Pascal provided only sequential file processing. Extended Pascal adds the capability to extend a sequential file, and also allows file variables to be declared with an index type. Such variables can provide direct access to individual file elements, by specifying an index value. Direct-access files allow updating as well as reading and writing.

This example displays the string which is element i of the file:

 VAR f: FILE [0..9999] OF string(20);
 ...
 SeekRead(f,i); writeln(f^);

[Prev] [Contents] [Next]

11. Mathematical extensions

A complex data type is provided. It is intentionally opaque, to permit implementations to choose the most appropriate representation; there are functions to obtain the real and imaginary parts (a Cartesian view), the magnitude and argument (a polar view), and also to construct a complex value from either pair of inputs. The mathematical operators and functions of classic Pascal can also take complex arguments and return complex results.

 z2 := cos(z1 * 5.5);
 writeln(re(z2),im(z2)); {Cartesian view}
 writeln(abs(z2),arg(z2)); {polar view}

Two exponentiation operators are included. POW raises a value to an integer power, and ** accepts a real exponent. In either case, the left-hand operand can be integer, real or complex. An integer operand of ** (as with the / operator) is cast to real before the operation.

[Prev] [Contents] [Next]

12. Set extensions

A new operator >< is defined, which takes the symmetric difference of two set values; there is a new predefined function card which returns the cardinality of a set (the number of members present); and the FOR statement allows a new form in which the control variable is given in turn the values defined by a set.

 FOR n IN setvalue DO ...

[Prev] [Contents] [Next]

13. Date and time

A predeclared record type TimeStamp is defined, which contains fields for year, month, day, hour, minute and second. (It is envisaged that an implementation might add further details such as millisecond or time zone which would be processed transparently by the predefined routines.) A procedure GetTimeStamp sets the current values in a TimeStamp record; functions Date and Time take such a record as a parameter and return strings in display format. This division of tasks allows the display functions to be used independently of system dates:

 VAR ts: TimeStamp;
 ...
 ts.year := 1993;
 ts.month := 1; {=January}
 ts.day := 1;
 writeln(date(ts)); {display in local format}

[Prev] [Contents] [Next]

14. Protected variables

Protection may be given to a variable in two contexts. The first is on export of the variable from a module; such a variable can be modified by code within the module, but importers must treat it as read-only. The originator of the module thereby ensures the security of that variable. The second context for protection is in parameter lists. A parameter may be declared to be protected; the code within the procedure or function must then not contain statements which might change the parameter. A caller passing a variable to a protected VAR parameter, for example, knows there is no risk of it being modified, and does not need to make a copy first; in the case of a large structure this may represent a significant saving. Declaring a protected value parameter indicates to an implementation or to the reader of the program text that the actual parameter is "safe" and will not be modified during execution of the procedure.

[Prev] [Contents]

15. Conclusion

Pascal, at least in its standard form, has the reputation of being a safe but limited language. The purpose of this introduction to the features of Extended Pascal is to show that the range of the language has been greatly increased without compromising its security.

To err is human, and people (even programmers) make mistakes. In the development and maintenance of software, these are expensive if not worse, and the contribution that the programming language can make to the avoidance of mistakes is very significant. Classic Pascal has features which encourage, and indeed sometimes require, a secure programming style; it also encourages readability, which greatly benefits long-term maintenance. Extended Pascal gives much extra flexibility without sacrificing these advantages. Also, any programmer familiar with classic Pascal can adopt the new features of the extended language gradually, achieving a smooth transition as familiarity grows.

Portability across platforms is important to the serious developer, and the use of a standardised language provides an assurance of continuity both vertically across levels of machine and horizontally in time.

To fill one of the most significant gaps in classic Pascal, the language standard provides a framework in which libraries can be developed and distributed. An implementor with proprietory source code can, if he wishes, supply processed interfaces and compiled object files, his code still retaining the advantages of portability. On the other hand, the standard rightly does not set out to specify what individual libraries should contain. Areas such as graphics or numerical computing are essentially language- independent, where desirable facilities are specified which can then be "bound" to different languages, as for example in the emerging set of Language Independent Arithmetic standards. All languages can then benefit from the care and attention given by experts in each particular field.

[Contents] [Return to Welcome page]

Extended Pascal

.. a new standard in computer languages

Contents:

1. Background

2. General Enhancements

3. Initial states

4. Array and record constructors

5. Modules

6. Restricted types

7. Strings

8. Nonstatic types

9. Type inquiry

10. File extensions

11. Mathematical extensions

12. Set extensions

13. Date and time

14. Protected variables

15. Conclusion