Pascal Programming/Files

Ever wondered how to process bulks of data? Files are the solution in Pascal. You were already acquainted with some basics in the input and output chapter. Here we will elaborate more details as far as the ISO standard 7185 “Pascal” defines them. The “Extended Pascal” ISO standard 10206 defines even more features, but these will be covered in the second part of this WikiBook.

File data types
So far we have been only handling text files, i.&#8239;e. files possessing the data type, but there are more file types.

Concept
Mathematically speaking, a file is a bounded finite sequence. That means, To put this in fancy math symbols: $ M^{\,*} = \bigcup_{i\ =\ 0}^{\infty} M^{\,i} $
 * components are oriented along an axis (sequence),
 * component values are chosen from one domain (bounded), and
 * there is a certain number of components present (finite).

Declaration
In Pascal we can declare file data types by specifying, where   needs to be a valid record data type. A permissible record data type can be any data type, except another file data type (including ) or a data type containing such. That means an  of file data types, or a   having a   as a component is not permitted. Let’s see an example: With a variable of the data type  we can access a file containing only one kind of data,   values (the domain restriction). Note, the variable  is not a file by itself. This Pascal variable merely provides us with an abstract “handle”, something that permits us, the, to get a hold of the actual file (as described in § Concept).

Modes
All files have a current mode. Upon declaration of a file variable, this mode is, like usual, undefined. In Standard Pascal as defined by the ISO standard 7185 you can choose from either generation or inspection mode.

Generation mode
In order to write to a file you will need to call the standard built-in  named. will attempt opening a file for writing from the start. The  immediately becomes empty, hence its name rewrite. Extended Pascal also has the non-destructive.

Only after successfully opening a file for writing, all write routines become legal. Attempting to write to a file that has not been opened for writing will constitute a fatal error. All parameters to  after the   (here  ) have to be of the   file’s. There must be at least one. Only if the  is a   file, various built-in data types are permitted.

Note that the procedure(s)  (and  ) can only be applied to   files. Other files do not “know” the notion of lines, therefore the  procedures cannot be applied on them.

Inspection mode
In order to read a file you will need to call the standard built-in  named. will attempt opening a file for reading from the start. Note that after  you cannot write anything to that file anymore. Modes are exclusive: Either you are writing or reading.

Application
The main and most apparent “advantage” of a  might be: Unlike an  we do not need to specify a size in advance, in our source code. The  can be as large as needed. Yet an  can be copied with a   assignment. Entire files cannot be copied this way.

The main “disadvantage” of a  might be: Access is only sequentially. We have to start reading and writing a  from the start. If we want to have, say, the 94th record, we need to advance 93 times and also take account of the possibility that there might be less than 94 records available.

The words advantage and disadvantage were put between quotation marks, because a programming language cannot judge/rate what is “better” or “worse”. It is the programmer’s task to make the assessment. Files are especially suitable for I/O of unpredictable length, for instance user input.

Primitive routines
So far we have been using only /  and  /. These procedures are convenient and perfect for everday use. However, Pascal also gives you the opportunity to have a comparatively “low-level” access to files,  and.

Buffer
Every file variable is associated with a buffer. A buffer is a temporary storage space. Everything you read from and write to a  passes through this storage space before the actual read or write action is communicated to the OS. Buffered I/O is chosen for performance reasons.

In Pascal we can access one, the “current” component of the buffer by appending  to the variable name, just as if it was a pointer. The data type of this dereferenced value is the  as in our declaration. So if we have the expression  has the data type.

To put everything into relation to each other let’s take a look at a diagram. This diagram is about understanding and shows a very specific situation. Focus on the relationships:

The upper part is in the purview of the OS. The lower part is in the purview of the (our). The data of the file, here a sequence of 16  values in total, are exclusively managed by the OS. Any access of the data is done via the OS. Directly reading or writing is not possible. We ask the OS to copy the first 4  data values for us into our buffer. We do so, because copying 4 integers individually is slower than copying them all together in one go.

Sliding window
The three different storage locations – the actual data file, the internal buffer, and the buffer variable – work together in providing us a “view” of the file. If we overlay everything that contains the same information, we get the following image:

Here, the second quartet of integers was loaded into the internal buffer (green background). The file buffer points to the second component of the internal buffer. This is represented by a bluish hue over the sixth component of the entire file. Everything else is shaded, meaning we can view and manipulate only the sixth component.

Advancing the window
This sliding window can be advanced (in the rightwards direction, i.&#8239;e. in the direction of EOF) with the routines  and. Both advance the file buffer to point to the next item in the internal buffer. Once the internal buffer has been completely processed, the next batch of components is loaded or stored. Calling  is only legal while a file is inspection mode; respectively   is only legal while a file is generation mode.

Using the window
and  take one non-optional parameter, a   (or  ) variable. takes the current contents of the buffer variable and ensures they are written to the actual file. Let’s see this in action. Consider the following : The following table shows in the right-hand column the state of, the contents and where the sliding window is at (blue background). Now let’s print the file  we just filled with some   values. For a change we use. Like /,  is only allowed if not  : Note that this prints just two  values: The third  value, although defined, was not written by a corresponding

Requirements
As mentioned above,  may only be called when the specified file is inspection mode, whereas   may only be called when the file is generation mode. More specifically, calling  is only allowed when   is , and calling   is only allowed when   is. In other words, reading past the EOF is forbidden, while writing has to occur at the EOF.

After successfully calling  (or the EP  ) the value of   becomes. Any subsequent  does not alter this value. After calling  the value of   depends on whether the given file is empty. Any subsequent  may change this value from   to   (never in the reverse direction).

buffer
The buffer value of a  has some special behavior. A  file is essentiallly a. Everything presented in this chapter can be applied to a  file just as if it was. However, as repeatedly emphasized, a  file is structured into lines, each line consisting of a (possibly empty) sequence of   values.

When  becomes , the buffer variable   returns a space character. Thus when using buffer variables the only way to distinguish between a space character as part of a line, and a space character terminating a line is to call the function.

Rationale: Various operating systems employ different methods of marking the end of a line. It has to be marked somehow, because this information cannot be magically deduced out of nowhere. However, there are multiple strategies out there. This is really inconvenient for the programmer who cannot take account of everything. Pascal has therefore chosen that, regardless of the specific EOL marker used, the buffer variable contains a simple space character at the end of a line. This is predictable, and predictable behavior is good.

Purpose
It is worth noting that all functionality of /  and  /  can at their heart be based on   and   respectively. Here are some basic relationships:

If  refers to a   variable and   is a   variable,   is equivalent to Similarly,   is equivalent to

For  variables the relationships are not as straightforward. The behavior depends on the various destination/source variables’ data types. Nonetheless, one simple relationship is, if  refers to a   variable,   is equivalent to The latter   actually “consumes” the newline marker.

Support
Unfortunately, from the compilers presented in the opening chapter, Delphi and the FPC do not support all ISO 7185 functionality. Rest assured, everything works fine if you are using the GPC. The authors cannot make a statement regarding the Pascal‑P compiler since they have not tested it.
 * Delphi and the FPC require files to be explicitly associated with file names before performing any operations. It is required to back any kind of  by a file in background memory (e.&#8239;g. on disk). How this works will be explained in the second part of this book, since ISO standard 10206 “Extended Pascal” defines some means for that, too.
 * The FPC provides the procedures  and , and file variable buffers only in   or  . Delphi does not support this at all.