Clipper Tutorial: a Guide to Open Source Clipper(s)/Working With Databases

Working with Databases
Let's return to the Wikipedia entry Database application.

Different kinds of database applications exist as well. If you did store your friend's phone numbers and addresses into a word processor, you would have what someone calls a Free-Form Database (however, a similar expression is an oxymoron in computer science) - myBase®, askSam®, Personal Knowbase®, MyInfo®, Info Select®, and GeneralKB® are a bunch of specialized free-form database application, which actually means PIM (Personal information manager). Now, a word processor lets us search the information, but other operations, such as sorting them in alphabetical or numerical order, cannot usually be done automatically by a word processor.

What about attempting to store it into a spreadsheet? We may use one column for the name, one for the surname, one for the telephone number, one for the city. This quick database, stored in a spreadsheet, may be searched and sorted: for example, we can sort it by city and person's name in alphabetical order. This is a flat database, http://www2.research.att.com/~gsf/man/man1/cql.html: a flat database is a sequence of newline terminated records of delimiter separated fields, and a spreadsheet shows its limits in data entry and reporting (if you did want to use the data in your table to print out addresses on envelopes a spreadsheet is not a good tool). An example is MyDatabase (http://www.pcmag.com/article2/0,2817,760833,00.asp).

Spreadsheets are much better to do accounting: how much harder a book-keeper's work would be if his data were stored in a wordprocessing program? The purpose here is to have our data structured in a certain way: all the costs in a place, all earnings in another.

Before 1970 complex databases were managed using Hierarchical Databases (very little information is needed about them - see for example http://www.extropia.com/tutorials/sql/hierarchical_databases.html and http://people.cs.pitt.edu/~chang/156/14hier.html). An example of a hierarchical database is IBM IMS (Information Management System, which was developed during the mid-1960s for applications in the aerospace industry). Their implementation is based on trees, a hierarchical data structure. Hierarchical Databases and Network Databases together form what today are referred to as Legacy Database Systems. Network databases were born as an extension to the programming language COBOL by the Conference on Data Systems Languages (CODASYL). The network data model is based on data structures called graphs.

Today's standard is the Relational Database (RDBMS), which is "a database with relationships between more than one table of records based on common fields". We will speak of them in some detail, but we will briefly mention the fourth approach: Object Oriented Databases. These databases store objects (in the same sense the word is used in the expression object-oriented programming). They're not much used, mostly because objects are more complex than the simple fields a relational database stores in its tables. More information on the topic at http://www.odbms.org/odmg-standard/.

The Wikipedia entry about DBase reads: «dBase is application development language and integrated navigational database management system which Ashton-Tate labeled as "relational" but it did not meet the criteria defined by Dr. Edgar F. Codd's relational model». Codd's criteria (the so-called 12 rules, which really are 13 because the rule numbered '0' actually exists) are so strict that in practice a true relational database system does not even exist, but the point is that dBase accessed databases in another way, so that it's considered a Navigational Database (which works in a way that simulates relational databases).

http://www.databasedev.co.uk/design_basics.html

DBF Files in Other Languages
Because of the great success of dBase and its, the DBF file format became an industry standard. Many other database programs have used them to store data, like Lotus Approach. We also have many little utilities to view and convert to other formats these files. Here is a bunch of URLs: https://dbfview.com/, http://www.alexnolan.net/software/dbf.htm, https://dbfviewer.com/en/, https://www.dbf2002.com/, http://www.whitetown.com/dbf2sql/ («DBF to SQL Converter allows you to convert your dbf files to SQL script. Personal license $29.95», but compare https://www.vlsoftware.net/exportizer/). And is so widely used that interfaces for working with it are available for various languages, for example:


 * C++
 * https://sourceforge.net/projects/xdb/, http://linux.techass.com/projects/xdb/xbasedocs/xbase_c1.html and http://linux.techass.com/projects/xdb/ Xbase (formerly known as xdb, also formerly known as xBase) is a collection of specifications, programs, utilities and a C++ class library for manipulating Xbase type datafiles and indices.
 * Python
 * https://sourceforge.net/projects/xbase-py/ A Python interface for managing dbf files
 * Java
 * https://sourceforge.net/projects/xbasej/ xBaseJ - xBase Engine for Java
 * Links for PHP interface:
 * http://www.ira.inaf.it/manuals/php/ref.dbase.html
 * http://php.net/manual/en/ref.dbase.php
 * http://www.nusphere.com/kb/phpmanual/ref.dbase.htm?
 * https://sourceforge.net/projects/sfdbase/ a SQL parser for dbase

Well, now we will see how to work with DBF files the way it was intended.

A verbose way
The code above created a DBF file, names.dbf, to be used by the following code. It will add a record to the DBF file. It is equivalent to the "First Sample Program" of my old PC GUIDE, which missed a line that is necessary in modern xBase. The CLOSE command is equivalent to the dbCloseArea function, which closes a work area: Pending updates are written, pending locks are released.

A more concise way
The short code below does the same work of the two pieces of code of the previous section (it only produces a different file name, namesdb.dbf instead of names.dbf).

This example uses the alias operator, ->. http://www.ousob.com/ng/clguide/ngcf412.php

The alias->field_name notation is used to allow access to fields of databases that are loaded but not active. The alias can be specified with the work area number (e.g. 2->std_id), with the work area alias (e.g. B->std_id), or with the database name (e.g. STUDENTS->std_id).

The result of this code is a file named namesdb.dbf. Informations about how DBF files are can be find at DBF File structure, http://www.dbf2002.com/dbf-file-format.html, where we find this list of Field type:
 * C  –   Character
 * Y  –   Currency
 * N  –   Numeric
 * F  –   Float
 * D  –   Date
 * T  –   DateTime
 * B  –   Double
 * I  –   Integer
 * L  –   Logical
 * M  – Memo
 * G  – General
 * C  –   Character (binary)
 * M  –   Memo (binary)
 * P  –   Picture
 * +  –   Autoincrement (dBase Level 7)
 * O  –   Double (dBase Level 7)
 * @  –   Timestamp (dBase Level 7)

My PC GUIDE showed how a .dbf file is made with the DataBase Utility DBU. Clones of this utility are FiveDBU (with source code) at https://code.google.com/archive/p/fivewin-contributions/downloads, DBF Viewer Plus at http://www.alexnolan.net/software/dbf.htm, CLUT at http://www.scovetta.com/archives/simtelnet/msdos/clipper. Harbour includes its own HbDBU (the source is in \hb32\addons\hbdbu ) and a component IdeDBU of HbIDE (the other two components are IdeEDITOR and IdeREPORTS).

From https://code.google.com/archive/p/fivewin-contributions/downloads we can get fivedbu_20130930.zip (New FiveDBU version with enhancements on ADO fields editing). It supports ADO, 3 RDD (DBFNTX, CBFCDX and RDDADS) and 6 languages - select "Bases de datos -> Preferencias -> Lenguaje: Inglés" to have it in English.

Let us see what is in our little file so far.

Database Design Issue: the First Normal Form (1NF)
The work done in the previous section was intended to exactly reproduce the database presented in my PC GUIDE. There are, however, drawbacks: having only one NAME field, this database cannot sort its data on the last name. Also, a careless user might insert the data of some people with the last name first, and some other data with the first name last. When designing a database precautions should be taken of these possibilities. The first normal form (http://www.1keydata.com/database-normalization/first-normal-form-1nf.php, http://www.sqa.org.uk/e-learning/SoftDevRDS02CD/page_14.htm) requires you to define fields whose information cannot be divided into smaller parts. So, instead of a NAME field, we should have a FIRST_NAME and LAST_NAME fields. Complying to the first normal form, our little database would be on the on the right track to being a normalized database.

Designing the database is an essential part of the work and it is not always obvious how it should be done. See https://www.ntu.edu.sg/home/ehchua/programming/sql/relational_database_design.html.

A graphical device used to help is database design are the Entity-Relationship Diagrams: https://www.lucidchart.com/pages/er-diagrams, https://www.guru99.com/er-diagram-tutorial-dbms.html.

Complicating our simple database
Harbour contains a file named test.dbf. Launch hbrun in its directory and type in

use test browse

At this point, we see that it is a 500 record table. Move around with the cursor keys and, when you're finished, punch the Esc key to quit this interactive table browser and editor. To get the record number of a person called Ted issue:

locate for first="Ted" ? recno

Something More Elaborate
We may now try the use of hbrun interactively to solve an exercise from an ECDL book. ECDL (European Computer Driving Licence, https://www.findcourses.co.uk/inspiration/articles/why-does-ecdl-matter-17580) is a qualification to show that someone can use a computer with different degrees of proficiency. When I got it, many (many) years ago the tasks for the database module very extremely easy. It was the 5th module the one that dealt with databases. I bought a small book and rereading it I found inspiring the 2nd exercise in that module: create a table to manage a movie library. I've done this in Harbour 3.0 under ReactOS 0.4.13, using hbrun almost interactively - which gives it an enjoyable dBase feel. However it is necessary to use some auxiliary files, as some commands are quite difficult - or impossible - to type on a single line. That is not bad, we'll see how to run a piece of code stored in a file from the prompt.

 Point 1 . Turn on the computer and open the database manager - as absurd as it seems, the first step of every exercise in my workbook is to turn on the computer.

The first command to issue will set the window to 25×80 characters, so that it looks good on screen and nothing is out of sight.

At this point we should see more or less this (probably your screen won't be in Italian):



The first line shows the last command hbrun executed, setmode(25,80), and the bottom line features the legendary Dot Prompt.

 Point 2 . Create a table to manage a movie library with the following fields:


 * movie code,
 * movie title,
 * genre,
 * year of production,
 * leading actor,
 * price in euro.

We can issue all the following commands at the Dot Prompt to create a database with this structure:

or we can store them in a file, for example creadb.prg and invoke from the command line using:

Now we can check what is in our working directory:

this command will show the name of all databases in the current directory. We should have something like:

Database Files    # Records    Last Update     Size videolib.dbf              0    09/15/21         227

and begin using out database:

the first two lines should now read:

 Point 3 . Create a form to enter data - this we will do later. We will see how it is done: the @...GET command places the input fields on the screen and allow users to enter data. A typical procedure looks like this:

 Point 4 . Fill the table with at least ten records with different data.

We will get our sample data from the CSV file "movies.txt" shown below using the command

1,"Jurassic Park","azione",1993,"Jeff Goldblum",35.99 2,"Jumanji","avventura",1995,"Robin Williams",8.49 3,"Navigator","fantascienza",1986,"Joey Cramer",11.39 4,"Mortal Kombat","azione",1995,"Christopher Lambert",10 5,"Karate Kid 4","azione",1994,"Hilary Swank",4.9 6,"Ritorno al futuro","fantascienza",1985,"Michael J. Fox",6.95 7,"2001: Odissea nello Spazio","fantascienza",1968,"Keir Dullea",7.9 8,"Il pianeta proibito","fantascienza",1956,"Leslie Nielsen",9.99 9,"Interstellar","fantascienza",2014,"Matthew McConaughey",6.95 10,"Prometheus","fantascienza",2012,"Michael Fassbender",7

We may check the data currently in our database using:

Now we can make some small changes to our sample data. For example the data of the Genre field are in Italian, but we would like to have them in English. The commands to do the substitutions are surprisingly simple: .replace genre with "science fiction" for genre="fantascienza" .replace genre with "action" for genre="azione" .replace genre with "adventure" for genre="avventura"

In the same way we can translate the Italian movie titles with their English equivalents ("Il pianeta proibito" is "The Forbidden Planet", "2001: Odissea nello spazio" is "2001: A Space Odyssey" - but that was easy to guess -, "Ritorno al futuro" is "Back to the Future").

The movieid field is a simple number, and that's ok but, to spice things up, let's try to change it with this command:

At this point we can already try and extract the list of films made before the year 2000:

to see how to enter multiple commands at the Dot Prompt, just separate them with a semicolon. Here we can see also how a dBase command is built:

list - a verb title, leadactor, year - an expression list for year < 2000 - a condition

However, those three question marks to make some space to see all the records (the first three lines of hbrun are occupied by other information) are just ugly. Next time we will use setpos.

Now that we think about it the movie code field, Movieid, in record 7 consists only of numbers and in record 8 contains a space, which is just as ugly. We can correct it interactively by entering at the dot prompt:

and replace those fields with "Odys68" and "Forb56".

Since we are making corrections, I just remembered that I gave away 2001: A Space Odyssey in change of a better movie, and so we may as well delete it:

Now we can browse again and add a new record by hand: "Miss86", "Mission", "history,drama", "1986", "Jeremy Irons", "6.95".

 Point 5 . Sort the table by film title.

Again,  will permit us to make sure things actually changed in the way the file appears.

 Point 6 . Query the database for films made before the year 2000

We’ve already done this, but this time we have the index on the field Title active:

What else we can do? We can also also compute the average age of our movies (in years):

(maybe i should watch more recent movies...), or see how much we spent on our movie collection:

 Point 7 . Create a report named "Movie List" with the records of the films in order of genre and title and containing the fields code, genre, title, price in euro.

The DOS dBase had its own Report Generator and Clipper provided an utility, RL.EXE, to create reports and labels. Those programs create a format file used to generate the report, in those ancient times when reports were printed on continuous form paper using dot matrix printers. But how can we create a report today? How to preview it onscreen before sending it to the printer? Probably the most obvious options are to create a PDF or HTML file (which will then need a stylesheet to be watchable, otherwise it'll be too insipid). Since I'm using a vanilla Harbour 3.0 to start we will follow the simplest way: generate an HTML file. We will try a tabular report (this code needs to be corrected, as the HTML it produces although can be interpreted, cannot be validated as many closing tags are missing):

This is not much to look at, but a CSS will improve its appearance considerably. Even a messy CSS (this too needs a clean up) like this:

can get us a colorful printed report that does not look dull - albeit unprofessional:



 Point 8 . Print the "Movie List" report.

 Point 9 . Close the database and the program. In hbrun these two things are done with these two commands, respectively:

What have we left within our working directory? reports the following:

iReport ntx     2048  09/15/21 movies  txt      605  09/15/21 report  prg      414  09/15/21 rMovie  htm     1290  09/15/21 styles  css      743  09/15/21 titlesor ntx    2048  09/15/21 videolib dbf    1287  09/15/21

Note 1. Using HTML for output is fun, but it is not a good idea, as often a report will be longer than one page of paper and an HTML+CSS table will not handle properly such multiple pages output. Creating a PDF of the report to print, using either libHaru (http://libharu.org/) using the bindings of contrib\hbhpdf (or extras\hbvpdf) would be a much better choice.

Note 2. This example is really basic. We could do just about everything using the Dot Prompt in less than 20 minutes. ECDL has changed a lot during the last few lustra. Now I'm giving a look at the ECDL Advanced Database Syllabus 2.0. It looks good, with some interesting exercises.

Workareas: more than one table at once
Let us for a moment think back at the old times when commonly used computers had no mouse and GUI. Nevertheless, if we wanted to use several tables for a task, we had to use some way to tell the computer which tables he should consider. Let us take for example a library. Oversimplifying, we need to use at least three tables to manage it (one for the books, one for the customers and one for the loans).

We might issue these commands:

SELECT 1 USE Books SELECT 2 USE Customers SELECT 3 USE Loans SELECT Books

We might visualize the result as the three windows in the picture below:



Here the command SELECT works like clicking on a window to activate it, and the workareas themselves look like windows having a number (the workarea number) and a name (the workarea alias).

Testdbf.prg
Here is the testdbf.prg source from \hb30\tests. It should be discussed in detail. It is a GPL piece of code poorly commented.

Input Mask
A simple data base input mask (from the Wikipedia Clipper entry):

RDDs: What Functions Are Available?
http://harbourlanguage.blogspot.de/2010/06/understanding-harbour-rdd.html

ADO RDD: Much Ado About Nothing
https://searchsqlserver.techtarget.com/definition/ActiveX-Data-Objects http://cch4clipper.blogspot.com/2009/10/using-adordd-with-harbourxharbour.html

Case Study: Checkbook Balancing
This is CHECKD.PRG from x2c-base.zip\Tutor.zip:

Deleting records
In this piece of code, the command DELETE marks the fourth record for deletion. But the file is not altered, not even by a CLOSE command. The PACK command actually removes the records marked for deletion (and also makes some additional work). The RECALL command removes the deleted flags. The function DELETED returns .T. if the current record is marked for deletion, .F. if not.

The PACK command, which does the actual deletion of data from the table, '''PACK requires that the current database be USEd EXCLUSIVEly. If this condition is not met when the PACK command is invoked, CA-Clipper generates a runtime error'''. Additional work that PACK does is to update indexes on the table it alters (if any).

The commands DELETE ALL and PACK are executed by a single command called ZAP.

An Indexed Example
Suppose a table containing these data:

FSTNAME	LSTNAME John  	Doe John  	Doe John  	Doe Jane  	Doe

We can create a little index file with this piece of code: