Structured Query Language/Create Table

One of the basic steps during database development cycles is the fixing of decisions about the table structure. To do so, there is the CREATE TABLE statement with which developers define tables together with their columns and constraints.

Because a lot of features may be activated by the command, its syntax is a little bit complex. This page shows the most important parts. The syntax is not straight forward. At some points it is possible to use alternative formulations to express the same purpose, e.g. the Primary Key may be defined within the column definition as a column constraint, at the end of the command as a table constraint or as a separate stand-alone command 'ALTER TABLE ADD CONSTRAINT ...;'.

General Description
After the introductory key words CREATE TABLE, the tablename is specified. Within a pair of parentheses, a list of column definitions follows. Each column is defined by its name, data type, an optional default value, and optional constraints for this individual column.

After the list of column definitions, developers can specify table constraints like Primary and Foreign Keys, Unique conditions, and general column conditions.

An first example was shown at the page Create a simple Table and a second one here:

The table consists of 4 columns. All of them have a data type and some a default value. The column id acts as the Primary Key. The table constraint test_check guarantees that part_name is mandatory if part_number is recorded.

Data Type
The standard defines a lot of predefined data types: character strings of fixed and variable size, character large objects (CLOB), binary strings of fixed and variable size, binary large objects (BLOB), numeric, boolean, datetime, interval, xml. Beyond, there are complex types like: ROW, REF(erence), ARRAY, MULTISET and user-definded types (UDT). The predefined data types are explained on the next page. To keep things simple, we use on this page only CHAR, VARCHAR, and DECIMAL.

Default Value
A column can have a default value. Its data type corresponds to the type of the column. It may be a constant value like the number -1 or the string 'n/a', or it is a system variable or a function call to determine dynamic values like the username or the actual timestamp.

The default clause affects those INSERT and MERGE commands, which do not specify the column. In our example database the person table has the column weight with the default value 0. If we omit this column in an INSERT command, the DBMS will store the value 0.

Identity Specification
The identity specification serves for the generation of a series of unique values that act as the Primary Key to the table's rows. The standard defines the syntax as: "GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY". Unfortunately, most DBMS vendors do not support this formulation. Instead, they offer different syntaxes and even different concepts to generate primary key values. Some use a combination of generators/sequences and triggers, others a special data type, or different keywords.

An overview about the wide spread of implementations is available in the wikibook SQL Dialects Reference: Auto-increment_column.

Column Constraint
The column constraint clause specifies conditions which all values must meet. There are different column constraint types:
 * NOT NULL
 * Primary Key
 * Unique
 * Foreign Key
 * Check values

The NOT NULL phrase defines, that it is not allowed to store the NULL value in the column.

The PRIMARY KEY phrase defines that the column acts as the Primary Key of the table. This implies that the column is not allowed to store a NULL value and that the values of all rows are distinct from each other.

The UNIQUE constraint has a similar meaning as the PRIMARY KEY phrase. But there are two slight differences.

First, the values of different rows of a UNIQUE column are not allowed to be equal, which is the same as with PK. But they are allowed to hold the NULL value, which is different from Primary Key. The existence of NULL values has an implication. As the term null = null never evaluates to true (it evaluates to unknown) there may exist multiple rows with the NULL value in a column which is defined to be UNIQUE.

Second, only one Primary Key definition per table is allowed. In contrast, there may be many UNIQUE constraints (on different columns).

The FOREIGN KEY condition defines that the column can hold only those values, which are also stored in a different column of (the same or) another table. This different column has to be UNIQUE or a Primary Key, whereas the values of the foreign key column itself may hold identical values for multiple rows. The consequence is that one cannot create a row with a certain value in this column before there is a row with exactly this certain value in the referred table. In our example database, we have a contact table whose column person_id refers to the id of persons. It makes sense that one cannot store contact values before storing the appropriate person.

Foreign Keys are the technique to realize one-to-many (1:m) relationships.

Column checks inspect the values of the column to see whether they meet the defined criterion. Within such column checks, only the actual column is visible. If a condition covers two or more columns (e.g., col_1 > col_2) a table check must be used.

Table Constraint
Table constraints define rules which are mandatory for the table as a whole. Their semantic and syntax overlaps partially with the previous shown column constraints.

Table constraints are defined after the definition of all columns. The syntax starts with the keyword CONSTRAINT, followed by an optional name. The following example includes the optional names t6_pk, t6_ik and t6_fk. It is a good practice to include names. In the case of an error exception, most DBMS will include this name as part of related error messages - if a name isn't defined, the DBMS may use its internal naming convention, which can be cryptic.

Primary Key, UNIQUE and Foreign Key
In the same manner as shown in the column constraints part Primary Key, UNIQUE and Foreign Key conditions can be expressed as table constraints. The syntax differs slightly from the column constraint syntax; the semantic is identical.

NOT NULL and Simple Column Checks
Similar to column constraints part NOT NULL conditions and simple column checks can be expressed as table expressions.

General Column Checks
If a condition affects more than one column, it must be expressed as a table constraint.

Column Constraints vs. Table Constraints
As you have seen, some constraints may be defined as part of the column definition, which is called a column constraint, or as a separate table constraint. Table constraints have two advantages. First, they are a little bit more powerful.

Second, they do have their own name! This helps to understand system messages. Furthermore, it opens the possibility to manage constraints after the table exists and contains data. The ALTER TABLE statement can deactivate, activate, or delete constraints. To do so, you have to know their name.

Exercises
Create a table 'company' with columns 'id' (numeric, primary key), 'name' (strings of variable size up to 200), 'isin' (strings of length 12, not nullable, unique values). Create a solution with column constraints only and another one with table constraints only.

Create a table 'accessory' with columns 'id' (numeric, primary key), 'name' (strings of variable size up to 200, unique), 'hobby_id' (decimal, not nullable, foreign key to column 'id' of table 'hobby'). Create a solution with column constraints only and another one with table constraints only.