One of the most important features that a DBMS must provide is a way of maintaining the integrity of data. The constraints, which are used to check the modification or insertion of data, are called integrity constraints. The task of maintaining integrity constraints can be handled by the user in application programs or by the DBMS.
The most important benefits of handling integrity constraints by the DBMS are the following:
- Increased reliability of data
- Reduced programming time
- Simple maintenance
Using the DBMS to define integrity constraints increases the reliability of data because there is no possibility that the integrity constraints can be forgotten by a programmer. (If an integrity constraint is handled by application programs, all programs concerning the constraint must include the corresponding code. If the code is omitted in one application program, the consistency of data is compromised.)
An integrity constraint not handled by the DBMS must be defined in every application program that uses the data involved in the constraint. In contrast, the same integrity constraint must be defined only once if it is to be handled by the DBMS. Additionally, application-enforced constraints are usually more complex to code than are database enforced constraints.
If an integrity constraint is handled by the DBMS, the modification of the structure of the constraint must be handled only once, in the DBMS. The modification of a structure in application programs requires the modification of every program that involves the corresponding code.
There are two groups of integrity constraints handled by a DBMS:
- Declarative integrity constraints
- Procedural integrity constraints that are handled by triggers (for the definition of triggers, see Chapter “Concurrency Controls”)
The declarative constraints are defined using the DDL statements CREATE TABLE and ALTER TABLE. They can be column-level constraints or table-level constraints. Column-level constraints, together with the data type and other column properties, are placed within the declaration of the column, while table-level constraints are always defined at the end of the CREATE TABLE or ALTER TABLE statement, after the definition of all columns.
Each declarative constraint has a name. The name of the constraint can be explicitly assigned using the CONSTRAINT option in the CREATE TABLE statement or the ALTER TABLE statement. If the CONSTRAINT option is omitted, the Database Engine assigns an implicit name for the constraint.
All declarative constraints can be categorized into several groups:
- DEFAULT clause
- UNIQUE clause
- PRIMARY KEY clause
- CHECK clause
- FOREIGN KEY clause and referential integrity
The definition of the default value using the DEFAULT clause was shown earlier in this chapter (see also Example 5.6). All other constraints are described in detail in the following sections.
The UNIQUE Clause
Sometimes more than one column or group of columns of the table have unique values and therefore can be used as the primary key. All columns or groups of columns that qualify to be primary keys are called candidate keys. Each candidate key is defined using the UNIQUE clause in the CREATE TABLE or the ALTER TABLE statement.
The UNIQUE clause has the following form:
[CONSTRAINT c_name]
UNIQUE [CLUSTERED | NONCLUSTERED] ({ col_name1} ,...)
The CONSTRAINT option in the UNIQUE clause assigns an explicit name to the candidate key. The option CLUSTERED or NONCLUSTERED relates to the fact that the Database Engine always generates an index for each candidate key of a table. The index can be clustered—that is, the physical order of rows is specified using the indexed order of the column values. If the order is not specified, the index is nonclustered (see also Chapter on Indices). The default value is NONCLUSTERED. col_name1 is a column name that builds the candidate key. (The maximum number of columns per candidate key is 16.)
Example 5.6 shows the use of the UNIQUE clause. (You have to drop the projects table, via DROP TABLE projects, before you execute the following example.)
USE sample;
CREATE TABLE projects (project_no CHAR(4) DEFAULT 'p1',
project_name CHAR(15) NOT NULL,
budget FLOAT NULL
CONSTRAINT unique_no UNIQUE (project_no));
Code language: PHP (php)
Each value of the project_no column of the projects table is unique, including the NULL value. (Just as with any other value with a UNIQUE constraint, if NULL values are allowed on a corresponding column, there can be at most one row with the NULL value for that particular column.) If an existing value should be inserted into the column project_no, the system rejects it. The explicit name of the constraint that is defined in Example 5.6 is unique_no.
The PRIMARY KEY Clause
The primary key of a table is a column or group of columns whose value is different in every row. Each primary key is defined using the PRIMARY KEY clause in the CREATE TABLE or the ALTER TABLE statement.
The PRIMARY KEY clause has the following form:
[CONSTRAINT c_name]
PRIMARY KEY [CLUSTERED | NONCLUSTERED] ({col_name1} ,...)
All options of the PRIMARY KEY clause have the same meaning as the corresponding options with the same name in the UNIQUE clause. In contrast to UNIQUE, the PRIMARY KEY column must be NOT NULL, and its default value is CLUSTERED.
Example 5.7 shows the specification of the primary key for the employee table of the sample database.
USE sample;
CREATE TABLE employee (emp_no INTEGER NOT NULL,
emp_fname CHAR(20) NOT NULL,
emp_lname CHAR(20) NOT NULL,
dept_no CHAR(4) NULL,
CONSTRAINT prim_empl PRIMARY KEY (emp_no));
Code language: PHP (php)
The employee table is re-created and its primary key is defined in Example 5.7. The primary key of the table is specified using the declarative integrity constraint named prim_empl. This integrity constraint is a table-level constraint, because it is specified after the definition of all columns of the employee table.
Example 5.8 is equivalent to Example 5.7, except for the specification of the primary key of the employee table as a column-level constraint.
USE sample;
CREATE TABLE employee
(emp_no INTEGER NOT NULL CONSTRAINT prim_empl PRIMARY KEY,
emp_fname CHAR(20) NOT NULL,
emp_lname CHAR(20) NOT NULL,
dept_no CHAR(4) NULL);
Code language: PHP (php)
In Example 5.8, the PRIMARY KEY clause belongs to the declaration of the corresponding column, together with its data type and nullability. For this reason, it is called a column-level constraint.
The CHECK Clause
The check constraint specifies conditions for the data inserted into a column. Each row inserted into a table or each value updating the value of the column must meet these conditions. The CHECK clause is used to specify check constraints. This clause can be defined in the CREATE TABLE or ALTER TABLE statement. The syntax of the CHECK clause is
[CONSTRAINT c_name]
CHECK [NOT FOR REPLICATION] expression
Code language: CSS (css)
expression must evaluate to a Boolean value (true or false) and can reference any columns in the current table (or just the current column if specified as a column level constraint), but no other tables. The CHECK clause is not enforced during a replication of the data if the option NOT FOR REPLICATION exists. (A database, or a part of it, is said to be replicated if it is stored at more than one site. Replication can be used to enhance the availability of data. Chapter “Query Optimizer” describes data replication.)
Example 5.9 shows how the CHECK clause can be used.
USE sample;
CREATE TABLE customer
(cust_no INTEGER NOT NULL,
cust_group CHAR(3) NULL,
CHECK (cust_group IN ('c1', 'c2', 'c10')));
Code language: PHP (php)
The customer table that is created in Example 5.9 contains the cust_group column with the corresponding check constraint. The database system returns an error if the cust_group column, after a modification of its existing values or after the insertion of a new row, would contain a value different from the values in the set (‘c1’, ‘c2’, ‘c10’).
The FOREIGN KEY Clause
A foreign key is a column or group of columns in one table that contains values that match the primary key values in the same or another table. Each foreign key is defined using the FOREIGN KEY clause combined with the REFERENCES clause.
The FOREIGN KEY clause has the following form:
[CONSTRAINT c_name]
[[FOREIGN KEY] ({col_name1} ,...)]
REFERENCES table_name ({col_name2},...)
[ON DELETE {NO ACTION| CASCADE | SET NULL | SET DEFAULT}]
[ON UPDATE {NO ACTION | CASCADE | SET NULL | SET DEFAULT}]
Code language: PHP (php)
The FOREIGN KEY clause defines all columns explicitly that belong to the foreign key. The REFERENCES clause specifies the table name with all columns that build the corresponding primary key. The number and the data types of the columns in the FOREIGN KEY clause must match the number and the corresponding data types of columns in the REFERENCES clause (and, of course, both of these must match the number and data types of the columns in the primary key of the referenced table).
The table that contains the foreign key is called the referencing table, and the table that contains the corresponding primary key is called the parent table or referenced table. Example 5.10 shows the specification of the foreign key in the works_on table of the sample database.
USE sample;
CREATE TABLE works_on
(emp_no INTEGER NOT NULL,
project_no CHAR(4) NOT NULL,
job CHAR (15) NULL,
enter_date DATE NULL,
CONSTRAINT prim_works PRIMARY KEY(emp_no, project_no),
CONSTRAINT foreign_works FOREIGN KEY(emp_no)
REFERENCES employee (emp_no));
Code language: PHP (php)
The works_on table in Example 5.10 is specified with two declarative integrity constraints: prim_works and foreign_works. Both constraints are table-level constraints, where the former specifies the primary key and the latter the foreign key of the works_on table. Further, the constraint foreign_works specifies the employee table as the parent table and its emp_no column as the corresponding primary key of the column with the same name in the works_on table.
The FOREIGN KEY clause can be omitted if the foreign key is defined as a column-level constraint, because the column being constrained is the implicit column “list” of the foreign key, and the keyword REFERENCES is sufficient to indicate what kind of constraint this is. The maximum number of FOREIGN KEY constraints in a table is 63.
A definition of the foreign keys in tables of a database imposes the specification of another important integrity constraint: the referential integrity, described next.