The component of Microsoft SQL Server called the Database Engine is a relational database system. The notion of relational database systems was first introduced by E. F. Codd in his article “A Relational Model of Data for Large Shared Data Banks” in 1970. In contrast to earlier database systems (network and hierarchical), relational database systems are based upon the relational data model, which has a strong mathematical background.
The central concept of the relational data model is a relation—that is, a table. Therefore, from the user’s point of view, a relational database contains tables and nothing but tables. In a table, there are one or more columns and zero or more rows. At every row and column position in a table there is always exactly one data value.
Working with the Book’s Sample Database
The sample database used in this tutorial represents a company with departments and employees. Each employee in the example belongs to exactly one department, which itself has one or more employees. Jobs of employees center on projects: each employee works at the same time on one or more projects, and each project engages one or more employees.
The data of the sample database can be represented using four tables:
- department
- employee
- project
- works_on
Tables 1-1 through 1-4 show all the tables of the sample database.
The department table represents all departments of the company. Each department has the following attributes:
department (dept_no, dept_name, location)
dept_no represents the unique number of each department. dept_name is its name, and location is the location of the corresponding department.
Table 1-1 The Department Table
dept_no | dept_name | location |
---|---|---|
d1 | Research | Dallas |
d2 | Accounting | Seattle |
d3 | Marketing | Dallas |
The employee table represents all employees working for a company. Each employee has the following attributes:
employee (emp_no, emp_fname, emp_lname, dept_no)
emp_no represents the unique number of each employee. emp_fname and emp_lname are the first and last name of each employee, respectively. Finally, dept_no is the number of the department to which the employee belongs.
Table 1-2 The Employee Table
emp_no | emp_fname | emp_lname | dept_no |
---|---|---|---|
25348 | Matthew | Smith | d3 |
10102 | Ann | Jones | d3 |
18316 | John | Barrimore | d1 |
29346 | James | James | d2 |
9031 | Elke | Hansel | d2 |
2581 | Elsa | Bertoni | d2 |
28559 | Sybill | Moser | d1 |
Each project of a company is represented in the project table. This table has the following columns:
project (project_no, project_name, budget)
project_no represents the unique number of each project. project_name and budget specify the name and the budget of each project, respectively.
Table 1-3 The Project Table
project_no | project_name | budget |
---|---|---|
p1 | Apollo | 120000 |
p2 | Gemini | 95000 |
p3 | Mercury | 186500 |
The works_on table specifies the relationship between employees and projects. It has the following columns:
works_on (emp_no, project_no, job, enter_date)
emp_no specifies the employee number and project_no specifies the number of the project on which the employee works. The combination of data values belonging to these two columns is always unique. job and enter_date specify the task and the starting date of an employee in the corresponding project, respectively.
Table 1-4 The works_on Table
emp_no | project_no | job | enter_date |
---|---|---|---|
10102 | p1 | Analyst | 2006.10.1 |
10102 | p3 | Manager | 2008.1.1 |
25348 | p2 | Clerk | 2007.2.15 |
18316 | p2 | NULL | 2007.6.1 |
29346 | p2 | NULL | 2006.12.15 |
2581 | p3 | Analyst | 2007.10.15 |
9031 | p1 | Manager | 2007.4.15 |
28559 | p1 | NULL | 2007.8.1 |
28559 | p2 | Clerk | 2008.2.1 |
9031 | p3 | Clerk | 2006.11.15 |
29346 | p1 | Clerk | 2007.1.4 |
Using the sample database, it is possible to describe some general properties of relational database systems:
- Rows in a table do not have any particular order.
- Columns in a table do not have any particular order.
- Every column must have a unique name within a table. On the other hand, columns from different tables may have the same name. (For example, the sample database has a dept_no column in the department table and a column with the same name in the employee table.)
- Every single data item in the table must be single valued. This means that in every row and column position of a table there is never a set of multiple data values.
- For every table, there is at least one column with the property that no two rows have the same combination of data values for all table columns. In the relational data model, such an identifier is called a candidate key. If there is more than one candidate key within a table, the database designer designates one of them as the primary key of the table. For example, the column dept_no is the primary key of the department table; the columns emp_no and project_no are the primary keys of the tables employee and project, respectively. Finally, the primary key for the works_on table is the combination of the columns emp_no, project_no.
- In a table, there are never two identical rows. (This property is only theoretical; the Database Engine and all other relational database systems generally allow the existence of identical rows within a table.)
SQL: A Relational Database Language
The SQL Server relational database language is called Transact-SQL. It is a dialect of the most important database language today: Structured Query Language (SQL). The origin of SQL is closely connected with the project called System R, which was designed and implemented by IBM in the early 1980s. This project showed that it is possible, using the theoretical foundations of the work of E. F. Codd, to build a relational database system.
In contrast to traditional languages like C, C++, and Java, SQL is a set-oriented language. (The former are also called record-oriented languages.) This means that SQL can query many rows from one or more tables using just one statement. This feature is one of the most important advantages of SQL, allowing the use of this language at a logically higher level than the level at which traditional languages can be used.
Another important property of SQL is its nonprocedurality. Every program written in a procedural language (C, C++, Java) describes how a task is accomplished, step by step. In contrast to this, SQL, as any other nonprocedural language, describes what it is that the user wants. Thus, the system is responsible for finding the appropriate way to solve users’ requests.
SQL contains two sublanguages: a data definition language (DDL) and a data manipulation language (DML). DDL statements are used to describe the schema of database tables. The DDL contains three generic SQL statements: CREATE object, ALTER object, and DROP object. These statements create, alter, and remove database objects, such as databases, tables, columns, and indexes.
In contrast to the DDL, the DML encompasses all operations that manipulate the data. There are always four generic operations for manipulating the database: retrieval, insertion, deletion, and modification.