Types of Data Pages / Parallel Processing of Tasks

Data pages are used to store data of a table. There are two types of data pages, each of which is used to store data in a different format:

In-row data pages
Row-overflow data pages

In-Row Data Pages

There is nothing special to say about in-row data pages: they are pages in which it is convenient to store data and index information. All data that doesn’t belong to large objects is always stored in-row. Also, VARCHAR(max), NVARCHAR(max), VARBINARY(max), and XML values can be stored in-row, if the large value types out of row option of the sp_tableoption system procedure is set to 0. In this case, all such values are stored directly in the data row, up to a limit of 8000 bytes and as long as the value can fit in the record. If the value does not fit in the record, a pointer is stored in-row and the rest is stored out of row in the storage space for large objects.

Row-Overflow Data

Values of the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) columns can be stored outside of the actual data page. As you already know, KB is the maximum size of a row on a data page, but you can exceed this size limit if you use columns of such large data types. In this case, the system stores the values of these columns in extra pages, which are called row-overflow pages.

The storage in row-overflow pages is done only under certain circumstances. The primary factor is the length of the row: if the row needs more than 8060 bytes, some of the column’s values will be stored on overflow pages. (A value of a column cannot be split between the actual data page and a row-overflow page.)

As an example of how content of a table with large values is stored, Example 15.1 creates such a table and inserts a row into it.

Example 15.1

USE sample;
CREATE TABLE mytable
		(col1 VARCHAR(1000),
		col2 VARCHAR(3000),
		col3 VARCHAR(3000),
		col4 VARCHAR(3000));
	INSERT INTO mytable
		SELECT REPLICATE('a', 1000), REPLICATE('b', 3000),
		REPLICATE('c', 3000), REPLICATE('d', 3000);Code language: SQL (Structured Query Language) (sql)

The CREATE TABLE statement in Example 15.1 creates the mytable table. The subsequent INSERT statement inserts a new row in the table. The length of the inserted row is 10,000 bytes. For this reason, the row doesn’t fit in a page.

The query in Example 15.2 uses several catalog views to display information concerning page type description.

Example 15.2

USE sample;
SELECT rows, type_desc AS page_type, total_pages AS pages
	FROM sys.partitions p JOIN sys.allocation_units a ON
		p.partition_id = a.container_id
	WHERE object_id = object_id('mytable');Code language: SQL (Structured Query Language) (sql)

The result is

In Example 15.2, the sys.partition and sys.allocation_units catalog views are joined together to display the information in relation to the mytable table and the storage of its row(s). The sys.partition view contains one row for each partition of each table or index. (Nonpartitioned tables, such as mytable, have only one partition unit.)

A set of pages of one particular data page type is called an allocation unit. Different allocation units can be displayed using the type_desc column of the sys.allocation_units catalog view. As you can see from the result of Example 15.2, for the single row of the mytable table, two convenient pages plus two row-overflow pages are allocated (or reserved) by the system.

Note – The performance of a system can significantly degrade if your queries access many row-overflow data pages.

Parallel Processing of Tasks

The Database Engine can execute different database tasks in parallel. The following tasks can be parallelized:

Bulk load
Backup
Query execution
Indices

The Database Engine allows data to be loaded in parallel using the bcp utility. (For the description of the bcp utility, see the next section.) The table into which the data is loaded must not have any indices, and the load operation must not be logged. (Only applications using the ODBC or OLE DB–based APIs can perform parallel data loads into a single table.)

The Database Engine can back up databases or transaction logs to multiple devices (tape or disk) using parallel striped backup. In this case, database pages are read by multiple threads one extent at a time (see also Chapter “Backup, Recovery, and System Availability”).

The Database Engine provides parallel queries to enhance the query execution. With this feature, the independent parts of a SELECT statement can be executed using several native threads on a computer. Each query that is planned for the parallel execution contains an exchange operator in its query execution plan. (An exchange operator is an operator in a query execution plan that provides process management, data redistribution, and flow control.) For such a query, the database system generates a parallel query execution plan. Parallel queries significantly improve the performance of the SELECT statements that process very large amounts of data.

On computers with multiple processors, the Database Engine automatically uses more processors to perform index operations, such as creation and rebuilding of an index. The number of processors employed to execute a single index statement is determined by the configuration option max degree of parallelism as well as the current workload. If the database system detects that the system is busy, the degree of parallelism is automatically reduced before the statement is executed.

In-Row Data Pages

Row-Overflow Data

Parallel Processing of Tasks

Related Lessons: