Aggregate Functions in SELECT Statement

Aggregate functions are functions that are used to get summary values. All aggregate functions can be divided into several groups:

Convenient aggregate functions
Statistical aggregate functions
User-defined aggregate functions
Analytic aggregate functions

The first three types are described in the following sections, while analytic aggregate functions are explained in detail in a different chapter.

Convenient Aggregate Functions

The Transact-SQL language supports six aggregate functions:

MIN
MAX
SUM
AVG
COUNT
COUNT_BIG

All aggregate functions operate on a single argument, which can be either a column or an expression. (The only exception is the second form of the COUNT and COUNT_BIG functions, COUNT(*) and COUNT_BIG(*).) The result of each aggregate function is a constant value, which is displayed in a separate column of the result.

The aggregate functions appear in the SELECT list, which can include a GROUP BY clause. If there is no GROUP BY clause in the SELECT statement, and the SELECT list includes at least one aggregate function, then no simple columns can be included in the SELECT list (other than as arguments of an aggregate function). Therefore, Example 6.24 is wrong.

Example 6.24 (Example of an Illegal Statement)

USE sample;
SELECT emp_lname, MIN(emp_no)
	FROM employee;Code language: PHP (php)

The emp_lname column of the employee table must not appear in the SELECT list of Example 6.24 because it is not the argument of an aggregate function. On the other hand, all column names that are not arguments of an aggregate function may appear in the SELECT list if they are used for grouping.

The argument of an aggregate function can be preceded by one of two keywords:

ALL – Indicates that all values of a column are to be considered (ALL is the default value)
DISTINCT – Eliminates duplicate values of a column before the aggregate function is applied

MIN and MAX Aggregate Functions

The aggregate functions MIN and MAX compute the lowest and highest values in the column, respectively. If there is a WHERE clause, the MIN and MAX functions return the lowest or highest of values from selected rows. Example 6.25 shows the use of the aggregate function MIN.

Example 6.25

Get the lowest employee number:

USE sample;
SELECT MIN(emp_no) AS min_employee_no
	FROM employee;Code language: PHP (php)

The result is

min_employee_no
2581

The result of Example 6.25 is not user friendly. For instance, the name of the employee with the lowest number is not known. As already shown, the explicit the name of the employee with the lowest employee number, use a subquery, as shown in Example 6.26, where the inner query contains the SELECT statement of the previous example.

Example 6.26

Get the number and the last name of the employee with the lowest employee number:

USE sample;
SELECT emp_no, emp_lname
	FROM employee
	WHERE emp_no =
	(SELECT MIN(emp_no)
		FROM employee);Code language: PHP (php)

The result is

emp_no	emp_lname
2581	Hansel

Example 6.27 shows the use of the aggregate function MAX.

Example 6.27

Get the employee number of the manager who was entered last in the works_on table:

USE sample;
SELECT emp_no
	FROM works_on
	WHERE enter_date =
	(SELECT MAX(enter_date)
		FROM works_on
		WHERE job = 'Manager');Code language: PHP (php)

The result is

emp_no
10102

The argument of the functions MIN and MAX can also be a string value or a date. If the argument has a string value, the comparison between all values will be provided using the actual collating sequence. For all arguments of temporal data types, the earliest date specifies the lowest value in the column and the latest date specifies the highest value in the column.

The DISTINCT option cannot be used with the aggregate functions MIN and MAX. All NULL values in the column that are the argument of the aggregate function MIN or MAX are always eliminated before MIN or MAX is applied.

SUM Aggregate Function

The aggregate function SUM calculates the sum of the values in the column. The argument of the function SUM must be numeric. Example 6.28 shows the use of the SUM function.

Example 6.28

Calculate the sum of all budgets of all projects:

USE sample;
SELECT SUM(budget) sum_of_budgets
	FROM project;Code language: PHP (php)

The result is

sum_of_budgets
401500

The aggregate function in Example 6.28 groups all values of the projects’ budgets and determines their total sum. For this reason, the query in Example 6.28 (as does each analog query) implicitly contains the grouping function. The grouping function from Example 6.28 can be written explicitly in the query, as shown in Example 6.29.

Example 6.29

SELECT SUM(budget) sum_of_budgets
	FROM project
	GROUP BY();

The use of this syntax for the GROUP BY clause is recommended because it defines a grouping explicitly. The use of the DISTINCT option eliminates all duplicate values in the column before the function SUM is applied. Similarly, all NULL values are always eliminated before SUM is applied.

AVG Aggregate Function

The aggregate function AVG calculates the average of the values in the column. The argument of the function AVG must be numeric. All NULL values are eliminated before the function AVG is applied. Example 6.30 shows the use of the AVG aggregate function.

Example 6.30

Calculate the average of all budgets with an amount greater than $100,000:

USE sample;
SELECT AVG(budget) avg_budget
	FROM project
	WHERE budget > 100000;Code language: PHP (php)

The result is

avg_budget
153250

COUNT and COUNT_BIG Aggregate Functions

The aggregate function COUNT has two different forms:

COUNT([DISTINCT] col_name)
COUNT(*)Code language: CSS (css)

The first form calculates the number of values in the col_name column. When the DISTINCT keyword is used, all duplicate values are eliminated before COUNT is applied. This form of COUNT does not count NULL values for the column. Example 6.31 shows the use of the first form of the aggregate function COUNT.

Example 6.31

Count all different jobs in each project:

USE sample;
SELECT project_no, COUNT(DISTINCT job) job_count
	FROM works_on
	GROUP BY project_no;Code language: PHP (php)

The result is

project_no	job_count
p1	3
p2	1
p3	3

As can be seen from the result of Example 6.31, all NULL values are eliminated before the function COUNT(DISTINCT job) is applied. (The sum of all values in the column is 8 instead of 11.)

The second form of the function COUNT, COUNT(*), counts the number of rows in the table. Or, if there is a WHERE clause in the SELECT statement, COUNT(*) returns the number of rows for which the WHERE condition is true. In contrast to the first form of the function COUNT, the second form does not eliminate NULL values, because this function operates on rows and not on columns. Example 6.32 shows the use of COUNT(*).

Example 6.32

Get the number of each job in all projects:

USE sample;
SELECT job, COUNT(*) job_count
	FROM works_on
	GROUP BY job;Code language: PHP (php)

The result is

job	job_count
NULL	3
Analyst	2
Clerk	4
Manager	2

The COUNT_BIG function is analogous to the COUNT function. The only difference between them is in relation to their return values: COUNT_BIG always returns a value of the BIGINT data type, while the COUNT function always returns a value of the INTEGER data type.

Statistical Aggregate Functions

The following aggregate functions belong to the group of statistical aggregate functions:

VAR – Computes the variance of all the values listed in a column or expression
VARP – Computes the variance for the population of all the values listed in a column or expression
STDEV – Computes the standard deviation (which is computed as the square root of the corresponding variance) of all the values listed in a column or expression
STDEVP – Computes the standard deviation for the population of all the values listed in a column or expression

Examples showing statistical aggregate functions will be provided in Chapter “Business Intelligence and Transact-SQL”.

User-Defined Aggregate Functions

The Database Engine also supports the implementation of user-defined aggregate functions. Using these functions, you can implement and deploy aggregate functions that do not belong to aggregate functions supported by the system. These functions are a special case of user-defined functions, which will be described in detail in Chapter “Stored Procedures and User-Defined Functions”.

Convenient Aggregate Functions

MIN and MAX Aggregate Functions

SUM Aggregate Function

AVG Aggregate Function

COUNT and COUNT_BIG Aggregate Functions

Statistical Aggregate Functions

User-Defined Aggregate Functions

Related Lessons: