 Click here to view and discuss this page in DocCommentXchange. In the future, you will be sent there automatically.
SQL Anywhere 10.0.1 » SQL Anywhere Server - SQL Usage » OLAP Support » Window functions » Window aggregate functions

### Standard deviation and variance functions

SQL Anywhere supports two versions of variance and standard deviation functions: a sampling version, and a population version. Choosing between the two versions depends on the statistical context in which the function is to be used.

All of the variance and standard deviation functions are true aggregate functions in that they can compute values for a partition of rows as determined by the query's GROUP BY clause. As with other basic aggregate functions such as MAX or MIN, their computation also ignores NULL values in the input.

For improved performance, SQL Anywhere calculates the mean, and the deviation from mean, in one step. This means that only one pass over the data is required.

Also, regardless of the domain of the expression being analyzed, all variance and standard deviation computation is done using IEEE double-precision floating point. If the input to any variance or standard deviation function is the empty set, then each function returns NULL as its result. If VAR_SAMP is computed for a single row, then it returns NULL, while VAR_POP returns the value 0.

Following are the standard deviation and variance functions offered in SQL Anywhere:

• STDDEV function
• STDDEV_POP function
• STDDEV_SAMP function
• VARIANCE function
• VAR_POP function
• VAR_SAMP function

To review the mathematical formulas represented by these functions see Mathematical formulas for the aggregate functions.

###### STDDEV function

This function is an alias for the STDDEV_SAMP function. See STDDEV_SAMP function [Aggregate].

###### STDDEV_POP function

This function computes the standard deviation of a population consisting of a numeric expression, as a DOUBLE.

###### Example 1

The following query returns a result set that shows the employees whose salary is one standard deviation greater than the average salary of their department. Standard deviation is a measure of how much the data varies from the mean.

```SELECT *
FROM ( SELECT
Surname AS Employee,
DepartmentID AS Department,
CAST( Salary as DECIMAL( 10, 2 ) )
AS Salary,
CAST( AVG( Salary )
OVER ( PARTITION BY DepartmentID ) AS DECIMAL ( 10, 2 ) )
AS Average,
CAST( STDDEV_POP( Salary )
OVER ( PARTITION BY DepartmentID ) AS DECIMAL ( 10, 2 ) )
AS StandardDeviation
FROM Employees
GROUP BY Department, Employee, Salary )
AS DerivedTable
WHERE Salary > Average + StandardDeviation
ORDER BY Department, Salary, Employee;```

The table that follows represents the result set from the query. Every department has at least one employee whose salary significantly deviates from the mean.

EmployeeDepartmentIDSalaryAverageStandardDeviation
1Lull10087900.0058736.2816829.60
2Scheffield10087900.0058736.2816829.60
3Scott10096300.0058736.2816829.60
4Sterling20064900.0048390.9513869.60
5Savarino20072300.0048390.9513869.60
6Kelly20087500.0048390.9513869.60
7Shea300138948.0059500.0030752.40
8Blaikie40054900.0043640.6711194.02
9Morris40061300.0043640.6711194.02
10Evans40068940.0043640.6711194.02
11Martinez50055500.0033752.209084.50

Employee Scott earns \$96,300.00, while the departmental average is \$58,736.28. The standard deviation for that department is \$16,829.00, which means that salaries less than \$75,565.88 (`58736.28 + 16829.60 = 75565.88`) fall within one standard deviation of the mean. At \$96,300.00, employee Scott is well above that figure.

This example assumes that Surname and Salary are unique for each employee, which isn't necessarily true. To ensure uniqueness, you could add EmployeeID to the GROUP BY clause.

###### Example 2

The following statement lists the average and variance in the number of items per order in different time periods:

```SELECT YEAR( ShipDate ) AS Year,
QUARTER( ShipDate ) AS Quarter,
AVG( Quantity ) AS Average,
STDDEV_POP( Quantity ) AS Variance
FROM SalesOrderItems
GROUP BY Year, Quarter
ORDER BY Year, Quarter;```

This query returns the following result:

YearQuarterAverageVariance
2000125.77514814.2794...
2000227.05084715.0270...
............

For more information on the syntax for this function, see STDDEV_SAMP function [Aggregate].

###### STDDEV_SAMP function

This function computes the standard deviation of a sample consisting of a numeric expression, as a DOUBLE. For example, the following statement returns the average and variance in the number of items per order in different quarters:

```SELECT YEAR( ShipDate ) AS Year,
QUARTER( ShipDate ) AS Quarter,
AVG( Quantity ) AS Average,
STDDEV_SAMP( Quantity ) AS Variance
FROM SalesOrderItems
GROUP BY Year, Quarter
ORDER BY Year, Quarter;```

This query returns the following result:

YearQuarterAverageVariance
2000125.77514814.3218...
2000227.05084715.0696...
............

For more information on the syntax for this function, see STDDEV_POP function [Aggregate].

###### VARIANCE function

This function is an alias for the VAR_SAMP function. See VAR_SAMP function [Aggregate].

###### VAR_POP function

This function computes the statistical variance of a population consisting of a numeric expression, as a DOUBLE. For example, the following statement lists the average and variance in the number of items per order in different time periods:

```SELECT YEAR( ShipDate ) AS Year,
QUARTER( ShipDate ) AS Quarter,
AVG( Quantity ) AS Average,
VAR_POP( quantity ) AS Variance
FROM SalesOrderItems
GROUP BY Year, Quarter
ORDER BY Year, Quarter;```

This query returns the following result:

YearQuarterAverageVariance
2000125.775148203.9021...
2000227.050847225.8109...
............

If VAR_POP is computed for a single row, then it returns the value 0.

For more information on the syntax for this function, see VAR_POP function [Aggregate].

###### VAR_SAMP function

This function computes the statistical variance of a sample consisting of a numeric expression, as a DOUBLE.

For example, the following statement lists the average and variance in the number of items per order in different time periods:

```SELECT YEAR( ShipDate ) AS Year,
QUARTER( ShipDate ) AS Quarter,
AVG( Quantity ) AS Average,
VAR_SAMP( Quantity ) AS Variance
FROM SalesOrderItems
GROUP BY Year, Quarter
ORDER BY Year, Quarter;```

This query returns the following result:

YearQuarterAverageVariance
2000125.775148205.1158...
2000227.050847227.0939...
............

If VAR_SAMP is computed for a single row, then it returns NULL.

For more information on the syntax for this function, see VAR_SAMP function [Aggregate].