The cumulative distribution function, CUME_DIST, is sometimes defined as the inverse of percentile. CUME_DIST computes the normalized position of a specific value relative to the set of values in the window. The range of the function is between 0 and 1.
As the window moves down the input rows, the cumulative distribution is calculated for the expression specified in the window's ORDER BY clause. If the ORDER BY clause includes multiple expressions, the second and subsequent expressions are used to break ties if the first expression has the same value in adjacent rows. NULL values are sorted before any other value (in ascending sequence).
The following example returns a result set that provides a cumulative distribution of the salaries of employees who live in California.
SELECT DepartmentID, Surname, Salary, CUME_DIST() OVER ( PARTITION BY DepartmentID ORDER BY Salary DESC ) "Rank" FROM Employees WHERE State IN ( 'CA' );
This query returns the following result:
The CUME_DIST function provides a simple method to determine the median of a set of values. CUME_DIST can be used to compute the median value successfully in the face of ties and whether the input contains an even or odd number of rows. Essentially, you need only determine the first row with a CUME_DIST value of greater than or equal to 0.5.
The following query returns the product information for the product with the median unit price:
SELECT FIRST * FROM ( SELECT Description, Quantity, UnitPrice, CUME_DIST() OVER ( ORDER BY UnitPrice ASC ) AS CDist FROM Products ) As DT WHERE CDist >= 0.5 ORDER BY CDist;
The query returns the following result:
For more information on the syntax for the CUME_DIST function, see CUME_DIST function [Ranking].