The FIRST_VALUE and LAST_VALUE functions return values from the first and last rows of a window. This allows a query to access values from multiple rows at once, without the need for a self-join.
These two functions are different from the other window aggregate functions because they must be used with a window. Also, unlike the other window aggregate functions, these functions allow the IGNORE NULLS clause. If IGNORE NULLS is specified, the first or last non-NULL value of the desired expression is returned. Otherwise, the first or last value is returned whether or not it is NULL.
The FIRST_VALUE function can be used to retrieve the first entry in an ordered group of values. The following query returns, for each order, the product identifier of the order's first item; that is, the ProductID of the item with the smallest LineID for each order.
Notice that the query uses the DISTINCT keyword to remove duplicates; without it, duplicate rows are returned for each item in each order.
SELECT DISTINCT ID, FIRST_VALUE( ProductID ) OVER ( PARTITION BY ID ORDER BY LineID ) FROM SalesOrderItems ORDER BY ID;
A common use of the FIRST_VALUE function is to compare a value in each row with the maximum or minimum value within the current group. The following query computes the total sales for each sales representative, and then compares that representative's total sales with the maximum total sales for the same product. The result is expressed as a percentage of the maximum total sales.
SELECT s.ProductID AS prod_id, o.SalesRepresentative AS sales_rep, SUM( s.Quantity * p.UnitPrice ) AS total_sales, 100 * total_sales / ( FIRST_VALUE( SUM( s.Quantity * p.UnitPrice ) ) OVER Sales_Window ) AS total_sales_percentage FROM SalesOrders o KEY JOIN SalesOrderItems s KEY JOIN Products p GROUP BY o.SalesRepresentative, s.ProductID WINDOW Sales_Window AS ( PARTITION BY s.ProductID ORDER BY SUM( s.Quantity * p.UnitPrice ) DESC ) ORDER BY s.ProductID;
The FIRST_VALUE and LAST_VALUE functions are useful when you have densified your data and you need to populate values instead of having NULLs. For example, suppose the sales representative with the highest total sales each day wins the distinction of Representative of the Day. The following query lists the winning sales representatives for the first week of April, 2001:
SELECT v.OrderDate, v.SalesRepresentative AS rep_of_the_day FROM ( SELECT o.SalesRepresentative, o.OrderDate, RANK() OVER ( PARTITION BY o.OrderDate ORDER BY SUM( s.Quantity * p.UnitPrice ) DESC ) AS sales_ranking FROM SalesOrders o KEY JOIN SalesOrderItems s KEY JOIN Products p GROUP BY o.SalesRepresentative, o.OrderDate ) AS v WHERE v.sales_ranking = 1 AND v.OrderDate BETWEEN '2001-04-01' AND '2001-04-07' ORDER BY v.OrderDate;
The query returns the following results:
However, note that no results are returned for days in which no sales were made. The following query densifies the data, creating records for days in which no sales were made. Additionally, it uses the LAST_VALUE function to populate the NULL values for rep_of_the_day (on non-winning days) with the ID of the last winning representative, until a new winner occurs in the results.
SELECT d.dense_order_date, LAST_VALUE( v.SalesRepresentative IGNORE NULLS ) OVER ( ORDER BY d.dense_order_date ) AS rep_of_the_day FROM ( SELECT o.SalesRepresentative, o.OrderDate, RANK() OVER ( PARTITION BY o.OrderDate ORDER BY SUM( s.Quantity * p.UnitPrice ) DESC ) AS sales_ranking FROM SalesOrders o KEY JOIN SalesOrderItems s KEY JOIN Products p GROUP BY o.SalesRepresentative, o.OrderDate ) AS v RIGHT OUTER JOIN ( SELECT DATEADD( day, row_num, '2001-04-01' ) AS dense_order_date FROM sa_rowgenerator( 0, 6 ) AS d ) ON v.OrderDate = d.dense_order_date AND sales_ranking = 1 ORDER BY d.dense_order_date;
The query returns the following results:
The derived table v from the previous query is joined to a derived table d, which contains all the dates under consideration. This yields a row for each desired day, but this outer join contains NULL in the SalesRepresentative column for dates on which no sales were made. Using the LAST_VALUE function solves this problem by defining rep_of_the_day for a given row to be the last non-NULL value of SalesRepresentative leading up to the corresponding day.