LAG | TroubleshootingSQL

-- Replace starting value with minimum starting value and increment for your sequence -- Replace the table name with the table name that you are interested in declare @startvalue int = 1, @increment int = 1 ;with cte as ( select a,(a-lag(a,1) OVER (ORDER BY a)) as MissingSequences from tblsequences ) select a, (MissingSequences/@increment)-1 as MissingSequences from cte where MissingSequences > @increment union all select TOP 1 MIN (a), CASE (MIN(a)- @startvalue)/@increment when 0 then null else (MIN(a)- @startvalue)/@increment end as MissingSequences from tblsequences group by a order by a

-- Replace schema name, table name(s) and sequence name as appropriate declare @startvalue int = 1,@interval int = 1, @seqname sysname = 'TestSeq', @schemaname sysname = 'dbo' select @startvalue = TRY_CAST(TRY_CAST(start_value as varchar(255)) as int), @interval = TRY_CAST(TRY_CAST(increment as varchar(255)) as int) from sys.sequences where name = @seqname and [schema_id] = (select [schema_id] from sys.schemas where name = @schemaname) if (@startvalue IS NOT NULL and @interval IS NOT NULL) begin ;with cte as ( select OrderID,(OrderID-lag(OrderID,1) over (order by OrderID)) as MissingSequences from (select OrderId as OrderID from tblTestSeq union all select OrderId as OrderID from tblTestSeq_2) A ) select OrderID, (MissingSequences/@interval)-1 as MissingSequences from cte where MissingSequences > @interval union all select TOP 1 OrderID, CASE (MIN(OrderID)- @startvalue)/@interval when 0 then null else (MIN(OrderID)- @startvalue)/@interval end as MissingSequences from tblTestSeq group by OrderID order by OrderID end else else begin PRINT 'CAST FAILED' end

SQL Server 2012 CTP 3, formerly known as SQL Server Code Name “Denali”, introduces a new set of T-SQL functions called Analytic functions. Analytic functions now open up a new vista for business intelligence where in you can calculate moving averages, running totals, percentages or top-N results within a group. I find this very useful while analyzing performance issues while traversing information present in a SQL Server trace file.

I was looking into a performance issue where in an application module executing a series of T-SQL functions was taking a long time to complete it’s operation. When I looked into the total duration of the T-SQL queries executed by the application, I couldn’t account for the total duration that the application was reporting. On tracking some of the statement executions done by the SPID which was being used by the application to execute the queries, I found a difference between the start time of a batch and the completed time of the previous batch. Now I needed to see the complete time difference between two subsequent query completion and start accounted for the difference in duration that I was seeing between the duration reported by the application and sum of duration of all the queries executed by the application. And BINGO… I was finally able to make the co-relation. Till SQL Server 2008 R2, I would have to write a query which involved a self-join to get the comparative analysis that I required:

;WITH cte AS
(SELECT b.name, a.starttime, a.endtime, a.transactionid, a.EventSequence, ROW_NUMBER() OVER(ORDER BY eventsequence) AS RowIDs
FROM trace a
INNER JOIN sys.trace_events b
ON a.eventclass = b.trace_event_id
WHERE spid = 83
AND b.name IN ('RPC:Starting','RPC:Completed','SQL:BatchStarting','SQL:BatchCompleted'))
SELECT TOP 1000 b.name, b.starttime, b.endtime, b.transactionid, DATEDIFF(S,a.endtime,b.starttime) as time_diff_seconds
FROM cte a
LEFT OUTER cte b
ON a.RowIDs = b.RowIDs-1

The output of the above query is shown in the screen shot below:

As you can see that there is a 4-second delay between the endtime of the statement in Row# 783 and the next execution shown in Row# 784. With the help of Analytic functions, I can simply use the LEAD function to get the above result and avoid a self-join.

SELECT  TOP 1000 a.name,b.StartTime,b.EndTime,b.TransactionID,
DATEDIFF(s,(LEAD(b.EndTime,1,0) OVER (ORDER BY EventSequence DESC)),b.StartTime) as TimeDiff
FROM sys.trace_events a
INNER JOIN dbo.trace b
on a.trace_event_id = b.EventClass
WHERE b.SPID = 83
and a.name in ('RPC:Starting','RPC:Completed','SQL:BatchStarting','SQL:BatchCompleted')

The output as you can see is the same the previous query:

I had imported the data from the profiler trace into a SQL Server database table using the function: fn_trace_gettable. Let’s see what the query plans look like. For the first query which uses the common table expression and a self-join, the graphical query plan is as follows:

Now let’s see what the query plan looks like with the new LEAD function in action:

As you can see above a new Window Spool operator is the one which performs the analytical operation to calculate the time difference between the subsequent rows using the EventSequence number. As you can see that I have eliminated the need for a self-join with a temporary table or a common table expression and therefore simplifying my query in the process.

In the above example I am using the LEAD function to get value that I am interested in the following row. If you are interested in the values from a preceding row then you can use LAG function.

One gotcha that you need to remember here is that if you don’t take care of the start and end values of the dataset which you are grouping, you could run into the following error due to an overflow or underflow condition.

Msg 535, Level 16, State 0, Line 1
The datediff function resulted in an overflow. The number of dateparts separating two date/time instances is too large. Try to use datediff with a less precise datepart.

This is a small example of how analytic functions can help reduce T-SQL complexity when calculating averages, percentiles for grouped data. Happy coding!!

Disclaimer: This information is based on the SQL Server 2012 CTP 3 (Build 11.0.1440), formerly known as SQL Server Code Name “Denali” documentation provided on MSDN which is subject to change in later releases.

del.icio.us Tags: SQL Server 2012,Analytic Functions,LAG,T-SQL Functions,T-SQL

TroubleshootingSQL

Explaining the bits and bytes of data in an AI world

Tag Archives: LAG

Awesomesauce: Finding out missing sequences

Hello Analytic Functions

TroubleshootingSQL

Explaining the bits and bytes of data in an AI world

Share this post:

Share this post: