10 | December | 2013 | TroubleshootingSQL

A lot of my customer engagements involve identifying waits experienced by SQL Server queries on a production environment. Since this month’s topic is WAITS… I see this as a good opportunity to talk about how I go about the analysis of the collected data to identify the blocking chains.

Before I start rambling off, a note of thanks to my friend Robert Davis [B|T] for hosting this month’s T-SQL Tuesday! He couldn’t have chosen a better topic! 🙂

In this post, I will be talking about how to slice and dice the diagnostic data collected by SQL Perf Stats script that is imported into a SQL Nexus database.

The tables of interest in the SQL Nexus database are:

1. tbl_Requests – This table holds the combined output of the snapshots collected by the Perf Stats script from the sys.dm_exec_requests, sys.dm_exec_sessions, sys.dm_exec_connections and sys.dm_os_tasks

2. tbl_NotableActiveQueries – This table holds the output sys.dm_exec_sql_text and sys.dm_exec_query_stats snapshots collected by the Perf Stats script.

The file that needs to be present when SQL Nexus is imported the collected diagnostic data is <server name>__SQL_2008_Perf_Stats_Startup.OUT. The plethora of information available in this file can be mind boggling at times. So pushing all this information into a database makes a lot of sense.

The first order of the day is to determine if there was actually blocking experienced during the data collection. The output of the query will give you collected snapshots where blocking was detected.

<span style="font-size: small;">select runtime,count (*) as blocked_processes
from tbl_requests
where blocking_session_id > 0 and session_id <> blocking_session_id
group by runtime
having count(*) > 1
order by runtime

The next order of the day is to find out which sessions were blocked. The SQL Nexus reports do a good job of drilling down into each blocking session observed. But if you want a collated view of all the blocking along with the list of head blockers, then the following T-SQL script could prove useful especially when the .OUT file being imported is a few hundred MBs in size! This would help you slice and dice the MBs or GBs of data that was imported into the SQL Nexus database tables and get a consolidated view of the blocking chains.

/*
This Sample Code is provided for the purpose of illustration only and is not intended to be used in a production environment. THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE. We grant You a nonexclusive, royalty-free right to use and modify the Sample Code and to reproduce and distribute the object code form of the Sample Code, provided that You agree: (i) to not use Our name, logo, or trademarks to market Your software product in which the Sample Code is embedded; (ii) to include a valid copyright notice on Your software product in which the Sample Code is embedded; and (iii) to indemnify, hold harmless, and defend Us and Our suppliers from and against any claims or lawsuits, including attorneys’ fees, that arise or result from the use or distribution of the Sample Code.

Author: Amit Banerjee

Modified: December 10, 2013

Description:

This T-SQL script extracts information about the blocking chains from the SQL Nexus database

*/

set nocount on
print '================== Blocking Analysis ====================='
print ''
print 'Runtime Information'
select min(runtime) as [Start Time], max(runtime) as [End Time] from tbl_Requests

if ((select count(*) from tbl_requests where blocking_session_id <> 0) > 0)
begin
print 'Blocking found'
print ''
print '============= Blocking Chains (Runtimes considered where blocking_session_id processes > 5) ==================='
-- Show a summary of the # of blocked sessions for every snapshot collected by the PerfStats script
select runtime,count (*) as blocked_processes
from tbl_requests
where blocking_session_id > 0 and session_id <> blocking_session_id
group by runtime
having count(*) > 1 -- The threshold can be changed as appropriate
order by runtime

print ''
print ''
print '================= Head Blocker Information ==================='
if (object_id('tmp_runtimes') <> NULL)
begin
drop table tmp_runtimes
end

-- Create a table to store the runtimes
CREATE TABLE tmp_runtimes(
[row_num] [bigint] IDENTITY(1,1) NOT NULL,
[runtime] datetime
) ON [PRIMARY]

-- Create a list of the distinct runtimes
insert into tmp_runtimes
select distinct (runtime)
from tbl_requests
where blocking_session_id <> 0
order by runtime

-- Start a while loop to fetch the data for the head blocker and blocked sessions
declare @count int,@loop int,@runtime datetime
select @count = count(*) from tmp_runtimes
set @loop = 1
while (@loop <> @count)
begin
select @runtime = runtime from tmp_runtimes where row_num = @loop

print '~~~~~~~~~~~~~~~ Runtime: '
select @runtime
print ''
print '~~~ Head Blocker'

select a.session_id,a.ecid,a.blocking_ecid,wait_type,wait_duration_ms,command,b.dbname,b.procname,b.stmt_text
from tbl_requests a
left outer join tbl_notableactivequeries b
on a.runtime = b.runtime and a.session_id = b.session_id
where blocking_session_id = 0 and a.runtime = @runtime and a.session_id in
(select blocking_session_id
from tbl_requests where blocking_session_id <> 0 and runtime = @runtime)

print ''
print '~~~ Blocked Session Information'
select *
from tbl_requests a
left outer join tbl_notableactivequeries b
on a.runtime = b.runtime and a.session_id = b.session_id
where a.blocking_session_id <> 0
and a.runtime = @runtime
order by a.session_id,a.blocking_session_id

set @loop = @loop + 1

end

drop table tmp_runtimes

end

else
begin
print 'No Blocking found'
end

set nocount off

With the help of the above output, you can get a collated view of all the blocking chains encountered during the data collection period. CSS engineers are Microsoft use SQL Nexus extensively for analyzing diagnostic data collected for SQL Server performance issues. I hope you find this script useful in analyzing blocking chains and do keep the feedback coming in. The above script is available for download on SkyDrive as well.

Please note that the PerfStats script collects data at 10 second intervals by default! If your blocking chains are for shorter durations then you will need to change the waitfor delay interval. If your blocking duration is lesser than a second, then data collection using the DMVs would not be of much help and you would need to go down the path of Extended Events tracing or other invasive forms of data collection (like XPerf utility).