PowerView and System Health Session– System

Previous posts in this series:

PowerView and System Health Session–CPU health

PowerView and System Health Session–Scheduler Health

PowerView and System Health Session–SQL Memory Health

PowerView and System Health Session– IO Health

In the last post for this series, I had explained how to retrieve the I/O statistics from the System Health Session data. In this post, I will describe how to build a dashboard using the SYSTEM component of the sp_server_diagnostics output. This view will help DBAs track various errors which can get their blood pressure shooting to abnormal levels. The SYSTEM component tracks various errors like non-yielding conditions, latch related warnings, inconsistent pages detected and access violations for the SQL Server instance.

Armed with this information in a Power Pivot table, I created two calculated columns for DAY and HOUR on the time the event was reported. After that I created KPIs on the maximum number of non-yielding conditions, latch related warnings, inconsistent pages and access violations reported.

Now that I have my Power Pivot data, I created a new Power View sheet which tracks the created KPIs for each day and hour. The screenshot below shows the final view.

The first half is a 100% Stacked Bar graph showing the various errors that were reported each day. There is a slicer for Day available which allows to filter the data quickly.

The second half of the report is a matrix which shows the KPI status for which day with a drill-down capability for hour.

The third half of the report shows a card view with the actual number of issues reported for each event against a particular time.

As usual the Excel sheet is available on SkyDrive at: http://sdrv.ms/10O0udO

Issue Statistics

The query to fetch the data required to build this report is available below.


SET NOCOUNT ON

-- Fetch data for only SQL Server 2012 instances

IF (SUBSTRING(CAST(SERVERPROPERTY ('ProductVersion') AS varchar(50)),1,CHARINDEX('.',CAST(SERVERPROPERTY ('ProductVersion') AS varchar(50)))-1) >= 11)

BEGIN
-- Get UTC time difference for reporting event times local to server time

DECLARE @UTCDateDiff int = DATEDIFF(mi,GETUTCDATE(),GETDATE());

-- Store XML data retrieved in temp table

SELECT TOP 1 CAST(xet.target_data AS XML) AS XMLDATA

INTO #SystemHealthSessionData

FROM sys.dm_xe_session_targets xet

JOIN sys.dm_xe_sessions xe

ON (xe.address = xet.event_session_address)

WHERE xe.name = 'system_health'

AND xet.target_name = 'ring_buffer';

-- Parse XML data and provide required values in the form of a table

;WITH CTE_HealthSession (EventXML) AS

(

SELECT C.query('.') EventXML

FROM #SystemHealthSessionData a

CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)

)

SELECT

DATEADD(mi,@UTCDateDiff,EventXML.value('(/event/@timestamp)[1]','datetime')) as [Event Time],

EventXML.value('(/event/data/text)[1]','varchar(255)') as Component,

EventXML.value('(/event/data/value/system/@latchWarnings)[1]','bigint') as [Latch Warnings],

EventXML.value('(/event/data/value/system/@isAccessViolationOccurred)[1]','bigint') as [Access Violations],

EventXML.value('(/event/data/value/system/@nonYieldingTasksReported)[1]','bigint') as [Non Yields Reported],

EventXML.value('(/event/data/value/system/@BadPagesDetected)[1]','bigint') as [Bad Pages Detected],

EventXML.value('(/event/data/value/system/@BadPagesFixed)[1]','bigint') as [Bad Pages Fixed]

FROM CTE_HealthSession

WHERE EventXML.value('(/event/@name)[1]', 'varchar(255)') = 'sp_server_diagnostics_component_result'

AND EventXML.value('(/event/data/text)[1]','varchar(255)') = 'SYSTEM'

ORDER BY [Event Time];

DROP TABLE #SystemHealthSessionData

END

TroubleshootingSQL Bytes–CPU usage analysis with Excel 2013

A screencast showing the CPU usage statistics of a SQL Server 2012 instance retrieved using Power Pivot. The visualization has been built using Power View in Excel 2013. The nuts and bolts of how the visualization was created is available in the following blog post: PowerView and System Health Session–CPU health

TroubleshootingSQL–CPU usage analysis with Excel 2013

PowerView and System Health Session– IO Health

Previous posts in this series:

PowerView and System Health Session–CPU health

PowerView and System Health Session–Scheduler Health

PowerView and System Health Session–SQL Memory Health

The SQL Server support team does get a lot of calls regarding slow performance which on analysis leads to a slow performing disk sub-system. The IO_SUBSYSTEM component of the sp_server_diagnostics output in SQL Server 2012 tracks I/O related latch timeouts and long duration I/Os reported along with the filename and the longest pending I/O duration. This information can be very useful when looking at the trends of slow I/O reported on the SQL Server database files on an instance.

As shown earlier in the series, I used this data captured by the sp_server_diagnostics output present in the System Health Session ring buffers to build visualizations using Power Pivot and Power View in Excel 2013. The query available at the bottom of this blog post allowed me to fetch the information from the System Health Session ring buffer into a Power Pivot table.

After that I created a two calculated fields for Hour and Day using the Event Time field in the table. Then, I created two calculated fields for tracking the maximum number of Long IOs and IO Latch Timeouts reported. Then I assigned KPIs to each of these calculated fields. After that I got down to designing the Powershell sheet which finally looked like the image in the screenshot!

The slider enables you to see the KPI status for each day on an hourly basis and the table on the right gives you insights into every snapshot captured by the sp_server_diagnostics output for the hour that you are interested in.

As usual the Excel sheet is available on SkyDrive at: http://sdrv.ms/10O0udO

IO Statistics

Query to fetch the above data is available below:


SET NOCOUNT ON
-- Fetch data for only SQL Server 2012 instances

IF (SUBSTRING(CAST(SERVERPROPERTY ('ProductVersion') AS varchar(50)),1,CHARINDEX('.',CAST(SERVERPROPERTY ('ProductVersion') AS varchar(50)))-1) >= 11)

BEGIN

-- Get UTC time difference for reporting event times local to server time

DECLARE @UTCDateDiff int = DATEDIFF(mi,GETUTCDATE(),GETDATE());

-- Store XML data retrieved in temp table

SELECT TOP 1 CAST(xet.target_data AS XML) AS XMLDATA

INTO #SystemHealthSessionData

FROM sys.dm_xe_session_targets xet

JOIN sys.dm_xe_sessions xe

ON (xe.address = xet.event_session_address)

WHERE xe.name = 'system_health'

AND xet.target_name = 'ring_buffer';

-- Parse XML data and provide required values in the form of a table

;WITH CTE_HealthSession (EventXML) AS

(

SELECT C.query('.') EventXML

FROM #SystemHealthSessionData a

CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)

)

SELECT

DATEADD(mi,@UTCDateDiff,EventXML.value('(/event/@timestamp)[1]','datetime')) as [Event Time],

EventXML.value('(/event/data/text)[1]','varchar(255)') as Component,

EventXML.value('(/event/data/value/ioSubsystem/@ioLatchTimeouts)[1]','bigint') as [IO Latch Timeouts],

EventXML.value('(/event/data/value/ioSubsystem/@totalLongIos)[1]','bigint') as [Total Long IOs],

EventXML.value('(/event/data/value/ioSubsystem/longestPendingRequests/pendingRequest/@filePath)[1]','varchar(8000)') as [Longest Pending Request File],

EventXML.value('(/event/data/value/ioSubsystem/longestPendingRequests/pendingRequest/@duration)[1]','bigint') as [Longest Pending IO Duration]

FROM CTE_HealthSession

WHERE EventXML.value('(/event/@name)[1]', 'varchar(255)') = 'sp_server_diagnostics_component_result'

AND EventXML.value('(/event/data/text)[1]','varchar(255)') = 'IO_SUBSYSTEM'

ORDER BY [Event Time];

DROP TABLE #SystemHealthSessionData

END 

System Health Session and beyond

I had blogged extensively about System Health Session available in SQL Server 2008 and above. I had also demonstrated how the System Health Session can be used in conjunction with SQL Server Management Studio reports in previous series that I ran on my blog.

System Health Session Dashboard Reports for SQL Server 2008 and SQL Server 2008 R2 – A set of reports available for SQL Server 2008/SQL Server 2008 R2 which provide visualizations for the events tracked by the System Health Session. The reports can be used using the custom reports option in SQL Server Management Studio.

System Health Session Dashboard – Shows the basic information of the information that can be fetched from the System Health Session.

System Health Session Dashboard — sp_server_diagnostics and more – A look into the different issues tracked by the sp_server_diagnostics script and how to build visualizations using that information.

System Health Session Dashboard – sp_server_diagnostics – The basics of what is covered by the sp_server_diagnostics procedure and how to build visualizations on top of the data captured.

System Health Session Dashboard – Error Summary Report
– Shows information on how to write queries to fetch information about various errors tracked by the System Health Session. Shows information tracked by the enhanced System Health Session in SQL Server 2012.

System Health Session Dashboard – Health Summary Report
– Gives you information about the SQL Server instance health like CPU usage, memory etc. A screenshot of the report is shown below:

I have added UTC time formatting so that the time is shown in the same time zone based on the server on which you are viewing the report.

I will keep updating this post once I keep updating with the new reports. As usual, feedback is always welcome.

The report definitions (.rdl) can be downloaded from here.

System Health Session and Deadlocks

I had blogged about retrieving deadlock related information using the default Extended Event session which runs by default on all SQL Server 2008 instances and above. However, once you have retrieved the XML deadlock graph, it could be quite cumbersome to read if the deadlock happens to be complex or involves multiple nodes. I frequently require the need to fetching the information about past deadlocks from the System Health Session data while working on customer environments. Due to the frequent repetitive nature of the data collection, I decided to automate this task.

I again decided to use a combination of Powershell and T-SQL to extract this information. The Powershell script (TransformtoXDL.ps1), which requires Powershell 2.0 or above, uses a T-SQL script (TransformtoXDL.sql ) to extract data from the System Health Session and outputs each individual deadlock graph as a separate .XDL file into a folder of your choice with the timestamp of the occurrence of the deadlock. Note that the time reported will be in GMT timezone.

The powershell script accepts two parameters: vServername for the SQL Server instance that you want to extract the data from and the vPath for the folder into which the XDL files should be saved into.

.\TransformToXDL.ps1 -vServername "<server name>" -vPath "C:\Tempdb\"

Yes… I have a folder called Tempdb on my C: drive!! Smile

A sample output is shown in the screenshot below:

image

The Transact-SQL script called TransformtoXDL.sql does the following:

1. Extracts the System Health Session data into a temporary table
2. Based on the version of your SQL Server instance, it performs the parsing to extract the deadlock graph. This script accounts for issues mentioned in KB978629. I would like to thank Michael Zilberstein [B] for the proposed corrective action on an issue that Jonathan Kehayias [B|T] had blogged about.
3. The last action that the script takes is to perform XML modification to get the XML deadlock data in the same format which is recognized by SQL Server 2012 Management Studio when viewing XDL files.

The powershell and T-SQL script can be downloaded here.


#    Script Name: TransformToXDL
#    Author: Amit Banerjee
#    Date: September 6, 2012
#    Description:
#    The script reads the deadlock graphs from the System Health Session
#    Ring Buffer and parses them to create an individual deadlock graph
#    in a folder of your choice.
#    Usage: .\TransformToXDL.ps1 -vServername "INST1" -vPath "C:\Tempdb\"
# This Sample Code is provided for the purpose of illustration only and is not intended to be used in a production environment. THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE. We grant You a nonexclusive, royalty-free right to use and modify the Sample Code and to reproduce and distribute the object code form of the Sample Code, provided that You agree: (i) to not use Our name, logo, or trademarks to market Your software product in which the Sample Code is embedded; (ii) to include a valid copyright notice on Your software product in which the Sample Code is embedded; and (iii) to indemnify, hold harmless, and defend Us and Our suppliers from and against any claims or lawsuits, including attorneys’ fees, that arise or result from the use or distribution of the Sample Code.

Param ([string] $vServername,
[string] $vPath)

cls

# Load the SQL Server snap-in for using sqlcmd cmdlet
$ErrorActionPreference = "SilentlyContinue"
Import-Module sqlps
$ErrorActionPreference = "Continue"

Write-Host "`nConnecting to SQL Server instance " $vServerName " to extract deadlock information"
# Extract the deadlock graphs and parse them in the system health session
Invoke-Sqlcmd -InputFile "C:\Tempdb\TransformToXDL.sql" -ServerInstance $vServerName

# Function to get the information from the table stored in tempdb
function Get-SqlData
{
param(
[string]$serverName=$(throw 'serverName is required.'),
[string]$databaseName,
[string]$query
)

$connString = "Server=$serverName;Database=$databaseName;Integrated Security=SSPI;"
$da = New-Object "System.Data.SqlClient.SqlDataAdapter" ($query,$connString)
$dt = New-Object "System.Data.DataTable"
[void]$da.fill($dt)
$dt

}

# Get the data stored in tempdb using the function defined above
$rows = get-sqldata $vServername  "tempdb"  "select row_id,event_time,deadlockgraph from tempdb.dbo.deadlock_graphs"

$vCount = 0
# Extract each row retrieved into an individual XDL file with the timestamp of the issue
foreach ($row in $rows)
{
if($row -ne $null)
{
$vCount++
$vFileName = $vPath + $vServername.Replace("\","_")+ "_" + $row.event_time.ToString().Replace(":","_").Replace("/","_") + ".xdl"
Write-Host "`nCreating file: "  $vFileName
$row.deadlockgraph | Out-File $vFileName
}
}

Write-Host "`nDeadlocks found: " $vCount.ToString()
Write-Host "`nPerforming cleanup"
Invoke-Sqlcmd -Query "IF EXISTS (SELECT TOP 1 name FROM tempdb.sys.objects where name = 'deadlock_graphs')
BEGIN
DROP TABLE tempdb.dbo.deadlock_graphs
END" -ServerInstance $vServerName

The above has been tested on SQL Server 2008, SQL Server 2008 R2 and SQL Server 2012. The resulting XDL files can be opened in SQL Server 2012 Management Studio. I am always looking for feedback. So please feel free to Tweet, Facebook or Email me regarding any issues or enhancements that you might need for the same.

You might want to remember that the T-SQL query used is a resource intensive query and it is preferable that you run this extraction exercise during non-business hours especially if your SQL Server instance is experiencing a large number of deadlocks.