I had written an introductory post on monitoring the system health using the default extended events sessions that runs on a SQL Server 2008 instance and above. Now it is time for next part for this post. In the first post, I provided a set of queries which would be used for getting all the errors that were recorded by the System Health Extended Events session. Now I can create a set of reports using Business Intelligence Development Studio which can be used by the Custom Reports feature of Management Studio.
I put together a three-level drill down report to get a summary report for all the errors reported by the T-SQL queries in my previous post. The dashboard report which will basically serve as the landing page for what I am terming as the System Health Dashboard looks something like this:
As you can see above, the report shows me the different events captured by the Extended Events session. The first level drill down provides a summary of all the different errors reported.
The last drill-down option is to go look into every occurrence of a specific error number. This report basically shows all the occurrences of a specific error number along with the query text (if captured) and specifics for the event recorded rather than the generic error description that you see in the above report.
The other category of events captured by the Health Session are wait information which fall under the following category:
- Any sessions that have waited on latches (or other interesting resources) for > 15 seconds.
- Any sessions that have waited on locks for > 30 seconds.
- Any sessions that have waited for a long time for preemptive waits. The duration varies by wait type. A preemptive wait is where SQL Server is waiting for external API calls.
The query below will give you the query which experienced the wait along with the wait times which were recorded by the System Health session.
SELECT CAST(xet.target_data AS XML) AS XMLDATA
FROM sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = 'system_health'
;WITH CTE_HealthSession (EventXML) AS
SELECT C.query('.') EventXML
FROM #SystemHealthSessionData a
CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)
WHERE C.query('.').value('(/event/@name)', 'varchar(255)') in ('wait_info','wait_info_external')
EventXML.value('(/event/@timestamp)', 'datetime') as EventTime,
EventXML.value('(/event/data/text)', 'varchar(50)') as WaitType,
EventXML.value('(/event/data/value)', 'int') as Duration,
EventXML.value('(/event/data/value)', 'int') as Max_Duration,
EventXML.value('(/event/data/value)', 'int') as Total_Duration,
EventXML.value('(/event/action/value)', 'varchar(10)') as Session_ID,
EventXML.value('(/event/action/value)', 'varchar(max)') as sql_text
ORDER BY EventTime DESC
DROP TABLE #SystemHealthSessionData
A sample output of the above query is shown above. The .sql file for the above query can be downloaded from here.
Note: Beware of false positives for PREEMPTIVE_OS_GETPROCADDRESS waits described here.
In the next post, I shall give a preview of the wait reports and provide another set of queries to track a category of events tracked by the System Health session.
Once I have completed this series, I shall upload all the series of reports to my SkyDrive so that they can be downloaded for your benefit.