Awesomesauce SQL Server 2012: System Health Session

There have been a spate of new features and enhancements to existing functionality in SQL Server 2012. The System Health Extended Events session is one of the benefactors. I had previously blogged about the usage of the System Health Session data using custom SSMS Reports in a four part series. Now it is time to talk about the enhancements to the System Health Session’s predecessor.

The SQL Server 2012 health session adds additional events to the default extended event health session running on the instance which is listed below to collect data for the following:

  • Out of memory errors for CLR and the database engine using the events:
    • sqlclr.clr_allocation_failure
    • sqlclr.clr_virtual_alloc_failure
    • sqlos.memory_broker_ring_buffer_recorded
    • sqlos.memory_node_oom_ring_buffer_recorded
  • Scheduler health issues using the events:
    • sqlos.scheduler_monitor_deadlock_ring_buffer_recorded
    • sqlos.scheduler_monitor_non_yielding_iocp_ring_buffer_recorded
    • sqlos.scheduler_monitor_non_yielding_ring_buffer_recorded
    • sqlos.scheduler_monitor_non_yielding_rm_ring_buffer_recorded
    • sqlos.scheduler_monitor_stalled_dispatcher_ring_buffer_recorded
    • sqlos.scheduler_monitor_system_health_ring_buffer_recorded
  • The results of the sp_server_diagnostics component
  • Security and connectivity errors encountered using the events:
    • sqlserver.connectivity_ring_buffer_recorded
    • sqlserver.security_error_ring_buffer_recorded

imageAdditional wait types are also tracked which adds to the existing list of waits captured by this session. Since SQL Server 2012 Management Studio allows you to view the configured Extended Events session on the SQL Server 2012 instance, you can get to the system_health session using the following steps from Object Explorer (see Screenshot 1):

Expand Management –> Extended Events –> system_health.

The Script Session as option now provides you with a way to view the session definition. Another enhancement made to this session is the target for this session is now also a .xel file which by default is located in your SQL Server LOG folder. The target definition for the System Health session are now:

a. Ring Buffer (as was in SQL Server 2008) 
b. XEL file – With a maximum size of 5 MB and a file retention policy of 4 files.

So now you can leverage the management studio XEvent UI to view the results. You can right-click on the ring buffer target in the UI and click on the View Target Data option which will give you an XML document. image

The coolest thing that I like here is the viewer’s capability to show you all the data from all the .xel files. Once you have the events loaded, you should see a view as shown in Screenshot 2. This gives you a list of all the events that were generated. You can sort on the event columns by clicking on them. The progress of the sorting will be visible towards the top right hand corner of SSMS as shown in Screenshot 3. Now that you have the basic view, let’s find out what more can we do with the System Health Session data displayed.

image

You can click on the Choose Columns option in the toolbar, select the columns that you want to view additionally apart from the name and timestamp by using the Available Columns list. See Screenshot 4 for an example.

image

Now that you have selected the additional columns, you can also set grouping (using the Grouping button) to view each of the event name categories, the number of events reported and aggregations on the column values (using the Aggregations button). So finally, I landed up with a view as shown below in Screenshot 5. What’s more, you can save the view settings using the Display Settings button!

image

I am currently writing a series on SSWUG for designing SSRS reports for viewing the events reported by the System Health Session. You can following the series here. If you have SQL Server 2012 RTM installed, then you need to download the T-SQL script attached to Mike Wachal’s blog post to address the wait type information captured.

Monitoring just got a whole lot easier!!

System Health Session: Part 2

I had written an introductory post on monitoring the system health using the default extended events sessions that runs on a SQL Server 2008 instance and above. Now it is time for next part for this post. In the first post, I provided a set of queries which would be used for getting all the errors that were recorded by the System Health Extended Events session. Now I can create a set of reports using Business Intelligence Development Studio which can be used by the Custom Reports feature of Management Studio.

I put together a three-level drill down report to get a summary report for all the errors reported by the T-SQL queries in my previous post. The dashboard report which will basically serve as the landing page for what I am terming as the System Health Dashboard looks something like this: image

As you can see above, the report shows me the different events captured by the Extended Events session. The first level drill down provides a summary of all the different errors reported.

image

The last drill-down option is to go look into every occurrence of a specific error number. This report basically shows all the occurrences of a specific error number along with the query text (if captured) and specifics for the event recorded rather than the generic error description that you see in the above report.

image

The other category of events captured by the Health Session are wait information which fall under the following category:

  • Any sessions that have waited on latches (or other interesting resources) for > 15 seconds.
  • Any sessions that have waited on locks for > 30 seconds.
  • Any sessions that have waited for a long time for preemptive waits. The duration varies by wait type. A preemptive wait is where SQL Server is waiting for external API calls.

The query below will give you the query which experienced the wait along with the wait times which were recorded by the System Health session.

SELECT CAST(xet.target_data AS XML) AS XMLDATA

INTO #SystemHealthSessionData

FROM sys.dm_xe_session_targets xet

JOIN sys.dm_xe_sessions xe

ON (xe.address = xet.event_session_address)

WHERE xe.name = 'system_health'

;WITH CTE_HealthSession (EventXML) AS

(

SELECT C.query('.') EventXML

FROM #SystemHealthSessionData a

CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)

WHERE C.query('.').value('(/event/@name)[1]', 'varchar(255)') in ('wait_info','wait_info_external')

)

SELECT

EventXML.value('(/event/@timestamp)[1]', 'datetime') as EventTime,

EventXML.value('(/event/data/text)[1]', 'varchar(50)') as WaitType,

EventXML.value('(/event/data/value)[3]', 'int') as Duration,

EventXML.value('(/event/data/value)[4]', 'int') as Max_Duration,

EventXML.value('(/event/data/value)[5]', 'int') as Total_Duration,

EventXML.value('(/event/action/value)[2]', 'varchar(10)') as Session_ID,

EventXML.value('(/event/action/value)[3]', 'varchar(max)') as sql_text

FROM CTE_HealthSession

ORDER BY EventTime DESC

DROP TABLE #SystemHealthSessionData

A sample output of the above query is shown above. The .sql file for the above query can be downloaded from here.

image

Note: Beware of false positives for PREEMPTIVE_OS_GETPROCADDRESS waits described here.

In the next post, I shall give a preview of the wait reports and provide another set of queries to track a category of events tracked by the System Health session.

Once I have completed this series, I shall upload all the series of reports to my SkyDrive so that they can be downloaded for your benefit.

Sleeping session ID: Where art thou?

This has been an age old confusion ever since the Dynamic Management Views changed the way SQL Performance monitoring was done.

A SPID/Session ID will show up as Sleeping/Awaiting Command when it has no active work request that it needs to execute on the Database Engine. Consider the following scenario:

1. You connect to a SQL instance using Management Studio

2. You execute: SELECT @@SPID

3. You then open another SSMS query window and try and lookup the Session ID returned by the output of Step #2 using the DMV: sys.dm_exec_requests.

If you didn’t know of this behavior, then you will be in for a surprise! This is because the Session ID associated with the query in Step #2 has no active work to perform which is why the DMV didn’t report a row for that particular session. However, when you look up this session id using sys.dm_exec_connections or sys.sysprocesses. The reason sys.sysprocesses reports this SPID is because it doesn’t differentiate between a session with/without an active work request. It will report all SPIDs currently connected to the database engine.

The above behavior is expected and by-design.

Bob Dorr has mentioned about this in his post on the CSS SQL Escalation blog and also talks about how this can affect concurrency if such a session has open transactions:

How It Works: What is a Sleeping / Awaiting Command Session

Powershell script to calculate folder size

Just published a PS Script to calculate the TOP 5 folders for a UNC share. The need for this script arose when I needed to find out which folders were consuming the largest amount of disk space on a File Server. I know there are graphical utilities like DiskMon, Space Monger etc. but this script provides me the flexibility of automation like identifying the folder owners or deleting based on some criteria etc.

Click here to download the script.

[Blog Update]: August posts on SQLServerFAQ

I was working on a root cause analysis for an OOM (Out-of-Memory) issue for SQL Server 2008 R2 and I needed to schedule notifications when the available memory on the server fell below a certain value. SQL Server 2008 and above has a nifty little DMV to do just this. Read about it’s usefulness here:

The hidden gems among DMVs: sys.dm_os_sys_memory