Default Trace–Performance Issues

There are multiple events that a default trace in SQL Server 2005 and above tracks which can be significantly useful for finding out areas of improvement. The events that I will be concentrating on are:

1. Missing Column Statistics – This event class indicates that column statistics that could have been useful for the optimizer are not available due to which an incorrect cardinality estimation could occur. This can cause the optimizer to choose a less efficient query plan than expected. You will not see this event produced unless the option to auto-create statistics is turned off.

2. Missing Join Predicate – This event class indicates that a query is being executed that has no join predicate. (A join predicate is the ON search condition for a joined table in a FROM clause.) This could result in a long-running query. This event is produced only if both sides of the join return more than one row.

3. Sort Warnings – This event class indicates that sort operations do not fit into memory. This does not include sort operations involving the creation of indexes, only sort operations within a query (such as an ORDER BY clause used in a SELECT statement). The EventSubClass field in this event shows whether this was a single pass or a multiple pass. A single pass (EventSubClass = 1) is when the sort table was written to disk, only a single additional pass over the data was required to obtain sorted output. A multiple pass (EventSubClass = 2) is when the sort table was written to disk, multiple passes over the data were required to obtain sorted output. A multiple pass is an enemy of query performance.

4. Hash Warnings – This event class can be used to monitor when a hash recursion or cessation of hashing (hash bailout) has occurred during a hashing operation.  Hash recursion (EventSubClass = 0) occurs when the build input does not fit into available memory, resulting in the split of input into multiple partitions that are processed separately. Hash bailout (EventSubClass = 1) occurs when a hashing operation reaches its maximum recursion level and shifts to an alternate plan to process the remaining partitioned data. Hash bailout usually occurs because of skewed data. Another enemy of performance!

5. Server Memory Change – This event class occurs when Microsoft SQL Server memory usage has increased or decreased. You can even determine what is the current memory usage after the increase or decrease.

6. Log File Auto Grow – This event class indicates that the log file grew automatically. This event is not triggered if the log file is grown explicitly through ALTER DATABASE. Frequent log file growths are not food for performance.

7. Data File Auto Grow – This event class indicates that the data file grew automatically. This event is not triggered if the data file is grown explicitly by using the ALTER DATABASE statement.

Since this information is already available in the default trace, I decided to use my Default Trace Statistics Power View Excel sheet to track this information graphically. And this is what I got (see screenshot 1)!

DefaultTrace_PerfIssues

So what is the above Excel sheet displaying?

1. The information available in the first column chart will show the Data and Log file grow events per database.

2. The first matrix in the middle of the Excel sheet shows the number of Sort Warnings and Hash Warnings with drill-down capabilities for each database to see the EventSubClass fields.

3. The second matrix shows the Missing Column Statistics and the Missing Join Predicate events for each database. The drill-down capability gives the name of the column statistics that was missing.

4. The line graph shows the change in memory for the SQL Server database engine.

Happy monitoring!

Previous posts in this series:

Schema Changes History Report

WOOT: Schema Changes History Report on Power View

The last post in this series talked about using Power View to analyze the data stored in the SQL Server’s default trace. I decided to take this a step further by creating the Schema Changes History report with the help of the data that I retrieved from the Default Traces. The advantage of a report created in Power View is that the interactivity which is missing in the standard report is available.

The way I created this report was to filter the data in the Power Pivot table using EventClass ID 46, 47 and 164 for only looking at the create, drop and alter commands which the default trace tracks. After that I created a table with a tile on the Database Name and a 100% Stacked Bar Chart to show the activity at a database level.

I also had to create linked tables for getting the Object Type and the Event Class Name that you see in the table below.

I will provide a final version of the Excel sheet once I have completed the other dashboards and sanitized the information available in the Power Pivot table.

SchemaChangesHistory

Previous Post in the Series:

Default Trace Dashboard
https://troubleshootingsql.com/2013/09/26/woot-default-trace-and-power-view/

PowerView and System Health Session– IO Health

Previous posts in this series:

PowerView and System Health Session–CPU health

PowerView and System Health Session–Scheduler Health

PowerView and System Health Session–SQL Memory Health

The SQL Server support team does get a lot of calls regarding slow performance which on analysis leads to a slow performing disk sub-system. The IO_SUBSYSTEM component of the sp_server_diagnostics output in SQL Server 2012 tracks I/O related latch timeouts and long duration I/Os reported along with the filename and the longest pending I/O duration. This information can be very useful when looking at the trends of slow I/O reported on the SQL Server database files on an instance.

As shown earlier in the series, I used this data captured by the sp_server_diagnostics output present in the System Health Session ring buffers to build visualizations using Power Pivot and Power View in Excel 2013. The query available at the bottom of this blog post allowed me to fetch the information from the System Health Session ring buffer into a Power Pivot table.

After that I created a two calculated fields for Hour and Day using the Event Time field in the table. Then, I created two calculated fields for tracking the maximum number of Long IOs and IO Latch Timeouts reported. Then I assigned KPIs to each of these calculated fields. After that I got down to designing the Powershell sheet which finally looked like the image in the screenshot!

The slider enables you to see the KPI status for each day on an hourly basis and the table on the right gives you insights into every snapshot captured by the sp_server_diagnostics output for the hour that you are interested in.

As usual the Excel sheet is available on SkyDrive at: http://sdrv.ms/10O0udO

IO Statistics

Query to fetch the above data is available below:


SET NOCOUNT ON
-- Fetch data for only SQL Server 2012 instances

IF (SUBSTRING(CAST(SERVERPROPERTY ('ProductVersion') AS varchar(50)),1,CHARINDEX('.',CAST(SERVERPROPERTY ('ProductVersion') AS varchar(50)))-1) >= 11)

BEGIN

-- Get UTC time difference for reporting event times local to server time

DECLARE @UTCDateDiff int = DATEDIFF(mi,GETUTCDATE(),GETDATE());

-- Store XML data retrieved in temp table

SELECT TOP 1 CAST(xet.target_data AS XML) AS XMLDATA

INTO #SystemHealthSessionData

FROM sys.dm_xe_session_targets xet

JOIN sys.dm_xe_sessions xe

ON (xe.address = xet.event_session_address)

WHERE xe.name = 'system_health'

AND xet.target_name = 'ring_buffer';

-- Parse XML data and provide required values in the form of a table

;WITH CTE_HealthSession (EventXML) AS

(

SELECT C.query('.') EventXML

FROM #SystemHealthSessionData a

CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)

)

SELECT

DATEADD(mi,@UTCDateDiff,EventXML.value('(/event/@timestamp)[1]','datetime')) as [Event Time],

EventXML.value('(/event/data/text)[1]','varchar(255)') as Component,

EventXML.value('(/event/data/value/ioSubsystem/@ioLatchTimeouts)[1]','bigint') as [IO Latch Timeouts],

EventXML.value('(/event/data/value/ioSubsystem/@totalLongIos)[1]','bigint') as [Total Long IOs],

EventXML.value('(/event/data/value/ioSubsystem/longestPendingRequests/pendingRequest/@filePath)[1]','varchar(8000)') as [Longest Pending Request File],

EventXML.value('(/event/data/value/ioSubsystem/longestPendingRequests/pendingRequest/@duration)[1]','bigint') as [Longest Pending IO Duration]

FROM CTE_HealthSession

WHERE EventXML.value('(/event/@name)[1]', 'varchar(255)') = 'sp_server_diagnostics_component_result'

AND EventXML.value('(/event/data/text)[1]','varchar(255)') = 'IO_SUBSYSTEM'

ORDER BY [Event Time];

DROP TABLE #SystemHealthSessionData

END 

Creating custom XEvent Templates

SQL Server 2012 added the Extended Events UI which made working with Extended Events session much easier. You no longer need to chalk up lengthy T-SQL scripts to spawn off your Extended Events monitoring session. The UI also provides you an option to view the collected data and perform groupings and aggregations for easier analysis.

The templates that you see while configuring an Extended Event session are located at “C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Templates\sql\xevent“. When you export an Extended Events session, you will be given an option to export the Session as a Template to the following folder: “C:\Users\<user name>\Documents\SQL Server Management Studio\Templates\XEventTemplates“. The template is always exported as an XML file. Now when you create a new Extended Event Session, you will see the template under the user templates section in the templates view as shown in the screen shot below:

You can now choose to create as many templates as you want for your monitoring purposes. However, the template files have to be deposited in the above mentioned location so that Management Studio can show them to you while creating an Extended Events session.

The above blog post is a manifestation of a seemingly simple question asked by Naman Vadhera yesterday while he was exploring the Extended Events UI option.

Reference: http://msdn.microsoft.com/en-us/library/ms174169.aspx

System Health Session and beyond

I had blogged extensively about System Health Session available in SQL Server 2008 and above. I had also demonstrated how the System Health Session can be used in conjunction with SQL Server Management Studio reports in previous series that I ran on my blog.

System Health Session Dashboard Reports for SQL Server 2008 and SQL Server 2008 R2 – A set of reports available for SQL Server 2008/SQL Server 2008 R2 which provide visualizations for the events tracked by the System Health Session. The reports can be used using the custom reports option in SQL Server Management Studio.

System Health Session Dashboard – Shows the basic information of the information that can be fetched from the System Health Session.

System Health Session Dashboard — sp_server_diagnostics and more – A look into the different issues tracked by the sp_server_diagnostics script and how to build visualizations using that information.

System Health Session Dashboard – sp_server_diagnostics – The basics of what is covered by the sp_server_diagnostics procedure and how to build visualizations on top of the data captured.

System Health Session Dashboard – Error Summary Report
– Shows information on how to write queries to fetch information about various errors tracked by the System Health Session. Shows information tracked by the enhanced System Health Session in SQL Server 2012.

System Health Session Dashboard – Health Summary Report
– Gives you information about the SQL Server instance health like CPU usage, memory etc. A screenshot of the report is shown below:

I have added UTC time formatting so that the time is shown in the same time zone based on the server on which you are viewing the report.

I will keep updating this post once I keep updating with the new reports. As usual, feedback is always welcome.

The report definitions (.rdl) can be downloaded from here.

Awesomesauce SQL Server 2012: System Health Session

There have been a spate of new features and enhancements to existing functionality in SQL Server 2012. The System Health Extended Events session is one of the benefactors. I had previously blogged about the usage of the System Health Session data using custom SSMS Reports in a four part series. Now it is time to talk about the enhancements to the System Health Session’s predecessor.

The SQL Server 2012 health session adds additional events to the default extended event health session running on the instance which is listed below to collect data for the following:

  • Out of memory errors for CLR and the database engine using the events:
    • sqlclr.clr_allocation_failure
    • sqlclr.clr_virtual_alloc_failure
    • sqlos.memory_broker_ring_buffer_recorded
    • sqlos.memory_node_oom_ring_buffer_recorded
  • Scheduler health issues using the events:
    • sqlos.scheduler_monitor_deadlock_ring_buffer_recorded
    • sqlos.scheduler_monitor_non_yielding_iocp_ring_buffer_recorded
    • sqlos.scheduler_monitor_non_yielding_ring_buffer_recorded
    • sqlos.scheduler_monitor_non_yielding_rm_ring_buffer_recorded
    • sqlos.scheduler_monitor_stalled_dispatcher_ring_buffer_recorded
    • sqlos.scheduler_monitor_system_health_ring_buffer_recorded
  • The results of the sp_server_diagnostics component
  • Security and connectivity errors encountered using the events:
    • sqlserver.connectivity_ring_buffer_recorded
    • sqlserver.security_error_ring_buffer_recorded

imageAdditional wait types are also tracked which adds to the existing list of waits captured by this session. Since SQL Server 2012 Management Studio allows you to view the configured Extended Events session on the SQL Server 2012 instance, you can get to the system_health session using the following steps from Object Explorer (see Screenshot 1):

Expand Management –> Extended Events –> system_health.

The Script Session as option now provides you with a way to view the session definition. Another enhancement made to this session is the target for this session is now also a .xel file which by default is located in your SQL Server LOG folder. The target definition for the System Health session are now:

a. Ring Buffer (as was in SQL Server 2008) 
b. XEL file – With a maximum size of 5 MB and a file retention policy of 4 files.

So now you can leverage the management studio XEvent UI to view the results. You can right-click on the ring buffer target in the UI and click on the View Target Data option which will give you an XML document. image

The coolest thing that I like here is the viewer’s capability to show you all the data from all the .xel files. Once you have the events loaded, you should see a view as shown in Screenshot 2. This gives you a list of all the events that were generated. You can sort on the event columns by clicking on them. The progress of the sorting will be visible towards the top right hand corner of SSMS as shown in Screenshot 3. Now that you have the basic view, let’s find out what more can we do with the System Health Session data displayed.

image

You can click on the Choose Columns option in the toolbar, select the columns that you want to view additionally apart from the name and timestamp by using the Available Columns list. See Screenshot 4 for an example.

image

Now that you have selected the additional columns, you can also set grouping (using the Grouping button) to view each of the event name categories, the number of events reported and aggregations on the column values (using the Aggregations button). So finally, I landed up with a view as shown below in Screenshot 5. What’s more, you can save the view settings using the Display Settings button!

image

I am currently writing a series on SSWUG for designing SSRS reports for viewing the events reported by the System Health Session. You can following the series here. If you have SQL Server 2012 RTM installed, then you need to download the T-SQL script attached to Mike Wachal’s blog post to address the wait type information captured.

Monitoring just got a whole lot easier!!

System Health Session: Part 2

I had written an introductory post on monitoring the system health using the default extended events sessions that runs on a SQL Server 2008 instance and above. Now it is time for next part for this post. In the first post, I provided a set of queries which would be used for getting all the errors that were recorded by the System Health Extended Events session. Now I can create a set of reports using Business Intelligence Development Studio which can be used by the Custom Reports feature of Management Studio.

I put together a three-level drill down report to get a summary report for all the errors reported by the T-SQL queries in my previous post. The dashboard report which will basically serve as the landing page for what I am terming as the System Health Dashboard looks something like this: image

As you can see above, the report shows me the different events captured by the Extended Events session. The first level drill down provides a summary of all the different errors reported.

image

The last drill-down option is to go look into every occurrence of a specific error number. This report basically shows all the occurrences of a specific error number along with the query text (if captured) and specifics for the event recorded rather than the generic error description that you see in the above report.

image

The other category of events captured by the Health Session are wait information which fall under the following category:

  • Any sessions that have waited on latches (or other interesting resources) for > 15 seconds.
  • Any sessions that have waited on locks for > 30 seconds.
  • Any sessions that have waited for a long time for preemptive waits. The duration varies by wait type. A preemptive wait is where SQL Server is waiting for external API calls.

The query below will give you the query which experienced the wait along with the wait times which were recorded by the System Health session.

SELECT CAST(xet.target_data AS XML) AS XMLDATA

INTO #SystemHealthSessionData

FROM sys.dm_xe_session_targets xet

JOIN sys.dm_xe_sessions xe

ON (xe.address = xet.event_session_address)

WHERE xe.name = 'system_health'

;WITH CTE_HealthSession (EventXML) AS

(

SELECT C.query('.') EventXML

FROM #SystemHealthSessionData a

CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)

WHERE C.query('.').value('(/event/@name)[1]', 'varchar(255)') in ('wait_info','wait_info_external')

)

SELECT

EventXML.value('(/event/@timestamp)[1]', 'datetime') as EventTime,

EventXML.value('(/event/data/text)[1]', 'varchar(50)') as WaitType,

EventXML.value('(/event/data/value)[3]', 'int') as Duration,

EventXML.value('(/event/data/value)[4]', 'int') as Max_Duration,

EventXML.value('(/event/data/value)[5]', 'int') as Total_Duration,

EventXML.value('(/event/action/value)[2]', 'varchar(10)') as Session_ID,

EventXML.value('(/event/action/value)[3]', 'varchar(max)') as sql_text

FROM CTE_HealthSession

ORDER BY EventTime DESC

DROP TABLE #SystemHealthSessionData

A sample output of the above query is shown above. The .sql file for the above query can be downloaded from here.

image

Note: Beware of false positives for PREEMPTIVE_OS_GETPROCADDRESS waits described here.

In the next post, I shall give a preview of the wait reports and provide another set of queries to track a category of events tracked by the System Health session.

Once I have completed this series, I shall upload all the series of reports to my SkyDrive so that they can be downloaded for your benefit.