T-SQL Tuesday#17: It’s all about APPLYcation this time

imageIt’s time for another round of T-SQL Tuesday and this round of the revolving blog party is being hosted by Matt Velic [Blog | Twitter].

APPLY – That is the topic for this month’s T-SQL Tuesday! The APPLY operator was added to the T-SQL repertoire and which has resulted in lesser use of cursors for a large number of diagnostic scripts that CSS uses to collect data while working on SQL Performance issues. In this blog, I shall share a few examples of such queries that we use to collect data while working on SQL Performance cases.

TOP Query Plan Statistics

The following query gives you a list of the SQL batches/procedures with their CPU usage, Query/Batch duration and Physical Reads rank. This query helps identify the TOP CPU/Duration/Read consuming queries by making use of system DMVs. The output below is useful for the following reasons:

1. I get the usecount of the procedure/batch and if this batch is called multiple times and the use count of a Compiled Proc cached object is only 1, then the plan is not being re-used. This now tells me that I need to look at reasons behind inability of plan re-use.

2. I get the total and average resource usage statistics for each of the queries listed in the output.

3. A quick glance at the output gives me an idea of the most expensive queries on the instance w.r.t. reads or/and CPU and/or query duration.


SELECT

LEFT(p.cacheobjtype + ' (' + p.objtype + ')',35) AS cacheobjtype,

p.usecounts,

p.size_in_bytes/1024  AS size_in_kb,

PlanStats.total_worker_time/1000 AS tot_cpu_ms,

PlanStats.total_elapsed_time/1000 AS tot_duration_ms,

PlanStats.total_physical_reads,

PlanStats.total_logical_writes,

PlanStats.total_logical_reads,

PlanStats.CpuRank,

PlanStats.PhysicalReadsRank,

PlanStats.DurationRank,

LEFT(CASE WHEN pa.value = 32767 THEN 'ResourceDb' ELSE ISNULL(DB_NAME(CONVERT(sysname,pa.value)),CONVERT(sysname,pa.value)) END,40) AS dbname,

sql.objectid,

CONVERT(nvarchar(50), CASE WHEN sql.objectid IS NULL THEN NULL ELSE REPLACE(REPLACE(sql.[text],CHAR(13),' '),CHAR(10),' ') END) AS procname,  REPLACE(REPLACE(SUBSTRING(sql.[text],PlanStats.statement_start_offset/2+1,CASE WHEN PlanStats.statement_end_offset=-1 THEN LEN(CONVERT(nvarchar(max),sql.[text]))

ELSE PlanStats.statement_end_offset/2 - PlanStats.statement_start_offset/2+1 END),CHAR(13),' '),CHAR(10),' ') AS stmt_text

FROM

(

SELECT

stat.plan_handle,

statement_start_offset,

statement_end_offset,

stat.total_worker_time,

stat.total_elapsed_time,

stat.total_physical_reads,

stat.total_logical_writes,

stat.total_logical_reads,

ROW_NUMBER()OVER ( ORDER BY stat.total_worker_time DESC ) AS CpuRank,

ROW_NUMBER()OVER ( ORDER BY stat.total_physical_reads DESC ) AS PhysicalReadsRank,

ROW_NUMBER()OVER ( ORDER BY stat.total_elapsed_time DESC ) AS DurationRank

FROM sys.dm_exec_query_stats stat

) AS PlanStats

INNER JOIN sys.dm_exec_cached_plans p

ON p.plan_handle =  PlanStats.plan_handle

OUTER APPLY sys.dm_exec_plan_attributes ( p.plan_handle ) pa

OUTER APPLY sys.dm_exec_sql_text ( p.plan_handle ) AS sql

WHERE

(PlanStats.CpuRank<50

OR PlanStats.PhysicalReadsRank<50

OR PlanStats.DurationRank<50)

AND

pa.attribute='dbid'

ORDER BY tot_cpu_ms DESC


Top Queries with Similar Query Hash and Query Plan Hash

 

SELECT TOP 10 query_plan_hash, query_hash,

COUNT (distinct query_plan_hash) as 'distinct query_plan_hash count',

SUM(execution_count) as 'execution_count',

SUM(total_worker_time) as 'total_worker_time',

SUM(total_elapsed_time) as 'total_elapsed_time',

SUM (total_logical_reads) as 'total_logical_reads',

MAX(REPLACE (REPLACE (SUBSTRING (st.[text], qs.statement_start_offset/2 + 1,CASE WHEN qs.statement_end_offset = -1 THEN LEN (CONVERT(nvarchar(max), st.[text])) ELSE qs.statement_end_offset/2 - qs.statement_start_offset/2 + 1 END), CHAR(13), ' '), CHAR(10), ' '))  AS sample_statement_text,

MIN(CAST(query_plan as varchar(max))) AS 'ShowPlan XML'

FROM sys.dm_exec_query_stats AS qs

CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st

CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) as sp

GROUP BY query_plan_hash, query_hash

ORDER BY sum(total_worker_time) ASC;

This is a query which can help you identify queries which have the same query plan hash

SQL Server Books Online topic “Finding and Tuning Similar Queries by Using Query and Query Plan Hashes” has more information on this topic. The query hash feature was added in SQL Server 2008 which made it easier to troubleshooting performance issues caused by ad-hoc queries which differed in just literal values. RML Utilities does a similar task by creating query hash but now if you are troubleshooting on the server, you can do this using DMVs without having to capture a profiler trace.

The right operand supplied to the Apply operator is a function of one or more column values that are present in the left operand. So basically, the right operand is a table-valued expression of which is evaluated once for each row that appears in the left operand. The Cross Apply and Outer Apply are the two flavors of the Apply operator. So if I wanted to simulate an Apply Operation without the Operator itself, it would require the use of temporary tables or table variables.

I use the APPLY operator a lot while parsing XML data like Process Monitor traces or XML query plans which make life a lot easier and saves me from writing a huge bunch of T-SQL code.

Profiler events for different scenarios

A few moons back I had blogged about the high frequency events in SQL Server. In this post, I shall talk about the common scenarios where profiler traces are collected for SQL Performance issues and what kind of events need to be captured in the server side traces that are configured for the data collection.

While configuring a profiler trace, always include all data columns for that event as that doesn’t have an overhead as much as the number of events being captured during the trace capture.

In this post, I shall cover two of the most common issues that CSS collects profiler data for addressing SQL Performance issues: Attentions/Timeouts and High CPU usage/Long Running queries.

Attentions or Timeouts

Attentions or Timeouts can occur due to client side application timeout setting (mostly this is the case) or due to database engine side issue (like failure to acquire memory, problem on the server side connectivity layer etc.). During this time, you will have a lot of queries starting but not completing successfully. If you are dealing with such a scenario, then you would need to capture the Starting events using a profiler trace to find out which statements didn’t complete and encountered the Attention event.

The events that need to be captured are:

All Database Events

Log File Auto Shrink
Data File Auto Shrink
Log File Auto Grow
Data File Auto Grow

All Errors and Warnings

Attention (Important for tracking the timeouts)
ErrorLog
EventLog
Exception
Hash Warning
Execution Warnings
Sort Warnings
Missing Column Statistics
Missing Join Predicate
Exchange Spill Event
Blocked process report
User Error Message
Background Job Error
Bitmap Warning
Database Suspect Data Page
CPU threshold exceeded

Lock:Deadlock Chain and Lock:Timeout (timeout > 0)

Performance: Auto Stats

Showplan All or Showplan XML (As execution plan will not be available for queries which didn’t complete)

Showplan Statistics Profile or Showplan XML Statistics Profile (For comparing estimated and actual query plan)

Security Audit: Audit Login, Audit Login Failed and Audit Logout

Stored Procedures events:

RPC:Starting
SP:Recompile
SP:Starting
SP:Completed
SP:StmtStarting
SP:StmtCompleted
RPC:Completed

Transactions: SQLTransaction is required for tracing back the transaction if there are nested transactions). DTCTransaction is required only if DTC transactions are getting timed out.

TSQL events:

SQL:BatchCompleted
SQL:BatchStarting
SQL:StmtStarting
SQL:StmtCompleted
SQL:StmtRecompile

Cursor events: CursorRecompile and CursorImplicitConversion (Required for checking if the requested cursor is being implicitly converted or if the cursor is being frequently recompiled).

The above events should be sufficient to track down any query timeouts and the common root causes for the same.

High CPU usage/Long Running queries

Another set of issues that are commonly dealt with while troubleshooting SQL Performance issues are Long Running Queries or Queries causing high CPU usage on the server. In such a case, the above set of events would be definitely be useful. If you are certain that the profiler trace can be captured for the entire duration of the query execution, then for highly transactional servers, you can omit the capture of the following events:

SQL: BatchStarting
SQL: StmtStarting
SP: StmtStarting
SP: Starting
RPC: Starting

The above logic of configuring a profiler trace will also help you capture trace data when you are trying to track down queries which are causing excessive amount of IO on the SQL instance.

Reference for capturing blocking information on a SQL instance: SQL 2000 Blocker Script output/SQL 2005 PerfStats

Keep in mind that the profiler data captured may not be sufficient for root cause analysis if you are not capturing SQL blocking information and performance monitor data in conjunction with the profiler traces.

In case you need additional events like deadlock events or lock events, then refer to my previous post and find out if they fall in the high frequency event category and when to capture the particular event.

TIP: As much as possible, avoid adding Text filters to your profiler trace configuration as it can lead to high CPU usage for highly active OLTP environments during filter evaluation.

Hope that the above information will help you reduce the excessive amount of data collection for troubleshooting SQL performance issues.

Approach to SQL Performance issues

Performance degradation can happen due to multiple reasons. The main bottlenecks that would affect performance are:

1. Memory: Physical or Virtual – Your server could be experiencing a physical crunch which would point us to the Available Physical RAM or a crunch on the Virtual Address Space. This would involve finding out the queries which are consuming high memory during compilation or execution and narrow down on the single biggest contributor and work towards rectifying the problem with the query.

2. Disk IO: There could be an issue with SQL Server queries running slowly due to large amount of IO being done by them or a disk performance issue.

3. CPU: It could be because the queries executing on your server are consuming high amount of CPU time. In such a case, it be statistics that are out-of-date or data that has increased which introduces data skew.

4. Blocking: The queries executing on the server could be blocked and waiting for a particular resource and thus increasing the query duration.

How to perform post-mortem analysis for SQL performance problems?

If you are facing a performance degradation on the SQL instance after an upgrade, we would need to analyze the PSSDIAG collected to find out the top bottlenecks. There are two ways to do this:

1. Comparison Analysis: You identify the a query batch or stored procedure which is a problem and provide us a baseline (w.r.t. normal execution time for the query in question). We capture a “fast” and a “slow” run and we perform a comparison between the two runs and we tell you what is biggest bottleneck: Memory, Disk, CPU or Database/Environment Settings like database statistics, query SET options, difference in parameters etc. Based on what is identified as the culprit, you will be able to identify/suggest action items to help remove the culprit.

2. Bottleneck Analysis: In case you do not have a baseline, you can perform a bottleneck analysis and then stack rank them for you in case there is more than one bottleneck. Based on the analysis, you will be able to identify the biggest bottleneck and can implement action items to mitigate the single largest bottleneck.

If you need to perform live troubleshooting on the server, then you can make use of DMVs if you are on SQL Server 2005 or above or use Performance Dashboard (mentioned below).

RML Utilities Download:

http://www.microsoft.com/downloads/details.aspx?FamilyId=7EDFA95A-A32F-440F-A3A8-5160C8DBE926&displaylang=en

More Info on how to use the RML Utilities:

http://blogs.msdn.com/psssql/archive/2007/12/18/rml-utilities-for-microsoft-sql-server-released.aspx

http://sqlcat.com/technicalnotes/archive/2008/02/01/precision-performance-for-microsoft-sql-server-using-rml-utilities-9-0.aspx

If you need to analyze blocking data on the server, then you can use SQL Nexus for the same.

SQL Nexus Download Link:

http://www.codeplex.com/sqlnexus

Another option would be to use SQL Server 2005 Performance Dashboard in case you are testing your application on SQL Server 2005. This tool lets you view your server status without running PSSDIAG and provides reports to identify Long Running queries and also identify which queries are consuming the highest amount of resources (IO/CPU/Memory).

Performance Dashboard:

http://www.microsoft.com/downloads/details.aspx?FamilyId=1d3a4a0d-7e0c-4730-8204-e419218c1efc

Performance Dashboard for SQL Server 2008

http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/sql-server-performance-dashboard-reports-in-ssms-introduction-install-datediff-error-amp-modified-for-sql-2008.aspx

Zeollar presentation on SQL Nexus

Get Microsoft Silverlight

If you have difficulty viewing this session, click here to view it in the original website.

Here is the recording of the webcast that I presented today! Feel free to contact me through this blog or post on the discussion forum on the SQL Nexus discussion forum.

In case you are not able to view the presentation, then you can view it directly on the Zeollar site.

Upcoming Zeollar Presentation

I will be presenting a 30 minute demo on how to perform post mortem analysis for SQL Performance issues using RML Utilities and SQL Nexus on 24th June, 2010. The recorded and live session will be available on Zeollar. Zeollar is a Microsoft India DPE initiative that gets you the latest technical content on a daily basis in different channels. Think of it like a personal television that switches on every day at a specified hour allowing you to switch channels and view the channel of your interest.

You can download the calendar invite from here.