Approach to SQL Performance issues

Performance degradation can happen due to multiple reasons. The main bottlenecks that would affect performance are:

1. Memory: Physical or Virtual – Your server could be experiencing a physical crunch which would point us to the Available Physical RAM or a crunch on the Virtual Address Space. This would involve finding out the queries which are consuming high memory during compilation or execution and narrow down on the single biggest contributor and work towards rectifying the problem with the query.

2. Disk IO: There could be an issue with SQL Server queries running slowly due to large amount of IO being done by them or a disk performance issue.

3. CPU: It could be because the queries executing on your server are consuming high amount of CPU time. In such a case, it be statistics that are out-of-date or data that has increased which introduces data skew.

4. Blocking: The queries executing on the server could be blocked and waiting for a particular resource and thus increasing the query duration.

How to perform post-mortem analysis for SQL performance problems?

If you are facing a performance degradation on the SQL instance after an upgrade, we would need to analyze the PSSDIAG collected to find out the top bottlenecks. There are two ways to do this:

1. Comparison Analysis: You identify the a query batch or stored procedure which is a problem and provide us a baseline (w.r.t. normal execution time for the query in question). We capture a “fast” and a “slow” run and we perform a comparison between the two runs and we tell you what is biggest bottleneck: Memory, Disk, CPU or Database/Environment Settings like database statistics, query SET options, difference in parameters etc. Based on what is identified as the culprit, you will be able to identify/suggest action items to help remove the culprit.

2. Bottleneck Analysis: In case you do not have a baseline, you can perform a bottleneck analysis and then stack rank them for you in case there is more than one bottleneck. Based on the analysis, you will be able to identify the biggest bottleneck and can implement action items to mitigate the single largest bottleneck.

If you need to perform live troubleshooting on the server, then you can make use of DMVs if you are on SQL Server 2005 or above or use Performance Dashboard (mentioned below).

RML Utilities Download:

http://www.microsoft.com/downloads/details.aspx?FamilyId=7EDFA95A-A32F-440F-A3A8-5160C8DBE926&displaylang=en

More Info on how to use the RML Utilities:

http://blogs.msdn.com/psssql/archive/2007/12/18/rml-utilities-for-microsoft-sql-server-released.aspx

http://sqlcat.com/technicalnotes/archive/2008/02/01/precision-performance-for-microsoft-sql-server-using-rml-utilities-9-0.aspx

If you need to analyze blocking data on the server, then you can use SQL Nexus for the same.

SQL Nexus Download Link:

http://www.codeplex.com/sqlnexus

Another option would be to use SQL Server 2005 Performance Dashboard in case you are testing your application on SQL Server 2005. This tool lets you view your server status without running PSSDIAG and provides reports to identify Long Running queries and also identify which queries are consuming the highest amount of resources (IO/CPU/Memory).

Performance Dashboard:

http://www.microsoft.com/downloads/details.aspx?FamilyId=1d3a4a0d-7e0c-4730-8204-e419218c1efc

Performance Dashboard for SQL Server 2008

http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/sql-server-performance-dashboard-reports-in-ssms-introduction-install-datediff-error-amp-modified-for-sql-2008.aspx

Zeollar presentation on SQL Nexus

Get Microsoft Silverlight

If you have difficulty viewing this session, click here to view it in the original website.

Here is the recording of the webcast that I presented today! Feel free to contact me through this blog or post on the discussion forum on the SQL Nexus discussion forum.

In case you are not able to view the presentation, then you can view it directly on the Zeollar site.

Unable to import a trace using SQL Nexus

While trying to import a profiler trace using SQL Nexus, you might get the following error in the SQL Nexus log file:

"The system cannot find the file specified (System)"
——————————
Program Location:
   at System.Diagnostics.Process.StartWithCreateProcess(ProcessStartInfo startInfo)
   at System.Diagnostics.Process.Start()
   at System.Diagnostics.Process.Start(ProcessStartInfo startInfo)
   at ReadTrace.ReadTraceNexusImporter.DoImport() in C:\jacklidocs3\cprsrc\main\src\sqlnexus_pass\ReadTraceNexusImporter\ReadTraceNexusImporter.cs:line 364
   at sqlnexus.fmImport.DoImport() in C:\jacklidocs3\cprsrc\main\src\sqlnexus_pass\sqlnexus\fmImport.cs:line 557

The issue could be one of the two reasons:

  1. Your RML Utilities version is older than the current version – Install the latest version of RML Utilities.
  2. Your RML utilities location is not registered with SQL Nexus – This can be corrected by navigating to  C:\Program Files\Microsoft Corporation\RMLUtils from Command Prompt and execute the following: orca.exe /R

Upcoming Zeollar Presentation

I will be presenting a 30 minute demo on how to perform post mortem analysis for SQL Performance issues using RML Utilities and SQL Nexus on 24th June, 2010. The recorded and live session will be available on Zeollar. Zeollar is a Microsoft India DPE initiative that gets you the latest technical content on a daily basis in different channels. Think of it like a personal television that switches on every day at a specified hour allowing you to switch channels and view the channel of your interest.

You can download the calendar invite from here.

SQL Nexus

This is a tool that is used by the Microsoft PSS team to analyze PSSDiag data. This tool has the capability of loading the Profiler Traces collected (provided ReadTrace is installed on the box).

In addition to this, it also loads the other .OUT files and the PerfStats script output file into the Nexus database. I use this tool regularly to analyze PSSDIAG data due to the following reasons:
1. Saves me time from manually opening each file and looking through them
2. Based on the graphs provided in the reports, I can quickly drilldown to the major bottleneck

The SQL Nexus reports act as a brilliant guide when you are analyzing PSSDIAG for the following scenarios:
1. High CPU
2. Blocking
3. High IO
4. General Performance Problems

Since, the SQL Nexus Reports use Reporting Services client-side report viewer, you can create your own reports to work against the Nexus database and then deploy them to the %appdata%/SQL Nexus/Reports folder. Once, this is done, the new reports would show up along with the defult reports every time SQL Nexus is launched.

What are the advantages of this tool?
1. Shows aggregation reports for the Profiler Traces loaded into it
2. Blocking and wait statistics information based on the PerfStats script output. This helps in analyzing blocking on the server
3. Since, the data collected are loaded into tables, you can run queries against the database to draw inferences. If you use these queries frequently enough, you can create your own reports and deploy them to the above mentioned location
4. You can export the data from the SQL Nexus Reports into files (.XLS,.PDF,.JPG etc) and send them as reports
5. The reports let you narrow down to specific time frames which makes analysis easier
Download location: http://www.codeplex.com/sqlnexus

Addition: April 4, 2011

A list of known issues documented with SQL Nexus:

http://blogs.msdn.com/b/sqlserverfaq/archive/2011/02/07/sql-nexus-and-rml-utilities-a-primer-on-the-issues-that-i-have-dealt-with-last-year.aspx

Addition: April 6, 2011

A very common issue that I worked on multiple times after which I realized that it warranted a blog post:

SQLDIAG: There are no instances of version 10 on this computer