T-SQL Tuesday #18: CTEs, XML and Process Monitor

Posted on May 10, 2011 by Amit Banerjee

This month’s recurring SQL blog party a.k.a. T-SQL Tuesday started by Adam Machanic (blog | @AdamMachanic) is being hosted by Bob Pusateri (blog | @SQLBob). This month’s topic is about Common Table Expressions which was introduced for the first time in SQL Server 2005. CTEs can be thought of as a derived table which can be self-referenced within a query. With the advent of CTEs, a lot of T-SQL developers made full use of this feature. Personally, I have used CTEs for various data collection and parsing scripts that I use for daily for my day-to-day work. I am a big fan of CTEs and have introduced their usage in one of the blocking reports for SQL Nexus. As with any coding construct, the use of CTEs has it’s PROs and CONs but in this post, I shall not go down the path as there are a lot of depends on whether a particular scenario is fit for the use of CTE of a temporary table. That decision has to be made based on testing against the benchmarks that you have or response-time SLAs that have defined for your application.

Since my primary expertise lies with SQL Server, I examine every opportunity of data analysis with the intention of importing the data into a SQL Server database table and then writing an automation routine over it to help me save the repititive iterations when I am analyzing the same data or need to perform the same set of steps over and over again. Recently I have shifted over to Powershell due to affinity for scripting but that will be a story for some other blog post. CSS uses Process Monitor, a vastly popular tool from Sysinternals, to troubleshoot various problems that we encounter while working on SQL Server related issues. Some of our adventures with Process Monitor are mentioned below:

PRB: SQL Backups to a UNC path fail with OS Error 1 (Incorrect Function)
INF: Permissions required for SQL Server Service account to use SSL certificate

During my tenure here with CSS, I too have relied on the Process Monitor to troubleshoot some fairly complex issues. Sometimes the Process Monitor .pml files can be quite large if collected without adding appropriate filters and I am just interested in a particular process(es). By importing the .PML file into a database, I can not only query the imported data using T-SQL queries but I can also write quick automations to look for common stack traces that I know have known issues/problems associated with them. Yes with Process Monitor, you can use locally cached symbols to get the call stack of the function call as well. Note that saving a .PML into XML with resolved stack symbols is slow due to the time taken for symbol resolution. The .XML file that I will use as an example was saved using the “Extensible Markup Language (XML)” option with “Include stack traces (will increase file size)” option enabled. I captured a Process Monitor trace while trying to access a directory which was not present on my server.

I already have a stored procedure to import the saved .XML file into a database table, then parse that existing data and insert the parsed data into the two tables that I have created to stored the event and stack information.

How does the use of CTEs fit in here?

The .XML file has the following structure for the stack and the frames:

<ProcessIndex>2083</ProcessIndex>
<Time_of_Day>5:52:38.8471548 PM</Time_of_Day>
<Process_Name>Explorer.EXE</Process_Name>
<PID>6588</PID>
<Operation>NotifyChangeDirectory</Operation>
<Path>D:\Tools</Path>
<Result></Result>
<Detail>Filter: FILE_NOTIFY_CHANGE_FILE_NAME, FILE_NOTIFY_CHANGE_ATTRIBUTES, FILE_NOTIFY_CHANGE_LAST_WRITE</Detail>
<stack>
<frame>
<depth>0</depth>
<address>0xfffff880013ab027</address>
<path>C:\Windows\system32\drivers\fltmgr.sys</path>
<location>fltmgr.sys + 0x2027</location>
</frame>

<frame>
<depth>1</depth>
<address>0xfffff880013abbe9</address>
<path>C:\Windows\system32\drivers\fltmgr.sys</path>
<location>fltmgr.sys + 0x2be9</location>
</frame>
….
….
</stack>

Now each operation has a call-stack associated with it which I am interested in importing into my database. But with the existing schema above, I cannot get the information that I require . After exploring the use of multiple ways of parsing the data using temporary variables, temporary tables, recursive CTEs, XML indexes and blah.. blah.. , I found that that quickest way to do this was:

1. Get the event list imported into a temporary table
2. Using the event list generated above, create the call stacks from the <frame> nodes (see code snippet below)

The logic here is that the cteXML used below extracts all frames for a particular event using a counter variable and creates XML data for each of the frames which are parsed by referencing the cteXML in the INSERT INTO…SELECT query.

-- Extract events from the XML file and put them into a staging table

INSERT INTO #tblXMLstaging (event,fname)

SELECT C.query('.') as event,@fname

FROM dbo.tbl_XMLData

CROSS APPLY xmldata.nodes('/procmon/eventlist/event') as T(C)

WHERE fname = @fname

-- Fetch values from staging table, store parsed frames in an XML and parse them to insert data into procmon_stacks table

WITH ctexml as

(

SELECT TOP 15 eventid,C.query('.') as frames,fname

FROM #tblXMLstaging

CROSS APPLY event.nodes('/event/stack/frame') as T(C)

WHERE eventid = @counter

)

INSERT INTO dbo.tblProcMon_Stacks (StackID, FName, FrameDepth, [Address], [Path], Location)

SELECT

eventid as StackID,

fname,

frames.value ('(/frame/depth)[1]', 'int') as FrameDepth,

frames.value ('(/frame/address)[1]', 'varchar(50)') as [address],

frames.value ('(/frame/path)[1]', 'varchar(255)') as [Path],

frames.value ('(/frame/location)[1]', 'varchar(255)') as [location]

FROM ctexml

The output of

If you are curious as to how long it takes to import the data, it took me a little over a minute (84 seconds to be precise) to load a 100MB XML file, parse and shred it! I am sure that there are optimizations possible to this method. The screenshot on the left shows the load times. The system specifications on where the data import is being done is:

Operating System: Microsoft Windows Server 2008 R2 Enterprise (x64)

Machine: Hewlett-Packard HP Z800 Workstation

Processor: 2 Quad Core [Intel(R) Xeon(R) CPU E5506 @ 2.13GHz, 2128 Mhz, 4 Core(s), 4 Logical Processor(s)]

RAM: 16.0 GB

Hard Disk: Barracuda 7200.12 SATA 3Gb/s 500GB Hard Drive

SQL Server: SQL Server 2008 R2
Now that the data is available in a database table, I can query the information using T-SQL queries. I had attempted to open the directory C:\Foo on my server using Windows Explorer. This information and the callstack can be visible using the following T-SQL queries:

-- Get the event information associated with the directory that I was looking up
select ProcessName, PID, Operation, Path, Result, Detail, StackID from dbo.tblprocmon_events

where fname = 'FileNotFound.xml'

and path like '%C:\Foo%'

order by StackID

-- Get information for the call-stack using the StackID obtained above
select FrameDepth, address, Path, Location

from tblProcmon_stacks

where fname = 'FileNotFound.xml' and StackID = 662

order by FrameDepth

Below is a screenshot showing me that the directory doesn’t exist and the call-stack associated with the function call that failed.

The options are now limitless, you can take this forward by creation for common callstacks in-case you are troubleshooting an application developed by you and have access to the private symbols. Or you could look for common results for certain operations like Operation = QueryDirectory and Result = NO SUCH FILE. My use of CTEs are for the following tasks:

1. Parsing XML data as shown here
2. Extracting useful information from Ring Buffers
3. Tracking available contiguous memory used and available in the SQL Server process address space

The scripts containing the Stored Procedure definitions and table schemas can be downloaded from here.

A shot in the arm for sp_purge_data

Posted on May 6, 2011 by Amit Banerjee

I had recently worked on a performance issue with the Management Data Warehouse job “mdw_purge_data”. This job calls a stored procedure sp_purge_data which was taking a long time to complete. This stored procedure just got a shot in the arm with an update to it released by the SQL Server Developer working on it. The T-SQL for the modified stored procedure is available on the SQL Agent MSDN blog. The actual Stored Procedure will be available in an upcoming update for SQL Server 2008 R2.

Addition: July 20th, 2011: SQL Server 2008 R2 Service Pack 1 has various updates for the purge process which doesn’t require modification of the procedure using the script mentioned in the aforementioned blog post.

A recent discussion on Twitter reminded me that I had this as one of the items to blog about so here it is. Now for some of the interesting technical details.

The poor performance of the stored procedure has been addressed through a refactoring of the Stored Procedure code but the issue doesn’t occur on all Management Database Warehouse databases.

The issue occurs when you have large amounts of data stored in the [snapshots].[query_stats] for each unique snapshot ID. If you have orphaned rows in the ‘ [snapshots].[notable_query_plan] ‘ and ‘ [snapshots].[notable_query_text] ‘ tables of the Management Data Warehouse. This was corrected in the builds mentioned in the KB Article below:

970014 FIX: The Management Data Warehouse database grows very large after you enable the Data Collector feature in SQL Server 2008
http://support.microsoft.com/default.aspx?scid=kb;EN-US;970014

The above KB Article also has a workaround mentioned which allows you to clean-up the orphaned rows using a T-SQL script.

The “Query Statistics” data collection set collects query statistics, T-SQL text, and query plans of most of the statements that affect performance. Enables analysis of poor performing queries in relation to overall SQL Server Database Engine activity. It makes use of DMVs sys.dm_exec_query_stats to populate the entries in the snapshots.query_stats table. If this DMV returns a large number of rows for each time the collection set runs, the number of rows in the actual Management Data Warehouse table “snapshots.query_text” can grow very quickly. This will then cause performance issues when the actual purge job executes on the MDW database.

In summary, if you are using Management Data Warehouse to monitor a highly active server where there are a lot of ad-hoc queries or prepared SQL queries being executed, then I would recommend you to modify you the sp_purge_data stored procedure using the MSDN blog link, I mentioned earlier in my post. I have seen significant performance improvement due to this on the specific environment that I was working on.

del.icio.us Tags: Performance,Tools,SQL Server 2008,SQL Agent,Management Data Warehouse,SQL Server 2008 R2,sp_purge_data

SQL Nexus: What StartTimeInterval?

Posted on May 5, 2011 by Amit Banerjee

When I get asked the same question more than 3 times in less than 48 hours, I guess it’s time to create some documentation around the error.

The issue is when you click on the UniqueStatements link in the ReadTrace_Main report in SQL Nexus, you get the following exception:

If you collect a profiler trace, you will find that the error is raised while executing the following command:

exec ReadTrace.spReporter_StmtTopN @StartTimeInterval, @EndTimeInterval, @TopN, @Filter1, @Filter2, @Filter3, @Filter4, @Filter1Name, @Filter2Name, @Filter3Name, @Filter4Name

The error message would be:

Error: 137, Severity: 15, State: 2
Must declare the scalar variable "@StartTimeInterval".

The above error will be reported in the exception message that is raised in SQL Nexus as well. The issue is not with SQL Nexus but with the RML Utilities report. You can get this issue under the following condition:

1. There are no entries in the readtrace.tblStatements and readtrace.tblTimeIntervals tables.
2. Or if you are using an older version of ReadTrace (RML Utilities)

I was able to reproduce the issue on ReadTrace.exe build of 9.01.0109.

The issue stems from the fact that when you click on the UniqueStatements report link and no parameters are specified, the Start Time and the End Time to be analyzed is picked up from the tblTimeIntervals. I was able to reproduce the issue on the same SQL Nexus database using one version of ReadTrace reports and not with the other. You can send an email to sstlbugs@microsoft.com and request the latest build for RML Utilities with the ReadTrace and SQL Nexus log file as attachments. The logs can be retrieved by clicking on the hyperlinks in SQL Nexus shown in the picture on the right.

del.icio.us Tags: RML,SQL Nexus,Tools

QnA: Performance Dashboard Webcast

Posted on May 4, 2011 by Amit Banerjee

I had an overflow of questions during the webcast today and wasn’t able to answer a few as we ran short of time. I shall attempt to address the most common questions asked during the webcast.

How do I configure SQL Server Performance Dashboards?
SQL Server Performance Dashboard was originally released for SQL Server 2005. Setup instructions are available with the download link. However, the same install can be tweaked to work for SQL Server 2008 and SQL Server 2008 R2 using the blog post by Sudarshan: Configuring Performance Dashboard for SQL Server 2008
Download link: SQL Server 2005 Performance Dashboard

The SQL Server instance being monitored must be running SQL Server 2005 SP2 or later. After completing the installation, you must:
1. Run the Setup.sql file on each instance of SQL Server 200x that you wish to monitor with the SQL Server 2005 Performance Dashboard Reports.
2. Open the performance_dashboard_main.rdl file with the Custom Reports functionality new to Management Studio in SQL Sever 2005 Service Pack 2.

Does Performance Dashboard work for SQL Server 2000?
No, Performance Dashboard uses DMVs to generate the data for the reports which are available from SQL Server 2005 and higher.

Where can I get the presentation and webcast recording?
The presentation and webcast recording for the entire webcast series will be made available on the SQLServerFAQ MSDN blog. I shall post an update once both are available.

What is the performance impact of running these reports?
If your server is unresponsive or experiencing high CPU usage on all the processors, then these reports will also face issues while retrieving diagnostic data from the DMVs. However, the overall impact of running this reports on a SQL Server instance is very minimal as compared to any other performance diagnostic tool.

Can these reports be used on SQL Azure or any other RDBMS?
No, they cannot be used to monitor SQL Azure databases or any other non-SQL Server RDMBS environment.

What permissions are required for using Performance Dashboard?
You need to have VIEW SERVER STATE permission to view all the information exposed by Performance Dashboard as most of the information is collected from DMVs.

Can these reports be deployed to a local shared repository?
You can use Visual Studio and create custom reports similar to the Performance Dashboard reports or modify the existing reports as per your needs and deploy them. Be advised, that this will be a customized solution and not supported by CSS w.r.t. performance issues while running these reports or configuration issues.

Lastly, thank you for attending the session! And for your patience.

In case there are more questions, please feel free to add a comment and I shall reply back accordingly.

Cheers!

del.icio.us Tags: Webcasts,SQL Server,Performance,Tools,Twost

SQL Server Backup Simulator: Cumulative update

Posted on January 17, 2011 by Amit Banerjee

The SQL Server Backup Simulator was launched on Microsoft Code Gallery to aid and assist in troubleshooting backup issues related to SQLVDI. Thank you to everyone who has evangelized the tool and provided feedback on the same.

The current release is marked as v1.2.

We released two minor updates for the tool to incorporate the following changes:

SQL Backup Simulator does not report FileVersion instead it reports ProductVersion while reading the sqlvdi.dll information – There was an issue that we uncovered while validating the sqlvdi DLL version which was corrected in v1.1 and is now part of the v1.2 release.
COPY_ONLY feature while simulating backup – The COPY_ONLY feature was added to ensure that the any full backups done for a database by the tool doesn’t break an existing backup chain for a database.

The latest executable can be downloaded from: https://github.com/Microsoft/tigertoolbox/releases/latest