Easing in the trace FILTER for SQLDIAG


Now that the Diag Manager is available online and you can use configured custom PSSDIAG/SQLDIAG configuration files to collect data from SQL Server instances, you might want to configure your data collection packages in such a manner that profiler traces are setup correctly with trace filters if needed.

Note: Trace filtering can dramatically reduce trace (.TRC file size) and the I/O cost of tracing, but you should be aware that it can actually increase the CPU burden of tracing. To minimize the extra CPU use, filtering should be performed on an integer column (dbid, duration, etc.) instead of a text column (database name, textdata, etc) whenever possible. If a filter doesn’t remove a significant portion of the trace events (example >10%), it probably isn’t worth it, and might actually introduce more overhead than it prevents. While configuring PSSDIAG/SQLDIAG for SQL Server, you cannot add Profiler Trace Filters. Even if you do so from the GUI, it would not be included in the configuration file so that the filters are honored when  the server side trace starts up.

imageTo set filters for profiler traces collected with PSSDIAG/SQLDIAG, you need to:

1. Initialize PSSDIAG/SQLDIAG on the server. By this I mean start the PSSDIAG/SQLDIAG.
2. Find out the Trace ID of the profiler trace running using fn_trace_getinfo function or sys.traces view.
3. Use the Trace ID obtained from the above step, and use the sp_trace_setfilter stored procedure to set the filter. Refer "SQL Profiler Data Columns" under SQL Server Books Online for the Data Column numbers and "sp_trace_setfilter" topic for finding out the values of the logical and comparison operators.
4. To verify that the filter is active, use the fn_trace_filterinfo function. 

I shall demonstrate in this blog post how this works. As you can see in the see in the screenshot that the TraceID 2 is what I want to customize. Now that I have trace id, I will stop the profiler trace using sp_trace_setstatus stored procedure. Status value 0 will stop the trace.

Once I stop the trace, I use the sp_trace_setfilter function with the appropriate Data Column ID for setting a filter on SPID 52 to set a filter on SPID 52. Then I start the trace again. When I look into the profiler trace, I find that before the trace was stopped it was collecting data for all SPIDs but after setting the filter data is being collected ONLY for SPID 52 (see highlighted sections screenshot below).

imageOnce the filter is set. You can use the fn_trace_getfilterinfo function to verify that the trace filter is active.

If you are not using SQLDIAG to collect profiler traces, then it is as simple as configuring a server side trace. The only additional thing that you need to in this case is add the column filters while configuring the profiler trace. The above exercise is required only when you are collecting traces using SQLDIAG/PSSDIAG.

Reference:
How to: Filter Events in a Trace (SQL Server Profiler)
http://msdn.microsoft.com/en-us/library/ms175520.aspx

/*T-SQL commands used*/

select * from sys.traces — To get the trace id
EXEC sp_trace_setstatus 2,0 — Stops the trace but doesn’t delete the trace definition from the server
EXEC sp_trace_setfilter 2, 12, 0, 0, 52 — Add a filter for SPID = 52
EXEC sp_trace_setstatus 2,1 — Start the trace again
select * from fn_trace_getfilterinfo(2) — Get information about the filters set for the trace

image

Hope the above information helps in setting up trace filter for a profiler trace collected using SQLDIAG.

Cache files–Why are you increasing in size?


Management Data Warehouse provides a mechanism to collect Perfmon and DMV outputs and store them in a SQL Server database to help you troubleshoot performance issues that you might have faced in the past. This is quite helpful for intermittent performance issues that you might face on you SQL Server instances as it might not be possible to have someone monitoring the system 24X7 to troubleshoot the performance problem when it is occurring.

Management Data Warehouse provides two ways of collecting data:

1. Cached mode – This stores the Collection Set data in a .cache file in the cache file directory which is uploaded to the MDW database based on the schedule that you defined for your Collection Set.
2. Non-cached mode – This method of collection data using a Collection Set doesn’t depend on local cache. It collects the data pertaining to the Collection Set and uploads the data based on two different configuration options: On Demand or based on a schedule.

MDW_CollectionWorkFlow

The screen shot (left) from Books Online of the run-time processing of the MDW jobs shows how the data is uploaded to the MDW database. When you have a highly transactional system with large number of concurrent connections to your SQL Server instance or a lot of processes running on the Windows Server, the amount of data collected by the System Data Collection Sets can be quite large for a single iteration especially for the Server Activity system data collection set.

So what can happen if the .cache files become very large. Let’s say your collection frequency is set to every 15 minutes. This means that the Data Collection set will collect data every 15 minutes. The scheduled upload will then use this cache file to upload the data using the SSIS upload package(s).

Now what happens if you miss an upload schedule because the upload job ran longer than 15 minutes. This means that your .cache file now has 2 sets of data to be uploaded. This cumulative addition of successive collection sets over a period of time with missed upload schedules can cause a problem once the upload job takes more than 60 minutes. The SSIS package(s) are hard-coded to timeout after 60 minutes of execution. Then you are stuck with cache data in your cache folder which cannot be uploaded.

When the SSIS package times out after 60 minutes of execution, you will get the following error in the job history for the job associated with the upload package:

Server Activity,Error,4861,Server Activity,,Error,The thread “ExecMasterPackage” has timed out 3600 seconds after being signaled to stop.,Upload,17/05/2011 13:11:37,,17/05/2011 14:11:38,3601,<server name>,,,

So what are your options here.

Option 1
1. Change the Server Activity collection to non-cached – This change would ensure that the data is uploaded as soon as it is collected without using an interim cache file.
2. Increase the collection frequency – The collection frequency is set to default of 15 seconds. This can be increased to 30 seconds or higher to prevent very large data being collected between the scheduled upload intervals.
3. Modify the sp_purge_data stored procedure for the purge job to run faster – An updated version of sp_purge_data is available here. This will ensure that the purge job completes faster.

Option 2
Evaluate the perfmon counters being collected in the Server Activity collection set. Find out which perfmon counters are required and then create a custom collection set to capture this data with a non-cached mode for data collection.

You can create your custom perfmon collection set as the default Server Activity collection set has a large number of perfmon counters from which data is collected and uploaded.

Below is a custom collection set created for Performance Monitor counters. I have shown how to collect Server specific perfmon counters and SQL instance specific perfmon counters.

use msdb;

Begin Transaction

Begin Try

Declare @collection_set_id_1 int

Declare @collection_set_uid_2 uniqueidentifier

EXEC [dbo].[sp_syscollector_create_collection_set]

@name=N'Perfmon Counter Collection Set',

@collection_mode=1,

@description=N'Collects Perfmon Counters ',

@target=N'',

@logging_level=0,

@days_until_expiration=7,

@proxy_name=N'',

@schedule_name=N'CollectorSchedule_Every_5min',

@collection_set_id=@collection_set_id_1 OUTPUT,

@collection_set_uid=@collection_set_uid_2 OUTPUT

Select @collection_set_id_1, @collection_set_uid_2


Declare @collector_type_uid_3 uniqueidentifier

Select @collector_type_uid_3 = collector_type_uid From [dbo].[syscollector_collector_types] Where name = N'Performance Counters Collector Type';

Declare @collection_item_id_4 int


EXEC [dbo].[sp_syscollector_create_collection_item]

@name=N'Logical Disk Collection and SQL Server CPU',

@parameters=N'<ns:PerformanceCountersCollector xmlns:ns="DataCollectorType">

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk Bytes/Read" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk Bytes/Write" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk sec/Read" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk sec/Write" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Disk Read Bytes/sec" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Disk Write Bytes/sec" Instances="*" />

<PerformanceCounters Objects="Process" Counters="% Privileged Time" Instances="*" />

<PerformanceCounters Objects="Process" Counters="% Processor Time" Instances="*" />

<PerformanceCounters Objects="Process" Counters="% User Time" Instances="*" />

<PerformanceCounters Objects="$(INSTANCE):Buffer Manager" Counters="Page life expectancy" Instances="*" />

</ns:PerformanceCountersCollector>',

@collection_item_id=@collection_item_id_4 OUTPUT,

@frequency=5,

@collection_set_id=@collection_set_id_1,

@collector_type_uid=@collector_type_uid_3

select @collection_item_id_4

Commit Transaction;

End Try


Begin Catch

Rollback Transaction;

DECLARE @ErrorMessage NVARCHAR(4000);

DECLARE @ErrorSeverity INT;

DECLARE @ErrorState INT;

DECLARE @ErrorNumber INT;

DECLARE @ErrorLine INT;

DECLARE @ErrorProcedure NVARCHAR(200);

SELECT @ErrorLine = ERROR_LINE(),

@ErrorSeverity = ERROR_SEVERITY(),

@ErrorState = ERROR_STATE(),

@ErrorNumber = ERROR_NUMBER(),

@ErrorMessage = ERROR_MESSAGE(),

@ErrorProcedure = ISNULL(ERROR_PROCEDURE(), '-');


RAISERROR (14684, @ErrorSeverity, 1 , @ErrorNumber, @ErrorSeverity, @ErrorState, @ErrorProcedure, @ErrorLine, @ErrorMessage);


End Catch;

Reference:
How to: Create a Custom Collection Set That Uses the Generic T-SQL Query Collector Type
Using SQL Server 2008 Management Data Warehouse for database monitoring in my application

Tools Tips and Tricks #12: XPerf, Memory usage and much more


This is the last post for the series Tools Tips and Tricks as May draws to a close. Today I shall talk about another tool that we use called XPerf for performance troubleshooting. Though this is not a common tool which is used on a regular basis by the SQL CSS team. But when we do decide to use this for very specific scenarios, the usefulness of this tool cannot be put in words. I had talked about using Debug Diag for monitoring memory usage and tracking down allocations right upto the function call. There is another way to track heap allocations which is what I shall be talking about today. I shall use the same MemAllocApp that I had used last week. I start off the Xperf monitoring using the following commands:

xperf -on PROC_THREAD+LOADER -BufferSize 1024 -MinBuffers 16 -MaxBuffers 16
xperf -start HeapSession -heap -Pids 9532 -BufferSize 1024 -MinBuffers 128 -MaxBuffers 128 -stackwalk HeapAlloc+HeapRealloc+HeapCreate

Now once I have collected the XPerf data, you can use the following command to stop the data collection:

xperf -stop HeapSession -stop –d F:\MemAlloc.etl

Once that is done, you should have an ETL file in the specified location by the –d parameter. Since, I am interested in the functions in the functions which were allocating the maximum amount of memory, I will use the following command to generate a summary report for the heap allocations traced by XPerf using the command below:

xperf -i "F:\MemAlloc.etl" -o "F:\MemAlloc.txt" -symbols -a heap -stacks -top 5

/* Output of MemAlloc.txt file */

Results for process MemAllocApp.exe (9532):

———————————————————————

GLOBAL ALLOCATIONS:
Alloc       :         100,     512000.0 KB
Realloc     :           0
Outstanding :         100,     512000.0 KB

———————————————————————

TOP 1:
Alloc       :         100,     512000.0 KB
Realloc     :           0
Outstanding :         100,     512000.0 KB

———————————————————————

MemAllocApp.exe!fn_allocatememory
MemAllocApp.exe!wmain
MemAllocApp.exe!__tmainCRTStartup
MemAllocApp.exe!wmainCRTStartup
kernel32.dll!BaseThreadInitThunk
ntdll.dll!RtlUserThreadStart

Alloc       :         100,     512000.0 KB
Realloc     :           0
Outstanding :         100,     512000.0 KB

 

As you can see from the above output, the function fn_allocatememory was responsible for 100 allocations worth 512KB each. With just the use of a single command I was able to figure out the reason behind my outstanding allocations for my EXE. Troubleshooting SQL Server outstanding memory allocations for heaps may not be as easy as this but it definitely saves time in having to look and dig out the allocations from the a memory dump.

This method is quite useful when you have a very large ETL file which you need to analyze. You can even configure a Circular Buffer for capturing data appending the following command for your HeapSession tracing commands:

-BufferSize 1024 -MaxBuffers 1024 -MaxFile 1024 -FileMode Circular

Note: Make sure that you set your _NT_SYMBOL_PATH environment variable correctly if you want the function calls to be resolved correctly.

Hope you enjoyed this series of Tools Tips and Tricks as much as I had fun in posting the various methods that I use to collect diagnostic data while troubleshooting SQL performance related issues.

References:
Using Actions to process Heap Data
Enabling Data Capture using XPerf
XPerf Options