WOOT: Default Trace and Power View

I have been working on building visualizations for various kinds of analysis that I perform for my customers. One such useful visualization was the use of Power View for analyzing the data available in the SQL Server Default Trace. The query below lets you retrieve all the information in the default traces. This same query is used to populate the Power Pivot table in the Excel file.

declare @enable int

-- Check to find out if Default Server Side traces are running
select top 1 @enable = convert(int,value_in_use) from sys.configurations where name = 'default trace enabled'

if @enable = 1 --default trace is enabled

declare @d1 datetime;
declare @diff int;
declare @curr_tracefilename varchar(500);
declare @base_tracefilename varchar(500);
declare @indx int ;

select @curr_tracefilename = path from sys.traces where is_default = 1 ;

set @curr_tracefilename = reverse(@curr_tracefilename)
select @indx = PATINDEX('%\%', @curr_tracefilename)
set @curr_tracefilename = reverse(@curr_tracefilename)
set @base_tracefilename = LEFT( @curr_tracefilename,len(@curr_tracefilename) - @indx) + '\log.trc';

select EventCat.name as Category, EventID.name as EventName, Events.*
from ::fn_trace_gettable( @base_tracefilename, default ) Events
inner join sys.trace_events EventID
on Events.EventClass = EventID.trace_event_id
inner join sys.trace_categories EventCat
on EventID.category_id = EventCat.category_id


Once I have all the trace data available in my Power Pivot table, I created calculated columns for Day, Hour and Minute. Now that I have all the data readily available for me, I went about creating the main dashboard which provides a view of all the events that occurred along with a time line view. All this took me less than 5 minutes after I had finished writing the query! Pretty quick. Now I have an interactive report that I can use for performing various kinds of analysis.

The screenshot below will show that there was only one event raised for the Server event category and the actual time of occurrence is shown in the line graph. A simple mouse over on the point will give you the exact details. Now isn’t that a simple way to track down events! Smile


I will provide a final version of the Excel sheet once I have completed the other dashboards and sanitized the information available in the Power Pivot table.

MaxBCPThreads not causing parallel export

I had recently run into an issue where the MaxBCPThreads value was set to a non-zero value and the Snapshot Agent was still not creating multiple snapshot files for the table in question which was over 70GB in size.

On further inspection, I did find that the issue was with the statistics distribution of the table in question was affecting the parallel file export by the snapshot agent. It has already been documented that the concurrent snapshot generation option prevents the use of parallel BCP threads by the Distribution Agent. This is documented on the ReplTalk blog on MSDN.

imageNow to find out why the parallel threads were not being generated. So I created a repro of the situation on my side with a transaction replication publication having three tables with the same schema options. The snapshot folder after the snapshot agent’s execution is as shown in the screenshot.

The tblNewArticle6 has 60398 rows, tblNewArticle3 has 100,000 rows and the tblNewArticle1 has 2 rows.

The MaxBCPThreads configured value for the snapshot agent is 6. Surprisingly, tblNewArticle6 has 12 BCP files and the tblNewArticle3 has only one BCP file.

Looking into a profiler trace and the verbose log for the Snapshot Agent, I found out the following information:

1. There is a thread which performs an exploration on what ranges the parallel BCP threads will be fetching. The application name in the profiler trace will show up as “Auxiliary bcp load ordering hint and partitioning resolution thread # for publication <publication name>”. See Screenshot 2 for more details.

2. You will also find dynamic SQL commands executing DBCC SHOW_STATISTICS against the tables in question. image

In my case, tblNewArticle6 has a data distribution graph as shown in screenshot 3. It is clear from the screenshot that the tblNewArticle6 has multiple ranges available for the index id = 1. These were utilized by the Snapshot agent to generate 12 BCP files.



On looking into the Snapshot agent verbose log, I find find the following

Partitioning where clauses for article ‘tblnewArticle6’:

([SalesOrderNumber] is null) or ([SalesOrderNumber] <= convert(nvarchar(7), 0x53004F0035003000380034003400) collate SQL_Latin1_General_CP1_CI_AS)
([SalesOrderNumber] > convert(nvarchar(7), 0x53004F0035003000380034003400) collate SQL_Latin1_General_CP1_CI_AS) and ([SalesOrderNumber] <= convert(nvarchar(7), 0x53004F0035003300310039003500) collate SQL_Latin1_General_CP1_CI_AS)

([SalesOrderNumber] > convert(nvarchar(7), 0x53004F0037003300300037003600) collate SQL_Latin1_General_CP1_CI_AS)

The above ranges translated to 12 ranges which is the same number of files that I see in my snapshot folder.

Now looking into tblNewArticle3 which has 100,000 rows, I have a DBCC SHOW_STATISTICS output which only showed me two distinct ranges. Another point to note was that the primary key of the table has a non-clustered index defined on it and the clustered index key is a non primary key column. On increasing the number of rows in the table, I still find that the number of rows in the table, tblNewArticle3, I still found that the number of BCP files that were being generated was only 1. Then on additional investigation, I found that there is a small requirement of the clustered index to contain the primary key columns for the published article as well. This is evaluated in one of the system stored procedures for replication.

Some deep diving into the profiler traces, snapshot agent logs and debugging the stored procedures will help you arrive at the following conclusion:

Multiple BCP files will be created by the Snapshot Agent only if the following are true:

1. There are multiple distinct ranges available in the clustered index’s statistics histogram to create partitions which can be used by the BCP program to extract the data in parallel.

2. The clustered index must be defined on the primary key columns.

3. MaxBCPThreads value is set to a value other than 1.

If the above is true, then you will see multiple BCP files created by the Snapshot agent for the table of interest. Note that the above information is simplified to ensure that I comply with my NDA. After making the above changes, I see the following output in the repldata folder shown below in the screenshot:


Additional Reference:
Improving snapshot performance using MaxBCPThreads