Running SQL Nexus using Command Line Parameters


SQL Nexus is a tool that helps you identify the root cause of SQL Server performance issues. It loads and analyzes performance data collected by SQLDiag and PSSDiag. It can dramatically reduce the amount of time you spend manually analyzing data. One of the least commonly known facts is that the SQL Nexus tool also allows you to run it’s reports using command line arguments or even import diagnostic data automatically. The command line help reference is shown below:

/Sserver

Specifies a SQL Server name to connect to

/D”database”

Database to connect to

/E

Log in to SQL using Windows/integrated security

/Uuser

Specifies a SQL (non-Windows) login name

/Ppassword

Specifies the password for a SQL (non-Windows) login

“/Cconnstring”

Specifies the SqlClient ConnectionString to use (can be used instead of /S /E).

“/Ipath”

Import SQL diagnostic data from this path

“/Rreport”

Specifies a report file name to run.

“/Opath”

Specifies an export path for reports executed via /R

/X

Exit after importing (/I) or exporting (/O) the specified report

(/R)

 

/Vparam=value

Specify the value of a form parameter

 

Automating Report Execution

The first scenario that I want to outline is the need to automate a report execution especially when you need to execute the reports repeatedly after activities like performance testing or during some other iterative activity which requires analysis of various sets of diagnostic data.

This can be achieved using the command line parameters shown below:

sqlnexus.exe /S”.” /X /D”sqlnexus” /R”<full or relative path>\Summary.rdl” /O”<path where the log file and report should be exported>”

As you can see from the screenshot below, you can see that I have a standard report i.e. “Bottleneck Analysis_C.xls” that was exported and a non-standard report which I created Summary.xls in the same folder. This also gives additional opportunity to create your own performance reports and schedule them using the SQL Nexus executable.

Report Output

After you have the exported reports, you could also write additional automation to email the reports as required.

Automating Data Import

SQL Nexus also allows you to perform data imports automatically using the command line parameters. The command line shown below performs data import from the specified folder into a SQL Server database called sqlnexustest on the default SQL server instance installed on the machine.

sqlnexus.exe /S”.” /X /D”sqlnexustest” /I”C:\temp\output”

If the database does not exist on the SQL Server instance, then it will be created. The settings for SQL Nexus utility will be used during the data import which were saved when the last time the UI was used. So if you have disabled the profiler trace import from the UI and saved your settings, then the command line execution will not import the profiler traces.

Advertisements

Chasing the Ghost Cleanup in an Availability Group


Because read operations are mapped to snapshot isolation transaction level, the cleanup of ghost records on the primary replica can be blocked by transactions on one or more secondary replicas. The ghost record cleanup task will automatically clean up the ghost records for disk-based tables on the primary replica when they are no longer needed by any secondary replica. This is similar to what is done when you run transaction(s) on the primary replica. In the extreme case on the secondary database, you will need to kill a long running read-query that is blocking the ghost cleanup. Note, the ghost clean can be blocked if the secondary replica gets disconnected or when data movement is suspended on the secondary database. This state also prevents log truncation, so if this state persists, we recommend that you remove this secondary database from the availability group.

The above is a snippet from the official Microsoft documentation for Availability Group Secondary Replicas under the limitations and restrictions section.

So a transaction on a secondary replica can block an operation on a primary replica… Hmm.. Now that smells like a mystery!

Before I go further, let me explain what Ghost Cleanup does. Let me give you the official text from the Books Online.

Deletes operations from a table or update operations that cause a row to move can immediately free up space on a page by removing references to the row. However, under certain circumstances, the row can physically remain on the data page as a ghost record. Ghost records are periodically removed by a background process. This residual data is not returned by the Database Engine in response to queries.

I had some free time a while back and decided to actually track this down to show how the Ghost Cleanup actually works in an availability group replica. My availability group setup was a simple one which had two SQL Server instances sitting across two different subnets as shown in Screenshot 1.

image

So, let’s get the show on the road and let me walk you through walk you the through the ghost cleanup behavior on the secondary replica.

The DML

On my existing Availability Group setup, I inserted a single row in a table of a primary replica database with the value of 3. The logged operations show up as follows in the SQL Server transaction log.. Psst.. Don’t tell anyone that I was reading the log file Winking smile

image

As you can see from the green highlight above the Transaction ID is 11899 (0x2e7b) which inserted a row on Page ID 315 (ox13b).

What was on the page

I verified that the page on the primary replica database had the new entry that I had added into the heap using the Page ID retrieved from the transaction log. Note that we are keeping track of the oldest active transaction as well. The record shows the transaction timestamp which was responsible for the DML operation 11899 (0x2e7b). From Screenshot 2 below, you can see that the version information is maintained and the transaction timestamp shows up correctly (green highlight). The value also shows up correctly (pink highlight).

image

What happened after that?

Then I started a transaction on the secondary replica and executed a SELECT query on the same table with a HOLDLOCK hint to keep the row lock. Then I proceeded to delete both the rows with value 3 in the primary replica. I verified that the rows were not retrieved by a SELECT query on the primary and secondary replica. The transaction log dump from secondary replica shows that the changes were replayed.

Oops! I had to read the transaction log again.

From the green highlights in Screenshot 4, you can see that the GhostCleanupTask transaction ran on the secondary replica. The pink highlights shows that the transaction ID 11900 (0x2e7c) deleted two rows from the Page ID 315 (ox13b). So all is good now.

image

Curiosity killed the cat!

imageWell curiosity got the better of me and I decided to check if the same the story was being told inside the transaction log of the primary replica database. And this is where David Duchovny’s a.k.a. Agent Moulder and Gillian Anderson a.k.a. Agent Scully’s faces from the X-Files will be an apt representation of what I present next.

Screenshot 5 will show that the Ghost Cleanup Task continues to execute on the Primary Replica Database! What now executed? Did we not delete the rows and verify everything was alright…

The first observation is that the transaction log is being replayed to the letter on the secondary replica. Notice that the transaction IDs of the Ghost Cleanup Task correspond with the transaction IDs of the Ghost Cleanup Task found on the secondary replica instance database. It wasn’t a joke when the documentation said that transactions are replayed on the secondary replica!

The yellow highlights show that the rows were deleted from the table that I had performed the delete on. The pink highlights confirm that the same transaction was associated with both the deletes.

I had verified that there were no ghost records in the database when I started the repro. So, the important question was:

Why Ghost Cleanup Task was running repeatedly on the primary replica database?

image

What was really happening: THE EXPLANATION

Since I had an open transaction on the secondary replica database, I had an active version store! Screenshot 6 shows that my active version store on the secondary replica with a transaction sequence number of 11900 (0x2e7c) which matches the transaction ID of the delete operation. This active version stored was created on the delete operation because I had performed a SELECT with a HOLDLOCK earlier as mentioned in this post on the table from the secondary replica.

Then I created a table on the primary replica database and inserted a row in it. I checked if this data was available on the secondary replica and it was!

image

I found that on the primary replica, the database page shows ghost version records (Screenshot 7). The transaction timestamp matches the transaction ID which performed the delete operation i.e. transaction ID 11900 (0x2e7c).

image

Light at the end of the tunnel

Once transaction which I had stated on the secondary replica with the HOLDLOCK hint was committed, the ghost cleanup task was able to perform the cleanup on the primary replica’s page. Once this was completed successfully, the ghost records on the secondary replica were cleaned up promptly as well.

The above behavior is true for both synchronous and asynchronous modes of operation.

I hope this was a fun Friday read! Have a good weekend!

Cache files–Why are you increasing in size?


Management Data Warehouse provides a mechanism to collect Perfmon and DMV outputs and store them in a SQL Server database to help you troubleshoot performance issues that you might have faced in the past. This is quite helpful for intermittent performance issues that you might face on you SQL Server instances as it might not be possible to have someone monitoring the system 24X7 to troubleshoot the performance problem when it is occurring.

Management Data Warehouse provides two ways of collecting data:

1. Cached mode – This stores the Collection Set data in a .cache file in the cache file directory which is uploaded to the MDW database based on the schedule that you defined for your Collection Set.
2. Non-cached mode – This method of collection data using a Collection Set doesn’t depend on local cache. It collects the data pertaining to the Collection Set and uploads the data based on two different configuration options: On Demand or based on a schedule.

MDW_CollectionWorkFlow

The screen shot (left) from Books Online of the run-time processing of the MDW jobs shows how the data is uploaded to the MDW database. When you have a highly transactional system with large number of concurrent connections to your SQL Server instance or a lot of processes running on the Windows Server, the amount of data collected by the System Data Collection Sets can be quite large for a single iteration especially for the Server Activity system data collection set.

So what can happen if the .cache files become very large. Let’s say your collection frequency is set to every 15 minutes. This means that the Data Collection set will collect data every 15 minutes. The scheduled upload will then use this cache file to upload the data using the SSIS upload package(s).

Now what happens if you miss an upload schedule because the upload job ran longer than 15 minutes. This means that your .cache file now has 2 sets of data to be uploaded. This cumulative addition of successive collection sets over a period of time with missed upload schedules can cause a problem once the upload job takes more than 60 minutes. The SSIS package(s) are hard-coded to timeout after 60 minutes of execution. Then you are stuck with cache data in your cache folder which cannot be uploaded.

When the SSIS package times out after 60 minutes of execution, you will get the following error in the job history for the job associated with the upload package:

Server Activity,Error,4861,Server Activity,,Error,The thread “ExecMasterPackage” has timed out 3600 seconds after being signaled to stop.,Upload,17/05/2011 13:11:37,,17/05/2011 14:11:38,3601,<server name>,,,

So what are your options here.

Option 1
1. Change the Server Activity collection to non-cached – This change would ensure that the data is uploaded as soon as it is collected without using an interim cache file.
2. Increase the collection frequency – The collection frequency is set to default of 15 seconds. This can be increased to 30 seconds or higher to prevent very large data being collected between the scheduled upload intervals.
3. Modify the sp_purge_data stored procedure for the purge job to run faster – An updated version of sp_purge_data is available here. This will ensure that the purge job completes faster.

Option 2
Evaluate the perfmon counters being collected in the Server Activity collection set. Find out which perfmon counters are required and then create a custom collection set to capture this data with a non-cached mode for data collection.

You can create your custom perfmon collection set as the default Server Activity collection set has a large number of perfmon counters from which data is collected and uploaded.

Below is a custom collection set created for Performance Monitor counters. I have shown how to collect Server specific perfmon counters and SQL instance specific perfmon counters.

use msdb;

Begin Transaction

Begin Try

Declare @collection_set_id_1 int

Declare @collection_set_uid_2 uniqueidentifier

EXEC [dbo].[sp_syscollector_create_collection_set]

@name=N'Perfmon Counter Collection Set',

@collection_mode=1,

@description=N'Collects Perfmon Counters ',

@target=N'',

@logging_level=0,

@days_until_expiration=7,

@proxy_name=N'',

@schedule_name=N'CollectorSchedule_Every_5min',

@collection_set_id=@collection_set_id_1 OUTPUT,

@collection_set_uid=@collection_set_uid_2 OUTPUT

Select @collection_set_id_1, @collection_set_uid_2


Declare @collector_type_uid_3 uniqueidentifier

Select @collector_type_uid_3 = collector_type_uid From [dbo].[syscollector_collector_types] Where name = N'Performance Counters Collector Type';

Declare @collection_item_id_4 int


EXEC [dbo].[sp_syscollector_create_collection_item]

@name=N'Logical Disk Collection and SQL Server CPU',

@parameters=N'<ns:PerformanceCountersCollector xmlns:ns="DataCollectorType">

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk Bytes/Read" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk Bytes/Write" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk sec/Read" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Avg. Disk sec/Write" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Disk Read Bytes/sec" Instances="*" />

<PerformanceCounters Objects="LogicalDisk" Counters="Disk Write Bytes/sec" Instances="*" />

<PerformanceCounters Objects="Process" Counters="% Privileged Time" Instances="*" />

<PerformanceCounters Objects="Process" Counters="% Processor Time" Instances="*" />

<PerformanceCounters Objects="Process" Counters="% User Time" Instances="*" />

<PerformanceCounters Objects="$(INSTANCE):Buffer Manager" Counters="Page life expectancy" Instances="*" />

</ns:PerformanceCountersCollector>',

@collection_item_id=@collection_item_id_4 OUTPUT,

@frequency=5,

@collection_set_id=@collection_set_id_1,

@collector_type_uid=@collector_type_uid_3

select @collection_item_id_4

Commit Transaction;

End Try


Begin Catch

Rollback Transaction;

DECLARE @ErrorMessage NVARCHAR(4000);

DECLARE @ErrorSeverity INT;

DECLARE @ErrorState INT;

DECLARE @ErrorNumber INT;

DECLARE @ErrorLine INT;

DECLARE @ErrorProcedure NVARCHAR(200);

SELECT @ErrorLine = ERROR_LINE(),

@ErrorSeverity = ERROR_SEVERITY(),

@ErrorState = ERROR_STATE(),

@ErrorNumber = ERROR_NUMBER(),

@ErrorMessage = ERROR_MESSAGE(),

@ErrorProcedure = ISNULL(ERROR_PROCEDURE(), '-');


RAISERROR (14684, @ErrorSeverity, 1 , @ErrorNumber, @ErrorSeverity, @ErrorState, @ErrorProcedure, @ErrorLine, @ErrorMessage);


End Catch;

Reference:
How to: Create a Custom Collection Set That Uses the Generic T-SQL Query Collector Type
Using SQL Server 2008 Management Data Warehouse for database monitoring in my application