Chasing the Ghost Cleanup in an Availability Group

Posted on July 25, 2014 by Amit Banerjee

Because read operations are mapped to snapshot isolation transaction level, the cleanup of ghost records on the primary replica can be blocked by transactions on one or more secondary replicas. The ghost record cleanup task will automatically clean up the ghost records for disk-based tables on the primary replica when they are no longer needed by any secondary replica. This is similar to what is done when you run transaction(s) on the primary replica. In the extreme case on the secondary database, you will need to kill a long running read-query that is blocking the ghost cleanup. Note, the ghost clean can be blocked if the secondary replica gets disconnected or when data movement is suspended on the secondary database. This state also prevents log truncation, so if this state persists, we recommend that you remove this secondary database from the availability group.

The above is a snippet from the official Microsoft documentation for Availability Group Secondary Replicas under the limitations and restrictions section.

So a transaction on a secondary replica can block an operation on a primary replica… Hmm.. Now that smells like a mystery!

Before I go further, let me explain what Ghost Cleanup does. Let me give you the official text from the Books Online.

Deletes operations from a table or update operations that cause a row to move can immediately free up space on a page by removing references to the row. However, under certain circumstances, the row can physically remain on the data page as a ghost record. Ghost records are periodically removed by a background process. This residual data is not returned by the Database Engine in response to queries.

I had some free time a while back and decided to actually track this down to show how the Ghost Cleanup actually works in an availability group replica. My availability group setup was a simple one which had two SQL Server instances sitting across two different subnets as shown in Screenshot 1.

So, let’s get the show on the road and let me walk you through walk you the through the ghost cleanup behavior on the secondary replica.

The DML

On my existing Availability Group setup, I inserted a single row in a table of a primary replica database with the value of 3. The logged operations show up as follows in the SQL Server transaction log.. _{Psst.. Don’t tell anyone that I was reading the log file}

As you can see from the green highlight above the Transaction ID is 11899 (0x2e7b) which inserted a row on Page ID 315 (ox13b).

What was on the page

I verified that the page on the primary replica database had the new entry that I had added into the heap using the Page ID retrieved from the transaction log. Note that we are keeping track of the oldest active transaction as well. The record shows the transaction timestamp which was responsible for the DML operation 11899 (0x2e7b). From Screenshot 2 below, you can see that the version information is maintained and the transaction timestamp shows up correctly (green highlight). The value also shows up correctly (pink highlight).

What happened after that?

Then I started a transaction on the secondary replica and executed a SELECT query on the same table with a HOLDLOCK hint to keep the row lock. Then I proceeded to delete both the rows with value 3 in the primary replica. I verified that the rows were not retrieved by a SELECT query on the primary and secondary replica. The transaction log dump from secondary replica shows that the changes were replayed.

_{Oops! I had to read the transaction log again.}

From the green highlights in Screenshot 4, you can see that the GhostCleanupTask transaction ran on the secondary replica. The pink highlights shows that the transaction ID 11900 (0x2e7c) deleted two rows from the Page ID 315 (ox13b). So all is good now.

Curiosity killed the cat!

Well curiosity got the better of me and I decided to check if the same the story was being told inside the transaction log of the primary replica database. And this is where David Duchovny’s a.k.a. Agent Moulder and Gillian Anderson a.k.a. Agent Scully’s faces from the X-Files will be an apt representation of what I present next.

Screenshot 5 will show that the Ghost Cleanup Task continues to execute on the Primary Replica Database! What now executed? Did we not delete the rows and verify everything was alright…

The first observation is that the transaction log is being replayed to the letter on the secondary replica. Notice that the transaction IDs of the Ghost Cleanup Task correspond with the transaction IDs of the Ghost Cleanup Task found on the secondary replica instance database. It wasn’t a joke when the documentation said that transactions are replayed on the secondary replica!

The yellow highlights show that the rows were deleted from the table that I had performed the delete on. The pink highlights confirm that the same transaction was associated with both the deletes.

I had verified that there were no ghost records in the database when I started the repro. So, the important question was:

Why Ghost Cleanup Task was running repeatedly on the primary replica database?

What was really happening: THE EXPLANATION

Since I had an open transaction on the secondary replica database, I had an active version store! Screenshot 6 shows that my active version store on the secondary replica with a transaction sequence number of 11900 (0x2e7c) which matches the transaction ID of the delete operation. This active version stored was created on the delete operation because I had performed a SELECT with a HOLDLOCK earlier as mentioned in this post on the table from the secondary replica.

Then I created a table on the primary replica database and inserted a row in it. I checked if this data was available on the secondary replica and it was!

I found that on the primary replica, the database page shows ghost version records (Screenshot 7). The transaction timestamp matches the transaction ID which performed the delete operation i.e. transaction ID 11900 (0x2e7c).

Light at the end of the tunnel

Once transaction which I had stated on the secondary replica with the HOLDLOCK hint was committed, the ghost cleanup task was able to perform the cleanup on the primary replica’s page. Once this was completed successfully, the ghost records on the secondary replica were cleaned up promptly as well.

The above behavior is true for both synchronous and asynchronous modes of operation.

I hope this was a fun Friday read! Have a good weekend!

MaxBCPThreads not causing parallel export

Posted on September 23, 2013 by Amit Banerjee

I had recently run into an issue where the MaxBCPThreads value was set to a non-zero value and the Snapshot Agent was still not creating multiple snapshot files for the table in question which was over 70GB in size.

On further inspection, I did find that the issue was with the statistics distribution of the table in question was affecting the parallel file export by the snapshot agent. It has already been documented that the concurrent snapshot generation option prevents the use of parallel BCP threads by the Distribution Agent. This is documented on the ReplTalk blog on MSDN.

Now to find out why the parallel threads were not being generated. So I created a repro of the situation on my side with a transaction replication publication having three tables with the same schema options. The snapshot folder after the snapshot agent’s execution is as shown in the screenshot.

The tblNewArticle6 has 60398 rows, tblNewArticle3 has 100,000 rows and the tblNewArticle1 has 2 rows.

The MaxBCPThreads configured value for the snapshot agent is 6. Surprisingly, tblNewArticle6 has 12 BCP files and the tblNewArticle3 has only one BCP file.

Looking into a profiler trace and the verbose log for the Snapshot Agent, I found out the following information:

1. There is a thread which performs an exploration on what ranges the parallel BCP threads will be fetching. The application name in the profiler trace will show up as “Auxiliary bcp load ordering hint and partitioning resolution thread # for publication <publication name>”. See Screenshot 2 for more details.

2. You will also find dynamic SQL commands executing DBCC SHOW_STATISTICS against the tables in question.

In my case, tblNewArticle6 has a data distribution graph as shown in screenshot 3. It is clear from the screenshot that the tblNewArticle6 has multiple ranges available for the index id = 1. These were utilized by the Snapshot agent to generate 12 BCP files.

On looking into the Snapshot agent verbose log, I find find the following

Partitioning where clauses for article ‘tblnewArticle6’:

([SalesOrderNumber] is null) or ([SalesOrderNumber] <= convert(nvarchar(7), 0x53004F0035003000380034003400) collate SQL_Latin1_General_CP1_CI_AS)
([SalesOrderNumber] > convert(nvarchar(7), 0x53004F0035003000380034003400) collate SQL_Latin1_General_CP1_CI_AS) and ([SalesOrderNumber] <= convert(nvarchar(7), 0x53004F0035003300310039003500) collate SQL_Latin1_General_CP1_CI_AS)
…
…
([SalesOrderNumber] > convert(nvarchar(7), 0x53004F0037003300300037003600) collate SQL_Latin1_General_CP1_CI_AS)

The above ranges translated to 12 ranges which is the same number of files that I see in my snapshot folder.

Now looking into tblNewArticle3 which has 100,000 rows, I have a DBCC SHOW_STATISTICS output which only showed me two distinct ranges. Another point to note was that the primary key of the table has a non-clustered index defined on it and the clustered index key is a non primary key column. On increasing the number of rows in the table, I still find that the number of rows in the table, tblNewArticle3, I still found that the number of BCP files that were being generated was only 1. Then on additional investigation, I found that there is a small requirement of the clustered index to contain the primary key columns for the published article as well. This is evaluated in one of the system stored procedures for replication.

Some deep diving into the profiler traces, snapshot agent logs and debugging the stored procedures will help you arrive at the following conclusion:

Multiple BCP files will be created by the Snapshot Agent only if the following are true:

1. There are multiple distinct ranges available in the clustered index’s statistics histogram to create partitions which can be used by the BCP program to extract the data in parallel.

2. The clustered index must be defined on the primary key columns.

3. MaxBCPThreads value is set to a value other than 1.

If the above is true, then you will see multiple BCP files created by the Snapshot agent for the table of interest. Note that the above information is simplified to ensure that I comply with my NDA. After making the above changes, I see the following output in the repldata folder shown below in the screenshot:

Additional Reference:
Improving snapshot performance using MaxBCPThreads

View query plans with non admin accounts

Posted on August 5, 2013 by Amit Banerjee

A question was posted on Twitter’s #sqlhelp hashtag on how to allow non-sysadmin accounts or accounts which do not have the View Server State privilege to fetch the cached query plans. This can be a very big deal if you are a hosting company for databases. You would want to restrict access to all non-admin users but at the same time you might need to give your customers and their developers access to their query plans to assist with their query tuning efforts.

The first thing that comes to mind is to use impersonation in SQL Server. Implementing it is where the fun is!

Let’s consider this hypothetical scenario. I have a SQL Server login named Sentinel which would be granted View Server State privileges only. I have another SQL Server login named Agent which is neither a sysadmin nor a login with View Server State privileges. The Agent login need to look up plans of stored procedures which are executing in the current database.

The first task would be to grant View Server State privileges to Sentinel and impersonate privileges to Agent.

USE [master]

GO

GRANT VIEW SERVER STATE TO [sentinel];

GRANT IMPERSONATE ON LOGIN::[sentinel] to [agent];

GO

Next I create a stored procedure which allows the login agent to see the cached plans which are available for the current database context.

CREATE PROCEDURE [dbo].[usp_SeePlan]

AS

BEGIN

SET NOCOUNT ON

EXECUTE AS LOGIN = ‘sentinel’;

SELECT ‘Executed as user: ‘ + SUSER_NAME()

SELECT * FROM sys.dm_exec_cached_plans cp

CROSS APPLY sys.dm_exec_query_plan (plan_handle) qp

CROSS APPLY sys.dm_exec_sql_text (plan_handle) qt

WHERE qp.dbid = db_id();

/* Add your diagnostic query here

The above query is an example */

REVERT;

SELECT ‘Reverted back to user: ‘ + SUSER_NAME()

END

Next when I attempt to grant permissions to the login agent to execute the above stored procedure, I will get the error shown below:

Msg 15151, Level 16, State 1, Line 1
Cannot find the user ‘agent’, because it does not exist or you do not have permission.

This error is reported because there is no database user mapped to the login agent. So I create a database user for agent and grant execute permissions to the login agent for the stored procedure using the commands below:

CREATE USER [agent] FOR LOGIN [agent] WITH DEFAULT_SCHEMA=[dbo];
GRANT EXECUTE ON OBJECT::[usp_SeePlan] TO agent;

Next I will try to execute the stored procedure with the agent login. Everything should work now! But alas.. I get another error as shown below:

Msg 916, Level 14, State 1, Procedure usp_SeePlan, Line 5
The server principal “sentinel” is not able to access the database “Test” under the current security context.

This is because of the fact the login sentinel does not exist in the database Test in which the stored procedure exists. Next I will create a database user mapped to the login sentinel.

CREATE USER [sentinel] FOR LOGIN [sentinel] WITH DEFAULT_SCHEMA=[dbo]

Now when I used the stored procedure, everything works!! Screenshot of the output is shown below.

A few other caveats to note is that if the stored procedure is created with the WITH EXECUTE AS option, then on execution of the procedure, you will be reported with the following error:

Msg 297, Level 16, State 1, Procedure usp_SeePlan, Line 8
The user does not have permission to perform this action.

So the summary is the following sequence:

1. Grant impersonate rights to Agent for Sentinel

2. Create a stored procedure which does the work which requires View Server State privilege using the EXECUTE AS LOGIN = ‘Sentinel’ statement

3. Create two database users in the database which are mapped to Agent and Sentinel

4. Grant execute privileges on the stored procedure which was created in Step #2.

Voila… You are now ready to impersonate and view the query plan!

Awesomesauce: Finding out missing sequences

Posted on September 14, 2012 by Amit Banerjee

Another of those #sqlhelp inspired posts. This was around ways to track down missing numbers in a sequence table. This task can be done in multiple ways and as I like to say that there are multiple ways to skin a cat! However, this blog post is about using the new T-SQL enhancements in SQL Server 2012 to figure out missing sequential numbers. I will demonstrate this using an example.

I have a table tblsequences which has two integer columns with the primary key being an int datatype. I did some random inserts into the table. The script below can be used to find out missing sequential numbers in an identity column or an integer column which is supposed to store sequential values.


-- Replace starting value with minimum starting value and increment for your sequence
-- Replace the table name with the table name that you are interested in
declare @startvalue int = 1, @increment int = 1
;with cte as
(
select a,(a-lag(a,1) OVER (ORDER BY a)) as MissingSequences
from tblsequences
)
select a, (MissingSequences/@increment)-1 as MissingSequences
from cte
where MissingSequences > @increment
union all
select TOP 1 MIN (a),
CASE (MIN(a)- @startvalue)/@increment
when 0 then null
else (MIN(a)- @startvalue)/@increment
end as MissingSequences
from tblsequences
group by a
order by a

The output is as shown below in the screenshot. You will notice that the first column reports the primary key value and the MissingSequences column reports the number of missing sequential values lesser than the value in the first column. You will need to make the necessary changes to the two scripts shown in this blog post so that the tables you want analyzed are done so.

Next, lets talk about Sequences which is a new feature in SQL Server 2012. I created a new sequence using the a start value of 10 and an increment value of 3.

I used the Sequence which was created for generating OrderID values in two different tables. Now we are posed with an interesting problem. I have two different tables in which I want to compare missing sequence numbers. This can be done with T-SQL code below.


-- Replace schema name, table name(s) and sequence name as appropriate
declare @startvalue int = 1,@interval int = 1, @seqname sysname = 'TestSeq', @schemaname sysname = 'dbo'
select @startvalue = TRY_CAST(TRY_CAST(start_value as varchar(255)) as int),
@interval = TRY_CAST(TRY_CAST(increment as varchar(255)) as int)
from sys.sequences
where name = @seqname and [schema_id] = (select [schema_id] from sys.schemas where name = @schemaname)

if (@startvalue IS NOT NULL and @interval IS NOT NULL)
begin
;with cte as
(
select OrderID,(OrderID-lag(OrderID,1) over (order by OrderID)) as MissingSequences
from
(select OrderId as OrderID
from tblTestSeq
union all
select OrderId as OrderID
from tblTestSeq_2) A
)
select OrderID, (MissingSequences/@interval)-1 as MissingSequences
from cte
where MissingSequences > @interval
union all
select TOP 1 OrderID,
CASE (MIN(OrderID)- @startvalue)/@interval
when 0 then null
else (MIN(OrderID)- @startvalue)/@interval
end as MissingSequences
from tblTestSeq
group by OrderID
order by OrderID
end
else

else
begin
PRINT 'CAST FAILED'
end

The logic used is the same as the logic when we were looking for missing sequences in identity/integer columns described in the first half of the post. However, since sequences can be used for one or more tables, I have used a UNION query to get all the columns together which use the same sequence object. The second result set of the output in the screenshot below shows the tables involved along with the sequential OrderIDs. As you can see below, there are 4sequence numbers missing 10, 13, 16 and 28. The first result set reports exactly that by notifying that there are 3 sequence ids missing lesser than 19 and 1 sequence number missing lesser than 31.

The output of the above T-SQL script is shown in the screenshot below.

Another way how T-SQL enhancements in SQL Server 2012 can make your life easier.

Happy T-SQL coding and a have a great weekend!

Moving those large files for secondary databases

Posted on September 13, 2012 by Amit Banerjee

I remember seeing a question on the #sqlhelp hashtag on how to move the secondary database files to a new physical location. While this might seem a mundane task but can throw up a few surprises. If you don’t want to tear down your log shipping configuration and re-establish it with the files in the new physical location, then this post will definitely interest you.

If you are running your log shipping in standby mode, you will first need to switch to norecovery mode. This is due to the fact that you cannot execute an ALTER DATABASE command on a standby database as it is not writable. You will be presented with a following error if you attempt to do so:

Msg 5004, Level 16, State 4, Line 1
To use ALTER DATABASE, the database must be in a writable state in which a checkpoint can be executed.

So to avoid the above issue, you need to switch to norecovery mode. If you are already running in this mode, then you have one task reduced from your list. Keep in mind that the switch of the operating mode takes effect only after the next log restore operation.

Now here are the exact set of steps that you need to follow:

1. Disable the Log Shipping Restore SQL Agent job.
2. Use ALTER DATABASE command to change the location of the secondary database files
3. Stop the SQL Server instance hosting the secondary database files
4. Move the secondary database files to the new location as updated in Step #1. (OS File Copy)
4. Start the SQL Server instance
5. Enable the Log Shipping Restore SQL Agent job

Verify that the log shipping jobs are running without any errors. The reason you need to stop the SQL Server instance is because an ALTER DATBASE…OFFLINE doesn’t work when the database is in RESTORING state. You will be blocked by the following error if you attempt to set the database offline:

Msg 5052, Level 16, State 1, Line 1
ALTER DATABASE is not permitted while a database is in the Restoring state.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.

Hope the above helps in moving the database files of the secondary databases which are configured for log shipping. The steps mentioned above works for both data and log files.

TroubleshootingSQL

Explaining the bits and bytes of data in an AI world

Category Archives: Did you know