Chasing the Ghost Cleanup in an Availability Group


Because read operations are mapped to snapshot isolation transaction level, the cleanup of ghost records on the primary replica can be blocked by transactions on one or more secondary replicas. The ghost record cleanup task will automatically clean up the ghost records for disk-based tables on the primary replica when they are no longer needed by any secondary replica. This is similar to what is done when you run transaction(s) on the primary replica. In the extreme case on the secondary database, you will need to kill a long running read-query that is blocking the ghost cleanup. Note, the ghost clean can be blocked if the secondary replica gets disconnected or when data movement is suspended on the secondary database. This state also prevents log truncation, so if this state persists, we recommend that you remove this secondary database from the availability group.

The above is a snippet from the official Microsoft documentation for Availability Group Secondary Replicas under the limitations and restrictions section.

So a transaction on a secondary replica can block an operation on a primary replica… Hmm.. Now that smells like a mystery!

Before I go further, let me explain what Ghost Cleanup does. Let me give you the official text from the Books Online.

Deletes operations from a table or update operations that cause a row to move can immediately free up space on a page by removing references to the row. However, under certain circumstances, the row can physically remain on the data page as a ghost record. Ghost records are periodically removed by a background process. This residual data is not returned by the Database Engine in response to queries.

I had some free time a while back and decided to actually track this down to show how the Ghost Cleanup actually works in an availability group replica. My availability group setup was a simple one which had two SQL Server instances sitting across two different subnets as shown in Screenshot 1.

image

So, let’s get the show on the road and let me walk you through walk you the through the ghost cleanup behavior on the secondary replica.

The DML

On my existing Availability Group setup, I inserted a single row in a table of a primary replica database with the value of 3. The logged operations show up as follows in the SQL Server transaction log.. Psst.. Don’t tell anyone that I was reading the log file Winking smile

image

As you can see from the green highlight above the Transaction ID is 11899 (0x2e7b) which inserted a row on Page ID 315 (ox13b).

What was on the page

I verified that the page on the primary replica database had the new entry that I had added into the heap using the Page ID retrieved from the transaction log. Note that we are keeping track of the oldest active transaction as well. The record shows the transaction timestamp which was responsible for the DML operation 11899 (0x2e7b). From Screenshot 2 below, you can see that the version information is maintained and the transaction timestamp shows up correctly (green highlight). The value also shows up correctly (pink highlight).

image

What happened after that?

Then I started a transaction on the secondary replica and executed a SELECT query on the same table with a HOLDLOCK hint to keep the row lock. Then I proceeded to delete both the rows with value 3 in the primary replica. I verified that the rows were not retrieved by a SELECT query on the primary and secondary replica. The transaction log dump from secondary replica shows that the changes were replayed.

Oops! I had to read the transaction log again.

From the green highlights in Screenshot 4, you can see that the GhostCleanupTask transaction ran on the secondary replica. The pink highlights shows that the transaction ID 11900 (0x2e7c) deleted two rows from the Page ID 315 (ox13b). So all is good now.

image

Curiosity killed the cat!

imageWell curiosity got the better of me and I decided to check if the same the story was being told inside the transaction log of the primary replica database. And this is where David Duchovny’s a.k.a. Agent Moulder and Gillian Anderson a.k.a. Agent Scully’s faces from the X-Files will be an apt representation of what I present next.

Screenshot 5 will show that the Ghost Cleanup Task continues to execute on the Primary Replica Database! What now executed? Did we not delete the rows and verify everything was alright…

The first observation is that the transaction log is being replayed to the letter on the secondary replica. Notice that the transaction IDs of the Ghost Cleanup Task correspond with the transaction IDs of the Ghost Cleanup Task found on the secondary replica instance database. It wasn’t a joke when the documentation said that transactions are replayed on the secondary replica!

The yellow highlights show that the rows were deleted from the table that I had performed the delete on. The pink highlights confirm that the same transaction was associated with both the deletes.

I had verified that there were no ghost records in the database when I started the repro. So, the important question was:

Why Ghost Cleanup Task was running repeatedly on the primary replica database?

image

What was really happening: THE EXPLANATION

Since I had an open transaction on the secondary replica database, I had an active version store! Screenshot 6 shows that my active version store on the secondary replica with a transaction sequence number of 11900 (0x2e7c) which matches the transaction ID of the delete operation. This active version stored was created on the delete operation because I had performed a SELECT with a HOLDLOCK earlier as mentioned in this post on the table from the secondary replica.

Then I created a table on the primary replica database and inserted a row in it. I checked if this data was available on the secondary replica and it was!

image

I found that on the primary replica, the database page shows ghost version records (Screenshot 7). The transaction timestamp matches the transaction ID which performed the delete operation i.e. transaction ID 11900 (0x2e7c).

image

Light at the end of the tunnel

Once transaction which I had stated on the secondary replica with the HOLDLOCK hint was committed, the ghost cleanup task was able to perform the cleanup on the primary replica’s page. Once this was completed successfully, the ghost records on the secondary replica were cleaned up promptly as well.

The above behavior is true for both synchronous and asynchronous modes of operation.

I hope this was a fun Friday read! Have a good weekend!

A full house–We could not have asked for more


The turnout at today’s event had an overwhelming response today. A big thank you to everyone who attended today’s event. An overwhelming response… So thank you everyone who attended and made this UG meeting a grand success! This was our very own version of introducing SQL Server 2014 to the SQLBangaloreUG community!

We had a brilliant line-up of topics… Just to re-cap the day.. Here is a short summary..

The day started off with a keynote session from Anupam Tiwari who is Business Program Manager and closely works with the team that supports developer technologies at Microsoft. He talked about where IT as we know it today is headed tomorrow.

Sourabh Agarwal [Blog | Twitter] covered the performance enhancements in SQL Server 2014 and explained why this version is bigger, faster and stronger! Smile

Then I had the opportunity of explaining the new enhancements in backup and restore in SQL Server 2014. I talked about backing up to a URL i.e. a storage account in Azure, managed backups to Azure and encrypted backups. The slide deck from today’s presentation is available below.

This was followed by Pinal Dave’s [Blog | Twitter] session on “Performing an effective Presentation” One key takeaway from this session is that the art of story-telling can weave out a presentation which can keep the audience hooked! He demonstrated this during his session and kept the audience awake after lunch! Smile

Then we had Balmukund Lakhani [Blog | Twitter] talking about InMemory Enhancements with SQL Server 2014. You can follow his SQL Server 2014 learning series to gain in-depth insights into the various enhancements that he talked about.

His session was followed by Vinod Kumar [Blog | Twitter] who talked about the Some more lesser known enhancements with SQL Server 2014 where he talked about managed lock priority, delayed durability and more.

And last but not the least was Kane Conway – (Support Escalation Engineer at Microsoft) who talked about Power Packed – Power BI with SQL Server. If you have not played around with PowerBI, then you should start now! Presenting data with compelling visualizations just got a new definition.

image image

It was a house full event and I had to put in two pictures to showcase the turnout today! It is not possible to have a successful event without the audience participation. We personally want to apologize to folks who came in late and had to return because the hall was overcrowded and the seats were full. Please do come in early for next UGs for sure.

A special thanks to Microsoft GTSC for giving us the space and the support till date. Also a special thanks to SQL Server Product Marketing team for sponsoring this Launch Event. Finally, a special thanks to team Pluralsight for supporting the UG till date and giving us numerous gifts as giveaway at the event.

The presentation that I used at the event today is available below:

Bangalore Community gets a peek at SQL Server 2014


SQL Server 2014 which was recently released will now receive special attention at the SQL Bangalore UG Meet scheduled to happen on Saturday, May 3rd, 2014. We are  conducting a full-day event for a SQL Server 2014 Community Launch at the SQL Bangalore UG. So don’t miss the opportunity to come and learn some of the new cool features introduced with SQL Server 2014.

I will be presenting on Backup Enhancements in SQL Server 2014 and will show you some exciting new features. This will get you thinking about your backup strategy when you upgrade to SQL Server 2014.

Just to tickle your learning taste buds, the agenda for the day is mentioned below:

Backup Enhancements with SQL Server 2014
Performance Enhancements with SQL Server 2014
Sourabh Agarwal [Blog | Twitter] – An active member on SQLBangalore community!

LUNCH BREAK

Performing an effective Presentation by Pinal Dave [Blog | Twitter] – He does not require an introduction!
InMemory Enhancements with SQL Server 2014 by Balmukund Lakhani [Blog | Twitter] – One of the most helpful people in the SQL community!
Some more lesser known enhancements with SQL Server 2014 by Vinod Kumar [Blog | Twitter] – Another individual who needs no introduction!
Power Packed – Power BI with SQL Server by Kane Conway – (Support Escalation Engineer at Microsoft) – A technical expert who will explain to about what PowerBI can do for you and your organization!

Register for the event (registration link) and feel free to pass along the message. The day will have an early start… 9.30AM!! Get your learning hats on and come visit us on Saturday!

Venue:
Microsoft Corporation, Signature Building,
Embassy Golf Links Business Park,
Intermediate Ring Road,
Domlur, Bangalore – 560071