Concurrent backups: What is the big deal?

You are probably wondering why am I writing a blog post on concurrent backups! Well this is a topic that brings up a few questions during my customer visits. Apart from that the biggest reason, I want to talk about concurrent backups is because of backup softwares that you have active in your environments.

A lot of the backup softwares available in the market have the option of scheduling backups of the entire system. This involves backing up of SQL Server databases using the interfaces exposed through sqlvdi.dll. Ain’t your software really smart? However, sometimes the right hand doesn’t really talk to the left hand! It is quite possible that your server admins have scheduled backups of your systems around the same time you have scheduled a maintenance plan to take a backup of the SQL Server instance.

If you initiate two differential or full database backups concurrently, then you will experience a blocking condition as shown in the table below. SPID 53 was taking a VDI full-backup using the SQL Server Backup Simulator tool where as SPID 55 was also taking a native full-backup of the same database. In such a scenario, you will find that the backup which was running first will complete and only then will the second backup will continue. The second concurrent backup will be blocked during the entire duration of the first backup.

spid blocked waittime lastwaittype waitresource cmd dbid
53 0 6302 ASYNC_IO_COMPLETION   BACKUP DATABASE 8
53 0 0 MISCELLANEOUS   BACKUP DATABASE 8
53 0 412 BACKUPBUFFER   BACKUP DATABASE 8
53 0 412 BACKUPIO   BACKUP DATABASE 8
55 53 2386 LCK_M_U DB: 8 [BULKOP_BACKUP_DB] BACKUP DATABASE 8

This issue doesn’t manifest itself when you are you taking a volume snapshot backup of the database. The wait resource will always be DB: <dbid> [BULKOP_BACKUP_DB] with a wait type of LCK_M_U (update lock) for the second session that is running the concurrent backup operation. Concurrent log backups are allowed though starting from SQL Server 2005.

One of the situations where this can pose to be problematic is when the application that installed the database creates a maintenance plan during the application setup process. This can cause problems if you have concurrent full or differential backups scheduled from another software during the same time when your backup maintenance plan is running. Another big pitfall of concurrent backups running is that sometimes, the same job is responsible for both full and log backups of the database. If your log backup job doesn’t kick-off because your database backup is still being blocked, then you could run into a bloated transaction log issue for a database in full or bulk-logged recovery model. My colleague, Arvind, has also blogged about the same issue a few years back.

The following T-SQL script can help you determine if you have overlapping backups being taken for a SQL Server 2012 function. This script can also work for releases lower than SQL Server 2012 if you remove the LAG function as that T-SQL enhancement was introduced in SQL Server 2012.

select backup_start_date, backup_finish_date,

LAG(backup_start_date, 1) OVER (PARTITION BY database_name ORDER BY backup_start_date) as previous_start_date,

CASE datediff(mi,(LAG(backup_start_date, 1) OVER (PARTITION BY database_name ORDER BY backup_start_date)),backup_start_date) WHEN 0 THEN 'OVERLAPPING' ELSE '-' END as [Status] ,

datediff(mi,backup_start_date,backup_finish_date) as backup_time_mins, type

from msdb..backupset

where database_name = '<database name>' -- Replace with appropriate database name

and type <> 'L'

order by backup_start_date

Microsoft TechEd 2012–Multiple rows affected

WP_000363If you were at Microsoft TechEd India 2012 held at Bangalore, then you would agree with me without a shadow of doubt that it is quite difficult to write a wrap-up post of the entire event. The event for me was what I refer to as PPP. And by PPP, I am not talking about Point-to-Point Protocol. What I am referring to is a self-coined term…. People, Passion and Parallel.

People because it was great to connect a face to the twitter handle or online persona that you have interacted with in the past. What amazed me was some of the folks that I met at TechEd introduced themselves with their Twitter handle first and their actual name later! It is phenomenal to see how social media can empower collaboration between a technical community! The #TechEdIn twitter hashtag was trending on Twitter India in less than 15 minutes after the event started on Day 1 and continued to trend for the next three days!

Passion for technology that was evident from Day 1 to Day 3 where the rooms overflowed with people attending the sessions at different tracks. It was as if people were watching technical blockbusters at TechEd!

Parallel is the name of the game as everything that was happening at TechEd was in parallel… Whether it was the networking that was happening between folks or be in the technical sessions being delivered in parallel at the different tracks or the technology showcases displaying the several cool stuff from the sponsors at the venue. I heard multiple conversations in the hallways where debates were being raged (among the attendees) to decide on which session they should attend under which track! And obviously Twitter stands testament to the fact that such a dilemma existed since there were numerous tweets on this topic!

Since my daily bread and butter revolves around Microsoft SQL Server, it was quite obvious that I was lurking around the sessions which had remotely anything to do with data! On day one, I would be found mostly at “Data and Developer – Synergy with SQL Server” and on day three I was mostly at “Know Your Data – A Day in Life of a DBA”. Though there was a copious amount of information exchange in the session, there were also some really humorous moments which deserve a mention:

Vinod [B | T] and Pinal’s [B | T] session on “SQL Server Misconceptions and Resolution – A Practical Perspective” probably landed them as top candidates for a thin one and the fit one series in case they ever choose to pursue a media related career option!

Pinal Dave’s session on “Speed Up! – Parallel Processes and Unparalleled Performance” left people in splits…

Balmukund’s [B | T] session on “Keeping Your Database Available – ‘AlwaysOn’“ made learning fun and even though he had presented this in the past at other forums… This time around he came back with a totally new look and feel along with a high quality technical session on SQL Server 2012 AlwaysOn features with demos.

The VDI demo that I showed was using the “SQL Server Backup Simulator” which can be downloaded from here. The snapshot demo was shown using a modified version of the code present in the snapshot.cpp file from the Microsoft SQL Server Virtual Device Specification.

I had a lot of fun presenting my session on the “Lesser Known facts of SQL Server Backup and Restore” on March 23rd at TechEd. The slide deck used for the presentation is available below without a few situational humorous elements which wouldn’t really make sense to a person who had not attended the session. (Had to add the disclaimer before I got pulled up with comments that this was the incomplete slide deck!)

Here are some additional resources on VDI and VSS in case you want to indulge in some light reading Smile

IO Frozen messages while taking NT Backup for SQL databases
INFORMATIONAL- SHEDDING LIGHT on VSS & VDI Backups in SQL Server
Troubleshooting SQL Server Backup Issues
Incorrect Buffercount can cause Out of Memory Errors
VDI (VSS) Backup Resources
How It Works: SQL Server Backup Buffer Exchange (a VDI Focus)

The photographs from the session are available on TroubleshootingSQL’s Facebook page. Last but not the least, there needs to a be special vote of thanks to Pinal Dave who ensured that we had a seamless experience during the event.

The session recording is embedded below:

People… Passion… Parallel… @TechEdIn

TechEd India 2012 Session

TechEd

TechEd India is always a much anticipated event and I always feel that this event keeps getting better year on year. This year I have the privilege of presenting at TechEd which is being held at Bangalore from March 21-23, 2012. My presentation topic is “Lesser Known Facts of SQL Server Backup and Restore”. The topic abstract is as follows:

Backup and restore are considered the most common tasks performed by DBAs. Most of the time the knowledge does not go beyond memorizing the syntax of backup and restore. In this session we will uncover some hidden facts which will change the perspective of DBAs towards this process.

What can you expect from this session?
image1. I will be talking about the some of the unknown facts about SQL Server database backup and restore which will help you streamline your backup performance and DR strategy.
2. Demo of a backup restore using a tool that I have written to show you some interesting statistics.
3. Clearing up some common myths around backup/restore.

I will delivering this session from 3:15PM IST – 4:15PM IST on SQL Server Administration track on Day 3 (March 23rd).

There will be other speakers like Pinal Dave [B | T], Vinod Kumar [B | T], Balmukund Lakhani [B | T] and Jacob Sebastian [B | T] who will not only help you gain knowledge but also have amazing demos in their presentations which will have you running the following query: SELECT REPLICATE(‘awesome’,1000) in your head!

If you haven’t already registered for TechEd yet, then you can do so by visiting the TechEd site. The Facebook event for my session is available here.

Sysdatabases is empty–You say Whaaat?

While trying to backup a SQL Server database using SQL Writer (snapshot backups using VSS framework), then you might notice the following error in the application event logs.

Log Name: Application
Source: SQLWRITER
Event ID: 24581
Task Category: None
Level: Error
Description:
Sqllib error: System table sys.sysdatabases in SQL Server instance <SQL Server Name> is empty.

The above error is a bit misleading because it doesn’t literally mean that all your databases on the SQL Server instance reported have disappeared. If you scour the application event logs, you will find that another error is reported along with this error during the same timeframe.

Log Name: Application
Source: SQLWRITER
Event ID: 24583
Level: Error
Description:
Sqllib error: OLEDB Error encountered calling IDBInitialize::Initialize. hr = 0x80040e4d. SQLSTATE: 28000, Native Error: 18456
Error state: 1, Severity: 14
Source: Microsoft SQL Server Native Client 10.0
Error message: Login failed for user ‘NT AUTHORITY\SYSTEM’.
DBPROP_INIT_DATASOURCE: <SQL Server Name>
DBPROP_INIT_CATALOG: master
DBPROP_AUTH_INTEGRATED: SSPI

Now it is clear that the SYSTEM account is unable to log into the SQL Server instance mentioned in the error message above. If you look into the SQL Errorlog, you will find the following error message:

Error: 18456, Severity: 14, State: 11.
Login failed for user ‘NT AUTHORITY\SYSTEM’. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: <local machine>]

State 11 is basically telling you that the account doesn’t have access to the SQL Server instance. The reason for the login failure is available on SQL Server 2008 instances and above. There is a catch here the Login Failed error message is only reported in the SQL Errorlogs and Application Eventlogs only if the System account is granted access to the SQL Server instance but not granted the right permissions to get the database list from the sys.databases catalog. The hex code 0x80040e4d corresponds to an authentication failure.

So why does this happen? When you initiate a backup of a SQL Server database through SQLWriter (VSS Framework), SQLWriter will try to connect to all online instances to build a list of files associated with each database. SQLWriter/VSS needs this information to create the exclusion list. The account that the SQLWriter service uses is the NT AUTHORITY\SYSTEM account. So if you have multiple instances of SQL Server online on the same server, then the SYSTEM account needs to have SYSADMIN permissions OR the necessary permissions to run a query against the sys.databases catalog on the instance. This is a by-design requirement and documented in the KB Article mentioned below. However, if the SQL Server instance is not started, then you SQLWriter is not bothered about the instance(s) as I/O need not be frozen for database files for an instance which is shutdown.

919023    SQL Server 2005 connectivity and Volume Shadow Copy Service (VSS)
http://support.microsoft.com/default.aspx?scid=kb;EN-US;919023

Excerpt from above article:

Additionally, because of the types of operations that the writer must perform, we recommend that you do not remove the NT AUTHORITY\SYSTEM login from the sysadmin server role.

Now if you do not want to grant sysadmin permission to the System account, then you need to do the following to prevent the above error:

1. Grant db_datareader role to the SYSTEM account on the master database. This is required for other queries that the SQLWriter may need to executed during the course of the backup.
2. Grant ALTER ANY DATABASE or VIEW ANY DATABASE server-level permission, or CREATE DATABASE permission in the master database to the SYSTEM account. These are the minimum permissions required to query sys.databases table.

Normally the SYSTEM account is part of Built-in Administrators group on the box if it is not removed due to security hardening. You would need to grant the above permissions to the System account on all SQL Server instances that are installed and online on the server where you are trying to perform the VSS backup.

Another symptom of the problem manifests itself when you execute the command: vssadmin list writers. You will find that the above error message(s) are reported and the SQLWriter is not listed in the list of available writers.

One you have granted the SYSTEM account the necessary permissions, a quick way to verify that everything is working as expected is to run the vssadmin list writers command again. You might get the following output:

Writer name: ‘SqlServerWriter’
Writer Id: <GUID>
Writer Instance Id: <GUID>
State: [1] Stable
Last error: Non-retryable error

If you find that a non-retryable error is reported, then you need to restart the SQL Server VSS Writer service. Once this is done, run the command again and if there are no problems encountered, then you will get the following output for the SQLWriter.

Writer name: ‘SqlServerWriter’
Writer Id: <GUID>
Writer Instance Id: <GUID>
State: [1] Stable
Last error: No error

NOTE: If a SQL Server VSS (Snapshot) Backup is in progress for any instance on the server, then a restart of the SQL Server VSS Writer service should be deferred till the backup is completed or fails.

Another SQL VDI error that can mystify you

While performing a backup of a SQL Server using a backup software which uses SQLVDI, you might encounter a failure which reports the following failure hex code: 0x080070005. This is basically an Access Denied error message. If your backup software logs all the VDI API calls that are made, then you might even know which function call failed.

If you are using SQL Server Backup Simulator to simulate a VDI backup of the any SQL Server database with the same account that raised the error, you will find the following message in the Backup Simulator logs.

Trying to perform VDI test on a default instance
Error: VDS::Create fails: 0x80070005

The above message tells me that the IClientVirtualDeviceSet2::CreateEx function call failed. Now the important question here is why did the CreateEx function fail. If you look into the application event log, you will find an Error reported by SQLVDI (source) with the following text:

SQLVDI: Loc=CVDS::CreateEx. Desc=Create Memory. ErrorCode=(5)Access is denied.. Process=24456. Thread=13984. Client. Instance=. VD=Global\SQLBackupSim_SQLVDIMemoryName_0.

The above message tells me that I am creating a Global Shared Memory Object but I currently do not have the necessary privilege to create the object. Check the output of whoami /priv for the account that is running the Backup Simulator or the backup software and see if the following privilege is listed: SeCreateGlobalPrivilege. This privilege is granted by default to administrators, services, and the local system account. In case you are trying to use Backup Simulator on a Windows version which has UAC, then you need to run Backup Simulator using “Run as Administrator” option to avoid this issue. If the account trying to create this global memory object doesn’t have this privilege, then you can grant the privilege to this account using the following steps:

  1. On the Start menu, click Run. In the Open box, type gpedit.msc.

    The Group Policy dialog box opens.

  2. On the Group Policy console, expand Computer Configuration, and then expand Windows Settings.

  3. Expand Security Settings, and then expand Local Policies.

  4. Select the User Rights Assignment folder.

    The policies will be displayed in the details pane.

  5. In the pane, double-click Create global objects.

  6. In the Local Security Policy Setting dialog box, click Add.

  7. In the Select Users or Groups dialog box, add an account with privileges to run backup software or the Backup Simulator application.

Once this is done, you need to re-launch Backup Simulator or re-start the backup using the Backup Software as the security privilege will not modify the token that is currently being used by the backup software or Backup Simulator.

Another issue that you can run into even if you have the above mentioned security privilege is if you have multiple backups trying to create the same global shared memory object. In such scenarios, you would want to create a unique virtual device name. More information about the same is mentioned in the KB Article below:

903646    An application that uses the Virtual Device Interface feature of SQL Server 2000 to back up a SQL Server 2000 database may not back up the database
http://support.microsoft.com/default.aspx?scid=kb;EN-US;903646

del.icio.us Tags: ,,