Wiki: SQL Database Engine Startup Failures

The startup failures for SQL Server can be classified into two types:

1. Service startup failures for stand-alone instances

2. Services startup failures for clustered instances

I shall document a road-map for a step-by-step troubleshooter for handling SQL Server startup failures for both stand-alone and clustered instances. By no means, is this the only way to troubleshoot startup failures but for a novice SQL Server user (hopefully, you don’t attempt the steps below for the first time on a production instance), this would be a good starting point in determining the root cause of the startup failures.

Services startup failures for stand-alone instances

If you are using SQL Server 2000, then try and start the SQL instance from the Services Manager and check if the service starts up. If you are using SQL Server 2005 or above, then try and start the service using SQL Server Configuration Manager.SQL Server Startup Paramaters

If the service fails to start, locate the SQL Server Errorlog location and check if any errors were observed in the SQL Server Errorlog. This can be done by looking up the –e startup parameter value for the SQL Server instance in question. If you are using SQL Server 2005 and above, then right-click on the Service Name in Configuration Manager –> Properties –> Advanced Tab (Refer Pic for more details). If you want to act smart in front of your co-worker Smile with tongue out, then you can use a VBScript (or Powershell) to fetch this information. Here is an example for obtaining the Startup Parameters using WMI and VBScript.

Verify that the files pointed to by the –d (master data file location), –e (Errorlog location) and –l (master log file location) actually exist and have the correct file paths. Starting from SQL Server 2005, the Errorlog will have the startup parameters printed in it when the SQL instance starts up.

Once you have located the current ERRORLOG (file with the same name will be present by the same name in the LOG folder), check for any errors present in the SQL Server Errorlog. An example of an error would be:

2010-11-20 07:50:58.250 Server Error: 17113, Severity: 16, State: 1.
2010-11-20 07:50:58.250 Server Error 2(The system cannot find the file specified.) occurred while opening file ‘C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\master.mdf’ to obtain configuration information at startup. An invalid startup option might have caused the error. Verify your startup options, and correct or remove them if necessary.

The above error message clearly indicates that the master.mdf file was not found in the location that it was expected to be found. If your system database files are not consistent, the SQL Server instance will not startup.

Note: If you see that the timestamp in the SQL Server Errorlog file doesn’t reflect the current system time when you attempted to start the SQL instance, then it is highly likely that the path to the SQL Errorlog is incorrect. If that is true, then the SQL Server instance will fail to start and a pretty unhelpful error message will be shown to you. It is always one of the two:

“The service failed to respond in a timely fashion”

OR

“The service failed to start”

The most common root causes for service startup failures are:

  1. Access denied on the folders for the SQL Server Service Account on the DATA folder which contains the SQL Server system databases or due to some other process (like anti-virus holding a lock on the SQL database files)
  2. Insufficient permissions granted to the SQL Server Service account.
  3. File not found errors due to either accidental deletion of files or disk failures.
  4. System database files having inconsistencies prevent SQL Server startup.
  5. Password of the service account was changed but not updated on the server that has the SQL instance installed.
  6. Startup parameters have incorrect file path locations.
  7. The SQL Server service is disabled in Service Control Manager.

Find out if the SQL Server service account has all the necessary permissions required to start the SQL Server service. This would cover the ACLs, Security Privileges for the Windows Account, and Windows Registry permissions. Refer the Books Online article which documents all the permissions required by a SQL Server service account.

Even if that doesn’t help, then check the Windows System and Application Event logs and check for any errors related to the SQL Server instance or service that is currently failing to start.

Note: Make sure that you are not falling prey to a Desktop Heap exhaustion issue.

Still stuck!! Wow!! Aren’t we in a soup. No errors in the Errorlog (very very rare case scenario) or in the Windows Event Logs.

Then you need to try and start the service using command prompt in console mode. There is yet another Books Online article which explains how to start the SQL Server instance using sqlservr.exe located in the BINN folder.

Example for Default instance:

sqlservr.exe –c –m -T3608

Example for Named instance:

sqlservr.exe –c –m –s <instance name> -T3608

Check if the command prompt window spews out any helpful error message which will point you to the root cause of the issue. The Trace Flag 3608 prevents any database startup apart from the master database. This will help you determine if the SQL instance is starting up correctly and the problem might lie with the other system databases.

During my course of troubleshooting, I have also encountered some pretty weird startup failures. Of the top of my head, here are a few which I could find links to:

By Balmukund Lakhani (Blog)

http://blogs.msdn.com/b/blakhani/archive/2009/11/24/sql-server-2005-express-setup-failure-with-error-an-error-occurred-during-encryption.aspx

The famous (infuriating) TDSSNICLIENT client initialization failures during startup. The SQL Protocols team has documented a bunch of them here:

http://blogs.msdn.com/b/sql_protocols/archive/2006/04/28/585835.aspx

The case of the obnoxious locks value:

https://troubleshootingsql.com/2011/04/01/the-case-of-the-obnoxious-locks-value/

In case you have any more interesting startup failures that you find blog posts about, feel free to leave a comment for this post.

Startup Failures for Clustered Instances in another blog post.

Versions that this road-map applies to:

SQL Server 2000, SQL Server 2005, SQL Server 2008, SQL Server 2008 R2

Additional references:
T-SQL Tuesday: Common SQL Server myths – Series I–Setup and Services

How to fetch all Login Failed messages from default traces

I was recently investigating a login failed issue on a SQL instance and required to quickly find out all the Login Failed events that had occurred on the server. I could have scoured the SQL Errorlogs but I didn’t want to do this due to the bloated size of the Errorlogs. An easier way would be to read the Default Traces (available from SQL Server 2005 and above) and look for all the Audit Login Failed events (event class = 20) using fn_trace_gettable function.

The query that can be used to get this information is given below:


declare @path varchar(1000)

select @path = path from sys.traces where id = 1

select trc.Spid, TextData, State, Error, StartTime, EventSubClass, LoginName, NTDomainName, NTUserName, ClientProcessID, HostName, ApplicationName

from fn_trace_gettable(@path,DEFAULT) trc

where EventClass = 20

order by StartTime desc

[Blog Update]: August posts on SQLServerFAQ

I was working on a root cause analysis for an OOM (Out-of-Memory) issue for SQL Server 2008 R2 and I needed to schedule notifications when the available memory on the server fell below a certain value. SQL Server 2008 and above has a nifty little DMV to do just this. Read about it’s usefulness here:

The hidden gems among DMVs: sys.dm_os_sys_memory

Zeollar presentation on SQL Nexus

Get Microsoft Silverlight

If you have difficulty viewing this session, click here to view it in the original website.

Here is the recording of the webcast that I presented today! Feel free to contact me through this blog or post on the discussion forum on the SQL Nexus discussion forum.

In case you are not able to view the presentation, then you can view it directly on the Zeollar site.

SQL VDI backup fails with 0x80770007

While taking a SQL Server database non-native backup using an application that calls SQLVDI.DLL, you find that the backup fails following HEX code: 0x080770007.

0x80770007 (VD_E_INSTANCE_NAME VD_ERROR) translates to: Failed to recognize the SQL Server instance name.

Then check if the following condition holds true:
There is no DEFAULT instance of SQL Server on the machine where you are trying to take a VDI backup and the SQL instance that you are connecting to perform a backup is a named instance.

If above condition is true, then the issue is with CreateEx function of the interface IClientVirtualDeviceSet2. The CreateEx function is used to create the virtual device set and has the following syntax:

HRESULT IClientVirtualDeviceSet2::CreateEx (
LPCWSTR lpInstanceName,
LPCWSTR lpName,
VDConfig* pCfg
);

The "lpInstanceName" parameter identifies the SQL Server instance to which the SQL command needs to be sent to. If the CreateEx method has NULL as the first parameter, then it would always connect to the Default instance. If the server doesn’t have a default SQL instance, then the first parameter needs to be provided with the instance name:
Eg. If you have a named instance on the server as "SERVER1\SQLINST1”, then the first parameter for CreateEx should be "SQLINST1". A way to workaround the same for simple.cpp is mentioned here.

Technorati Tags: ,,