Recently, I had replied to a MSDN post regarding an Access Violation reported by a SQL Server instance. In this post, I shall elucidate on troubleshooting memory dumps generated by SQL Server exceptions and what can be done without the availability of private symbols for debugging.
I had written a post earlier on when a memory dump is generated in SQL Server. If a single Errorlog has multiple stack dumps from multiple exceptions, then the first thing to do is to investigate the first access violation reported in the series of dumps reported. It is also possible that an exception (eg:access violation) could be followed by another exception/assertion or a non-yielding condition which may be a side effect of the earlier access violation.
When SQL encounters an access violation, there will be SQLDumpXXXX.mdmp, SQLDumpXXXX.txt and SQLDumpXXXX.log files created in the LOG folder associated with the exception. The SQL Errorlog will report the occurrence of the exception as well.
- The .mdmp file is the memory dump which would be required by CSS to troubleshoot the issue.
- The .txt file is the symptom dump file which contains environment information along with other information depending on the nature of the exception.
- The .log file contains an extract from the SQL Errorlog when the exception was encountered. This helps in looking at what messages were reported in the Errorlog prior to the issue. Helps during post-mortem analysis if the current Errorlogs have already been recycled.
There can be two types of exceptions: one that is fatal enough to cause a SQL service termination or one that encounters the exception which terminates the thread which was encountered the issue.
Now what can be done in such a situation without having to look into memory dump file. The first step would be to check if the issue is a recurring behavior due to a particular operation to establish some pattern or if there is not identifiable pattern at all. If you look into the SQL Errorlog, you might be able to identify the input buffer which cause the exception (unless it’s an exception which terminated the SQL service, one such example is shown below).
Server * BEGIN STACK DUMP:
Server * spid 5728
Server * ex_handle_except encountered exception C0000005 – Server terminating
If the above snippet contains the T-SQL query which raised the exception, then you would want to execute the same query from a Management Studio query window and check if the exception re-occurs. If yes, then you have a valid repro scenario.
Example snippet from an Errorlog where the Access Violation didn’t cause a service termination:
* Exception Address = 0042B866
* Exception Code = c0000005 EXCEPTION_ACCESS_VIOLATION
* Access Violation occurred writing address 67192000
* Input Buffer 38 bytes -
* exec sp_updatestats
Next check if you can spot any external components loaded into SQL Server address space. I shall revert back to my MSDN post where I had spotted an anti-virus filter DLL loaded in SQL Server process address space.
Server * bcrypt 000007FEFC900000 000007FEFC921FFF 00022000
Server * PCTLsp64 000000000B1E0000 000000000B251FFF 00072000
Server * mswsock 000007FEFC840000 000007FEFC893FFF 00054000
Based on a few Bing searches, I was able to determine that the PCTLsp64 was part of a virus monitoring program from PC Tools. The next step would be to check if this external component can be moved out of SQL Server process address space. The fact that the DLL is loaded in SQL process address space doesn’t make the DLL a culprit. However, when troubleshooting such issues, it is important to eliminate all unknown or third party elements that could prevent us from narrowing down to the culprit. Other DLLs which can be commonly found in SQL Server process address space would be provider DLLs when linked servers are being used on the SQL Server instance. Additionally execution of Extended Stored Procedures (XSPs) could also load external DLLs into SQL Server process address space depending on the way the XSP DLL was coded. An easy way to identifying a XSP dll is searching for the word “using” in the SQL Server Errorlog. SQL Server reports the first instance of a XSP dll being loaded into SQL Server process address space.
Using ‘xpstar.dll’ version ’2009.100.1600′ to execute extended stored procedure ‘xp_readerrorlog’
If you do find an external component loaded in SQL process address space, check the callstack reported in the Errorlog below the list of modules printed below to determine if the functions of the exception stack are associated with external component that you identified. For the issue reported in the MSDN post, I did find this:
Module(sqlservr+00000000017A954C)
Module(kernel32+0000000000099380)
Module(ntdll+0000000000096228)
Module(ntdll+0000000000014F48)
Module(ntdll+0000000000034F6D)
Module(ntdll+0000000000015B2C)
Module(ntdll+000000000004F638)
Module(ntdll+000000000004C8F4)
Module(ntdll+000000000004C822)
Module(PCTLsp64+00000000000040AD)
Module(PCTLsp64+0000000000004701)
Module(PCTLsp64+0000000000005434)
Now this makes my story a bit more compelling of removing the external component out of SQL Server process address space. Once that is done and if you have a repro condition which causes the issue, then it would be good to re-run your repro and check if the issue is actually resolved. Not all the exceptions are the same even though they are reported in the same manner.
If all the above hasn’t helped in narrowing down the issue, then check if you are on the latest available cumulative update for the version of SQL Server that you are using. If not, then it might be a good idea to update your instance to the latest available build to ensure that you are not hitting a known issue. If this is possible, then apply the latest cumulative update package and check if the issue re-occurs. I am cognizant of the fact that this might be an easy option due to change management restrictions on production environments. For such a dependency, my recommendation would be engage CSS to assist with a thorough investigation of the root cause of the issue.
As you can see from the steps outline above, you can look into a few things before you need to start poking into the exception memory dump using a Debugging tool.
Happy debugging!
Dear All,
I need your help.
I have one production server working on environment Windows Server 2008 (32 bit) with SQL Server 2008 R2 Service pack 1 CU4 (32 bit).
We have created one link server to Oracle 9i server.
I migrated my 2000 database to 2008. I have created link server also.
But after that I am facing a very serious problem.
Sometimes my SQL server stop automatically and when I checked in error log I found stack dump in that. And event viewer log shows me following errors.
SQL Server is terminating because of fatal exception c0000005. This error may be caused by an unhandled Win32 or C++ exception, or by an access violation encountered during exception handling. Check the SQL error log for any related stack dumps or messages. This exception forces SQL Server to shutdown. To recover from this error, restart the server (unless SQLAgent is configured to auto restart).
The OLE DB provider “MSDAORA” for linked server “IIS” reported an error 0×80004005 aborting the current transaction.
Can anyone please help me on this.
This might be an access violation. MSDAORA is not supported (tested) for your version of Oracle. Ref: http://support.microsoft.com/kb/239719
Have you tried using the Oracle Provider for setting up the linked server? As of now the information provided is not sufficient enough. Could you use the steps mentioned in this post to retrieve the offending call stack and post the information to the SQL Server MSDN Forums. Provide the link back here so that I can respond.
Hi Amit,
I have posted my same request on MSDN at below location:
http://social.msdn.microsoft.com/Forums/en-US/sqldatabaseengine/thread/db65c980-f704-415c-8b40-7aa561cd9223
Hi Amit,
I cant find the exact cause of this problem.
And a very strange things i know that this same stucture working very fine in SQL Server 2000 before migrate.
I will try to setup link server with using Oracle OLEDB and let you know.
From the information that you have provided, it is clear that it is an access violation. As I said in my previous reply, MSDAORA provider was not intended for use with higher versions of Oracle. You would want to use the Oracle provider. If it was working with SQL Server 2000, then you were lucky as this is not a tested scenario. To check what exactly is causing the Access Violation, the exception stack will have to be obtained from the .mdmp file. Without that, not much can be said other than what I have already mentioned.
It’s not often I leave replies to items such as this but in this case many thanks are in order !
I was investigating SQLServer error 17311 and the subsequent shutdowns and came across your item, which I found interesting.
Then your mention of PCTLsp64 made it even more interesting since I installed PCTools on the server yesterday afternoon ………….
The server is currently rebooting with two disc checks – once that’s done I’ll look at our dump files and see if the problem is as similar as it looks to yours !!
17311 is a generic error number which is raised when the SQL service terminates due to an exception. Gald to know that the article helped and if PCTools DLL is not on the top of the stack, you might want to check the call stack and see if the functions shown on the top of stack is from a 3rd party DLL or not.
2011-05-25 14:32:15.50 Server *
2011-05-25 14:32:15.50 Server * Edi: 1522E74C: 0A3F8B80 00000000 1522E850 000D34C8 1522E850 7C82E0CC
2011-05-25 14:32:15.50 Server * Esi: 00000000:
2011-05-25 14:32:15.50 Server * Eax: 1522E630: 000042AC 00000000 00000000 77E4BEF7 00000000 00000002
2011-05-25 14:32:15.50 Server * Ebx: 0000003F:
2011-05-25 14:32:15.50 Server * Ecx: 1522ECA0: 00000000 00010007 C0150008 00840082 0A3F8B84 00000000
2011-05-25 14:32:15.50 Server * Edx: 0000003D:
2011-05-25 14:32:15.50 Server * Eip: 77E4BEF7: 10C2C95E 01093D00 840FC000 00000095 F29DE950 9090FFFF
2011-05-25 14:32:15.50 Server * Ebp: 1522E680: 1522E6C4 021B2ECC 000042AC 00000000 00000000 00000000
2011-05-25 14:32:15.50 Server * SegCs: 7813001B: 00000000 00000000 00000000 00000000 00000000 00000000
2011-05-25 14:32:15.50 Server * EFlags: 00000246:
2011-05-25 14:32:15.50 Server * Esp: 1522E62C: 00000000 000042AC 00000000 00000000 77E4BEF7 00000000
2011-05-25 14:32:15.50 Server * SegSs: 78130023: 00000000 00000000 00000000 00000000 00000000 00000000
2011-05-25 14:32:15.50 Server * *******************************************************************************
2011-05-25 14:32:15.50 Server * ——————————————————————————-
2011-05-25 14:32:15.50 Server * Short Stack Dump
2011-05-25 14:32:15.52 Server 77E4BEF7 Module(kernel32+0000BEF7)
2011-05-25 14:32:15.52 Server 021B2ECC Module(sqlservr+011B2ECC)
2011-05-25 14:32:15.52 Server 021B6DE0 Module(sqlservr+011B6DE0)
2011-05-25 14:32:15.52 Server 021B698D Module(sqlservr+011B698D)
2011-05-25 14:32:15.52 Server 02427A29 Module(sqlservr+01427A29)
2011-05-25 14:32:15.52 Server 77E761B7 Module(kernel32+000361B7)
2011-05-25 14:32:15.54 Server 77E792A3 Module(kernel32+000392A3)
2011-05-25 14:32:15.54 Server Stack Signature for the dump is 0x1A074F5C
2011-05-25 14:32:25.53 Server External dump process return code 0×20000001.
2011-05-25 14:32:15.50 Server * mscorlib.ni 126F0000 131D5FFF 00ae6000
2011-05-25 14:32:15.50 Server * mscorsec 07530000 07542FFF 00013000
2011-05-25 14:32:15.50 Server * SOFTPUB 07560000 07564FFF 00005000
2011-05-25 14:32:15.50 Server * cryptnet 082F0000 08301FFF 00012000
2011-05-25 14:32:15.50 Server * SensApi 09E60000 09E64FFF 00005000
2011-05-25 14:32:15.50 Server * SqlAccess 09F00000 09F55FFF 00056000
2011-05-25 14:32:15.50 Server * mscorjit 09F70000 09FC2FFF 00053000
2011-05-25 14:32:15.50 Server * System.Data 131E0000 134A6FFF 002c7000
2011-05-25 14:32:15.50 Server * System.ni 13D60000 1451DFFF 007be000
2011-05-25 14:32:15.50 Server * System.Transactions 0A130000 0A172FFF 00043000
2011-05-25 14:32:15.50 Server * System.Security.ni 13980000 13A35FFF 000b6000
2011-05-25 14:32:15.50 Server * System 145E0000 148C3FFF 002e4000
2011-05-25 14:32:15.50 Server * System.Security 148D0000 14911FFF 00042000
2011-05-25 14:32:15.50 Server * System.Xml 14920000 14B13FFF 001f4000
2011-05-25 14:32:15.50 Server * scrrun 13A40000 13A64FFF 00025000
2011-05-25 14:32:15.50 Server * MFC42 14BE0000 14D00FFF 00121000
2011-05-25 14:32:15.50 Server * WININET 14D10000 14DBAFFF 000ab000
2011-05-25 14:32:15.50 Server * SXS 14DC0000 14E7CFFF 000bd000
2011-05-25 14:32:15.50 Server * oledb32 343F0000 34468FFF 00079000
2011-05-25 14:32:15.50 Server * MSDART 152E0000 152F9FFF 0001a000
2011-05-25 14:32:15.50 Server * OLEDB32R 15300000 15310FFF 00011000
2011-05-25 14:32:15.50 Server * comsvcs 76D10000 76E54FFF 00145000
2011-05-25 14:32:15.50 Server * sqloledb 77840000 778C0FFF 00081000
2011-05-25 14:32:15.50 Server * MSDATL3 344B0000 344C4FFF 00015000
2011-05-25 14:32:15.50 Server * DBNETLIB 48040000 4805BFFF 0001c000
2011-05-25 14:32:15.50 Server * DBnmpNTw 77FF0000 77FF6FFF 00007000
2011-05-25 14:32:15.50 Server * System.Configuration 76E60000 76EC1FFF 00062000
2011-05-25 14:32:15.50 Server * odsole70 15230000 1523EFFF 0000f000
2011-05-25 14:32:15.50 Server * Microsoft.VisualBasic 7DE80000 7DF27FFF 000a8000
2011-05-25 14:32:15.50 Server * CustomMarshalers.ni 15170000 151AFFFF 00040000
2011-05-25 14:32:15.50 Server * CustomMarshalers 15260000 15274FFF 00015000
2011-05-25 14:32:15.50 Server * clsHuffman 09560000 09569FFF 0000a000
2011-05-25 14:32:15.50 Server * MSVBVM60 B93F0000
2011-05-25 14:32:15.50 Server * dbghelp B9670000
I am seeing clsHuffman (probably custom compression/decompression DLL) and VB Runtime 6.0 (MSVBVM60) in SQL process address space. I believe the operation that hit the AV was a CLR operation. This would require a thorough investigation from CSS. I would suggest opening a support incident since this is a production server.
2011-05-25 14:32:15.50 Server * XOLEHLP 06B70000 06B75FFF 00006000
2011-05-25 14:32:15.50 Server * MSDTCPRX 06B80000 06BF8FFF 00079000
2011-05-25 14:32:15.50 Server * OLEAUT32 77D00000 77D8AFFF 0008b000
2011-05-25 14:32:15.50 Server * msvcp60 06C00000 06C64FFF 00065000
2011-05-25 14:32:15.50 Server * MTXCLU 06C70000 06C88FFF 00019000
2011-05-25 14:32:15.50 Server * VERSION 77B90000 77B97FFF 00008000
2011-05-25 14:32:15.50 Server * WSOCK32 06C90000 06C98FFF 00009000
2011-05-25 14:32:15.50 Server * CLUSAPI 06CA0000 06CB1FFF 00012000
2011-05-25 14:32:15.50 Server * RESUTILS 06CC0000 06CD2FFF 00013000
2011-05-25 14:32:15.50 Server * mswsock 06CF0000 06D30FFF 00041000
2011-05-25 14:32:15.50 Server * DNSAPI 76ED0000 76EF9FFF 0002a000
2011-05-25 14:32:15.50 Server * winrnr 06D80000 06D86FFF 00007000
2011-05-25 14:32:15.50 Server * rasadhlp 06DA0000 06DA4FFF 00005000
2011-05-25 14:32:15.50 Server * security 07170000 07173FFF 00004000
2011-05-25 14:32:15.50 Server * msfte 07950000 07BA9FFF 0025a000
2011-05-25 14:32:15.50 Server * dbghelp 07BC0000 07CD4FFF 00115000
2011-05-25 14:32:15.50 Server * WINTRUST 76BB0000 76BDBFFF 0002c000
2011-05-25 14:32:15.50 Server * imagehlp 76C10000 76C37FFF 00028000
2011-05-25 14:32:15.50 Server * dssenh 080E0000 08106FFF 00027000
2011-05-25 14:32:15.50 Server * hnetcfg 08320000 08379FFF 0005a000
2011-05-25 14:32:15.50 Server * wshtcpip 080D0000 080D7FFF 00008000
2011-05-25 14:32:15.50 Server * ntdsapi 08440000 08453FFF 00014000
2011-05-25 14:32:15.50 Server * xpsp2res 08460000 08724FFF 002c5000
2011-05-25 14:32:15.50 Server * CLBCatQ 777B0000 77832FFF 00083000
2011-05-25 14:32:15.50 Server * sqlncli 08730000 08956FFF 00227000
2011-05-25 14:32:15.50 Server * COMCTL32 77530000 775C6FFF 00097000
2011-05-25 14:32:15.50 Server * comdlg32 007A0000 007E8FFF 00049000
2011-05-25 14:32:15.50 Server * SHLWAPI 7D180000 7D1D1FFF 00052000
2011-05-25 14:32:15.50 Server * SHELL32 08960000 0915EFFF 007ff000
2011-05-25 14:32:15.50 Server * comctl32 77420000 77522FFF 00103000
2011-05-25 14:32:15.50 Server * SQLNCLIR 09190000 091C2FFF 00033000
2011-05-25 14:32:15.50 Server * xpsqlbot 0ABD0000 0ABD5FFF 00006000
2011-05-25 14:32:15.50 Server * xpstar90 0ABF0000 0AC3BFFF 0004c000
2011-05-25 14:32:15.50 Server * SQLSCM90 0AC50000 0AC58FFF 00009000
2011-05-25 14:32:15.50 Server * ODBC32 0AC70000 0ACADFFF 0003e000
2011-05-25 14:32:15.50 Server * BatchParser90 0ACB0000 0ACCEFFF 0001f000
2011-05-25 14:32:15.50 Server * ATL80 7C630000 7C64AFFF 0001b000
2011-05-25 14:32:15.50 Server * odbcint 0AFA0000 0AFB6FFF 00017000
2011-05-25 14:32:15.50 Server * xpstar90 0AFC0000 0AFE5FFF 00026000
2011-05-25 14:32:15.50 Server * xplog70 0AFF0000 0AFFBFFF 0000c000
2011-05-25 14:32:15.50 Server * xplog70 0B010000 0B012FFF 00003000
2011-05-25 14:32:15.50 Server * sqlevn70 0B0C0000 0B2C4FFF 00205000
2011-05-25 14:32:15.50 Server * sqlevn70 0B2D0000 0B4D2FFF 00203000
2011-05-25 14:32:15.50 Server * sqlevn70 0B4E0000 0B688FFF 001a9000
2011-05-25 14:32:15.50 Server * sqlevn70 0B690000 0B87BFFF 001ec000
2011-05-25 14:32:15.50 Server * sqlevn70 0B880000 0BA78FFF 001f9000
2011-05-25 14:32:15.50 Server * sqlevn70 0BA80000 0BC60FFF 001e1000
2011-05-25 14:32:15.50 Server * sqlevn70 0BC70000 0BE18FFF 001a9000
2011-05-25 14:32:15.50 Server * sqlevn70 0BE20000 0BFC8FFF 001a9000
2011-05-25 14:32:15.50 Server * sqlevn70 0BFD0000 0C178FFF 001a9000
2011-05-25 14:32:15.50 Server * mscorwks 0C180000 0C6E0FFF 00561000
2011-05-25 14:32:15.50 Server ***Stack Dump being sent to D:\data\data_srv\dbagroup\mssql\MSSQL.1\MSSQL\LOG\SQLDump0103.txt
2011-05-25 14:32:15.50 Server * *******************************************************************************
2011-05-25 14:32:15.50 Server *
2011-05-25 14:32:15.50 Server * BEGIN STACK DUMP:
2011-05-25 14:32:15.50 Server * 05/25/11 14:32:15 spid 0
2011-05-25 14:32:15.50 Server *
2011-05-25 14:32:15.50 Server * ex_handle_except encountered exception C0000005 – Server terminating
2011-05-25 14:32:15.50 Server *
2011-05-25 14:32:15.50 Server *
2011-05-25 14:32:15.50 Server * MODULE BASE END SIZE
2011-05-25 14:32:15.50 Server * sqlservr 01000000 02C24FFF 01c25000
2011-05-25 14:32:15.50 Server * ntdll 7C800000 7C8C2FFF 000c3000
2011-05-25 14:32:15.50 Server * kernel32 77E40000 77F41FFF 00102000
2011-05-25 14:32:15.50 Server * MSVCR80 78130000 781CAFFF 0009b000
2011-05-25 14:32:15.50 Server * msvcrt 77BA0000 77BF9FFF 0005a000
2011-05-25 14:32:15.50 Server * MSVCP80 7C420000 7C4A6FFF 00087000
2011-05-25 14:32:15.50 Server * ADVAPI32 7D1E0000 7D27BFFF 0009c000
2011-05-25 14:32:15.50 Server * RPCRT4 77C50000 77CEFFFF 000a0000
2011-05-25 14:32:15.50 Server * Secur32 76F50000 76F62FFF 00013000
2011-05-25 14:32:15.50 Server * sqlos 344D0000 344D4FFF 00005000
2011-05-25 14:32:15.50 Server * USERENV 76920000 769E1FFF 000c2000
2011-05-25 14:32:15.50 Server * USER32 77380000 77410FFF 00091000
2011-05-25 14:32:15.50 Server * GDI32 77C00000 77C48FFF 00049000
2011-05-25 14:32:15.50 Server * WINMM 76AA0000 76ACCFFF 0002d000
2011-05-25 14:32:15.50 Server * opends60 333E0000 333E6FFF 00007000
2011-05-25 14:32:15.50 Server * NETAPI32 71C40000 71C96FFF 00057000
2011-05-25 14:32:15.50 Server * psapi 76B70000 76B7AFFF 0000b000
2011-05-25 14:32:15.50 Server * instapi 48060000 48069FFF 0000a000
2011-05-25 14:32:15.50 Server * sqlevn70 4F610000 4F7B8FFF 001a9000
2011-05-25 14:32:15.50 Server * NTMARTA 77E00000 77E20FFF 00021000
2011-05-25 14:32:15.50 Server * WLDAP32 76F10000 76F3DFFF 0002e000
2011-05-25 14:32:15.50 Server * SAMLIB 7E020000 7E02EFFF 0000f000
2011-05-25 14:32:15.50 Server * ole32 77670000 777A8FFF 00139000
2011-05-25 14:32:15.50 Server * rsaenh 05B50000 05B84FFF 00035000
2011-05-25 14:32:15.50 Server * AUTHZ 76C40000 76C53FFF 00014000
2011-05-25 14:32:15.50 Server * MSCOREE 06620000 06664FFF 00045000
2011-05-25 14:32:15.50 Server * msv1_0 76C90000 76CB6FFF 00027000
2011-05-25 14:32:15.50 Server * cryptdll 06880000 0688BFFF 0000c000
2011-05-25 14:32:15.50 Server * WS2_32 71C00000 71C16FFF 00017000
2011-05-25 14:32:15.50 Server * WS2HELP 71BF0000 71BF7FFF 00008000
2011-05-25 14:32:15.50 Server * iphlpapi 76CF0000 76D09FFF 0001a000
2011-05-25 14:32:15.50 Server * kerberos 068F0000 06948FFF 00059000
2011-05-25 14:32:15.50 Server * MSASN1 06950000 06961FFF 00012000
2011-05-25 14:32:15.50 Server * schannel 06980000 069A7FFF 00028000
2011-05-25 14:32:15.50 Server * CRYPT32 069B0000 06A42FFF 00093000
2011-05-25 14:32:15.50 Server * COMRES 06AA0000 06B65FFF 000c6000
2011-05-25 14:32:13.35 Server Error: 17311, Severity: 16, State: 1.
2011-05-25 14:32:13.35 Server SQL Server is terminating because of fatal exception c0000005. This error may be caused by an unhandled Win32 or C++ exception, or by an access violation encountered during exception handling. Check the SQL error log for any related stack dumps or messages. This exception forces SQL Server to shutdown. To recover from this error, restart the server (unless SQLAgent is configured to auto restart).
2011-05-25 14:32:13.50 Server Using ‘dbghelp.dll’ version ’4.0.5′
2011-05-25 14:32:13.58 spid1s AppDomain 4 (REXME.dbo[runtime].3) is marked for unload due to memory pressure.
2011-05-25 14:32:13.58 spid1s AppDomain 4 (REXME.dbo[runtime].3) unloaded.
2011-05-25 14:32:15.50 Server **Dump thread – spid = 0, PSS = 0×00000000, EC = 0×00000000
2011-05-25 14:32:15.50 Server ***Stack Dump being sent to D:\MSSQL.1\MSSQL\LOG\SQLDump0103.txt
Thanks Amit. This information is really useful.
Amit,
This is really good info and its nice to know these when troubleshooting memory dumps that happen once in a while.
Thank you Sankar.