Replication Agent has not logged a message in 10 minutes

Posted on May 28, 2010 by Amit Banerjee

I saw multiple posts on MSDN regarding the following message:

“The replication agent has not logged a progress message in 10 minutes. This might indicate an unresponsive agent or high system activity.”

More often than not the above message can be ignored safely. The next obvious question is when can this message be ignored. If you find that your replication agent history is not reporting any error messages, then the above message is benign. Typically, this message means the agent is busy doing its work and has no resources to respond to status inquiries.

Why does this error message get generated?

This error message gets generated because of the Distribution heartbeat interval property. This property governs how long an agent can run without logging a progress message. If your replication agents are not reporting an error message and you are seeing the above message, then you could change your heartbeat interval to a higher value. One of the option could be that you changed the history logging option for your replication agent so that it doesn’t log any message.

Query:

exec sp_changedistributor_property @property = 'heartbeat_interval', @value = <value in minutes>;

Reference: sp_changedistributor_property (Transact-SQL)

I have read lots of opinions as to why this is a bug and if you subscribe to that school of thought, please create a Connect item at http://connect.microsoft.com/sqlserver.

Is my RESTORE operation complete?

Posted on March 12, 2010 by Amit Banerjee

When you perform a database restore operation using Management Studio Object Explorer, you might be baffled as to why the confirmation pop-up window doesn’t pop-up saying that the database restore operation is complete.

This is because the RESTORE operation in SQL Server can be divided into two stages if you are having a high level discussion and don’t want to get into the intricacies:

1. Copy the information from the backup media and write them into the physical files on the Windows File System

2. Perform recovery on the database to bring it online (provided in the OPTIONS tab you have selected the first radio button under the Recovery State)

The highlighted counter shows you the progress of the first stage only. The second stage doesn’t have a progress counter associated with it. If you look at the percent_complete column value in the sys.dm_exec_requests DMV.

So, if your database restore screen is showing 100% completed status and the SQL Server Errorlog is not reporting the following for the database being restored:

2010-03-13 02:56:22.650 spid61 Starting up database ‘distribution_new’.
2010-03-13 02:56:23.140 Backup Database was restored: Database: distribution_new, creation date(time): 2009/08/04(01:37:56), first LSN: 865:804:70, last LSN: 865:833:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {‘C:\Program Files\Microsoft SQL Server\MSSQL.2\MSSQL\Backup\distribution.bak’}). Informational message. No user action required.

Then you are phase 2 of the restore provided that session performing the database restore is not experiencing blocking.

Now for the most interesting part, Why is this happening i.e. the long recovery? One of the reasons is documented here.

How to find out Security Errors on SQL Server 2005 instance and above including the API call it failed on

Posted on December 30, 2009 by Amit Banerjee

More often than not, while troubleshooting SQL Server Security Issues, you end up performing post-mortem analysis. Under such circumstances, the Ring Buffer output comes in handy because it stores the Security Errors that were reported in the past for a certain period of time.

For an example, let’s say if you tried to create a SQL Login with the Password Policy enabled and provided a password that didn’t match a password policy, then you will receive an error stating that the password is not complex enough. This error will be stored in the ring buffer. When you execute the query below, you will find out the SPID that raised the error along with the API Name that is failing. In the sample output for this scenario described above, you will find that the NetValidatePwdPolicy failed.

If you convert the ErrorCode value 0x8C5 (= 2245) to decimal and do a NET HELPMSG <number> from command prompt, you will find more information on why that error occurred.

Output of NET HELPMSG 2245:

The password does not meet the password policy requirements. Check the minimum password length, password complexity and password history requirements.

The Calling API Name and the API Name are very helpful when troubleshooting Domain Account related security errors as it would call functions related to Directory Services whose documentation can be found on Tech Net which would make your troubleshooting easier.

Sample Output:

Notification_time: 2009-11-06 08:37:08.023
ErrorCode: 0x8C5             
CallingAPIName: NetValidatePwdPolicy                                       
APIName: NetValidatePwdPolicy                         
SPID: 53       
Record Id: 0           
Type: RING_BUFFER_SECURITY_ERROR

For SQL Server 2005:

SELECT CONVERT (varchar(30), GETDATE(), 121) as runtime,
DATEADD (ms, -1 * ((sys.cpu_ticks / sys.cpu_ticks_in_ms) - a.[Record Time]), GETDATE()) AS Notification_time, 
a.* , sys.ms_ticks AS [Current Time]
FROM
(SELECT
x.value('(//Record/Error/ErrorCode)[1]', 'varchar(30)') AS [ErrorCode],
x.value('(//Record/Error/CallingAPIName)[1]', 'varchar(255)') AS [CallingAPIName],
  x.value('(//Record/Error/APIName)[1]', 'varchar(255)') AS [APIName],
x.value('(//Record/Error/SPID)[1]', 'int') AS [SPID],
x.value('(//Record/@id)[1]', 'bigint') AS [Record Id],
x.value('(//Record/@type)[1]', 'varchar(30)') AS [Type],
x.value('(//Record/@time)[1]', 'bigint') AS [Record Time]
FROM (SELECT CAST (record as xml) FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_SECURITY_ERROR') AS R(x)) a
CROSS JOIN sys.dm_os_sys_info sys
ORDER BY a.[Record Time] ASC

For SQL Server 2008

SELECT CONVERT (varchar(30), GETDATE(), 121) as runtime,
dateadd (ms, (a.[Record Time] - sys.ms_ticks), GETDATE()) as [Notification_Time],
a.* , sys.ms_ticks AS [Current Time]
FROM
(SELECT
x.value('(//Record/Error/ErrorCode)[1]', 'varchar(30)') AS [ErrorCode],
x.value('(//Record/Error/CallingAPIName)[1]', 'varchar(255)') AS [CallingAPIName],
  x.value('(//Record/Error/APIName)[1]', 'varchar(255)') AS [APIName],
x.value('(//Record/Error/SPID)[1]', 'int') AS [SPID],
x.value('(//Record/@id)[1]', 'bigint') AS [Record Id],
x.value('(//Record/@type)[1]', 'varchar(30)') AS [Type],
x.value('(//Record/@time)[1]', 'bigint') AS [Record Time]
FROM (SELECT CAST (record as xml) FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_SECURITY_ERROR') AS R(x)) a
CROSS JOIN sys.dm_os_sys_info sys
ORDER BY a.[Record Time] ASC

Technorati Tags: SQL Server 2005,Ring Buffers,Security

Additional Reference

http://blogs.msdn.com/psssql/archive/2008/03/24/how-it-works-sql-server-2005-sp2-security-ring-buffer-ring-buffer-security-error.aspx

How to troubleshoot connectivity issues for SQL

Posted on December 30, 2009 by Amit Banerjee

The first option should be to perform a UDL test. The following blog post has information on the same:

http://glennberrysqlperformance.spaces.live.com/blog/cns!45041418ECCAA960!939.entry

If this fails, you can then use the steps mentioned in the blog post by the SQL Dev team to troubleshoot your connectivity issue:

http://blogs.msdn.com/sql_protocols/archive/2008/04/30/steps-to-troubleshoot-connectivity-issues.aspx

Check Integrity task in Maint Plan fails on SQL Server 2000

Posted on December 30, 2009 by Amit Banerjee

When a SQL Server 2000 Database Maintenance Plan is executed, you find that the following message in printed in the Maint Plan .OUT file (Found in the SQL Server LOG folder):

The backup was not performed since data verification errors were found

[Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 5058: [Microsoft][ODBC SQL Server Driver][SQL Server]Option ‘SINGLE_USER’ cannot be set in database ‘MASTER’. [Microsoft][ODBC SQL Server Driver][SQL Server]sp_dboption command failed. [1] Database master: Check Data and Index Linkage… [Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 7919: [Microsoft][ODBC SQL Server Driver][SQL Server]Repair statement not processed. Database needs to be in single user mode.

This happens because you had selected the option of "Attempt to repair any minor problems" (refer above pic). This option needs CHECKDB to be executed with the database in SINGLE USER mode. If you have users connected to the database during that time, the database cannot be put into SINGLE USER mode due to which the Integrity Check task will fail. This will prevent backups from occurring on the database and you would see the above messages in the log files.

DBCC CHECKDB (<database name>, repair_fast) executed by <account name> found 0 errors and repaired 0 errors. Elapsed time: 0 hours 0 minutes 50 seconds.

If you look in the SQL Server ERRORLOG, you will find that the CHECKDB is executed with REPAIR_FAST flag. This requires Single User mode. Another note to keep in mind is if you check the Integrity Check option and "Attempt to repair any minor problems", then exclude MASTER, MSDB and MODEL databases from the Plan.

Reference Article:

290622 BUG: Database maintenance plan on system databases fails on integrity check if "Attempt to repair minor problems" is selected
http://support.microsoft.com/default.aspx?scid=kb;EN-US;290622