SQL Server 2005 Upgrade

Posted on December 30, 2009 by Amit Banerjee

Upgrading from SQL Server 2000 to SQL Server 2005 is always a daunting task if you have had a chance to contemplate as to what needs to be done before and after the upgrade process. On the other hand, if you had a chance to plan out everything before hand, then it is very unlikely that you would land yourself in a soup unless and until the Gods choose to be very unkind towards you on that fateful day. 🙂

So, first let us understand what are the two procedures of upgrading from SQL Server 2000 to SQL Server 2005.
1. IN PLACE UPGRADE
2. SIDE BY SIDE UPGRADE

So, before I go ahead and explain these two upgrade procedures, let us dwell on the pre-requisites for upgrading to SQL Server 2005.
1. Testing: Make sure all your applications are thoroughly tested and function as expected on a SQL Server 2005 environment before deciding to take the final step
2. Fall-back plan: This is what I call "insurance". It’s a hassle but when a problem does happen, this is what takes care in helping you out of the problem. I believe it is better to be safe than sorry.
3. Backups: This should have been included in Point 2 but this is something of utmost importance, so I thought I would re-iterate this again. We need to have backups of all the user and system (YES! we need them.) databases in case we need to use our fall-back plan.
4. Upgrade Advisor: Run the upgrade advisor on your existing SQL Server 2000 installation and find out if there are any REDFLAGS in upgrading to SQL Server 2005 based on your current configuration.

In-place upgrade process
The upgrade process which refers to upgrading all the existing installed components of SQL Server 2000 to SQL Server 2005 on the same server.
Side-by-side upgrade process
The upgrade process which refers to leaving your existing SQL Server installation intact and installing a fresh SQL Server 2005 instance on the same server or a different server and then migrating over all your databases. Once, all the components (Databases, Plans, DTS Packages etc.) have been migrated the SQL 2000 installation can then be phased out.
Both these upgrade processes have their PROs and CONs which are clearly documented in the TechNet Article mentioned below.

Upgrade Handbook for SQL Server 2005
http://www.microsoft.com/technet/prodtechnol/sql/2005/sqlupgrd.mspx
Upgrading to SQL Server 2005
http://www.microsoft.com/sql/solutions/upgrade/default.mspx
Technet Article on Upgrading SQL Server 2000 to SQL Server 2005
http://www.microsoft.com/downloads/details.aspx?FamilyID=3d5e96d9-0074-46c4-bd4f-c3eb2abf4b66&DisplayLang=en

So, what do we need to take care of when upgrading to SQL Server 2005 from SQL Server 2000?
1. Replication: If you do have replication configured (subscriber/publisher/distributor), you need to find out if your replication topology is compatible and will continue to function smoothly if the current server is upgraded?
2. DTS Packages: These need to migrated to SQL Server 2005 using the Migration Wizard. Related Article: http://msdn2.microsoft.com/en-us/library/ms143501.aspx
3. Maintenance Plans: Again they are different from their SQL Server 2000 counterparts because in SQL 2005, we use SSIS in the background.
4. Logins/Linked servers: In case of an side-by-side upgrade, then these need to be migrated too.

These are a few of prime areas of concern that come to my mind. A whole bunch of other considerations are mentioned in the above mentioned Technet Whitepaper. Please do read through that if you are planning to upgrade to SQL Server 2005. One of the biggest reasons that we get support calls regarding failed upgrades is that some best practice was not followed due to which the entire setup failed.

Remember that the support team for SQL has a skilled set of engineers who are trained on troubleshooting problem scenarios with SQL Server but they do not hold magic wands. If your SQL Server setup is so botched that it cannot be recovered, then you will have to end up reinstalling the instance from scratch after performing a manual cleanup.

Please think for a moment: Which is the better option?

a. Not reading the lengthy whitepaper

b. OR taking a chance (50-50) that the upgrade will happen smoothly but if it fails, then we have a whole set of headaches to take care of.

The need for a sound Backup Strategy

Posted on December 30, 2009 by Amit Banerjee

Backup strategy needs to be designed in such a manner that it causes minimum amount of downtime in case during a disaster recovery scenario if backups need to be restored. The backup strategy depends on the following:
1. Size of the database – This would determine how often a FULL BACKUP can be taken
2. Recovery Model – This would determine if transaction log/differential backups are possible
3. The importance of the database – This would determined how often transaction log backups need to be taken if the database is in full recovery model.

If the database is in “simple” recovery model, then the only option that we would have is to take full backups. If the database is in FULL/BULK-LOGGED recovery models, then we have the option of taking transaction log/differential backups.

Now the next question is how often to take a transaction log backup. Depending of the importance of the database, transaction log backups can be taken every 10-15 minutes or even every hour. This needs to be determined at your end. Furthermore, if the transaction log backups are being taken very frequently and a full backup is taken once a week, then it is advisable to take differential backups during the middle of the week. This would ensure that the process of restore becomes less cumbersome as after the full backup, then differential backup can be applied followed by the transactional log backups.

In case of full/bulk-logged recovery models, the following backup strategy would be ideal:

1. Full backup

a. Differential backup(s)

i. Transaction log backup(s)

b. Differential backup(s)

i. Transaction log backup(s)

2. Full backup (and repeat the above cycle)

However, any backup strategy that you design needs to be compliant with your business needs.

Furthermore, for creating sound fail-over strategies for disaster recovery scenarios, the following can be considered:
1. Log shipping

2. Database Mirroring (For SQL 2005 SP1 and above)

3. Replication (Snapshot, Transactional, Merge)

4. Failover Clustering

5. Or an in-house disaster recovery mechanism envisioned by your IT & DBA team which could be a combination of one or more of the above methods.
Also, it is considered a good practice to restore a backup on a test server and a CHECKDB run on the restored database to make sure that the backup is in good shape. This becomes of utmost importance when it is not possible to run a CHECKDB before taking a backup. In SQL Server 2005, you have the option to perform a piece meal restore which would allow you to restore up to the page level in case you have database corruption.

The following articles could be helpful:

Overview of Restore and Recovery in SQL Server
http://msdn2.microsoft.com/en-us/library/ms191253.aspx

SQL Server 2005 provides the option of performing an online restore which is available for SQL Server 2005 Enterprise Edition. Furthermore, you have option of performing Piece Meal Restores i.e. page level restores, single file restore for a secondary data file in a filegroup while keeping the database online etc.

Please refer the following for more information:
Performing Online Restores
http://msdn2.microsoft.com/en-us/library/ms188671.aspx
Storage Top 10 Best Practices
http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/storage-top-10.mspx
Using Recovery Models
http://msdn2.microsoft.com/en-us/library/aa173678(SQL.80).aspx

How to update statistics for a VLDB

Posted on December 30, 2009 by Amit Banerjee

Statistics are something that the SQL Optimizer depends on to generate a plan. The more outdated your statistics are, greater are the chances that your query optimizer will land up with a sub-optimal plan.
So what do you do when you have a database which is highly transactional in nature and is quite large in size. So, you might land up in a scenario where the database is online 24X7 and there is no downtime window where you can update your statistics with a FULLSCAN.

Before we get into what my opinion is about updating statistics of a database is, lets make sure we understand what I mean by statistics being “up-to-date“. Statistics being up-to-date is the sampling rate used while updating statistics was 100% or they were updated with a FULLSCAN. Statistics updated using AUTO UPDATE STATISTICS property of a database will not update statistics with a FULLSCAN contrary to popular belief. The sole reason for this being is that suppose a table has 1 GB of data and a query on the database initiated a AUTO UPDATE of a STAT on this database, then if the sampling was 100%, then your query duration would take an eternity.

So, if you have a large database:
1. Ensure that the AUTO UPDATE STATS is turned on for the database. This would provide for some relief
2. Identify which tables need to have their statistics updated with a fullscan. Only those would be required which have the maximum number of queries running on them
3. Next identify a period when the database usage is low. This is the time when you can use to update the stats of the other tables which did not qualify the list in #2.
4. Use a script to update statistics of the tables identified step # 2 probably based on rowmodctr values (ROWMODCTR: Counts the total number of inserted, deleted, or updated rows since the last time statistics were updated for the table. In SQL Server 2005, this is not going to be always helpful [Ref: http://msdn2.microsoft.com/en-us/library/ms190283.aspx]. But in SQL 2000, this could be used as a deciding factor)

Sample Script
============
Let’s say I had identified 4 tables in STEP 2:
CREATE TABLE UPD_STATS_TBL (tblname varchar(10), dt_updated datetime, rowmodctr bigint)

SELECT * FROM UPD_STATS_TBL

tblname dt_updated rowmodctr

TBL1 2007-10-24 21:13:46.123 20000
TBL2 2007-10-23 21:13:46.123 400000
TBL3 2007-10-22 21:13:46.123 508000
TBL4 2007-10-24 20:13:46.123 87000

**************************************************

CREATE PROC UPD_STATS_DB 

AS 

DECLARE @tbl varchar(10) 

UPDATE upd_stats_tbl SET rowmodctr = (SELECT MAX(rowmodctr) from sys.sysindexes where id=OBJECT_ID(tblname)) 

SELECT TOP 1 @tbl = tblname 

FROM UPD_STATS_TBL 

WHERE rowmodctr > 10000 

ORDER BY dt_updated 

-- For SQL Server 2005, you would have to used the STATS DATE function instead of rowmodctr values. Refer my statistics update post for more details. 

DECLARE @stmt varchar (100) 

SET @stmt = 'UPDATE STATISTICS'+SPACE(1)+@tbl+SPACE(1)+'WITH FULLSCAN' 

EXEC (@stmt) 

UPDATE upd_stats_tbl SET dt_updated = (SELECT GETDATE()) 

WHERE tblname = @tbl

**************************************************
NOTE: You could refine this down to the table statistic if you wanted to.

Statistics Update

Posted on December 30, 2009 by Amit Banerjee

This is an important task in terms of database maintenance. This ensures the statistics are up-to-date which in turn would ensure that the query optimizer doesn’t land up with sub-optimal plans. The two ways to monitor for the need for an update on statistics are:
1. STATS_DATE function

2. rowmodctr value in the sysindexes (For SQL 2000) and sys.sysindexes (For SQL 2005) output

The STATS_DATE function would give you the date when the statistics were updated last. The rowmodctr value would tell you how many changes have taken place in terms of update, inserts and deletes in the data of the column the index is associated with. However, this counter is not applicable for SQL Server 2005 as SQL Server 2005 and higher versions use Column Modifications to track modifications on tables and based on the same, the statistics are updated if AUTO UPDATE STATS property of the database is set.

The frequency of statistics update depends on the following:
1. Do BULK operations happen on the database tables in question?
2. Is there an off-peak period where a statistics update with full scan can be done?
3. How often is the data modified in the database tables in question?

Statistics update can be a resource intensive operation depending on the size of the table. If the data in the tables in question change very rarely, then a statistics update with a full scan can be done during a maintenance window. Statistics update is always an ONLINE operation and doesn’t cause the database to be in an OFFLINE mode.

If there are BULK operations happening in a table, then statistics have a tendency to getting skewed after the BULK operation. The best option is do perform a statistics update on this table if this tables is to be used by an application after the bulk operation.

If the database is updated very frequently and the database is very large in size, then it needs to be determined which tables are most frequently updated. Based on this, a statistics updated can be performed for only those tables and a statistics update can be done for the entire database with a lower sampling rate like 20-30% depending on what is suitable. This can be determined by comparing historical data and finding out what kind of sampling rate is suitable for your needs.

Another option is to enable AUTO UPDATE STATISTICS option for the database. But it needs to be monitored whether this is a boon or a bane. AUTO UPDATE STATISTICS can be good when the statistics are relatively updated and it makes sure that the statistics don’t fall too out of place. This feature has a downside when the statistics become out-dated too frequently, then you would have AUTO UPDATE STATISTICS being fired every time and this would cause all cached plans associated with the rows whose statistics have been updated to be recompiled. This can cause a serious bottleneck if there are too many auto update statistics events fired in short time span.

The following methods can be used to update the statistics:
1. UPDATE STATISTICS WITH FULLSCAN (or a lower sampling rate) Please refer the following article for further details: http://msdn2.microsoft.com/en-us/library/aa260645(SQL.80).aspx

2. sp_updatestats

3. A maintenance plan or SQL Agent job to update the statistics
The following script for SQL Server 2005 would be helpful in determining how badly affected the statistics are for the index associated with it. The rowmodctr counts the total number of inserted, deleted, or updated rows since the last time statistics were updated for the table. Again this is applicable for SQL Server 2000 only.

A DBCC DBREINDEX would update the statistics associated with those indexes. ~~But this would not update the AUTO CREATED STATISTICS.~~ The DBCC SHOW_STATISTICS command could also help you to determine the statistics condition for the particular index. The “Row Sampled” and “Rows” if equal would indicate that the sampling is currently 100%.

Modification: May 13, 2011:

I had earlier stated that a DBCC DBREINDEX wouldn’t update the auto-created statistics but this is incorrect. The current behavior is that the auto-created statistics and manually created column statistics are not updated only when ALTER INDEX..REBUILD ALL is used. However, when you use DBCC DBREINDEX, it would update the statistics for the manually and auto-created statistics with the default sampling rate.

Statistics for INDEX ‘pk_customers’.

Updated Rows Rows Sampled Steps Density Average key length

Jun 23 2007 5:03PM 91 91 91 1.0989011E-2 10.0

The following articles could be helpful:
UPDATE STATISTICS
http://msdn2.microsoft.com/en-us/library/aa260645(SQL.80).aspx
Statistics used by Query Optimizer in SQL Server 2005
http://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx

Database Mirroring with SQL Authentication

Posted on December 30, 2009 by Amit Banerjee

It’s quite fascinating that a database can be mirrored across two sites and a front-end application will not have a clue about a failure due to a failover partner string specified in the connection string of the application.

Database mirroring was a new feature introduced in SQL Server 2005 which has seen a sizeable amount of implementation over the years. What people have not noticed is how the application connects to the database on two completely different servers w.r.t. authentication. Most organizations use Windows Authentication and after a database failover, users are able to connect seamlessly. When SQL Authentication is used to connect to the mirrored database after a failover and this login only has logon rights on the mirrored database, then it becomes a different ballgame altogether.

SQL Server stores the information of a login using a SID value. The SID value for Windows Authentications (Domain Accounts) are the same throughout the environment hence a problem never arises here when the database is failed. But for SQL Login. Rule of thumb tells us that the login used to login to a mirrored database should exist on both instances which hosts the principal and mirror databases. So, what happens when you use a SQL Authentication after a failover?? Since, SQL Server relies on SIDs to map a database user to a SQL login. You can find the information under sys.database_users and sys.syslogins catalog views. Since, after a failover the SIDs for a SQL Authentication is bound to have a mismatch (as SQL Authentication SIDs are machine specific and not something derived from the AD as in the case of Domain Accounts), you will start getting login failed or login timeout period expired for SQL Authentication.

To resolve such an issue make sure that the SID of the SQL Authenticated user is the same on both the principal and mirror server instances.

So, if you have run into an issue of having orphaned logins after failing over your mirrored database, then you can follow the steps below to resolve the issue:

1. Use the sp_help_revlogin script to script out the SQL Login

2. Drop the user on the mirrored instance and recreated it using the script obtained above

3. You would have to failover and re-map the SQL login to the database user under user mappings because if you have a database user and drop the SQL login associated with it , then the database user becomes orphaned.

This is documented both in the whitepaper on Database Mirroring on Technet as well as Books Online which talks about resolving orphaned users issue after a database failover.

http://blogs.msdn.com/sqlserverfaq/archive/2009/04/13/orphaned-users-with-database-mirroring-and-log-shipping.aspx