DBREINDEX and ALTER INDEX differences


I had written about update statistics a long time back and had been meaning to write a clarification on the difference in behavior w.r.t. table statistics being updated when rebuilding indexes using different T-SQL commands. I am sure all of you are aware of why statistics for a table need to be updated with a fullscan to ensure that the optimizer doesn’t fall prey to sub-optimal plans. However, there is a little known fact about how statistics get affected when a DBREINDEX is issued on a table versus an ALTER INDEX…REBUILD ALL.

I created a table with 4 integer columns and inserted 100,000 records into the table using the script below:

-- Create table
create table tblUpdStats (a int primary key,b int,c int, d int)
-- Insert 100,000 rows in the table
set nocount on
declare @var int = 1
while @var<=100000
begin
insert into tblUpdStats values(@var,@var+1,@var+2,@var+3)
set @var+=1
end
-- Create a non-clustered index on column b
create index idx_b on tblUpdStats(b)
-- Create a user-defined statistics on column c
create statistics stat_c on tblUpdStats(c)
-- Execute the following to query so that the database engine creates an auto-created stats on column d
select d from tblUpdStats where d > 50000
Name                            Updated              Rows    Rows Sampled
------------------------------- -------------------- ------- ------------
idx_b                           May 13 2011  8:03PM  100000  100000
PK__tblUpdSt__3BD0198E558AAF1E  May 13 2011  8:03PM  100000  100000
stat_c                          May 13 2011  8:03PM  100000  100000
_WA_Sys_00000004_53A266AC       May 13 2011  8:03PM  100000  100000

I took a snapshot of the DBCC SHOW_STATISTICS output for the statistics associated with the non-clustered index, clustered index, user created statistics and auto-created statistics. All the rows and rows sampled are equal which means that the statistics are updated with a fullscan. I then inserted another 100,000 rows into the same table. I then executed the following command to rebuild the indexes on the table:

alter index all on tblstats rebuild

I took a snapshot of the DBCC SHOW_STATISTICS command again and notice the difference below:

Name                            Updated              Rows     Rows Sampled
--------------------------------------------------------------------------
idx_b                           May 13 2011  8:19PM  200000   200000
PK__tblUpdSt__3BD0198E558AAF1E  May 13 2011  8:19PM  200000   200000
stat_c                          May 13 2011  8:03PM  100000   100000
_WA_Sys_00000004_53A266AC       May 13 2011  8:03PM  100000   100000

You will see that the user created and auto-created statistics do not reflect the change in the number of rows. However when I execute a DBREINDEX on the table, I see a different picture altogether. (See show_statistics output below)

dbcc dbreindex ('tblUpdStats','')
Name                               Updated              Rows      Rows Sampled
 ---------------------------------- -------------------- --------- --------------
 idx_b                              May 13 2011  8:31PM  200000    200000
 PK__tblUpdSt__3BD0198E558AAF1E     May 13 2011  8:31PM  200000    200000
 stat_c                             May 13 2011  8:31PM  200000    200000
_WA_Sys_00000004_53A266AC           May 13 2011  8:31PM  200000    200000

Now if I look at the rows sampled and rows column values, I see that the updated column shows the same date for all the objects. In this case the rows sampled and rows are the same indicating that a fullscan occurred for the user created/auto created statistics on the table.

To summarize, ALTER INDEX…REBUILD ALL will not update statistics which were manually created or were auto-created by the Database Engine. It will only update statistics associated with the indexes defined on the table columns. On the other hand, a DBCC DBREINDEX on the table will update statistics for the user created statistics and the auto-created statistics using the default sampling rate. The default sampling comes into play for very large tables. If you have a maintenance plan job which does an index rebuild operation on SQL Server 2005 or above, then that uses an ALTER INDEX…REBUILD command.

Reorganize data and index pages


There are a few ways that this can be achieved:

1. Shrinking the database files (data file or transaction log file)

2. DBCC DBREINDEX (Similar to dropping and re-creating the indexes)

3. DBBC INDEXDEFRAG (De-fragmenting the indexed pages to remove logical scan fragmentation)

Note: It needs to be kept in mind that DBREINDEX is a fully-logged operation and can cause the transaction log to grow by large amounts when the database size is considerably large. So there have been situations when the transaction log has grown out of proportion and filled up the entire disk. If the T-log does grow out of proportion and you do choose to truncate/shrink the file, then please do so after verifying the points below:

1. Transactional replication is not configured for the database

2. Log shipping is not configured for the database

3. Database mirroring is not configured for the database (For SQL 2005 SP1 and above)

4. Any other operations are not being performed which need transactional log consistency in terms of LSN chains

Online re-indexing is a feature which is available in SQL Server 2005 Enterprise Edition only. Please note that rebuilding the indexes would cause the statistics associated with those indexes to also be updated. This however would not update the AUTO CREATED statistics.

The following articles could be useful:

SET OPTION considerations when running DBCC with indexes on computed columns

http://support.microsoft.com/kb/301292/

DBCC DBREINDEX

http://msdn2.microsoft.com/en-us/library/ms181671.aspx

The DBCC SHOWCONTIG output for SQL Server 2000 and 2005 will help you find out the current status of fragmentation for the database. A sample output would look like this:

DBCC SHOWCONTIG scanning ‘Customers’ table…

Table: ‘Customers’ (2073058421); index ID: 1, database ID: 6

TABLE level scan performed.

– Pages Scanned…………………………..: 3

– Extents Scanned…………………………: 2

– Extent Switches…………………………: 1

– Avg. Pages per Extent……………………: 1.5

– Scan Density [Best Count:Actual Count]…….: 50.00% [1:2]

– Logical Scan Fragmentation ………………: 0.00%

– Extent Scan Fragmentation ……………….: 50.00%

– Avg. Bytes Free per Page…………………: 246.7

– Avg. Page Density (full)…………………: 96.95%

The logical scan fragmentation should ideally be below 20% but anything above 40% would mean an INDEXDEFRAG or DBREINDEX is required. The script which is from the same MSDN article can help you run an INDEXDEFRAG or DBREINDEX based on the logical scan fragmentation count.

A sample script for the same can be found under the topic of "DBCC SHOWCONTIG" in SQL Server Books Online (2000 and 2005)

In SQL Server 2005, the following features may be helpful to you while using the ALTER INDEX command:

REBUILD [ WITH ( [ ,… n]) ]

Specifies the index will be rebuilt using the same columns, index type, uniqueness attribute, and sort order. This clause is equivalent to DBCC DBREINDEX. REBUILD enables a disabled index. Rebuilding a clustered index does not rebuild associated nonclustered indexes unless the keyword ALL is specified. If index options are not specified, the existing index option values stored in sys.indexes are applied. For any index option whose value is not stored in sys.indexes, the default indicated in the argument definition of the option applies.

The options ONLINE and IGNORE_DUP_KEY are not valid when you rebuild an XML index.

If ALL is specified and the underlying table is a heap, the rebuild operation has no affect on the table. Any nonclustered indexes associated with the table are rebuilt.

The rebuild operation can be minimally logged if the database recovery model is set to either bulk-logged or simple.

WITH ( LOB_COMPACTION = { ON OFF } )

Specifies that all pages that contain large object (LOB) data are compacted. The LOB data types are image, text, ntext, varchar(max), nvarchar(max), varbinary(max), and xml. Compacting this data can improve disk space use. The default is ON.

ON

All pages that contain large object data are compacted.Reorganizing a specified clustered index compacts all LOB columns that are contained in the clustered index. Reorganizing a nonclustered index compacts all LOB columns that are nonkey (included) columns in the index. For more information, see Creating Indexes with Included Columns. When ALL is specified, all indexes that are associated with the specified table or view are reorganized, and all LOB columns that are associated with the clustered index, underlying table, or nonclustered index with included columns are compacted. OFF Pages that contain large object data are not compacted.

OFF

has no affect on a heap.The LOB_COMPACTION clause is ignored if LOB columns are not present.

REORGANIZE

Specifies the index leaf level will be reorganized. This clause is equivalent to DBCC INDEXDEFRAG. ALTER INDEX REORGANIZE statement is always performed online. This means long-term blocking table locks are not held and queries or updates to the underlying table can continue during the ALTER INDEX REORGANIZE transaction. REORGANIZE cannot be specified for a disabled index or an index with ALLOW_PAGE_LOCKS set to OFF.

Furthermore, the dynamic management view sys.dm_db_index_physical_stats would help you get statistical information pertaining to indexes on SQL Server 2005. The columns mentioned below in this DMV output could be helpful in analyzing the need for index reorganization.

avg_fragmentation_in_percent: Logical fragmentation for indexes, or extent fragmentation for heaps in the IN_ROW_DATA allocation unit. The value is measured as a percentage and takes into account multiple files. For definitions of logical and extent fragmentation, see Remarks. 0 for LOB_DATA and ROW_OVERFLOW_DATA allocation units.NULL for heaps when mode = SAMPLED.

fragment_count: Number of fragments in the leaf level of an IN_ROW_DATA allocation unit. For more information about fragments, see Remarks.NULL for nonleaf levels of an index, and LOB_DATA or ROW_OVERFLOW_DATA allocation units. NULL for heaps when mode = SAMPLED.

avg_fragment_size_in_pages: Average number of pages in one fragment in the leaf level of an IN_ROW_DATA allocation unit. NULL for nonleaf levels of an index, and LOB_DATA or ROW_OVERFLOW_DATA allocation units. NULL for heaps when mode = SAMPLED.

The following article could be helpful:Microsoft SQL Server 2000 Index Defragmentation Best Practices

http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx