I had written about update statistics a long time back and had been meaning to write a clarification on the difference in behavior w.r.t. table statistics being updated when rebuilding indexes using different T-SQL commands. I am sure all of you are aware of why statistics for a table need to be updated with a fullscan to ensure that the optimizer doesn’t fall prey to sub-optimal plans. However, there is a little known fact about how statistics get affected when a DBREINDEX is issued on a table versus an ALTER INDEX…REBUILD ALL.
I created a table with 4 integer columns and inserted 100,000 records into the table using the script below:
-- Create table create table tblUpdStats (a int primary key,b int,c int, d int) -- Insert 100,000 rows in the table set nocount on declare @var int = 1 while @var<=100000 begin insert into tblUpdStats values(@var,@var+1,@var+2,@var+3) set @var+=1 end -- Create a non-clustered index on column b create index idx_b on tblUpdStats(b) -- Create a user-defined statistics on column c create statistics stat_c on tblUpdStats(c) -- Execute the following to query so that the database engine creates an auto-created stats on column d select d from tblUpdStats where d > 50000
Name Updated Rows Rows Sampled ------------------------------- -------------------- ------- ------------ idx_b May 13 2011 8:03PM 100000 100000 PK__tblUpdSt__3BD0198E558AAF1E May 13 2011 8:03PM 100000 100000 stat_c May 13 2011 8:03PM 100000 100000 _WA_Sys_00000004_53A266AC May 13 2011 8:03PM 100000 100000
I took a snapshot of the DBCC SHOW_STATISTICS output for the statistics associated with the non-clustered index, clustered index, user created statistics and auto-created statistics. All the rows and rows sampled are equal which means that the statistics are updated with a fullscan. I then inserted another 100,000 rows into the same table. I then executed the following command to rebuild the indexes on the table:
alter index all on tblstats rebuild
I took a snapshot of the DBCC SHOW_STATISTICS command again and notice the difference below:
Name Updated Rows Rows Sampled -------------------------------------------------------------------------- idx_b May 13 2011 8:19PM 200000 200000 PK__tblUpdSt__3BD0198E558AAF1E May 13 2011 8:19PM 200000 200000 stat_c May 13 2011 8:03PM 100000 100000 _WA_Sys_00000004_53A266AC May 13 2011 8:03PM 100000 100000
You will see that the user created and auto-created statistics do not reflect the change in the number of rows. However when I execute a DBREINDEX on the table, I see a different picture altogether. (See show_statistics output below)
dbcc dbreindex ('tblUpdStats','')
Name Updated Rows Rows Sampled ---------------------------------- -------------------- --------- -------------- idx_b May 13 2011 8:31PM 200000 200000 PK__tblUpdSt__3BD0198E558AAF1E May 13 2011 8:31PM 200000 200000 stat_c May 13 2011 8:31PM 200000 200000 _WA_Sys_00000004_53A266AC May 13 2011 8:31PM 200000 200000
Now if I look at the rows sampled and rows column values, I see that the updated column shows the same date for all the objects. In this case the rows sampled and rows are the same indicating that a fullscan occurred for the user created/auto created statistics on the table.
To summarize, ALTER INDEX…REBUILD ALL will not update statistics which were manually created or were auto-created by the Database Engine. It will only update statistics associated with the indexes defined on the table columns. On the other hand, a DBCC DBREINDEX on the table will update statistics for the user created statistics and the auto-created statistics using the default sampling rate. The default sampling comes into play for very large tables. If you have a maintenance plan job which does an index rebuild operation on SQL Server 2005 or above, then that uses an ALTER INDEX…REBUILD command.