I had written about update statistics a long time back and had been meaning to write a clarification on the difference in behavior w.r.t. table statistics being updated when rebuilding indexes using different T-SQL commands. I am sure all of you are aware of why statistics for a table need to be updated with a fullscan to ensure that the optimizer doesn’t fall prey to sub-optimal plans. However, there is a little known fact about how statistics get affected when a DBREINDEX is issued on a table versus an ALTER INDEX…REBUILD ALL.
I created a table with 4 integer columns and inserted 100,000 records into the table using the script below:
-- Create table create table tblUpdStats (a int primary key,b int,c int, d int) -- Insert 100,000 rows in the table set nocount on declare @var int = 1 while @var<=100000 begin insert into tblUpdStats values(@var,@var+1,@var+2,@var+3) set @var+=1 end -- Create a non-clustered index on column b create index idx_b on tblUpdStats(b) -- Create a user-defined statistics on column c create statistics stat_c on tblUpdStats(c) -- Execute the following to query so that the database engine creates an auto-created stats on column d select d from tblUpdStats where d > 50000
Name Updated Rows Rows Sampled ------------------------------- -------------------- ------- ------------ idx_b May 13 2011 8:03PM 100000 100000 PK__tblUpdSt__3BD0198E558AAF1E May 13 2011 8:03PM 100000 100000 stat_c May 13 2011 8:03PM 100000 100000 _WA_Sys_00000004_53A266AC May 13 2011 8:03PM 100000 100000
I took a snapshot of the DBCC SHOW_STATISTICS output for the statistics associated with the non-clustered index, clustered index, user created statistics and auto-created statistics. All the rows and rows sampled are equal which means that the statistics are updated with a fullscan. I then inserted another 100,000 rows into the same table. I then executed the following command to rebuild the indexes on the table:
alter index all on tblstats rebuild
I took a snapshot of the DBCC SHOW_STATISTICS command again and notice the difference below:
Name Updated Rows Rows Sampled -------------------------------------------------------------------------- idx_b May 13 2011 8:19PM 200000 200000 PK__tblUpdSt__3BD0198E558AAF1E May 13 2011 8:19PM 200000 200000 stat_c May 13 2011 8:03PM 100000 100000 _WA_Sys_00000004_53A266AC May 13 2011 8:03PM 100000 100000
You will see that the user created and auto-created statistics do not reflect the change in the number of rows. However when I execute a DBREINDEX on the table, I see a different picture altogether. (See show_statistics output below)
dbcc dbreindex ('tblUpdStats','')
Name Updated Rows Rows Sampled ---------------------------------- -------------------- --------- -------------- idx_b May 13 2011 8:31PM 200000 200000 PK__tblUpdSt__3BD0198E558AAF1E May 13 2011 8:31PM 200000 200000 stat_c May 13 2011 8:31PM 200000 200000 _WA_Sys_00000004_53A266AC May 13 2011 8:31PM 200000 200000
Now if I look at the rows sampled and rows column values, I see that the updated column shows the same date for all the objects. In this case the rows sampled and rows are the same indicating that a fullscan occurred for the user created/auto created statistics on the table.
To summarize, ALTER INDEX…REBUILD ALL will not update statistics which were manually created or were auto-created by the Database Engine. It will only update statistics associated with the indexes defined on the table columns. On the other hand, a DBCC DBREINDEX on the table will update statistics for the user created statistics and the auto-created statistics using the default sampling rate. The default sampling comes into play for very large tables. If you have a maintenance plan job which does an index rebuild operation on SQL Server 2005 or above, then that uses an ALTER INDEX…REBUILD command.
Interesting result. The final paragraph seems a little confusing as you seem to advocate DBREINDEX, but mention that the maintenance job uses ALTER INDEX. I’d think you’d have more of warning that your maint plan may not be rebuilding statistics? Or am I reading this wrong?
If the stats on indexes are rebuilt, any guidance on how the unchanged stats might hurt performance in queries? Maybe another blog post for that one?
LikeLike
Thanks for the insight. Didn’t realise that the paragraph was giving conflicting views. The last paragraph was intended to point out the difference of how auto-created and user statistics get handled using ALTER INDEX..REBUILD ALL and DBCC DBREINDEX. I am not advocating the use of DBREINDEX as the feature is slated to be removed in future versions as mentioned in BOL. When you have tables with less number of rows, you will not see this behavior as a performance bottleneck. But if the statistics are updated only with a default sampling rate which tends to be a very small percentage for very large tables which might affect queries adversely which are dependant on the user/auto created statistics for cardinality estimation.
Index rebuild commands should never be used as a substitute for Update Statistics commands. My recommendation is ALTER INDEX REBUILD or REORGANIZE based on the threshold value that you deem apporpriate based on observed behavior in your environment and then UPDATE STATISTICS WITH FULLSCAN for user and manually created statistics during atleast one database maintenance cycle or based on the number of changes received by the table. Statistics with default sampling rate can result in sub-optimal plans. I will work on post in the future on this a working sample.
LikeLike
Pingback: Statistics Auto Update may not be automatically the best choice « TroubleshootingSQL
Pingback: Defragging SQL Server Index using ALTER INDEX - A bit of this and that, on IT. - Site Home - TechNet Blogs