Why can I not produce the same plan with a copy of the production database?

This question has been asked multiple times for SQL Server 2005/2008 environments. Well, the answer is SQL Server has become a bit smarter when it comes to the Query Optimizer. Now it takes into account the hardware environment and database session state also. (Sometimes, the optimizer may be a bit too smart for it’s own good 🙂 but that is discussion not being addressed in this thread)

To determine an optimal query plan, SQL Server 2005 and 2008 uses the following information:

  1. The database metadata – The table statistics should hold the same information i.e. same of data distribution.
  2. The hardware environment – Is the Physical Memory (RAM) and the number of Processors identical to the letter on the two machines.
  3. The database session state

Typically, you must simulate all these same types of information if you want to reproduce the behavior of the query optimizer on a test system. If you are lucky, then without 2 & 3 being satisfied, you might land up with the same plan. In scenarios where you don’t, Option 2 & 3 would be a good option to simulate before running off to solve the question:

Why is Server A generating a better plan than Server B?

With the advent of Virtualization, the simulation of the physical memory and CPU processors is not that big a deal as before. Just thought I would answer this question because many time I have been asked to explain why the same database backup doesn’t produce the same plan as Server A. The first option is to always create statistics-only copy of your database and see if you can reproduce the plan that you see on your production server. If yes, then you can safely proceed to the next step of troubleshooting the bad plan i.e. find out the most expensive part of the plan and take necessary steps to tune it like adding covering indexes, defining statistics, re-placing the join order, adding query hints etc.

Very often SQL CSS team would require a statistics clone of your database, so that they can reproduce the issue in-house. This would not contain any data from the tables but a clone of the database metadata. So, in case you need to troubleshoot a performance issue where you suspect the query plan to be the culprit, you also use a statistics clone and use that on a test server to check if you reproduce the so-called “bad” plan. The reason I mention a test environment because sometimes it is not possible to troubleshoot a query performance issue on a production server. And generating a database statistics clone for a few tables is much faster than a backup restore of the entire database.

You can use the information mentioned in the article below to create a statistics clone of your database:

How to generate a script of the necessary database metadata to create a statistics-only database in SQL Server 2005 and in SQL Server 2008
http://support.microsoft.com/?kbid=914288

How to retrieve data from Showplan XML for Query Compile Event

The Showplan XML for Query Compile profiler event in SQL Server can be very helpful when you see a lot of SPIDs in SQL Server 2005 waiting for Resource Semaphore Query Compile waits. 

RESOURCE_SEMAPHORE_QUERY_COMPILE waits: 

A large number of waits of type RESOURCE_SEMAPHORE_QUERY_COMPILE indicates a large number of concurrent compiles. In SQL Server 2005 we limit the number of compiles than can start at any given time to prevent the following scenario: a large number of compilations are initiated and memory partially allocated. A large percentage of queries land up being suspended (waiting) for resources to complete compilation, eventually timing out and releasing memory allocated to them. 

select * from sys.dm_exec_requests 

where wait_duration_ms <> 0 order by wait_duration_ms desc

If you see a large number of Session IDs waiting on this wait type, then it is time to identify which queries on your server are causing large amount of memory. In this post, I shall not talk in detail about what conditions lead to this kind of wait. However, some information can be found in this blog post by the SQL Dev team: 

http://blogs.msdn.com/sqlprogrammability/archive/2007/01/21/2-0-diagnosing-plan-cache-related-performance-problems-and-suggested-solutions.aspx 

You can also use this query to identify all the queries on the server instance waiting during Compilation: 

select sp.*, st.text from sys.sysprocesses sp CROSS APPLY sys.dm_exec_sql_text(sp.sql_handle) AS st WHERE sp.lastwaittype LIKE 'RESOURCE_SEMAPHORE_QUERY_COMPILE%' ORDER BY sp.waittime DESC;

In brief, the two conditions that can lead to this issue are: 

1. Multiple large ad-hoc queries being submitted (from the application) to the engine which leads to a compilation every time. 

2. A single large query being submitted to the engine which is throttling the execution of all other queries. 

You can use a server side profiler trace to collect a trace with the Showplan XML for Query Compile Event. 

Once you have the profiler trace with you, you need to load the profiler trace into a database table using the query below. 

select identity(int,1,1) as rownumber,* into xmltraceall from fn_trace_gettable ('<location of the trace file>',default where eventclass = 168

The Eventclass filter will only import the Showplan XML for Query Compile events into the database table. After that would need to run the query below to get the compile memory statistics for the queries running on your instance. 

select databasename,objectname,sum(convert(int,substring(textdata, charindex('compilememory', textdata)+len('CompileMemory="'), charindex ('"', textdata, charindex('compilememory', textdata)+len('CompileMemory="'))-(charindex('compilememory', textdata)+len('CompileMemory="'))))) as Total_Compiled_Memory, 

avg(convert(int,substring(textdata, charindex('compilememory', textdata)+len('CompileMemory="'),charindex ('"', textdata, charindex('compilememory', textdata)+len('CompileMemory="'))-(charindex('compilememory', textdata)+len('CompileMemory="'))))) as Avg_Compiled_Memory, 

max(convert(int,substring(textdata, charindex('compilememory', textdata)+len('CompileMemory="'),charindex ('"', textdata, charindex('compilememory', textdata)+len('CompileMemory="'))-(charindex('compilememory', textdata)+len('CompileMemory="'))))) as Max_Compiled_Memory, 

min(convert(int,substring(textdata, charindex('compilememory', textdata)+len('CompileMemory="'), charindex ('"', textdata, charindex('compilememory', textdata)+len('CompileMemory="'))-(charindex('compilememory', textdata)+len('CompileMemory="'))))) as Min_Compiled_Memory, 

count(*) as Counts 

from xmltraceall 

group by databasename,objectname 

order by 3 desc

Alternatively, on the server, you can use the query below to get details of the queries causing havoc on your SQL Server instance: 

WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS sp) Select stmt.stmt_details.value('(./sp:QueryPlan/@CompileMemory)[1]', 'bigint') as compile_memory_kb, 

stmt.stmt_details.value('(./sp:QueryPlan/@CompileTime)[1]', 'bigint') as compile_time_ms, 

stmt.stmt_details.value('(./sp:QueryPlan/@CompileCPU)[1]', 'bigint') as compile_cpu_ms, 

stmt.stmt_details.value('(./sp:QueryPlan/@CachedPlanSize)[1]', 'bigint') as CachePlanSize, 

p.objname, replace (replace (stmt.stmt_details.value('@StatementText', 'nvarchar(max)'), CHAR(10), ' '), CHAR(13), ' ') as query_text 

from (select cast (t.query_plan as Xml) showplan,t.dbid ,t.objectid, OBJECT_NAME (t.objectid, t.dbid) 'objname' from sys.dm_exec_cached_plans p1 cross apply sys.dm_exec_text_query_plan (plan_handle,DEFAULt,DEFAULT) t) as p 

cross apply showplan.nodes('//sp:StmtSimple') as stmt (stmt_details) 

order by 3 desc, 1 desc

However, please be advised that this is quite an expensive query and can cause additional performance issues on the server. I would suggest using the server side profiler trace method.

Missing Indexes Feature in SQL Server 2005

Indexes are essential in making sure that your queries have efficient query plans and for SELECTS, you don’t end up doing searches on HEAPS.

One of the biggest improvements in SQL Server 2005 is that it tracks all the transactions happening on the server and makes a list of indexes which could prove beneficial for those queries. Of course, one index might be beneficial for one query but detrimental to another. So, it is highly essential that we test the feasibility of implementing these indexes on a production environment before rolling out changes to our indexes.

The missing indexes feature is on by default. No controls are provided to turn the feature on or off, or to reset any of the tables returned when the dynamic management objects are queried. When SQL Server is restarted, all of the missing index information is dropped. This feature can only be disabled if an instance of SQL Server is started by using the -x argument (one of the startup parameters) with the sqlservr command-prompt utility. However, along with the missing indexes, a lot of other useful events for performance troubleshooting will not be captured which is why this startup option is not recommended.

So, the first thing I would do if a query or a set of queries are running slowly in SQL Server, I would query the DMVs related to these missing indexes and find out if there are an indexes related to the tables on which those slow running queries are executing. This feature becomes highly useful when the following conditions are true:
1. NO CPU bottleneck
2. NO Blocking on the server
3. NO Disk bottleneck

You can view a list of all the missing indexes using Performance Dashboard which can be used with SQL Server 2005 Service Pack 2 and above.

You can get the CREATE INDEX commands for the missing indexes recommended on your instance using the query mentioned here.

If you want to test an automatic determination of the index candidates for your workload, then have a look at the post below:
http://blogs.msdn.com/b/queryoptteam/archive/2006/06/01/613516.aspx

Related Links for Missing Indexes
About Missing Indexes
http://msdn2.microsoft.com/en-us/library/ms345524.aspx
Finding Missing Indexes
http://msdn2.microsoft.com/en-us/library/ms345417.aspx
Limitations of this feature
http://msdn2.microsoft.com/en-us/library/ms345485.aspx