Performance benchmarking: Explicit vs Implicit Transactions


imageBenchmarking… The single most important word in Performance Tuning. Without an established benchmark, you really don’t know when to actually call it a day!

If you have ever worked in testing on database projects, you would have probably done testing by trying to narrow down the data set involved which is nearly representative of the actual production data set or used a sub-set of the queries which are representative of the workload which is expected or is active on the production environment.

Now this brings me to the reason of this post. It is quite often during testing that we forget that SQL Server executes all DML operations by default in autocommit mode. This means that all individual statements are committed if they complete successfully. If you want to avoid this behavior, then you either need to set IMPLICIT_TRANSACTIONS setting to ON for your database connection or use a BEGIN TRANSACTION command before you execute your query.

SQL Server follows Write Ahead Logging protocol which means no data modifications are written to disk before the associated log record is written to disk. This maintains the ACID properties for a transaction. (when you involve disk-caching, that opens up another can of worms but that discussion is outside the scope of this blog post)

My belief is that if you are comparing execution times between two different environments, then you should be doing that on exactly the same hardware configuration and if that is not feasible, you should at-least match the CPU, physical RAM and disk sub-system on both sides. I had already documented in an earlier blog post why RAM/CPU configurations are important when comparison execution statistics between two different SQL Server environments. Ideally, you should have the same exact environment as your test environment including the workload (i.e. number of users, connections etc.). However, we all know that Utopia is not a place where we live in and hence the toned down test scripts, scaled down/up environments, shared resources and I could keep lamenting!!

In the last month, I dealt with two such issues where in a T-SQL batch performing a large number of inserts on a database table was being used to compare the performance between two different environments. Since I get called in to fix a problem and no-one generally calls CSS to tell us that their server is running as expected… the problem invariably happened to be that a bigger, beefier server was taking a longer time to execute the same batch. I shall demonstrate where not knowing about the WAL protocol can cause such test mechanisms to be skewed and take you down the proverbial rabbit-hole!

The script that I shall be using for demonstrating the pitfalls of this behavior is pretty simple:

declare @cntr int = 1
while @cntr <= 1000000
begin
insert into tblInserts (SNo,RowVal) values(@cntr,’Record ID ‘ +CAST(@cntr as varchar(7)))
set @cntr += 1
end

The script inserts 1 million rows into a database table (a heap table) using a while loop. During the time of the insert, I shall capture various performance counter values during the execution along with wait statistics for the query.

Modification: September 9th, 2011: Based on Kendra’s (Twitter) feedback, changing the sub-headings. Test 1 makes use of auto-commit mode of SQL Server which is the default and Test 2 can be achieved either by using implicit transaction mode or performing an explicit transaction (BEGIN…COMMIT).

Test 1: With AutoCommit mode

For two iterations the above script on an average took 8 minutes and 30 seconds to complete. When I looked into the wait statistics captured (at 5 second intervals), I don’t see anything standing out other than a few non-zero wait durations for WRITELOGs during certain periods. The non-zero wait times (in milli-seconds) are shown below with the highest value being captured being 10 milli-seconds. The Average Disk Secs/(Read/Write) don’t show me any outstanding values during the data capture to indicate that the disk was a bottleneck, then why does the data below show so many WRITELOG waits. Keep in mind that the total amount of time waited for as indicated by the data below is also not a significantly large amount. So why is it taking over eight minutes to insert the data??image

Runtime Session_id Status Wait_type Wait_time Last_Wait_Type
12:46 AM 51 suspended WRITELOG 10 WRITELOG
12:46 AM 51 suspended WRITELOG 7 WRITELOG
12:47 AM 51 suspended WRITELOG 1 WRITELOG
12:47 AM 51 suspended WRITELOG 2 WRITELOG
12:47 AM 51 suspended WRITELOG 3 WRITELOG
12:47 AM 51 suspended WRITELOG 2 WRITELOG
12:48 AM 51 suspended WRITELOG 7 WRITELOG
12:49 AM 51 suspended WRITELOG 8 WRITELOG
12:49 AM 51 suspended WRITELOG 1 WRITELOG
12:49 AM 51 suspended WRITELOG 10 WRITELOG
12:49 AM 51 suspended WRITELOG 6 WRITELOG
12:50 AM 51 suspended WRITELOG 2 WRITELOG
12:51 AM 51 suspended WRITELOG 3 WRITELOG
12:51 AM 51 suspended WRITELOG 4 WRITELOG
12:51 AM 51 suspended WRITELOG 1 WRITELOG
12:51 AM 51 suspended WRITELOG 2 WRITELOG
12:52 AM 51 suspended WRITELOG 5 WRITELOG
12:52 AM 51 suspended WRITELOG 7 WRITELOG
12:53 AM 51 suspended WRITELOG 1 WRITELOG
12:53 AM 51 suspended WRITELOG 8 WRITELOG

To explain the query, I would need to fall back of SQL Server Performance Monitor counters (sampled at 5 second intervals). On analysis of the performance monitor counters, I found the following:
a. SQLServer:Databases: Log Bytes Flushed/sec showed that on an average 3.1 MB of log bytes were being flushed every 5 seconds for the database on which I was performing the inserts.
b. SQLServer:DatabasesLog Flushes/sec showed that about 6000 log flushes were occurring for this database every 5 seconds on an average.
c. SQLServer:Wait Statistics: Log write waits shows me that on an average there were about 9000+ waits started per second. However, the Cumulative wait time (ms) per second for the same counter shows me negligible values and the Average wait time (ms) value is nearly zero through the data collection period.

So how can I explain where the time is being spent?? Now I decided to run another test using implicit transactions.

imageTest 2: Without AutoCommit mode

I ran two iterations of the above T-SQL batch within BEGIN TRANSACTION…COMMIT block and the average duration was 14 seconds! Whattttt!??!?! Yes.. And all this can be simply explained using the same counters that I looked above. This time around the performance monitor data told me the following story:
a. SQLServer:Databases: Log Bytes Flushed/sec showed that on an average 26 MB of log bytes were being flushed every 5 seconds for the database on which I was performing the inserts.
b. SQLServer:DatabasesLog Flushes/sec showed that about 468 log flushes were occurring for this database every 5 seconds on an average.
c. SQLServer:Wait Statistics: Log write waits shows me that on an average there were about 23(approx.)+ waits started per second.

If you look at the Performance Monitor graphs for the disk statistics that I captured for a single run for Test 1 (screenshot above) and Test 2 (screenshot on the left), you will see that the %Idle Time for the disk, on which the database files resided on (in my case F: drive), shows was busy 50% of the time (see blue squiggly above) during the test and the value climbed back to ~100% after the test completed. On the contrary, the %Idle Time for the same disk has a very zig-zag pattern (see black squiggly on left) during Test 2 which indicates that the disk was sporadically busy and it was not a continuous pattern as observed in Test 1.

The Avg. Disk Sec/Write counter shows absolutely no latency which means that the there is no problem in terms of latency on the disks.

During Test 1, data was being written to the disk at the rate of 907 KB/s where as during Test 2, the write rate was 5MB/s which was determined by monitoring the Disk Write Bytes/sec counter.

The reason for this difference is that for every insert in Test 1 is followed by an autocommit. This means that you need to flush the log buffer for each insert. However in Test 2, the log buffer was being flushed much lesser but at the same time more data was being written to the disk for each log flush. Since SQL Server follows a WAL protocol, till the earlier log flush is completed, it cannot commit/move onto the next transaction.

If you are running a similar system with two different sets of hardware, then you would start having to chase down disk speeds/RPMs between the two servers. For servers which have disks performing optimally, this behavior is quite difficult to gather without the right set of data.

With SQL Server 2008, tracking down the waits for a single session is much, much easier with the help of XEvents. But that shall be a story for another day… errrr.. post!!

Moral of the story: If you are using a test query similar to the one shown above to test/benchmark performance and the query batch is not indicative of your actual workload, then you would probably not be able to gauge or establish an accurate performance benchmark for the queries executing on your system. And REMEMBER that SQL Server runs in auto-commit mode by default!

HAPPY TESTING! Smile

Advertisements

PREEMPTIVE_OS_GETPROCADDRESS: Are you confusing me?


PREEMPTIVE_OS_GETPROCADDRESS is a new wait type that was introduced in SQL Server 2008 to track time taken by GetProcAddress to load the entrypoint in the DLL when Extended Stored Procedure calls are made by the SQL Server instance. However, the way the wait time is tracked by this wait type has a little quirk which Rob Dorr had documented in a blog post.

Based on a discussion on #sqlhelp thread on Twitter, I found the need to demonstrate this unique quirk with the help of an example. I am using the following command to list you the sub-directories on my C: drive using the command:

exec master..xp_dirtree ‘C:\’

Next I monitored the wait statistics for the session executing this Extended Stored procedure using the T-SQL script below with a 1 second delay:

SET NOCOUNT ON
WHILE (1=1)
BEGIN
SELECT [session_id], [status], [command], [start_time], [cpu_time], [reads], [writes], [wait_resource], [wait_time], [last_wait_type], [wait_type]
FROM sys.dm_exec_requests
WHERE [session_id] = 53 -- Session ID that is executing the Extended Stored Procedure

WAITFOR DELAY '00:00:01'
RAISERROR ('', 10, 1, N'Waits');

END

The output of this script was as follows:

session_id status        command          start_time              cpu_time    reads    writes      wait_time   last_wait_type                        wait_type

53         runnable      EXECUTE          2011-07-20 00:59:44.330 62          0        0           712         PREEMPTIVE_OS_GETPROCADDRESS          PREEMPTIVE_OS_GETPROCADDRESS

53         running       EXECUTE          2011-07-20 00:59:44.330 234         0        0           1713        PREEMPTIVE_OS_GETPROCADDRESS          PREEMPTIVE_OS_GETPROCADDRESS

53         runnable      EXECUTE          2011-07-20 00:59:44.330 452         0        0           2714        PREEMPTIVE_OS_GETPROCADDRESS          PREEMPTIVE_OS_GETPROCADDRESS

As you can see above the wait time keeps increasing even though I can see that the session is performing work due to the following results:
a. The results window in SSMS for the session executing the extended stored procedure was churning out results
b. Using a Process Monitor, I could trace the filesystem activity on the C: drive for the SQL Server process

You could run into this issue while using any Extended Stored Procedure (system or user) for a SQL Server 2008 or SQL Server 2008 R2 instance. Since I know that the session is performing work, the only two other things that I would check if there are resource bottlenecks w.r.t. memory or I/O if some filesystem related activity is being performed by the extended stored procedure. Most of the issues where I have see this wait type causing confusion is when the extended stored procedure is performing disk related activity. At that time, Process Monitor will put your doubts to rest along with the information that I mentioned above to confirm that that the procedure is actually doing some work and not stuck or perceived as “hung”.

T-SQL Tuesday #20 Wrap-up and a few smiles


I had the privilege of hosting this month’s T-SQL Tuesday which is a recurring blog party, the brainchild of MVP Adam Machanic (blog|twitter). It goes without saying that the posts this time were a wonderful read and it’s very encouraging to see such brilliant engagement from the community. A big “thank you” to all the people who took the time to pen down a post for this month’s topic: “T-SQL Best Practices”. If you are a newcomer to the world of T-SQL or a seasoned pro, the insights shared in these posts are definitely a good read if you work with SQL Server.

Below is a round-up of all the posts from this month (in chronological order of comments appearing on my invitation post):

1. Jason Bacani [Blog | Twitter] writes about why T-SQL formatting is best as well as good practice and makes it easier for the person supporting the code that you have written. I feel motivated on working on the code only if I like looking at it.

2. Rob Farley [Blog | Twitter], a popular SQL Server MVP, talks about why understanding query plans can help in writing better and more efficient T-SQL code. Totally agree with Rob here and from experience I can say that this goes a long way in writing efficient T-SQL code.

3. Dirk Wegener [Blog] talks about why including comments to explain your code logic and reiterates why T-SQL formatting is important in his first T-SQL Tuesday post. Way to go Dirk and we look forward to your participation in the future.

4. Noel McKinney [Blog | Twitter] talks about coding best practices which are quite pertinent in today’s development world. I completely agree that everything shouldn’t be done through T-SQL. Sometimes exploring other avenues may be the right answer!

5. Pinal Dave [Blog | Twitter], a friend and now a colleague at Microsoft, talks about practices that can be avoided in environments generally and not just from a T-SQL perspective.

6. Thomas Rushton [Blog | Twitter] talks about coding best practices, some of which is generally applicable to the world outside T-SQL as well. Absolutely loved the quote at the end of your post!

7. Rich Brown [Blog | Twitter] shows with an example how a developer managed to get into trouble by using an ORDER BY construct for an INSERT statement.

8. Sebastian Meine [Blog | Twitter] shows why an interface layer is required between the application and database. This is quite difficult to achieve due to various reasons (some of it maybe just ignorance sometimes) but if you can pull it off, it really does save a lot of time during a post-deployment crisis scenario.

9. Muthukkumaran Kaliyamoorthy [Blog | Twitter] talks about a list of DOs and DON’Ts while writing T-SQL code.

10. Nick Haslam [Blog | Twitter] talks about how multiple database files can help improve throughput of your DML queries for large data warehouse with the help of some TPC-H test numbers that he shares in the same post.

11. Marcos Freccia [Blog | Twitter] talks about a recent performance problem he resolved by avoiding an implicit convert operation in the query plan with the help of a small change in his T-SQL query. An example of how following best practices can help performance.

12. Matt Velic [Blog | Twitter] talks about best practices to follow while writing T-SQL code from a readability, formatting and code clarity standpoint. Supporting and fixing a code is much faster if it’s easy to read!

13. Robert Matthew Cook [Blog | Twitter] shows how not following simple best practices led to a cascading effect where a stored procedure failed to execute. He goes the extra mile to explain why the issue happened!

14. Erin Stellato [Blog | Twitter] explains why following T-SQL coding standards and adding comments to your code can help save painful hours of searching and brain-wracking during the time of a crisis.

15. Josh Feierman [Blog | Twitter] talks about what can be done in the current scheme of things with the business angle kept in mind. And what he talks about in his post are ground realities and what can be done to counter them to make us efficient in what we do without having sleepless nights while handling a critical issue.

16. Aaron Bertrand [Blog | Twitter], a popular MVP, talks about some of the best practices that should be followed while writing T-SQL code. He provides examples on what kind of issues you could run into when such practices are not followed.

17. Jack Vamvas [Blog | Twitter] gives an example and shows how filtered indexes come to the rescue when T-SQL code which doesn’t follow best practices cannot be tuned due to support and license agreements.

18. Gabriel Villa [Blog | Twitter] talks about T-SQL coding best practices which would help in better management of the code base in the post-deployment phase. This would definitely lead to less consumption of aspirins for the team managing and supporting the code.

19. Nancy Hidy Wilson [Blog | Twitter] talks about adding proper comments to your T-SQL code in a crisp and concise post.

20. Ricardo [Blog | Twitter] re-iterates why comments in the code are necessary but they necessarily don’t need to be short stories. Additionally, he mentions about testing the T-SQL code to ensure that it works in production and not just on the Development box that you are using it write the code on. The post is in Spanish but Bing Translate helped me read the English version of the post and I did have a smile on my face while reading through it.

21. Colleen Morrow [Blog | Twitter] explains with an example why using functions in the predicate of a T-SQL query can be bad karma for your code.

22. Carlos Bossy [Blog | Twitter] explains in his post why T-SQL code made to function in a procedural manner is not probably the best approach.

23. Jes Borland [Blog | Twitter] shares her top three T-SQL best practices and how keeping it simple is the best idea!

Once again, my sincere thanks to all the SQL folks that contributed to this month’s T-SQL Tuesday! And also, my thanks to Adam for letting me host this virtual blog party for this month!

Since a major part of my daily job involves troubleshooting SQL Server performance related issues, I do come across T-SQL code which from a best practices standpoint has a lot to be desired. So, I would say that the next few lines are my 2 cents from what I have seen based on my experience but I ‘m afraid it’s going to turn out to be a $1 worth of rambling!

1. If your company has established coding standards/formats, follow them – This helps keep consistent across the board and the standards and formats were put in place majority of the time with a good intention in mind. If there aren’t any defined, then define a set for yourself and follow them!
2. Use some form of source-control – It could be as simple as adding comments in the procedure code itself. It really saves time and helps blood pressure levels stay normal when trying to find out why a small change in the code was made and when, which is the current cause of the performance bottleneck!
3. Add comments/description to your code – I am yet to come across a production application which has a single person in-charge of the production, support, maintenance and version upgrades. Your code more often than not will be supported and maintained by someone else. Help that person understand why that piece of code is there and what is it supposed to do. What is obvious to you may not be obvious to someone else. So, rather than having a DBA curse you at an unearthly hour, mention (a single line is sufficient most times) why a best practice was not followed.
4. T-SQL is not the answer for everything – There are some things that can probably be done much better through code in some other language like C#. Don’t be afraid to explore such an option. At the end of the day, SQL Server is a database platform and NOT an application server. (*ducks and runs*)
5. It doesn’t hurt to look under the hood – It’s not necessary to know how the query optimizer works or how to read a query plan or even how indexes should be designed for tables for writing T-SQL code. But if you did understand some of this if not a majority of it, you will find that you end up writing much more efficient T-SQL code than you used to when you didn’t know some of under-the-hood workings of the database engine.

Do not misunderstand the above points to mean that I write the best T-SQL code can be written. All I am trying to say is that write T-SQL code which is easy to read, decipher and support! Even then as a Developer you might not become a DBA’s best friend overnight but you would have taken a giant step in getting there and bridging that chasm that exists between the two worlds! Smile

While reading Thomas Rushton’s [Blog | Twitter] submission this month I came across the quote mentioned below. Going forward, this is going to be my punch-line when I talk about best practices. If you are one of those individuals where a visual aid helps, then the above picture should probably aid in following best practices while writing T-SQL code.

Always code as if the person who ends up maintaining your code is a violent psychopath who knows where you live.

*courtesy Jeff Atwood (blog|twitter)

In case you want to keep getting updates on what is happening on my blog, then you can do so using one of the icons below (Facebook page, Twitter or RSS Feeds).

Facebook Fan PageTwitterRSS Feeds

Disclaimer: I have provided the correct links to your blog, twitter accounts and posts to the best of my knowledge. In case there are discrepancies, please leave a comment and I shall make the necessary corrections. Such an error was purely unintentional!