This post is going to be a change of pace for the blog. I want to talk about how I approach performance tuning, specifically when talking about queries. This won’t be a tutorial, just somewhat of a journal post.
The other day, I was working on optimizing a query in a development environment with another database administrator (DBA). We were comparing execution plans for a new index and looking to see if our query wanted the index. We created several variations of the index and measured whether SQL Server would use the index when running a SELECT statement against the table. Once the query used the index, we decided to look at SQL Server’s IO Statistics . We compared several variations of the index before I noticed that we had been running a SELECT * against the table.
I think this was an example where we should have been testing with an actual workload. My thought process is that a SELECT * is usually not a good representation of the table. If a table is very wide (meaning it has many columns), SQL Server may choose to scan the Clustered Index of a table instead of using a well-designed Nonclustered Index. In the system I was working on, most queries will have a specific set of columns that they want to load. We didn’t keep our testing consistent with the actual system workload.
I’m going to try to make a demo of this scenario in a future blog post. In the mean time, here’s a list of the things that I think are important for consistent testing of performance:
- Same version of SQL Server, or at least set to similar server settings
- Table structure including data types
- Same indexes
- Same set of data used for testing
- If possible, same system memory and CPU
- If possible, similar existing Execution Plan Cache
I listed the last two as possible because sometimes it’s not possible to have the same hardware in Dev as Prod, and I don’t think it’s realistic to expect that on larger SQL Servers. In this case, simply generating an Estimated Execution Plan from the production environment can be good enough.