As part of my current job at Microsoft, I work with SQL customers worldwide, ranging from fortune 500 financial institutions to online retailers to small partner outfits. I am engaged primarily to conduct performance tuning of their SQL Server systems.
When a system has a performance problem, and time is short (as it inevitably is), it's tempting just to "kill it with iron" – add more memory, more CPU horsepower, bigger and better disk subsystems, and so on. The problem is that if your design is fundamentally flawed then there is a strong chance that your application will not scale: doubling your horsepower will not double the performance. You might end up throwing a lot of hardware (and a lot of money) at the problem in order to reach your performance target.
Instead, your first port of call should be with the design of the application, and probably the "lowest-hanging fruit" here – the option that produces maximum gains with least impact on existing systems and processes – is to examine your indexing strategy.
When you run a query, SQL Server query optimizer's job is to find the best possible execution plan (based on internal costing algorithms) to provide the best response time. If the indexes required for implementing this optimal execution plan are not present, then the query optimizer is forced to choose a sub-optimal execution plan, for which indexes do exist. This could cause SQL Server to do more I/O and result in slower query performance. The need to have the right indexes in place cannot be overemphasized.
However, the task of identifying the right indexes is not necessarily straightforward. It requires a sound knowledge of the of the sort of queries that will be run against the data, the distribution of that data, and the volume of data, as well as an understanding of what type of index will suit your needs best. While this understanding is still essential, SQL Server 2005 does offer a helping hand in the form of some new tools – principally the Database Engine Tuning Advisor – that can help you determine, tune and monitor your indexes.
In this article, I will demonstrate how to use these tools to get answers to the following questions:
Which indexes do I need for my queries?
How do I monitor index usage and their effectiveness?
How do I identify redundant indexes that could negatively impact performance of my DML queries (insert, updates and deletes)
As workload changes, how I do I identify any missing indexes that could enhance performance for my new queries?
Finding the right indexes for your workload
Determining exactly the right indexes for your system can be quite a taxing process. For example, you have to consider:
Which columns should be indexed (based on your knowledge of how the data is queried)
Whether to choose a single-column index or a multiple column index
Whether you need a clustered index or a non-clustered index
Whether or not (in SQL 2005) you could benefit from an index with included columns to avoid bookmark lookups
How to utilize indexed views (which the optimizer might access instead of the underlying tables to retrieve a subset of your data)
All of these options increase the complexity of identifying the right indexes for your system. You might be tempted to drop indexes in everywhere "just in case", but one has to remember that with the introduction of every new index for improving the performance of the SELECT workload, the query optimizer has to do more work when running DML statements (insert, update or delete), as now it has to reorganize this index to accommodate for the newly updated data in the table. There is a balance to be struck here.
Even once you've determined the prefect set of indexes, your job is not finished. Your workload will change over time (new queries are will be added, older ones removed from the application) and this might warrant revisiting existing indexes, analyzing their usage and making adjustments (modifying/dropping existing indexes or creating new ones). Maintenance of indexes is critical to ensuring optimal performance in the long run.
SQL 2005 provides the following tools and features to help you find the right indexes for your queries, and then monitor and tune them:
Database Engine Tuning Advisor (DTA)
STATISTICS XML output
Dynamic Management Views (DMVs)
I will describe each one of them in the subsequent sections.
Using Database Engine Tuning Advisor (DTA)
The DTA is a new tool in SQL 2005 which replaces the Index Tuning Wizard in earlier versions of SQL Server. DTA can analyze both OLTP and OLAP workloads. You can either tune a single query or the entire workload to which your server is subjected. Based on the options that you select, you can use the DTA to make recommendations for several Physical Design Structures (PDS), which include:
Clustered indexes
Non-clustered indexes
Indexes with included columns (to avoid bookmark lookups)
Indexed views
Partitions
Let's walk through the steps involved in using DTA effectively.
Generating the DTA workload
The first step is to collect a workload for DTA to analyze. You can do this one of two ways:
Using Management Studio – if you need to optimize the performance of a single query, you can use Management Studio to provide an input to DTA. Simply type the query in Management Studio, highlight it and then right click on it to choose Analyze in Database Engine Tuning Advisor. This will launch DTA with your query as the input.
Using Profiler – if you want to want to determine the optimum index set for the entire workload of your server, you should collect a profiler trace using the built-in TUNING template.
To unleash the true effectiveness of DTA, you should always use a representative profiler trace. Make sure that you subject your server to all the queries that will typically be run against the data, while you are collecting the trace. This could lead to a huge trace file, but that is normal. If you simply collect a profiler trace over a 5-10 minute period, you can be pretty sure it won't be truly representative of all the queries run against your database.
The TUNING template captures only minimal events, so there should not be any significant performance impact on your server.
As part of the code download for this article, I've provided a profiler trace (Workload2Analyze.trc) that was captured with the TUNING template while running numerous queries ( Queries4Workload.sql) against the AdventureWorks database. I would strongly recommend that you use this trace, or the one from your server, to get a hands-on perspective of DTA.
Consuming the workload using DTA
Having collected a truly representative profiler trace, you can use it as an input to DTA, which will then generate recommendations. You can perform one of the following two types of analysis.
SCENARIO I: Keep my existing PDS and tell me what else I am missing
This type of analysis is common and is useful if you have previously established the set of indexes that you deem to be most useful for your given workload, and are seeking further recommendations. To conduct this analysis:
Launch a new session in DTA.
Choose the profiler trace as the input to this session.
In the Select databases and tables to tune section select your target database (AdventureWorks, if you are using my script).
In the Database for workload analysis dropdown, you can either use the same database or Tempdb (or any other database). I generally use Tempdb for his purpose.
At the Tuning Options tab, select the following options:
Physical Design Structures to use in database -> Indexes and Indexed views
Physical Design Structures to keep in database -> Keep all existing PDS
Uncheck the checkbox for limit tuning time.
Hit START ANALYSIS and DTA will start consuming your workload.
Once DTA is finished consuming the workload, it will list all its recommendations, under the Recommendations tab. We will talk more about reading and implementing DTA's recommendations in the section Reading output from DTA.
SCENARIO II: Ignore my existing PDS and tell me what query optimizer needs
In scenario I, DTA makes recommendations for any missing indexes. However, this doesn't necessarily mean your existing indexes are optimal for the query optimizer. You may also consider conducting an analysis whereby DTA ignores all existing physical design structures and recommends what it deems the best possible set of PDS for the given workload. In this way, you can validate your assumptions about what indexes are required.
To conduct this analysis, follow steps 1 to 6 as above, except that at step 5b, choose "Do not keep any existing PDS".
Contrary to how this might sound, DTA will not actually drop or delete any existing PDSs. This is the biggest advantage of using DTA, as it means you can use the tool to perform what-if analysis without actually introducing any changes to the underlying schema.
A note on the DTA Tuning Log
As it consumes the workload, DTA will store in the DTA Tuning log (found under the Progress tab, in the bottom section) details of any errors it encounters.
Typically, you might see a message of the form "XX% of the consumed workload has syntax errors". The tuning log will provide further details of these errors, many of which can be safely ignored. For example:
Statement doesn't reference any table – statements such as SET or DECLARE in your workload will cause these errors, and they can generally be ignored
Statement references only small tables – DTA will not tune a query if it references a small table (10 data pages or less)
Incorrect syntax or object related errors – if these errors indicate queries statments containing keywords such as BEGIN, TRY, INSERTED then you can ignore them.
To learn more about how to analyze the tuning log, refer to About the Tuning Log and Determining Whether Events Can Be Tuned on BOL
Reading output from DTA
After consuming the workload, DTA presents, under the Recommendations tab, a set of recommendations for tuning your PDS. I tend to focus on the following sections:
Recommendation – this is the action that you need to take. Possible values include Create or Drop.
Target of Recommendation – this is the proposed name of the PDS to be created. The naming convention is typical of DTA and generally starts with _dta*. However, I recommend that you change this name based on the naming convention in your database.
Definition – this is the list of columns that this new PDS will include. If you click on the hyperlink, it will open up a new window with the T-SQL script to implement this recommendation.
Estimated Improvements – this is the estimated percentage improvement that you can expect in your workload performance, if you implement all the recommendations made by DTA.
Space used by recommendation (MB) – under the Tuning Summary section of the Reports tab, you can find out the extra space in MB that you would need, if you decide to implement these recommendations.
Inbuilt analysis reports
There are 15 inbuilt reports under the Reports tab. However, the following three reports are the most important.
NOTE:
For information on these and the other 12 reports, visit the article Choosing a Database Engine Tuning Advisor Report on MSDN.
Index Usage Report (current)
Start with this report to see how your existing indexes are being used by the queries running against your server. Each index that has been used by a query is listed here. Each referenced index has a Percent Usage value which indicates the percentage of statements in your workload that referenced this index.
If an index is not listed here, it means that it has not been used by any query in your workload.
TIP: identifying indexes that are not used
If you are certain that all the queries that run against your server have been captured by your profiler trace, then you can use this report to identify indexes that are not required and possibly delete them.
Index Usage Report (recommended)
Next, look at this report to identify how index usage will change if the recommended indexes are implemented. If you compare these two reports, you will see that the index usage of some of the current indexes has fallen while some new indexes have been included with a higher usage percentage, indicating a different execution plan for your workload and improved performance.
Statement Cost Report
This report lists individual statements in your workload and the estimated performance improvement for each one of them. Using this report, you can identify your poorly performing queries and see the sort of improvement you can expect if you implement the recommendations made by DTA.
You will find that some statements don't have any improvements (Percent improvement = 0). This is because either the statement was not tuned for some reason or it already has all the indexes that it needs to perform optimally.
Implementing DTA's recommendations
By now, we have collected a workload using Profiler, consumed it using DTA and got a set of recommendations to improve performance. You then have the choice to either:
Save recommendations – you can save the recommendations in an SQL script by navigating ACTIONS | SAVE RECOMMENDATIONS. You can then manually run the script in Management Studio to create all the recommended PDS.
Apply recommendations using DTA – if you are happy with the set of recommendations then simply navigate ACTIONS | APPLY RECOMMENDATIONS. You can also schedule a later time to apply these recommendations (during off-peak hours, for example)
More often than not I use the Save Recommendations option. It means that I can easily compare the resulting PDS set for several scenarios. The recommendation reports are also useful for future reference. As the workload on the server changes, you need to re-evaluate the index usage and make corrections (drop redundant indexes or create new ones). You can compare the index usage on a periodic basis (may be quarterly or so) and see how the usage is changing.
Performing what-if analysis using DTA
This is a very cool feature of DTA. Suppose you don't want to apply all the recommendations that DTA provided. However, since the Estimated Improvement value can only be achieved if you apply all of these recommendations together, you are not really sure what kind of an impact it will have if you only choose to apply a sub-set of these recommendations.
Deselect the recommendations that you don't want to apply. Now, go to ACTIONS | EVALUATE RECOMMENDATIONS.
This will launch another session with the same options as the earlier one. However, now when you click on START ANALYSIS, DTA will provide data on estimated performance improvements, based on just this sub-set of the recommendations.
Again, the key thing to remember is that DTA performs this "what-if" analysis without actually implementing anything in the database.
TIP: checking for redundant objects left behind by DTA
In theory, DTA should not leave any extra "hypotetical" PDS (indexes etc.) in your database, after the analysis is complete. However, you can easily check for this by running the following query:
select name from sys.indexes where is_hypothetical = 1Using STATISTICS XMLIf you just wish to determine the optimum index set for a particular query, then the STATISTICS XML option provides a quick alternative to DTA.When a query is executed with STATISTICS XML enabled, SQL Server not only returns the query result-set, but also executes T-SQL statements that retrieve the execution plan for the query, in the form of an XML file (Showplan.xml). For example, try executing the following SQL statement:SET STATISTICS XML ONGOUse AdventureworksSELECT CustomerID, PurchaseOrderNumber, TotalDueFROM Sales.SalesOrderHeaderWHERE ShipMethodID > 2AND TotalDue > 200.00AND TerritoryID = 3;GOSET STATISTICS XML OFFGOBelow the query results, you should see a link to an XML file. The most important elements to look for in the XML file are:
Tidak ada komentar:
Posting Komentar