Tuesday, June 5, 2018

How to Use SQL Plan Management By: Nigel Bayliss | Product Manager

Introduction

In December I posted an entry on the subject of SQL plan management (SPM). It was inspired by questions raised at Oracle Open World 2015 and since then I’ve been talking to more customers about it, particularly in the context of upgrades. As you might guess, I get a lot of questions on the topic of how to use SPM in practice, so it seems like a good time to collate my answers and give you some pointers on how to get the best from the feature. If you're not familiar with SPM, there’s a four-part blog series to get you up to speed.
I’m going to make the assumption that your system is a suitable candidate for implementing SPM. This implies that you have a finite set of repeatable queries (or at least a finite set of critical queries). In SPM terminology, a repeatable query is one that is executed multiple times without the query text being modified, so this will make it practical and worthwhile to capture and stabilize its execution plan. This usually means that the application will need to use bind variables and not incorporate literal values in query strings (unless cursor sharing is being used to address that). If your application does use literal values for some critical queries, consider using SQL profiles with the “force_match” parameter.

Choosing between a strategic or tactical approach

SPM has been designed so that it can be used strategically. In other words, it will stabilize the execution plans of all your repeatable SQL statements and prevent them from encountering performance regression. If you need to avoid regression at all costs then a strategic, “SQL-plan-baseline-everything” approach is often the most appropriate choice. You should definitely consider creating SQL plan baselines for all queries if you are upgrading and want to continue to use the same SQL execution plans. After upgrade, you can evolve SQL plan baselines once the system is established, taking advantage of new optimizations and better execution plans in an incremental and verified way.
SPM can be implemented in a more tactical manner; using it to target a subset of your workload. This is appropriate if you are confident that you can identify a core set of critical queries and you can be sure that regressions outside this set are unlikely to significantly harm your service levels.

Using SPM proactively or reactively

You might have seen the following image in our documentation; comparing SPM with SQL profiles:
SQL Plan Management vs SQL Profiles

You can see that SPM is usually used proactively to prevent the use of suboptimal plans, whereas SQL Profiles are usually used reactively to correct problems once they have occurred. But don’t assume that this guidance is cast in stone. If you’ve read the earlier SPM series you’ll know that you can use SPM to “repair” queries that have regressed. In my experience, many DBAs use SPM reactively to correct individual queries, avoiding the need to change the application in any way (something that every DBA wants to have in their tool-box).
For the rest of this post, I’ll assume that SPM is being used proactively and that there are SQL plan baselines for all (or a significant part) of your workload.

When to capture SQL plan baselines

Capture queries from your workload when the system is performing well so that, by definition, all of the captured SQL execution plans will be “good” execution plans. SPM auto capture is a very popular choice with DBAs and it's particularly easy to use, but be aware that it will add some overhead to your workload because the execution plans are captured and stored at (hard) parse time. This isn't usually a problem because hard parse rates are generally low of course, but if you're concerned about it you can always capture in bulk from the cursor cache or from SQL tuning sets at any time. Make sure that you create accepted SQL plan baselines for every captured query and if a query has multiple execution plans (because of adaptive cursor sharing) go ahead and accept them all. This approach requires little manual intervention and there is no need to micro-manage which queries to accept.
SQL plan baseline capture is an ongoing-process, but Oracle recommends that you don’t switch on auto-capture indefinitely. Instead, capture new queries when changes have been made to applications and workloads. Bulk capture (from the cursor cache or from SQL tuning sets) is a good on-going approach too. For all types of capture, SQL plan baselines will be created without requiring you to figure out which ones are new and which ones already exist – SPM takes care of that for you.
If you are upgrading a critical system to Oracle Database 12c, then this is a perfect time to consider creating SQL plan baselines for all your queries. You can do this using the pre or post-upgraded system (see here for details).

Implementing SPM and backing out SQL plan baselines

There is always risk associated with making changes to a production system, and change management procedures will usually demand that you have a back-out procedure. It is reasonable to consider what would happen if you add or enable a large number of SQL plan baselines and you encounter problems of some kind. To help you in these circumstances, the DBA_SQL_PLAN_BASELINES view includes timestamps for creation and modification (as well as the origin of the SQL plan baseline). You can use this information to identify recently added and changed SQL plan baselines. The DROP_SQL_PLAN_BASELINE and ALTER_SQL_PLAN_BASELINE procedures are very efficient so, for example, most systems will be capable of enabling and disabling hundreds of SQL plan baselines per second.

Fixing SQL plan baselines

Individual SQL plan baselines can be fixed. This means that new plans will not be captured automatically for SQL statements that have fixed SQL plan baselines (although you can still load new plans manually and make them fixed if you want to). Be aware that if a SQL plan baseline is fixed for a particular query, but other non-accepted plans in the SQL plan history were previously captured (for this query) then these plan history entries will be subject to evolution. I have heard of DBAs encountering this, but it is a reare scenario - it is more usual to fix all baselines for a given SQL statement if you were considering fixing at all.
Some DBAs use SPM very tactically, creating and fixing SQL plan baselines for a carefully chosen set of queries with the intention of never evolving them. However, before fixing SQL plan baselines, consider the possibility of using a slightly different approach. Instead, don't fix them, but allow SPM to capture alternative plans in the SQL plan history. The trick then is to use the SPM auto evolve task to report on the new SQL execution plans without accepting them by default. Like this:
DBMS_SPM.SET_EVOLVE_TASK_PARAMETER(
    task_name => 'SYS_AUTO_SPM_EVOLVE_TASK'
,   parameter => 'ACCEPT_PLANS'
,   value     => 'FALSE'
);
Now you have the option to view the evolve task report at any time and experiment with new plans, accepting the better ones individually and manually if you prefer.

How and when to evolve

If SQL plan baselines have been comprehensively captured for a representative workload, the number of new plans identified is usually very low. In this case, the motivation to evolve plans will also be low. Nevertheless, evolution is still worthwhile to take into account new plans that are generated by change (such as changes to underlying schema objects, application changes or configuration changes).
In Oracle Database 12c, the evolve process is fully automated and enabled by default. We want you to be confident of using SPM with very little manual intervention, but I know that automation has surprised a number of DBAs that had originally intended to implement evolution down-the-line. If you want to postpone evolution for whatever reason, then the best approach is usually to set the evolve auto task parameter ACCEPT_PLANS to FALSE (see above). New excution plans will be verified and you can view the auto task report at any time to identify improvements and then be as selective as you like about which ones to accept using the evolve API on a case-by-case basis. For example:
DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE( … sql_handle=>'...', plan_name=>'...' …);
When upgrading a database, a strategic SPM approach will enable you to carry forward your existing SQL execution plans into the new release. Once the new environment is established, it is likely that you will capture a large number of new execution plans in the SQL plan history. After all, the Optimizer will be different. The motivation to evolve baselines is likely to be stronger because it makes good sense to make full use of the most up-to-date Optimizer but you might want it to verify that it is better (and find out why if it’s not). For this scenario, it is not unreasonable to use ACCEPT_PLANS = FALSE to help you identify improved SQL execution plans to evolve individually. This will give you the opportunity to gain trust in evolution before you set ACCEPT_PLANS to TRUE.

Troubleshooting

Systems are rarely static. Schema changes or environment changes can sometimes make it impossible to continue to use a previously-valid execution plan defined in a SQL plan baseline. For example, if a SQL plan baseline specifies an index range scan but the corresponding index has been dropped, then it will become impossible for the SQL plan baseline excution plan to be used. This will not cause the corresponding SQL statement to fail; there is no error condition or error message, but it can mean that there will be a SQL statement that is no longer “protected” by a SQL plan baseline so it could conceivably suffer from a performance regression. If evolve is used frequently then this scenario can be avoided to a large extent because plan changes are captured in the SQL plan history where they can be verified and accepted when necessary.
I have seen cases where a customer evolves their SQL plan history infrequently, and subsequently some SQL statements have stopped matching their intended SQL plan baselines without being noticed. Also, this can sometimes happen because of a bug or where there has been a schema change without the DBAs realizing that it could make some execution plans in SQL plan baselines non-viable.  Fortunately, it is not difficult to identify queries in the SQL cursor cache that have a corresponding SQL plan baseline but (for whatever reason) it is not being successfully matched and used. Take a look at the utility scripts I’ve uploaded to GitHub – and in particular “nomatch.sql” shows you how it’s possible to identify SQL statements in the cursor cache that are failing to match their SQL plan baselines. As usual, if you have any suggestions on how I could improve the scripts, just let me know in the comments.

1 comment: