Friday, February 24, 2012

Different execution plan between stored proc and ad-hoc query

We have just encountered a problem with a particular stored procedure which I would appreciate anyone's thoughts on.

The proc in question contains a single select statement which simply joins four tables and filters based on two parameters. It normally runs in well under a second but this morning we came in to find it was running in anything up to a minute.

After investigation we found that it was running a clustered index scan on one of the larger tables (approx 10.5m rows) rather than using the appropriate index. However, when we took the select statement out and ran it as an ad-hoc query it was utilising the index and performing correctly.

We tried recompiling the proc and also ran sp_updatestats on the database but we were still getting the different execution plans being generated. It was only after we ran UPDATE STATISTICS WITH FULLSCAN on the table in question that the proc went back to performing normally.

Can anyone shed any light on why the proc and query were using different execution plans, even after recompiling the proc? The table is heavily utilised in all areas of the system so I could understand if the stats got out of date (although autostats is on and sp_updatestats does get run on the database 2 or 3 times a day through an automated process).

Also, this is the second week in a row where this behaviour has occured so we need to try and mitigate the risks next week. Obviously we could schedule a stats update with fullscan early Monday morning but we would like to understand what is causing this and try and fix it "properly".

cheers

James

If the system is heavily updated, the stats could get stale. Consider running [sp_updatestats 'resample'] or explicitly forcing an index (the latter is not really recommended unless you fully understand the consequences).

e.g.

select * from tb with (index(myindex))

|||

It's very quite possible to get two different plans. Take for example these two TSQL statements:

a)

declare @.x int

set @.x = 99

select * from t1 where col1 = @.x

b)

select * from t1 where col1=99

In a), the optimizer evalutes the whole batch, and cannot evalute the literal value of 99 since it's a parameter value set inside a batch. So it makes a guesstimate and optimizes for a given value which it thinks may be the most correct/optimal. Often times it is, often times it is not. In b), the statement is evaluated and immediately it knows the value of 99, so it can get a very accurate estimate, thus producing a more reliable plan.

The same holds true for stored procedures. If you create a stored procedure and pass in a parameter value used in your WHERE clause, the optimizer will most likely produce a far better estimate than if you created a procedure which declared variables, set them, and then used them in your WHERE clause. It's always beneficial in a proc to try not to declare variables, set them and then use them in a query. If you can pass them as a parameter, the query has a better chance of producing an optimal plan based on accurate estimates.

Where you can also get into trouble is if the plan is cached with a value that gave good estimates/performance at the time it was created, but then as time passes and changes occur to the tables, yes the stats can get out of date, and you'll need to either update stats or clear the cache to remove the stale stats/plan.

If you got lost with what i was trying to say above, forgive me, I'm sure it's documented in some whitepaper somewhere, but I know this is an issue as merge replication procs and triggers hit this kind of problem often.

|||

Thank you both for your comments. Greg, that makes sense and does explain why we may have been seeing the different plan.

This problem actually appeared again yesterday, although this time the ad-hoc query was using the same (flawed) plan as the proc. We have decided in this instance to use an index hint as there are really no circumstances where it should be doing a full scan. As OJ says, I know these are not really recommended but I think I'm happy for this to be one of the rare expections!

cheers

James

|||Which version of SQL Server are you using? If 2005 then you might opt for the "Optimize For" hint or "Option Recompile". When using an index hint, you run the risk of accidentally breaking code later if that index is dropped or changed to include different columns.

No comments:

Post a Comment