Today I want to address two very common issues when it comes to query plan tuning or query optimizing. At first I will have a look on implicit converts and the consequences. The second topic is all about the query optimizer and how quick one can have bad execution plans. For both issues I will five advice what to do in such a situation.

Implicit converts

Whenever you tell SQL Server to compare or evaluate two expressions that have different incomparable data types the SQL Server will adjust one of the expressions to a fitting data type so it can evaluate the expression. A very basic example is comparing integer with float numbers. Most likely the integer number will get transferred into a floating number to then compare it with the other float number.

But in SQL this can have fatal consequences. At first the optimizer does not know how a transformed column is distributed. The underlying statistics only apply for the original data type. So if the column gets converted due to be comparable the optimizer has no information how the compared value is actually distributed within the column. This problem is called CardinalityEstimate.

Another consequence is that the optimizer also cannot use an index anymore (for seeking). So it will do a scan because the B-tree is built with the original data type. It is just incompatible now. This problem is called SeekPlan. You can find the information in the query plan.

Have a look at that easy example. Keep in mind that the Column NationalIDNumber is of data type varchar. 

Query 1:
SELECT BusinessEntityID, NationalIDNumber, LoginID
FROM HumanResources.Employee
WHERE NationalIDNumber = 112457891
Query 2:
SELECT BusinessEntityID, NationalIDNumber, LoginID
FROM HumanResources.Employee
WHERE NationalIDNumber = '112457891'

 

When you have a look at the query plans you can see the situation I described. Because in the first query the constant 112457891 is changed to varchar, it is not possible for the optimizer to use the index on NationalIDNumber.

Bad Query:

 

Good Query:

Also you can see that the second plan using an index seek is faster (relative to the batch)

  

How can I fix this? 

Check your plan for Warnings (yellow exclamation mark). Try to resolve the issues. If data type mismatch is in your query as constant, just change the constants. If mismatch occurs because data types mismatches in tables consider a persisted calculated column that converts the value in the correct format. Then statistics can be calculated on that persisted column. 

Query optimizer 

The second issue I want to address is also quite common but still very unknown. What do you think how many tables have to be involved in a join / subquery scenario to confuse the query optimizer? What is the threshold where the optimizer starts struggling and not returning always the perfect plan?

The answer is quite complex but let me just show you this example. If three tables are involved in a query via join, the query optimizer has 3! = 1*2*3 = 6 possibilities for different join order. With ten tables you already have 10! = 3628800 possibilities.

The optimizer cannot have a look at all these tables.

 

SELECT	        Sales.SalesTerritory.TerritoryID, 
		Sales.SalesTerritoryHistory.BusinessEntityID, 
		Sales.SalesReason.SalesReasonID
FROM			Sales.SalesOrderDetail 
INNER JOIN		Sales.SalesOrderHeader 
	ON Sales.SalesOrderDetail.SalesOrderID = Sales.SalesOrderHeader.SalesOrderID 
INNER JOIN		Sales.SalesOrderHeaderSalesReason 
	ON Sales.SalesOrderHeader.SalesOrderID = Sales.SalesOrderHeaderSalesReason.SalesOrderID 
INNER JOIN		Sales.SalesPerson 
	ON Sales.SalesOrderHeader.SalesPersonID = Sales.SalesPerson.BusinessEntityID 
INNER JOIN		Sales.SalesPersonQuotaHistory 
	ON Sales.SalesPerson.BusinessEntityID = Sales.SalesPersonQuotaHistory.BusinessEntityID 
INNER JOIN		Sales.SalesReason 
	ON Sales.SalesOrderHeaderSalesReason.SalesReasonID = Sales.SalesReason.SalesReasonID 
INNER JOIN 		Sales.SalesTerritory 
	ON Sales.SalesOrderHeader.TerritoryID = Sales.SalesTerritory.TerritoryID 
	AND Sales.SalesPerson.TerritoryID = Sales.SalesTerritory.TerritoryID 
INNER JOIN		Sales.SalesTerritoryHistory 
	ON Sales.SalesPerson.BusinessEntityID = Sales.SalesTerritoryHistory.BusinessEntityID 
	AND Sales.SalesTerritory.TerritoryID = Sales.SalesTerritoryHistory.TerritoryID

 

It is taken from AdventureWorks2012 and basically joins everything from the sales schema.

If you have a look at the query plan you realize this hint:

 

 

It tells us that the optimizer stopped evaluating plans at timeout and just chose the best one so far.

So is there a possibility to give it more time to evaluate? Yes. Have a look at the query plan with Traceflags 2301 and 8780 turned on:

 

If you compare those two plans you will see that the second plan is actually better.

 

How can I fix this? 

  • Don’t do unnecessary joins
  • Try to persist intermediate results to temp tables to lower the number of joins at a time
  • You could also force join order but this is not recommended
  • Redesign your database or queries so you lower the number of joins.

 

The tricky thing about that fact is that maybe your query runs fine for a year. But there can be a tipping point where decisions are turned around from query optimizer and with bad luck it won’t find a good plan for you in time.