-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
We usually translate Contains to an uncorrelated IN subquery, e.g. WHERE x.Item IN (SELECT value FROM OPENJSON(@p))
. However, when the item is nullable, we translate to a correlated EXISTS subquery instead: WHERE EXISTS (SELECT value FROM OPENJSON(..) WHERE value = x.Item
.
The problem is that during translation of primitive collections, we cannot examine parameter contents, and so we must assume that there are possible nulls in the collection (unless we're dealing with a non-nullable value). Later, in SqlNullabilityProcessor, when processing the InExpression we see that both sides (item and subquery) are nullable, meaning that we can't use SQL IN for that (impossible to distinguish between a NULL result that indicates that the item wasn't found, and a NULL result that indicates that NULL was found). As a result, we currently transform the IN to an EXISTS, to make sure results are accurate.
The fix here would be to look into the parameterized collection in SqlNullabilityProcessor (where we can access parameters), and rewrite it to remove any null values, compensating with an additional OR clause. Once nulls are cleared out, we can keep the uncorrelated IN translation, rather than switching to correlated EXISTS. Note that this would be the first time we rewrite parameters in the 2nd part of the query pipeline; it also means we can't cache the resulting SQL, since the parameter rewriting must happen again for each query.
(/cc @ajcvickers regarding value converters and null: since the query pipeline cares about nulls vs. non-nulls, if value converters support converting between nulls and non-nulls, that has to happen early, otherwise we'll get incorrect results)
Note that any other type of IN+subquery will continue to be translated as before, i.e. transforming IN to correlated EXISTS when both sides are nullable. This isn't a regression from previous releases.