Search This Blog

Wednesday, October 5, 2011

NULLs and the NOT IN predicate



create table a (id int , colour varchar(10))
insert into a select 1,'Red' union all select 2,'Green' union all select 3,null

select * from a

create table b (colour varchar(10))insert into b select 'Red'union allselect 'Green' union all select 'Blue'


select * from b where b.colour not in (select a.colour from a )


Obviously this is 'incorrect'. What is the problem? It's simply that SQL uses three-valued logic, driven by the existence of NULL, which is not a value but a marker to indicate missing (or UNKNOWN) information. When the NOT operator is applied to the list of values from the subquery, in the IN predicate, it is translated like this:
The expression "color=NULL" evaluates to UNKNOWN and, according to the rules of three-valued logic, NOT UNKNOWN also evaluates to UNKNOWN. As a result, all rows are filtered out and the query returns an empty set.
This mistake will often surface if requirements change, and a non-nullable column is altered to allow NULLs. It also highlights the need for thorough testing. Even if, in the initial design, a column disallows NULLs, you should make sure your queries continue to work correctly with NULLs.
One solution is to use the EXISTS predicate in place of IN, since EXISTS uses two-valued predicate logic evaluating to TRUE/FALSE:


correct:
select * from b where not exists (select a.colour from a where a.colour =b.colour )
 

No comments:

Post a Comment