oracle中not exists对外层查询的影响详解_Oracle

前言

最近同事发现了一个问题，在12c中跑的buffer get很高，但是在10g中跑的buffer很低。怀疑是不是12c的优化器有问题。

这个10g的环境和12c的环境，数据量大致一样，只是有很少部分的不同，但是就是这个很少部分不同，造成了not exists中的子查询返回不同的值，进而对外层查询产生不同的影响。

我们来用如下的代码模拟一下。

初始化数据：

				?

									--10g

									drop table t1;

									drop table t2;

									create table t1 (id number,name varchar2(20),dep_id varchar2(10));

									create table t2 (id number,name varchar2(20),dep_id varchar2(10));

									insert into t1 select rownum,'a','kk' from dual connect by level <=3000000;

									insert into t2 select rownum,'a','kk' from dual connect by level <=1000000;

									insert into t2 select rownum,'a','mm' from dual;

									commit;

									--12c

									drop table t1;

									drop table t2;

									create table t1 (id number,name varchar2(20),dep_id varchar2(10));

									create table t2 (id number,name varchar2(20),dep_id varchar2(10));

									insert into t1 select rownum,'a','kk' from dual connect by level <=3000000;

									insert into t2 select rownum,'a','kk' from dual connect by level <=1000000;

									commit;

我们看到，12c的数据和10g只是有很少的差别，t1表12c和10g都一样，t2表在12c只是少了一行数据。

				?

									--10g

									SQL> select dep_id,count(*) from t1 group by dep_id;

									DEP_ID     COUNT(*)

									-------------------- ----------

									kk      3000000

									SQL> select dep_id,count(*) from t2 group by dep_id;

									DEP_ID     COUNT(*)

									-------------------- ----------

									mm       1

									kk      1000000

									SQL>

									--12c

									SQL> select dep_id,count(*) from t1 group by dep_id;

									DEP_ID     COUNT(*)

									-------------------- ----------

									kk      3000000

									SQL> select dep_id,count(*) from t2 group by dep_id;

									DEP_ID     COUNT(*)

									-------------------- ----------

									kk      1000000

									SQL>

我们将要执行的sql语句是：

				?

									select count(*)

									 from t1, t2

									 where t1.id = t2.id

									 and t1.dep_id = 'kk'

									 and not exists (select 1

									   from t1, t2

									   where t1.id = t2.id

									   and t2.dep_id = 'mm');

我们先来看执行情况的差距，10g的bufferget小，12c多：

				?

									--10g

									SQL> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm');

									 COUNT(*)

									----------

									   0

									SQL> select* from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

									PLAN_TABLE_OUTPUT

									----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

									SQL_ID 22t5mb43w55pr, child number 0

									-------------------------------------

									select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not

									exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm')

									Plan hash value: 3404612428

									------------------------------------------------------------------------------------------------------------------

									| Id | Operation   | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |

									------------------------------------------------------------------------------------------------------------------

									| 0 | SELECT STATEMENT  |  |  1 |  |  1 |00:00:00.02 | 2086 |  |  |   |

									| 1 | SORT AGGREGATE  |  |  1 |  1 |  1 |00:00:00.02 | 2086 |  |  |   |

									|* 2 | FILTER    |  |  1 |  |  0 |00:00:00.02 | 2086 |  |  |   |

									|* 3 | HASH JOIN   |  |  0 | 901K|  0 |00:00:00.01 |  0 | 39M| 5518K|   |

									| 4 |  TABLE ACCESS FULL| T2 |  0 | 901K|  0 |00:00:00.01 |  0 |  |  |   |

									|* 5 |  TABLE ACCESS FULL| T1 |  0 | 2555K|  0 |00:00:00.01 |  0 |  |  |   |

									|* 6 | HASH JOIN   |  |  1 |  23 |  1 |00:00:00.02 | 2086 | 1517K| 1517K| 612K (0)|

									|* 7 |  TABLE ACCESS FULL| T2 |  1 |  23 |  1 |00:00:00.02 | 2082 |  |  |   |

									| 8 |  TABLE ACCESS FULL| T1 |  1 | 2555K|  1 |00:00:00.01 |  4 |  |  |   |

									------------------------------------------------------------------------------------------------------------------

									Predicate Information (identified by operation id):

									---------------------------------------------------

									 2 - filter( IS NULL)

									 3 - access("T1"."ID"="T2"."ID")

									 5 - filter("T1"."DEP_ID"='kk')

									 6 - access("T1"."ID"="T2"."ID")

									 7 - filter("T2"."DEP_ID"='mm')

									Note

									-----

									 - dynamic sampling used for this statement

									34 rows selected.

									SQL>

									--12c

									SQL> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm');

									 COUNT(*)

									----------

									 1000000

									SQL> select* from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

									PLAN_TABLE_OUTPUT

									------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

									SQL_ID 22t5mb43w55pr, child number 0

									-------------------------------------

									select /*+ gather_plan_statistics */ count(*) from t1,t2 where

									t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2

									where t1.id=t2.id and t2.dep_id='mm')

									Plan hash value: 1692274438

									--------------------------------------------------------------------------------------------------------------------

									| Id | Operation    | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |

									--------------------------------------------------------------------------------------------------------------------

									| 0 | SELECT STATEMENT  |  |  1 |  |  1 |00:00:00.79 | 10662 |  | |  |

									| 1 | SORT AGGREGATE  |  |  1 |  1 |  1 |00:00:00.79 | 10662 |  | |  |

									|* 2 | FILTER    |  |  1 |  | 1000K|00:00:00.74 | 10662 |  | |  |

									|* 3 | HASH JOIN   |  |  1 | 1215K| 1000K|00:00:00.52 | 8579 | 43M| 6111K| 42M (0)|

									| 4 |  TABLE ACCESS FULL | T2 |  1 | 1215K| 1000K|00:00:00.01 | 2083 |  | |  |

									|* 5 |  TABLE ACCESS FULL | T1 |  1 | 2738K| 3000K|00:00:00.07 | 6496 |  | |  |

									|* 6 | HASH JOIN RIGHT SEMI|  |  1 |  35 |  0 |00:00:00.02 | 2083 | 1245K| 1245K| 461K (0)|

									|* 7 |  TABLE ACCESS FULL | T2 |  1 |  23 |  0 |00:00:00.02 | 2083 |  | |  |

									| 8 |  TABLE ACCESS FULL | T1 |  0 | 2738K|  0 |00:00:00.01 |  0 |  | |  |

									--------------------------------------------------------------------------------------------------------------------

									Predicate Information (identified by operation id):

									---------------------------------------------------

									 2 - filter( IS NULL)

									 3 - access("T1"."ID"="T2"."ID")

									 5 - filter("T1"."DEP_ID"='kk')

									 6 - access("T1"."ID"="T2"."ID")

									 7 - filter("T2"."DEP_ID"='mm')

									Note

									-----

									 - dynamic statistics used: dynamic sampling (level=2)

									35 rows selected.

									SQL>

									SQL>

可以看到第23，24行，在10g中运行时，buffers是0，而在12c中，即78，79行，buffer是2083+6496。

也就是说在10g中，外层查询不进行t1和t2的扫描，直接返回结果了，而在12c中，外层查询还要进行t1表和t2表层扫描才返回结果。

这其实不是10g和12c的差别，而是not exists的返回数据对外层的影响。子查询要返回0行记录，才满足not exist的条件，从而返回外层查询结果。

在10g中，子查询返回了一行记录

				?

									--10g

									SQL> select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='mm';

									   1

									----------

									   1

									SQL>

不满足not exists（即0行才满足），所以，也就不用在外层继续查询了。直接返回记录0行。

在12c中，子查询返回0行记录，满足not exist的条件，所以还需要在外层查询中继续查询。

				?

									--12c

									SQL> select count(*) from t1,t2 where t1.id=t2.id and t2.dep_id='kk';

									 COUNT(*)

									----------

									 1000000

									SQL> set line 1000

									SQL> set pages 1000

									SQL> col PLAN_TABLE_OUTPUT for a250

									SQL>

									SQL>

									SQL> select /*+ gather_plan_statistics */ count(*) from t1,t2 where t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2 where t1.id=t2.id and t2.dep_id='kk');

									 COUNT(*)

									----------

									   0

									SQL> select* from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

									PLAN_TABLE_OUTPUT

									---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

									SQL_ID c5hj2p2jt1fxf, child number 0

									-------------------------------------

									select /*+ gather_plan_statistics */ count(*) from t1,t2 where

									t1.id=t2.id and t1.dep_id='kk' and not exists (select 1 from t1,t2

									where t1.id=t2.id and t2.dep_id='kk')

									Plan hash value: 1692274438

									--------------------------------------------------------------------------------------------------------------------

									| Id | Operation    | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |

									--------------------------------------------------------------------------------------------------------------------

									| 0 | SELECT STATEMENT  |  |  1 |  |  1 |00:00:00.28 | 2087 |  | |  |

									| 1 | SORT AGGREGATE  |  |  1 |  1 |  1 |00:00:00.28 | 2087 |  | |  |

									|* 2 | FILTER    |  |  1 |  |  0 |00:00:00.28 | 2087 |  | |  |

									|* 3 | HASH JOIN   |  |  0 | 1215K|  0 |00:00:00.01 |  0 | 69M| 7428K|   |

									| 4 |  TABLE ACCESS FULL | T2 |  0 | 1215K|  0 |00:00:00.01 |  0 |  | |  |

									|* 5 |  TABLE ACCESS FULL | T1 |  0 | 2738K|  0 |00:00:00.01 |  0 |  | |  |

									|* 6 | HASH JOIN RIGHT SEMI|  |  1 | 2738K|  1 |00:00:00.28 | 2087 | 43M| 6111K| 42M (0)|

									|* 7 |  TABLE ACCESS FULL | T2 |  1 | 1215K| 1000K|00:00:00.12 | 2083 |  | |  |

									| 8 |  TABLE ACCESS FULL | T1 |  1 | 2738K|  1 |00:00:00.01 |  4 |  | |  |

									--------------------------------------------------------------------------------------------------------------------

									Predicate Information (identified by operation id):

									---------------------------------------------------

									 2 - filter( IS NULL)

									 3 - access("T1"."ID"="T2"."ID")

									 5 - filter("T1"."DEP_ID"='kk')

									 6 - access("T1"."ID"="T2"."ID")

									 7 - filter("T2"."DEP_ID"='kk')

									Note

									-----

									 - dynamic statistics used: dynamic sampling (level=2)

									35 rows selected.

									SQL>