Wednesday, June 11, 2014

What is the diff between Exist and In Operator?

The main difference would be performance.

If the conditions are correct the results would be the same, but the queries would be executed in a different way by the database engine. When the appropriate index exists, the EXISTS approach usually performs better. You could use the autotrace option in SQL*Plus or SQL Command Line to see the execution plan for your queries.

In my point of view, EXISTS is better because it requires to specify a join condition, which can invoke an INDEX scan. However, IN is often better if the results of the sub query are very small. Usually to run the query that returns the smaller set of results first.


SQL Statements that use the SQL EXIST Condition are very inefficient since the sub-query is RE-RUN for EVERY row in the outer query's table.

The EXISTS operator tests for existence of rows in the result set of the subquery.

. If a subquery row value is found :

- The condition is flagged TRUE

- The search does not continue in the inner query.


. If a subquery row  value is not found :

- The condition is flagged FALSE

- The search continues in the inner query.


-EXISTS operator is frequently used in Correlated subqueries to test whether a value retrieved by the outer query exists in the result set of the values retrieved
by the inner query.


If the subquery returns at least one row , the operator returns TRUE. If the value does not exits , it returns FALSE.

Accordingly, NOT EXISTS tests whether a value retrived by the outer query is not part of the results set of the values retrived by the inner query.


Q: Find employees who have at least one person reporting to them.

SELECT employee_id,last_name, job_id, department_id

FROM employees outer

WHERE EXISTS (SELECT 'X'
               FROM employees
              WHERE manager_id= outer.employee_id);


The EXISTS operator ensures that the search in the inner query doesn't continue when at least one match is found for the manager and employee_id condition:

WHERE manager_id =outer.employee_id;


Note that the inner SELECT query doesn't need to return a specific value , so a constant can be selected.
From a performance stand point , it is faster to select a constant than a column.

NOTE : Having employee_id in the SELECT clause of the inner query causes a table scan for that column.

Replacing it with a literal 'X' or any constant improves performance.

This is more efficient than using in operator.

A IN construct can be used as an alternative for a EXISTS operator , as shown below.

SELECT employee_id,last_name, job_id, department_id
FROM employees outer
WHERE employee_id  IN  (SELECT manager_id
                        FROM employees
                       WHERE manager_id IS NOT NULL);
             




No comments:

Post a Comment