Sunday, January 26, 2014

correlated subquery

The Correlated subqueries are used for row-by-row processing .Each subquery is executed once for each row of the outer query.


  • GET candidate row from outer query
  • EXECUTE inner query using the candidate row value
  • USE values from inner query to qualify or disqualify candidate row.

The non-correlated subquery or Nested Subquery is  that ,  the subquery can be run independently of the outer query. Basically, the subquery has no relationship with the outer query.

Now, a correlated subquery has the opposite property – the subquery can not be run independently of the outer query. You can take a look at this example of a correlated subquery below and easily see the difference yourself:


SELECT *
FROM Employee Emp1
WHERE (1) = (
SELECT COUNT(DISTINCT(Emp2.Salary))
FROM Employee Emp2
WHERE Emp2.Salary > Emp1.Salary)

It’s important to understand the order of operations in a correlated subquery. First, a row is processed in the outer query. Then, for that particular row the subquery is executed – so for each row processed by the outer query, the subquery will also be processed. In our example of a correlated subquery above, every time a row is processed for Emp1, the subquery will also choose that row’s value for Emp1.Salary and run. And then the outer query will move on to the next row, and the subquery will execute for that row’s value of Emp1.Salary. This will continue until the “WHERE (1) = (… )” condition is satisfied.

The Oracle server performs  a correlated subquery when the subquery reference a column from a table refererred to in the parent statement. A correlated subquery is evaluated once for each row processed by the parent statement . The parent statement can be SELECT ,UPDATE or DELETE statement.

Nested subqueries versus Correlated Subquery:

With a normal subquery, the inner SELECT query runs first and execute once, returning values to be used by the main query.  A Correlated subquery however executes once for each candidate row considered by the outer query. In other words, the inner query is driven by the outer query.

Nested subqueries :

  • The inner query executes first and find a value.
  • The outer query executes once, using the value from the inner query.

Correlated Subqueries:


  • Get a candidate row (fethched by the outer query).
  • Executes the inner query using the value of the candidate row.
  • Use the values resulting from the inner query to qualify or disqualify the candidate.
  • Repeat until no candidate row remains.


Ex: Using Correlated Subquery find all employees who earn more than the average salary in their department.

SELECT last_name, salary,department_id     -- Parent Query
FROM employees outer
WHERE salary > ( SELECT AVG(salary)
                             FROM  employees
                          WHERE department_id=outer.department_id);


-- The subquery references a column from a table in the parent query. each time a row in the outer  query is processed , the inner query is evaluated.

The example in the slide determines which employee earns more than the avearge salary of their department. In this case, the correlated subquery specifically computes the avearage salary for each department.

Because both the outer query and inner query use the EMPLOYEES table in the FROM clause , an alias is given to the EMPLOYEES in the outer SELECT statement, for calrity. Not only does the alias make the entire SELECT statement more readable, but without the alias the query would not work properly , because the inner statement would not be able to distinguish the inner table column from the outer table column.

/* Using Correlated subquery display details of those employees who has switched jobs at least twice*/

SELECT e.employee_id, last_name,e.job_id
FROM employees e
WHERE 2 <= ( SELECT COUNT(*)
                 FROM  job_history
                 WHERE employee_id=e.employee_id);


The Oracle server evaluates the  a correlated subquery as follows.

1. Select a row from the table specified in the outer query . This will be current candidate row .
2.Store the value of the column referenced in the subquery from the candidate row  (here the column referenced in the subquery is e. employee_id)
3.Perform the subquery with its condition referencing the value from the outer query's candidate row. ( here group funtion count(*) is evaluated based on the value of                                              
                                                                                                                                                   e.employee_id column based on the step 2)
4. Evaluate the WHERE clause of the outer query on the basis of the results of the subquery performed on step3. this determines if the candidate row is selected for output. ( Here, the number of times the employee has switched jobs, evaluated by the subquery , is compared with 2 in the WHERE clause of the outer query.If the condition is satisfied that employee record is displayed.)

5.Repeat the procedure for the next candidate row of the table , and so until all row in the table have been processed.

The Correlation is established by using an element from the outer query in the subquery.In this example , the correlation is established by the statement employee_id =e.employee_id in which you compare employee_id from the table in the subquery with the employee_id from table in the outer query.

No comments:

Post a Comment