Paper #11: Optimization of nested sql queries revisited

R. Ganski and H. Wong.
Optimization of nested sql queries revisited.
In Proceedings of the ACM-SIGMOD International Conference on Management of Data, San Francisco, California, pages 23-33, May 1987.
Presented by: Poolavari

One powerful feature of SQL is the nesting of query blocks. The conventional techniques used in the implementation of query nesting is inefficient. Tables referenced in the inner query block of a nested query have to be retrieved once for each tuple of the relation referenced in the outer query block. This summary classifies nested queries and proposes transformation algorithms to transform it to a logical equivalent single-level query. The efficiency of the query is examined by a query optimizer to determine efficient order and method of evaluation. Classifications being discussed are Type-A nesting, Type-N nesting, Type-J nesting, Type-JA nesting and are based on the join predicate that references the outer relation and the existence of an aggregate function in the select clause of the inner relation.

Kim's algorithms for nested queries:

For Type-N and Type-J nested queries, the nested iteration method for processing nested queries is equivalent to performing a join between the inner and outer relations. These can be transformed to logically equivalent single-level queries containing explicit single-level join predicates. The query optimizer can then choose a merge join method resulting in cost reduction.

Type-N, Type-J Nested Query:

Type-N nested two-relation query is equivalent to a canonical two-relation query with a join predicate. The NEST-N-J algorithm is developed which results in canonical query logically equivalent to the original nested query.

Type-JA Nested Query

Type-JA nested query can be transformed to Type-J nested query which references a new temporary relation. The action of the nested iteration processing of a type-JA query can be captured in a temporary table formed with a GROUP BY clause. This leads to the NEST-JA algorithm which transforms a type-JA nested query of depth one to an equivalent type-J nested query of depth 1.

Costs of Kim's algorithms

The analyses of the algorithms compares the costs of processing N, J and JA-type nested queries using the nested iteration method and the transformation method followed by merge joins. Cost functions for each type of nesting is developed, using the sizes of relations, available memory buffer space and selectivity factors. A cost savings of 80% to 95% is possible with his transformation method.

Solution

A problem arises when a type-JA nested query contains the COUNT function and when the nested join predicates contain operators other than equality. A solution is a join in the temporary table which contains the aggregate information. Outer join is used if the aggregate function is COUNT. This causes the temporary table to include aggregate values over the proper range of join column values. The join predicate in the original query must be changed to equality and be performed on a projection of the outer table in order to avoid an increase in the aggregate values due to duplicates in the outer table. The resulting algorithm NEST-JA2 reduces the cost of processing.

The transformation algorithms are extended to more useful operators like EXISTS, ANY, ALL and larger class of predicates and queries with any level of nesting using recursive approach and query graphs.