Optimized join processing

Joins are very expensive unless handled properly. The OpenAccess SDK SQL engine handles all joins. The IP is only responsible for accessing data from a single table at a time. For join processing, the IP is called to process a SELECT for each table.

The simplest way to implement a join is for the IP to execute the query on one of the tables, execute the query on the next table, and then perform a Cartesian product between the two tables. A Cartesian product of two result sets of size M and N is M*N. This can be a large number as M and N grow, which happens often when a single piece of information is spread among many tables. The OpenAccess SDK SQL engine optimizes this possibility by first building a result set for the first table and then going through each row in this set and passing the required information to the next query as restrictions. This way, off diagonal elements are not even created. The IP is called to process SELECT on the second table as many times as there are rows in the first result set. The IP does not have to build the M*N set in which most of the data gets thrown away.

The OpenAccess SDK SQL engine also supports Block Join and Join Pushdown optimizations. These features allow the standard JOIN processing algorithm to be modified to take advantage of the data source’s ability to perform joins or where the data source is efficient at returning blocks of data. See Designing and coding the IP for more information.