select a.column1, b.column2 from a, /*+ optional */ b WHERE a.key = b.key
Federated Optimizations
Access patterns are used on both physical tables and views to specify the need for criteria against a set of columns. Failure to supply the criteria will result in a planning error, rather than a run-away source query. Access patterns can be applied in a set such that only one of the access patterns is required to be satisfied.
Currently any form of criteria referencing an affected column may satisfy an access pattern.
In federated database systems pushdown refers to decomposing the user level query into source queries that perform as much work as possible on their respective source system. Pushdown analysis requires knowledge of source system capabilities, which is provided to Teiid though the Connector API. Any work not performed at the source is then processed in Federate’s relational engine.
Based upon capabilities, Teiid will manipulate the query plan to ensure that each source performs as much joining, filtering, grouping, and so forth, as possible. In many cases, such as with join ordering, planning is a combination of standard relational techniques along with costing information, and heuristics for pushdown optimization.
Criteria and join push down are typically the most important aspects of the query to push down when performance is a concern. For information about how to read a plan to ensure that source queries are as efficient as possible, see Query plans.
A special optimization called a dependent join is used to reduce the rows returned from one of the two relations involved in a multi-source join. In a dependent join, queries are issued to each source sequentially rather than in parallel, with the results obtained from the first source used to restrict the records returned from the second. Dependent joins can perform some joins much faster by drastically reducing the amount of data retrieved from the second source and the number of join comparisons that must be performed.
The conditions when a dependent join is used are determined by the query planner based on access patterns, hints, and costing information. Teiid supports the following types of dependent joins:
- Join based on in/equality support
-
Where the engine will determine how to break up large queries based upon translator capabilities.
- Key pushdown
-
Where the translator has access to the full set of key values and determines what queries to send.
- Full pushdown
-
Where the translator ships the all data from the independent side to the translator. Can be used automatically by costing or can be specified as an option in the hint.
Teiid supports the following types of hints to control dependent join behavior:
- MAKEIND
-
Indicates that the clause should be the independent side of a dependent join.
- MAKEDEP
-
Indicates that the clause should be the dependent side of a join. MAKEDEP as a non-comment hint supports the following optional max and join arguments.
- MAKEDEP(JOIN)
-
meaning that the entire join should be pushed.
- MAKEDEP(MAX:5000)
-
meaning that the dependent join should only be performed if there are less than the maximum number of values from the independent side.
- MAKENOTDEP
-
Prevents the clause from being the dependent side of a join.
These can be placed in either the OPTION Clause or directly in the FROM Clause. As long as all Access Patterns can be met, the MAKEIND, MAKEDEP, and MAKENOTDEP hints override any use of costing information. MAKENOTDEP supersedes the other hints.
Tip
|
The MAKEDEP or MAKEIND hints should only be used if the proper query plan is not chosen by default. You should ensure that your costing information is representative of the actual source cardinality. |
Note
|
An inappropriate MAKEDEP or MAKEIND hint can force an inefficient join structure and may result in many source queries. |
Tip
|
While these hints can be applied to views, the optimizer will by default remove views when possible. This can result in the hint placement being significantly different than the original intention. You should consider using the NO_UNNEST hint to prevent the optimizer from removing the view in these cases. |
In the simplest scenario the engine will use IN clauses (or just equality predicates) to filter the values coming from the dependent side.
If the number of values from the independent side exceeds the translators MaxInCriteriaSize
, the values
will be split into multiple IN predicates up to MaxDependentPredicates
.
When the number of independent values exceeds MaxInCriteriaSize*MaxDependentPredicates
,
then multiple dependent queries will be issued in parallel.
If the translator returns true for supportsDependentJoins
, then the engine may provide the entire set of
independent key values. This will occur when the number of independent values exceeds
MaxInCriteriaSize*MaxDependentPredicates
so that the translator may use specific logic to avoid issuing
multiple queries as would happen in the simple scenario.
If the translator returns true for both supportsDependentJoins
and supportsFullDependentJoins
then
a full pushdown may be chosen by the optimizer A full pushdown, sometimes also called as data-ship pushdown,
is where all the data from independent side of the join is sent to dependent side. This has an added
benefit of allowing the plan above the join to be eligible for pushdown as well.
This is why the optimizer may choose to perform a full pushdown even when the number of independent values
does not exceed MaxInCriteriaSize*MaxDependentPredicates
. You may also force full pushdown using the MAKEDEP(JOIN)
hint.
The translator is typically responsible for creating, populating, and removing a temporary table that
represents the independent side. For more information about how to use custom translators to support dependent, key,
and full pushdown, see Dependent join pushdown in Translator Development.
NOTE: Key/Full Pushdown is currently only supported out-of-the box by a subset of JDBC translators.
To enable support, set the translator override property enableDependentJoins
to true
.
The JDBC source must support the creation of temporary tables, which typically requires a Hibernate dialect.
Translators that should support this feature include: DB2, Derby, H2, Hana, HSQL, MySQL, Oracle,
PostgreSQL, SQL Server, SAP IQ, Sybase, Teiid, and Teradata.
Copy criteria is an optimization that creates additional predicates based upon combining join and where clause criteria.
For example, equi-join predicates (source1.table.column = source2.table.column)
are used to create
new predicates by substituting source1.table.column
for source2.table.column
and vice versa.
In a cross-source scenario, this allows for where criteria applied to a single side of the join to be applied to both source queries
Teiid ensures that each pushdown query only projects the symbols required for processing the user query. This is especially helpful when querying through large intermediate view layers.
Partial aggregate pushdown allows for grouping operations above multi-source joins and unions to be decomposed so that some of the grouping and aggregate functions may be pushed down to the sources.
An optional or redundant join is one that can be removed by the optimizer. The optimizer will automatically remove inner joins based upon a foreign key or left outer joins when the outer results are unique.
The optional join hint goes beyond the automatic cases to indicate to the optimizer that a joined table should be omitted if none of its columns are used by the output of the user query or in a meaningful way to construct the results of the user query. This hint is typically only used in view layers containing multi-source joins.
The optional join hint is applied as a comment on a join clause. It can be applied in both ANSI and non-ANSI joins. With non-ANSI joins an entire joined table may be marked as optional.
Suppose this example defines a view layer X
. If X
is queried in such a way as to not need b.column2
,
then the optional join hint will cause b
to be omitted from the query plan. The result would be the same as if X
were defined as:
select a.column1 from a
select a.column1, b.column2, c.column3 from /*+ optional */ (a inner join b ON a.key = b.key) INNER JOIN c ON a.key = c.key
In this example the ANSI join syntax allows for the join of a and b to be marked as optional.
Suppose this example defines a view layer X. Only if both column a.column1
and b.column2
are not needed,
for example, SELECT column3 FROM X
will the join be removed.
The optional join hint will not remove a bridging table that is still required.
select a.column1, b.column2, c.column3 from /*+ optional */ a, b, c WHERE ON a.key = b.key AND a.key = c.key
Suppose this example defines a view layer X
. If b.column2
or c.column3
are solely required by a query to X
,
then the join on a be removed. However if a.column1
or both b.column2
and c.column3
are needed,
then the optional join hint will not take effect.
When a join clause is omitted via the optional join hint, the relevant criteria is not applied. Thus it is possible that the query results may not have the same cardinality or even the same row values as when the join is fully applied.
Left/right outer joins where the inner side values are not used and whose rows under go a distinct operation will automatically be treated as an optional join and do not require a hint.
select distinct a.column1 from a LEFT OUTER JOIN /*+optional*/ b ON a.key = b.key
Note
|
A simple "SELECT COUNT(*) FROM VIEW" against a view where all join tables are marked as optional will not return a meaningful result. |
Teiid user and transformation queries can contain a meta source hint that can provide additional information to source queries. The source hint has the following form:
/*+ sh[[ KEEP ALIASES]:'arg'] source-name[ KEEP ALIASES]:'arg1' ... */
-
The source hint is expected to appear after the query (SELECT, INSERT, UPDATE, DELETE) keyword.
-
Source hints can appear in any subquery, or in views. All hints applicable to a given source query will be collected and pushed down together as a list. The order of the hints is not guaranteed.
-
The sh arg is optional and is passed to all source queries via the
ExecutionContext.getGeneralHints
method. The additional args should have a source-name that matches the source name assigned to the translator in the VDB. If the source-name matches, the hint values will be supplied via theExecutionContext.getSourceHints
method. For more information about using an ExecutionContext, see Translator Development . -
Each of the arg values has the form of a string literal - it must be surrounded in single quotes and a single quote can be escaped with another single quote. Only the Oracle translator does anything with source hints by default. The Oracle translator will use both the source hint and the general hint (in that order) if available to form an Oracle hint enclosed in
/*+ … */
. -
If the KEEP ALIASES option is used either for the general hint or on the applicable source specific hint, then the table/view aliases from the user query and any nested views will be preserved in the push-down query. This is useful in situations where the source hint may need to reference aliases and the user does not wish to rely on the generated aliases (which can be seen in the query plan in the relevant source queries — see above). However in some situations this may result in an invalid source query if the preserved alias names are not valid for the source or result in a name collision. If the usage of KEEP ALIASES results in an error, the query could be modified by preventing view removal with the NO_UNNEST hint, the aliases modified, or the KEEP ALIASES option could be removed and the query plan used to determine the generated alias names.
SELECT /*+ sh:'general hint' */ ...
SELECT /*+ sh KEEP ALIASES:'general hint' my-oracle:'oracle hint' */ ...
Union partitioning is inferred from the transformation/inline view. If one (or more) of the UNION columns is defined by constants and/or has WHERE clause IN predicates containing only constants that make each branch mutually exclusive, then the UNION is considered partitioned. UNION ALL must be used and the UNION cannot have a LIMIT, WITH, or ORDER BY clause (although individual branches may use LIMIT, WITH, or ORDER BY). Partitioning values should not be null.
create view part as select 1 as x, y from foo union all select z, a from foo1 where z in (2, 3)
The view is partitioned on column x, since the first branch can only be the value 1 and the second branch can only be the values 2 or 3.
Note
|
More advanced or explicit partitioning will be considered for future releases. |
The concept of a partitioned union is used for performing partition-wise joins, in Updatable Views, and Partial Aggregate Pushdown. These optimizations are also applied when using the multi-source feature as well - which introduces an explicit partitioning column.
Partition-wise joins take a join of unions and convert the plan into a union of joins,
such that only matching partitions are joined against one another.
If you want a partition-wise join to be performed implicit without the need for an
explicit join predicate on the partitioning column, set the model/schema property
implicit_partition.columnName
to name of the partitioning column used on each partitioned view in the model/schema.
CREATE VIRTUAL SCHEMA all_customers SERVER server OPTIONS ("implicit_partition.columnName" 'theColumn');
In an XML VDB:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<vdb name="partition" version="1">
<model name="all_customers" type="VIRTUAL">
<property name="implicit_partition.columnName" value="theColumn"/>
...
Teiid also incorporates many standard relational techniques to ensure efficient query plans.
-
Rewrite analysis for function simplification and evaluation.
-
Boolean optimizations for basic criteria simplification.
-
Removal of unnecessary view layers.
-
Removal of unnecessary sort operations.
-
Advanced search techniques through the left-linear space of join trees.
-
Parallelizing of source access during execution.
Some source systems only allow "relationship" queries logically producing left outer join results
even when queried with an inner join Teiid will attempt to form an appropriate left outer join.
These sources are restricted to only supporting key joins. In some circumstances foreign key relationships
on the same source should not be traversed at all or with the referenced table on the outer side of join.
The extension property teiid_rel:allow-join
can be used on the foreign key to further restrict
the pushdown behavior. With a value of "false" no join pushdown will be allowed, and with a value
of "inner" the referenced table must be on the inner side of the join. If the join pushdown is prevented,
the join will be processed as a federated join.