Internal Materialization

Internal materialization creates Teiid Spring Boot temporary tables to hold the materialized table. While these tables are not fully durable, they perform well in most circumstances and the data is present at each Teiid Spring Boot instance which removes the single point of failure and network overhead of an external database. Internal materialization also provides built-in facilities for refreshing and monitoring.

View Options

The materialized option must be set for the view to be materialized. The Cache Hint, when used in the context of an internal materialized view transformation query, provides the ability to fine tune the materialized table. The caching options are also settable via extension metadata:

Property Name Description Optional Default

materialized

Set for the view to be materialized

false

true

UPDATABLE

Allow updating Materialized View via DML UPDATE

true

false

teiid_rel:ALLOW_MATVIEW_MANAGEMENT

Allow Teiid Spring Boot based management of the ttl and initial load rather than the implicit behavior.

true

false

teiid_rel:MATVIEW_PREFER_MEMEORY

Same as the pref_mem cache hint option.

true

false

teiid_rel:MATVIEW_TTL

Trigger a Scheduled ExecutorService which execute refreshMatView repeatedly with a specified time to live

true

null

teiid_rel:MATVIEW_UPDATABLE

Allow updating Materialized View via refreshMatView, refreshMatViewRow, refreshMatViewRows

true

false.

teiid_rel:MATVIEW_SCOPE

Same as the scope cache hint option.

true

VDB

teiid_rel:MATVIEW_WRITE_THROUGH

When true Teiid Spring Boot will perform both the underlying update and the corresponding update against the materialization target for an insert/update/delete issued against the view.

true

false

teiid_rel:MATVIEW_POLLING_QUERY

This property defines a query that must return a single timestamp value. If the value is greater than the last update time of the materialization table, it will be reloaded.

true

n/a

teiid_rel:MATVIEW_POLLING_INTERVAL

This property defines the polling interval, in milliseconds, used with the polling query.

true

60000

teiid_rel:MATVIEW_PART_LOAD_COLUMN

This property defines the partitioned load column. If specified the default load strategy will be updated to refresh the materialization one partition at a time. NOTE: this does not yet work for the initial load. This must specify a column that exists on the view and it must be of a comparable type that is convertable to string values.

true

n/a

teiid_rel:MATVIEW_PART_LOAD_VALUES

If MATVIEW_PART_LOAD_COLUMN is specified, this may be a query expression that returns a single column providing the partition values. e.g. for multi-source you can get the source names via the query "select s.name from (exec sysadmin.schemaSources('schema name')) s"

true

the distinct values for the MATVIEW_PART_LOAD_COLUMN selected with option no cache.

The pref_mem option also applies to internal materialized views. Internal table index pages already have a memory preference, so the perf_mem option indicates that the data pages should prefer memory as well.

All internal materialized view refresh and updates happen atomically. Internal materialized views support READ_COMMITTED (used also for READ_UNCOMMITED) and SERIALIZABLE (used also for REPEATABLE_READ) transaction isolation levels.

A sample VDB defining an internal materialization
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<vdb name="sakila" version="1">

    <model name="pg">
        <source name="pg" translator-name="postgresql" connection-jndi-name="java:/sakila-ds"/>
    </model>

    <model name="sakila" type="VIRTUAL">
    <metadata type="DDL"><![CDATA[
        CREATE VIEW actor (
           actor_id integer,
           first_name varchar(45) NOT NULL,
           last_name varchar(45) NOT NULL,
           last_update timestamp NOT NULL
        ) OPTIONS (materialized true,
               UPDATABLE 'TRUE',
               "teiid_rel:MATVIEW_TTL" 120000,
               "teiid_rel:MATVIEW_PREFER_MEMORY" 'true',
               "teiid_rel:ALLOW_MATVIEW_MANAGEMENT" 'true',
               "teiid_rel:MATVIEW_UPDATABLE" 'true',
               "teiid_rel:MATVIEW_SCOPE" 'vdb')
            AS SELECT actor_id, first_name, last_name, last_update from pg."public".actor;
     ]]>
    </metadata>
    </model>
</vdb>

Loading And Refreshing

An internal materialized view table is initially in an invalid state (there is no data).

  • If teiid_rel:ALLOW_MATVIEW_MANAGEMENT is specified as true, then the initial load will occur on vdb startup.

  • If teiid_rel:ALLOW_MATVIEW_MANAGEMENT is not specified or false, then the load of the materialization table will occur on implicit on the first query that accesses the table.

When a refresh happens while the materialization table is invalid all other queries against the materialized view will block until the load completes.

Using System Procedure

In some situations administrators may wish to better control when the cache is loaded with a call to SYSADMIN.refreshMatView. The initial load may itself trigger the initial load of dependent materialized views. After the initial load user queries against the materialized view table will only block if it is in an invalid state. The valid state may also be controlled through the SYSADMIN.refreshMatView procedure.

Invalidating Refresh
CALL SYSADMIN.refreshMatView(viewname=>'schema.matview', invalidate=>true)

matview will be refreshed and user queries will block until the refresh is complete (or fails).

While the initial load may trigger a transitive loading of dependent materialized views, subsequent refreshes performed with refreshMatView will use dependent materialized view tables if they exist. Only one load may occur at a time. If a load is already in progress when the SYSADMIN.refreshMatView procedure is called, it will return -1 immediately rather than preempting the current load.

Using TTL Snapshot Refresh

The Cache Hint or extension properties may be used to automatically trigger a full snapshot refresh after a specified time to live (ttl). The behavior is different depending on whether the materialization is managed or non-managed.

For non-managed views the ttl starts from the time the table is finished loading and the refresh will be initiated after the ttl has expired on a view access.

For managed views the ttl is a fixed interval and refreshes will be triggered regardless of view usage.

In either case the refresh is equivalent to CALL SYSADMIN.refreshMatView('view name', *), where the invalidation behavior * is determined by the vdb property lazy-invalidate. By default ttl refreshes are invalidating, which will cause other user queries to block while loading. That is once the ttl has expired, the next access will be required to refresh the materialized table in a blocking manner. If you would rather that the ttl is enforced lazily, such that the current contents are not replaced until the refresh completes, set the vdb property lazy-invalidate=true.

Auto-refresh Transformation Query*
/*+ cache(ttl:3600000) */ select t.col, t1.col from t, t1 where t.id = t1.id

The resulting materialized view will be reloaded every hour (3600000 milliseconds).

TTL Snapshot Refresh Limitations

  • The automatic ttl refresh may not be suitable for complex loading scenarios as nested materialized views will be used by the refresh query.

  • The non-managed ttl refresh is performed lazily, that is it is only trigger by using the table after the ttl has expired. For infrequently used tables with long load times, this means that data may be used well past the intended ttl.

Updatable

In advanced use-cases the cache hint may also be used to mark an internal materialized view as updatable. An updatable internal materialized view may use the SYSADMIN.refreshMatViewRow procedure to update a single row in the materialized table. If the source row exists, the materialized view table row will be updated. If the source row does not exist, the correpsonding materialized row will be deleted. To be updatable the materialized view must have a single column primary key. Composite keys are not yet supported by SYSADMIN.refreshMatViewRow. Transformation Query:

/*+ cache(updatable) */ select t.col, t1.col from t, t1 where t.id = t1.id

Update SQL:

CALL SYSADMIN.refreshMatViewRow(viewname=>'schema.matview', key=>5)

Given that the schema.matview defines an integer column col as its primary key, the update will check the live source(s) for the row values.

The update query will not use dependent materialized view tables, so care should be taken to ensure that getting a single row from this transformation query performs well. See the Reference Guide for information on controlling dependent joins, which may be applicable to increasing the performance of retrieving a single row. The refresh query does use nested caches, so this refresh method should be used with caution.

When the updatable option is not specified, accessing the materialized view table is more efficient because modifications do not need to be considered. Therefore, only specify the updatable option if row based incremental updates are needed. Even when performing row updates, full snapshot refreshes may be needed to ensure consistency.

The EventDistributor also exposes the updateMatViewRow as a lower level API for Programmatic Control - care should be taken when using this update method.

Indexes

Internal materialized view tables will automatically create a unique index for each unique constraint and a non-unique index for each index defined on the materialized view. The primary key (if it exists) of the view will automatically be part of a clustered index.

The secondary indexes are always created as ordered trees - bitmap or hash indexes are not supported. Teiid Spring Boot’s metadata for indexes is currently limited. We are not currently able to capture additional information, sort direction, additional columns to cover, etc. You may workaround some of these limitations though.

  • Function based index are supported, but can only be specified through DDL metadata.  If you are not using DDL metadata, consider adding another column to the view that projects the function expression, then place an index on that new column. Queries to the view will need to be modified as appropriate though to make use of the new column/index.

  • If additional covered columns are needed, they may simply be added to the index columns. This however is only applicable to comparable types. Adding additional columns will increase the amount of space used by the index, but may allow its usage to result in higher performance when only the covered columns are used and the main table is not consulted.

Running Multiple Instances

Each instance will maintain its own copy of each materialized table and associated indexes as their is no built-in mechanism for clustering provide with Teiid Spring Boot.

results matching ""

    No results matching ""