Amazon S3 Translator

The Amazon S3 translator, known by the type name amazon-s3, exposes stored procedures to leverage Amazon S3 object resources. The Web Service Data Source resource-adapter will be used for access. This will commonly be used with the TEXTTABLE or XMLTABLE table functions to use CSV or XML formatted data or read excel files, or other object files stored in the Amazon S3. This translator supports access to Amazon S3 using access-key and secret-key.

Usage

Here is sample VDB that is reading CSV file from Amazon S3 with name 'g2.txt' in the Amazon S3 bucket called 'teiidbucket'

e1,e2,e3
5,'five',5.0
6,'six',6.0
7,'seven',7.0
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<vdb name="example" version="1">
    <model name="s3">
        <source name="web-connector" translator-name="user-s3" connection-jndi-name="java:/amazon-s3"/>
    </model>
    <model name="Stocks" type="VIRTUAL">
        <metadata type="DDL"><![CDATA[
        CREATE VIEW G2 (e1 integer, e2 string, e3 double,PRIMARY KEY (e1))
            AS  SELECT SP.e1, SP.e2,SP.e3
                FROM (EXEC s3.getTextFile(name=>'g2.txt')) AS f,
                TEXTTABLE(f.file COLUMNS e1 integer, e2 string, e3 double HEADER) AS SP;
        ]]> </metadata>
    </model>
    <translator name="user-s3" type="amazon-s3">
      <property name="accesskey" value="xxxx"/>
      <property name="secretkey" value="xxxx"/>
      <property name="region" value="us-east-1"/>
      <property name="bucket" value="teiidbucket"/>
    </translator>
</vdb>

Execution Properties

Use the translator override mechanism to supply the following properties.

Name Description Default

Encoding

The encoding that should be used for CLOBs returned by the getTextFiles procedure. The value should match an encoding known to the JRE.

The system default encoding

Accesskey

Amazon Security Access Key. Log in to Amazon console to find your security access key. When provided, this becomes the default access key

n/a

Secretkey

Amazon Security secret Key. Log in to Amazon console to find your security secret key. When provided, this becomes the default secret key.

n/a

Region

Amazon Region to be used with the request. When provided this will be default region used.

n/a

Bucket

Amazon S3 bucket name, if provided this will serve as default bucket to be used for all the requests

n/a

Encryption

When SSE-C type encryption used, where customer supplies the encryption key, this key will be used for defining the "type" of encryption algorithm used. Supported are AES-256, AWS-KMS. If provided this will be used as default algorithm for all "get" based calls

n/a

Encryptionkey

When SSE-C type encryption used, where customer supplies the encryption key, this key will be used for defining the "encryption key". If provided this will be used as default key for all "get" based calls

n/a

Tip
See override an execution property and the example below to set the properties.

Procedures Exposed by Translator

When you add the a model (schema) like above in the example, the following procedure calls are available for user to execute against Amazon S3.

Note
Please note that bucket, region, accesskey, secretkey, encryption and encryptionkey are optional or nullable parameters in most of the methods provided. i.e. user do not need to provide them, if they are already configured using translator override properties as shown in above vdb example.

getTextFile(…​)

Retrieves the given named object as text file from specified bucket and region using the provided security credentials as clob.

getTextFile(string name NOT NULL, string bucket, string region,
   string endpoint, string accesskey, string secretkey,string encryption, string encryptionkey, boolean stream default false)
   returns TABLE(file blob, endpoint string, lastModified string, etag string, size long);
Note
endpoint is optional, when provided the this URL will be used instead of the one constructed by the supplied properties. Use encryption and encryptionkey only in when server side security with customer supplied keys (SSE-C) in force.

If stream is true, then returned lobs may only be read once and will not typically be buffered to disk.

examples

exec getTextFile(name=>'myfile.txt');

SELECT SP.e1, SP.e2,SP.e3, f.lastmodified
   FROM (EXEC getTextFile(name=>'myfile.txt')) AS f,
   TEXTTABLE(f.file COLUMNS e1 integer, e2 string, e3 double HEADER) AS SP;

getFile(…​)

Retrieves the given named object as binary file from specified bucket and region using the provided security credentials as blob.

getFile(string name NOT NULL, string bucket, string region,
   string endpoint, string accesskey, string secretkey, string encryption, string encryptionkey, boolean stream default false)
   returns TABLE(file blob, endpoint string, lastModified string, etag string, size long)
Note
endpoint is optional, when provided the this URL will be used instead of the one constructed by the supplied properties. Use encryption and encryptionkey only in when server side security with customer supplied keys (SSE-C) in force.

If stream is true, then returned lobs may only be read once and will not typically be buffered to disk.

examples

exec getFile(name=>'myfile.xml', bucket=>'mybucket', region=>'us-east-1', accesskey=>'xxxx', secretkey=>'xxxx');

select b.* from (exec getFile(name=>'myfile.xml', bucket=>'mybucket', region=>'us-east-1', accesskey=>'xxxx', secretkey=>'xxxx')) as a,
XMLTABLE('/contents' PASSING XMLPARSE(CONTENT a.result WELLFORMED) COLUMNS e1 integer, e2 string, e3 double) as b;

saveFile(…​)

Save the CLOB, BLOB, or XML value to given name and bucket. In the below procedure signature contents parameter can be any of the lob types.

call saveFile(string name NOT NULL, string bucket, string region, string endpoint,
   string accesskey, string secretkey, contents object)
Note
currently saveFile does NOT support streaming/chuncked based upload of the contents. i.e. if you try to load very large objects there is risk of reaching out of memory issues. This method does not support SSE-C based security encryption either.

exmaples

exec saveFile(name=>'g4.txt', contents=>'e1,e2,e3\n1,one,1.0\n2,two,2.0');

deleteFile(…​)

Delete the named object from the bucket.

call deleteFile(string name NOT NULL, string bucket, string region, string endpoint, string accesskey, string secretkey)

examples

exec deleteFile(name=>'myfile.txt');

list(…​)

Lists the contents of the bucket.

call list(string bucket, string region, string accesskey, string secretkey, nexttoken string)
    returns Table(result clob)

The result is the XML file that Amazon S3 provides in following format

<?xml version="1.0" encoding="UTF-8"?>/n
<ListBucketResult
    xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <Name>teiidbucket</Name>
    <Prefix></Prefix>
    <KeyCount>1</KeyCount>
    <MaxKeys>1000</MaxKeys>
    <IsTruncated>false</IsTruncated>
    <Contents>
        <Key>g2.txt</Key>
        <LastModified>2017-08-08T16:53:19.000Z</LastModified>
        <ETag>&quot;fa44a7893b1735905bfcce59d9d9ae2e&quot;</ETag>
        <Size>48</Size>
        <StorageClass>STANDARD</StorageClass>
    </Contents>
</ListBucketResult>

You can parse this into view using a example query like below

select b.* from (exec list(bucket=>'mybucket', region=>'us-east-1')) as a,
 XMLTABLE(XMLNAMESPACES(DEFAULT 'http://s3.amazonaws.com/doc/2006-03-01/'), '/ListBucketResult/Contents'
 PASSING XMLPARSE(CONTENT a.result WELLFORMED) COLUMNS Key string, LastModified string, ETag string, Size string,
 StorageClass string,	NextContinuationToken string PATH '../NextContinuationToken') as b;

When all properties like bucket, region, accesskey and secretkey are defined as translator override properties one can also issue simply

SELECT * FROM Bucket

Note: if there are more then 1000 object in the bucket, then the value 'NextContinuationToken' need to be supplied as 'nexttoken' into the list call to fetch the next batch of objects. This can be automated in Teiid with enhancement request.

JCA Resource Adapter

The resource adapter for this translator provided through "Web Service Data Source", Refer to Admin Guide for configuration information.

results matching ""

    No results matching ""