Teradata Batch Source - CDAP Documentation (original) (raw)

Results will update as you type.

You‘re viewing this with anonymous access, so some content might be blocked.

The Teradata Batch source is available in the Hub.

Plugin version: 1.7.0

Reads from a Teradata using a configurable SQL query. Outputs one record for each row returned by the query.

Use this source when you need to read from a Teradata. For example, you may want to create daily snapshots of a database table by using this source and writing to a a table in BigQuery.

Configuration

Property Macro Enabled? Version Introduced Description
Property Macro Enabled? Version Introduced Description
Reference Name No Required. Name used to uniquely identify this source for lineage, annotating metadata, etc.
Driver Name No Required. Name of the JDBC driver to use.Default is teradata.
Host Yes Required. Host that Teradata is running on.Default is localhost.
Port Yes Required. Port that Teradata is running on.Default is 1025.
Database Yes Required. Teradata database name.
Import Query Yes Required. The SELECT query to use to import data from the specified table. You can specify an arbitrary number of columns to import, or import all columns using *. The Query should contain the ‘$CONDITIONS’ string. For example, ‘SELECT * FROM table WHERE CONDITIONS’.The‘CONDITIONS’. The ‘CONDITIONS’.TheCONDITIONS’ string will be replaced by Split-By Field Name field limits specified by the bounding query. The ‘$CONDITIONS’ string is not required if Number of Splits to Generate is set to 1.
Bounding Query Yes Optional. Bounding Query should return the min and max of the values of the ‘splitBy’ field. For example, ‘SELECT MIN(id),MAX(id) FROM table’. Not required if Number of Splits to Generate is set to 1.
Split-By Field Name Yes Optional. Field Name which will be used to generate splits. Not required if Number of Splits to Generate is set to 1.
Number of Splits to Generate Yes Optional. Number of splits to generate.Default is 1.
Fetch Size Yes 6.6.0/1.7.0 Optional. The number of rows to fetch at a time per split. Larger Fetch Size can result in faster import with the trade-off of higher memory usage.Default is 1000.
Username Yes Optional. User identity for connecting to the specified database.
Password Yes Optional. Password to use to connect to the specified database.
Connection Arguments Yes Optional. A list of arbitrary string key/value pairs as connection arguments. These arguments will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations.
Schema Required. The schema of records output by the source. This will be used in place of whatever schema comes back from the query. However, it must match the schema that comes back from the query, except it can mark fields as nullable and can contain a subset of the fields.

Example

Suppose you want to read data from Teradata database named “prod” that is running on “localhost” port 1025, as “postgres” user with “postgres” password (Ensure that the driver for Teradata is installed. You can also provide driver name for some specific driver, otherwise “teradata” will be used), then configure plugin with:

Property Value
Reference Name src1
Driver Name teradata
Host localhost
Port 1025
Database prod
Import Query select id, name, email, phone from users
Number of Splits to Generate 1
Username dbc
Password dbc

For example, if the ‘id’ column is a primary key of type int and the other columns are non-nullable varchars, output records will have this schema:

Field Type
id int
name string
email string
phone string

Data Types Mapping

Teradata specific data types mapped to string and can have multiple input formats and one ‘canonical’ output form. To figure out proper formats, see Teradata data types documentation.

Teradata Data Type CDAP Schema Data Type
Teradata Data Type CDAP Schema Data Type
BYTEINT INT
SMALLINT INT
INTEGER INT
BIGINT LONG
DECIMAL/NUMERIC DECIMAL
FLOAT/REAL/DOUBLE PRECISION DOUBLE
NUMBER DECIMAL
BYTE BYTES
VARBYTE BYTES
BLOB BYTES
CHAR STRING
VARCHAR STRING
CLOB STRING
DATE DATE
TIME TIME_MICROS
TIMESTAMP TIMESTAMP_MICROS
TIME WITH TIME ZONE TIME_MICROS
TIMESTAMP WITH TIME ZONE TIMESTAMP_MICROS
INTERVAL YEAR STRING
INTERVAL YEAR TO MONTH STRING
INTERVAL MONTH STRING
INTERVAL DAY STRING
INTERVAL DAY TO HOUR STRING
INTERVAL DAY TO MINUTE STRING
INTERVAL DAY TO SECOND STRING
INTERVAL HOUR STRING
INTERVAL HOUR TO MINUTE STRING
INTERVAL HOUR TO SECOND STRING
INTERVAL MINUTE STRING
INTERVAL MINUTE TO SECOND STRING
INTERVAL SECOND STRING
ST_Geometry STRING

Comments