Connecting to Oracle
This section contains details for Oracle connections.
General information
Field | Description |
---|---|
Data source | Oracle |
Supported versions |
12.2.0.1.0 |
Connection string | jdbc:oracle:thin
|
Packaged? |
Yes |
Certified? |
Yes |
Supported features | |
Estimate job
|
Yes |
Analyze data
|
Yes |
Schedule
|
Yes |
Processing capabilities | |
Pushdown
|
No |
Spark agent
|
Yes |
Yarn agent
|
Yes |
Parallel JDBC
|
Yes |
Java Platform version compatibility | |
JDK 8
|
Yes |
JDK 11
|
Yes |
Minimum user permissions
In order to bring your Oracle data into Collibra Data Quality & Observability, you need the following permissions.
- Read access on Oracle database tables.
- The Kerberos user has read permissions on Oracle tables (if using Kerberos authentication)
- ROLE_ADMIN assigned to your user in Collibra DQ.
Recommended and required connection properties
Required | Connection Property | Type | Value |
---|---|---|---|
Yes |
Name | Text | The unique name of your connection. Ensure that there are no spaces in your connection name. |
Yes |
Connection URL | String |
The connection string path of your Oracle connection. When referring to the example below, replace the Example Using SID Example Using database service name |
Yes |
Driver Name | String |
The driver class name of your connection.
|
Yes |
Port | Integer |
The port number to establish a connection to the datasource. The default port is |
No |
Source Name | String | N/A |
No |
Target Agent | Option | The Agent that submits your Spark job for processing. |
Yes |
Auth Type | Option |
The method to authenticate your connection. Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types. |
No |
Properties | String |
The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true |
Connection properties recognized by Oracle JDBC drivers
Name | Short Name | Type | Description |
---|---|---|---|
user | n/a | String | The user name for logging into the database |
password | n/a | String | The password for logging into the database |
database | server | String | The connect string for the database |
internal_logon | n/a | String | A role, such as sysdba or sysoper , that allows you to log on as sys |
defaultRowPrefetch | prefetch | String (containing integer value) | The default number of rows to prefetch from the server (default value is "10") |
remarksReporting | remarks | String (containing boolean value) | "true" if getTables() and getColumns() should report TABLE_REMARKS; equivalent to using setRemarksReporting() (default value is "false") |
defaultBatchValue | batchvalue | String (containing integer value) | The default batch value that triggers an execution request (default value is "10") |
processEscapes | n/a | String (containing boolean value) | "false" to disable escape processing for statements (Statement or PreparedStatement) created from this connection. Set this to "false" if you want to avoid many calls to Statement.setEscapeProcessing(false); . This is espcially usefull for PreparedStatement where a call to setEscapeProcessing(false) would have no effect. The default is "true". |
defaultNChar | n/a | String (containing boolean value) |
"false" is the default. If set to "true", the default behavior for handling character datatypes is changed so that NCHAR/NVARCHAR2 become the default. This means that setFormOfUse() won't be needed anymore when using NCHAR/NVARCHAR2. This can also be set as a java property : |
useFetchSizeWithLongColumn | n/a | String (containing boolean value) |
"false" is the default. Important This is a thin-only property and should not be used with any other drivers. If set to "true", the performance when retrieving data in a 'SELECT' will be improved but the default behavior for handling LONG columns will be changed to fetch multiple rows (prefetch size). It means that enough memory will be allocated to read this data. So if you want to use this property, make sure that the LONG columns you are retrieving are not too big or you may run out of memory. This property can also be set as a java property : |
SetFloatAndDoubleUseBinary | n/a | String (containing boolean value) |
"false" is the default. If set to "true", causes the java.sql.PreparedStatment setFloat and setDouble API's to use internal binary format as for BINARY_FLOAT and BINARY_DOUBLE parameters. See oracle.jdbc.OraclePreparedStatement setBinaryFloat and setBinaryDouble |
https://docs.oracle.com/cd/E11882_01/appdev.112/e13995/oracle/jdbc/OracleDriver.html
Authentication
Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.
Required | Field | Description |
---|---|---|
Yes |
Username |
The username of your Oracle account. |
Yes |
Password |
The password of your Oracle account. |
Yes |
Script |
The file path that contains the script file that the password manager uses to interact with and authenticate a user account. Example /tmp/keytab/oracle_pwd_mgr.sh |
No |
Param $1 | Optional. An additional parameters to authenticate your Oracle connection. |
No |
Param $2 | Optional. An additional parameter to authenticate your Oracle connection. |
No |
Param $3 | Optional. An additional parameter to authenticate your Oracle connection. |
Yes |
Principal |
The Kerberos entity to authenticate and grant access to your connection. Example [email protected] |
Yes |
Keytab |
The file path of the keytab file that contains the encrypted key for a Kerberos principal. Example /tmp/keytab/oracle_user.keytab Important You must update your keytab secrets file with the following values. Key: cdq_user.keytab Label: base64 encoded string |
Yes |
Password | The secret credential associated with your Kerberos principal. |
Yes |
Script |
The file path that contains the script file used to interact with and authenticate a Kerberos user. Example /tmp/keytab/oracle_pwd_mgr.sh |
No |
Param $1 | Optional. Additional Kerberos parameter. |
No |
Param $2 | Optional. Additional Kerberos parameter. |
No |
Param $3 | Optional. Additional Kerberos parameter. |
Yes |
TGT Cache | The ticket-granting ticket cache that stores the TGT to authenticate your connection. |
Limitations
- There is an Apache Spark limitation where Pullup jobs on Oracle datasets that use Timestamp with Time Zone column types fail in the LOAD activity. For more information about the limitation and a workaround, see the support article.
- Because Oracle normally converts all characters in table and schema names to uppercase, Oracle also treats lower case characters as special characters. A possible workaround is to do the following:
- Go to Explorer and click Create DQ Job on an Oracle table.
- In the Query box, wrap the schema and table name with double quotes
"
. For example,select * from "schema"."tablename"
. - Go to the Run page and click the Run CMD tab.
- Click the lock icon to unlock the command line.
- Wrap backslashes
\
around the double quoted schema and table names, and add a whitespace before the ending double quote of the -q section. For example,-q "select * from \"schema\".\"table\" "
- The SDO_GEOMETRY column data type is not supported on Oracle connections.
- When you add a timeslice to a column in an Oracle dataset, you must use the "TO_DATE" function to ensure the successful execution of your DQ Job. When a column includes a timeslice without the "TO_DATE" function, the DQ Job will fail with the error, "ORA-00907: missing right parenthesis."
- When you include
to_date
within the command line query, your DQ Job will run successfully with a non-zero time. In the following example of a command line query with ato_date
,TRADE_DATE
is the timestamp column:-owluser user -numexecutors 1 -lib "/opt/owl/drivers/oracle/" -executormemory 1g -bhminon -h dq:5432/dq?currentSchema=validation -executorcores 2 -ds ORACLE_DQUSER.NYSE_2 -master k8s:// -drivermemory 1g -deploymode cluster -q "select * from DQUSER.NYSE where TRADE_DATE = to_date('${rd}', 'yyyy-mm-dd hh24:mi:ss')" -rd "2018-01-16 05:00:00" -bhlb 10 -driver "oracle.jdbc.OracleDriver" -bhmaxon -loglevel INFO -cxn ORACLE
- When you include