Connecting to Oracle

This section contains details for Oracle connections.

General information

Field Description
Data source Oracle
Supported versions

12.2.0.1.0

Connection string jdbc:oracle:thin
Packaged?

Yes

Certified?

Yes

Supported features
Estimate job

Yes

Analyze data

Yes

Schedule

Yes

Processing capabilities
Pushdown

No

Spark agent

Yes

Yarn agent

Yes

Parallel JDBC

Yes

Java Platform version compatibility
JDK 8

Yes

JDK 11

Yes

Minimum user permissions

In order to bring your Oracle data into Collibra Data Quality & Observability, you need the following permissions.

  • Read access on Oracle database tables.
  • The Kerberos user has read permissions on Oracle tables (if using Kerberos authentication)
  • ROLE_ADMIN assigned to your user in Collibra DQ.

Recommended and required connection properties

Required Connection Property Type Value

Yes

Name Text The unique name of your connection. Ensure that there are no spaces in your connection name.

Yes

Connection URL String

The connection string path of your Oracle connection.

When referring to the example below, replace the ${value} sections of the connection URL with your actual value.

Example Using SID

jdbc:oracle:thin:@${host}:SID

Example Using database service name

jdbc:oracle:thin:@//${host}:${port}/service_name

Yes

Driver Name String

The driver class name of your connection.

oracle.jdbc.OracleDriver

Yes

Port Integer

The port number to establish a connection to the datasource.

The default port is 1521

No

Source Name String N/A

No

Target Agent Option The Agent that submits your Spark job for processing.

Yes

Auth Type Option

The method to authenticate your connection.

Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types.

No

Properties String

The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true

Connection properties recognized by Oracle JDBC drivers

Name Short Name Type Description
user n/a String The user name for logging into the database
password n/a String The password for logging into the database
database server String The connect string for the database
internal_logon n/a String A role, such as sysdba or sysoper, that allows you to log on as sys
defaultRowPrefetch prefetch String (containing integer value) The default number of rows to prefetch from the server (default value is "10")
remarksReporting remarks String (containing boolean value) "true" if getTables() and getColumns() should report TABLE_REMARKS; equivalent to using setRemarksReporting() (default value is "false")
defaultBatchValue batchvalue String (containing integer value) The default batch value that triggers an execution request (default value is "10")
processEscapes n/a String (containing boolean value) "false" to disable escape processing for statements (Statement or PreparedStatement) created from this connection. Set this to "false" if you want to avoid many calls to Statement.setEscapeProcessing(false);. This is espcially usefull for PreparedStatement where a call to setEscapeProcessing(false) would have no effect. The default is "true".
defaultNChar n/a String (containing boolean value)

"false" is the default. If set to "true", the default behavior for handling character datatypes is changed so that NCHAR/NVARCHAR2 become the default. This means that setFormOfUse() won't be needed anymore when using NCHAR/NVARCHAR2. This can also be set as a java property :java -Doracle.jdbc.defaultNChar=true myApplication

useFetchSizeWithLongColumn n/a String (containing boolean value)

"false" is the default.

Important  This is a thin-only property and should not be used with any other drivers.

If set to "true", the performance when retrieving data in a 'SELECT' will be improved but the default behavior for handling LONG columns will be changed to fetch multiple rows (prefetch size). It means that enough memory will be allocated to read this data. So if you want to use this property, make sure that the LONG columns you are retrieving are not too big or you may run out of memory. This property can also be set as a java property : java -Doracle.jdbc.useFetchSizeWithLongColumn=true myApplication

SetFloatAndDoubleUseBinary n/a String (containing boolean value)

"false" is the default. If set to "true", causes the java.sql.PreparedStatment setFloat and setDouble API's to use internal binary format as for BINARY_FLOAT and BINARY_DOUBLE parameters. See oracle.jdbc.OraclePreparedStatement setBinaryFloat and setBinaryDouble

https://docs.oracle.com/cd/E11882_01/appdev.112/e13995/oracle/jdbc/OracleDriver.html

Authentication

Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.

Required Field Description

Yes

Username

The username of your Oracle account.

Yes

Password

The password of your Oracle account.

Yes

Script

The file path that contains the script file that the password manager uses to interact with and authenticate a user account.

Example /tmp/keytab/oracle_pwd_mgr.sh

No

Param $1 Optional. An additional parameters to authenticate your Oracle connection.

No

Param $2 Optional. An additional parameter to authenticate your Oracle connection.

No

Param $3 Optional. An additional parameter to authenticate your Oracle connection.

Yes

Principal

The Kerberos entity to authenticate and grant access to your connection.

Example [email protected]

Yes

Keytab

The file path of the keytab file that contains the encrypted key for a Kerberos principal.

Example /tmp/keytab/oracle_user.keytab

Important You must update your keytab secrets file with the following values.

Key: cdq_user.keytab

Label: base64 encoded string

Yes

Password The secret credential associated with your Kerberos principal.

Yes

Script

The file path that contains the script file used to interact with and authenticate a Kerberos user.

Example /tmp/keytab/oracle_pwd_mgr.sh

No

Param $1 Optional. Additional Kerberos parameter.

No

Param $2 Optional. Additional Kerberos parameter.

No

Param $3 Optional. Additional Kerberos parameter.

Yes

TGT Cache The ticket-granting ticket cache that stores the TGT to authenticate your connection.

Limitations

  • There is an Apache Spark limitation where Pullup jobs on Oracle datasets that use Timestamp with Time Zone column types fail in the LOAD activity. For more information about the limitation and a workaround, see the support article.
  • Because Oracle normally converts all characters in table and schema names to uppercase, Oracle also treats lower case characters as special characters. A possible workaround is to do the following:
    1. Go to Explorer and click Create DQ Job on an Oracle table.
    2. In the Query box, wrap the schema and table name with double quotes ". For example, select * from "schema"."tablename".
    3. Go to the Run page and click the Run CMD tab.
    4. Click the lock icon to unlock the command line.
    5. Wrap backslashes \ around the double quoted schema and table names, and add a whitespace before the ending double quote of the -q section. For example, -q "select * from \"schema\".\"table\" "
  • The SDO_GEOMETRY column data type is not supported on Oracle connections.
  • When you add a timeslice to a column in an Oracle dataset, you must use the "TO_DATE" function to ensure the successful execution of your DQ Job. When a column includes a timeslice without the "TO_DATE" function, the DQ Job will fail with the error, "ORA-00907: missing right parenthesis."
    • When you include to_date within the command line query, your DQ Job will run successfully with a non-zero time. In the following example of a command line query with a to_date, TRADE_DATE is the timestamp column:
      • -owluser user -numexecutors 1 -lib "/opt/owl/drivers/oracle/" -executormemory 1g -bhminon -h dq:5432/dq?currentSchema=validation -executorcores 2 -ds ORACLE_DQUSER.NYSE_2 -master k8s:// -drivermemory 1g -deploymode cluster -q "select * from DQUSER.NYSE where TRADE_DATE = to_date('${rd}', 'yyyy-mm-dd hh24:mi:ss')" -rd "2018-01-16 05:00:00" -bhlb 10 -driver "oracle.jdbc.OracleDriver" -bhmaxon -loglevel INFO -cxn ORACLE