Connecting to Hive
This section contains details for Hive connections.
General information
Field | Description |
---|---|
Data source | Hive |
Supported versions |
2.6.19.1022 |
Connection string | jdbc:hive2://
|
Packaged? |
No |
Certified? |
Yes |
Supported features | |
Estimate job
|
Yes |
Analyze data
|
Yes |
Schedule
|
Yes |
Processing capabilities | |
Pushdown
|
No |
Spark agent
|
Yes |
Yarn agent
|
Yes |
Parallel JDBC
|
Yes |
Java Platform version compatibility | |
JDK 8
|
Yes |
JDK 11
|
Yes |
Minimum user permissions
In order to bring your Hive data into Collibra Data Quality & Observability, you need the following permissions.
- The Kerberos user has read permissions on Hive tables.
- ROLE_ADMIN assigned to your user in Collibra DQ.
Recommended and required connection properties
Required | Connection Property | Type | Value |
---|---|---|---|
Yes |
Name | Text | The unique name of your connection. Ensure that there are no spaces in your connection name. |
No |
Is Hive | Option |
Uses the Hive server engine for distributed speed and scale. This option sets |
Yes |
Connection URL | String |
The connection string path of your Hive connection. When referring to the example below, replace the Example |
Yes |
Driver Name | String |
The driver class name of your connection.
|
Yes |
Port | Integer |
The port number to establish a connection to the datasource. The default port is |
No |
Source Name | String | N/A |
No |
Target Agent | Option | The Agent that submits your Spark job for processing. |
Yes |
Auth Type | Option |
The method to authenticate your connection. Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types. |
No |
Properties | String |
The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true Optionally add the following driver property to remove the enforcement of unique column names:
Optionally add the following driver properties to set the Spark conf spark.hadoop.hive.metastore.uris and allow Spark to read through a warehouse catalog, such as Hive:
|
Authentication
Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.
Required | Field | Description |
---|---|---|
Yes |
Principal | The Kerberos entity to authenticate and grant access to your connection. |
Yes |
Keytab |
The file path of the keytab file that contains the encrypted key for a Kerberos principal. Example /tmp/keytab/hive_user.keytab |
Yes |
Password | The secret credential associated with your Kerberos principal. |
Yes |
Script |
The file path that contains the script file used to interact with and authenticate a Kerberos user. Example /tmp/keytab/hive_pwd_mgr.sh |
No |
Param $1 | Optional. Additional Kerberos parameter. |
No |
Param $2 | Optional. Additional Kerberos parameter. |
No |
Param $3 | Optional. Additional Kerberos parameter. |
Yes |
TGT Cache | The ticket-granting ticket cache that stores the TGT to authenticate your connection. |