Connecting to Snowflake
Important Click the Pullup or Pushdown tab to display the Snowflake connection information for either processing method.
- Pullup
- Pushdown
This section contains details for Snowflake Pullup connections.
General information
Field | Description |
---|---|
Data source | Snowflake |
Supported versions | 3.14.4 |
Connection string | jdbc:snowflake://
|
Packaged? |
Yes |
Certified? |
Yes |
Supported features | |
Estimate job
|
Yes |
Analyze data
|
Yes |
Schedule
|
Yes |
Processing capabilities | |
Pushdown
|
No Important To see information for Snowflake Pushdown, click the Pushdown tab above. |
Spark agent
|
Yes |
Yarn agent
|
Yes |
Parallel JDBC
|
Yes |
Java Platform version compatibility | |
JDK 8
|
Yes |
JDK 11
|
Yes |
Minimum user permissions
In order to bring your Snowflake data into Collibra Data Quality & Observability, you need the following permissions.
- Read access on your Snowflake database tables.
- ROLE_ADMIN assigned to your user in Collibra DQ.
Recommended and required connection properties
Required | Connection Property | Type | Value |
---|---|---|---|
Yes |
Name | Text | The unique name of your connection. Ensure that there are no spaces in your connection name. |
Yes |
Connection URL | String |
The connection string path of your Snowflake connection. When referring to the example below, replace the Example Important If your connection URL contains special characters and your Kubernetes deployment of Collibra DQ is a version before 2023.08, then you need to set the following Helm value to ensure you do not receive an invalid connection exception: |
Yes |
Driver Name | String |
The driver class name of your Snowflake connection.
|
Yes |
Port | Integer |
The port number to establish a connection to the datasource. The default port is |
No |
Source Name | String | N/A |
No |
Target Agent | Option | The Agent that submits your Spark job for processing. |
Yes |
Auth Type | Option |
The method to authenticate your connection. Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types. |
No |
Pushdown | Option |
Switches Pushdown processing on or off. Important This option should not be selected for Pullup connections. |
No |
Properties | String |
The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true Optionally add the following connection property to limit the databases and tables displayed in Explorer to only the tables to which a user has access. CLIENT_METADATA_REQUEST_USE_CONNECTION_CTX=TRUE |
For more information about optional JDBC connection parameters, see the official Snowflake documentation.
Authentication
Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.
Required | Field | Description |
---|---|---|
Yes |
Username |
The username of your Snowflake service account. |
Yes |
Password |
The password of your Snowflake service account. |
Yes |
Script |
The file path that contains the script file that the password manager uses to interact with and authenticate a user account. Example /tmp/keytab/snowflake_pwd_mgr.sh |
No |
Param $1 | Optional. An additional parameter to authenticate your Snowflake connection. |
No |
Param $2 | Optional. An additional parameter to authenticate your Snowflake connection. |
No |
Param $3 | Optional. An additional parameter to authenticate your Snowflake connection. |
Tip To use JWT authentication, enter any value as the Username and your token value as the Password.
Configuring key pair authentication via the Connection URL
The table below shows the properties used to configure key pair authentication with a Connection URL in the New JDBC Connection modal.
Property | Description |
---|---|
Connection URL |
jdbc:snowflake://<URL>?db=owluserdb&warehouse=owluserdb&authenticator=snowflake_jwt&private_key_file=/<your-file-path>/keytab/snowflake_key.p8&private_key_file_pwd=<your-password> The file name should include a .p8 extension, as shown in the example above. Note Connection URL parameters must be appended by ampersand ( |
Username | The username for which the private key is created. For example, DQ_User. |
Password | The password field must be left empty. |
Configuring key pair authentication via the Driver Properties
The table below shows the properties used to configure key pair authentication with the Driver Properties in the New JDBC Connection modal.
Property | Description |
---|---|
Username | The username for which the private key is created. For example, DQ_User. |
Password | The password field must be left empty. |
Driver Properties |
authenticator=snowflake_jwt,private_key_file=/<your-file-path>/snowflakeKeys/rsa_key2.p8,private_key_file_pwd=<your-password> The file name should include a .p8 extension, as shown in the example above. Note Driver properties must be comma separated, as shown in the example above. |
Additional connection parameters
Parameter | Description |
---|---|
<account_identifier>
|
Specifies the account identifier for your Snowflake account. For more details, see Account Identifiers. For examples of the account identifier used in a JDBC connection string, see Examples. |
<connection_params>
|
Specifies a series of one or more parameters, in the form of <param>=<value> , with each parameter separated by the ampersand character (& ), and no spaces anywhere in the connection string. |
For more information on individual connection parameters, see the official Snowflake documentation about JDBC Driver Connection Parameter Reference.
Other parameters
Any session parameter can be included in the connection string. For example:
Parameter | Description |
---|---|
CLIENT_SESSION_KEEP_ALIVE=<Boolean>
|
Specifies whether to keep the current session active after a period of inactivity, or to force the user to login again. When the boolean value is When the boolean value is The default boolean value is |
For descriptions of all the session parameters, see the official Snowflake documentation about Parameters.
Optional parameter examples
The following is an example of the connection string that uses an account identifier that specifies the account myaccount
in the organization myorganization
.
Copyjdbc:snowflake://myorganization-myaccount.snowflakecomputing.com/?user=peter&warehouse=mywh&db=mydb&schema=public
The following is an example of a connection string that uses the account locator xy12345
as the account identifier:
Copyjdbc:snowflake://xy12345.snowflakecomputing.com/?user=peter&warehouse=mywh&db=mydb&schema=public
Note This example uses an account in the AWS US West (Oregon) region. If the account is in a different region or if the account uses a different cloud provider, you need to specify additional segments after the account locator.
PrivateLink
Please let us know if you are using PrivateLink for Snowflake. Setup can vary depending on the endpoint that is created. In most cases, use the private endpoint as a normal JDBC connection.
For additional information, see the Snowflake community article about troubleshooting AWS PrivateLink.
Advanced PrivateLink and proxy
The following is an example of a JDBC string connection we used to set up a Snowflake connection :
- <ACCOUNT_NAME> is the full link to the Snowflake instance with the PrivateLink.
- DQ installed on-prem in a private IaaS and DQ is behind a proxy.
- If the Snowflake instance is using a PrivateLink, whitelist the private link URL to bypass the proxy.
- In addition to connectivity to the Snowflake instance, the JDBC driver tries to access Snowflake Blob storage by connecting directly to some S3 buckets managed by Snowflake.
- Those need to be whitelisted as well.
Command line example for basic spark-submit job
-h <IP_ADDRESS>:5432/postgres \
-drivermemory 4g \
-master spark://<SPARK_MASTER>:7077 \
-ds PUBLIC.TRANSLATION \
-deploymode client \
-q "select * from PUBLIC.TRANSLATION" \
-rd "2021-07-24" \
-driver "net.snowflake.client.jdbc.SnowflakeDriver" \
-cxn snowflake
This section contains details for Snowflake Pushdown connections.
General information
Field | Description |
---|---|
Data source | Snowflake |
Supported versions | 3.14.4 |
Connection string | jdbc:snowflake://
|
Packaged? |
Yes |
Certified? |
Yes |
Supported features | |
Estimate job
|
Yes |
Analyze data
|
Yes |
Schedule
|
Yes |
Processing capabilities | |
Pushdown
|
Yes |
Spark agent
|
No |
Yarn agent
|
No |
Parallel JDBC
|
No |
Java Platform version compatibility | |
JDK 8
|
Yes |
JDK 11
|
Yes |
Prerequisites
For first-time configurations, we recommend that you also successfully run the Pushdown setup script. If you do not run the setup script, ensure that you meet all the criteria that the script attempts to accomplish.
Minimum user permissions
In order to bring your Snowflake data into Collibra Data Quality & Observability, you need the following permissions.
- USAGE and MONITOR privileges on all Snowflake database tables, external tables, views, schemas, and streams.
- OPERATE, USAGE, and MONITOR privileges on the Collibra DQ virtual data warehouse.
- When using Archive Break Records, ensure that you have the following:
- USAGE privilege on the database and/or schema.
- INSERT privilege on tables.
- CREATE, READ, UPDATE, and DELETE privileges to prune records.
- ROLE_ADMIN assigned to your user in Collibra DQ.
Recommended and required connection properties
Required | Connection Property | Type | Value |
---|---|---|---|
Yes |
Name | Text | The unique name of your connection. Ensure that there are no spaces in your connection name. |
Yes |
Connection URL | String |
The connection string path of your Snowflake connection. When referring to the example below, replace the Example Important If your connection URL contains special characters and your Kubernetes deployment of Collibra DQ is a version before 2023.08, then you need to set the following Helm value to ensure you do not receive an invalid connection exception: |
Yes |
Driver Name | String |
The driver class name of your Snowflake connection.
|
Yes |
Port | Integer |
The port number to establish a connection to the datasource. The default port is |
No |
Source Name | String | N/A |
No |
Target Agent | Option |
The Agent that submits your Spark job for processing. You do not need to select an option because Pushdown Jobs do not require an Agent to submit them to the data warehouse. |
Yes |
Auth Type | Option |
The method to authenticate your connection. Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types. |
Yes |
Pushdown | Option |
Switches Pushdown processing on or off. Important This option must be selected for Pushdown processing to work correctly. |
No |
Archive Break Records | Option |
Allows you to write break records directly to a Snowflake database or schema. Important Ensure that you have the necessary privileges to write break records to Snowflake. |
Yes |
Properties | String |
The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true Add the following driver property: CLIENT_METADATA_REQUEST_USE_CONNECTION_CTX=TRUE |
For more information about optional JDBC connection parameters, see the official Snowflake documentation.
Authentication
Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.
Required | Field | Description |
---|---|---|
Yes |
Username |
The username of your Snowflake service account. |
Yes |
Password |
The password of your Snowflake service account. |
Yes |
Script |
The file path that contains the script file that the password manager uses to interact with and authenticate a user account. Example /tmp/keytab/snowflake_pwd_mgr.sh |
No |
Param $1 | Optional. An additional parameter to authenticate your Snowflake connection. |
No |
Param $2 | Optional. An additional parameter to authenticate your Snowflake connection. |
No |
Param $3 | Optional. An additional parameter to authenticate your Snowflake connection. |
Tip To use JWT authentication, enter any value as the Username and your token value as the Password.
Configuring key pair authentication via the Connection URL
The table below shows the properties used to configure key pair authentication with a Connection URL in the New JDBC Connection modal.
Property | Description |
---|---|
Connection URL |
jdbc:snowflake://<URL>?db=owluserdb&warehouse=owluserdb&authenticator=snowflake_jwt&private_key_file=/<your-file-path>/keytab/snowflake_key.p8&private_key_file_pwd=<your-password> The file name should include a .p8 extension, as shown in the example above. Note Connection URL parameters must be appended by ampersand ( |
Username | The username for which the private key is created. For example, DQ_User. |
Password | The password field must be left empty. |
Configuring key pair authentication via the Driver Properties
The table below shows the properties used to configure key pair authentication with the Driver Properties in the New JDBC Connection modal.
Property | Description |
---|---|
Username | The username for which the private key is created. For example, DQ_User. |
Password | The password field must be left empty. |
Driver Properties |
authenticator=snowflake_jwt,private_key_file=/<your-file-path>/snowflakeKeys/rsa_key2.p8,private_key_file_pwd=<your-password> The file name should include a .p8 extension, as shown in the example above. Note Driver properties must be comma separated, as shown in the example above. |