Connecting to NetApp

This section contains details for NetApp connections.

General information

NetApp is a data storage platform that Collibra Data Quality & Observability recognizes as a Remote File Connection, allowing you to access data stored in Amazon S3 buckets.

Field Description
Data source NetApp
Supported versions N/A
Connection string https://
Packaged?

Yes

Certified?

Yes

Supported features
Analyze data

Yes

Archive breaking records

Yes

Estimate job

Yes

Pushdown

No

Processing capabilities
Spark agent

Yes

Yarn agent

Yes

Minimum user permissions

In order for Collibra DQ to access your Amazon S3 bucket, you need the following permissions.

  • ROLE_ADMIN in Collibra DQ.
  • Read access on the Amazon S3 bucket where your data is stored.
  • Read and write access on the Amazon S3 bucket where breaking records are archived from Collibra DQ. This is only necessary if you use the archive breaking records feature.

Recommended and required connection properties

Required Connection Property Type Value

Yes

Name Text The unique name of your connection. Ensure that there are no spaces in your connection name.

No

Path Style Access
Option

Allows access to object stores along HTTPS paths.

Yes

Connection URL String

The connection string path of your NetApp connection.

Example https://<domain>.netapp.com

Yes

Region Option

The AWS region in which the Amazon S3 bucket resides.

The default region is US_EAST_1

Yes

Bucket Name String

The exact name of the Amazon S3 bucket along the URI path you are attempting to access.

Example example_bucket

Yes

Authentication Type Option

The method to authenticate your connection.

Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types.

Yes

Save Credentials Option Select this option after you enter your connection details. We recommend selecting this option to allow credentials to be shared with users across the Collibra DQ platform.

No

Target Agent Option The Agent used to submit your DQ Job.

No

Archive Breaking Records Option The unique name of your connection. Ensure that there are no spaces in your connection name.

No

Archive Location
Option

The path to store archived breaking records when you select the Archive Breaking Records option.

Example /folder/path/

Yes

Driver Properties String

The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true

To connect to a NetApp endpoint in URI format, select the HTTPS option, specify a Bucket Name, and then add one of the following property to the Properties tab:

s3-endpoint=netapp

Authentication

Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.

Field Description
Key AWS security credentials that use an access key ID and secret access key combination to access S3 buckets.
Key
The access key ID for your Amazon S3 storage account.
Secret
The secret access key for your Amazon S3 storage account.
Instance Profile

The instance profile used to grant access to the EC2 instance to access your S3 bucket.

Optionally select Assume Role and add an accessRole when you use Instance Profile.

Username
The username credential required for the IdP service to authenticate a user.
Password
The password credential required for the IdP service to authenticate a user.
Assume Role

This is optional for Instance Profile and Key authentication.

Select this option to generate a set of temporary security credentials that you can use to access AWS resources.

To assume an S3 IAM Role on EKS-based deployments of Collibra DQ, select Assume Role, then optionally enter the information in the Role to Assume field.

Role to Assume

The IAM role associated with your EC2 instance. This is optional when assuming an S3 IAM Role on EKS-based deployments of Collibra DQ.

Connect to NetApp in URI format

To connect to a NetApp endpoint in URI format, select the HTTPS option, specify a Bucket Name, and then add one of the following property to the Properties tab:

Connection Property
NetApp s3-endpoint=netapp