Add the GCS synchronization capability
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
After you have created a connection to the Google Cloud Platform (GCP) in your Edge site, you have to add the GCS synchronization capability to the connection.
Before you start
- You have created and installed an Edge site.
- You have created a connection to the Google Cloud Platform (GCP) in your Edge site.
Required permissions
- You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.
Steps
- Open an Edge site.
-
On the main toolbar, click
, and then click
Settings.
The Collibra settings page opens. -
In the tab pane, click Edge.
The Sites tab opens and shows a table with an overview of the Edge sites. - In the table, click the name of the Edge site whose status is Healthy.
The Edge site page opens.
-
On the main toolbar, click
, and then click
Settings.
- In the Capabilities section, click Add capability.
The Add capability page appears. - Select the GCS synchronization capability template.
- Enter the required information.
Field Description Required Capability
This section contains general information about the capability.
Name
The name of the Edge capability.
Yes
Description
The description of the Edge capability.
No
Capability template
The capability template. The value that you select in this field determines which sections appear on the page.
Select the following Edge capability:
GCS synchronization
Yes
GCP service account
This section contains information on how to connect to Google Cloud Storage. GCP ConnectionThe GCP connection to be used. Yes
Configuration This section contains information on the configuration of the crawlers. Maximum number of files per crawlerThe maximum number of files that can be registered per crawler. The default value is 1,000. Yes
Save input metadataSelect the checkbox if you want to save the input metadata extracted from the data source in ZIP files. The files can be useful for troubleshooting. Select this option only on request of Collibra Support. The Collibra Support team can provide the location of the saved ZIP files after the synchronization.
This checkbox is not selected by default.
No
Integrate Schemas from DataplexSelect the checkbox if you want to integrate the schemas from Dataplex based on the crawler path that will be specified in the GCS integration configuration.
If the checkbox is not selected, no Dataplex data will be ingested.This checkbox is selected by default.
No
Project IDsAdd a comma-separated list of the Project IDs where Dataplex is enabled.
The capability will search in these projects for schemas based on the crawler path that will be specified in the GCS integration configuration. If the Project IDs field is empty, the integration will search in the project included in the provided GCP Service Account Credentials JSON.No
Advanced Configuration - Logging configuration
- Memory
- JVM arguments
These configuration options help when investigating issues with the capability.
Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.
No
Debug
This setting is not valid for this integration. It should be set to false. An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.No
Log level
This setting is not valid for this integration. It should be set to No logging.
An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
- Click Create.
The capability is added to the Edge site.
The fields become read-only.
What's next?
You can now register a GCS file system.