Steps overview: Integrate a Google Cloud Storage file system via Edge
You can configure Collibra to register and synchronize a Google Cloud Storage (GCS) file system via Edge.
Tip If you are using schemas with table files that you want to integrate as File Group assets with tables and columns instead of File assets, you can use Google Dataplex. The Dataplex zone in which the GCS buckets are registered must be in the same project as the GCP service account. For information on how to add a GCS asset to a Dataplex Zone that can then be discovered by the our GCS integration, go to the Google Dataplex documentation.
# | Step | Description |
---|---|---|
1 |
Define that you want to integrate GCS via Edge. Note
If you have defined an outbound (forward) proxy on your Edge site, the integration will take that configuration into account when connecting to the data source. The following proxies are supported for GCS:
|
|
2 | Create a GCP connection to your Edge site. | Create a connection to the Google Cloud Platform (GCP) in an Edge site. |
3 |
Add a GCS synchronization capability to your Edge site. | Add the GCS synchronization capability to the GCP Edge connection. The capability allows to retrieve data from the GCS file system. |
4 | Register a GCS file system. |
|
5 | Connect the GCS file system asset to the Edge capability. | Link the registered GCS file system to the Edge capability. |
6 | Create crawlers. | Create crawlers to define the folders that you want to synchronize. |
7 | Synchronize GCS. |
You can manually synchronize GCS or you can add a synchronization schedule to automatically synchronize it. |