Steps overview: Integrate Google BigQuery via Edge
The integration steps vary slightly depending on how you choose to connect to your data source.
Integrate via JDBC connection
| # | Step | Description |
|---|---|---|
| 1 | Review the preflight checks. | Key considerations to help ensure successful integration, including required Edge, technical lineage, and data source-specific permissions, network requirements and more. |
| 2 |
Create a JDBC connection to Google BigQuery. |
For Collibra Data Lineage to connect to and retrieve metadata from your data source, create a JDBC connection. If you have set up a JDBC connection when you registered the data source via Edge, you can use the existing JDBC connection. |
| 3 |
Add the Technical Lineage for BigQuery capability for JDBC connections. |
Add the technical lineage capability to your Edge or Collibra Cloud site. The capability allows the lineage harvester to retrieve data from your data source. |
| 4 | Synchronize the technical lineage. |
You can synchronize your technical lineage manually or automatically by adding a synchronization schedule. |
Integrate via Shared Storage connection
| # | Step | Description |
|---|---|---|
| 1 | Review the preflight checks. | Key considerations to help ensure successful integration, including required Edge, technical lineage, and data source-specific permissions, network requirements and more. |
| 2 | Prepare and store your SQL files in a directory. | You need to provide SQL files that include your SQL queries. Collibra Data Lineage processes the metadata based on your queries to create the technical lineage. |
| 3 |
Create a Shared Storage connection. |
For best technical lineage results, use the JDBC connection to ingest JDBC sources when possible, instead of using the Shared Storage connection with SQL files. Shared Storage connection is not supported for Collibra Cloud sites. |
| 4 |
Add the Technical Lineage for SqlDirectory capability for Shared Storage connections. |
Add the technical lineage capability to your Edge or Collibra Cloud site. The capability allows the lineage harvester to retrieve data from your data source. |
| 5 | Synchronize the technical lineage. |
You can synchronize your technical lineage manually or automatically by adding a synchronization schedule. |
Integrate via Cloud Storage connection
| # | Step | Description |
|---|---|---|
| 1 | Review the preflight checks. | Key considerations to help ensure successful integration, including required Edge, technical lineage, and data source-specific permissions, network requirements and more. |
| 2 | Prepare and store your SQL files in your cloud-based storage system. |
You need to provide SQL files that include your SQL queries. Collibra Data Lineage processes the metadata based on your queries to create the technical lineage. Your SQL files must be stored in one of the following:
|
| 3 |
Create a Cloud Storage connection. |
For guidance on how to create a connection between your cloud-based storage system and Edge or Collibra Cloud site, go to the appropriate topic: |
| 4 |
Add the Technical Lineage for SqlDirectory (Cloud) capability for Cloud Storage connections. |
Add the technical lineage capability to your Edge or Collibra Cloud site. The capability allows the lineage harvester to retrieve data from your data source. |
| 5 | Synchronize your technical lineage. |
You can synchronize your technical lineage manually or automatically by adding a synchronization schedule. |
After you synchronize the technical lineage, you can view the ingestion report. This shows the impact of technical lineage synchronization on the assets in Collibra.
Helpful resources
- Google BigQuery integration preflight checks
- Edge harvester network requirements
- Connect to a Collibra Data Lineage service instance via OAuth authentication
- Connect to a proxy server
- Edit a JDBC connection to Google BigQuery
- Delete a JDBC connection to Google BigQuery
- Supported SQL statements
- Automatic stitching for technical lineage
- Technical lineage admin options
- Sharing database models across data sources
- Delete a technical lineage