Dataplex: Supported transformation details
- Collibra Data Lineage visualizes lineage for Google Dataplex down to column level. To view the technical lineage for Google Dataplex, ensure that you select Objects in the toolbar of your technical lineage graph.
- Collibra Data Lineage ingests lineage from BigQuery and other Google Cloud services only when they contribute lineage for BigQuery assets. Currently, only Column, Table, and File assets are processed and included in the technical lineage.
Collibra Data Lineage does not collect metadata directly from other Google Cloud services. However, if these services generate lineage for BigQuery assets, that lineage is captured by Dataplex and included in the exported lineage file. Collibra Data Lineage then ingests this exported lineage, so any indirect lineage created by these services is reflected in the technical lineage for BigQuery assets.
Note The column-level lineage generated in Collibra Data Lineage is subject to the limitations of the data lineage feature in Dataplex. For details, go to Limitations in the About data lineage topic of the Dataplex Universal Catalog documentation. - Technical lineage for Google Dataplex can start from GCS or BigQuery and end in BigQuery.
- You can choose to create table-level lineage or column-level lineage for Google Dataplex when you synchronize the Technical Lineage for Google Dataplex capability.
- Transformations are ingested by calling the GCP Process and subsequently the GCP Jobs. Therefore, the Service Account user that is defined in the Edge connection requires, at a minimum, the
bigquery.jobs.getpermission, and optionally thebigquery.adminrole, which lets the capability ingest the details of all the jobs in the project.
Differences between technical lineage for Google Dataplex and Google BigQuery
You can create technical lineage for Google BigQuery by using a JDBC connection or for Google Dataplex by using a Google Cloud Platform (GCP) connection. Consider the following differences to determine which data source and connection type to use.
| Feature | Support in technical lineage for Google Dataplex (column-level lineage) | Support in technical lineage for Google Dataplex (table-level lineage) | Support in technical lineage for Google BigQuery |
|---|---|---|---|
| SQL transformation code | Yes | No | Yes |
| Executed SQL in stored procedures | Yes | Yes | No |
| Ingest lineage from... |
BigQuery and other Google Cloud services supported by the data lineage feature in Dataplex |
BigQuery and other Google Cloud services supported by the data lineage feature in Dataplex | BigQuery |
| Stitching | Yes | No | Yes |