Setting up cross-project access for BigQuery datasets

This page shows you how to make cross-account BigQuery datasets accessible to the GCP IAM service account in your project, and how to configure a BigQuery connection in Collibra DQ to run DQ checks against cross-account BigQuery datasets.

Note 
In Pullup mode, BigQuery connections can only support one additional project.
In Pushdown mode, BigQuery connections can support multiple additional projects. However, we do not recommend exceeding 3 additional projects.

The following diagram shows the architecture of cross-account access for BigQuery datasets. In this topic, Project A refers to the billing account in the panel on the left of this diagram, and Project B refers to the hosting account in the panel on the right.

bigquery dataset cross-project access diagram

Before you begin

  • You have created a BigQuery connection.

Prerequisites

  • You have a GCP IAM service account owned by Project A and its corresponding access JSON file.
  • You have read access on BigQuery datasets hosted in Project B for the GCP service account owned by Project A.
    • You have at least roles/bigquery.dataViewer permissions on the BigQuery datasets.
  • You have read and write access on the temporary dataset hosted by Project A for the GCP IAM service account.
    • You have at least roles/bigquery.jobUser and roles/bigquery.readSessionUser at the project level, and roles/bigquery.dataOwner permissions on the temporary dataset.
  • The datasets of Project B and the temporary dataset of Project A are in the same zone.

Steps

  1. Sign in to Collibra Data Quality & Observability and click the Cogwheel icon in the left navigation pane.

  2. Click Connections.
  3. The Connection Management page opens.
  4. Locate your BigQuery connection, then click Edit icon.
  5. The Edit Connection modal appears.
  6. Append the following properties to the Connection URL:
  7. AdditionalProjects=project-b;QueryProperties=dataset_project_id=project-b

    Note Set the value of project-b according to your specific requirements.

  8. Click the Properties tab.
  9. Add the following Driver Properties to enable materialized view for optimized performance of the connector:
  10. materializationProject=project-a,materializationDataset=temporary_dataset_name

    Note Set the values of project-a and the temporary_dataset_name according to your specific requirements.

  11. Click Submit.
  12. Click explorer icon Explorer and verify that the records in Explorer are the same as in your BigQuery connection.
  13. When you create a new DQ Job, update the -q in the SQL Query editor with the following template query to identify the project:
  14. -q "select * from project-b.dataset_name.table_name"

    Note Set the values of project-b, dataset_name, and table_name according to your specific requirements.

    Warning If you do not update the query with these values, the default project is used and Collibra DQ may throw an error that it cannot find your cross-project dataset in the expected location.