About the Azure Data Lake Storage file system integration

Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

The Azure Data Lake Storage file system integration allows for the registration of Azure Data Lake Storage (ADLS) as a data source in Collibra and the synchronization of the metadata. ADLS is a service provided on Microsoft Azure Blob Storage. After the integration, the files and directories of the ADLS file system are represented in Collibra by specific asset types, retaining the original names.

Important 
  • The ADLS integration supports Azure Data Lake Storage Gen2.
    Azure Data Lake Storage Gen1 is not supported. To verify which Azure version you are using, check the Account Kind in the Overview section in your Azure storage account details. StorageV2 indicates you are using Gen2.

  • You can integrate an Azure Data Lake Storage file system only via Edge.

For detailed information on Microsoft Azure Data Lake Storage Gen2, go to the Azure documentation.

About ADLS integration with Microsoft Purview

The ADLS integration supports Microsoft Purview, a service for schema discovery. With Microsoft Purview, you can integrate tables and columns from files. If you integrate only ADLS, multiple File assets will become available. With Microsoft Purview, a File asset can represent multiple files that follow a specific naming pattern. These files are shown in a single File asset with a schema (Table and Columns). To use the ADLS integration with Microsoft Purview, select ADLS with Purview in the Synchronization Source field when you configure the ADLS synchronization.
For information on the asset types, go to the ADLS operating model.

Important 
  • Even if you use Microsoft Purview to integrate tables and columns, we don't currently support profiling and sampling.
  • Currently, the ADLS integration can ingest up to 100,000 assets from Purview.

For detailed information on Microsoft Purview, go to the Purview documentation.

What's next?

Steps overview: Integrate an Azure Data Lake Storage file system

Learn more

To learn about the ADLS integration and watch videos, follow our University course.