About the Azure Data Lake Storage file system integration

The Azure Data Lake Storage file system integration allows for the registration of Azure Data Lake Storage (ADLS) as a data source in Collibra and the synchronization of the metadata. ADLS is a service provided on Microsoft Azure Blob Storage. After the integration, the files and directories of the ADLS file system are represented in Collibra by specific asset types, retaining the original names.

Important 

You can integrate an Azure Data Lake Storage file system only via Edge.

For detailed information on Microsoft Azure Data Lake Storage, go to the Azure documentation.

About Microsoft Purview

The ADLS integration supports Microsoft Purview, a service used for schema discovery.
This allows you to integrate the schemas, tables and columns from the files into one single File asset in Collibra rather than multiple File assets. For more details, go to the ADLS operating model.

Important 
  • Even if you use Microsoft Purview to integrate schemas and tables, we don't currently support profiling and sampling.
  • Currently, the ADLS integration can ingest up to 100,000 assets from Purview.

For detailed information on Microsoft Purview, go to the Purview documentation.

What's Next?

Steps overview: Integrate an Azure Data Lake Storage file system

To learn about the ADLS integration and watch videos, follow our University course.