Add the ADLS synchronization capability
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
After you have created a connection to the Azure Data Lake Storage (ADLS) file system in your Edge site, you have to add the ADLS synchronization capability to the connection.
Before you begin
- You have created and installed an Edge site.
- You have created a connection to ADLS in your Edge site.
Required permissions
You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.
Steps
- Open an Edge site.
-
On the main toolbar, click
, and then click
Settings.
The Collibra settings page opens. -
In the tab pane, click Edge.
The Sites tab opens and shows a table with an overview of the Edge sites. - In the table, click the name of the Edge site whose status is Healthy.
The Edge site page opens.
-
On the main toolbar, click
, and then click
Settings.
- In the Capabilities section, click Add capability.
The Add capability page is shown. - Select the ADLS synchronization capability template.
- Enter the required information.
Field Description Required Capability
This section contains general information about the capability.
Name
The name of the Edge capability.
Yes
Description
The description of the Edge capability.
No
Capability template
The capability template. The value that you select in this field determines which sections appear on the page.
Select the following Edge capability:
ADLS synchronization
Yes
ADLS service account
This section contains the information on how to connect to Azure Data Lake Storage. Azure ConnectionThe ADLS connection to be used. Yes
Synchronization SourceChoose which Microsoft data sources you want to integrate from.
Note This option is made available for private beta testing and should not yet be used outside the private beta tasks.
No
Microsoft Purview Account NameThe name of your Microsoft Purview account.
If you enter a Purview account name, the integration uses Microsoft Purview for the integration.No
Save Input MetadataIf you select this option the metadata extracted from the data source will be saved in a file that can be used for troubleshooting. Select this option only on request of Collibra Support. No
Max Schema LevelFor columns that have a structured technical data type, Array or Struct, you can register the structure of the data. This is supported for AVRO, CSV, JSON, ORC, PARQUET, PSV, SSV, TSV, TXT, and XML.
In this field, enter the maximum level of the structure you want to see. For example, 3.
Note If you include a high number of levels, this can have an impact on the integration performance.
No
Advanced Configuration - Logging configuration
- Memory
- JVM arguments
These configuration options help when investigating issues with the capability.
Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.
No
Debug
This setting is not valid for this integration. It should be set to false. An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.No
Log level
This setting is not valid for this integration. It should be set to No logging.
An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
- Click Create.
The capability is added to the Edge site.
The fields become read-only.
What's next?
You can now register the ADLS file system.