About Data Catalog

The overarching aim of Data Catalog is to create and maintain an inventory of an organization’s data assets across its entire digital landscape, so that data assets are easier to find and trust to drive insightful business decisions by data consumers.

In the Catalog application in CPSH, you can integrate metadata from multiple data sources: databases, data lakes, warehouses, enterprise applications, ETL tools, and BI solutions. Metadata provides information such as the format of the data, the structure of the data, and when assets were created. For example, Metadata synchronization results for jdbc data sources.
Once the metadata is integrated, you can enrich the metadata by adding profiling information, defining the data class, showing sample data, linking the meta data to the business context, showing the lineage, data quality, and more.

About metadata, samples, profiling data, classification, lineage, and more in Catalog asset pages

Catalog asset pages can include detailed information about the data they represent. These details include:

Data Catalog submenu pages

Important 

Choose an option below to explore the documentation for the latest user interface (UI) or the classic UI.

The following table describes each of the submenu items of the Catalog application.

Page Description
Overview
The landing page for Catalog. This page is designed to help you quickly and easily find Data Catalog-related assets.

Reports

All report assets.

Data Sets All Data Set assets shown as a set of tiles or as a table, with their name, description and, if there are any, connections to existing assets in CPSH.
Data Sources Data sources that are used for data source registrations.
Data Dictionary All data assets in CPSH.
Technology Assets All technology assets in CPSH.
Access Requests The history of your access requests and their status.
Integrations Allows you to register a data source. This page contains two tabs.
The Data Source Registration tab allows you to create a Database or File System asset from which you can start the synchronization of a data source. Use this tab for JDBC, S3, GCS, and ADLS integrations.
The Integration Configuration tab allows you to configure all other Metadata, ETL, and BI Integrations and start the synchronization. For example, Synchronize Databricks Unity Catalog, or Create a technical lineage via Edge.

Helpful resources

Courses on Collibra University:

  • Navigating catalog assets for data consumers
  • Register a data source: Bring your metadata over from Google Cloud Storage
  • Register a data source: Bring your metadata over from Databricks Unity Catalog
  • Register a data source: Bring your metadata over from Azure Data Lake Storage
  • Register a data source: Bring your metadata over from Snowflake
  • Register a data source: Bring your metadata over from Amazon Redshift