Data product basics
In this topic
About data products
A data product is a reusable package that provides data to answer a business question or solve a specific business problem. It includes everything you need to understand, access, and use the data. This makes it actionable and ready to support business decisions. A data product is secure, easy to use, and designed for anyone, including those who are not domain experts.
A data product includes not only the data elements but also context about the data and how to access it. It consists of 3 main components: Context, Data, and Access.
Context | The context includes background information, such as why the data product was created, who owns it, and details related to quality and privacy. |
Data | The data can refer to a table, view, or a business asset such as a report or a model. |
Access | The access information includes details on how to access the data and the policies that govern access. |
Data product asset types and operating model
Data products use specific operating model asset types, relations, and attributes.
Tip Using the specific data product operating model allows you to scale as Collibra data product features grow over time.
- Asset types
- Data Product
A data product is a reusable package that provides data to answer a business question or solve a specific business problem. - Data Product Port
A data product port defines the interface through which a data product interacts with the broader ecosystem. A link between a Data Product Port and a Data Product via the "exposes data as" relation can be considered an "output port" and a link with the "consumes data through" relation can be considered an "input port".In advanced governance use cases, you can use the child asset types Data Product Output Port and Data Product Input Port, and their relations.
- Data Contract
A data contract defines the commitments a data product owner makes to data consumers. The data contract documents the structure, format, service level, quality, and terms of use.Example The contract can outline service-level objectives (SLOs) related to system uptime and latency for a data product. It can also include details about pipelines or data delivery mechanisms and provide information about the data, such as schema and expected quality metrics.
- Data Product
- Asset type groups
- Data Product Output: Groups the physical data for a data product.
- Data Product Input: The physical input for a data product.
Note Use this asset type group only in advanced governance use cases that involve the Data Product Input Port asset type.
- Relations
- Data Product exposes data as/is output port for Data Product Port
- Data Product Port is input port for/consumes data through Data Product
- Data Product Port is implemented as/implements Data Product Output
Note When you create a diagram, you can add this relation type from the Data Product Port asset type to each asset type assigned to the Data Product Output asset type group.
- Data Product Input implements/is implemented as Data Product Input Port
Note Although this relation is still available, we recommend using Data Product Port is implemented as/implements Data Product Output instead.
- Data Contract governs functioning of / should operate according to Data Product Port
- Data Contract information to be provided / is mentioned in the terms of Data Attribute
- System implements/is implemented in Data Product Port
- Data Product relates to/is related to Measure
- Data Product relates to/is related to Business Term
- Data Product relates to/is related to Data Domain
- Data Product is explained in/explains Data Notebook
-
Data attributes
Some specific attributes have been defined, such as:- Data product category: Defines whether a data product is targeting business users or more technical users. The out-of-the-box values are Derived (for business users) and Foundational (for more technical users).
This attribute is available for Data Product. - Target delivery date: Indicates the date when the requested resource should be made available.
This attribute is available for Data Product and Data Product Port. - Access method: Indicates the method that can be used to access the data.
This attribute is available for Data Product Port.Tip This attribute can be used in workflows to create output port assets per access method.
- Access instructions: Provides instructions on how to access the data.
This attribute is available for Data Product Port.
- Data product category: Defines whether a data product is targeting business users or more technical users. The out-of-the-box values are Derived (for business users) and Foundational (for more technical users).
Data product lifecycle and workflows
The detailed lifecycle of a data product can look like this:
- Data products can be created beforehand or requested by consumers. (Request)
- The data product is built and published.
- The requirements are defined. (Discover)
- The port details are refined. (Discover)
- The sources are prepared. (Build)
- The data product is published and made available as curated data. (Publish)
- The requestor or any data consumer can find and use the data product. (Access)
- The data product is monitored and updated when needed. (Monitor)
Workflows can help with the data product request, creation, and access request.
Tip You can find some data product-specific workflows in Collibra Marketplace.
These workflows are published as community offerings and include dedicated documentation.
The following example is based on the community workflows: