Reference data lifecycle
Reference data is relatively easy to govern because it concerns predictable data. Often, the code sets are related to the assets in the Business Glossary application.
The process of managing reference data in Collibra Platform generally involves the following phases.
Phase |
Description |
||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Create |
Gather all existing reference data content, analyze it, and enter the relevant parts in Collibra Platform as Code Set and Code Value assets. We recommend that you use a specific Codelist domain for each code set. Tip You can create the assets manually, but usually it is easier to use the import functionality to enter thousands of assets at the same time. To describe the code set completely, you can create relations between the Code Set and Code Value assets, as well as other relevant assets.
The outcome consists of Code Set and Code Value assets, organized in different Codelist domains. The assets can have relations to other assets and still have the Candidate status. |
||||||||||||||||||||
2. Complete |
Create responsibilities by assigning users or user groups to roles for the respective Codelist domains:
Use the Approval and Simple Approval workflows to update and approve the Code Set and Code Value assets. The outcome consists of Code Set and Code Value assets with the Approved status. |
||||||||||||||||||||
3. Map |
The DataStewards map code values and crosswalks between corresponding Code Value assets. A Crosswalk asset may have additional attributes to describe the transformation logic. Often, this transformation logic is hidden or implicit. The Crosswalk assets originally also have the Candidate status. Therefore, they should also be reviewed and approved via the Approval and Simple Approval workflows. |
||||||||||||||||||||
4. Publish and trace |
After you have created the required assets and added the required relations, you can use diagrams to trace the lineage. The approved code values can also be provided to the business users in different ways:
To indicate that the code sets are published, you can create a status, for example, Published. |
||||||||||||||||||||
5. Use and maintain |
Finally, the business users use the published code sets in their own applications, for example, in reporting software. Typically, there will be inconsistencies or incompleteness in the code sets. These issues can be reported, which starts a workflow to fix the issue. |