Linking data assets to data categories

This example shows you how to create a simple optimized derived relation type (DRT), "Data Asset is categorized by / categorizes Data Category", using the JSON editor. You can use the JSON editor to create complex, optimized relation paths that can fork and join, unlike the visual relation type builder, which supports only linear paths. The example guides you through defining non-linear relation paths using JSON syntax, testing the paths, assigning the derived relation type to an asset type, and viewing the derived relation on an asset page.

Context

In the example Linking columns to databases, you created a single linear relation path. A relation path is linear if it travels straight from the head to the tail without branching or forking. In a linear relation path, all intermediate nodes have exactly one incoming relation type and one outgoing relation type. Only the head and tail can have multiple relation types leaving and coming in, respectively. The relation type builder in the derived relation type editor supports building only linear paths. However, this can lead to significant repetition, especially when building long paths that differ only by a few relation types. To avoid such repetition and build optimized, non-linear paths, use the JSON editor.

Unlike the relation type builder, relations paths declared using the JSON editor can be non-linear. Any intermediate node can have more than one incoming relation type and more than one outgoing relation type. This leads to more condensed relation paths that lead to more efficient and optimized queries in the database.

The relation paths in most out-of-the-box derived relation types don't appear in the relation type builder because they are optimized. They can be visualized only in the JSON editor.

Objective

  • Show all data categories that categorize a data set on the data set's asset page.
  • Show all data categories that categorize a table on the table's asset page.

Overview of phases

This example includes the following phases:

  1. Create a derived relation type using the JSON editor.
  2. Assign the derived relation type to asset types.
  3. View the derived relation on asset pages.

For clarity, the phase of creating a derived relation type using the JSON editor is split into multiple parts.

Prerequisites

  • Your environment uses the latest user interface.
  • You have a global role with the Product Rights > System administration global permission.
  • The Derived relation support setting in Collibra Console is activated. If activated, the Add derived relation type button is shown on the Relation types page in the Operating Model settings.
  • The following table lists the out-of-the-box operating model elements used in this example. If these elements were renamed in your environment, use the public IDs to identify them.
Type Name Resource ID Public ID
Asset type Column 00000000-0000-0000-0000-000000031008 Column
Data Category 00000000-0000-0000-0000-000000031109 DataCategory
Data Set 00000000-0000-0000-0001-000400000001 DataSet
Table 00000000-0000-0000-0000-000000031007 Table
Relation type Column is part of / contains Table 00000000-0000-0000-0000-000000007042 ColumnIsPartOfTable
Data Category categorizes / is categorized by Data Asset c0e00000-0000-0000-0000-000000007315 DataCategoryCategorizesDataAsset
Data Set contains / is part of Data Element 00000000-0000-0000-0000-000000007062 DataSetContainsDataElement
  • Your environment contains data that is compatible with the derived relation type. The example assumes that your environment contains the following test data.
Community Domains Assets Relations
 DRT test community

DRT test data categories of type Business Dimensions

DRT test - PII asset of type Data Category N/A
DRT test data of type Physical Data Dictionary DRT column of type Column “is categorized by Data Category” → DRT test - PII
DRT table of type Table N/A
DRT test data sets of type Data Usage Registry DRT test - data set of type Data Set “contains Data Element” → DRT column

Depending on your environment, you may need to add the “contains Data Element” relation type to the Data Set asset type layout first.

1 Create a DRT using the JSON editor

The following image shows the relation paths used in the example.

Image of DRT with data asset as head and data category as tail

To connect a data asset to a data category, you can define your derived relation type as "Data Asset is categorized by / categorizes Data Category", where:

  • Head: Data Asset
  • Role: is categorized by
  • Co-role: categorizes
  • Tail: Data Category

1.1 Start the DRT creation process

  1. On the main toolbar, click Products iconCogwheel icon Settings.
    The Settings page opens.
  2. In the Operating model section, click Relation types.
  3. On the Relation types page, click Add derived relation type.
    The derived relation type editor opens.
  4. On the Details tab, enter the following information:
    • Role: is categorized by
    • Co-role: categorizes
    • Description: A derived relation that shows the kind of data stored in a Data Asset using connected Data Categories

1.2 Define the nodes

  1. Click the JSON tab.
    The JSON tab shows a JSON template to get you started.
    Tip As you continue the next steps, the editor shows syntax errors. These errors are expected and will resolve automatically as you complete your valid relation path.
  2. In headNode:
    • Set the name property to Data Asset.
    • Set the headType property to DataAsset.
  3. In tailNode:
    • Set the name property to Data Category.
    • Set the tailType property to DataCategory.
  4. In intermediateNodes, change the name property from IntermediateNode1 to Column.

1.3 Connect the nodes

  1. In headNodeoutgoingEdges:
    • In line 8, set the type property to ColumnIsPartOfTable.
    • In line 9, change the nextNodeName property from IntermediateNode1 to Column.
  2. Copy lines 6 through 10 in the clipboard, paste them at the beginning of line 11, and move ] to the next line.
    The content on the JSON tab shows 34 lines in total.
  3. Add the comma character at the end of line 10.
    You now have a second element in the outgoingEdges array.
    Image of JSON tab
  4. In headNodeoutgoingEdges:
    • In line 12, change the direction property from co-role to role.
    • In line 13, change the type property from ColumnIsPartOfTable to DataSetContainsDataElement.
  5. In intermediateNodesoutgoingEdges:
    • In line 24, set the type property to DataCategoryCategorizesDataAsset.
    • In line 25, change the nextNodeName property from TailNodeName to Data Category.
  6. Click Save relation type.
Copy

Content on the JSON tab

{
  "headNode": {
    "name": "Data Asset",
    "headType": "DataAsset",
    "outgoingEdges": [
      {
        "direction": "co-role",
        "type": "ColumnIsPartOfTable",
        "nextNodeName": "Column"
      },
      {
        "direction": "role",
        "type": "DataSetContainsDataElement",
        "nextNodeName": "Column"
      }
    ]
  },
  "intermediateNodes": [
    {
      "name": "Column",
      "outgoingEdges": [
        {
          "direction": "co-role",
          "type": "DataCategoryCategorizesDataAsset",
          "nextNodeName": "Data Category"
        }
      ]
    }
  ],
  "tailNode": {
    "name": "Data Category",
    "tailType": "DataCategory"
  }
}

2 Assign the DRT to asset types

  1. On the main toolbar, click Products iconCogwheel icon Settings.
    The Settings page opens.
  2. In the Operating model section, click Asset types.
  3. On the Asset types page, click the name of the Table asset type.
  4. On the Table asset type page, in the left pane, expand the global assignment and click Characteristics.
  5. On the Characteristics page, click Edit layout.
  6. On the Edit layout page, in the left pane, click Add a Characteristic.
  7. In the Add a Characteristic dialog box, find and select your new derived relation type, Data Asset is categorized by Data Category.
  8. In the is categorized by Data Category dialog box, select the Add directly to layout checkbox and click Add.
    The derived relation is categorized by Data Category is added to the layout.
    Tip You can change the position of the derived relation using .
  9. Click Publish and close the asset type page.
  10. Follow the same steps for the Data Set asset type.

3 View the derived relation on asset pages

  1. Open the DRT test - data set and DRT table asset pages.
  2. On the Summary tab, verify that the asset pages show the derived relation is categorized by Data Category with the Data Category asset DRT test - PII.
    Image of asset page with derived relation
    Tip You can identify a derived relation on an asset page by the diagram icon Diagram icon. Clicking Diagram icon shows the relation paths on which the derived relation is based, and clicking an asset name opens the corresponding asset page.