Best practices for derived relations
This topic explains best practices for building derived relation types that are efficient, meaningful, and aligned with your use cases. It also includes examples that show how to create and optimize derived relation types for both simple and complex scenarios.
Core guidelines
When you create a derived relation type, you define how two asset types of interest are related. This is done by specifying a series of interconnected relations called a relation path. There are multiple ways to relate two asset types through relations. Follow these guidelines when choosing relations for a relation path.
| No. | Guideline | Description |
|---|---|---|
| 1 |
Use the most meaningful relation paths. |
Ask yourself: Which paths are truly relevant to my objective? Not all paths add value. Roles and co-roles give meaning to the relations and help you assess their relevance to your use case. The shortest path between two assets isn't always the most useful. Sometimes, it may even be irrelevant and introduce unnecessary noise. |
| 2 |
Add only used relations.
|
Ask yourself: Are all the relation paths that I found actually used in my environment? Even if you find multiple meaningful relation paths to relate two asset types, some paths may not be used in your environment. If possible, avoid registering unused paths. This helps reduce the size of the derived relation type, resulting in smaller and faster queries. |
| 3 |
Define the smallest possible derived relation type to achieve your goal. |
Ask yourself: Is my path too large or too long to perform efficiently? A derived relation type with many relations or deeply nested paths runs slower than one with shorter, simpler paths. While system limits help prevent excessively slow queries, you may still need to simplify the paths based on your performance requirements. Tip
|
| Reuse derived relation types in other derived relation types. | You can include a derived relation type within another derived relation type. This allows you to define your most frequent paths, such as natural business or technical hierarchies, in derived relation types that you can reuse in others. Reusing derived relation types makes creating and maintaining relation paths faster and safer. |
Examples
To better understand the concept of derived relations, consider the following examples. The first example highlights basic concepts, while the second example shows how to approach creating a derived relation type in a more complex scenario.
Identifying the database a column belongs to
The following derived relation type helps you identify the database a column belongs to by navigating several explicit relations from the column to the table, to the schema, and finally to the database.
Derived relation type definition
Like an explicit relation type, a derived relation type has a head, tail, role, and co-role.
- Head: Column
- Role: is part of
- Co-role: contains
- Tail: Database
Relation path
The derived relation type definition follows specific relations to form a relation path between the head and the tail.
- "Column is part of Table"
- "Table is part of Schema"
- "Schema belongs to Technology Asset" (narrowed to Database, which is a descendant of Technology Asset)
Implementation
To use the derived relation type, add it to the Column asset type assignment and the asset page layout. This shows the parent Database asset as a derived relation on a Column asset page. If your asset type assignment has a dynamic or out-of-the-box asset layout, you need to add the derived relation type only to the assignment (not to the layout).
Knowing if an AI use case can leak sensitive information
In the fictional Acme organization, you want to identify which data categories are related to an AI use case. Your objective is to ensure that sensitive data represented by the PII and PHI data categories isn't exposed through training and inference data. This example is more complex than the previous one because there are multiple ways to relate a data category to an AI use case. To address this, you need to analyze your operating model and determine how to relate an AI use case to a data category.
Examine the starting points
Your focus is on training and inference data. To begin, examine how an AI use case is related to training and inference data. The following two relations can serve as starting points for two different relation paths:
- "AI Use Case infers from Asset"
- "AI Use Case trained by Asset"
Notice that an AI use case can consume inference and training data through an AI model using the following relations:
- "AI Use Case uses AI Model"
- "AI Model infers from Asset"
- "AI Model trained by Asset"
This allows for two additional relation paths to the derived relation type. However, Acme doesn’t govern AI Model assets yet. Therefore, following guideline #2 (Add only the relations that are in use), you won’t add these relation paths yet. You can always include them later if you decide to consider AI Model assets.
Navigate to the data
The next step is to navigate from the Asset asset type to the Data Category asset type. Different asset types have different relations assigned. Even if the current paths both lead to the generic Asset asset type, you need to identify the specific asset types that are used in your organization to determine the next possible relations in the paths.
Acme uses data sets and tables as training data and only tables as inference data. Although other asset types can be included, you should limit the derived relation type to only the asset types that are currently used in your organization.
A table can reach a data category through a column using the following relation path:
- "Table contains Column"
- "Data Asset is categorized by Data Category" (Column is a descendant of Data Asset)
Similarly, a data set can reach a data category also through a column using the following relation path:
- "Data Set contains Data Element" (Column is a descendant of Data Element)
- "Data Asset is categorized by Data Category" (Column is a descendant of Data Asset)
Resulting relation paths
Based on the analysis of Acme’s operating model, the following relation paths are required to connect an AI use case to a data category.
| For | Relation paths |
|---|---|
| Training data |
|
| Inference data |
|
Visualizing the relation path logic
The following diagram represents how the relation paths navigate the knowledge graph from the AI use case (head) to the data category (tail).
Simplifying the representation
Notice that the same asset types (Asset and Column) are used in all three relation paths. You can simplify the representation by joining these paths at the Column node.
Optimization strategy
Both the linear and joined representations are useful, but they suit different administrative needs.
- Graphical builder: The graphical relation type builder accepts only linear paths, which are paths that go directly from the head to the tail without branching or joining. Therefore, you need to add the three paths listed above to the derived relation type. The graphical relation type builder automatically suggests all possible relations that can follow from the current asset type. This guides you in building the correct relation paths without requiring you to manually look up or define the next steps.
- JSON editor: In contrast, the JSON editor allows for more optimized paths and accepts both versions, including the one where two paths join at the Column node. Following guideline #3 (Define the smallest possible derived relation type to achieve your goal), the simplified path is preferred because it contains fewer relations (five instead of nine). However, the current approach is still within acceptable limits, and building the derived relation type with the graphical relation type builder will still remain efficient.