Custom technical lineage JSON file examples

This topic shows sample lineage.json files that create a simple custom technical lineage and an advanced custom technical lineage.

Each sample can be used to generate technical lineage graphs in Collibra to represent the IOT_JSON and IOT_DEVICES_PER_COUNTRY tables with the following columns:

IOT_JSON

IOT_DEVICES_PER_COUNTRY

CCA3

COUNTRY

DEVICE_ID

NUMBER_DEVICES

Sample custom technical lineage definition for a simple custom technical lineage

In the following sample, the tree section defines the IOT_JSON and IOT_DEVICES_PER_COUNTRY tables and columns. The tables are in a schema named COLLIBRA. The COLLIBRA schema is in a database named COLLIBRA and a system named Databricks. Technical lineage via Edge ignores the Collibra system name setting for custom technical lineage. The Databricks system is used for stitching, regardless of whether Collibra system name is set to True or False.

To show the transformation code at the bottom of the Collibratechnical lineage graph that uses a simple custom technical lineage, specify the mapping and source_code properties in the lineages section.

{ 
  "version": "1.0",
  "tree": [
	{ 
	    "name": "Databricks", 
           "type": "system",
	    "children": [
	       { 
		   "name": "COLLIBRA", 
		   "type": "database",
		   "children": [
       	      { 
	                  "name": "COLLIBRA", 
	                  "type": "schema",
	                  "children": [
		             { 
		                 "name": "IOT_JSON", 
		                 "type": "table",
		                 "leaves": [
		                    { 
			                "name": "CCA3", 
			                "type": "column"
			            },
			            { 
			                "name": "DEVICE_ID", 
			                "type": "column"
			            }
			         ]
		             },
		             { 
		                 "name": "IOT_DEVICES_PER_COUNTRY",
			         "type": "table",
			         "leaves": [
			            { 
			                 "name": "COUNTRY", 
			                 "type": "column"
			            },
			            { 
			                "name": "NUMBER_DEVICES",  
			                "type": "column"
			            }
			        ] 
	                    }
		        ]
		    }
	          ]
	       }
           ]
       } 
  ],
  "lineages": [
	 {
         "src_path": [
	     {
	         "system": "Databricks"
	     },
	     {
	         "database": "COLLIBRA"
            },
	     {
	         "schema": "COLLIBRA"
	     },
	     {
	         "table": "IOT_JSON"
	     },
	     {
	         "column": "CCA3"
	     }
	  ],
	  "trg_path": [
	     {
	         "system": "Databricks"
	     },
	     {
	         "database": "COLLIBRA"
	     },
	     {
	         "schema": "COLLIBRA"
	     },
	     {
	         "table": "IOT_DEVICES_PER_COUNTRY"
	     },
	     {
	         "column": "COUNTRY"
	     }
	  ],
	  "mapping": "dev_no_bat_per_country_view",
	  "source_code": "INSERT INTO ... SELECT CCA3 AS COUNTRY...FROM IOT_JSON"
 	 }
  ]
}

Sample custom technical lineage definition for an advanced custom technical lineage

In the following sample, the tree section defines the IOT_JSON and IOT_DEVICES_PER_COUNTRY tables and columns. The tables are in a schema named COLLIBRA. The COLLIBRA schema is in a database named COLLIBRA and a system named Databricks. Technical lineage via Edge ignores the Collibra system name setting for custom technical lineage. The Databricks system is used for stitching, regardless of whether Collibra system name is set to True or False.

{
  "version": "1.0",
  "tree": [
     { 
         "name": "Databricks", 
	  "type": "system",
	  "children": [
	     { 
	         "name": "COLLIBRA", 
	         "type": "database",
	         "children": [
                   { 
	                "name": "COLLIBRA", 
	                "type": "schema",
	                "children": [
	                   {
		               "name": "IOT_JSON",
		               "type": "table",
		               "leaves": [
		                  { 
		                      "name": "CCA3", 
			              "type": "column"
			          },
			          { 
			              "name": "DEVICE_ID", 
			              "type": "column"
			          }
			       ] 
			   },
			   { 
			       "name": "IOT_DEVICES_PER_COUNTRY", 
			       "type": "table",
			       "leaves": [
			          { 
			              "name": "COUNTRY",
			              "type": "column"
			          },
			          { 
			              "name": "NUMBER_DEVICES", 
			              "type": "column"
			          }
		              ] 
                         }
                     ]
                  }
               ] 
            }
         ] 
      }
  ],
  "lineages": [
     {
         "src_path": [
	     {
                "system": "Databricks"
            },
	     {
	         "database": "COLLIBRA"
	     },
	     {
	         "schema": "COLLIBRA"
	     },
	     {
	         "table": "IOT_JSON"
	     },
	     {
	         "column": "CCA3"
	     }
	  ],
	  "trg_path": [
	     {
	         "system": "Databricks"
	     },
	     {
	         "database": "COLLIBRA"
	     },
	     {
	         "schema": "COLLIBRA"
	     },
	     {
	         "table": "IOT_DEVICES_PER_COUNTRY"
	     },
	     {
	         "column": "COUNTRY"
	     }
	 ],
	 "mapping_ref": 
	    {
	        "source_code": "transforms.sql",
	        "mapping": "dev_no_bat_per_country_view",
	        "codebase_pos": [
	           { 
	              "pos_start": 71, "pos_len": 69
	           } 
               ]
           } 
      }
  ],
  "codebase_files": 
    {
       "transforms.sql": 
	   {
	       "mapping_refs": 
	          {
	              "dev_no_bat_per_country_view": 
	          {
	              "pos_start": 0,
	              "pos_len": 246
	          }
	       }
	   }
    }
  }

Sample technical lineage graphs

Both sample lineage.json files generate the following technical lineage graph, which contains 2 nodes and 1 edge.

The following technical lineage graph is generated by using the sample lineage.json file for an advanced custom technical lineage. The bottom part shows the transformation code that generated the data flow.

In the lineages section, the pos_start property is specified with 71 and the pos_len property is specified with 69. The specifications indicate that the transformation code that starts at position 71 and the following 69 characters are highlighted in blue. Line 2 in the technical lineage graph contains the highlighted transformation code.