Prepare the lineage harvester configuration file for MicroStrategy

You have to prepare a configuration file before you run the lineage harvester. The lineage harvester collects your MicroStrategy metadata and sends it to the Collibra Data Lineage server, where it is processed and analyzed. Collibra Data Intelligence Cloud then imports the MicroStrategy assets and relations to Data Catalog.

Prerequisites

Steps

  1. Run the following command line to start the lineage harvester:
    • Windows: .\bin\lineage-harvester.bat
    • For other operating systems: chmod +x bin/lineage-harvester and then bin/lineage-harvester
    An empty configuration file is created in the config folder.
  2. Open the lineage-harvester.conf file and enter the values for each property.
    PropertiesDescription
    general

    This section describes the connection information between the lineage harvester and Data Catalog.

    catalog

    This section contains information that is necessary to connect to Data Catalog.

    url

    The URL of your Collibra Data Intelligence Cloud environment.

    Note You can only enter the public URL of your Collibra DGC environment. Other URLs will not be accepted.

    username

    The username that you use to sign in to Collibra.

    useSharedDbModel

    Optional property to enable the sharing of metadata batches from multiple SQL data sources. Set this property to true, to help avoid potential analysis errors on the Collibra Data Lineage server.

    To use this property, you need lineage harvester 2022.07 or newer.

    If you set this property to true, you have to run the lineage harvester twice. Read the following details about the issue and solution.

    sources

    This section contains all MicroStrategy connection properties.

    type

    The kind of data source. In this case, the value has to be MicroStrategy.

    collibraSystemName

    This property is deprecated for MicroStrategy integration. The lineage harvester does not take into account any value that you enter here.

    id

    The unique ID of your MicroStrategy metadata. For example, my_microstrategy.

    Warning In the sources section of your lineage harvester configuration file, you can only specify one id property per MicroStrategy Intelligence Server. If you have multiple id properties for a single MicroStrategy Intelligence Server, ingestion will fail. If you have multiple id properties in the configuration file, it means you intend to ingest from multiple unique MicroStrategy Intelligence Servers.

    Tip This value can be anything as long as it is unique and human readable. The ID identifies the batch of MicroStrategy metadata on the Collibra Data Lineage server.

    domainId

    The unique reference ID of the domain in Collibra Data Intelligence Cloud in which you want to ingest the MicroStrategy assets.

    username
    The username that you use to sign in to MicroStrategy.
    hostname

    The endpoint that you use to access the PostgreSQL repository or remote data source, depending on where you installed the lineage harvester.

    For example remote.postgres.com.

    port
    The port number.
    databaseName

    Optionally, the name of your database. For example poc_metadata.

  3. Save the configuration file.
  4. Start the lineage harvester again in the console and run the following command:
    • for Windows: .\bin\lineage-harvester.bat full-sync
    • for other operating systems: ./bin/lineage-harvester full-sync
  5. When prompted, enter the password or client secret to connect to your Collibra Data Intelligence Cloud and MicroStrategy environment.
    The passwords are encrypted and stored in /config/pwd.conf

Example

The following example shows a configuration file for MicroStrategy.

{
 "general": {
   "catalog": {
     "url": "https://<organization>.collibra.com",
     "userName": "<your-collibra-username>"
	},
   "useSharedDbModel": true,
   "useCollibraSystemName": false
 },
  "sources": {
    "type": "Microstrategy",
    "id": "microstrategy-batch",
    "collibraSystemName": "system-name",
    "domainId": "<domain-resource-id>",
    "username": "mstr",
    "hostname": "remote.postgres.com",
    "port": 5432,
    "databaseName": "poc_metadata"
  }
}