Prepare the lineage harvester configuration file for MicroStrategy (NEW)(Beta)

Before you run the lineage harvester, you have to prepare the lineage harvester configuration file to provide the connection information that you need to connect your MicroStrategy server and remote data source to the Collibra Data Lineage service instance and domain in which you want to ingest the MicroStrategy assets.

Example 

Before you begin

Requirements and permissions

  • Collibra Data Intelligence Cloud.
  • A global role with the following global permissions:
    • Catalog, for example Catalog Author
    • Data Stewardship Manager
    • Manage all resources
    • System administration
    • Technical lineage
  • A resource role with the following resource permissions on the community level in which you created the BI Data Catalog domain:
    • Asset: add
    • Attribute: add
    • Domain: add
    • Attachment: add
  • Necessary permissions to all database objects that the lineage harvester accesses.
    Tip 

    Some data sources require specific permissions.

    Ensure that you meet the Azure Data Factory prerequisites.
    You need read access on the SYS schema.
    You need read access on the SYS schema and the View Definition Permission in your SQL Server.
    You need read access on information_schema:
    • bigquery.datasets.get
    • bigquery.tables.get
    • bigquery.tables.list
    • bigquery.jobs.create
    • bigquery.routines.get
    • bigquery.routines.list
    GRANT SELECT, at table level. Grant this to every table for which you want to create a technical lineage.
    You need read access on information_schema. Only views that you own are processed.
    SELECT, at table level. Grant this to every table for which you want to create a technical lineage.
    The role of the user that you specify in the username property in lineage harvester configuration file must be the owner of the views in PostgreSQL.
    A role with the LOGIN option.
    SELECT WITH GRANT OPTION, at Table level.
    CONNECT ON DATABASE
    Note The following permissions are the same, regardless of the ingestion mode: SQL or SQL-API.
    You need a role that can access the Snowflake shared read-only database. To access the shared database, the account administrator must grant the IMPORTED PRIVILEGES privilege on the shared database to the user that runs the lineage harvester.
    Tip If the default role in Snowflake does not have the IMPORTED PRIVILEGES privilege, you can use the customConnectionProperties property in the lineage harvester configuration file to assign the appropriate role to the user. For example:
    "customConnectionProperties": "role=METADATA"
    You need read access on the DBC.
    You need read access to the following dictionary views:
    • all_tab_cols
    • all_col_comments
    • all_objects
    • ALL_DB_LINKS
    • all_mviews
    • all_source
    • all_synonyms
    • all_views
    You need read access on definition_schema.
    • Your user role must have privileges to export assets.
    • You must have read permission on all assets that you want to export.
    • You have added the Matillion certificate to a Java truststore.
    • You have at least a Matillion Enterprise license.
  • In MicroStrategy:
    • Admin API permissions.
    • Permissions to access the library server.
    • The lineage harvester uses port 443. If the port is not open, you also need permissions to access the repository.

Steps

  1. Open the lineage-harvester.conf file that was created when you installed the lineage harvester, and enter the values for each property.
    PropertiesDescription
    general

    This section describes the connection information between the lineage harvester and Data Catalog.

    techlin

    This section contains information that is necessary to connect to the Collibra Data Lineage service instance.

    Warning This applies only to US government customers.

    url

    The URL of the Collibra Data Lineage service instance.“url”: “https://techlin-gov.collibra.com”

    Warning This applies only to US government customers.

    userKey

    The unique API key to connect to the Collibra Data Lineage service instance.

    A unique user key is needed for each Collibra environment. If you're not sure what your user key is, please contact your Collibra Customer Success Manager.

    Warning This applies only to US government customers.

    catalog

    This section contains information that is necessary to connect to Data Catalog.

    url

    The URL of your Collibra Data Intelligence Cloud environment.

    Note You can only enter the public URL of your Collibra DGC environment. Other URLs will not be accepted.

    username

    The username that you use to sign in to Collibra.

    useCollibraSystemName

    Indicates whether or not you want to use the system or server name of a data source to match to the System asset in Data Catalog during automatic stitching. This is useful when you have multiple databases with the same name.

    By default, the useCollibraSystemName property is set to false. If you want to use it, set it to true.

    Important 
    • If you set this property to true, the lineage harvester reads the value of the collibraSystemName property in your MicroStrategy <source ID> configuration file.
    • If you set the useCollibraSystemName property to false, the lineage harvester ignores the collibraSystemName property in the Power BI <source-ID> configuration file.
    sources

    This section contains all MicroStrategy connection properties.

    id

    The unique ID of your MicroStrategy metadata. For example, my_microstrategy.

    Warning In the sources section of your lineage harvester configuration file, you can only specify one id property per MicroStrategy Intelligence Server. If you have multiple id properties for a single MicroStrategy Intelligence Server, ingestion will fail. If you have multiple id properties in the configuration file, it means you intend to ingest from multiple unique MicroStrategy Intelligence Servers.

    Tip This value can be anything as long as it is unique and human readable. The ID identifies the batch of MicroStrategy metadata on the Collibra Data Lineage service.

    type

    The kind of data source. In this case, the value has to be MSTR_V2.

    url

    The URL of your MicroStrategy account.

    username

    The username that you use to sign in to MicroStrategy.

    maxParallelRequests

    This optional property allows you to specify the internal sizing, meaning the amount of tasks that can be executed at the same time.

    The default value is "1", which means that HTTP requests are run in a synchronous manner, instead of in parallel. As value of "5", for example, means that as many as 5 HTTP requests can take place in parallel.

    A lower value reduces the chances of experiencing HTTP 401 Unauthorized errors.

    deleteRawMetadataAfterProcessing

    The lineage harvester harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing.

    You can use this optional property to specify whether or not the raw metadata should be deleted from Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    The default value is false.

    If the property is set to true, the raw source metadata is deleted after processing. If set to false, it is stored in the Collibra infrastructure.

    Note Setting this property to true can negatively impact performance.

    appUrlSuffix

    This optional property ensures that the correct URL to data objects in MicroStrategy is included on the asset pages of corresponding MicroStrategy assets. The required value depends on which platform you run MicroStrategy:

    • For J2EE, use: "appUrlSuffix": "MicroStrategy/servlet/mstrWeb"
    • For .NET, use: "appUrlSuffix": "MicroStrategy/asp/Main.aspx"

  2. Save the configuration file.

What's next?

Prepare your <source ID> configuration file.