DGC service configuration settings

The settings in this topic are applicable for Collibra Data Intelligence Platform 2024.04. For on-premises installations, check the compatibility table to know which version of Collibra Data Intelligence Platform is equivalent to the version of Collibra Data Governance Center you are using.

Note 
  • You can edit the DGC service configuration in Collibra Console or Collibra settings. To edit the DGC service configuration, you need the ADMIN role, unless it is specified that a SUPER role is required.
  • The SUPER role does not exist in cloud environments. Therefore, settings that require the SUPER role are not available in Collibra Data Intelligence Platform environments. If you need to edit one of such settings, create a support ticket.
  • The numbers used to identify the following sections may differ from those shown on the Services Configuration page of the Collibra settings.

1 General settings

The general settings of Collibra Data Intelligence Platform.

Setting Description
Default locale (Requires restart) The default locale for new users. It has to contain a language code and may contain a country code. Examples: pl, en_US, nl_BE.
Enable view rights
  • True (default): The view permissions feature is enabled.
  • False: The view permissions feature is disabled.

Base URL

This setting requires the SUPER role.

The base URL for this Collibra instance, for example http://dgc.example.com. It is the consistent part of the URL to access Collibra Data Intelligence Platform.

This is used amongst others to:

  • construct hyperlinks to the system, for example in emails.
  • display your profile picture.
  • display Tableau report images.
  • ...
Enable auditing (Requires restart)

 

This setting requires the SUPER role.
  • True (default): Audit and history information is stored.
  • False: Audit and history information is not stored.

Google Analytics tracking ID

This setting requires the SUPER role.

The Google Analytics 3 Tracking ID for capturing web analytics for your Collibra environment.

The tracking ID is used to have the code snippet embedded on the Collibra page to capture default Google Analytics 3 events that occur, such as page visits and form submission.

Google Analytics 4 tracking ID

This setting requires the SUPER role.

The Google Analytics 4 Tracking ID for capturing web analytics for your Collibra environment.

The tracking ID is used to have the code snippet embedded on the Collibra page to capture default Google Analytics 4 events that occur, such as page visits and form submission.

Show target asset type above relation table

  • True (default): Show the asset type of the target asset in the title of relation tables on an asset page. The target asset can be either the head or the tail of the relation, depending on which asset page you have open.
  • False: Hide the asset type of the target asset.

The default value is true.

Collect Application Usage Data

This setting requires the SUPER role.

The usage data is used to understand how users interact with Collibra. The information can be used to provide reporting and recommendations.

When you enable this setting in a cloud environment, the data collection starts immediately. When you enable the setting in an on-premises environment, you may need to approve the tracking script (pendo.io, app.pendo.io, and cdn.pendo.io).

  • True (default): Gathering usage data is enabled and sent to Collibra.
  • False: Gathering usage data is disabled.

Usage Data API key

This setting requires the SUPER role.

The Pendo usage data API key that you want to use to collect the usage data.

Homepage

  • True: Enables the Homepage. When you sign in to Collibra, the Homepage is shown. The Homepage replaces your default dashboard.
  • False: Disables the Homepage. When you sign in to Collibra, your default dashboard is shown, instead of the Homepage.

1.1 Help Menu

The configuration of the Help menu in Collibra Data Intelligence Platform.

Setting Description
Links The list of links in the help menu.
  Menu item name
The name of the menu item as it will appear in Collibra Data Intelligence Platform's help menu.
  Menu index
The position of the menu item in the help menu. The top position starts with the value 1.
  Menu URL
The target URL of the menu item.
  Show admin only
  • True: The menu item is only visible to users with the Sysadmin role.
  • False: The menu item is visible to every user.

2 Email configuration

The configuration of email notifications.

Note In a Collibra Data Intelligence Platform environment, you cannot update the email server settings, such as host and port. For more information, see Collibra Data Intelligence Platform infrastructure.

Setting Description
Default schedule (Requires restart)

The Cron schedule to send emails only at specific times. With this, you can send emails in batches and avoid an overload of mails.

Keep in mind that these emails are only workflow emails and have nothing to do with the notification schedule.

If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.

Template map The location of template emails.

Password

This setting requires the SUPER role.

The password paired with your username to sign in to your SMTP server.
From address

The email address used as the sender of all outgoing emails.

Contact Collibra Support to change the From address, see also Email configuration.

Port

This setting requires the SUPER role.

The port to connect to your SMTP server. The default value is 25.

Host

This setting requires the SUPER role.

The hostname or URL of your SMTP server.

Start TLS

This setting requires the SUPER role.

  • True: Use TLS (Transport Layer Security) to connect to your SMTP server.
  • False (default): Do not use TLS to connect to your SMTP server.

Username

This setting requires the SUPER role.

The username to sign in to your SMTP server.

Sending threads

This setting requires the SUPER role.

The number of threads that are used to send emails. The default value is 3.

Max retries

This setting requires the SUPER role.

The maximum number of retries before the system aborts the sending of an email. The default value is 5.

Email address change notification

This setting is only available for the ADMIN role.

If you change the email address to which notifications are sent, notification of the change is sent to the old email address.

2.1 Notifications

The configuration of notification emails to users.

Note These settings can be overridden for every user in the preferences.xml file.

Setting Description
Notification days

The days of the week on which Collibra sends notifications. The days are represented by numbers from 1 to 7, where 1 represents Sunday.

Per row you can add one day.

Daily roles The roles that receive notifications on the days defined in Notification days.
Enable monthly notifications
  • True: The users receive a monthly summary.
  • False (default): The users do not receive a monthly summary.
Roles for monthly notifications The roles that receive monthly notification emails. This is only relevant if Enable monthly notifications is True.

2.2 Handlers

A mail handler can poll for emails on a mail server, process those emails and perform actions based on the contents.

Setting Description

Host

This setting requires the SUPER role.

The hostname or URL of the incoming mail server.

Port

This setting requires the SUPER role.

The port to connect to your incoming mail server.

Protocol

This setting requires the SUPER role.

  • The protocol to connect to your incoming mail server, with or without SSL (POP3, POP3S, IMAP, IMAPS).
  • Note  The additional S at the end of the abbreviations stands for the secure version of the protocol using SSL. Using this requires the SSL certificates to be correctly configured.

    Force domain

    This setting requires the SUPER role.

    • True: Only handle emails from the same domain as the handler's email address.
    • False (default): Handle emails from any domain.

    Handler list

    This setting requires the SUPER role.

    The configuration of email handlers, which can poll emails on an email server, process those emails and perform actions based on the contents.
     

    Enabled

    This setting requires the SUPER role.

    • True: The handler is enabled.
    • False (default): The handler is not enabled.
     

    Name

    This setting requires the SUPER role.

    The name of the mail handler. We recommend to use a meaningful name to easily identify what this handler is used for.
     

    Username

    This setting requires the SUPER role.

    The username to connect to the incoming mail server.
     

    Password

    This setting requires the SUPER role.

    The password to connect to the incoming mail server.
     

    Email address

    This setting requires the SUPER role.

    The email address to which workflow action mails are sent.
     

    Polling interval

    This setting requires the SUPER role.

    The time in milliseconds between two pollings of the mail server.
     

    Delete

    This setting requires the SUPER role.

    • True: Delete messages from the mail server once the mail is processed.
    • False (default): Keep messages on the mail server after the mail is processed.

    This option is only relevant if Protocol is IMAP or IMAPS.

     

    Alias filter

    This setting requires the SUPER role.

    • True (default): Retrieve only the emails of which the To field contains the email address of the handler.
    • False: Do not filter on the To field.

    3 Hyperlinking configuration

    The configuration of automatic hyperlinks. When you change a setting, you have to rebuild the hyperlinks.

    Setting Description
    Enable hyperlinking
    • True: Hyperlinks are created automatically.
    • False (default): Hyperlinks are not created automatically.

    For more information about automatic hyperlinks, see Hyperlinking.

    Warning If you enable this setting, the performance of Collibra can decrease.
    Enable case sensitivity
    • True: Hyperlinks are case-sensitive.
    • False (default): Hyperlinks are not case-sensitive.

    Note If you edit this setting, you have to reindex Collibra.

    Excluded asset type IDs

    The list of asset types that are ignored by automatic hyperlinking. You can use this setting to exclude particular asset types so that not all the assets are potential targets for automatic hyperlinking.

    Each time you enter an asset type ID in a field, a new field appears for you to add another ID. To exclude multiple asset types, use separate fields, instead of entering multiple IDs in a single field separated by commas.

    Excluding assets reduces the amount of hyperlinks, which improves performance.

    Tip We recommend that you exclude technical asset types such as Column, Field, Table, Code Value, and Code Set.

    Note If you edit this setting, you have to reindex Collibra.

    4 Recommender configuration

    The configuration of the recommender.
    We recommend not to change the default values of this configuration.

    Setting impacts Description
    Catalog recommender enabled All recommendations
    • True (default): The "Data sets you might like" section is included on the Data Catalog Home page. This section shows data sets you might be interested in, as determined by the recommender, which takes into account your data sets and the data sets of similar users.
    • False: The "Data sets you might like" section is not included on the Data Catalog Home page.
    Data set recommender execution time Recommendations of data sets to users

    The schedule (CRON job) by which the data set recommender looks for recommended data sets for a user.

    By default the data set recommender does this every night.

    Asset recommender execution time Recommendations of business assets to data assets The schedule (CRON job) by which the asset recommender looks for suggested relations between business assets and data sets.
    Data set matcher execution time Data set matcher The schedule (CRON job) by which the data set matcher looks for similar data sets.
    Data set similarity threshold Data set matcher

    The amount of business assets that have to be related to two data sets before the data sets are considered to be similar.

    This percentage is expressed by a decimal where 1,00 equals 100%.

    Duplicate schema threshold Schema matcher

    The amount of assets that have to be related to both schemas before the schemas are considered to be similar.

    This percentage is expressed by a decimal where 1,00 equals 100%.

    Fuzzy vs exact matching strategy for business assets Recommendations of business assets to data sets and of business assets to column assets

    The percentage that determines to what extent assets with a similar name become more important.

    The ranking in the search engine results always has an impact on the suggestion score. However, similarity between the asset names can also be taken into account. If you decrease this percentage, the ranking of the search results becomes more important for the suggestion score, while the similarity between the asset names becomes less important. If you increase the percentage, assets with similar names will receive a higher suggestion score.

    This percentage is expressed by a decimal where 1,00 equals 100%. You can enter a value greater than 1,00.

    Recommendation weights for data sets Recommendations of data sets to users

    An ordered comma-separated list of values that define the importance of properties for recommendations. The order of the values reflects the importance of the value.

    This setting is only used for data set recommendations if your Collibra does not yet have enough data for relevant results from the active recommendations algorithms.

    Possible values:

    • CERTIFIED: Data sets that are certified are considered more relevant.
    • POPULARITY: The number of visits to the data set page.
    Active recommendation algorithms Recommendations of data sets to users and of business assets to data sets

    A comma-separated list of algorithms that calculate recommendations. By default, all available algorithms are listed.

    Possible values:

    Data set elements threshold Recommendations of data sets to users

    The maximum number of elements per data set that the recommender will use to train the model. The data set elements are taken randomly.
    Lowering this number can prevent out-of-memory issues but also impacts the accuracy of recommendations for large data sets.

    Warning If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.

    5 Search index configuration

    The configuration of the search index.

    Setting Description
    UI search appends wildcard
    • True (default): A wildcard (asterisk) is automatically added to each search query. An asterisk is not added if:
      • The query contains a tilde (~).
      • The query ends with a quotation mark (").

      Note This applies only to queries via the user interface. A wildcard is not added automatically for REST API queries.

    • False: A wildcard is not added to the search query.
    Maximum batch size

    The amount of resources that are scanned in a single operation for the search query.

    • Default value: 5,000
    • Maximum value: 30,000

    Maximum batch size for relations

    The maximum batch size for relations reindex.

    • Default value: 500
    • Minimum value: 50
    • Maximum value: 10,000

    Stop words

    (Requires restart)

    A list of stop words that are ignored as tokens for the index.

    The default list of English stop words includes the following:

    a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with

    If you do not create your own list of stop words, the default list is used.

    If you create your own list of stop words, you need to:

    1. Reindex Collibra Data Intelligence Platform.
    2. Restart the environment to apply your changes. For more information, go to Stop an environment and Start an environment.

    Relation-based search
    • True (default in new environments): The Data Marketplace search considers certain assets and relation types between assets. As a result, your search results not only include assets that directly match the search criteria, but also assets that match the criteria through specific relation types.

      Example A column named Order is included in a data set named Customer. If the relation-based search is enabled and you search for Order in Data Marketplace, then the data set Customer appears in the search results because the data set contains this column.

      Tip For more information about this feature and the default relation types, go to Filtering and searching based on relations in Data Marketplace.

    • False: The Data Marketplace search results do not consider relations.

    After you enable this setting, you must reindex Data Marketplace relations or reindex Collibra completely.

    Note In new Collibra environments, this setting is enabled by default. In upgraded Collibra environments, the previous status of this setting is retained.

    5.1 Tokenizer

    The configuration of the tokenizer of the indexing mechanism. If you edit these settings, you need to restart and reindex your environment.

    Setting Description
    Type

    The tokenizer that determines how the search text is split.

    The Type field must contain either Standard (default) or Character.

    • Standard: The Standard tokenizer is a method of splitting the search text into individual terms and is based on a default set of characters. This tokenizer follows the word break rules from the Unicode Text Segmentation algorithm.
    • Character: The Character tokenizer allows you to customize when you want a search text to be split into individual terms. For more information, go to When and how to use the Character tokenizer.
    Parameter map

    A list of characters that the Character tokenizer allows.

    If you entered Character in the Type field, you must add a parameter map.

    To add a parameter map:

    1. Click Add.
      The Add map option dialog box appears.
    2. In the Field key field, enter the following value: allowedCharacters
    3. In the Field value field, enter the set of characters that you want the tokenizer to allow. For more information, go to When and how to use the Character tokenizer.

    5.2 Boosting

    The configuration of the boosting function.

    Setting Description
    Asset The boost factor of assets.

    Class Match

    The boost factor of data classes.

    Community The boost factor of communities.
    Domain The boost factor of domains.
    User The boost factor of users.
    User group The boost factor of user groups.
    Name The boost factor of names.
    Comment The boost factor of comments.
    Tag The boost factor of tags.
    Attribute boost map

    The boost factor of attribute types.

    • Field key: The attribute type ID.
    • Field value: The boost factor of the attribute type.

    Display exact match of name as first

    • True (default): If the name of an asset is exactly the same as the search text, put it at the top of the search results regardless of boost factors.
    • False: Use the regular search order, taking into account boost factors.

    Asset boost map

    The boost factor of asset types.

    • Field key: The asset type ID.
    • Field value: The boost factor of the asset type.

    Partial exact match enabled

    Enables partial exact matching while searching for multi word phrases.

    • True (default): For multi-word search text, the search engine considers the exact match percentage with the resource name, when ordering the results.
      Example You enter search text "scheduled maintenance". Two example assets are ordered as follows:
      1. An asset named "daily scheduled maintenance", as two of the three words (66%) match exactly.
      2. An asset named "daily scheduled maintenance revised", as two of the four words (50%) match exactly.
    • False: The exact match percentage is not taken into account in the score calculation.

    5.3 Slow logs configuration

    The configuration of the slow logs function.

    Setting Description
    Indexing threshold

    The time limit, in milliseconds, after which an index query is logged in Elasticsearch.

    If the value is set to 0 (zero), all index queries are logged.

    Changes to this setting require a full reindex of your Collibra Data Intelligence Platform environment.

    Fetching threshold

    The time limit, in milliseconds, after which a fetch query is logged in Elasticsearch.

    If the value is set to 0 (zero), all fetch queries are logged.

    Changes to this setting require a full reindex of your Collibra Data Intelligence Platform environment.

    5.4 Search Event Log configuration

    The configuration of indexing.

    Setting Description
    Automatic relation indexing

    This setting keeps Data Marketplace up to date if relations between assets are created, updated, or removed.

    Example If the relation between asset A and asset B changes and this relation is used in relation-based filters or relation-based search, then the Data Marketplace search considers this change.

    • True: Automatically index certain relation type changes between assets so that the relation information remains consistent between Collibra and Data Marketplace. The relation types that are considered are the relation paths used by relation-based search and filters. If such a relation type between assets changes, the change is reflected in the search index after some time.
      Tip For more information about this feature and the default relation types, go to Filtering and searching based on relations in Data Marketplace.
      Note Collibra does not automatically reindex relations between assets for relation paths that end with an attribute. You need to manually reindex the relations.
      You, for example, created a path that ends with an attribute: Table A contains Column B with attribute Privacy. If you have a column with attribute Privacy and value "sensitive data", a user searching for "sensitive data" can find Table A based on the relation path. However, changes to the attribute value will not be picked up automatically during automatic reindexing.
    • False (default): Changes to relations are not automatically indexed. This can cause inconsistencies between Collibra and Data Marketplace. You can, however, manually reindex Data Marketplace relations.

    6 Upload configuration

    The configuration of the file upload service.

    The file upload restrictions apply to the following actions in Collibra:

    Setting Description
    Max file size

    The maximum file size in bytes for uploads.

    • For cloud environments, the default value is 512 MB or 536,870,912 bytes. This value cannot be changed.
    • For on-premises environments, the default is 10 MB or 10,485,760 bytes.
    Max per day

    The maximum number of uploads per user per day.

    • For cloud environments, the default value is 1,235,465 uploads. This value cannot be changed.
    • For on-premises environments, the default is 150 uploads.
    Accepted content types

    The MIME type names of the files you want to allow for uploads.

    For example, type application/pdf for PDF files.

    Restricted content types

    Content types in MIME type format that cannot be uploaded. Restricted content types take precedence over an accepted content types:

    • application/vnd.ms-excel.addin.macroenabled.12
    • application/vnd.ms-excel.sheet.binary.macroenabled.12
    • application/vnd.ms-excel.sheet.macroenabled.12
    • application/vnd.ms-excel.template.macroenabled.12
    • application/vnd.ms-powerpoint.addin.macroenabled.12
    • application/vnd.ms-powerpoint.presentation.macroenabled.12
    • application/vnd.ms-powerpoint.slide.macroenabled.12
    • application/vnd.ms-powerpoint.slideshow.macroenabled.12
    • application/vnd.ms-powerpoint.template.macroenabled.12
    • application/vnd.ms-word.document.macroenabled.12
    • application/vnd.ms-word.template.macroenabled.12

    7 Statistics configuration

    The configuration of statistics.

    Setting Description
    Buffer size

    The maximum amount of statistics entries that the buffer can contain before saving them in the database.

    The default value is 10.

    Buffer flush time

    The maximum amount of time in milliseconds to keep statistic entries in memory before saving them in the database.

    The default values is 10,000.

    Cron map

    List of statistics, listed by their Cron name, and a Cron interval.

    These are the default values:

    Field key Field value

    workflow-task

    0 59 23 * * ?

    active-users

    0 0/15 * * * ?

    term-count

    0 59 23 * * ?

    vocabulary-count

    0 59 23 * * ?

    page-hit

    0 0 * * * ?

    task-count

    0 0 * * * ?

    If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.

    8 Import configuration

    The configuration for imports.

    Setting Description
    Enable workflows during import
    • True: Allow starting workflows upon importing assets.
    • False (default): Do not allow to start workflows upon importing assets.
    Asset responsibilities support
    • True: Enable importing responsibilities at asset level.
    • False (default): Disable importing responsibilities at asset level.

    Warning Setting specific responsibilities on a large number of resources will affect the performance and stability of the system.

    Number of failed commands before stopping import job

    An import job with the option to continue on error enabled will stop after the specified number of commands have failed. Any valid command is still committed to the database until the moment the job stops, which can lead to some resources being imported.

    The default and maximum value is 100.

    Temporary data location

    The location of the temporary files used by the import job.

    The default value is FILE.

    Import UI v2

    This setting requires the SUPER role.

    • True (default): Use the new import interface for importing assets and complex relations, with improved usability and performance.
    • False: Use the original import interface for importing assets and complex relations.

    8.1 Excel import configuration

    The configuration of Excel import.

    Setting Description
    The default CSV separator character The default separator character of the CSV fields for complex relations.
    The default CSV quote character The default quote character of the CSV fields for complex relations.

    Number of rows per chunk of data

    When importing views, the database is called repeatedly, each time importing a chunk of data from the import file. This option defines how many rows each chunk of data can contain.

    Lower values reduce the burden on memory. Higher values require more memory, but may slightly increase the speed of the export.

    The default value is 5,000.

    9 Excel export configuration

    The configuration of Excel export.

    Setting Description
    The default CSV separator character The default separator character of the CSV fields for complex relations.
    The default CSV quote character The default quote character of the CSV fields for complex relations.

    Number of rows per chunk of data

    When exporting views, the database is called repeatedly, each time fetching a chunk of data to build the export file. This option defines how many rows each chunk of data can contain.

    Lower values reduce the burden on memory. Higher values require more memory, but may slightly increase the speed of the export.

    The default value is 5,000.

    10 CSV export configuration

    The configuration of CSV export.

    Setting Description
    Always use quotes
    • True: Use quotes for every cell in the CSV.
    • False (default): Only use quotes when necessary.

    Number of rows per chunk of data

    When exporting views, the database is called repeatedly, each time fetching a chunk of data to build the export file. This option defines how many rows each chunk of data can contain.

    Lower values reduce the burden on memory. Higher values require more memory, but may slightly increase the speed of the export.

    The default value is 5,000.

    11 User interface configuration

    The configuration of user interface features.

    Setting Description

    Optimize CSS

    This setting requires the SUPER role.

    • True (default): The CSS files are optimized to improve performance of the user interface.
    • False: The CSS files are not optimized.

    Optimize JavaScript

    This setting requires the SUPER role.

    • True (default): The JavaScript code is optimized to improve the performance of the user interface.
    • False: The JavaScript code is not optimized

    Concatenate JavaScript

    This setting requires the SUPER role.

    • True (default)
    • False

    Velocity cache

    This setting requires the SUPER role.

    • True: The velocity cache is enabled.
    • False: The velocity cache is disabled. This allows you to reload velocity templates without restarting Collibra.

    Modules JSON overrides

    This setting requires the SUPER role.

    DISCLAIMER: If you choose to customize any aspects of Collibra, including CSS or other modules/page-definition customizations, these must be thoroughly tested between upgrades. Customizations are unsupported and can break between upgrades. We recommend your organization and the responsible parties maintain a list of customizations applied to Collibra and use that as a checklist for validating upgrades in a test or lower environment. If changes are needed to customizations, make appropriate preparation and testing plan to promote to your production instance.

    Modules properties overrides

    This setting requires the SUPER role.

    DISCLAIMER: If you choose to customize any aspects of Collibra, including CSS or other modules/page-definition customizations, these must be thoroughly tested between upgrades. Customizations are unsupported and can break between upgrades. We recommend your organization and the responsible parties maintain a list of customizations applied to Collibra and use that as a checklist for validating upgrades in a test or lower environment. If changes are needed to customizations, make appropriate preparation and testing plan to promote to your production instance.

    Page definition overrides

    This setting requires the SUPER role.

    DISCLAIMER: If you choose to customize any aspects of Collibra, including CSS or other modules/page-definition customizations, these must be thoroughly tested between upgrades. Customizations are unsupported and can break between upgrades. We recommend your organization and the responsible parties maintain a list of customizations applied to Collibra and use that as a checklist for validating upgrades in a test or lower environment. If changes are needed to customizations, make appropriate preparation and testing plan to promote to your production instance .

    12 API call logging

    The configuration of the API call logging.

    Setting Description
    Enabled
    • True: API call logging is enabled for REST Core API v1 (deprecated) calls.
    • False (default): API call logging is disabled for REST Core API v1 (deprecated) calls.
    Maximum number of log entries (Requires restart)

    The maximum number of API calls to store in the component_call_logging database table. Once this number is reached, the oldest records are overwritten.

    The default value is 1,000,000.

    Pattern duration list The list of methods and a corresponding minimum duration time. The minimum duration time is the minimum time before the method is stored in the database.
      Minimum duration
    The time in milliseconds that an API call must last before it is logged.
      Method pattern
    The method that you want to log in the database. For each pattern that you want to log, you have to add a new pattern.

    13 System metrics

    The configuration of metric collection.

    Setting Description
    Enable (Requires restart)
    • True (default): Metric collection is enabled.
    • False: Metric collection is disabled.
    Enable JVM metrics (Requires restart)
    • True (default): JVM metric collection is enabled.
    • False: JVM metric collection is disabled.
    Enable advanced metrics (Requires restart)
    • True: Advanced metrics collection is enabled. Enabling this option has a negative impact on the performance of your environment.
    • False (default): Advanced metrics collection is disabled.

    Enable minimal monitoring (Requires restart)

    • True: Monitoring of the metrics is enabled.
    • False (default): Monitoring of the metrics is disabled.

    14 API configuration

    The configuration of API settings.

    Setting Description
    Enable maximum paging limit
    • True (default for new environments): The maximum paging limit is set to 1,000 data elements per API call.

      Note  Once the maximum paging limit has been enabled, it cannot be disabled.

    • False : There is no maximum paging limit per API calls. We recommend you enable the maximum paging limit, as too many data elements per API call can cause your environment to crash.

    15 Security configuration

    The configuration of security.

    Setting Description
    X-Frame options (Requires restart)

    The content of the HTTP-header X-Frame-Options. This is set on all rendered pages and is used to avoid clickjacking attacks. By default, only pages with the same origin can use the rendered pages in a frame.

    Limit user sessions
    • True: A user can only open one session.
    • False (default): A user can open multiple sessions.

    Office research guest access

    • True: The Office research integration is always allowed guest access via REST, regardless of the general Guest access setting.
    • False (default): The general Guest access setting is kept.

    Note Currently, The Office research integration is only available when Collibra Data Intelligence Platform is publicly available, which is why this override setting is necessary.

    Prevent advanced html features in text dashboard

    Text widgets can contain full HTML. However, this means an attacker could potentially execute an XSS attack by injecting malicious HTML.

    • True: Potentially dangerous HTML elements are removed from text attributes when you save the text field.
    • False (default): No HTML elements are removed from text attributes when you save the text field.
    Note 

    If you enable this setting, the following HTML elements are deleted when you save:

    • script, including JavaScript
    • svg
    • frame
    • frameset
    • iframe
    • any event handlers

    Guest access

    This setting requires the SUPER role.

    • True: Anyone that can access the URL, has viewing rights to the system.
    • False (default): The user is asked to sign in before having access to any data.
    Enable schema introspection
    • True: Schema fields are shown during an introspection.
    • False (default): Schema fields are hidden during an introspection.
    Enable schema introspection for Public GraphQL APIs (Requires restart)
    • True (default): Schema fields are shown when using introspection for public APIs.
    • False: Schema fields are hidden when using introspection.
    Enable customer validation functions
    • True (default): Groovy scripts with custom validation functions can be loaded.
    • False: Groovy scripts with custom validation functions cannot be loaded.

    15.1 LDAP

    The configuration of an LDAP server to handle the authentication.

    Setting Description
    Enable LDAP integration (Requires restart)
    • True: The LDAP integration is enabled.
    • False (default): The LDAP integration is disabled.
    Sync after restore
    • True (default): LDAP data is synchronized with Collibra when an initial data set is bootstrapped.
    • False: LDAP data is synchronized with Collibra only when the LDAP synchronization job is triggered.
    User page size

    The page size that is used when retrieving users during synchronization.

    The default value is 500. You can set it to 0 to disable paging.

    Note This is a global setting. If you are working with multiple LDAP servers, only the value for the main server is taken into account.

    Group page size

    The page size that is used when retrieving groups.

    You can set it to 0 to disable paging.

    Note This is a global setting. If you are working with multiple LDAP servers, only the value for the main server is taken into account.

    Time limit

    Specifies the time limit in milliseconds for all LDAP searches.

    The default value is 120,000.

    You can set it to 0 to disable the time limit.

    Tip 
    • If you get Time limit Exceeded error messages, increase the default value or check why the LDAP search takes too long.
    • We recommend that you modify the User page size and Group page size settings before you modify this setting.
    Sync job enabled
    • True (default): The synchronization job is enabled.
    • False: The synchronization job is disabled.
    Sync job cron

    The schedule to perform an LDAP synchronization (CRON).

    The default value for this setting is daily at midnight.

    If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.

    User field mapping The configuration mapping of all the user fields. This determines which LDAP field ismapped to which user field. Empty fields are ignored during the synchronization.
      Username
    The unique user ID in the LDAP, typically UID. This is a mandatory field.
      Email
    The corresponding email field in the LDAP directory. This is a mandatory field.
      First name
    The first name field in the LDAP directory.
      Last name
    The last name field in the LDAP directory.
      Middle name

    The middle name field of the LDAP directory, this is usually givenName.

      Enabled
    Indication whether a user is active or inactive in LDAP.
      Language

    The language and locale of the user. It has to contain a language code and may contain a country code.

    Examples: pl, en_US, nl_BE.

      Group

    The LDAP property that defines to which groups the user belongs. If there is a group entry in the LDAP directory, use the Group field mapping settings.

      Additional email list

    An additional email list.

      Instant messaging fields
    The mapping for the user's IM locations.
      AIM
    The mapping for the user's AOL IM account.
      Google Talk
    The mapping for the user's Google Talk IM account.
      Icq
    The mapping for the user's ICQ IM account.
      Jabber
    The mapping for the user's Jabber IM account.
      Messenger
    The mapping for the user's Live Messenger IM account.
      Skype
    The mapping for the user's Skype IM account.
      Yahoo Messenger
    The mapping for the user's Yahoo Messenger IM account.
      Website map
    Enter the field value and field key to map a social media website.
      Phone
    The mapping for the user's phone.
      Fax
    The mapping for the user's fax number.
      Mobile
    The mapping for the user's mobile number.
      Pager
    The mapping for the user's pager number.
      Private
    The mapping for the user's private number.
      Work
    The mapping for the user's work number.
      Other
    The mapping for any other phone number for this user.
      Home address
    The mapping for the user's home address.
      Street
    The mapping for the user's street.
      Number
    The mapping for the user's number.
      City
    The mapping for the user's city.
      Post code
    The mapping for the user's postal code.
      State
    The mapping for the user's state.
      Country
    The mapping for the user's country.
      Work address
    The mapping for the user's work address.
      Street
    The mapping for the user's street.
      Number
    The mapping for the user's number.
      City
    The mapping for the user's city.
      Post code
    The mapping for the user's postal code.
      State
    The mapping for the user's state.
      Country
    The mapping for the user's country.
      Gender
    The mapping information for the user's gender.
      Mapping
    The attribute key for the gender value. If the content equals one of the male or female mappings, the user will be saved as male or female. Otherwise a default of UNKNOWN will be used.
      Male value
    The value for male users.
      Female value
    The value for female users.
    Group field mapping Groups can be defined as a separate structure or as a userField. The following section allows you to sync with a group structure that is unrelated to the user structure.
      Group name field

    The name of the group to use in the application.

      Users field
    The user DNs that are member of the group.
    Servers

    The Collibra parameters to map with your LDAP server parameters.

      LDAP server URL
    The URL or IP address to the LDAP server, for example ldap://ldap.yourcompany.com:389 or ldaps://ldap.yourcompany.com:636.
      Bind DN
    The DN of the administrator user that is used for authentication, for example admin.
      Bind password
    The password of the administrator user.
      Base DN
    The base DN for when you are working with relative DNs. This base DN is used for all LDAP look-ups.
      User base
    The base DN of where the LDAP users for Collibra are located. If a base has been specified, it is used as a prefix for this user base. Subtree search is used, so all DNs located below are searched for matching users.
      Authentication user LDAP filter
    The filter that specifies which users can authenticate in the application. By default, all the objects found in the user base are selected, including the root.
      Synchronization user LDAP filter

    The filter that specifies which users are imported by the synchronization job. The users have to be the same as, or a subset of, the Authentication user LDAP filter.

    If you provide no value for this setting, the same filter as specified for the Authentication user LDAP filter setting is used. That allows you to synchronize only the users that have to have access to the application, even if they have not logged in yet. Users in the Authentication user LDAP filter are synchronized each time they authenticate and are only available after the first sign-in to the application. This is the default setting.

      Authentication type

    The authentication mechanism for authenticating users on the LDAP servers.

    Authentication type Explanation
    none No authentication is performed.
    simple Simple authentication is performed, using the Bind DN and Bind password as credentials. The credentials are sent as plain text.
    DIGEST-MD5 Simple authentication is performed, using the Bind DN and Bind Password as credentials. The Bind password is hashed with the MD5 algorithm.
    TLS-SIMPLE A temporary secured TLS connection is set up before the credentials are sent as plain text. SSL must be configured.
    TLS-EXTERNAL A temporary secured TLS connection with external SASL authentication using a client certificate. SSL must be configured.
      Shutdown gracefully
    • True: The LDAP context is destroyed immediately. When using TLS, some servers require the connection to be shut down by the client before the LDAP context is destroyed.
    • False (default): The LDAP context is not destroyed immediately.
      Referral Setting

    Specifies what to do with referrals. Possible values:

    Referral setting Explanation
    throw

    Throws an exception if a referral is encountered.

    ignore (default)

    All referrals are ignored.

    follow

    Follows the referral to the actual location of the entry on another server.

    This is recommended when using Microsoft Active Directory.

    Note If you are experiencing slow searches on Microsoft Active Directory with the follow value for the Referral setting, try using the Global Catalog as Active Directory domain controller. The Global Catalog enables searching for Active Directory objects in any domain in the forest without the need for subordinate referrals. This can dramatically speed up searching. However, the Global Catalog only contains a subset of the attributes of an object. This solution is only viable if the attributes requested for the search results are stored in the global catalog. Note that the Global Catalog is accessible on port 3268/3269, not the standard 389/636 LDAP ports.

      Group base DN
    The base DN of where all the groups are located. If a base has been specified, that base is used as the prefix for this group base.
      Group LDAP filter
    The LDAP filter to which each group has to comply to be synchronized.
    Batch synchronization The synchronization of the users with the LDAP server happens in batches.
      Batch size

    The number of users in each batch. If a batch fails, none of the users in that batch is updated and the user names are listed in the DGC service log. Other batches are processed as normal. After processing all batches, Collibra disables users that are no longer in LDAP, unless one ore more batches failed.

    Set the value to 0 to disable batch processing.

    15.2 Password

    The configuration of passwords.

    Setting Description
    Minimum length (Requires restart)

    The minimum length of passwords.

    The default minimum length is 12.

    Maximum length (Requires restart)

    The maximum length of passwords.

    The default maximum length is 1,024.

    Digits required (Requires restart)
    • True (default): Passwords have to contain one or more digits.
    • False: Passwords do not have to contain digits.
    Non alphanumeric required (Requires restart)
    • True (default): Passwords have to contain one or more non-alphanumeric (special) characters.
    • False: Passwords do not have to contain non-alphanumeric characters.
    Uppercase required (Requires restart)
    • True (default): Passwords have to contain one or more upper-case characters.
    • False: Passwords do not have to contain upper-case characters.
    Lowercase required (Requires restart)
    • True (default): Passwords have to contain one or more lower-case characters.
    • False: Passwords do not have to contain lower-case characters.
    Username disallowed (Requires restart)
    • True (default): Passwords cannot be the username.
    • False: Passwords can be the username.

    Expiration interval (months)

    The number of months before users have to change their passwords.

    Set it to 0 if users never have to change their passwords.

    The default interval is 6 months.

    Allowed login failures

    The number of consecutive failed login attempts that are allowed before the user account is disabled.

    Set it to 0 for unlimited attempts.

    The default is 3 login failures.

    No reuse count

    The number of previous passwords users cannot reuse. The default is 1: the user cannot change his password to what it currently is.

    Set this to 0 to allow using the same password.

    Password reset link validity period

    The number of minutes that a link to reset a password remains valid. Beyond this time, the user has to request a new password reset link.

    The default value is 60 minutes.

    The minimum value is 15 minutes, the maximum value is 1,440 minutes (24 hours).

    Account lock-out duration

    The number of minutes that a user cannot sign in after too many failed sign-in attempts. If the number of minutes is set to 0, a Collibra administrator must reset the password to unlock the account. This setting is only applicable if the "Allowed sign-in failures" setting is defined.

    A locked-out account does not mean that your account is disabled.

    15.3 REST

    The security configuration of the REST interface.

    Setting Description
    Limited CSRF

    This option offers limited security, so we recommend upgrading to the Enhanced CSRF.

    • True: The validity of a request is checked with a CSRF token.
    • False (default): The validity of a request is not checked with a CSRF token.
    Enhanced CSRF If enabled, Collibra will check the validity of the request using a Spring Security CSRF token.
    • True: The validity of a request is checked with a CSRF token.
    • False (default): The validity of a request is not checked with a CSRF token.
    Referrer enabled
    • True: The HTTP referrer header is used to identify the origin of the request.
    • False (default): The HTTP referrer header is not used to identify the origin of the request. It is recommended to leave this option disabled.
    Referrer checking allow empty
    • True (default): The HTTP referrer header can be empty.
    • False: The HTTP referrer header cannot be empty.

    15.4 SSL

    The configuration of SSL.

    Setting Description
    Key store name The name of the keystore file. The file is expected to be in the <collibra_data>/dgc/security folder.
    Key store password The password of the keystore.
    Key store type The type of the keystore file. For example, JKS or PKCS12.
    Trust store name The name of the truststore file. The file is expected to be in the <collibra_data>/dgc/security folder.
    Trust store password The password of the truststore.
    Trust store type The type of the truststore file. For example, JKS or PKCS12.

    15.5 SSO

    The configuration of Single Sign-On (SSO) authentication.

    Setting Description
    Mode

    The SSO mode of Collibra.

    The possible values are:

    • SAML_ATTRIBUTES
    • SAML_LDAP
    • SSO_HEADER
    • SSO_HEADER_LDAP
    • DISABLED
    Header

    The name of the header to be checked. The contents of this header is used for the search query, which is SSO_HEADER = username.

    The value of the actual query depends on DN and possibly Attribute.

    DN

    If the SSO mode is SSO_HEADER_LDAP or SAML_LDAP, this field determines whether the distinguished name (DN) or attribute is used:

    • True: The header has to contain the distinguished name (DN) of the user in the LDAP.
    • False (default): The header has to contain the value of Attribute.

    If the SSO mode is DISABLED, SSO_HEADER or SAML_ATTRIBUTES, this field is ignored.

    Attribute

    This field is only used if the SSO mode is SSO_HEADER_LDAP or SAML_LDAP, and if DN is False.

    If the above criteria are met, the LDAP has to contain this value.

    Disable automatic user creation when signing in via SSO

    If users try to sign in via SSO, they still need a user account in Collibra. You can either create the user accounts automatically when they sign in, or create the user accounts manually or via LDAP synchronization

    • True: User accounts are not created automatically.
    • False (default): User accounts are created automatically.

    Disable the Collibra signin page

    When SSO is enabled, a user can still navigate to the /signin page and try to log in via that page. However, you can disable that page.

    • True: Users cannot access the Collibra signin page.
    • False (default): Users can access the Collibra signin page
    SAML The configuration of SAML.
      Metadata HTTP
    The URL of the SAML metadata file to be used. The URL always has to be reachable by the Collibra environment.
      Entity Provider Entity ID

    The entity ID inside the metadata to be referenced.

    Note A metadata file can describe multiple entity IDs, make sure to use in the entity ID from the correct metadata file.

      Attribute fields

    The mappings of attributes in the SAML response. The values are used as keys to look for in the SAML response.

    Examples of attribute fields are first name, last name, address information, phone numbers and so on.

      First name

    The mapping for the user's first name.

    This attribute is optional. The value can be empty.

      Last name

    The mapping for the user's last name.

    This attribute is optional. The value can be empty.

      Email

    The mapping for the user's email address.

    This attribute is optional for existing users, but mandatory for new users.

    Warning If the email address is invalid when you synchronize, the user is deactivated and the user information is not updated.

      Enabled
    The mapping that indicates whether the account of the incoming user is enabled.
      Group

    The mapping (attribute) which indicates to which Collibra groups the user should be added. If the groups don't exist yet, they will be created. This attribute can have multiple values (groups) or the groups can be sent as a comma-separated list of groups.

    If passing groups in this attribute, you must set Groups DGC Managed to False.

      Phone
    The mapping for the user's phone.
      Fax
    The mapping for the user's fax number.
      Mobile
    The mapping for the user's mobile number.
      Pager
    The mapping for the user's pager number.
      Private
    The mapping for the user's private number.
      Work
    The mapping for the user's work number.
      Other
    The mapping for any other phone number for this user.
      Home address
    The mapping for the user's home address.
      Street
    The mapping for the user's street.
      Number
    The mapping for the user's number.
      City
    The mapping for the user's city.
      Post code
    The mapping for the user's postal code.
      State
    The mapping for the user's state.
      Country
    The mapping for the user's country.
      Work address
    The mapping for the user's work address.
      Street
    The mapping for the user's street.
      Number
    The mapping for the user's number.
      City
    The mapping for the user's city.
      Post code
    The mapping for the user's postal code.
      State
    The mapping for the user's state.
      Country
    The mapping for the user's country.
      Instant messaging
    The mapping for the user's IM locations.
      AIM
    The mapping for the user's AOL IM account.
      Google Talk
    The mapping for the user's Google Talk IM account.
      Icq
    The mapping for the user's ICQ IM account.
      Jabber
    The mapping for the user's Jabber IM account.
      Messenger
    The mapping for the user's Live Messenger IM account.
      Skype
    The mapping for the user's Skype IM account.
      Yahoo Messenger
    The mapping for the user's Yahoo Messenger IM account.
      Gender
    The mapping information for the user's gender.
      Mapping
    The attribute key for the gender value. If the content equals one of the male or female mappings, the user will be saved as male or female. Otherwise a default of UNKNOWN will be used.
      Male value
    The value for male users.
      Female value
    The value for female users.
      Groups DGC managed

    Option to configure that groups should be managed by Collibra, or that groups should be set by the SAML assertion (SAML+Attributes mode).

    This option is only relevant if Mode is SAML_ATTRIBUTES.

    • True: The groups are fully managed by Collibra. In the UI the admin has the option to assign groups to users, without it being overwritten by SAML.
    • False (default): The groups are managed by the SAML assertions. In this case the groups are managed by the SAML IDP. Be sure to configure the Group attribute in the Attribute Fields section.
      Service Provider Entity ID

    Field that determines the value of the Entity ID parameter in the service provider metadata returned by Collibra. The default value is empty, in which case Collibra uses the value of the Base URL field.

    Enter a custom value if the base URL does not match the audience configured in your SAML identity provider.

    Warning The value of the audience restriction in the SAML response has to be exactly the same as the value of this field.

    Note SSO does not work if the Service Provider Entity ID field contains the base URL with trailing forward slash (for example www.collibra.com/), and the audience of your IDP contains the base URL without a trailing forward slash (for example www.collibra.com).
    Both values need to be exactly the same. In this case, you can resolve the issue by changing the value in the configuration of your IDP, or the value of this field. It does not matter whether both have a trailing forward slash or not, as long as they contain the same value.

      Sign authentication requests (Requires restart)
    • True: Authentication requests have to be signed.
    • False (default): Authentication request don't have to be signed.
      Force authn
    • True (default): The SP authentication request forces re-authentication.
    • False: The SP authentication request does not force re-authentication.
      Force passive
    • True: The reauthentication has to happen in the background.
    • False (default): The reauthentication does not have to happen in the background.

    This is only relevant if Force authn is True.

      Name ID

    Name ID that is used in the SP authentication. The default value is urn:oasis:names:tc:SAML:2.0:nameid-format:persistent.

    The Name ID value is mandatory.

      Name ID allow create
    • True (default): The IDP can create a name ID to fulfill the SP authentication request.
    • False: The IDP cannot create a name ID to fulfill the SP authentication request.
      Disable client address
    • True: The validation of the client IP address in the assertion message is disabled.
    • False (default): The validation of the client IP address in the assertion message is enabled.
      SAML Requested authentication context

    Settings for the SAML requested authentication context. The IDP uses the authentication context to authenticate the user. By default, the authentication context mandates user/password authentication over HTTPS.

      Disable
    • True: The requested authentication context section is not sent in the SAML request.
    • False (default): The requested authentication context section is sent in the SAML request.
      Comparison type

    The comparison type that is transmitted in the requested authentication context.

    Possible values:

    • minimum
    • maximum
    • better
    • exact (default value)

    For more information about the comparison type values, refer to the SAML specifications.

      Reference list

    The list of class references in the requested authentication context. You can separate list items with the pipe character (|).

    For more information about this list, refer to the SAML specifications.

      Declaration list

    The list of class declarations in the requested authentication context. You can separate list items with the pipe character (|).

    For more information about this list, refer to the SAML specifications.

      Response decryption mode

    Enable the support for encrypted SAML responses.

    • DISABLED: Collibra only accepts plain-text SAML responses.
    • OPTIONAL: Collibra can handle both encrypted and plain-text SAML responses.
    • FORCED: Collibra only accepts encrypted SAML responses.

    Once OPTIONAL or FORCED is selected, the encryption key pair is generated and added to the Collibra SAML keystore. A self-signed certificate is generated and works in most situations. If your IdP rejects self-signed certificates, you will have to add a certificate that is signed by a trusted 3rd party.

      Validity period of the SAML certificate

    The SAML certificate expiry date in years.

    By default, the SAML certificate expires after 20 years.

    15.6 Signout

    The configuration of redirecting after signing out of Collibra.

    Setting Description
    Override signout URL (Requires restart)
    • True: Redirect the user to a specific website after signing out.
    • False (default): Redirect the user to the sign-in page after signing out.
    Signout redirect URL (Requires restart) The URL to be redirected to when signing out.

    15.7 Session

    The configuration related to sessions.

    Setting Description

    Idle Session timeout (Requires restart)

    This setting requires the SUPER role.

    The time after which you are signed out if you are inactive.

    • Minimum value: 5 minutes
    • Maximum value: 24 hours
    • Default value: 1,800 seconds

    15.8 Import/Export

    The configuration to avoid the Formula Injection vulnerability in Excel.

    Setting Description

    Escape Excel formulas

    The option to disable Formula Injection into Excel. When enabling this option, an escape character is added at the beginning of Excel formulas during the export and is removed when importing formulas.

    The escape character will be added to fields that start with one of the following characters:

    • equation: =
    • plus: +
    • minus: -
    • at-sign: @

    This option is enabled by default.

    Excel formulas escape character

    The escape character for Excel formulas when exporting or importing data.

    15.9 JWT

    The JSON Web Token configuration.

    Setting Description
    JSON Web Key Set URL

    The URL to retrieve public key information needed to verify the authenticity of JSON Web Tokens (JWTs), issued by an authorization server.

    This setting is required to enable JWT authentication.

    JWT Token Types

    A case-insensitive comma-separated list of accepted JWT media types coming in the typ header parameter.

    Leave blank if the authorization server does not provide a media type parameter.

    The default values is at+jwt,jwt.

    JWT Algorithms

    A comma-separated list of accepted JWT algorithms coming in the alg header parameter. See https://tools.ietf.org/html/rfc7518#section-3.1 for details.

    Leave blank to accept all digital signature algorithms.

    JWT Issuer

    The accepted issuer coming in the iss JWT claim.

    Leave blank if the authorization server does not provide an issuer claim.

    JWT Audience

    A comma-separated list of accepted audience values for the aud claim.

    The value for this field is a configuration setting in your authorization server, which identifies your Collibra environment as the intended recipient of the JWT.

    Leave blank if the authorization server does not provide an audience claim.

    JWT Principal ID Claim Name.

    The name of the JWT claim containing the principal's identity. See https://tools.ietf.org/html/rfc7519#section-4.1.2 for details.

    Defaults to the standard subject claim, sub.

    Change this setting only if your authorization server has other means of identifying the principal, for example, a client_id claim.

    This setting is required if JWT authentication is enabled.

    JWT Maximum Clock Skew

    The maximum acceptable difference in seconds between the clocks of the machines running the authorization server and Collibra.

    Differences smaller than the given amount are ignored when performing time comparisons for token validation.

    The default value is 60 seconds if left blank.

    15.10 HTTP headers

    The configuration of the HTTP headers

    Field

    Description

    URL pattern

    This setting requires the SUPER role.

    The pattern of the URLs to which the HTTP response header is applied.

    This field supports wildcards such as **, *, and ?.

    Tip The following pattern matches all URLs: /**

    HTTP headers

    This setting requires the SUPER role.

    The HTTP response headers in a key-value format.

    You can add new HTTP response headers by clicking Add at the bottom of the section, and entering the HTTP response header name as the field key and the HTTP response header value as the field value.

    15.11 Whitelists

    The configuration for whitelist placeholders that can be used in security headers.

    Option Description
    connect-src whitelist The 'connect-src' whitelist. To use this whitelist in a security header, use the '{connectSrcWl}' placeholder.
    font-src whitelist The 'font-src' whitelist. To use this whitelist in a security header, use the '{fontSrcWl}' placeholder.
    frame-src whitelist The 'frame-src' whitelist. To use this whitelist in a security header, use the '{frameSrcWl}' placeholder.
    img-src whitelist The 'img-src' whitelist. To use this whitelist in a security header, use the '{imgSrcWl}' placeholder.
    script-src whitelist The 'script-src' whitelist. To use this whitelist in a security header, use the '{scriptSrcWl}' placeholder.
    style-src whitelist The 'style-src' whitelist. To use this whitelist in a security header, use the '{styleSrcWl}' placeholder.
    frame-ancestors whitelist The 'frame-ancestors' whitelist. To use this whitelist in a security header, use the '{frameAncestorsWl}' placeholder.
    Tableau frame-ancestors whitelist The tableau 'frame-ancestors' whitelist. To use this whitelist in a security header, use the '{tableauFrameAncestorsWl}' placeholder.

    15.12 Disclaimer

    The configuration of a disclaimer upon signing in to Collibra.

    Setting Description
    Disclaimer
    • True: Upon signing in, show a disclaimer that you have to agree with before you can continue.
    • False (default): Don't show a disclaimer.
    Disclaimer message

    The disclaimer message that is shown after signing in.

    If you leave this field empty, there is a default message.

    You can use basic html tags, such as headers, paragraphs, images and hyperlinks.

    16 Workflow engine configuration

    The configuration of the workflow engine.

    Setting Description

    Activate default escalation

    This setting requires the SUPER role.

    • True (default): Automatically add an escalation function to user tasks after a configurable period of time.
    • False: Don't escalate task automatically.

    This option only works for tasks that do not yet have a configured escalation path configured in Collibra Data Intelligence Platform.

    Default escalation timer duration

    This setting requires the SUPER role.

    The duration before a task is escalated.

    For more information on the format, please refer to ISO 8601.

    Activate default task email notification

    This setting requires the SUPER role.

    • True (default): Send email notifications to any candidate user of a user task in the workflow engine.
    • False: Do not send email notifications to candidate users.

    17 Collibra Connect

    The configuration to communicate with Collibra Connect.

    Setting Description
    Base URL The URL to Collibra Connect.
    Username The username to connect to Collibra Connect.
    Password The password to connect to Collibra Connect.

    18 Register data source

    Global parameters that apply to Data Source Registration.

    Setting Description
    Table types to ignore A comma separated list of table types that are not ingested. For example, INDEX and SEQUENCE.

    AWS regions restriction

    A list of AWS regions Data Catalog is allowed to connect to. For example, eu-west-3 and us-east-2. For a list of all AWS locations, see the AWS documentation.

    • If you want to allow Collibra to make a connection to any AWS region, leave the field empty.
    • If you remove a region from this list and the region was previously used for an S3 integration, you may want to delete the Glue database from the previously used region manually. By default, Collibra does not remove it. The Glue database has the following naming convention: collibra_catalog_<Asset Id>_<Domain Id>
      For example: collibra_catalog_d3174a88-5ffe-4d50-8fbe-7bf0832ec3af_5d198ce9-4e56-4d0e-a885-58204da50741
    • When using Edge, a warning is added to the logs if an invalid region is detected in the restricted regions list.
    AWS API call rate

    Allowed number of AWS API calls per second.

    Use this option to limit the number of API calls per second to prevent throttling errors from the AWS API.

    Database registration via Edge

    An option to enable database registration via Edge.

    • True: Register a data source via Edge.
    • False: Register a data source via Jobserver only.

    Note Enabling data source registration via Edge does not prevent you from registering a data source via Jobserver as well.

    Collibra Data Quality & Observability Synchronization UI via DQ Connector on Edge

    An option to enable the Data Quality extraction interface in Collibra

    • True: The Quality extraction tab is available on the configuration page of a database asset
    • False (default): The Quality extraction tab is not available and as such, it is not possible to extract and synchronize data quality information.

    You can only enable Collibra Data Quality & Observability synchronization if you also enabled Database registration via Edge.

    Google Cloud Storage synchronization via Edge

    An option to enable Google Cloud Storage file system registration and synchronization via Edge.

    • True: You can register and synchronize a Google Cloud Storage file system via Edge.
    • False: You can't register a Google Cloud Storage file system via Edge.
    Amazon S3 synchronization via Edge

    An option to enable Amazon S3 file system registration and synchronization via Edge.

    • True: You can register and synchronize an Amazon S3 file system via Edge.
    • False: You can only register an Amazon S3 file system via Jobserver.

    Note Enabling the registration of an Amazon S3 file system via Edge does not prevent you from registering an Amazon S3 file system via Jobserver.

    For more information, see Working with Amazon S3.

    Databricks Unity Catalog synchronization via Edge

    An option to enable the integration of Databricks Unity Catalog via Edge.

    • True: You can register and synchronize Databricks Unity Catalog via Edge.
    • False: You cannot integrate Databricks Unity Catalog.

    Set this option to True.

    19 Jobserver

    The configuration of the Jobserver service.

    Setting Description

    Jobserver list

    The list of registered Jobserver instances.

    Name

    The name of the Jobserver as it will appear when you register a data source in Data Catalog.

    The name is a freely chosen name but it is recommended to only use alphanumerical characters and dashes, for example Jobserver-1.

    You will have to use this name as the ID of the gateway and in the address of this configuration.

    Protocol

    The protocol that is used for the communication between the Data Governance Center service and the Jobserver service.

    It is recommended to use HTTPS, especially if the services are hosted in different network segments.

    Address

    The address (IP address, URL, hostname) of the Jobserver.

    Trusted server CA certificate

    The certificate of the trusted CA needed to validate the server certificate. If blank, the default truststore will be used. The default truststore is defined in the SSL configuration section of the DGC service.

    The CA certificate of the server party (Jobserver).

    Client certificate

    The client certificate offered by the DGC service to the server. If blank, you cannot select mutual authentication as the Jobserver service authentication level.

    Client private key

    The private key of the DGC service's certificate.

    Table profiling data size

    The approximate maximum disk size of the data in MB that will be used to profile a table. The value cannot exceed 10,000.

    Test connection timeout

    This timeout is a time limit (in seconds) after which the connection test is stopped and a timeout error is shown. The default value is 60 seconds.

    20 Data profiling

    Profiling must be executed again after a change in this section.

    Setting Description
    Maximum number of samples The maximum number of samples you want to collect for a data source. The default value is 100. The maximum value is 1,000.
    This setting is specific to sample data.
    Maximum value length The maximum length of a value extracted during profiling or sampling. Additional characters are trimmed.
    Default date pattern The default format used to decode dates. It is the default pattern used for detecting dates when the Date Pattern and/or Time Pattern attribute is not specified in Column assets.
    Default time pattern The default format used to decode times. It is the default pattern used for detecting times when the Date Pattern and/or Time Pattern attribute is not specified in Column assets.
    Default combined date and time pattern The default format used to decode combined dates and times. It is the default pattern used for detecting combined dates and times when the Date Pattern and/or Time Pattern attribute is not specified in Column assets.
    Empty values

    A comma separated list of strings enclosed in double quotes. A value that matches one of those expressions is considered an empty value.

    Please note that a database null value is always considered an empty value, for example "", "na" and "none".

    Data type detection threshold The percentage of matching Column values to reach for an Advanced Data Type to be considered a possible Data Type for that Column. This is expressed as a value between 0.0 and 1.0).

    Anonymize data (Jobserver)

    An option to anonymize sensitive data.

    • True: Content in columns with data type Text or Geo is removed or replaced by a random hash value before the profiling results are sent to the cloud.
    • False (default): No content is removed or replaced by a random hash value.

    Tip For anonymization via Edge, see setting "Anonymize Edge profiling results for all data types".

    Database profiling via Edge

    An option to enable profiling and classifying of synchronized metadata via Edge instead of Jobserver.

    • True: Profiling and classification via Edge.
    • False: Profile via Jobserver and classify via the Data Classification Platform.

    Note You can enable Database profiling via Edge only if you also enabled Database registration via Edge.

    Maximum duration of a profiling Edge job

    The maximum time duration, in minutes, that a profiling Edge job can run before Data Profiling stops the job.

    The default value is 20,160 minutes, 2 days.
    You can increase this limit to a maximum of 4 days.

    Parallel database profiling via Edge

    The maximum number of schemas that Edge can profile at the same time.

    By default, the value of this setting is 4. This means Edge processes four profiling jobs at a time. This can have a huge positive impact on the performance of the profiling activity.
    You can increase this number to a maximum of 16.

    Note 
    • If you increase this number to more than four jobs, make sure that your Edge site resources are aligned with the extra requests it will receive.
    • If you decrease this number and the running number of jobs exceeds the limit, no job will be canceled. Instead, there won't be any room to schedule a new job until at least one running job is completed.
    Example 

    The parallel schema profiling via Edge setting is set to 4.

    • For 1 database that contains 3 schemas, we will process all 3 schemas at the same time.
    • For 2 databases that contain 4 schemas in total, we will process all 4 schemas at the same time.
    • For 1 database that contains 8 schemas, we will start with 4 schemas and then proceed to the next ones as soon as a job is completed.
    Anonymize Edge profiling results for all data types

    Enable this option to anonymize all Edge profiling results stored in Collibra.

    • True: Profiling results via Edge are anonymized for all columns.
    • False (default): Profiling results via Edge are anonymized only for columns with the Text or Geo data type.

    Calculate Data Similarity

    Important Data similarity is a cloud-only feature and is not certified for Collibra Cloud for Government.

    Enables the data similarity feature in your environment.

    • True (default for cloud environments, except for Public sector): Extra algorithms can run during profiling via Edge allowing the calculation of data similarity scores. The data similarity scores are currently used in Data Marketplace to show similar Table assets.
    • False: The beta feature is not enabled.
    Data Similarity Threshold

    This setting relates to the data similarity feature and defines from which similarity score Table assets must be displayed as similar data.
    Enter a value between 0.1 and 0.9.
    The default value is 0.5, which means that Table assets with a similarity score higher than 50% will show up as similar data.

    21 Beta features

    The configuration of features in beta state.

    Setting Description
    Tableau provisioning enabled
    • True: Provisioning to Tableau is enabled.
    • False (default): Provisioning to Tableau is disabled.

    Max number of concurrent import jobs

    The maximum number of import jobs that can be executed at the same time via the API. This is to avoid memory issues.

    Default value is 4, set to 0 if there is no limit.

    Task sidebar

    • True (default): Workflow tasks appear in the sidebar on both resource pages an the task management page. Task forms appear in the sidebar instead of dialog boxes. Users can seamlessly complete their tasks from the task management page and have a side-by-side view of the tasks and resource details on resource pages.
    • False : Workflow tasks appear in the task bar on resource pages and in a sidebar on the task management page. Task forms appear in dialog boxes. The behavior is the same as with older versions of Collibra.
    Settings landing enabled
    • True (default): Show the new Settings landing page in your Collibra environment.
    • False : Use the classic Settings page in your Collibra environment.
    Allow access to the Workflow Designer
    • True: Enable the Workflow Designer access global permission which allows access to the Workflow Designer, a visual tool for creating process definitions.
    • False (default): Disable the Workflow Designer access global permission.

    Warning Do not enable this setting if you are working in a CPSH environment. Workflow Designer is not supported for CPSH.

    Frontend enabled

    This setting requires the SUPER role.

    • True(default): This setting is a prerequisite for the Homepage, Data Marketplace, Protect, and Usage Analytics applications. Enabling it will allow to enable these applications. This setting is also a prerequisite for the new frontend experience.
    • False: If you set the value to False, the Homepage, Data Marketplace, Protect, and Usage Analytics applications, and the new frontend experience will be disabled.

    This setting is not valid for Cloud customers.

    Shell asset page enabled
    • True: Enable the new asset page layout in the user interface.
    • False (default): Disable the new asset page layout in the user interface.

    Unified Classification enabled

    Enables the new Unified Data Classification method on Edge.

    • True (default in new environments from 2024.02): The environment uses the new classification method, Unified Data Classification. This has an impact on the available data classes, the required capabilities, and the way you classify data.

      Note All existing data classes and classifications become unavailable.

      Tip  A migration process will be available in a future release. This process will transfer all old data classes and classifications to the Unified Data Classification method. Old transferred data classes will need to be updated to include classification rules to work with the new automatic data classification method.

    • False: The feature is not enabled.

      If you deactivate the setting, you revert back to your previous data classification setup and the previously defined data classes and classifications become available again.

    Collections

    Enable this setting to activate the use of collections in Data Marketplace.

    • True: Collections can be used in Data Marketplace.
      • Users can add an asset to a collection from an asset preview in Data Marketplace. They can also remove an asset from a collection from the asset preview.
      • Users can access and manage all their collections from an overview page via their avatar → Collections.
    • False: (default): The beta feature is not enabled.
      • When disabled, collections are not accessible via the UI.
      • Existing collections are not removed.
    Sampling optimization enabled

    By enabling this setting, the process of checking the Edge cache for samples is much faster.

    • True: If samples are already cached, they are immediately visible in the asset page.
    • False (default): The optimization feature is not enabled.
    New frontend experience enabled

    Enables the latest Collibra user interface, providing a brand new design.

    Uninterrupted Search
    • True: The search function remains available and the current data is available for search, while the search index is rebuilding in the background. Search results, however, may not be up to date until the rebuild is complete. After the rebuild is complete, all new data and changes to the existing data become available for search.
    • False (default): The search function becomes temporarily unavailable when the search index is rebuilding.
    Important If you enable or disable this setting, you need to rebuild the search index for the change to take effect.
    Generate descriptions with Collibra AI
    Important 

    This is a cloud-only feature and is not certified for Collibra Cloud for Government.

    Allows users to ask Collibra AI for description suggestions for the following asset types:

    • Column
    • Table
    • Database View
    • Data Set

    The possible values are:

    Collibra AI model
    This setting requires the SUPER role.

    The Large Language Model used by Collibra AI to generate descriptions for assets.
    For more information about this setting, contact Collibra Support.

    22 External link notice

    This section contains settings for showing a warning to users when they click a link that redirects them to an external website. These settings are applicable only to the latest user interface (UI).

    Setting Description
    External link notice enabled
    • True: Shows a dialog box when users click external links to warn them about the redirection to third-party websites. Enabling this setting may decrease the UI performance.
    • False (default): Does not show a dialog box when users click external links.
    External link notice dontAskAgain enabled
    • True: Shows a checkbox in the dialog box to allow users to choose whether they want to receive the same warning again on the same system and browser.
    • False (default): Does not show a checkbox in the dialog box.

    23 [Design system] features

    This section contains settings that control the appearance of certain products and features. These settings require the SUPER role and are therefore managed by Collibra.

    24 Throttling

    Throttling is a security mechanism where you can limit the number of requests per seconds to ensure security and performance of your environment.

    Setting

    Description

    Collect metrics without throttling

    This setting requires the SUPER role.

    • True (default): Apply throttle logic without actual throttling to collect metrics.
    • False: No throttle logic is applied to collect metrics.

    The throttling metrics are only used for evaluation by Collibra support engineers if you report significant performance loss of your environment. We only track the number of times the throttling limit is exceeded.

    By default, the throttling limit is 100 API requests per second.

    24.1 REST API version 1.0 throttling

    Throttle configuration for REST API version 1.0.

    Setting

    Description

    Throttling enabled

    This setting requires the SUPER role.

    • True: REST API v1 throttling is enabled.
    • False (default): REST API v1 throttling is disabled.

    Number of requests

    This setting requires the SUPER role.

    The number of allowed request for the configured number of seconds.

    Number of seconds

    This setting requires the SUPER role.

    The number of seconds during which the configured number of requests can be performed.

    24.2 REST API version 2.0 throttling

    Throttle configuration for REST API version 2.0.

    Setting

    Description

    Throttling enabled

    This setting requires the SUPER role.

    • True: REST API v2 throttling is enabled.
    • False (default): REST API v2 throttling is disabled.

    Number of requests

    This setting requires the SUPER role.

    The number of allowed request for the configured number of seconds.

    Number of seconds

    This setting requires the SUPER role.

    The number of seconds during which the configured number of requests can be performed.

    24.3 GraphQL throttling

    Throttle configuration for GraphQL.

    Setting

    Description

    Throttling enabled

    This setting requires the SUPER role.

    • True: GraphQL throttling is enabled.
    • False (default): GraphQL throttling is disabled.

    Number of requests

    This setting requires the SUPER role.

    The number of allowed request for the configured number of seconds.

    Number of seconds

    This setting requires the SUPER role.

    The number of seconds during which the configured number of requests can be performed.

    25 Hibernate cache configuration

    The configuration of the hibernate second level caching. Hibernate caching uses a buffer memory that lies between Collibra and your repository database. It stores recently used data in this buffer memory to reduce the number of requests to your database, thereby improving the performance of your environment.

    Tip We recommend that you use the default values. If you choose to edit the values, contact the Collibra support department before doing so.

    Setting Description

    Enabled

    This setting requires the SUPER role.

    • True (default): Hibernate caching is enabled.
    • False: Hibernate caching is disabled.

    Configuration list

    This setting requires the SUPER role.

    The list of associated cache configurations.

     

     

    <cache configuration>

    This setting requires the SUPER role.

    The cache configurations.

    Click Add to create a new cache configuration.

    Setting Description
    Enabled
    • True: This cache configuration is enabled.
    • False: This cache configuration is disabled.
    Name

    The code of the assets that you want to cache.

    The codes are displayed in the following table. Do not use other codes.

    Max elements in memory The maximum amount of elements in the memory.
    Eternal
    • True: The cache configuration is eternal.
    • False: The cache configuration is not eternal. This is the default value.

    We recommend that you do not change the default value.

    Overflow to disk
    • True: Write data to disk if the cache if full.
    • False: Do not write data to disk if the cache is full. This is the default value.

    We recommend that you do not change the default value.

    Name

    Resource type

    Recommendation for Max elements in memory

    GR Groups Set a number that is a percentage of the total amount of groups.
    CO Communities Set a number that is at least the total amount of communities if possible.
    UR Users Set a number that is a percentage of the total amount of users.
    VC Domains Use as many domains as possible in cache.
    RP Assets Set a number that is a percentage of the total amount of assets.
    CT Asset Types Set a number that is at least the total amount of asset types if possible.
    VT Domain Types Set a number that is at least the total amount of domain types if possible.
    TY Attribute Types

    Set a number that is at least the total amount of attribute types if possible.

    BF

    Relation Types

    Set a number that is at least the total amount of relation types if possible.

    ST

    Status

    Set a number that is at least the total amount of statuses if possible.

    25.1 Default configuration

    The default configuration for this cache.

    Do not change any of the default values.

    Setting Description

    Max elements in memory

    This setting requires the SUPER role.

    Type the maximum amount of elements in the memory.

    Eternal

    This setting requires the SUPER role.

    • True: The cache configuration is eternal.
    • False: The cache configuration is not eternal.

    Time to idle (seconds)

    This setting requires the SUPER role.

    Type the amount of seconds until the cache is idle.

    Time to live (seconds)

    This setting requires the SUPER role.

    Type the amount of seconds until the cache starts.

    Overflow to disk

    This setting requires the SUPER role.

    • True: The cache configuration overflows to the disk. When the cache has reached its maximum number of elements, the next elements will be stored on disk.
    • False: The cache configuration does not overflow to the disk. This is the default value.

    Disk persistent

    This setting requires the SUPER role.

    • True: The disk store is persistent between CacheManager instances.
    • False: The disk store is not persistent between CacheManager instances. This is the default value.

    This option is only relevant if Overflow to disk is set to True.

    Disk expiry thread interval (seconds)

    This setting requires the SUPER role.

    Enter the amount of seconds between runs of the disk expiry thread. This value is how often we check for expiry.

    This option is only relevant if Overflow to disk is set to True.

    Memorystore eviction policy

    This setting requires the SUPER role.

    Enter the policy how items are deleted from disk:

    LRU

    Least recently used items are removed as first from disk. This is the default value.

    LFU

    Least frequently used items are removed as first from disk.

    FIFO

    First in first out principle to remove items from disk.

    This option is only relevant if Overflow to disk is set to True.

    Statistics

    This setting requires the SUPER role.

    This option is not used by a Collibra environment.

    26 Graph query

    The configuration of the Graph query engine which is used to retrieve data from the repository.

    For the general Graph query settings in a cloud environment, you need the SUPER role. Contact Collibra Support if you want to edit these settings.. For on-premises environments, you can edit the settings yourself.

    The Graph query settings are not available in on-premises environments.

    Setting

    Description

    work_mem setting for output module queries

    This setting requires the SUPER role.

    A custom amount of memory that is reserved for the output module SQL queries.

    This setting should only be used for diagnosing potential lack of memory in case of performance issues. Performance issues may arise when large sorting or large joins are needed.

    By default, this option is disabled.

    Enable simple joins

    This setting requires the SUPER role.

    • True: Enable an alternative logic in the query engine to execute join operations. The alternative logic improves the query engine performance if queries don't have "to-many" relations.
    • False (default): Keep the default logic of the query engine.

    Enable joins for view permission filtering

    This setting requires the SUPER role.

    • True: Enable an alternative logic to calculate view permissions in output module queries to improve the performance.
    • False (default): Keep the default logic of the output module.

    26.1 Graph query limits

    Setting

    Description

    Enables limiting of the number of root nodes in result
    • True: Enable limiting the number of root elements as result of a Graph query.
    • False (default): Disable limiting the number of root elements as result of a Graph query.
    Maximum number of root nodes that can be requested with graph query API

    The maximum number of root nodes that you can request in the view configuration of an API call (REST or workflow).

    If you exceed this value in the view configuration, an exception is shown. If no value is defined in the view configuration, then the default value is taken.

    The default value is 100,000.

    Note If the number of asset types or domain types exceeds the set number, the hierarchy will be incomplete. Make sure that the limit is always higher than the actual number of asset and domain types.

    Maximum number of nodes that can be requested with the graph query API in a single page The maximum number of both root and children nodes that can be requested through Output API in a single data page. If the value is outside of the allowed range an exception is thrown. The default value is 1 million.

    26.2 Graph query timeouts

    Setting

    Description

    Maximum number of minutes a graph query can run

    The maximum number of minutes that the graph query runs before it will time out. The maximum is 1,440 minutes (1 day).

    The default value is 480.

    27 Table

    The configuration of tables.

    Setting Description
    Time limit for loading data in tables in seconds

    The time limit after which a table stops loading on a page.

    Example A value of 600 means that if a table hasn’t loaded within 600 seconds, the task is canceled and a timeout error is shown.

    The default value is 60, the maximum value is 720 seconds.

    27.1 Multi-column sort

    The configuration of multi-column sorting.

    Setting Description
    Multi-column sorting on tables
    • True: Tables can be sorted on multiple columns.
    • False (default): Tables can be sorted on one column.
    Number of columns available for multi-sort

    Type the maximum number of columns that can be used to simultaneously sort tables.

    The default value is 3, the minimum is 1, the maximum is 9.

    This setting is only relevant is Multi-column sorting on tables is True.

    27.2 Inherited responsibilities

    Setting Description
    Enable Inherited Responsibilities
    • True: Shows both direct and inherited responsibilities when filtering or displaying assets in table and tile views.
    • False (default): Shows only direct responsibilities when filtering or displaying assets in table and tile views.
    Note 
    • This setting affects only asset views and tile sets. It does not affect the Responsibilities tab of asset pages.
    • If this setting is set to True, opening a table or tile view may take longer depending on factors such as asset count, selected columns, and applied filters.

    28 Purge configuration

    The configuration of the automatic purging of data from the repository database. Purging means to delete data of a specified age. This helps to keep your data relevant and keep the database from growing infinitely.

    Setting Description

    Purge schedule (Requires restart)

    A Cron expression specifying the timing and frequency of purge cycles.

    The default scheduled time is 002 * *, which equates to 02:00 every day.

    If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.

    Maximum time for each purge cycle (Requires restart)

    Maximum amount of time (in seconds) allowed for each purge cycle.

    The default value is 7,200, which is two hours.

    Any qualifying data that is not purged in the allowed time will be addressed in a subsequent purge cycle, which picks up where the previous cycle left off.

    List of data elements and age at which each will be purged (Requires restart)

    The data elements that will be purged at the specified age, in a key-value format:

    • The data elements to be purged (Field key). Possible data elements:
      • Statistics: Data that has been used to calculate data quality.
      • Authentication events: Data about authentication in your environment.
      • Validation results: Information about data validation.
      • Jobs: Data about all jobs that are created in your environment.
      • Workflows: Data about completed workflow instances and tasks.
      • License usage: Data about how licenses are used in your environment.
    • The age (in months) at which each individual data element will be purged (Field value).
    Enable removal of orphaned tags(Requires restart)

    Option to enable the automatic deletion of tags that are not assigned to any assets.

    • True (default): Orphan tags are deleted according to the timing and frequency that you specify.
    • False: Orphan tags are not deleted.
    Orphaned tags removal schedule (Requires restart)

    A Cron expression specifying the timing and frequency of the deletion of orphan tags.

    The default scheduled time is 001* *, which equates to 01:00 every day.

    If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.

    29 Cloud Data Classification configuration

    With data classification you can automatically assign data classes to ingested data.

    Note In a Collibra Data Intelligence Platform environment, you have to create a support ticket to configure this feature.

    Setting

    Description

    Machine Learning platform URL

    This setting requires the SUPER role.

    The address of the machine learning platform that will classify your data.

    Requester Name

    This setting requires the SUPER role.

    The unique name to identify the client when using Machine Learning platform.

    API key

    This setting requires the SUPER role.

    The API Key to authorize the requester when connecting to the Machine Learning platform.

    Enable Data Classification

    • True: Enable Collibra's data classification technology.
    • False (default): Do not use Collibra's data classification technology are not accepted.

    29.1 Classification thresholds

    Setting

    Description

    Enable automatic classification acceptance and rejection

    True: The automatic acceptance and rejection of data classification suggestions is active.

    False (default): Data classification suggestions are not automatically accepted or rejected.

    Tip Start using this tool by manually accepting and rejecting the data classification suggestions. Only activate the automatic acceptance and rejection feature if you are comfortable with the data classification results.

    Automatic acceptance threshold

    The percentage from which data classification suggestions must be accepted automatically.
    If you set this value to 75, then the classification suggestions with a confidence level of 75% or higher are automatically accepted.

    If multiple classification suggestions meet the threshold condition for a column, the classification suggestion with the highest confidence level percentage is accepted automatically if this classification suggestion is the only one to have that confidence level percentage.

    Example 

    You set the automatic acceptance threshold to 85%. You classify a table with 2 columns.

    • For column A, three classification suggestions are possible, one with confidence level 93%, one with 92%, and one with 90%.
    • For column B, two classification suggestions are possible. Their confidence level is the same, 86%.

    The results of the automatic acceptance will be:

    • For column A, the classification suggestion with 93% will be accepted automatically.
    • For column B, nothing is done, both suggestions will be visible.

    The default acceptance threshold is 90.

    Automatic rejection threshold

    The percentage from which data classification suggestions must be rejected automatically. If you set this value to 49, then all data classification suggestions with a confidence level of 49% or lower are automatically rejected.

    The default rejection threshold is 10.

    Note If the acceptance threshold and rejection threshold are set to the same value, and a data classification suggestion has this confidence level percentage, the classification suggestion will be rejected.

    30 Reporting

    For more information about these settings, go to Insights Data Access.

    Setting

    Description

    Cloud Provider

    Cloud provider - AWS or GCP

    Customer GUID

    The GUID of your Collibra environment.

    Note This field is configured by Collibra Cloud Ops.

    Insights download bucket name

    The name of the AWS S3 bucket in which your reporting data is stored.

    Note This field is configured by Collibra Cloud Ops.

    Insights AWS S3 Region

    The AWS S3 region in which your data is processed.

    Note This field is configured by Collibra Cloud Ops.

    Insights zip location pattern

    A pattern with the format "/zip/insights_%s.zip", where "%s" is replaced by the Collibra Insights snapshot date.

    Note This field is configured by Collibra Cloud Ops.

    Tableau report URL pattern

    The Tableau URL pattern, which should contain {reportName}.

    Tip You can paste the URL from the Link field in Tableau, as described in Generate the dashboard reports you configured in Collibra Data Intelligence Platform Settings.

    Reports definitions

     

      Report view name
  • The report name, as you want it to appear on the report button in the Usage Analytics widget, for example "Data Maturity Dashboard".
  •   Report name
  • The report name, as it appears in the URL of the Tableau report, for example "DataMaturityDashboard".
  • 31 Catalog Experience

    Data Catalog Experience improves the layout of Data Catalog's asset pages.

    Setting

    Description

    Enable Catalog experience

    • True: Catalog experience is enabled. This will improve the layout of Data Catalog's asset pages, such as those of Data Set, Schema, Table and Column assets.
    • False: Catalog experience is disabled.

    Catalog Experience Titlebar theme

    The theme for the Catalog experience. You can choose between the LIGHT and DARK.

    This option is only applicable if the Enable Catalog experience option is enabled.

    32 Data Notebook features

    Setting

    Description

    Data Notebook enabled

    This setting requires the SUPER role.

    • True: The Data Notebook feature is available in the shell.
    • False (default): The Data Notebook feature is not available in the shell.

    33 Diagrams

    These settings determine dialog loading time and size limits.

    Setting

    Description

    Maximum loading time for the back end

    The time limit, in seconds, after which a diagram stops fetching data.

    The value must be a positive integer and cannot be greater than 3,600 (one hour).

    The default value is 300.

    Example A value of 300 means that if a diagram hasn’t fetched all data within 300 seconds, the diagram stops fetching data and an empty diagram with a notification is shown.

    Size limit for the backend

    The maximum number of nodes plus edges that will be fetched by the backend, to build a diagram.

    The value must be a positive integer and cannot be greater than 100,000.

    The default value is 10,000.

    Example A value of 10,000 means that if the total number of nodes plus edges is greater than 10,000, the diagram does not load and a notification is shown.

    Size limit for the frontend

    The maximum number of visible nodes plus edges that can be shown on the page.

    The value must be a positive integer and cannot be greater than 10,000.

    The default value is 2,000.

    Example A value of 2,000 means that if the total number of visible nodes and edges is greater than 2,000, the diagram does not load and a notification is shown.

    Maximum flow depth

    The system-wide maximum number of flow relations between the start node and any other diagram node.

    The value must be an integer between 1 and 100.

    The default value is 50.

    Note 
    • If the maximum flow depth is specified in the selected diagram view, that value supersedes the maximum you specify here.
    • You can also manually adjust the flow depth in the diagram.

    Diagrams Business Qualifier Filter (*)

    • True: Users can filter diagrams by a specified Business Qualifier asset.
    • False (default): Users are unable to filter diagrams by Business Qualifier.

    34 Everywhere Desktop configuration

    These settings determine some of the ways in which Collibra for Desktopinteracts with Collibra Data Intelligence Platform.

    Note These settings are only applied in Collibra for Desktop if you have Collibra 2021.01 or newer in combination with Collibra for Desktop 1.2.1 and newer.

    Setting

    Description

    Default search filter

    The filter that is applied, by default, to search results. The value must be the UUID of the filter.

    To find the UUID, open the Collibra environment and click in the Search box. Click the name of a search filter. In the address bar you will see the UUID of the filter.

    Note Specifying a default search filter in the application will override the default filter that you specify here.

    Custom Search box placeholder

    Placeholder text that appears in the Search field before a user enters search text.

    The default text is "Search in Collibra".

    Shortcut Search

    Enable or disable the use of a keyboard shortcut to search for selected text in Collibra Data Intelligence Platform from within your browser or another application.

    • True (default): Users can use the keyboard shortcut you specify in the following setting to search in Collibra Everywhere.
    • False: Keyboard shortcut is disabled.
    Custom Shortcut Search

    The keyboard shortcut to search for selected text in Collibra Data Intelligence Platform from within your browser or another application.

    Tip The keyboard shortcut has to be a combination of Control, Alt or Shift with one letter or number. On macOS you can also use the Command key.

    Note 
    • To make available the keyboard shortcut, you have to enable the feature in the previous setting.
    • Specifying a keyboard shortcut in the application will override the shortcut that you specify here.

    Enable Auto Hyperlinking

    Option to enable automatic hyperlinking within Collibra for Desktop.

    With this option enabled, the name of an asset automatically becomes a hyperlink when you fill out a text attribute.

    This option only works if the Enable hyperlinking option in Collibra Console is also enabled.

    Enable Workflows

    Option to enable workflows in Collibra Everywhere.

    This allows you to complete tasks or start a workflow in the app. The available workflows depend on the ones that you add to the Global workflows and Asset workflow configuration.

    Recommender

    The Recommender helps users by suggesting relevant business assets and data sets, based on certain relation types and the past actions of similar users.

    • True: Recommender is enabled.
    • False (default): Recommender is disabled.

    This feature only works if Analytics is enabled. You can enable Analytics in section 1 General of the DGC service configuration.

    Auto-updater

    Option to automatically upgrade Collibra Everywhere when a new version is available.

    • True (default): Collibra Everywhere is automatically upgraded when a new version is available.
    • False: You need to manually upgrade Collibra Everywhere when a new version is available.

    Note If you enable automatic updates, you have to whitelist the S3 bucket collibra-otg-desktop-installers in the region eu-west-1.

    Allow User Configuration

    Option to allow users to edit personal settings in Collibra Everywhere.

    • True (default): Users can edit personal settings.
    • False: Users cannot edit personal settings.

    Global workflows

    The list of workflows that is available in the app's main menu.

    Enter the UUIDs of the workflows. An example workflow could be "Create issue".

    Asset workflow

    The list of workflows that is available on an asset page in the app.

    Enter the UUIDs of the workflows. An example workflow could be "Ask the expert".

    No search result workflows

    The workflows that are available if there are no search results found.

    Enter the UUIDs of the workflows. An example workflow could be "Propose new business term".

    Enable autostart

    Option to automatically start Collibra for Desktop when signing in to your operating system.

    • True: The app starts automatically when signing in to your operating system.
    • False (default): The app does not start automatically.

    If you have set this option in the Collibra for Desktop settings, this option is neglected.

    35 Everywhere Mobile configuration

    These settings determine some of the ways in which Collibra for Mobileinteracts with Collibra Data Intelligence Platform.

    Setting

    Description

    Default search filter

    The filter that is applied, by default, to your search results. The value must be the UUID of the filter.

    This filter overrules any search filter that is set in the app.

    To find the UUID, open the Collibra environment and click in the Search box. Click the name of a search filter. In the address bar you will see the UUID of the filter.

    Custom Search box placeholder The text that is shown in the search box of Collibra for Mobile before a user enters search text.

    Enable Workflows

    Option to enable workflows in Collibra for Mobile.

    This allows you to complete tasks or start a workflow in the app. The available workflows depend on the ones that you add to the Global workflows and Asset workflow configuration.

    Global workflows

    The list of workflows that is available in the app's main menu.

    Enter the UUIDs of the workflows. An example workflow could be "Create issue".

    Asset workflow

    The list of workflows that is available on an asset page in the app.

    Enter the UUIDs of the workflows. An example workflow could be "Ask the expert".

    No search result workflows

    The workflows that are available if there are no search results found.

    Enter the UUIDs of the workflows. An example workflow could be "Propose new business term".

    36 Collibra Browser Extension configuration

    The settings determine how and where you can use the Everywhere Chrome Extension.

    Setting Description
    Domains

    Add a web domain, for example of a web application such as Power BI or Tableau, on which the Browser Extension automatically appears as overlay.

    37 Edge

    Edge configuration options, when changing an option, you only have to refresh the page that runs your Collibra environment.

    Setting Description
    Enable Edge jobs feature (beta)
    • True: Enable the Edge jobs page, this page gives you a overview of all jobs and their status.
    • False (default): Disable the Edge jobs page.

    38 Tableau Metadata API

    You need the Tableau metadata API to ingest Tableau 2020.2 and newer.

    Warning If you upgrade to Tableau version 2020.2 or newer, but previously synchronized an older Tableau version via the REST API and XML mapping, you have to prepare the migration procedure to prevent losing manually added relations, attributes, tags, comments and stitching results.

    Setting

    Description

    Enable Tableau metadata API

    • True: Tableau metadata API is enabled. This enables you to ingest Tableau 2020.2 or newer into Data Catalog.
    • False: Tableau metadata API is disabled. If you ingest Tableau 2020.2 or newer, the ingestion will fail. This prevents data loss of manually added relations and attributes.
    Tableau on-premise instances

    Note This setting is only applicable if you are using the latest Collibra UI.

    A comma-separated list of your on-premises Tableau URLs. This list represents the values that appear in the Tableau URL or endpoint drop-down list on Tableau Server asset pages. You select a URL when setting up a connection to the on-premises Tableau instance.

    Tableau Online URL regular expression

    A regular expression that is used to validate the format of URL of the on-premises Tableau instance.

    The default expression is ^https?:\/\/[a-zA-Z0-9.-]+\.tableau\.com\/?$. We recommend that you not change the default expression unless you find that it doesn't work.

    39 Job Service (Activities)

    Setting Description
    Number of executor threads for the Job Service

    The maximum number of threads, or jobs, that the Job Service can run in parallel.

    Generally speaking, increasingly the number of jobs running in parallel reduces overall processing time. Conversely, it requires more system resources, which can negatively impact performance. It also increases the risk of job conflicts.

    40 Identity

    Setting

    Description

    Limit user information access
    • True: Limits user access to information related to other users.
    • False (default): Users have access to information related to other users.

    Note Users that have a role with the System or User administration permission have full access regardless of this setting.

    41 Lineage on Edge

    Setting Description
    DGC user name The DGC user that is used to ingest technical lineage data into the environment via the technical lineage servers.
    DGC user password Password of the DGC user that is used to ingest technical lineage data into the environment via the technical lineage servers.
    Collibra system name flag Enable this option if Lineage uses a Collibra system name.
    Power BI and Tableau synchronization enabled
    • True: Enable ingestion and technical lineage for Power BI and Tableau via Edge.
    • False (default): Disable ingestion and technical lineage for Power BI and Tableau via Edge.
    Technical lineage via JDBC connection enabled
    • True: Enable technical lineage for data sources by using JDBC connections via Edge. For an overview of the supported data sources, go to the Technical lineage documentation.
    • False (default): Disable technical lineage for data sources by using JDBC connections via Edge.
    Technical lineage for SQL via folder, ETL Tools, and custom lineage on Edge enabled
    • True: Enable custom technical lineage and technical lineage for ETL tools and SQL data sources by using a Shared Storage connection. Technical lineage ingestion by using the Shared Storage connection type on Edge is equivalent to ingestion by using the folder connection type when you use the lineage harvester. For an overview of the supported data sources, go to the Technical lineage documentation.
    • False (default): Disable custom technical lineage and technical lineage for ETL tools and SQL data sources by using a Shared Storage connection.

    42 Collibra Protect

    Setting Description
    Protect scheduler fixed delay

    The number of minutes in between synchronizations.

    The default value is 60 minutes.

    43 License configuration

    Setting Description

    User license view schedule

    This setting requires the SUPER role.

    This sets how often the license numbers on the user table are refreshed.
    License usage snapshot cron job schedule This sets how often the license usage snapshot is refreshed. This cannot run at an interval smaller than 60 minutes.

    44 Data Marketplace configuration

    The configuration of the Data Marketplace.

    Setting Description
    Data Marketplace
    • True (default): Data Marketplace is enabled. Anyone with the required permissions can use or configure the Data Marketplace application from the Applications icon .

      Note When Data Marketplace is enabled and you reindex Collibra completely, the relations are also reindexed automatically. You don't need to start it manually. However, reindexing the relations will not reindex Collibra completely.

    • False: Data Marketplace is not enabled.

    After you enable this setting, reindex Data Marketplace relations or reindex Collibra completely.

    Note In new Collibra environments, this setting is enabled by default. In upgraded Collibra environments, the previous status of this setting is retained.

    45 Data Privacy

    The configuration of the Data Privacy landing page.

    Setting Description
    Privacy landing page

    46 Export Configuration

    Setting

    Description

    Detailed Relation Export Format

    This setting requires the SUPER role.

    • True (default): The exported relation column headers contain 3 elements of the relation: [Source Asset Type], Relation, and [Target Asset Type].
    • False: The exported relation column headers contain only 2 elements: Relation and [Target Asset Type].

    47 Slack Integration Configuration

    Note This section is not currently applicable.

    48 Asset external link

    Setting

    Description

    URL patterns to external asset pages

    A list of URL patterns to external asset pages:

    • External system id: The external system ID as defined by the externalSystemId field of the import file used with the Collibra REST Import API.
    • External system label: The name that the external system has on the asset page.
    • URL pattern to external asset page: The pattern to construct the URL to external asset page, for example https://external-system.com/entity/{externalEntityId}. The externalEntityId is the one defined by the same field of the import file used with the Collibra REST Import API.

    49 Assessment Configuration

    Setting

    Description

    Show out of the box templates
    • True (default): Shows all out-of-the-box Collibra templates to all users on the Template Gallery page and also in the template drop-down list box when conducting an assessment.
    • False: Hides all out-of-the-box Collibra templates from all users on the Template Gallery page and also in the template drop-down list box when conducting an assessment.