Services configuration

The services configuration allows you edit settings that affect your entire platform. For example, you can edit the help menu and configure the search feature.

The Collibra configuration includes the following options:

General settings

The general settings of Collibra Data Intelligence Cloud.

Setting Description
Enable view rights
  • True (default): The view permissions feature is enabled.
  • False: The view permissions feature is disabled.

Show target asset type above relation table

  • True (default): Show the asset type of the target asset in the title of relation tables on an asset page. The target asset can be either the head or the tail of the relation, depending on which asset page you have open.
  • False: Hide the asset type of the target asset.

The default value is true.

Refreshed Navigation

  • True: Use the new, refreshed navigation of the Collibra user interface.
  • False (default): Use the classic navigation of the Collibra user interface.
Add Asset Grid link to menu
  • True: Add the link to Asset Grid in the main menu of your environment.
  • False (default): The link to Asset Grid is not available in the main menu of your environment.

Help Menu

The configuration of the Help menu in Collibra Data Intelligence Cloud.

Setting Description
Links The list of links in the help menu.
  Menu item name
The name of the menu item as it will appear in Collibra Data Intelligence Cloud's help menu.
  Menu index
The position of the menu item in the help menu. The top position starts with the value 1.
  Menu URL
The target URL of the menu item.
  Show admin only
  • True: The menu item is only visible to users with the Sysadmin role.
  • False: The menu item is visible to every user.

Email configuration

The configuration of email notifications.

Note In a Collibra Data Intelligence Cloud environment, you cannot update the email server settings, such as host and port. For more information, see Collibra Data Intelligence Cloud infrastructure.

Setting Description
Default schedule (Requires restart)

The Cron schedule to send emails only at specific times. With this, you can send emails in batches and avoid an overload of mails.

Keep in mind that these emails are only workflow emails and have nothing to do with the notification schedule.

If you create an invalid Cron pattern, Collibra Data Intelligence Cloud stops responding.

Template map The location of template emails.
Password (*) The password paired with your username to sign in to your SMTP server.
From address The email address used as the sender of all outgoing emails.
Port (*) The port to connect to your SMTP server. The default value is 25.
Host (*) The hostname or URL of your SMTP server.
Start TLS (*)
  • True: Use TLS (Transport Layer Security) to connect to your SMTP server.
  • False (default): Do not use TLS to connect to your SMTP server.
Username (*) The username to sign in to your SMTP server.
Sending threads (*) The number of threads that are used to send emails. The default value is 3.
Max retries (*) The maximum number of retries before the system aborts the sending of an email. The default value is 5.

Notifications

The configuration of notification emails to users.

Note These settings can be overridden for every user in the preferences.xml file.

Setting Description
Notification days

The days of the week on which Collibra sends notifications. The days are represented by numbers from 1 to 7, where 1 represents Sunday.

Per row you can add one day.

Daily roles The roles that receive notifications on the days defined in Notification days.
Enable monthly notifications
  • True: The users receive a monthly summary.
  • False (default): The users do not receive a monthly summary.
Roles for monthly notifications The roles that receive monthly notification emails. This is only relevant if Enable monthly notifications is True.

Recommender configuration

The configuration of the recommender.

Setting impacts Description
Catalog recommender enabled All recommendations
  • True (default): The "Data sets you might like" section is included on the Data Catalog Home page. This section shows data sets you might be interested in, as determined by the recommender, which takes into account your data sets and the data sets of similar users.
  • False: The "Data sets you might like" section is not included on the Data Catalog Home page.
Data set recommender execution time Recommendations of data sets to users

The schedule (CRON job) by which the data set recommender looks for recommended data sets for a user.

By default the data set recommender does this every night.

Asset recommender execution time Recommendations of business assets to data assets The schedule (CRON job) by which the asset recommender looks for suggested relations between business assets and data sets.
Data set matcher execution time Data set matcher The schedule (CRON job) by which the data set matcher looks for similar data sets.
Data set similarity threshold Data set matcher

The amount of business assets that have to be related to two data sets before the data sets are considered to be similar.

This percentage is expressed by a decimal where 1,00 equals 100%.

Duplicate schema threshold Schema matcher

The amount of assets that have to be related to both schemas before the schemas are considered to be similar.

This percentage is expressed by a decimal where 1,00 equals 100%.

Fuzzy vs exact matching strategy for business assets Recommendations of business assets to data sets and of business assets to column assets

The percentage that determines to what extent assets with a similar name become more important.

The ranking in the search engine results always has an impact on the suggestion score. However, similarity between the asset names can also be taken into account. If you decrease this percentage, the ranking of the search results becomes more important for the suggestion score, while the similarity between the asset names becomes less important. If you increase the percentage, assets with similar names will receive a higher suggestion score.

This percentage is expressed by a decimal where 1,00 equals 100%. You can enter a value greater than 1,00.

Recommendation weights for data sets Recommendations of data sets to users

An ordered comma-separated list of values that define the importance of properties for recommendations. The order of the values reflects the importance of the value.

This setting is only used for data set recommendations if your Collibra does not yet have enough data for relevant results from the active recommendations algorithms.

Possible values:

  • CERTIFIED: Data sets that are certified are considered more relevant.
  • POPULARITY: The number of visits to the data set page.
Active recommendation algorithms Recommendations of data sets to users and of business assets to data sets

A comma-separated list of algorithms that calculate recommendations. By default, all available algorithms are listed.

Possible values:

Warning If you create an invalid Cron pattern, Collibra Data Intelligence Cloud stops responding.

Search index configuration

The configuration of the search index.

Setting Description
UI search appends wildcard
  • True (default): A wildcard (asterisk) is automatically added to each search query. An asterisk is not added in the following exceptions:
    • If the query contains a tilde (~).
    • If the query ends with a quotation mark (").

    Note This applies only to queries via the user interface. A wildcard is not added automatically for REST API queries.

  • False: No wildcard is added to the search query.
Maximum batch size

The amount of resources scanned in one go for the search query.

The default value is 5,000. The maximum value is 30,000.

Maximum batch size for relations

Maximum batch size for relations reindex.

Stop words (Requires restart)

A list of stop words that are ignored as tokens for the index.

The default list of English stop words includes:

a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with

If you choose not to create your own list of stop words, the default list applies.

If you create your own list of stop words, you have to:

  1. Reindexing Collibra Data Intelligence Cloud.
  2. Restart the environment to apply your changes. See Stop an environment and Start an environment.

Tokenizer

The configuration of the tokenizer of the indexing mechanism. If you edit these settings, you need to restart and reindex your environment.

Setting Description
Type

The tokenizer that is used. Currently two tokenizers are supported:

  • Standard (default): This tokenizer uses the word break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.
  • Character: This tokenizer sees words as groups of all alphanumeric characters together with a configurable list of extra characters. This can be used if you know for sure which characters should keep certain words together. For example, if you want to keep words with a dash ( - ) together, you have to add the dash in the allowedCharacters parameter.
Parameter map

The allowed characters if the Type is Character.

  • Field key: This field has to contain allowedCharacters.
  • Field value: The concatenated list of characters that does not split strings into separate tokens. For example, the concatenated list -' allows dashes and apostrophes in tokens.

Boosting

The configuration of the boosting function.

Setting Description
Asset The boost factor of assets.

Class Match

The boost factor of data classes.

Community The boost factor of communities.
Domain The boost factor of domains.
User The boost factor of users.
User group The boost factor of user groups.
Name The boost factor of names.
Comment The boost factor of comments.
Tag The boost factor of tags.
Attribute boost map

The boost factor of attribute types.

  • Field key: The attribute type ID.
  • Field value: The boost factor of the attribute type.

Display exact match of name as first

  • True (default): If the name of an asset is exactly the same as the search text, put it at the top of the search results regardless of boost factors.
  • False: Use the regular search order, taking into account boost factors.

Asset boost map

The boost factor of asset types.

  • Field key: The asset type ID.
  • Field value: The boost factor of the asset type.

Partial exact match enabled

Enables partial exact matching while searching for multi word phrases.

  • True (default): For multi-word search text, the search engine considers the exact match percentage with the resource name, when ordering the results.
    Example You enter search text "scheduled maintenance". Two example assets are ordered as follows:
    1. An asset named "daily scheduled maintenance", as two of the three words (66%) match exactly.
    2. An asset named "daily scheduled maintenance revised", as two of the four words (50%) match exactly.
  • False: The exact match percentage is not taken into account in the score calculation.

Slow logs configuration

The configuration of the slow logs function.

Setting Description
Indexing threshold

The time limit, in milliseconds, after which an index query is logged in Elasticsearch.

If the value is set to 0 (zero), all index queries are logged.

Changes to this setting require a full reindex of your Collibra Data Intelligence Cloud environment.

Fetching threshold

The time limit, in milliseconds, after which a fetch query is logged in Elasticsearch.

If the value is set to 0 (zero), all fetch queries are logged.

Changes to this setting require a full reindex of your Collibra Data Intelligence Cloud environment.

Statistics configuration

The configuration of statistics.

Setting Description
Buffer size

The maximum amount of statistics entries that the buffer can contain before saving them in the database.

The default value is 10.

Buffer flush time

The maximum amount of time in milliseconds to keep statistic entries in memory before saving them in the database.

The default values is 10,000.

Cron map

List of statistics, listed by their Cron name, and a Cron interval.

These are the default values:

Field key Field value

workflow-task

0 59 23 * * ?

active-users

0 0/15 * * * ?

term-count

0 59 23 * * ?

vocabulary-count

0 59 23 * * ?

page-hit

0 0 * * * ?

task-count

0 0 * * * ?

If you create an invalid Cron pattern, Collibra Data Intelligence Cloud stops responding.

Import configuration

The configuration for imports.

Setting Description
Rebuild hyperlinks after import
  • True (default): Automatically rebuild the hyperlinks after an import.
  • False: Do not rebuild the hyperlinks after an import.

Excel import configuration

The configuration of Excel import.

Setting Description
The default CSV separator character The default separator character of the CSV fields for complex relations.
The default CSV quote character The default quote character of the CSV fields for complex relations.

Number of rows per chunk of data

When importing views, the database is called repeatedly, each time importing a chunk of data from the import file. This option defines how many rows each chunk of data can contain.

Lower values reduce the burden on memory. Higher values require more memory, but may slightly increase the speed of the export.

The default value is 5,000.

Excel export configuration

The configuration of Excel export.

Setting Description
The default CSV separator character The default separator character of the CSV fields for complex relations.
The default CSV quote character The default quote character of the CSV fields for complex relations.

Number of rows per chunk of data

When exporting views, the database is called repeatedly, each time fetching a chunk of data to build the export file. This option defines how many rows each chunk of data can contain.

Lower values reduce the burden on memory. Higher values require more memory, but may slightly increase the speed of the export.

The default value is 5,000.

CSV export configuration

The configuration of CSV export.

Setting Description
Always use quotes
  • True: Use quotes for every cell in the CSV.
  • False (default): Only use quotes when necessary.

Number of rows per chunk of data

When exporting views, the database is called repeatedly, each time fetching a chunk of data to build the export file. This option defines how many rows each chunk of data can contain.

Lower values reduce the burden on memory. Higher values require more memory, but may slightly increase the speed of the export.

The default value is 5,000.

API call logging

The configuration of the API call logging.

Setting Description
Enabled
  • True: API call logging is enabled.
  • False (default): API call logging is disabled.
Pattern duration list The list of methods and a corresponding minimum duration time. The minimum duration time is the minimum time before the method is stored in the database.
  Minimum duration
The time in milliseconds that an API call must last before it is logged.
  Method pattern
The method that you want to log in the database. For each pattern that you want to log, you have to add a new pattern.

Security configuration

The configuration of security.

Setting Description
X-Frame options (Requires restart)

The content of the HTTP-header X-Frame-Options. This is set on all rendered pages and is used to avoid clickjacking attacks. By default, only pages with the same origin can use the rendered pages in a frame.

Limit user sessions
  • True: A user can only open one session.
  • False (default): A user can open multiple sessions.

Office research guest access

  • True: The Office research integration is always allowed guest access via REST, regardless of the general Guest access setting.
  • False (default): The general Guest access setting is kept.

Note Currently, The Office research integration is only available when Collibra Data Intelligence Cloud is publicly available, which is why this override setting is necessary.

Prevent advanced html features in text dashboard

Text widgets can contain full HTML. However, this means an attacker could potentially execute an XSS attack by injecting malicious HTML. For more information, see the Troubleshooting section.

  • True: Potentially dangerous HTML elements are removed from text attributes when you save the text field.
  • False (default): No HTML elements are removed from text attributes when you save the text field.

Note 

If you enable this setting, the following HTML elements are deleted when you save:

  • script (including JavaScript)
  • svg
  • frame
  • frameset
  • iframe
  • any event handlers

Guest access

  • True: Anyone that can access the URL, has viewing rights to the system.
  • False (default): The user is asked to sign in before having access to any data.
Enable schema introspection
  • True: Schema fields are shown during an introspection.
  • False (default): Schema fields are hidden during an introspection.
Enabel customer validation functions
  • True (default): Groovy scripts with custom validation functions can be loaded.
  • False: Groovy scripts with custom validation functions cannot be loaded.

LDAP

The configuration of an LDAP server to handle the authentication.

Setting Description
Enable LDAP integration (Requires restart)
  • True: The LDAP integration is enabled.
  • False (default): The LDAP integration is disabled.
Sync after restore
  • True (default): LDAP data is synchronized with Collibra when an initial data set is bootstrapped.
  • False: LDAP data is synchronized with Collibra only when the LDAP synchronization job is triggered.
User page size

The page size that is used when retrieving users during synchronization.

The default value is 500. You can set it to 0 to disable paging.

Note This is a global setting. If you are working with multiple LDAP servers, only the value for the main server is taken into account.

Group page size

The page size that is used when retrieving groups.

You can set it to 0 to disable paging.

Note This is a global setting. If you are working with multiple LDAP servers, only the value for the main server is taken into account.

Time limit

Specifies the time limit in milliseconds for all LDAP searches.

The default value is 120,000.

You can set it to 0 to disable the time limit.

Tip 
  • If you get Time limit Exceeded error messages, increase the default value or check why the LDAP search takes too long.
  • We recommend that you modify the User page size and Group page size settings before you modify this setting.
Sync job enabled
  • True (default): The synchronization job is enabled.
  • False: The synchronization job is disabled.
Sync job cron

The schedule to perform an LDAP synchronization (CRON).

The default value for this setting is daily at midnight.

If you create an invalid Cron pattern, Collibra Data Intelligence Cloud stops responding.

User field mapping The configuration mapping of all the user fields. This determines which LDAP field ismapped to which user field. Empty fields are ignored during the synchronization.
  Username
The unique user ID in the LDAP, typically UID. This is a mandatory field.
  Email
The corresponding email field in the LDAP directory. This is a mandatory field.
  First name
The first name field in the LDAP directory.
  Last name
The last name field in the LDAP directory.
  Middle name

The middle name field of the LDAP directory, this is usually givenName.

  Enabled
Indication whether a user is active or inactive in LDAP.
  Language

The language and locale of the user. It has to contain a language code and may contain a country code.

Examples: pl, en_US, nl_BE.

  Group

The LDAP property that defines to which groups the user belongs. If there is a group entry in the LDAP directory, use the Group field mapping settings.

  Additional email list

An additional email list.

  Instant messaging fields
The mapping for the user's IM locations.
  AIM
The mapping for the user's AOL IM account.
  Google Talk
The mapping for the user's Google Talk IM account.
  Icq
The mapping for the user's ICQ IM account.
  Jabber
The mapping for the user's Jabber IM account.
  Messenger
The mapping for the user's Live Messenger IM account.
  Skype
The mapping for the user's Skype IM account.
  Yahoo Messenger
The mapping for the user's Yahoo Messenger IM account.
  Website map
Enter the field value and field key to map a social media website.
  Phone
The mapping for the user's phone.
  Fax
The mapping for the user's fax number.
  Mobile
The mapping for the user's mobile number.
  Pager
The mapping for the user's pager number.
  Private
The mapping for the user's private number.
  Work
The mapping for the user's work number.
  Other
The mapping for any other phone number for this user.
  Home address
The mapping for the user's home address.
  Street
The mapping for the user's street.
  Number
The mapping for the user's number.
  City
The mapping for the user's city.
  Post code
The mapping for the user's postal code.
  State
The mapping for the user's state.
  Country
The mapping for the user's country.
  Work address
The mapping for the user's work address.
  Street
The mapping for the user's street.
  Number
The mapping for the user's number.
  City
The mapping for the user's city.
  Post code
The mapping for the user's postal code.
  State
The mapping for the user's state.
  Country
The mapping for the user's country.
  Gender
The mapping information for the user's gender.
  Mapping
The attribute key for the gender value. If the content equals one of the male or female mappings, the user will be saved as male or female. Otherwise a default of UNKNOWN will be used.
  Male value
The value for male users.
  Female value
The value for female users.
Group field mapping Groups can be defined as a separate structure or as a userField. The following section allows you to sync with a group structure that is unrelated to the user structure.
  Group name field

The name of the group to use in the application.

  Users field
The user DNs that are member of the group.

Password

The configuration of passwords.

Setting Description
Minimum length (Requires restart) The minimum length of passwords.
Maximum length (Requires restart) The maximum length of passwords.
Digits required (Requires restart)
  • True: Passwords have to contain one or more digits.
  • False (default): Passwords do not have to contain digits.
Non alphanumeric required (Requires restart)
  • True: Passwords have to contain one or more non-alphanumeric (special) characters.
  • False (default): Passwords do not have to contain non-alphanumeric characters.
Uppercase required (Requires restart)
  • True: Passwords have to contain one or more upper-case characters.
  • False: (default) Passwords do not have to contain upper-case characters.
Lowercase required (Requires restart)
  • True: Passwords have to contain one or more lower-case characters.
  • False (default): Passwords do not have to contain lower-case characters.
Username disallowed (Requires restart)
  • True: Passwords cannot be the username.
  • False (default): Passwords can be the username.

Expiration interval (months)

The number of months before users have to change their passwords.

Set it to 0 if users never have to change their passwords.

Allowed login failures

The number of consecutive failed login attempts that are allowed before the user account is disabled.

Set it to 0 for unlimited attempts.

No reuse count

The number of previous passwords users cannot reuse. The default is 1: the user cannot change his password to what it currently is.

Set this to 0 to allow using the same password.

REST

The security configuration of the REST interface.

Setting Description
CSRF token enabled
  • True: The validity of a request is checked with a CSRF token.
  • False (default): The validity of a request is not checked with a CSRF token.
Referrer enabled
  • True: The HTTP referrer header is used to identify the origin of the request.
  • False (default): The HTTP referrer header is not used to identify the origin of the request. It is recommended to leave this option disabled.
Referrer checking allow empty
  • True (default): The HTTP referrer header can be empty.
  • False: The HTTP referrer header cannot be empty.

SSL

The configuration of SSL.

Setting Description
Key store name The name of the keystore file. The file is expected to be in the <collibra_data>/dgc/security folder.
Key store password The password of the keystore.
Key store type The type of the keystore file. For example, JKS or PKCS12.
Trust store name The name of the truststore file. The file is expected to be in the <collibra_data>/dgc/security folder.
Trust store password The password of the truststore.
Trust store type The type of the truststore file. For example, JKS or PKCS12.

SSO

The configuration of Single Sign-On (SSO) authentication.

Setting Description
Mode

The SSO mode of Collibra.

The possible values are:

  • SAML_ATTRIBUTES
  • SAML_LDAP
  • SSO_HEADER
  • SSO_HEADER_LDAP
  • DISABLED
Header

The name of the header to be checked. The contents of this header is used for the search query, which is SSO_HEADER = username.

The value of the actual query depends on DN and possibly Attribute.

DN

If the SSO mode is SSO_HEADER_LDAP or SAML_LDAP, this field determines whether the distinguished name (DN) or attribute is used:

  • True: The header has to contain the distinguished name (DN) of the user in the LDAP.
  • False (default): The header has to contain the value of Attribute.

If the SSO mode is DISABLED, SSO_HEADER or SAML_ATTRIBUTES, this field is ignored.

Attribute

This field is only used if the SSO mode is SSO_HEADER_LDAP or SAML_LDAP, and if DN is False.

If the above criteria are met, the LDAP has to contain this value.

Disable automatic user creation when signing in via SSO

If users try to sign in via SSO, they still need a user account in Collibra. You can either create the user accounts automatically when they sign in, or create the user accounts manually or via LDAP synchronization

  • True: User accounts are not created automatically.
  • False (default): User accounts are created automatically.

Disable the Collibra signin page

When SSO is enabled, a user can still navigate to the /signin page and try to log in via that page. However, you can disable that page.

  • True: Users cannot access the Collibra signin page.
  • False (default): Users can access the Collibra signin page
SAML The configuration of SAML.
  Metadata HTTP
The URL of the SAML metadata file to be used. The URL always has to be reachable by the Collibra environment.
  Entity Provider Entity ID

The entity ID inside the metadata to be referenced.

Note A metadata file can describe multiple entity IDs, make sure to use in the entity ID from the correct metadata file.

  Attribute fields

The mappings of attributes in the SAML response. The values are used as keys to look for in the SAML response.

Examples of attribute fields are first name, last name, address information, phone numbers and so on.

  First name

The mapping for the user's first name.

This attribute is optional. The value can be empty.

  Last name

The mapping for the user's last name.

This attribute is optional. The value can be empty.

  Email

The mapping for the user's email address.

This attribute is optional for existing users, but mandatory for new users.

Warning If the email address is invalid when you synchronize, the user is deactivated and the user information is not updated.

  Enabled
The mapping that indicates whether the account of the incoming user is enabled.
  Group

The mapping (attribute) which indicates to which Collibra groups the user should be added. If the groups don't exist yet, they will be created. This attribute can have multiple values (groups) or the groups can be sent as a comma-separated list of groups.

If passing groups in this attribute, you must set Groups DGC Managed to False.

  Phone
The mapping for the user's phone.
  Fax
The mapping for the user's fax number.
  Mobile
The mapping for the user's mobile number.
  Pager
The mapping for the user's pager number.
  Private
The mapping for the user's private number.
  Work
The mapping for the user's work number.
  Other
The mapping for any other phone number for this user.
  Home address
The mapping for the user's home address.
  Street
The mapping for the user's street.
  Number
The mapping for the user's number.
  City
The mapping for the user's city.
  Post code
The mapping for the user's postal code.
  State
The mapping for the user's state.
  Country
The mapping for the user's country.
  Work address
The mapping for the user's work address.
  Street
The mapping for the user's street.
  Number
The mapping for the user's number.
  City
The mapping for the user's city.
  Post code
The mapping for the user's postal code.
  State
The mapping for the user's state.
  Country
The mapping for the user's country.
  Instant messaging
The mapping for the user's IM locations.
  AIM
The mapping for the user's AOL IM account.
  Google Talk
The mapping for the user's Google Talk IM account.
  Icq
The mapping for the user's ICQ IM account.
  Jabber
The mapping for the user's Jabber IM account.
  Messenger
The mapping for the user's Live Messenger IM account.
  Skype
The mapping for the user's Skype IM account.
  Yahoo Messenger
The mapping for the user's Yahoo Messenger IM account.
  Gender
The mapping information for the user's gender.
  Mapping
The attribute key for the gender value. If the content equals one of the male or female mappings, the user will be saved as male or female. Otherwise a default of UNKNOWN will be used.
  Male value
The value for male users.
  Female value
The value for female users.
  Groups DGC managed

Option to configure that groups should be managed by Collibra, or that groups should be set by the SAML assertion (SAML+Attributes mode).

This option is only relevant if Mode is SAML_ATTRIBUTES.

  • True: The groups are fully managed by Collibra. In the UI the admin has the option to assign groups to users, without it being overwritten by SAML.
  • False (default): The groups are managed by the SAML assertions. In this case the groups are managed by the SAML IDP. Be sure to configure the Group attribute in the Attribute Fields section.
  Service Provider Entity ID

Field that determines the value of the Entity ID parameter in the service provider metadata returned by Collibra. The default value is empty, in which case Collibra uses the value of the Base URL field.

Enter a custom value if the base URL does not match the audience configured in your SAML identity provider.

Warning The value of the audience restriction in the SAML response has to be exactly the same as the value of this field.

Note SSO does not work if the Service Provider Entity ID field contains the base URL with trailing forward slash (for example www.collibra.com/), and the audience of your IDP contains the base URL without a trailing forward slash (for example www.collibra.com).
Both values need to be exactly the same. In this case, you can resolve the issue by changing the value in the configuration of your IDP, or the value of this field. It does not matter whether both have a trailing forward slash or not, as long as they contain the same value.

  Sign authentication requests (Requires restart)
  • True: Authentication requests have to be signed.
  • False (default): Authentication request don't have to be signed.
  Force authn
  • True (default): The SP authentication request forces re-authentication.
  • False: The SP authentication request does not force re-authentication.
  Force passive
  • True: The reauthentication has to happen in the background.
  • False (default): The reauthentication does not have to happen in the background.

This is only relevant if Force authn is True.

  Name ID

Name ID that is used in the SP authentication. The default value is urn:oasis:names:tc:SAML:2.0:nameid-format:persistent.

The Name ID value is mandatory.

  Name ID allow create
  • True (default): The IDP can create a name ID to fulfill the SP authentication request.
  • False: The IDP cannot create a name ID to fulfill the SP authentication request.
  Disable client address
  • True: The validation of the client IP address in the assertion message is disabled.
  • False (default): The validation of the client IP address in the assertion message is enabled.
  SAML Requested authentication context

Settings for the SAML requested authentication context. The IDP uses the authentication context to authenticate the user. By default, the authentication context mandates user/password authentication over HTTPS.

  Disable
  • True: The requested authentication context section is not sent in the SAML request.
  • False (default): The requested authentication context section is sent in the SAML request.
  Comparison type

The comparison type that is transmitted in the requested authentication context.

Possible values:

  • minimum
  • maximum
  • better
  • exact (default value)

For more information about the comparison type values, refer to the SAML specifications.

  Reference list

The list of class references in the requested authentication context. You can separate list items with the pipe character (|).

For more information about this list, refer to the SAML specifications.

  Declaration list

The list of class declarations in the requested authentication context. You can separate list items with the pipe character (|).

For more information about this list, refer to the SAML specifications.

  Response decryption mode

Enable the support for encrypted SAML responses.

  • DISABLED: Collibra only accepts plain-text SAML responses.
  • OPTIONAL: Collibra can handle both encrypted and plain-text SAML responses.
  • FORCED: Collibra only accepts encrypted SAML responses.

Once OPTIONAL or FORCED is selected, the encryption key pair is generated and added to the Collibra SAML keystore. A self-signed certificate is generated and works in most situations. If your IdP rejects self-signed certificates, you will have to add a certificate that is signed by a trusted 3rd party.

  Validity period of the SAML certificate

The SAML certificate expiry date in years.

By default, the SAML certificate expires after 20 years.

Signout

The configuration of redirecting after signing out of Collibra.

Setting Description
Override signout URL (Requires restart)
  • True: Redirect the user to a specific website after signing out.
  • False (default): Redirect the user to the sign-in page after signing out.
Signout redirect URL (Requires restart) The URL to be redirected to when signing out.

Import/Export

The configuration to avoid the Formula Injection vulnerability in Excel.

Setting Description

Escape Excel formulas

The option to disable Formula Injection into Excel. When enabling this option, an escape character is added at the beginning of Excel formulas during the export and is removed when importing formulas.

The escape character will be added to fields that start with one of the following characters:

  • equation: =
  • plus: +
  • minus: -
  • at-sign: @

This option is enabled by default.

Excel formulas escape character

The escape character for Excel formulas when exporting or importing data.

JWT

The JSON Web Token configuration.

Setting Description
JSON Web Key Set URL

The URL to retrieve public key information needed to verify the authenticity of JSON Web Tokens (JWTs), issued by an authorization server.

This setting is required to enable JWT authentication.

JWT Token Types

A case-insensitive comma-separated list of accepted JWT media types coming in the typ header parameter.

Leave blank if the authorization server does not provide a media type parameter.

The default values is at+jwt,jwt.

JWT Algorithms

A comma-separated list of accepted JWT algorithms coming in the alg header parameter. See https://tools.ietf.org/html/rfc7518#section-3.1 for details.

Leave blank to accept all digital signature algorithms.

JWT Issuer

The accepted issuer coming in the iss JWT claim.

Leave blank if the authorization server does not provide an issuer claim.

JWT Audience

A comma-separated list of accepted audience values for the aud claim.

The value for this field is a configuration setting in your authorization server, which identifies your Collibra environment as the intended recipient of the JWT.

Leave blank if the authorization server does not provide an audience claim.

JWT Principal ID Claim Name.

The name of the JWT claim containing the principal's identity. See https://tools.ietf.org/html/rfc7519#section-4.1.2 for details.

Defaults to the standard subject claim, sub.

Change this setting only if your authorization server has other means of identifying the principal, for example, a client_id claim.

This setting is required if JWT authentication is enabled.

JWT Maximum Clock Skew

The maximum acceptable difference in seconds between the clocks of the machines running the authorization server and Collibra.

Differences smaller than the given amount are ignored when performing time comparisons for token validation.

The default value is 60 seconds if left blank.

Whitelists

The configuration for whitelist placeholders that can be used in security headers.

Option Description
connect-src whitelist The 'connect-src' whitelist. To use this whitelist in a security header, use the '{connectSrcWl}' placeholder.
font-src whitelist The 'font-src' whitelist. To use this whitelist in a security header, use the '{fontSrcWl}' placeholder.
frame-src whitelist The 'frame-src' whitelist. To use this whitelist in a security header, use the '{frameSrcWl}' placeholder.
img-src whitelist The 'img-src' whitelist. To use this whitelist in a security header, use the '{imgSrcWl}' placeholder.
script-src whitelist The 'script-src' whitelist. To use this whitelist in a security header, use the '{scriptSrcWl}' placeholder.
style-src whitelist The 'style-src' whitelist. To use this whitelist in a security header, use the '{styleSrcWl}' placeholder.
frame-ancestors whitelist The 'frame-ancestors' whitelist. To use this whitelist in a security header, use the '{frameAncestorsWl}' placeholder.
Tableau frame-ancestors whitelist The tableau 'frame-ancestors' whitelist. To use this whitelist in a security header, use the '{tableauFrameAncestorsWl}' placeholder.

Collibra Connect

The configuration to communicate with Collibra Connect.

Setting Description
Base URL The URL to Collibra Connect.
Username The username to connect to Collibra Connect.
Password The password to connect to Collibra Connect.

Register data source

Global parameters that apply to Data Source Registration.

Setting Description
Table types to ignore A comma separated list of table types that are not ingested. For example, INDEX and SEQUENCE.

AWS regions restriction

A list of AWS regions Data Catalog is allowed to connect to. For example, eu-west-3 and us-east-2. For a list of all AWS locations, see the AWS documentation.

If you want to allow Collibra to make a connection to any AWS region, leave the field empty.

Database registration via Edge

An option to enable database registration via Edge.

  • True: Register a data source via Edge.
  • False: Register a data source via Jobserver only.

Note Enabling data source registration via Edge does not prevent you from registering a data source via Jobserver as well.

Data Quality Synchronization UI via DQ Connector on Edge

An option to enable the Data Quality extraction interface in Collibra

  • True: The Quality extraction tab is available on the configuration page of a database asset
  • False (default): The Quality extraction tab is not available and as such, it is not possible to extract and synchronize data quality information.

Jobserver (*)

The configuration of the Jobserver service.

Setting Description

Jobserver list

The list of registered Jobserver instances.

Name

The name of the Jobserver as it will appear when you register a data source in Data Catalog.

The name is a freely chosen name but it is recommended to only use alphanumerical characters and dashes, for example Jobserver-1.

You will have to use this name as the ID of the gateway and in the address of this configuration.

Protocol

The protocol that is used for the communication between the Data Governance Center service and the Jobserver service.

It is recommended to use HTTPS, especially if the services are hosted in different network segments.

Address

The address (IP address, URL, hostname) of the Jobserver.

Trusted server CA certificate

The certificate of the trusted CA needed to validate the server certificate. If blank, the default truststore will be used. The default truststore is defined in the SSL configuration section of the DGC service.

The CA certificate of the server party (Jobserver).

Client certificate

The client certificate offered by the DGC service to the server. If blank, you cannot select mutual authentication as the Jobserver service authentication level.

Client private key

The private key of the DGC service's certificate.

Table profiling data size

The approximate maximum disk size of the data in MB that will be used to profile a table. The value cannot exceed 10,000.

Test connection timeout

This timeout is a time limit (in seconds) after which the connection test is stopped and a timeout error is shown. The default value is 60 seconds.

Data profiling

The global configuration of Data Profiling. Profiling must be executed again after a change in this section.

Setting Description
Maximum number of samples The maximum number of rows taken as a sample during profiling.
Maximum value length The maximum length of a value extracted during profiling or sampling. Additional characters are trimmed.
Default date pattern The default format used to decode dates. It is the default pattern used for detecting dates when the Date Pattern and/or Time Pattern attribute is not specified in Column assets.
Default time pattern The default format used to decode times. It is the default pattern used for detecting times when the Date Pattern and/or Time Pattern attribute is not specified in Column assets.
Default combined date and time pattern The default format used to decode combined dates and times. It is the default pattern used for detecting combined dates and times when the Date Pattern and/or Time Pattern attribute is not specified in Column assets.
Empty values

A comma separated list of strings enclosed in double quotes. A value that matches one of those expressions is considered an empty value.

Please note that a database null value is always considered an empty value, for example "", "na" and "none".

Data type detection threshold The percentage of matching Column values to reach for an Advanced Data Type to be considered a possible Data Type for that Column. This is expressed as a value between 0.0 and 1.0).

Anonymize data

An option to anonymize sensitive data.

  • True: Content in columns with data type Text or Geo is removed or replaced by a random hash value before the profiling results are sent to the cloud.
  • False (default): No content is removed or replaced by a random hash value.
Database profiling via Edge

An option to enable profiling and classifying synchronized metadata via Edge instead of Jobserver.

  • True: Profiling and classify via Edge.
  • False: Profile via Jobserver and classify via the Data Classification Platform.

Note You can only enable Database profiling via Edge if you also enabled Database registration via Edge.

Beta features

The configuration of features in beta state.

Setting Description
Tableau provisioning enabled
  • True: Provisioning to Tableau is enabled.
  • False (default): Provisioning to Tableau is disabled.

Max number of concurrent import jobs

The maximum number of import jobs that can be executed at the same time via the API. This is to avoid memory issues.

Default value is 4, set to 0 if there is no limit.

Task sidebar

  • True: Workflow tasks appear in the sidebar on both resource pages an the task management page. Task forms appear in the sidebar instead of dialog boxes. Users can seamlessly complete their tasks from the task management page and have a side-by-side view of the tasks and resource details on resource pages.
  • False (default): Workflow tasks appear in the task bar on resource pages and in a sidebar on the task management page. Task forms appear in dialog boxes. The behavior is the same as with older versions of Collibra.
Settings landing enabled
  • True: Show the new Settings landing page in your Collibra environment.
  • False (default): Use the classic Settings page in your Collibra environment.
Search reindex using Output Module
  • True: Enable the use of the Output Module when executing a search reindex.
  • False (default): Disable the use of the Output Module when executing a search reindex.

Setting

Description

Throttling enabled
  • True: REST API v1 throttling is enabled.
  • False (default): REST API v1 throttling is disabled.
Number of requests The number of allowed request for the configured number of seconds.
Number of seconds The number of seconds during which the configured number of requests can be performed.

Setting

Description

Throttling enabled
  • True: REST API v2 throttling is enabled.
  • False (default): REST API v2 throttling is disabled.
Number of requests The number of allowed request for the configured number of seconds.
Number of seconds The number of seconds during which the configured number of requests can be performed.

Setting

Description

Throttling enabled
  • True: GraphQL throttling is enabled.
  • False (default): GraphQL throttling is disabled.
Number of requests The number of allowed request for the configured number of seconds.
Number of seconds The number of seconds during which the configured number of requests can be performed.

Graph query

The configuration of the Graph query engine which is used to retrieve data from the repository.

Graph query limits

Setting

Description

Enables limiting of the number of root nodes in result
  • True: Enable limiting the number of root elements as result of a Graph query.
  • False (default): Disable limiting the number of root elements as result of a Graph query.
Maximum number of root nodes that can be requested with graph query API

The maximum number of root nodes that you can request in the view configuration of an API call (REST or workflow).

If you exceed this value in the view configuration, an exception is shown. If no value is defined in the view configuration, then the default value is taken.

The default value is 100,000.

Note If the number of asset types or domain types exceeds the set number, the hierarchy will be incomplete. Make sure that the limit is always higher than the actual number of asset and domain types.

Graph query timeouts

Setting

Description

Maximum number of minutes a graph query can run

The maximum number of minutes that the graph query runs before it will time out. The maximum is 1,440 minutes (1 day).

The default value is 480.

Table

The configuration of tables.

Setting Description
Time limit for loading data in tables in seconds

The time limit after which a table stops loading on a page.

Example A value of 600 means that if a table hasn’t loaded within 600 seconds, the task is canceled and a timeout error is shown.

The default value is 60, the maximum value is 720 seconds.

Multi-column sort

The configuration of multi-column sorting.

Setting Description
Multi-column sorting on tables
  • True: Tables can be sorted on multiple columns.
  • False (default): Tables can be sorted on one column.
Number of columns available for multi-sort

Type the maximum number of columns that can be used to simultaneously sort tables.

The default value is 3, the minimum is 1, the maximum is 9.

This setting is only relevant is Multi-column sorting on tables is True.

Inherited responsibilities

Setting Description
Enable Inherited Responsibilities
  • True: Show inherited responsibilities on asset views.
  • False (default): Do not show inherited responsibilities on asset views.

Note This setting only affects asset views and tile sets. It does not affect the Responsibilities tab page of asset pages.

Cloud Data Classification configuration

With data classification you can automatically assign data classes to ingested data.

Note In a Collibra Data Intelligence Cloud environment, you have to create a support ticket to configure this feature.

Setting

Description

Enable Data Classification

  • True: Enable Collibra's data classification technology.
  • False (default): Do not use Collibra's data classification technology are not accepted.

Reporting

For more information about these settings, see Introduction to the reporting data layer.

Setting

Description

Cloud Provider

Cloud provider - AWS or GCP

Customer GUID

The GUID of your Collibra environment.

Note This field is configured by Collibra Cloud Ops.

Collibra Insights download bucket name

The name of the AWS S3 bucket in which your reporting data is stored.

Note This field is configured by Collibra Cloud Ops.

Collibra Insights AWS S3 Region

The AWS S3 region in which your data is processed.

Note This field is configured by Collibra Cloud Ops.

Collibra Insights zip location pattern

A pattern with the format "/zip/insights_%s.zip", where "%s" is replaced by the Collibra Insights snapshot date.

Note This field is configured by Collibra Cloud Ops.

Tableau report URL pattern

The Tableau URL pattern, which should contain {reportName}.

Tip You can paste the URL from the Link field in Tableau, as described in Generate the dashboard reports you configured in Collibra Data Intelligence Cloud Settings.

Reports definitions

 

  Report view name
  • The report name, as you want it to appear on the report button in the Collibra Insights widget, for example "Data Maturity Dashboard".
  •   Report name
  • The report name, as it appears in the URL of the Tableau report, for example "DataMaturityDashboard".
  • Catalog Experience

    Data Catalog Experience improves the layout of Data Catalog's asset pages.

    Setting

    Description

    Enable Data Catalog experience

    • True: Data Catalog experience is enabled. This will improve the layout of Data Catalog's asset pages, such as those of Data Set, Schema, Table and Column assets.
    • False: Data Catalog experience is disabled.

    Data Catalog Experience Titlebar theme

    The theme for the Data Catalog experience. You can choose between the LIGHT and DARK.

    This option is only applicable if the Enable Data Catalog experience option is enabled.

    Diagrams

    These settings determine dialog loading time and size limits.

    Setting

    Description

    Maximum loading time for the back end

    The time limit, in seconds, after which a diagram stops fetching data.

    The value must be a positive integer and cannot be greater than 3,600 (one hour).

    The default value is 300.

    Example A value of 300 means that if a diagram hasn’t fetched all data within 300 seconds, the diagram stops fetching data and an empty diagram with a notification is shown.

    Size limit for the backend

    The maximum number of nodes plus edges that will be fetched by the backend, to build a diagram.

    The value must be a positive integer and cannot be greater than 100,000.

    The default value is 10,000.

    Example A value of 10,000 means that if the total number of nodes plus edges is greater than 10,000, the diagram does not load and a notification is shown.

    Size limit for the frontend

    The maximum number of visible nodes plus edges that can be shown on the page.

    The value must be a positive integer and cannot be greater than 10,000.

    The default value is 2,000.

    Example A value of 2,000 means that if the total number of visible nodes and edges is greater than 2,000, the diagram does not load and a notification is shown.

    Maximum flow depth

    The system-wide maximum number of flow relations between the start node and any other diagram node.

    The value must be an integer between 1 and 100.

    The default value is 50.

    Note 
    • If the maximum flow depth is specified in the selected diagram view, that value supersedes the maximum you specify here.
    • You can also manually adjust the flow depth in the diagram.

    Diagrams Business Qualifier Filter (*)

    • True: Users can filter diagrams by a specified Business Qualifier asset.
    • False (default): Users are unable to filter diagrams by Business Qualifier.

    Tableau Metadata API

    You need the Tableau metadata API to ingest Tableau 2020.2 and newer.

    Warning If you upgrade to Tableau version 2020.2 or newer, but previously synchronized an older Tableau version via the REST API and XML mapping, you have to prepare the migration procedure to prevent losing manually added relations, attributes, tags, comments and stitching results.

    Setting

    Description

    Enable Tableau metadata API

    • True: Tableau metadata API is enabled. This enables you to ingest Tableau 2020.2 or newer into Data Catalog.
    • False: Tableau metadata API is disabled. If you ingest Tableau 2020.2 or newer, the ingestion will fail. This prevents data loss of manually added relations and attributes.

    Backup configuration management

    Setting Description
    Backup service URL The URL of the backup service.

    Job Service (Activities)

    Setting Description
    Number of executor threads for the Job Service

    The maximum number of threads, or jobs, that the Job Service can run in parallel.

    Generally speaking, increasingly the number of jobs running in parallel reduces overall processing time. Conversely, it requires more system resources, which can negatively impact performance. It also increases the risk of job conflicts.