Search examples

Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

This topic explains how certain configuration settings can affect your search results. The topic is intended to:

  • Help Collibra Platform administrators understand how the search configuration settings affect the search results.
  • Help other Collibra users understand why certain search queries might not provide the expected results.
Tip For more information, go to Customizing the search index and Search behavior.

In this topic

Searching for assets that contain stop words in their names

Prerequisite: An asset named On The Go exists.

Search text: On The Go

Search result: The asset On The Go is found.

Interpretation: In the following image, the word Go is shown in green. This indicates that Go is the word that produced a match. The words On The, which are shown in black, did not produce a match.

Interpretation: In the search results, the word that produced a match is shown in green. The words that did not produce a match are shown in black.

Furthermore, if you enter the search text On The, the asset On The Go or any other asset is not found. This is because "on" and "the" are stop words, which are filtered from indexing and searches.

Tip The best way to ensure thorough and intuitive search results is to name your assets, domains, and communities as thoughtfully as possible.

Searching for assets that contain more than one word in their names

Search text: marketing team summit

Interpretation: "marketing" or "team" or "summit"

Search result: Assets with the following names are found:

Note The order of the actual results may differ from the following order.

  • marketing team summit
  • marketing_campaign_august
  • team123
  • summit_planning
  • marketing_team_summit_august
  • marketingTeamSummit
  • marketing team summit August

Searching for assets that contain CamelCase in their names

Search text: MarketingTeam

Interpretation: "marketing" and "team"

Search result: Assets with the following names are found:

  • MarketingTeam
  • marketingteam
  • marketing_team
  • marketing team summit
  • marketing_team_summit
  • marketing_team_summit_August
  • xyz marketing abc team def summit ghi

However, an asset with the following name is not found: team. This is because the search engine expects both the words ("marketing" and "team") in the name.

Image showing the search results when the search text contains CamelCase

Searching for assets that contain underscores (snake_case) in their names

Search text:marketing_team_summit

Interpretation: "marketing" and "team" and "summit"

Search result: Assets with the following names are found:

  • marketing_team_summit
  • marketing team summit
  • MarketingTeamSummit
  • marketing_team_summit_August
  • xyz marketing abc team def summit ghi

However, asset with the following name is not found: marketingteamsummit. This is because the search engine treats marketingteamsummit as a single term.

An asset with the following name is also not found: marketing_team. This is because the search engine expects all the three words ("marketing" and "team" and "summit") in the name.

Image showing the search results when the search text contains underscores

Searching for assets that contain special characters in their names

Search text: team's calendar

Interpretation: "team" or "calendar"

Search result: Assets with the following names are found:

  • Team's calendar
  • team123
  • weekly calendar
  • team_calendar_weekly
  • marketing team calendar

Searching for assets that contain URLs in their names

Search text: scheme://doma.in/optional/path

Interpretation: "scheme://doma.in/optional/path"

Search result: Assets with the following names are found:

  • scheme://doma.in/optional/path
  • scheme://doma.in/optional/path/suffix
  • scheme://doma.in, doma.in/optional/other
  • path, scheme, optional

Note The order of the results may depend on where the match is found within an asset and how often it occurs. For example, because the default boosting weighs name higher than attribute, matches found in only the description might appear lower in the list compared to those in the name.

Searching for database objects

You can search for database objects using dot notation in your search text. Dot notation allows you to specify database objects by connecting their hierarchical components with dots, such as schema.table or database.schema.table.column. This method allows you to quickly locate specific assets within a database structure.

Prerequisites: The following assets exist:

  • A Database asset named datahub.
  • A Schema asset named sales, which is within datahub.
  • A Table asset named transactions, which is within sales.
  • A Column asset named customer_id, which is within transactions.
  • The full name of these assets follows the standard convention for database objects. For example, the full name of the customer_id Column asset is datahub > sales > transactions > customer_id.

Search text: sales.transactions

Interpretation: The "transactions" asset within the "sales" asset

Search result: The following assets are found:

  • The transactions table that is within the sales schema. This result is prioritized over others.
  • Related assets, such as columns that are in the transactions table within the sales schema, and other tables that are within the sales schema.
Tip Searches are case-insensitive, meaning that the capitalization in your search text doesn't affect the results.

The following table contains additional search texts and their prioritized search results.

Search text Prioritized result
datahub.sales.transactions.customer_id The customer_id column that is in the transactions table within the sales schema of the datahub database.
datahub.sales.transactions The transactions table that is within the sales schema of the datahub database.
datahub.sales. All tables that are within the sales schema of the datahub database.

Handling duplicate asset names

Suppose that three Column assets with the same name, customer_id, have the following full names:

  • datahub1 > sales > transactions > customer_id
  • datahub2 > sales > transactions > customer_id
  • datahub1 > sales1 > transactions > customer_id

The following table contains search texts and their prioritized search results.

Search text Prioritized result

datahub1.sales.transactions.customer_id

customer_id column in the datahub1 database.

 

sales.transactions.customer_id
  • customer_id column in the datahub1 database.
  • customer_id column in the datahub2 database.
transactions.customer_id
  • customer_id column in the sales schema.
  • customer_id column in the datahub1 database.
  • customer_id column in the datahub2 database.
  • customer_id column in the sales1 schema.

How boosting specific resource types affects search results

Prerequisites:

  • A resource, user, or user group with the name verylongname exists.
  • Edit the resource type boost factors in Collibra Settings as follows:
    • Asset: 2
    • Community: 4
    • Domain: 6
    • User: 8
    • User group: 10

Search text: verylongname

Search result: The search results are ordered according to the boost factor values of the respective resource types. The user group resource type, with a boost factor of 10, is the most relevant of the results. Asset, with a boost factor of 2, is the least relevant resource type.

How boosting specific fields affects search results

Prerequisites:

  • The following assets exist:
    • An asset named superfeature
    • An asset named asset1, with the following tag: superfeature
    • An asset named asset2, with the following comment: superfeature
  • Property boost factors are:
    • Name: 1
    • Comment: 5
    • Tag: 10

Search text: superfeature

Search result: The search results are ordered according to the boost factor values of the respective fields. The tag field, with a boost factor of 10, is the most relevant of the results. Name, with a boost factor of 1, is the least relevant field.

How boosting attributes affects search results

Prerequisites:

  • The following assets exist:
    • An asset with the description terminator
    • An asset with the definition terminator
    • An asset with the note definition
  • Attribute boost factors are:
    • Description: 1
    • Definition: 2
    • Note: 3

Search text: terminator

Search result: The search results are ordered according to the boost factor values of the respective attributes. The note attribute, with a boost factor of 3, is the most relevant of the results. Description, with a boost factor of 1, is the least relevant attribute.

How boosting asset types affects search results

Prerequisites:

  • The following assets exist:
    • A Business Term asset with the name Payment
    • A Data Attribute asset with the name Payment Type
    • A Policy asset with the name Payments
  • Attribute boost factors are:
    • Business Term: 3
    • Data Attribute: 2
    • Policy: 1.5

Search text: payment

Search result: The search results are ordered according to the boost factor values of the respective asset types. The Business Term asset type, with a boost factor of 3, is the most relevant of the results. Policy, with a boost factor of 1.5, is the least relevant asset type.

How the exact match boost feature affects search results, regardless of boost factors

Prerequisites:

  • The following assets exist:
    • A Schema asset with the name Payment
    • A Business Term asset with the name Payment Type
    • A Data Attribute asset with the name Payment Type
    • A Policy asset with the name Payments
    • Other assets, as shown in the following image
  • Asset type boost factors are:
    • Business Term: 3
    • Data Attribute: 2
    • Policy: 1.5

Search text: payment

Search result: The resources that match the exact search text in the Name attribute of the resource are shown first. In this example, there is only one exact match—the Schema asset with the name Payment.

After the exact matches, the search results are sorted in order of descending relevance.

How term frequency contributes to relevance scores

Prerequisites:

  • Several assets, each with a variation of jedi in the name, exist.
  • Default boost factors for all resource types, properties, and attributes.

Search text: jedi

Search result: The search results are ranked as shown in the following image. A combination of the total number of matches in a name and the percentage of matches per total words in a name affect relevance scoring of the results.