Search examples
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
This topic explains how certain configuration settings can affect your search results. The topic is intended to:
- Help Collibra Platform administrators understand how the search configuration settings affect the search results.
- Help other Collibra users understand why certain search queries might not provide the expected results.
In this topic
Searching for assets that contain stop words in their names
Prerequisite: An asset named On The Go exists.
Search text: On The Go
Search result: The asset On The Go is found.
Interpretation: In the following image, the word Go is shown in green. This indicates that Go is the word that produced a match. The words On The, which are shown in black, did not produce a match.
Interpretation: In the search results, the word that produced a match is shown in green. The words that did not produce a match are shown in black.
Furthermore, if you enter the search text On The, the asset On The Go or any other asset is not found. This is because "on" and "the" are stop words, which are filtered from indexing and searches.
Tip The best way to ensure thorough and intuitive search results is to name your assets, domains, and communities as thoughtfully as possible.
Searching for assets that contain more than one word in their names
Search text: marketing team summit
Interpretation: "marketing" or "team" or "summit"
Search result: Assets with the following names are found:
Note The order of the actual results may differ from the following order.
- marketing team summit
- marketing_campaign_august
- team123
- summit_planning
- marketing_team_summit_august
- marketingTeamSummit
- marketing team summit August
Searching for assets that contain CamelCase in their names
Search text: MarketingTeam
Interpretation: "marketing" and "team"
Search result: Assets with the following names are found:
- MarketingTeam
- marketingteam
- marketing_team
- marketing team summit
- marketing_team_summit
- marketing_team_summit_August
- xyz marketing abc team def summit ghi
However, an asset with the following name is not found: team. This is because the search engine expects both the words ("marketing" and "team") in the name.
Searching for assets that contain underscores (snake_case) in their names
Search text:marketing_team_summit
Interpretation: "marketing" and "team" and "summit"
Search result: Assets with the following names are found:
- marketing_team_summit
- marketing team summit
- MarketingTeamSummit
- marketing_team_summit_August
- xyz marketing abc team def summit ghi
However, asset with the following name is not found: marketingteamsummit. This is because the search engine treats marketingteamsummit as a single term.
An asset with the following name is also not found: marketing_team. This is because the search engine expects all the three words ("marketing" and "team" and "summit") in the name.
Searching for assets that contain special characters in their names
Search text: team's calendar
Interpretation: "team" or "calendar"
Search result: Assets with the following names are found:
- Team's calendar
- team123
- weekly calendar
- team_calendar_weekly
- marketing team calendar
Searching for assets that contain URLs in their names
Search text: scheme://doma.in/optional/path
Interpretation: "scheme://doma.in/optional/path"
Search result: Assets with the following names are found:
- scheme://doma.in/optional/path
- scheme://doma.in/optional/path/suffix
- scheme://doma.in, doma.in/optional/other
- path, scheme, optional
Note The order of the results may depend on where the match is found within an asset and how often it occurs. For example, because the default boosting weighs name higher than attribute, matches found in only the description might appear lower in the list compared to those in the name.
Searching for database objects
You can search for database objects using dot notation in your search text. Dot notation allows you to specify database objects by connecting their hierarchical components with dots, such as schema.table or database.schema.table.column. This method allows you to quickly locate specific assets within a database structure.
Prerequisites: The following assets exist:
- A Database asset named datahub.
- A Schema asset named sales, which is within datahub.
- A Table asset named transactions, which is within sales.
- A Column asset named customer_id, which is within transactions.
- The full name of these assets follows the standard convention for database objects. For example, the full name of the customer_id Column asset is datahub > sales > transactions > customer_id.
Search text: sales.transactions
Interpretation: The "transactions" asset within the "sales" asset
Search result: The following assets are found:
- The transactions table that is within the sales schema. This result is prioritized over others.
- Related assets, such as columns that are in the transactions table within the sales schema, and other tables that are within the sales schema.
The following table contains additional search texts and their prioritized search results.
Search text | Prioritized result |
---|---|
datahub.sales.transactions.customer_id | The customer_id column that is in the transactions table within the sales schema of the datahub database. |
datahub.sales.transactions | The transactions table that is within the sales schema of the datahub database. |
datahub.sales. | All tables that are within the sales schema of the datahub database. |
Handling duplicate asset names
Suppose that three Column assets with the same name, customer_id, have the following full names:
- datahub1 > sales > transactions > customer_id
- datahub2 > sales > transactions > customer_id
- datahub1 > sales1 > transactions > customer_id
The following table contains search texts and their prioritized search results.
Search text | Prioritized result |
---|---|
datahub1.sales.transactions.customer_id |
customer_id column in the datahub1 database.
|
sales.transactions.customer_id |
|
transactions.customer_id |
|
How boosting specific resource types affects search results
Prerequisites:
- A resource, user, or user group with the name verylongname exists.
- Edit the resource type boost factors in Collibra Settings as follows:
- Asset: 2
- Community: 4
- Domain: 6
- User: 8
- User group: 10
Search text: verylongname
Search result: The search results are ordered according to the boost factor values of the respective resource types. The user group resource type, with a boost factor of 10, is the most relevant of the results. Asset, with a boost factor of 2, is the least relevant resource type.
How boosting specific fields affects search results
Prerequisites:
- The following assets exist:
- An asset named superfeature
- An asset named asset1, with the following tag: superfeature
- An asset named asset2, with the following comment: superfeature
- Property boost factors are:
- Name: 1
- Comment: 5
- Tag: 10
Search text: superfeature
Search result: The search results are ordered according to the boost factor values of the respective fields. The tag field, with a boost factor of 10, is the most relevant of the results. Name, with a boost factor of 1, is the least relevant field.
How boosting attributes affects search results
Prerequisites:
- The following assets exist:
- An asset with the description terminator
- An asset with the definition terminator
- An asset with the note definition
- Attribute boost factors are:
- Description: 1
- Definition: 2
- Note: 3
Search text: terminator
Search result: The search results are ordered according to the boost factor values of the respective attributes. The note attribute, with a boost factor of 3, is the most relevant of the results. Description, with a boost factor of 1, is the least relevant attribute.
How boosting asset types affects search results
Prerequisites:
- The following assets exist:
- A Business Term asset with the name Payment
- A Data Attribute asset with the name Payment Type
- A Policy asset with the name Payments
- Attribute boost factors are:
- Business Term: 3
- Data Attribute: 2
- Policy: 1.5
Search text: payment
Search result: The search results are ordered according to the boost factor values of the respective asset types. The Business Term asset type, with a boost factor of 3, is the most relevant of the results. Policy, with a boost factor of 1.5, is the least relevant asset type.
How the exact match boost feature affects search results, regardless of boost factors
Prerequisites:
- The following assets exist:
- A Schema asset with the name Payment
- A Business Term asset with the name Payment Type
- A Data Attribute asset with the name Payment Type
- A Policy asset with the name Payments
- Other assets, as shown in the following image
- Asset type boost factors are:
- Business Term: 3
- Data Attribute: 2
- Policy: 1.5
Search text: payment
Search result: The resources that match the exact search text in the Name attribute of the resource are shown first. In this example, there is only one exact match—the Schema asset with the name Payment.
After the exact matches, the search results are sorted in order of descending relevance.
How term frequency contributes to relevance scores
Prerequisites:
- Several assets, each with a variation of jedi in the name, exist.
- Default boost factors for all resource types, properties, and attributes.
Search text: jedi
Search result: