Dataset Overview

Dataset Overview allows you to execute SQL queries, instantly view results, and conduct initial exploration of new datasets. Additionally, you can directly create and add rules to your jobs. With the power to write SQL to query your dataset, you can accelerate the data discovery process and extrapolate important insights in real time.

From the Metadata Bar on any dataset-level page, click the Dataset Overview icon to open Dataset Overview.

The dataset-level pages where the Metadata Bar is available include:

  • Explorer
  • Profile
  • Findings
  • Alert Builder
  • Dataset Rules

Prerequisites

You have:

  • ROLE_ADMIN, ROLE_VIEW_DATA, or ROLE_CONNECTION_MANAGER assigned to your user account.
  • Access to the connection upon which the dataset was created.

About Dataset Overview

dataset overview modal with numbers corresponding with the table below

No. Component Description
Dataset information Displays the name of the data source connection, database, and table from which your data originates.
Create Rule

Lets you create a rule based on the query criteria in the editor below it. When you click Create Rule, you can give your rule a memorable name and save it to your dataset. You can find your new rule on the Dataset Rules of your dataset.

Note The SQL type of rules created in Pullup mode depend on the rule type. For instance, if your rule type is Freeform SQL, the SQL type is Spark SQL, whereas a Native SQL rule saves as the SQL specific to your datasource. Likewise, Pushdown datasets always use SQL native to your datasource.

Find/Replace Finds and replaces any string value in the SQL editor. For example, if your query in the SQL editor is select * from public.chicago_energy WHERE energy_star_score < '20' AND community_area = 'LOOP', but you want to update your query by replacing 'community_area' with 'community_name', click Find/Replace and in the Find and Replace with input fields, enter community_area and community_name, respectively. Click Replace after you update the input fields to update the SQL editor.
-q Loads the contents of the dataset source query into the SQL editor. For example, if the most recent run of your dataset used the query SELECT * FROM example.dataset limit 100, when you click -q, that exact query will load into the SQL editor, replacing the default select * statement that loads when you open Dataset Overview. This can help you narrow the scope of the data returned, allowing you to more efficiently access the core of the data and insights from your most recent DQ job runs.
Pretty print Formats your SQL query in a more readable layout.

Collibra AI

Opens SQL Assistant for Data Quality (powered by Collibra AI) to allow you to automate SQL rule writing and troubleshooting to help you accelerate the discovery, curation, and visualization of your data.

Note ROLE_GENAI_USER and/or ROLE_ADMIN are required to use SQL Assistant for Data Quality.

SQL editor The SQL editor lets you explore your dataset based on the data points and criteria specific to your particular use case. When you update your SQL and then run the query, the overview table displays the results that meet the conditions of your query.
Run

Runs the SQL in the editor and updates the overview table.

Important If you do not have connection access and ROLE_ADMIN, ROLE_VIEW_DATA, or ROLE_CONNECTION_MANAGER assigned to your user account, you cannot run queries in Dataset Overview.

Results

Contains an overview of the columns and cell values in your dataset in table format. You can click individual cells or click and drag to select a range of cells to copy to the clipboard.

Tip 
Click Compile button near the bottom right of the SQL editor to populate the overview table. The default query select * from [table] populates an overview of the entire table.

Schema Contains a complete list of the columns in your dataset and their data types.
Copy Results Copies the results of the query as shown in the overview table to the clipboard in comma delimited format. Choose to copy the first 5, 10, 20, or all rows in the overview table and then paste in either the SQL editor or elsewhere, such as a notepad. When you copy rows, the column headers also copy in comma delimited format.
Download Results

Downloads a CSV file containing the results of queries you run in the SQL editor.

Using the SQL editor

The SQL editor helps you quickly navigate and extract the information you need from your dataset.

  1. There are two options to get started:
    1. Enter your SQL query in the SQL editor.
      OR
    2. Click Collibra AI and complete the steps in the modal.
  2. Click Compile button near the bottom right of the SQL editor to populate the overview table with the results of your query.
  3. Repeat as necessary or optionally create a rule around your query.

Creating a rule

After you run a query with the SQL editor, you can save it as a rule. When you discover important or otherwise relevant data points from the results of your query, you can save the SQL query as a dataset rule.

  1. Follow the steps outlined in Using the SQL editor.
  2. Click Create Rule.
  3. Give your rule a name in the Name input field.
  4. Click Save.
  5. You can access your newly created rule on the Dataset Rules page of your dataset.

Known limitations

When you name a rule that matches an existing rule name, the previous rule is replaced by the new one.