Data Notebook

Important  This feature is available only in the latest user interface.

Data Notebook is a querying tool integrated directly into Collibra Data Intelligence Platform to enable you to find and query data in real time via an SQL editor. It offers a space where you can register your data sources, create notebooks to run queries against data sources, create assets from notebooks, and collaborate within notebooks. You can also visually represent the query results as charts to make them informative and engaging.

By leveraging Data Catalog, Data Notebook allows you to efficiently write and run queries against your data sources, thereby reducing the time to access and explore ingested data. Furthermore, Data Notebook promotes collaborative efforts by allowing you to create assets from data notebooks, thereby giving your teams a centralized knowledge repository within Collibra.

Image of a notebook

Data Notebook as an asset

You can convert your notebook into an asset by simply publishing the notebook. The asset type of the asset thus created is Data Notebook.

Secure architecture

Data Notebook prioritizes security with its reliance on Edge, ensuring that all interactions with your data sources occur via Edge. This eliminates the need to change how your databases are exposed to Collibra or globally.

Storage of query results

Data Notebook provides the following three options for storing the results of your SQL queries:

  • Collibra Data Intelligence Platform: Have the results securely stored in the Collibra Platform alongside the rest of your notebook content.
  • Your own database: Connect the database you manage either to your Edge component or to your own database.
  • No storage: Choose not to store the results on any database.

These options are set when registering data sources for Data Notebook.

Authentication method

Administrators can enforce how you connect to databases to run queries. This may include using personal credentials, service accounts, or authentication protocols such as Open Authorization (OAuth) and Single Sign-On (SSO) connections.

  • Personal credentials: Users need to enter their own credentials to connect to the data source when running queries. Users will therefore inherit permissions from the underlying system, meaning they can query only that data to which they have access.
  • Service accounts: Users don't need to enter any credentials to connect to the data source when running queries. Data Notebook uses the service account from the Edge data source connection. This option is suitable for testing Data Notebook or for giving organization-wide access to a data source.
  • OAuth or SSO: Users are redirected to a sign-in page when running queries. If the sign-in is successful, they can run queries against the data source using their own identity, similar to personal credentials.
Note 
  • Not all authentication methods are available for all data source providers. For more information, go to Data sources for Data Notebook.
  • Queries are run with user-based system permissions. Data Notebook doesn't restrict the SQL statements that users can run. Therefore, consider setting proper permissions.

Supported data sources

Currently, Data Notebook supports the following data sources:

  • BigQuery
  • Databricks
  • Microsoft SQL Server (beta)
  • Oracle (beta)
  • PostgreSQL
  • Redshift
  • Snowflake