Graph database infrastructure

Some features and applications in Collibra rely on a graph database that holds a partial copy of the main database, including the knowledge graph and operating model. This copy is updated regularly to reflect changes made to the main database.

When you commit a transaction, all changes are transferred to the graph database through a synchronization process called Change Data Capture (CDC). Features and applications that rely on the graph database may be affected by the synchronization status. Some capabilities may become temporarily unavailable or show outdated data. For more information about the specific impact, refer to the documentation for each feature or application.

Snapshot process

When a graph database is first connected to your environment, a full copy of the main database is transferred in a process called a snapshot. The graph database is unavailable during this process, which may also affect the availability of other features and applications.

Note During a snapshot, the graph database status shows as Snapshotting in progress.

Occasionally, Collibra may need to re-trigger a full or partial snapshot to deliver new features, fix issues, or prevent more significant problems. When this is planned as part of scheduled maintenance, you are notified in advance through an in-app notification.

Synchronization delay

Changes to the main database are transferred to the graph database after each committed transaction. Transactions are always processed in commit order to ensure database integrity. Large transactions can block the transfer of smaller ones, so it is important to keep your transaction size and duration as small as possible.

Important Large transactions can significantly delay synchronization and may affect the freshness of data in features that rely on the graph database.

Common sources of large transactions

Large transactions are most commonly observed in the following contexts:

Source Recommendations
Import jobs
  • Set the continueOnError flag to true on import jobs. This ensures the job relies on smaller, independent transactions instead of one large one committed at the end.

    Note Starting with version 2026.07, this is the default behavior for the Import API.

Workflows Prefer bulk operations that rely on the import module, output module, or knowledge graph API. These approaches are optimized for faster processing between user interactions.

Graph database status

Collibra measures synchronization delay by regularly sending a heartbeat signal that is inserted between transactions. You can view the synchronization status in Collibra settingsGeneralSystemGraph DB synchronization status row. The row shows the current status and, if the delay exceeds one minute, the synchronization delay in minutes. Some features and applications also report the delay or status directly. For more information, refer to the documentation for each feature.

Status Delay range Description
Current Under 5 minutes The graph database is up to date. The delay is not shown when under 1 minute.
Slightly behind 5–30 minutes Minor synchronization lag. Data may be slightly outdated.
Moderately behind 30 minutes–2 hours Noticeable lag. Some features may show stale data.
Heavily behind 2–4 hours Significant lag. Review recent large transactions.
Critically behind Over 4 hours Severe lag. Consider contacting Collibra Support if this persists and you have not committed large transactions recently.
Snapshotting in progress Not applicable A full or partial snapshot is running. The graph database is temporarily unavailable.
Status unavailable Not applicable An error occurred while querying the status. Contact Collibra Support if the issue persists.

Availability

The graph database synchronization status is only available in environments where the CDC process is enabled. The following availability restrictions apply:

  • Only commercial cloud environments can be enabled at this time.
  • This feature is not available for CPSH or GovCloud environments.
  • Commercial cloud production environments are enabled on demand. Submit a support request to enable it for your environment.

Important The graph database infrastructure is currently in public preview. Service interruptions may occur more frequently than expected as Collibra stabilizes and scales the infrastructure.