System requirements

In this section, you can find some guidelines for the system requirements per Data Governance Center service.

Note These guidelines are only recommendations. Ultimately, the performance of a Collibra DGC environment depends on many additional factors, for example, network performance, load balance and data volume.

DGC service

Number of concurrent users Number of assets (*) Recommended requirements
1 - 50 < 1 million 4 CPUs / 16 GB memory
50 - 100 1- 5 million 4 CPUs / 16 GB memory
100 - 200 5 - 25 million 8 CPUs / 32 GB memory
200 - 500 25 - 50 million

16 CPUs / 64 GB memory

> 500 > 50 million 32 CPUs / 128 GB memory

(*) The amount of memory indicates the memory dedicated to the DGC service.

CPU power influences the performance of transactions, while memory has a significant influence on data imports.

The recommended CPU cores and memory can be different according to your usage:

  • If you use Collibra DGC for reference data management, you can use more CPU cores than indicated in the table.
  • If you regularly import large amounts of data, we recommend that you increase the memory.

The DGC service does not store any real Collibra DGC data, therefore the disk space usage of this component is more stable. We recommend at least 50 GB.

Repository service

Number of concurrent users Number of assets (*) Recommended requirements
1 - 50 < 1 million 2 CPUs / 16 GB memory
50 - 100 1- 5 million 4 CPUs / 16 GB memory
100 - 200 5 - 25 million 8 CPUs / 32 GB memory
200 - 500 25 - 50 million

16 CPUs / 64 GB memory

> 500 > 50 million 32 CPUs / 128 GB memory

(*) The amount of memory indicates the memory dedicated to the repository service.

The recommended CPUs and memory can be different according to your usage:

  • If you have a large amount of assets in your database, the repository requires more RAM memory. The more RAM available to the operating system, the greater the role the file system cache plays in storing the data.
  • If the size of your content is large, you will need more disk space. For example, the history of performed actions in the system is stored in the database.

We recommend that you start with 125 GB of disk storage and monitor the usage.

Search service

Number of assets

Search service memory

< 500k 1 GB
< 1M 2 GB
> 1M 4 GB

Rule of thumb for assigning memory to the search service:

#million assets x 2 = GB of memory

For example, for 3 million assets in the repository, assign 6 GB of memory.

Jobserver service

For data ingestion:

  • 64 GB RAM
  • 500 GB free disk space
  • Hard disk type: SSD
  • Number of CPUs: 16

For Tableau ingestion:

  • 6 GB RAM
  • 35-50 GB free disk space
  • Hard disk type: SSD
  • Number of CPUs: 4

We highly recommend you to install the Jobserver on a dedicated server. However, if you install the Jobserver on the same server as other Collibra nodes, the minimum hardware requirements of the Jobserver must be added to those of the other Collibra nodes on the same server.

Collibra Console

Collibra Console requires free disk space as all backups are stored on the node with the Collibra Console component.

Collibra Agent

The agent works perfectly with the requirements of the installed services. There's no memory scaling required as its memory consumption remains stable.

Lineage harvester

The recommended software requirements are identical to the minimum software requirements. However, the minimum hardware requirements are most likely insufficient for production environments. We recommend to provide the following hardware requirements:

  • 4 GB RAM
    Tip 4 GB RAM is sufficient in most cases, but more memory could be needed for larger harvesting tasks. For instructions on how to increase the maximum heap size, see the advice on how to resolve Java heap space errors, in the Collibra Support Portal.
  • 20 GB free disk space

Note To install and use the lineage harvester, you first have to purchase Collibra Data Lineage. This feature is only available for customers that use Collibra Data Intelligence Platform version 5.7.3 or newer.

Power BI harvester

We recommend to provide the following system requirements:

  • 4 GB RAM
  • 20 GB free disk space
  • Microsoft .NET Framework 4.7.2 or higher.
  • Client operating system: Windows 10 April 2018 update, version 1803 or newer.
  • Server operating system: Windows Server 2016 version 1803 or newer.

Tip To ingest Power BI metadata in Data Catalog, you need to run two different harvesters: the Power BI harvester and the lineage harvester.

Note To install and use the Power BI harvester, you first have to purchase the Power BI integration feature. This feature is only available for customers that use Collibra Data Intelligence Platformversion 2020.11 or newer.