Edge FAQ

The following table contains the most frequently asked questions about Edge that were not answered anywhere else in the Edge documentation.

Question

Answer

Who benefits from using Edge?

All customers who want to ingest data into Collibra Data Intelligence Cloud benefit from Edge.

Some of the benefits for using Edge are:

  • Data is processed in the customer's secure environment and only the process results are sent to Collibra Data Intelligence Cloud.
  • Edge can automatically anonymize sensitive profiling data before sending it to Collibra Data Intelligence Cloud.
  • Edge can automatically classify the metadata and send the classification results together with the profiling results to Collibra Data Intelligence Cloud.
  • Edge enables better profiling performance, because data no longer has to be copied or moved.
  • Edge can execute capabilities in parallel, considering this is dependent of available resources. Jobserver only executes capability jobs sequentially.
Where can I find Edge API documentation? You can find the Edge API reference documentation in your Collibra environment on this URL: https://<your_collibra_platform_url>/edge/docs/index.html

Does Edge replace the Jobserver?

Customers can choose between Edge and Jobserver.

The main differences between Edge and Jobserver are the following:

  • Edge is based on Kubernetes, a distributed runtime, which means:
    • It offers built in resource management.
    • It has reliable delivery of results to Collibra Data Intelligence Cloud.
  • Edge is a Collibra service compatible with on-premises as well as cloud environments.
  • Edge offers continuous delivery of capability types and updates will be installed on a regular basis.
  • Edge updates are included in Collibra Data Intelligence Cloud releases.

Jobserver features correspond to Edge capabilities, each one is developed and deployed independently of one another. New capabilities will not be developed for Jobserver and it will be gradually phased out until early 2024. In the future, we will provide a script for migrating features from Jobserver to Edge where applicable.

Can Edge run alongside Jobserver? Yes. Both can even be installed on the same server as long as the server has enough resources to support both, though we recommend not to run both services on a single server.
What does the Edge architecture look like? You can see how Edge interacts with other components in this architecture and components overview.

Can Edge use Kubernetes provided by a Cloud vendor, for example Google Kubernetes Engine (GKE), Azure Kubernetes Services (AKS) or Amazon Elastic Kubernetes Service (EKS)?

When the Edge site is installed in a Cloud environment, it does not use a managed Kubernetes provided by the Cloud vendor, because Kubernetes is already included in the Edge site installation process.

You can install Edge on Amazon EKS. In the first releases, we cannot benefit from seamless integration of various Cloud services offered by those platforms, for example, embedded authentication, auto-scaling and databases. Edge on AKS and GKE are not part of the short term road map at this time. Please contact your Customer Success Manager if you have any questions.

Can Edge be installed on Windows servers? No, you cannot install an Edge site on Windows servers. Support for K8S, K3S in particular, and container technology is underserved on Windows without the equivalent of a Linux sub-system. We will continue to prioritize your experience on Linux-based operating systems, and as such, will not support Edge installation on Windows servers until the support is seamless.
What are the supported data sources on Edge? You can find the list of supported data sources in the Data sources supported by Edge section.
How does authentication from Edge to the customer's data sources work? Authentication to data sources depends on the source type that the capability is connecting to. JDBC sources are covered via Edge connection providers. Other sources are accessed in different ways by capabilities themselves.
Can you connect using a cloud provider key manager such as AWS Secrets Manager, GCP Secret Manager or Azure Key Vault? Not at this time.
Is CentOS Linux 8 supported for Edge installations? Not for any versions of the Edge installation later than and including 2022.11. These later versions will require RedHat 8 in order to receive support. If you have an existing site, everything will work as before unless you need to reinstall a new site with a later version.
Why are you removing support for CentOS Linux 8? CentOS Linux 8 has been made end-of-life. We are committed to using the latest technologies to ensure the best performance of our software, and as such RedHat 8 is required in order to receive support for Edge installations after the 2022.11 release.

How does Edge connect to Collibra Data Intelligence Cloud?

An Edge site is installed in the customer's environment, close to the data source. The Edge site communicates to Collibra Data Intelligence Cloud using an outbound HTTPS connection via port 443.

Is Edge on premises or in the Cloud? Edge is always close to your data, and therefore can be on your premises or in a private or public Cloud setup.

Who controls Edge?

Edge is controlled by the customer through local access via the Collibra Data Intelligence Cloud user interface. You can also use local access via the Linux shell for advanced troubleshooting when Edge is unable to connect.

How is Edge updated?

Edge is updated automatically based on your Collibra Data Governance Center platform. The ability to disable automatic updates is currently on our road map, but is not currently supported with the available Edge installer.

Can an Edge site connect to more than one Collibra environment?

No. Every Edge site belongs and authenticates to only one Collibra Data Intelligence Cloud environment.

Do you need multiple instances of Edge for Data Quality to run?

It depends on your current setup. While you can technically run Collibra Data Quality & Observability and capabilities in the same Edge instance, you will need to ensure resources and space are available if you have a large Edge site.

  • If have an existing Edge site that runs capabilities without Data Quality, you can update Edge Config to enable/disable any service or configuration during any run time, in order to provide space to run Data Quality.
  • If you have an existing Edge site and are open to reinstalling, then you can enable the Data Quality flags during the reinstallation process in order to keep one instance of Edge.

Note It is not recommended to run Classification and Data Quality in the same Edge instance, as they will compete for resources. Best practice is to have separate Edge sites for Classification and Data Quality.

Can Edge use customer-provided certificates to connect to Collibra Data Intelligence Cloud?

Currently, we do not support this.

Edge is a Collibra product that can run on the customer's on-premises or cloud environment. The authentication between the Edge site and Collibra Data Intelligence Cloud is controlled and secured by Collibra. The keys and credentials are generated when you install the Edge site.

When do internal K3S certificates expire? The internal K3S certificates expire 12 months after the initial installation. You should restart the K3S-based Edge site in the last 3 months to ensure the internal certificates are rotated. If not, restart K3S or reinstall the Edge site.
Does Edge implement Cross-Site Request Forgery (CSRF) tokens?

Yes, the Edge management user interface can now implement CSRF tokens.

Note The CSRF token needs to be unique per user session and should be a large, random value.

Does Edge support mTLS when connecting to Collibra Data Intelligence Cloud?

Currently, we do not support this.

Is Edge horizontally scalable?

Currently, Edge is not horizontally scalable. You cannot add more nodes.

Does Edge support High Availability and disaster recovery?

Edge does not support High Availability, but core Edge services can be replicated if Edge is installed on a multi-node cluster, and Edge capabilities can be restarted in the event of a failure.

Disaster recovery is supported through regular backups. More information about our disaster recovery process can be found in this overview.

What troubleshooting information is collected and where is it stored?

When Edge is operational and has deployed running capabilities, jobs or services, it can collect information on multiple levels:

  • Infrastructure logs - default level info is collected, sent to the Cloud and accessible by Collibra.
  • Edge system monitoring - sent to the Cloud and accessible by Collibra.
  • Metadata connector logs - off by default and accessible by the customer .
  • Edge diagnostics - information is collected on demand by the customer on site and sent to Collibra as part of the support ticket.

Edge Sample Data capability:

  1. Can everybody see sample data?
  2. How is sample data queried from the database?
  3. Which user account pulls the sample data from the database?

The Sample Data capability for Edge is a beta feature and needs to be activated.

  1. Only users with the permission will be able to view the sample data.
  2. Samples are queried from the data source upon request.
  3. The samples will be pulled from the database using the ID of the account specified in the Edge connection.
Can metrics data from an Edge site be sent to Collibra through a private link instead of over the Internet? No, this data can only be sent over the Internet.
What are Edge security considerations?

Edge is designed around security first principles. Several highlights:

  1. No inbound connectivity - Edge site is always polling the platform via a REST endpoint.
  2. Data is not stored on Edge after a job has finished.
  3. Credentials are managed by Edge and not accessible outside of it.
  4. Credentials on Edge site are encrypted with the key secured in the Collibra Data Governance Center.
  5. Credentials can be updated both for data sources and Collibra Data Governance Center.
How are secrets stored on an Edge site? You can find the details of how Edge stores secrets in this Storing secrets overview.