Edge FAQ

The following table contains the most frequently asked questions about Edge that were not answered anywhere else in the Edge documentation.

Question

Answer

Who benefits from using Edge?

All customers who want to ingest data into Collibra Data Intelligence Platform benefit from Edge.

Some of the benefits for using Edge are:

  • Data is processed in the customer's secure environment and only the process results are sent to Collibra Data Intelligence Platform.
  • Edge can automatically anonymize sensitive profiling data before sending it to Collibra Data Intelligence Platform.
  • Edge can automatically classify the metadata and send the classification results together with the profiling results to Collibra Data Intelligence Platform.
  • Edge enables better profiling performance, because data no longer has to be copied or moved.
  • Edge can execute capabilities in parallel, considering this is dependent of available resources. Jobserver only executes capability jobs sequentially.

Why should I migrate from Jobserver to Edge?

Edge provides our customers with all of the capabilities provided with Jobserver, but with better security controls and added capabilities. Edge provides seamless native integrations and on-site data processing solutions that prioritize security and proximity to the data, while keeping the processing of your data within your own environment. For more information, go to Migrate to Edge from Jobserver.

The main differences between Edge and Jobserver are the following:

  • Edge is based on Kubernetes, a distributed runtime, which means:
    • It offers built in resource management.
    • It has reliable delivery of results to Collibra Data Intelligence Platform.
  • Edge provides the ability to mirror images in your private docker registry to better fit your security policy.
  • Edge offers two upgrade modes to best suit your needs: Automatic and Manual.
  • Edge is a Collibra service compatible with on-premises as well as cloud environments.
  • Edge offers continuous delivery of capability types and updates will be delivered on a regular basis.
  • Edge updates are included with Collibra Data Intelligence Platform releases.

New capabilities will not be developed for Jobserver, as it will be made end of life from September 30, 2024. We recommend migrating to Edge before this date.

Can Edge run alongside Jobserver? Yes, both can technically be run at the same time, however, we strongly recommend that you do not install both Jobserver and Edge on the same server. Edge should be installed on its own dedicated server.
What does the Edge architecture look like? You can see how Edge interacts with other components in this architecture and components overview.

Can Edge use Kubernetes provided by a Cloud vendor, for example Google Kubernetes Engine (GKE), Azure Kubernetes Services (AKS) or Amazon Elastic Kubernetes Service (EKS)?

Yes, you can install Edge on the following managed, dedicated Kubernetes clusters:

  • Azure Kubernetes Service (AKS)
  • AWS Fargate using EKS
  • Amazon EKS
  • Google Kubernetes Engine (GKE)
  • OpenShift

Currently, we only support basic integrations with these Cloud services. Please contact your Customer Success Manager if you have any questions.

Note Alternatively, if you install Edge on a Cloud environment, the Edge site installer includes the k3s Kubernetes version.
Can Edge be installed on Windows servers?

If you use Microsoft technologies, you can install your Edge site on a managed Azure Kubernetes Service (AKS) cluster.

Otherwise, support for other k8s clusters, k3s in particular, and container technology is under served on Windows without the equivalent of a Linux sub-system. We will continue to prioritize your experience on Linux-based operating systems, and as such, will not support Edge installation on Windows servers for other Kubernetes clusters until the support is seamless.

Why can Edge only be run on a dedicated cluster? Will we be able to run Edge on a shared cluster in the future? Edge currently runs on two namespaces. This requires Edge to be run on a dedicated cluster for optimized security.

The ability for Edge to run securely on shared clusters for Amazon Kubernetes Service (EKS), Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), AWS Fargate using EKS, OpenShift Container platform and other services is a part of the current road map. Please contact your Customer Success Manager if you have any questions.
What are the supported data sources on Edge? You can find the list of supported data sources in the Data sources supported by Edge section.
How does authentication from Edge to the customer's data sources work? Authentication to data sources depends on the source type that the capability is connecting to. JDBC sources are covered via Edge connection providers. Other sources are accessed in different ways by capabilities themselves.
Can you connect using a cloud provider key manager such as AWS Secrets Manager, GCP Secret Manager or Azure Key Vault?

Yes, you can integrate your Edge site with the following vault providers:

  • CyberArk Vault
  • HashiCorp Vault
  • Azure Key Vault
  • AWS Secrets Manager
  • Google Secret Manager

Why do you not support CentOS Linux 8? CentOS Linux 8 has been made end-of-life. We are committed to using the latest technologies to ensure the best performance of our software, and as such RedHat 8 is required in order to receive support for Edge installations.
How does Edge connect to Collibra Data Intelligence Platform? An Edge site is installed in the customer's environment, close to the data source. The Edge site communicates to Collibra Data Intelligence Platform using an outbound HTTPS connection via port 443.
Is Edge on premises or in the Cloud? Edge is always close to your data, and therefore can be on your premises or in a private or public Cloud setup.
Who controls Edge? Edge is controlled by the customer through local access via the Collibra Data Intelligence Platform user interface. You can also use local access via the Linux shell for advanced troubleshooting when Edge is unable to connect. For more information, go to About Edge.
How is Edge updated? Edge sites can be configured to either upgrade automatically whenever a new version is released, or upgrade manually, in order to control when and to which version your sites are upgraded. For more information, go to Upgrading an Edge site.
Can an Edge site connect to more than one Collibra environment? No. Every Edge site belongs and authenticates to only one Collibra Data Intelligence Platform environment.
Do you need multiple instances of Edge for Data Quality to run? No, only one Edge site is required for DQ Cloud. However, while you can technically run Collibra Data Quality & Observability and capabilities in the same Edge instance, you will need to ensure resources and space are available if you have a large Edge site.
  • Unlike DQ Cloud, Edge is not available for on-prem instances of Collibra DQ.
  • If you have an existing Edge site that runs capabilities without Data Quality, you can update Edge Config to enable/disable any service or configuration during any run time, in order to provide space to run Data Quality.
  • If you have an existing Edge site and are open to reinstalling, then you can enable the Data Quality flags during the reinstallation process in order to keep one instance of Edge.

Note It is not recommended to run Classification and Data Quality in the same Edge instance, as they will compete for resources. Best practice is to have separate Edge sites for Classification and Data Quality.

Can Edge use customer-provided certificates to connect to Collibra Data Intelligence Platform? Currently, we do not support this.

Edge is a Collibra product that can run on the customer's on-premises or cloud environment. The authentication between the Edge site and Collibra Data Intelligence Platform is controlled and secured by Collibra. The keys and credentials are generated when you install the Edge site.

When do internal K3S certificates expire? The internal K3S certificates expire 12 months after the initial installation. You should restart the K3S-based Edge site in the last 3 months to ensure the internal certificates are rotated. If not, restart K3S or reinstall the Edge site.
Does Edge implement Cross-Site Request Forgery (CSRF) tokens? Yes, the Edge management user interface can now implement CSRF tokens.

Note The CSRF token needs to be unique per user session and should be a large, random value.

Does Edge support mTLS when connecting to Collibra Data Intelligence Platform? Currently, we do not support this.
Is Edge horizontally scalable? Currently, Edge is not horizontally scalable. You cannot add more nodes.
Does Edge support High Availability and disaster recovery?

Edge does not support High Availability, but core Edge services can be replicated if Edge is installed on a multi-node cluster, and Edge capabilities can be restarted in the event of a failure.

Disaster recovery is supported through regular backups. More information about our disaster recovery process can be found in this overview.

What troubleshooting information is collected and where is it stored?

When Edge is operational and has deployed running capabilities, jobs or services, it can collect information on multiple levels:

  • Infrastructure logs - default level info is collected, sent to the Cloud and accessible by Collibra.
  • Edge system monitoring - sent to the Cloud and accessible by Collibra.
  • Metadata connector logs - off by default and accessible by the customer .
  • Edge diagnostics - information is collected on demand by the customer on site and sent to Collibra as part of the support ticket.

Edge Sample Data capability:

  1. Can everybody see sample data?
  2. How is sample data queried from the database?
  3. Which user account pulls the sample data from the database?

The Sample Data capability for Edge is a feature and needs to be activated.

  1. Only users with the permission will be able to view the sample data.
  2. Samples are queried from the data source upon request.
  3. The samples will be pulled from the database using the ID of the account specified in the Edge connection.
Can metrics data from an Edge site be sent to Collibra through a private link instead of over the Internet? No, this data can only be sent over the Internet.
What are Edge security considerations?

Edge is designed around security first principles. Several highlights:

  1. No inbound connectivity - Edge site is always polling the platform via a REST endpoint.
  2. Data is not stored on Edge after a job has finished.
  3. Credentials are managed by Edge and not accessible outside of it.
  4. Credentials on Edge site are encrypted with the key secured in the Collibra Data Governance Center.
  5. Credentials can be updated both for data sources and Collibra Data Governance Center.
  6. With the Edge Smart Upgrade feature, you can configure your Edge sites to upgrade manually. Manual upgrade allows you to run security scans on images included in a new release version before upgrading your Edge site version. Furthermore, these security scans can be performed in your own private docker registry. For more information on how your Edge sites can be upgraded, go to Upgrading an Edge site.
How are secrets stored on an Edge site? You can find the details of how Edge stores secrets in this Storing secrets overview.