System metrics

You can monitor the following system metrics in your environment:

  • CPU usage
  • Memory, used and available
  • File system, used and available

    Tip We recommend setting an alert when file system usage is above 80%.

If you are using services or daemons, we recommend monitoring the health of the collibra-agent and collibra-console services.

The following is an example configuration for OpenTelemetry collector to collect basic system metrics. This may be different for the observability tool that you are using.

receivers:
  # See more details: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/hostmetricsreceiver
  hostmetrics:
    scrapers:
      cpu:
      memory:
      disk:
      load:
      paging:
      processes:
      network:
      filesystem:

processors:
  # See more details: https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor
  batch:

  # See more details: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourcedetectionprocessor
  resourcedetection:
    detectors:
      - env
      - system

exporters:
  # See more details: https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/debugexporter
  debug:

  # Add appropriate exporter(s) here
  # See all available ones: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter

service:
  pipelines:
    metrics:
      receivers:
        - hostmetrics
      processors:
        - resourcedetection
        - batch
      exporters:
        - debug
        # Add appropriate exporter(s) here