Red Hat ACM Observability
Veeam Kasten can be integrated with Red Hat Advanced Cluster Management (ACM) Observability Service to provide centralized monitoring across your Red Hat OpenShift fleet. This integration leverages Prometheus remote_write to push metrics from the K10 cluster to the ACM Hub.
Prerequisites
- Red Hat ACM installed on the Hub cluster.
- MultiClusterObservability enabled and configured on the ACM Hub.
- For detailed instructions, refer to the Red Hat ACM Observability documentation.
- Veeam Kasten installed (or ready to be installed) on the Managed Cluster.
Step 1: Configure ACM Observability
Ensure that the ACM Observability service is running and gather the necessary connection details. You can verify service availability and retrieve configurations details either through the Red Hat ACM UI or via the CLI as described below.
Retrieve Configuration via Web Console
-
Verify Installation:
- Log in to your OpenShift Console on the Hub Cluster.
- Ensure you are in the local-cluster view.
- Click Search in the left navigation menu.
- In the Resources dropdown, type and select
MultiClusterObservability. - Click on the
observabilityresource instance. - Ensure the status is
Ready.
-
Find the Tenant ID:
- Navigate to Infrastructure > Clusters.
- Select the local-cluster (the Hub cluster).
- On the Overview tab, locate the Cluster ID. This is your Tenant ID.
-
Find the Thanos Receive Endpoint:
- Ensure you are in the local-cluster view.
- Navigate to Networking > Routes.
- Select the project
open-cluster-management-observability. - Locate the
observatorium-apiroute. - The Location column contains the base URL. Append
/api/v1/receiveto this URL to form the complete endpoint.
Retrieve Configuration via CLI
You will need the Thanos Receive Endpoint URL and the Tenant ID to configure K10 on managed cluster.
-
Verify the Observability Service (Run on Hub Cluster):
oc get multiclusterobservability -n open-cluster-management-observabilityRefer to Red Hat documentation to verify the state.
-
Identify the Thanos Receive Endpoint URL (Run on Hub Cluster):
-
If K10 is on the Hub Cluster, use the internal service URL:
http://observability-observatorium-api.open-cluster-management-observability.svc:8080/api/v1/receive -
If K10 is on a Managed Cluster, retrieve the external route and construct the URL:
HOST=$(oc get route observatorium-api -n open-cluster-management-observability -o jsonpath='{.spec.host}')
echo "https://$HOST/api/v1/receive"
-
-
Find the Tenant ID (Run on Hub Cluster): The Tenant ID is typically the Cluster ID of the ACM Hub Cluster.
oc get clusterversion version -o jsonpath='{.spec.clusterID}'
Step 2: Configure K10 for Remote Write
Configure K10's embedded Prometheus to push metrics to the ACM Hub. This is done by updating the K10 Helm values.
You can fine-tune the Prometheus remote write behavior by adding standard Prometheus configuration options under prometheus.server.remote_write[0]. Parameters such as queue_config, send_interval, and write_relabel_configs are supported and passed directly to the Prometheus server configuration. Find more details in Prometheus Remote Write
1. Gather Configuration Values
Before creating the configuration file, ensure you have the following values:
<YOUR_TENANT_ID>: The Tenant ID required by the ACM Observability endpoint. (Required)- This is typically the Hub Cluster ID (see "Find the Tenant ID" in Step 1).
<YOUR_CLUSTER_NAME>: A unique, friendly name for the cluster where K10 is installed (e.g.,us-east-prod-01).- OpenShift: Auto-detected from the infrastructure config if not provided.
- Other Platforms: Required. Must be provided explicitly.
<YOUR_CLUSTER_ID>: The unique identifier for the managed cluster.- OpenShift: Auto-detected if not provided.
- Other Platforms: Required. Must be provided explicitly.
<YOUR_K10_DASHBOARD_URL>: The full URL used to access the K10 dashboard on this cluster (e.g.,https://k10.apps.example.com/k10/).- Recommended. This URL is attached to metrics to enable "deep linking" from the central dashboard back to the specific K10 instance.
2. Prepare the Helm Values
Create or update your k10-values.yaml file with the ACM configuration.
Minimal Configuration (OpenShift)
On OpenShift, K10 can auto-detect the Cluster Name and Cluster ID. You only need to provide the Tenant ID and the remote write URL.
global:
acm:
enabled: true
hubThanosTenantId: "<YOUR_TENANT_ID>" # Required
prometheus:
server:
remote_write:
- url: "http://observability-observatorium-api.open-cluster-management-observability.svc:8080/api/v1/receive"
Full Configuration (Recommended)
For better control and to enable deep linking, provide all values explicitly.
clusterName: "<YOUR_CLUSTER_NAME>" # Friendly name for the dashboard
global:
acm:
enabled: true
hubThanosTenantId: "<YOUR_TENANT_ID>"
managedClusterId: "<YOUR_CLUSTER_ID>" # Optional override
prometheus:
server:
remote_write:
- url: "http://observability-observatorium-api.open-cluster-management-observability.svc:8080/api/v1/receive"
dashboardUrl: "<YOUR_K10_DASHBOARD_URL>"
# Optional: Filter metrics sent to the remote write endpoint (defaults to K10 core metrics)
metricsRegex: "(backup|restore|import|export|job|policy|action|catalog|process).*"
# Ensure resources are sufficient for remote_write
resources:
requests:
cpu: 750m
memory: 1.5Gi
3. Apply the Configuration
Install or Upgrade K10 to apply the changes. Run these commands on the Managed Cluster.
Using Values File (Recommended)
# For new installations
helm install k10 kasten/k10 -n kasten-io --create-namespace -f k10-values.yaml
# For existing installations
helm upgrade k10 kasten/k10 --reuse-values -n kasten-io -f k10-values.yaml
Using Command Line Flags
Alternatively, you can specify the configuration using --set flags:
# For new installations
helm install k10 kasten/k10 \
--namespace kasten-io \
--create-namespace \
--set global.acm.enabled=true \
--set global.acm.hubThanosTenantId="<YOUR_TENANT_ID>" \
--set global.acm.managedClusterId="<YOUR_CLUSTER_ID>" \
--set clusterName="<YOUR_CLUSTER_NAME>" \
--set prometheus.server.remote_write[0].url="<REMOTE_WRITE_URL>" \
--set prometheus.server.remote_write[0].dashboardUrl="<YOUR_K10_DASHBOARD_URL>"
# For existing installations
helm upgrade k10 kasten/k10 \
--reuse-values \
--namespace kasten-io \
--set global.acm.enabled=true \
--set global.acm.hubThanosTenantId="<YOUR_TENANT_ID>" \
--set global.acm.managedClusterId="<YOUR_CLUSTER_ID>" \
--set clusterName="<YOUR_CLUSTER_NAME>" \
--set prometheus.server.remote_write[0].url="<REMOTE_WRITE_URL>" \
--set prometheus.server.remote_write[0].dashboardUrl="<YOUR_K10_DASHBOARD_URL>"
Step 3: Verification
1. Verify K10 Prometheus Logs
Check the Prometheus logs to ensure remote_write is active and not encountering errors.
kubectl logs -n kasten-io -l app=prometheus -c prometheus-server | grep "remote_write"
2. Verify Metrics in ACM Grafana
- Open the ACM Grafana dashboard on the Hub cluster.
- Navigate to Explore (or Drilldown in newer versions).
- Select the Thanos (or observatorium) datasource.
- Query for a K10 metric, for example:
action_backup_ended_overall. - Verify that the metric is visible and tagged with your cluster name.
Troubleshooting
Common Installation Errors
"A valid .Values.global.acm.hubThanosTenantId is required"
This error occurs during helm install or helm upgrade if the Tenant ID is missing.
- Cause:
global.acm.hubThanosTenantIdis not set in your values file. - Fix: Retrieve the Hub Cluster ID (see Step 1) and add it to your
k10-values.yaml.
"global.acm.managedClusterId is required when global.acm.enabled is true"
This error occurs if K10 cannot auto-detect the Cluster ID (e.g., on non-OpenShift platforms) and it was not provided.
- Cause: You are installing on a platform where Cluster ID auto-detection is not supported, or RBAC permissions prevent lookup.
- Fix: Add
global.acm.managedClusterId: "<YOUR_CLUSTER_ID>"to yourk10-values.yaml.
"clusterName is required when prometheus.server.remote_write is configured"
This error occurs if K10 cannot determine a name for the cluster.
- Cause: You are installing on a non-OpenShift platform (or auto-detection failed) and did not provide a
clusterName. - Fix: Add
clusterName: "my-cluster-name"to yourk10-values.yaml.
Runtime Errors (Prometheus Logs)
"server returned HTTP status 500 ... no matching hashring to handle tenant"
This error appears in the Prometheus logs (kubectl logs ... -c prometheus-server).
- Cause: The
hubThanosTenantIdprovided is incorrect or does not match a valid tenant on the ACM Hub. - Fix: Verify that the Tenant ID matches the Hub Cluster ID exactly.
"server returned HTTP status 401 Unauthorized"
- Cause: The ACM Observability endpoint requires authentication (e.g., mTLS) which is not correctly configured, or the endpoint URL is incorrect.
- Fix: Ensure you are using the correct internal or external route for the
observatorium-apiservice.