Version: 8.5.3 (latest)

Red Hat ACM Observability

Veeam Kasten can be integrated with Red Hat Advanced Cluster Management (ACM) Observability Service to provide centralized monitoring across your Red Hat OpenShift fleet. This integration leverages Prometheus remote_write to push metrics from the K10 cluster to the ACM Hub.

Prerequisites

Red Hat ACM installed on the Hub cluster.
MultiClusterObservability enabled and configured on the ACM Hub.
- For detailed instructions, refer to the Red Hat ACM Observability documentation.
Veeam Kasten installed (or ready to be installed) on the Managed Cluster.

Step 1: Configure ACM Observability

Ensure that the ACM Observability service is running and gather the necessary connection details. You can verify service availability and retrieve configurations details either through the Red Hat ACM UI or via the CLI as described below.

Retrieve Configuration via Web Console

Verify Installation:
- Log in to your OpenShift Console on the Hub Cluster.
- Ensure you are in the local-cluster view.
- Click Search in the left navigation menu.
- In the Resources dropdown, type and select MultiClusterObservability.
- Click on the observability resource instance.
- Ensure the status is Ready.
Find the Tenant ID:
- Navigate to Infrastructure > Clusters.
- Select the local-cluster (the Hub cluster).
- On the Overview tab, locate the Cluster ID. This is your Tenant ID.
Find the Thanos Receive Endpoint:
- Ensure you are in the local-cluster view.
- Navigate to Networking > Routes.
- Select the project open-cluster-management-observability.
- Locate the observatorium-api route.
- The Location column contains the base URL. Append /api/v1/receive to this URL to form the complete endpoint.

Retrieve Configuration via CLI

You will need the Thanos Receive Endpoint URL and the Tenant ID to configure K10 on managed cluster.

Verify the Observability Service (Run on Hub Cluster):
```
oc get multiclusterobservability -n open-cluster-management-observability
```
Refer to Red Hat documentation to verify the state.
Identify the Thanos Receive Endpoint URL (Run on Hub Cluster):
- If K10 is on the Hub Cluster, use the internal service URL: http://observability-observatorium-api.open-cluster-management-observability.svc:8080/api/v1/receive
- If K10 is on a Managed Cluster, retrieve the external route and construct the URL:
```
HOST=$(oc get route observatorium-api -n open-cluster-management-observability -o jsonpath='{.spec.host}')
echo "https://$HOST/api/v1/receive"
```
Find the Tenant ID (Run on Hub Cluster): The Tenant ID is typically the Cluster ID of the ACM Hub Cluster.
```
oc get clusterversion version -o jsonpath='{.spec.clusterID}'
```

Step 2: Configure K10 for Remote Write

Configure K10's embedded Prometheus to push metrics to the ACM Hub. This is done by updating the K10 Helm values.

Advanced Configuration

You can fine-tune the Prometheus remote write behavior by adding standard Prometheus configuration options under prometheus.server.remote_write[0]. Parameters such as queue_config, send_interval, and write_relabel_configs are supported and passed directly to the Prometheus server configuration. Find more details in Prometheus Remote Write

1. Gather Configuration Values

Before creating the configuration file, ensure you have the following values:

<YOUR_TENANT_ID>: The Tenant ID required by the ACM Observability endpoint. (Required)
- This is typically the Hub Cluster ID (see "Find the Tenant ID" in Step 1).
<YOUR_CLUSTER_NAME>: A unique, friendly name for the cluster where K10 is installed (e.g., us-east-prod-01).
- OpenShift: Auto-detected from the infrastructure config if not provided.
- Other Platforms: Required. Must be provided explicitly.
<YOUR_CLUSTER_ID>: The unique identifier for the managed cluster.
- OpenShift: Auto-detected if not provided.
- Other Platforms: Required. Must be provided explicitly.
<YOUR_K10_DASHBOARD_URL>: The full URL used to access the K10 dashboard on this cluster (e.g., https://k10.apps.example.com/k10/).
- Recommended. This URL is attached to metrics to enable "deep linking" from the central dashboard back to the specific K10 instance.

2. Prepare the Helm Values

Create or update your k10-values.yaml file with the ACM configuration.

Minimal Configuration (OpenShift)

On OpenShift, K10 can auto-detect the Cluster Name and Cluster ID. You only need to provide the Tenant ID and the remote write URL.

global:
  acm:
    enabled: true
    hubThanosTenantId: "<YOUR_TENANT_ID>" # Required

prometheus:
  server:
    remote_write:
      - url: "http://observability-observatorium-api.open-cluster-management-observability.svc:8080/api/v1/receive"

Full Configuration (Recommended)

For better control and to enable deep linking, provide all values explicitly.

clusterName: "<YOUR_CLUSTER_NAME>" # Friendly name for the dashboard

global:
  acm:
    enabled: true
    hubThanosTenantId: "<YOUR_TENANT_ID>"
    managedClusterId: "<YOUR_CLUSTER_ID>" # Optional override

prometheus:
  server:
    remote_write:
      - url: "http://observability-observatorium-api.open-cluster-management-observability.svc:8080/api/v1/receive"
        dashboardUrl: "<YOUR_K10_DASHBOARD_URL>"
        # Optional: Filter metrics sent to the remote write endpoint (defaults to K10 core metrics)
        metricsRegex: "(backup|restore|import|export|job|policy|action|catalog|process).*"
    # Ensure resources are sufficient for remote_write
    resources:
      requests:
        cpu: 750m
        memory: 1.5Gi

3. Apply the Configuration

Install or Upgrade K10 to apply the changes. Run these commands on the Managed Cluster.

Using Values File (Recommended)

# For new installations
helm install k10 kasten/k10 -n kasten-io --create-namespace -f k10-values.yaml

# For existing installations
helm upgrade k10 kasten/k10 --reuse-values -n kasten-io -f k10-values.yaml

Using Command Line Flags

Alternatively, you can specify the configuration using --set flags:

# For new installations
helm install k10 kasten/k10 \
    --namespace kasten-io \
    --create-namespace \
    --set global.acm.enabled=true \
    --set global.acm.hubThanosTenantId="<YOUR_TENANT_ID>" \
    --set global.acm.managedClusterId="<YOUR_CLUSTER_ID>" \
    --set clusterName="<YOUR_CLUSTER_NAME>" \
    --set prometheus.server.remote_write[0].url="<REMOTE_WRITE_URL>" \
    --set prometheus.server.remote_write[0].dashboardUrl="<YOUR_K10_DASHBOARD_URL>"

# For existing installations
helm upgrade k10 kasten/k10 \
    --reuse-values \
    --namespace kasten-io \
    --set global.acm.enabled=true \
    --set global.acm.hubThanosTenantId="<YOUR_TENANT_ID>" \
    --set global.acm.managedClusterId="<YOUR_CLUSTER_ID>" \
    --set clusterName="<YOUR_CLUSTER_NAME>" \
    --set prometheus.server.remote_write[0].url="<REMOTE_WRITE_URL>" \
    --set prometheus.server.remote_write[0].dashboardUrl="<YOUR_K10_DASHBOARD_URL>"

Step 3: Verification

1. Verify K10 Prometheus Logs

Check the Prometheus logs to ensure remote_write is active and not encountering errors.

kubectl logs -n kasten-io -l app=prometheus -c prometheus-server | grep "remote_write"

2. Verify Metrics in ACM Grafana

Open the ACM Grafana dashboard on the Hub cluster.
Navigate to Explore (or Drilldown in newer versions).
Select the Thanos (or observatorium) datasource.
Query for a K10 metric, for example: action_backup_ended_overall.
Verify that the metric is visible and tagged with your cluster name.

Troubleshooting

Common Installation Errors

"A valid .Values.global.acm.hubThanosTenantId is required"

This error occurs during helm install or helm upgrade if the Tenant ID is missing.

Cause: global.acm.hubThanosTenantId is not set in your values file.
Fix: Retrieve the Hub Cluster ID (see Step 1) and add it to your k10-values.yaml.

"global.acm.managedClusterId is required when global.acm.enabled is true"

This error occurs if K10 cannot auto-detect the Cluster ID (e.g., on non-OpenShift platforms) and it was not provided.

Cause: You are installing on a platform where Cluster ID auto-detection is not supported, or RBAC permissions prevent lookup.
Fix: Add global.acm.managedClusterId: "<YOUR_CLUSTER_ID>" to your k10-values.yaml.

"clusterName is required when prometheus.server.remote_write is configured"

This error occurs if K10 cannot determine a name for the cluster.

Cause: You are installing on a non-OpenShift platform (or auto-detection failed) and did not provide a clusterName.
Fix: Add clusterName: "my-cluster-name" to your k10-values.yaml.

Runtime Errors (Prometheus Logs)

"server returned HTTP status 500 ... no matching hashring to handle tenant"

This error appears in the Prometheus logs (kubectl logs ... -c prometheus-server).

Cause: The hubThanosTenantId provided is incorrect or does not match a valid tenant on the ACM Hub.
Fix: Verify that the Tenant ID matches the Hub Cluster ID exactly.

"server returned HTTP status 401 Unauthorized"

Cause: The ACM Observability endpoint requires authentication (e.g., mTLS) which is not correctly configured, or the endpoint URL is incorrect.
Fix: Ensure you are using the correct internal or external route for the observatorium-api service.

Prerequisites​

Step 1: Configure ACM Observability​

Retrieve Configuration via Web Console​

Retrieve Configuration via CLI​

Step 2: Configure K10 for Remote Write​

1. Gather Configuration Values​

2. Prepare the Helm Values​

Minimal Configuration (OpenShift)​

Full Configuration (Recommended)​

3. Apply the Configuration​

Using Values File (Recommended)​

Using Command Line Flags​

Step 3: Verification​

1. Verify K10 Prometheus Logs​

2. Verify Metrics in ACM Grafana​

Troubleshooting​

Common Installation Errors​

"A valid .Values.global.acm.hubThanosTenantId is required"​

"global.acm.managedClusterId is required when global.acm.enabled is true"​

"clusterName is required when prometheus.server.remote_write is configured"​

Runtime Errors (Prometheus Logs)​

"server returned HTTP status 500 ... no matching hashring to handle tenant"​

"server returned HTTP status 401 Unauthorized"​