Resource Requirements

Veeam Kasten's resource requirements are almost always related to the number of applications in your Kubernetes cluster and the kind of data management operations being performed (e.g., snapshots vs. backups).

Some of the resource requirements are static (base resource requirements) while other resources are only required when certain work is done (dynamic resource requirements). The auto-scaling nature of Veeam Kasten ensures that resources consumed by dynamic requirements will always scale down to zero when no work is being performed.

While the below recommendations for both requests and limits should be applicable to most clusters, it is important to note that the final requirement will be a function of your cluster and application scale, total amount of data, file size distribution, and data churn rate. You can always use Prometheus or Kubernetes Vertical Pod Autoscaling (VPA) with updates disabled to check your particular requirements.

Requirement Types

Base Requirements: These are the core resources needed for Veeam Kasten's internal scheduling and cleanup services, which are mostly driven by monitoring and catalog scale requirements. The resource footprint for these base requirements is usually static and generally does not noticeably grow with either a growth in catalog size (number of Kubernetes resources protected) or number of applications protected.
Disaster Recovery: These are the resources needed to perform a DR of the Veeam Kasten install and are predominantly used to compress, deduplicate, encrypt, and transfer the Veeam Kasten catalog to object storage. Providing additional resources can also speed up the DR operation. The DR resource footprint is dynamic and scales down to zero when a DR is not being performed.
Backup Requirements: Resources for backup are required when data is transferred from volume snapshots to object storage or NFS file storage. While the backup requirements depend on your data, churn rate, and file system layout, the requirements are not unbounded and can easily fit in a relatively narrow band. Providing additional resources can also speed up backup operations. To prevent unbounded parallelism when protecting a large number of workloads, Veeam Kasten bounds the number of simultaneous backup jobs (default 9). The backup resource footprint is dynamic and scales down to zero when a backup is not being performed.

Requirement Guidelines

The below table lists the resource requirements for a Veeam Kasten install protecting 100 applications or namespaces.

It should be noted that DR jobs are also included in the maximum parallelism limit (N) and therefore you can only have N simultaneous backup jobs or N-1 simultaneous backup jobs concurrently with 1 DR job.

Veeam Kasten Resource Guidelines
Type	Requested CPU (Cores)	Limit CPU (Cores)	Requested Memory (GB)	Limit Memory (GB)
Base	1	2	1	4
DR	1	1	0.3	0.3
Dynamic (per parallel job)	1	1	0.4	0.4
Total	3	4	1.8	4.8

Configuring Veeam Kasten Resource Usage for Core Pods

Using Helm values, resource requests and limits can be set for the core Pods that make up Veeam Kasten's base requirements. Kubernetes resource management is at the container level, so in order to set resource values, you will need to provide both the deployment and container names. Custom resource usage can be set through Helm in two ways:

Providing the path to one or more YAML files during helm install or helm upgrade with the --values flag:

resources:
  <deployment-name>:
    <container-name>:
      requests:
        memory: <value>
        cpu: <value>
      limits:
        memory: <value>
        cpu: <value>

Note

See Resource units in Kubernetes for details on how to specify valid memory and CPU values.

For example, this file will modify the settings for the catalog-svc container, upgrade-init init container, and kanister-sidecar sidecar container, which runs in the pod created by the catalog-svc deployment:

resources:
  catalog-svc:
    catalog-svc:
      requests:
        memory: "1.5Gi"
        cpu: "300m"
      limits:
        memory: "3Gi"
        cpu: "1"
    upgrade-init:
      requests:
        memory: "120Mi"
        cpu: "100m"
      limits:
        memory: "360Mi"
        cpu: "300m"
    kanister-sidecar:
      requests:
        memory: "800Mi"
        cpu: "250m"
      limits:
        memory: "950Mi"
        cpu: "900m"

Modifying the resource values one at a time with the --set flag during helm install or helm upgrade:

--set=resources.<deployment-name>.<container-name>.[requests|limits].[memory|cpu]=<value>

For the equivalent behavior of the example above, the following values can be provided:

--set=resources.catalog-svc.catalog-svc.requests.memory=1.5Gi \
--set=resources.catalog-svc.catalog-svc.requests.cpu=300m \
--set=resources.catalog-svc.catalog-svc.limits.memory=3Gi \
--set=resources.catalog-svc.catalog-svc.limits.cpu=1 \
--set=resources.catalog-svc.upgrade-init.requests.memory=120Mi \
--set=resources.catalog-svc.upgrade-init.requests.cpu=100m \
--set=resources.catalog-svc.upgrade-init.limits.memory=360Mi \
--set=resources.catalog-svc.upgrade-init.limits.cpu=300m \
--set=resources.catalog-svc.kanister-sidecar.requests.memory=800Mi \
--set=resources.catalog-svc.kanister-sidecar.requests.cpu=250m \
--set=resources.catalog-svc.kanister-sidecar.limits.memory=950Mi \
--set=resources.catalog-svc.kanister-sidecar.limits.cpu=900m

When adjusting a container's resource limits or requests, if any setting is left empty, the Helm chart will assume it should be unspecified. Likewise, providing empty settings for a container will result in no limits/requests being applied.

For example, the following Helm values file will yield no specified resource requests or limits for the kanister-sidecar container and only a CPU limit for the jobs-svc container, which runs in the pod created by the jobs-svc deployment:

resources:
  catalog-svc:
    kanister-sidecar:
  jobs-svc:
    jobs-svc:
      limits:
        cpu: "50m"

Prometheus and Grafana Pods Resources

Resource requests and limits can be added to the Prometheus and Grafana pods through Grafana and Prometheus child Helm charts values. Custom resource usage can be set through Helm in two ways:

Providing the path to one or more YAML files during helm install or helm upgrade with the --values flag:

grafana:
  resources:
    requests:
      memory: <value>
      cpu: <value>
    limits:
      memory: <value>
      cpu: <value>
prometheus:
  server:
    resources:
      requests:
        memory: <value>
        cpu: <value>
      limits:
        memory: <value>
        cpu: <value>
  configmapReload:
    prometheus:
      resources:
        requests:
          memory: <value>
          cpu: <value>
        limits:
          memory: <value>
          cpu: <value>

Modifying the resource values one at a time with the --set flag during helm install or helm upgrade:

--set=grafana.resources.[requests|limits].[memory|cpu]=<value> \
--set=prometheus.server.resources.[requests|limits].[memory|cpu]=<value> \
--set=prometheus.configmapReload.prometheus.resources.[requests|limits].[memory|cpu]=<value>

Configuring Veeam Kasten Resource Usage for Worker Pods

By default, Veeam Kasten does not assign resource requests or limits to temporary worker Pods. This allows Pods responsible for data movement during backup or restore operations to scale up as needed to ensure timely completion. If explicit resource settings are required in the environment, see the available methods below.

Configuring Cluster-Wide Worker Pod Resource Usage

Using Helm values, resource requests and limits can be set for the temporary worker Pods created by Veeam Kasten to perform operations. This method will set the same requests and limits for all injected Kanister sidecar containers used for Generic Volume Backup, as well as all other temporary worker Pods provisioned by Veeam Kasten.

As it may be undesirable to configure the same requests and limits for all temporary worker Pods across all applications, a granular alternative is described in the following section.

Note

Veeam Kasten affinity-pvc-group- Pods have fixed resource requests and limits that cannot be modified. These Pods are used only for PVC/node placement prior to data restore and do not require customization based on workload.

Note

If namespace level resource limitations have been configured using LimitRange or ResourceQuota, the values below may be prevented from being applied or result in failure of the operation. Make sure resources are specified taking those restrictions into consideration.

Cluster-wide resource requests and limits for Veeam Kasten worker Pods can be applied through Helm in two ways:

Providing the path to one or more YAML files during helm install or helm upgrade with the --values flag:

genericVolumeSnapshot:
  resources:
    requests:
      memory: <value>
      cpu: <value>
    limits:
      memory: <value>
      cpu: <value>

Modifying the resource values one at a time with the --set flag during helm install or helm upgrade:
```
--set=genericVolumeSnapshot.resources.[requests|limits].[memory|cpu]=<value>
```

Configuring Granular Worker Pod Resource Usage

The following approach allows for specifying granular resource requests and limits for different types of Veeam Kasten worker Pods, as well as allowing configuration on a per application basis. The purpose is to accommodate "right-sizing" across different application profiles within a single cluster, rather than sizing all worker Pods based on the data mover requirements of the largest applications.

Granular resource configuration uses the Veeam Kasten-specific custom resources, ActionPodSpec and ActionPodSpecBinding. An ActionPodSpec specifies the Pod categories and their associated request and limit values. An ActionPodSpec resource may be applied to a specific namespace using an ActionPodSpecBinding, or may be applied via reference within a policy.

An ActionPodSpec can be created in any namespace and may be cross referenced from other namespaces. An ActionPodSpecBinding must be created in application namespace to which the referenced ActionPodSpec is being applied.

Note

To enable granular resource control, set the Helm flag workerPodCRDs.enabled to true.

You can also define a default ActionPodSpec during installation. This will have the lowest priority and will be used if there is no ActionPodSpec defined for the namespace or action.

--set=workerPodCRDs.defaultActionPodSpec.[name|namespace]=<value>

Examples

An ActionPodSpec with an explicit resource configuration for the export-volume-to-repository Pod type and a default configuration for all other temporary worker Pods:

apiVersion: config.kio.kasten.io/v1alpha1
kind: ActionPodSpec
metadata:
  name: aps-example
  namespace: kasten-io
spec:
  options:
    - podType: "*"
      resources:
        requests:
          cpu: 125m
          memory: 128Mi
    - podType: "export-volume-to-repository"
      resources:
        limits:
          cpu: 2000m
          memory: 512Mi
        requests:
          cpu: 1000m
          memory: 256Mi

An ActionPodSpecBinding applying an ActionPodSpec to the temporary worker Pods within a specific namespace:

apiVersion: config.kio.kasten.io/v1alpha1
kind: ActionPodSpecBinding
metadata:
  name: apsb-example
  namespace: app-ns
spec:
  actionPodSpecRef:
    name: aps-example
    namespace: kasten-io

Alternatively, a Policy referencing an ActionPodSpec to affect temporary worker Pod resources provisioned as part of the backup action:

apiVersion: config.kio.kasten.io/v1alpha1
kind: Policy
metadata:
  name: policy-template-2
  namespace: kasten-io
spec:
  actions:
  - action: backup
    backupParameters:
      filters: {}
      ignoreExceptions: true
      profile:
        name: aws
        namespace: kasten-io
      actionPodSpec:
        name: aps-example
        namespace: kasten-io

Action Pod Types

This list contains Pod types that are affected by the specified settings. Typically, Pod types are associated with specific operations; however, on rare occasions, a single Pod type may be used for several similar operations. Pod type can also be found in the k10.kasten.io/actionPodType annotation on worker Pods.

Pod Type Value	Pod Type Description
`*`	Wildcard value to configure any worker Pods types not explicitly specified
`check-repository`	Performs checks on a repository during the export process
`create-repository`	Initializes a backup repository in an export location
`delete-block-data-from-repository`	Deletes backup data exported using block mode
`delete-collection`	Deletes the application manifest data associated with a backup
`delete-data-from-repository`	Deletes backup data exported using the default filesystem mode
`export-block-volume-to-repository`	Exports a snapshot using block mode
`export-volume-to-repository`	Exports a snapshot using the default filesystem mode
`image-copy`	Exports and restores container images from ImageStreams
`kanister-job`	Created by Blueprint actions to perform custom operations
`list-data-from-repository`	Lists data in the repository when retiring backups
`repository-operations`	Performs background operations such as repository scans and maintenance
`repository-server`	API server used for multiple operations including export, restore, and import
`restore-block-volume-from-repository`	Restores a volume exported using block mode
`restore-data-dr`	Restores Veeam Kasten from a Disaster Recovery backup
`restore-data-from-repository`	Restores a volume exported using the default filesystem mode
`upgrade-repository`	Upgrades the repository
`validate-repository`	Validates a remote repository when restoring a Disaster Recovery backup

Worker Pod Resource Usage Configuration Priority

The cluster-wide genericVolumeSnapshot Helm setting holds the lowest priority and will be overridden by any applied ActionPodSpec. ActionPodSpec resources that are bound to a namespace using an ActionPodSpecBinding hold a lower priority than an ActionPodSpec explicitly specified as part of a policy. ActionPodSpec configurations from multiple sources are not merged; therefore, it is not possible to apply namespace-bound resources to one type of Pod and policy-specific resources to another.

It is possible to override resources of worker pods globally through the Kanister Pod Override, but this approach is not recommended.

Configuring Metric Sidecar Resource Usage for Worker Pods

By default, Veeam Kasten provisions a sidecar container on temporary worker Pods used to collect metrics to monitor resource utilization. Using Helm, resource requests and limits can be added to the metric sidecar container added to worker Pods. This setting is independent of either method previously detailed for configuring worker Pod resources.

Custom resource requests and limits can be set through Helm in two ways:

Providing the path to one or more YAML files during helm install or helm upgrade with the --values flag:

kanisterPodMetricSidecar:
  resources:
    requests:
      memory: <value>
      cpu: <value>
    limits:
      memory: <value>
      cpu: <value>

Modifying the resource values one at a time with the --set flag during helm install or helm upgrade:
```
--set=kanisterPodMetricSidecar.resources.[requests|limits].[memory|cpu]=<value>
```