Resource Requirements
Veeam Kasten's resource requirements are almost always related to the number of applications in your Kubernetes cluster and the kind of data management operations being performed (e.g., snapshots vs. backups).
Some of the resource requirements are static (base resource requirements) while other resources are only required when certain work is done (dynamic resource requirements). The auto-scaling nature of Veeam Kasten ensures that resources consumed by dynamic requirements will always scale down to zero when no work is being performed.
While the below recommendations for both requests and limits should be applicable to most clusters, it is important to note that the final requirement will be a function of your cluster and application scale, total amount of data, file size distribution, and data churn rate. You can always use Prometheus or Kubernetes Vertical Pod Autoscaling (VPA) with updates disabled to check your particular requirements.
Requirement Types
Base Requirements: These are the core resources needed for Veeam Kasten's internal scheduling and cleanup services, which are mostly driven by monitoring and catalog scale requirements. The resource footprint for these base requirements is usually static and generally does not noticeably grow with either a growth in catalog size (number of Kubernetes resources protected) or number of applications protected.
Disaster Recovery: These are the resources needed to perform a DR of the Veeam Kasten install and are predominantly used to compress, deduplicate, encrypt, and transfer the Veeam Kasten catalog to object storage. Providing additional resources can also speed up the DR operation. The DR resource footprint is dynamic and scales down to zero when a DR is not being performed.
Backup Requirements: Resources for backup are required when data is transferred from volume snapshots to object storage or NFS file storage. While the backup requirements depend on your data, churn rate, and file system layout, the requirements are not unbounded and can easily fit in a relatively narrow band. Providing additional resources can also speed up backup operations. To prevent unbounded parallelism when protecting a large number of workloads, Veeam Kasten bounds the number of simultaneous backup jobs (default 9). The backup resource footprint is dynamic and scales down to zero when a backup is not being performed.
Requirement Guidelines
The below table lists the resource requirements for a Veeam Kasten install protecting 100 applications or namespaces.
It should be noted that DR jobs are also included in the maximum
parallelism limit (N
) and therefore you can only have N
simultaneous backup jobs or N-1
simultaneous backup jobs
concurrently with 1 DR job.
Type |
Requested CPU (Cores) |
Limit CPU (Cores) |
Requested Memory (GB) |
Limit Memory (GB) |
---|---|---|---|---|
Base |
1 |
2 |
1 |
4 |
DR |
1 |
1 |
0.3 |
0.3 |
Dynamic (per parallel job) |
1 |
1 |
0.4 |
0.4 |
Total |
3 |
4 |
1.8 |
4.8 |
Note
Kasten temporarily consumes approximately 3GB of ephemeral storage on a worker node for each concurrent volume export or restore operation. Default worker node storage configuration can vary based on distribution. Configuring worker nodes with 100GB of storage is recommended to prevent interruptions to Kasten operations.
Configuring Veeam Kasten Resource Usage for Core Pods
Using Helm values, resource requests and limits can be set for the core Pods that make up Veeam Kasten's base requirements. Kubernetes resource management is at the container level, so in order to set resource values, you will need to provide both the deployment and container names. Custom resource usage can be set through Helm in two ways:
Providing the path to one or more YAML files during
helm install
orhelm upgrade
with the--values
flag:resources: <deployment-name>: <container-name>: requests: memory: <value> cpu: <value> limits: memory: <value> cpu: <value>
Note
See Resource units in Kubernetes for details on how to specify valid memory and CPU values.
For example, this file will modify the settings for the
catalog-svc
container,upgrade-init
init container, andkanister-sidecar
sidecar container, which runs in the pod created by thecatalog-svc
deployment:resources: catalog-svc: catalog-svc: requests: memory: "1.5Gi" cpu: "300m" limits: memory: "3Gi" cpu: "1" upgrade-init: requests: memory: "120Mi" cpu: "100m" limits: memory: "360Mi" cpu: "300m" kanister-sidecar: requests: memory: "800Mi" cpu: "250m" limits: memory: "950Mi" cpu: "900m"
Modifying the resource values one at a time with the
--set
flag duringhelm install
orhelm upgrade
:--set=resources.<deployment-name>.<container-name>.[requests|limits].[memory|cpu]=<value>
For the equivalent behavior of the example above, the following values can be provided:
--set=resources.catalog-svc.catalog-svc.requests.memory=1.5Gi \ --set=resources.catalog-svc.catalog-svc.requests.cpu=300m \ --set=resources.catalog-svc.catalog-svc.limits.memory=3Gi \ --set=resources.catalog-svc.catalog-svc.limits.cpu=1 \ --set=resources.catalog-svc.upgrade-init.requests.memory=120Mi \ --set=resources.catalog-svc.upgrade-init.requests.cpu=100m \ --set=resources.catalog-svc.upgrade-init.limits.memory=360Mi \ --set=resources.catalog-svc.upgrade-init.limits.cpu=300m \ --set=resources.catalog-svc.kanister-sidecar.requests.memory=800Mi \ --set=resources.catalog-svc.kanister-sidecar.requests.cpu=250m \ --set=resources.catalog-svc.kanister-sidecar.limits.memory=950Mi \ --set=resources.catalog-svc.kanister-sidecar.limits.cpu=900m
When adjusting a container's resource limits or requests, if any setting is left empty, the Helm chart will assume it should be unspecified. Likewise, providing empty settings for a container will result in no limits/requests being applied.
For example, the following Helm values file will yield no specified resource
requests or limits for the kanister-sidecar
container and only a CPU
limit for the jobs-svc
container, which runs in the pod
created by the jobs-svc
deployment:
resources:
catalog-svc:
kanister-sidecar:
jobs-svc:
jobs-svc:
limits:
cpu: "50m"
Prometheus Pod's Resources
Resource requests and limits can be added to the Prometheus pod through Prometheus child Helm charts values. Custom resource usage can be set through Helm in two ways:
Providing the path to one or more YAML files during
helm install
orhelm upgrade
with the--values
flag:prometheus: server: resources: requests: memory: <value> cpu: <value> limits: memory: <value> cpu: <value> configmapReload: prometheus: resources: requests: memory: <value> cpu: <value> limits: memory: <value> cpu: <value>
Modifying the resource values one at a time with the
--set
flag duringhelm install
orhelm upgrade
:--set=prometheus.server.resources.[requests|limits].[memory|cpu]=<value> \ --set=prometheus.configmapReload.prometheus.resources.[requests|limits].[memory|cpu]=<value>
Configuring Veeam Kasten Resource Usage for Worker Pods
By default, Veeam Kasten does not assign resource requests or limits to temporary worker Pods. This allows Pods responsible for data movement during backup or restore operations to scale up as needed to ensure timely completion. If explicit resource settings are required in the environment, see the available methods below.
Configuring Cluster-Wide Worker Pod Resource Usage
Using Helm values, resource requests and limits can be set for the temporary worker Pods created by Veeam Kasten to perform operations. This method will set the same requests and limits for all injected Kanister sidecar containers used for Generic Volume Backup, as well as all other temporary worker Pods provisioned by Veeam Kasten.
As it may be undesirable to configure the same requests and limits for all temporary worker Pods across all applications, a granular alternative is described in the following section.
Note
Veeam Kasten affinity-pvc-group- Pods have fixed resource requests and limits that cannot be modified. These Pods are used only for PVC/node placement prior to data restore and do not require customization based on workload.
Note
If namespace level resource limitations have been configured using LimitRange or ResourceQuota, the values below may be prevented from being applied or result in failure of the operation. Make sure resources are specified taking those restrictions into consideration.
Cluster-wide resource requests and limits for Veeam Kasten worker Pods can be applied through Helm in two ways:
Providing the path to one or more YAML files during
helm install
orhelm upgrade
with the--values
flag:genericVolumeSnapshot: resources: requests: memory: <value> cpu: <value> limits: memory: <value> cpu: <value>
Modifying the resource values one at a time with the
--set
flag duringhelm install
orhelm upgrade
:--set=genericVolumeSnapshot.resources.[requests|limits].[memory|cpu]=<value>
Configuring Granular Worker Pod Resource Usage
The following approach allows for specifying granular resource requests and limits for different types of Veeam Kasten worker Pods, as well as allowing configuration on a per application basis. The purpose is to accommodate "right-sizing" across different application profiles within a single cluster, rather than sizing all worker Pods based on the data mover requirements of the largest applications.
Granular resource configuration uses the Veeam Kasten-specific custom
resources, ActionPodSpec
and ActionPodSpecBinding
. An ActionPodSpec
specifies the Pod categories and their associated request and limit values.
An ActionPodSpec
resource may be applied to a specific namespace using
an ActionPodSpecBinding
, or may be applied via reference within a policy.
An ActionPodSpec
can be created in any namespace and
may be cross referenced from other namespaces.
An ActionPodSpecBinding
must be created in application namespace to
which the referenced ActionPodSpec
is being applied.
Note
To enable granular resource control,
set the Helm flag workerPodCRDs.enabled
to true
.
You can also define a default ActionPodSpec
during installation.
This will have the lowest priority and will be used if there
is no ActionPodSpec
defined for the namespace or action.
--set=workerPodCRDs.defaultActionPodSpec.[name|namespace]=<value>
Examples
An ActionPodSpec
with an explicit resource configuration
for the export-volume-to-repository
Pod type and a default
configuration for all other temporary worker Pods:
apiVersion: config.kio.kasten.io/v1alpha1 kind: ActionPodSpec metadata: name: aps-example namespace: kasten-io spec: options: - podType: "*" resources: requests: cpu: 125m memory: 128Mi - podType: "export-volume-to-repository" resources: limits: cpu: 2000m memory: 512Mi requests: cpu: 1000m memory: 256Mi
An ActionPodSpecBinding
applying an ActionPodSpec
to
the temporary worker Pods within a specific namespace:
apiVersion: config.kio.kasten.io/v1alpha1 kind: ActionPodSpecBinding metadata: name: apsb-example namespace: app-ns spec: actionPodSpecRef: name: aps-example namespace: kasten-io
Alternatively, a Policy
referencing an ActionPodSpec
to
affect temporary worker Pod resources provisioned as part of the
backup
action:
apiVersion: config.kio.kasten.io/v1alpha1 kind: Policy metadata: name: policy-template-2 namespace: kasten-io spec: actions: - action: backup backupParameters: filters: {} ignoreExceptions: true profile: name: aws namespace: kasten-io actionPodSpec: name: aps-example namespace: kasten-io
Action Pod Types
This list contains Pod types that are affected by the specified settings.
Typically, Pod types are associated with specific operations;
however, on rare occasions,
a single Pod type may be used for several similar operations.
Pod type can also be found in the k10.kasten.io/actionPodType
annotation on worker Pods.
Pod Type Value |
Pod Type Description |
---|---|
|
Wildcard value to configure any worker Pods types not explicitly specified |
|
Performs checks on a repository during the export process |
|
Initializes a backup repository in an export location |
|
Deletes backup data exported using block mode |
|
Deletes the application manifest data associated with a backup |
|
Deletes backup data exported using the default filesystem mode |
|
Exports a snapshot using block mode |
|
Exports a snapshot using the default filesystem mode |
|
Exports and restores container images from ImageStreams |
|
Created by Blueprint actions to perform custom operations |
|
Lists data in the repository when retiring backups |
|
Performs background operations such as repository scans and maintenance |
|
API server used for multiple operations including export, restore, and import |
|
Restores a volume exported using block mode |
|
Restores Veeam Kasten from a Disaster Recovery backup |
|
Restores a volume exported using the default filesystem mode |
|
Upgrades the repository |
|
Validates a remote repository when restoring a Disaster Recovery backup |
Worker Pod Resource Usage Configuration Priority
The cluster-wide genericVolumeSnapshot
Helm setting holds the
lowest priority and will be overridden by any applied ActionPodSpec
.
ActionPodSpec
resources that are bound to a namespace using an
ActionPodSpecBinding
hold a lower priority than an ActionPodSpec
explicitly specified as part of a policy. ActionPodSpec
configurations from multiple sources are not merged;
therefore, it is not possible to apply namespace-bound
resources to one type of Pod and policy-specific resources to another.
It is possible to override resources of worker pods globally through the Kanister Pod Override, but this approach is not recommended.
Configuring Metric Sidecar Resource Usage for Worker Pods
By default, Veeam Kasten provisions a sidecar container on temporary worker Pods used to collect metrics to monitor resource utilization. Using Helm, resource requests and limits can be added to the metric sidecar container added to worker Pods. This setting is independent of either method previously detailed for configuring worker Pod resources.
Custom resource requests and limits can be set through Helm in two ways:
Providing the path to one or more YAML files during
helm install
orhelm upgrade
with the--values
flag:workerPodMetricSidecar: resources: requests: memory: <value> cpu: <value> limits: memory: <value> cpu: <value>
Modifying the resource values one at a time with the
--set
flag duringhelm install
orhelm upgrade
:--set=workerPodMetricSidecar.resources.[requests|limits].[memory|cpu]=<value>