As Veeam Kasten is a stateful application running on the
cluster, it must be responsible for backing up its own data to enable
recovery in the event of disaster - this is enabled by the
Veeam Kasten Disaster Recovery (KDR) policy. In particular, KDR
provides the ability to recover the Veeam Kasten platform
from a variety of disasters, such as the unintended deletion of
Veeam Kasten or its restore points, the failure of the underlying
storage used by Veeam Kasten, or even the accidental
destruction of the Kubernetes cluster on which Veeam Kasten is deployed.
The KDR mode specifies how internal Veeam Kasten resources are protected. The
mode can be set either before or after enabling the KDR policy. Changes
to the KDR mode only apply to future KDR policy runs.
All installations default to Legacy DR mode. Quick DR mode is available
and recommended for installations using snapshot-capable storage.
Warning
Quick DR mode should only be enabled if the storage provisioner
used for Veeam Kasten PVCs supports both the creation of snapshots
and the ability to restore the existing volume from a snapshot.
To enable Quick DR mode, install or upgrade Veeam Kasten
with the --setkastenDisasterRecovery.quickMode.enabled=true Helm value.
To enable Legacy DR mode, install or upgrade Veeam Kasten
with the --setkastenDisasterRecovery.quickMode.enabled=false Helm value.
Refer to the details below to understand the key differences between
each mode.
Quick DR
Snapshot-capable storage for Veeam Kasten PVCs required
Incrementally exports only necessary data from the catalog database
and creates a local snapshot of the catalog PVC on each policy run
Enables recovery of exported restore points on any cluster
Enables recovery of local restore points, exported
restore points, and action history only where the local catalog
snapshot is available (i.e. in-place recovery on the
original cluster)
Faster KDR backup and recovery versus Legacy DR
Consumes less location profile storage versus Legacy DR
Protects additional Veeam Kasten resource types versus Legacy DR
Legacy DR
No dependency on snapshot-capable storage for Veeam
Kasten PVCs
Exports a full dump of the catalog database
on each policy run
Enables recovery of local restore points, exported
restore points, and action history
Enabling Veeam Kasten Disaster Recovery (KDR) creates a dedicated
policy within Veeam Kasten to back up its resources and catalog data
to an external location profile.
Note
Veeam Repository location profiles cannot
be used as a destination for KDR backups.
Note
It is strongly recommended to use a location profile
that supports immutable backups to ensure
restore point catalog data can be recovered in the event of
incidents including ransomware and accidental deletion.
The Veeam Kasten Disaster Recovery settings are accessible via the
SetupKastenDR page under the Settings menu in the
navigation sidebar. For new installations, these settings are
also accessible using the link located within the alerts panel.
Select the SetupKastenDR page under the Settings menu in the
navigation sidebar.
Enabling KDR requires selecting a Location
Profile for the exported KDR backups and providing
a passphrase to encrypt the data using AES-256-GCM.
The passphrase can be provided as a raw string
or as reference to a secret in HashiCorp Vault or AWS Secrets Manager.
Enable KDR by selecting a valid location profile and providing
either a raw passphrase or secret management credentials, then clicking
the EnableKastenDR button.
Note
If providing a raw passphrase,
save it securely outside the cluster.
A confirmation message with the clusterID will be displayed
when KDR is enabled. This ID is used as a prefix to
the object storage or NFS file storage location where Veeam Kasten
saves its exported backup data.
Warning
After enabling Veeam Kasten Disaster Recovery, it is essential
to retain the following to successfully recover Veeam Kasten
from a disaster:
The source clusterID
The KDR passphrase (or external secret manager details)
The KDR location profile details and credential
Without this information, restore point catalog recovery will not be possible.
The clusterID value can also be accessed by using the
following kubectl command.
Managing the Veeam Kasten Disaster Recovery Policy
A policy named k10-disaster-recovery-policy that implements
Veeam Kasten Disaster Recovery (KDR) will automatically be created when
KDR is enabled. This policy can be viewed through the Policies
page in the navigation sidebar.
Click RunOnce on the k10-disaster-recovery-policy to start a
manual backup.
Click Edit to modify the frequency and retention settings. It is
recommended that the KDR policy match the frequency of the lowest RPO
policy on the cluster.
Veeam Kasten Disaster Recovery can be disabled by clicking
the DisableKastenDR button on the SetupKastenDR page,
which is found under the Settings menu in the navigation sidebar.
Warning
It is not recommended to run Veeam Kasten without KDR enabled.
To recover from a KDR backup using the UI, follow these steps:
On a new cluster, install a fresh Veeam Kasten instance in the same
namespace as the original Veeam Kasten instance.
On the new cluster, create a location profile by providing the
bucket information and credentials for the object storage
location or NFS file storage location where previous Veeam
Kasten backups are stored.
On the new cluster, navigate to the RestoreKasten
page under the Settings menu in the navigation sidebar.
In the Profile drop-down, select the location profile created
in step 3.
For Cluster ID, provide the ID of the original cluster with
Veeam Kasten Disaster Recovery enabled. This ID can be found
on the SetupKastenDR page of the original cluster that
currently has Veeam Kasten Disaster Recovery enabled.
Raw passphrase: Provide the passphrase used when enabling
Disaster Recovery.
HashiCorp Vault: Provide the Key Value Secrets Engine Version,
Mount, Path, and Passphrase Key stored in a HashiCorp Vault secret.
AWS Secrets Manager: Provide the secret name, its associated region,
and the key.
Note
For immutable location profiles, a previous
point in time can be provided to filter out any restore points
newer than the specified time in the next step. If no specific
date is chosen, it will display all available restore points,
with the most recent ones appearing first.
Click the Next button to start the validation process.
If validation succeeds, a drop-down containing the available
restore points will be displayed.
Note
All times are displayed in the local timezone of the
client's browser.
Select the desired restore point and click the Next button.
Review the summary and click the StartRestore button to
begin the restore process.
Upon completion of a successful restoration, navigation to the
dashboard and information about ownership and deletion of
the configmap is displayed.
In Veeam Kasten v7.5.0 and above, KDR recoveries can be performed via
API or CLI using DR API Resources.
Recovering from a KDR backup using CLI involves the following
sequence of steps:
Create a Kubernetes Secret, k10-dr-secret, using the passphrase
provided while enabling Disaster Recovery as described in
Specifying a Disaster Recovery Passphrase.
Install a fresh Veeam Kasten instance in the same namespace as the above
Secret.
Provide bucket information and credentials for the object storage
location or NFS file storage location where previous Veeam Kasten backups
are stored.
Create KastenDRReview resource providing
the source cluster information.
Create KastenDRRestore resource
referring to the KastenDRReview resource and choosing one of the restore
points provided in the KastenDRReview status.
The steps 4 and 5 can be skipped and KastenDRRestore resource can be
created directly with the source cluster information.
Delete the KastenDRReview and KastenDRRestore resources after restore
completes.
The k10restore Helm chart is deprecated with Veeam Kasten v7.5.0
release and will be removed in a future release.
Recovering from a KDR backup using k10restore involves the
following sequence of actions:
Create a Kubernetes Secret, k10-dr-secret, using the passphrase
provided while enabling Disaster Recovery
Install a fresh Veeam Kasten instance in the same namespace as the above
Secret
Provide bucket information and credentials for the object storage
location or NFS file storage location where previous Veeam Kasten backups
are stored
Restoring the Veeam Kasten backup
Uninstalling the Veeam Kasten restore instance after recovery is
recommended
Note
If Kasten was previously installed in FIPS mode, ensure the fresh
Veeam Kasten instance is also installed in FIPS mode.
Note
If Veeam Kasten backup is stored using an
NFS File Storage Location, it is
important that the same NFS share is reachable from the recovery cluster
and is mounted on all nodes where Veeam Kasten is installed.
Currently, Veeam Kasten Disaster Recovery encrypts all artifacts via the
use of the AES-256-GCM algorithm. The passphrase entered while enabling
Disaster Recovery is used for this encryption. On the cluster used for
Veeam Kasten recovery, the Secret k10-dr-secret needs to be
therefore created using that same passphrase in the Veeam Kasten
namespace (default kasten-io)
The passphrase can be provided as a raw string or reference
a secret in HashiCorp Vault or AWS Secrets Manager.
Specifying the passphrase as a HashiCorp Vault secret:
$ kubectlcreatesecretgenerick10-dr-secret\--namespacekasten-io\--from-literalsource=vault\--from-literalvault-kv-version=<version-of-key-value-secrets-engine>\--from-literalvault-mount-path=<path-where-key-value-engine-is-mounted>\--from-literalvault-secret-path=<path-from-mount-to-passphrase-key>\--from-literalkey=<name-of-passphrase-key>
# Example
$ kubectlcreatesecretgenerick10-dr-secret\--namespacekasten-io\--from-literalsource=vault\--from-literalvault-kv-version=KVv1\--from-literalvault-mount-path=secret\--from-literalvault-secret-path=k10\--from-literalkey=passphrase
The supported values for vault-kv-version are KVv1 and KVv2.
Note
Using a passphrase from HashiCorp Vault also requires enabling
HashiCorp Vault authentication when installing the kasten/k10restore
helm chart. Refer: Enabling HashiCorp Vault using
Token Auth or
Kubernetes Auth.
Specifying the passphrase as an AWS Secrets Manager secret:
$ kubectlcreatesecretgenerick10-dr-secret\--namespacekasten-io\--from-literalsource=aws\--from-literalaws-region=<aws-region-for-secret>\--from-literalkey=<aws-secret-name>
# Example
$ kubectlcreatesecretgenerick10-dr-secret\--namespacekasten-io\--from-literalsource=aws\--from-literalaws-region=us-east-1\--from-literalkey=k10/dr/passphrase
When reinstalling Veeam Kasten on the same cluster, it is
important to clean up the namespace in which Veeam Kasten was
previously installed before the above passphrase creation.
The overrideResources flag must be set to true when using
Quick Disaster Recovery. Since the Disaster Recovery operation involves
creating or replacing resources, confirmation should be provided
by setting this flag.
Veeam Kasten provides the ability to apply labels and annotations to all
temporary worker pods created during Veeam Kasten recovery as part of its
operation. The labels and annotations can be set through the podLabels and
podAnnotations Helm flags, respectively. For example, if using a
values.yaml file:
The restore job always restores the restore point catalog and artifact
information. If the restore of other resources (options include profiles,
policies, secrets) needs to be skipped, the skipResource flag can be used.
The timeout of the entire restore process can be configured by the helm field
restore.timeout. The type of this field is int and the value is
in minutes.
If the Disaster Recovery Location Profile was configured for
Immutable Backups, Veeam Kasten can be
restored to an earlier point in time. The protection period chosen when
creating the profile determines how far in the past the point-in-time
can be. Set the pointInTime helm value to the desired time stamp.
Restoring Veeam Kasten Backup with Iron Bank Kasten Images
The general instructions found in
Restoring Veeam Kasten with k10restore
can be used for restoring Veeam Kasten using Iron Bank
hardened images with a few changes.
Specific helm values are used to ensure that the Veeam Kasten
restore helm chart only uses Iron Bank images.
The values file must be downloaded by running:
This file is protected and should not be modified. It is necessary
to specify all other values using the corresponding helm flags, such as
--set, --values, etc.
Credentials for Registry1 must be provided in order to successfully pull
the images. These should already have been created as part of re-deploying a
new Veeam Kasten instance; therefore, only the name of the secret should be
used here.
The following set of flags should be added to the instructions found in
Restoring Veeam Kasten with k10restore to use
Iron Bank images for Veeam Kasten recovery:
To ensure that certified cryptographic modules are utilized, you must install
the k10restore chart with additional Helm values that can be found here: FIPS
values. These should be added to the
instructions found in
Restoring Veeam Kasten with k10restore
for Veeam Kasten disaster recovery:
Restoring Veeam Kasten Backup in Air-Gapped environment
In case of air-gapped installations, it's assumed that k10offline tool is
used to push the images to a private container registry.
Below command can be used to instruct k10restore to run in air-gapped mode.
Restoring Veeam Kasten Backup with Google Workload Identity Federation
Veeam Kasten can be restored from a Google Cloud Storage bucket using the
Google Workload Identity Federation. Please follow the instructions
provided here to restore Veeam Kasten with
this option.
Recovering from a Veeam Kasten backup involves the following sequence of
actions:
Install a fresh Veeam Kasten instance.
Configure a Location Profile from
where the Veeam Kasten backup will be restored.
Create a Kubernetes Secret named k10-dr-secret in the same namespace
as the Veeam Kasten install, with the passphrase given when disaster
recovery was enabled on the previous Veeam Kasten instance.
The commands are detailed here.
Create a K10restore instance. The required values are
Cluster ID - value given when disaster recovery was enabled
on the previous Veeam Kasten instance.
Profile name - name of the Location Profile configured in Step 2.
and the optional values are
Point in time - time (RFC3339) at which to evaluate restore data.
Example "2022-01-02T15:04:05Z".
Resources to skip - can be used to skip restore of specific resources.
Example "profile,policies".
After recovery, deleting the k10restore instance is recommended.
Operator K10restore form view with EnableHashiCorpVault set to False
Operator K10restore form view with EnableHashiCorpVault set to True
Using the Restored Veeam Kasten in Place of the Original
The newly restored Veeam Kasten includes a safety mechanism to prevent
it from performing critical background maintenance operations on backup
data in storage. These operations are exclusive, meaning that there
is only one Veeam Kasten instance should perform them one at a time.
The DR-restored Veeam Kasten initially assumes that it does not have
permission to perform these maintenance tasks. This assumption is
made in case the original source, Veeam Kasten, is still running,
especially during scenarios like testing the DR restore procedure in
a secondary test cluster while the primary production Veeam Kasten is
still active.
If no other Veeam Kasten instances are accessing the same sets of backup
data (i.e., the original Veeam Kasten has been uninstalled and only the new
DR-restored Veeam Kasten remains), it can be signaled that the new Veeam
Kasten is now eligible to take over the maintenance duties by deleting
the following resource:
It is critical that you delete this resource only when you are prepared
to make the permanent cutover to the new DR-restored Veeam Kasten instance.
Running multiple Veeam Kasten instances simultaneously, each assuming
ownership, can corrupt backup data.