Storage Integration
K10 supports direct integration with public cloud storage vendors, direct Ceph support, as well as CSI integration. While most integrations are transparent, the below sections document the configuration needed for the exceptions.
Direct Provider Integration
K10 supports seamless and direct storage integration with a number of storage providers. The following storage providers are either automatically discovered and configured within K10 or can be configured for direct integration:
Azure Managed Disks (Azure Managed Disks)
Ceph (RBD)
Cinder-based providers on OpenStack
Veeam Backup (snapshot data export only)
Container Storage Interface (CSI)
Apart from direct storage provider integration, K10 also supports invoking volume snapshots operations via the Container Storage Interface (CSI). To ensure that this works correctly, please ensure the following requirements are met.
CSI Requirements
Kubernetes v1.14.0 or higher
The
VolumeSnapshotDataSource
feature has been enabled in the Kubernetes clusterA CSI driver that has Volume Snapshot support. Please look at the list of CSI drivers to confirm snapshot support.
Pre-Flight Checks
Assuming that the default kubectl
context is pointed to a cluster
with CSI enabled, CSI pre-flight checks can be run by deploying the
primer
tool with a specified StorageClass. This tool runs in a
pod in the cluster and performs the following operations:
Creates a sample application with a persistent volume and writes some data to it
Takes a snapshot of the persistent volume
Creates a new volume from the persistent volume snapshot
Validates the data in the new persistent volume
First, run the following command to derive the list of provisioners along with their StorageClasses and VolumeSnapshotClasses.
$ curl -s https://docs.kasten.io/tools/k10_primer.sh | bash
Then, run the following command with a valid StorageClass to deploy the pre-check tool:
$ curl -s https://docs.kasten.io/tools/k10_primer.sh | bash /dev/stdin -s ${STORAGE_CLASS}
CSI Snapshot Configuration
For each CSI driver, ensure that a VolumeSnapshotClass has been added
with K10 annotation (k10.kasten.io/is-snapshot-class: "true"
).
Note that CSI snapshots are not durable. In particular, CSI snapshots
have a namespaced VolumeSnapshot object and a non-namespaced
VolumeSnapshotContent object. With the default (and recommended)
deletionPolicy
, if there is a deletion of a volume or the
namespace containing the volume, the cleanup of the namespaced
VolumeSnapshot object will lead to the cascading delete of the
VolumeSnapshotContent object and therefore the underlying storage
snapshot.
Setting deletionPolicy
to Delete
isn't sufficient either as
some storage systems will force snapshot deletion if the associated
volume is deleted (snapshot lifecycle is not independent of the
volume). Similarly, it might be possible to force-delete snapshots
through the storage array's native management interface. Enabling
backups together with volume snapshots is therefore required for a
durable backup.
K10 creates a clone of the original VolumeSnapshotClass with the DeletionPolicy set to 'Retain'. When restoring a CSI VolumeSnapshot, an independent replica is created using this cloned class to avoid any accidental deletions of the underlying VolumeSnapshotContent.
VolumeSnapshotClass Configuration
apiVersion: snapshot.storage.k8s.io/v1alpha1
snapshotter: hostpath.csi.k8s.io
kind: VolumeSnapshotClass
metadata:
annotations:
k10.kasten.io/is-snapshot-class: "true"
name: csi-hostpath-snapclass
apiVersion: snapshot.storage.k8s.io/v1beta1
driver: hostpath.csi.k8s.io
kind: VolumeSnapshotClass
metadata:
annotations:
k10.kasten.io/is-snapshot-class: "true"
name: csi-hostpath-snapclass
Given the configuration requirements, the above code illustrates a correctly-configured VolumeSnapshotClass for K10. If the VolumeSnapshotClass does not match the above template, please follow the below instructions to modify it. If the existing VolumeSnapshotClass cannot be modified, a new one can be created with the required annotation.
Whenever K10 detects volumes that were provisioned via a CSI driver, it will look for a VolumeSnapshotClass with K10 annotation for the identified CSI driver and use it to create snapshots. You can easily annotate an existing VolumeSnapshotClass using:
$ kubectl annotate volumesnapshotclass ${VSC_NAME} \ k10.kasten.io/is-snapshot-class=true
Verify that only one VolumeSnapshotClass per storage provisioner has the K10 annotation. Currently, if no VolumeSnapshotClass or more than one has the K10 annotation, snapshot operations will fail.
# List the VolumeSnapshotClasses with K10 annotation $ kubectl get volumesnapshotclass -o json | \ jq '.items[] | select (.metadata.annotations["k10.kasten.io/is-snapshot-class"]=="true") | .metadata.name' k10-snapshot-class
StorageClass Configuration
As an alternative to the above method, a StorageClass can be
annotated with the following-
(k10.kasten.io/volume-snapshot-class: "VSC_NAME"
).
All volumes created with this StorageClass will be snapshotted by
the specified VolumeSnapshotClass:
$ kubectl annotate storageclass ${SC_NAME} \ k10.kasten.io/volume-snapshot-class=${VSC_NAME}
Migration Requirements
If application migration across clusters is needed, ensure that the VolumeSnapshotClass names match between both clusters. As the VolumeSnapshotClass is also used for restoring volumes, an identical name is required.
CSI Snapshotter Minimum Requirements
Finally, ensure that the csi-snapshotter
container for all CSI
drivers you might have installed has a minimum version of v1.2.2. If
your CSI driver ships with an older version that has known bugs, it
might be possible to transparently upgrade in place using the
following code.
# For example, if you installed the GCP Persistent Disk CSI driver
# in namespace ${DRIVER_NS} with a statefulset (or deployment)
# name ${DRIVER_NAME}, you can check the snapshotter version as below:
$ kubectl get statefulset ${DRIVER_NAME} --namespace=${DRIVER_NS} \
-o jsonpath='{range .spec.template.spec.containers[*]}{.image}{"\n"}{end}'
gcr.io/gke-release/csi-provisioner:v1.0.1-gke.0
gcr.io/gke-release/csi-attacher:v1.0.1-gke.0
quay.io/k8scsi/csi-snapshotter:v1.0.1
gcr.io/dyzz-csi-staging/csi/gce-pd-driver:latest
# Snapshotter version is old (v1.0.1), update it to the required version.
$ kubectl set image statefulset/${DRIVER_NAME} csi-snapshotter=quay.io/k8scsi/csi-snapshotter:v1.2.2 \
--namespace=${DRIVER_NS}
AWS Storage
K10 supports Amazon Web Services (AWS) storage integration, including Amazon Elastic Block Storage (EBS) and Amazon Elastic File System (EFS)
Amazon Elastic Block Storage (EBS) Integration
K10 currently supports backup and restores of EBS CSI volumes as well as Native (In-tree) volumes. In order to work with the In-tree provisioner, or to migrate snapshots within AWS, K10 requires an Infrastructure Profile. Please refer to AWS Infrastructure Profile on how to create one.
Amazon Elastic File System (EFS) Integration
K10 currently supports backup and restores of statically
provisioned
EFS CSI volumes. Since statically provisioned volumes use the entire
file system we are able to utilize AWS APIs to take backups.
While the EFS CSI driver has begun supporting dynamic
provisioning, it
does not create new EFS volumes. Instead, it creates and uses access points
within existing EFS volumes. The current AWS APIs do not support backups
of individual access points.
However, K10 can take backups of these dynamically
provisioned EFS
volumes using the Shareable Volume Backup and Restore
mechanism.
For all other operations, EFS requires an Infrastructure Profile. Please refer to AWS Infrastructure Profile on how to create one.
AWS Infrastructure Profile
To enable K10 to take snapshots and restore volumes from AWS, an Infrastructure Profile must be created from the Settings menu.
The AWS Access Key
and AWS Secret
fields are required.
Using AWS IAM Service Account Credentials that K10 was installed with is also
possible with the Authenticate with AWS IAM Role
checkbox.
An additional AWS IAM Role
can be provided if the user requires
K10 to assume a different role.
The provided credentials are verified for both EBS and EFS.
Currently, K10 also supports the legacy mode of providing AWS credentials via Helm. In this case, an AWS Infrastructure Profile will be created automatically with the values provided through Helm, and can be seen on the Dashboard. This profile can later be replaced or updated manually if necessary, such as when the credentials change.
In future releases, providing AWS credential via Helm will be deprecated.
Azure Managed Disks
K10 supports backups and restores of CSI volumes as well as in-tree volumes for Azure Managed Disks. In order to work with the Azure in-tree provisioner, K10 requires an Infrastructure Profile to be created from the Settings menu.
K10 supports authentication with Azure Active Directory
with Azure Client
Secret credentials, as well as Azure Managed Identity.
To authenticate with Azure Client Secret credentials, K10 requires
Tenant ID
, Client ID
, and Client Secret
.
To authenticate with Azure Managed Identity, clusters must have Azure Managed
Identity enabled.
If Use Azure Managed Identities
is chosen, users also have the option of
using default Managed Identity by choosing Use Default Client ID
, or to supply K10 with a specific Client ID
.
In addition to authentication credentials, K10 also requires Subscription ID
and Resource Group
. For information on how to retrieve the required data,
please refer to Install K10 with Azure.
Additionally, information for Azure Stack such as Storage Environment Name
,
Resource Manager Endpoint
, AD Endpoint
, and AD Resource
can also be specified. These fields are not mandatory, and default values
will be used if they are not provided by the user.
Field |
Value |
---|---|
Storage Environment Name |
AzurePublicCloud |
Resource Manager Endpoint |
|
AD Endpoint |
|
AD Resource |
K10 also supports the legacy method of providing Azure credentials via Helm. In this case, an Azure Infrastructure Profile will be created automatically with the values provided through Helm, and can be seen on the Dashboard. This profile can later be replaced or updated manually if necessary, such as when the credentials change.
In future releases, providing Azure credentials via Helm will be deprecated.
Pure Storage
For integrating K10 with Pure Storage, please follow Pure Storage's instructions on deploying the Pure Storage Orchestrator and the VolumeSnapshotClass.
Once the above two steps are completed, follow the instructions for K10 CSI integration. In particular, the Pure VolumeSnapshotClass needs to be edited using the following commands.
$ kubectl annotate volumesnapshotclass pure-snapshotclass \
k10.kasten.io/is-snapshot-class=true
NetApp Trident
For integrating K10 with NetApp Trident, please follow NetApp's instructions on deploying Trident as a CSI provider and then follow the instructions above.
Google Persistent Disk
K10 supports Google Persistent Disk (GPD) storage integration with both CSI and native (in-tree) drivers. In order to use GPD native driver, an Infrastructure Profile must be created from the Settings menu.
The GCP Project ID
and GCP Service Key
fields are required.
The GCP Service Key
takes the complete content of the service account
json file when creating a new service account.
Currently, K10 also supports the legacy mode of providing Google credentials via Helm. In this case, a Google Infrastructure Profile will be created automatically with the values provided through Helm, and can be seen on the Dashboard. This profile can later be replaced or updated manually if necessary, such as when the credentials change.
In future releases, providing Google credential via Helm will be deprecated.
Ceph
K10 supports Ceph RBD and Ceph FS snapshots and backups via their CSI drivers.
CSI Integration
Note
If you are using Rook to install Ceph, K10 only supports Rook v1.3.0 and above. Previous versions had bugs that prevented restore from snapshots.
K10 supports integration with Ceph (RBD and FS) via its CSI interface by following the instructions for CSI integration. In particular, the Ceph VolumeSnapshotClass needs to be edited using the following commands.
$ kubectl annotate volumesnapshotclass csi-snapclass \
k10.kasten.io/is-snapshot-class=true
Direct Ceph Integration (RBD only)
Note
Non-CSI support for Ceph will be deprecated in an upcoming release in favor of direct CSI integration
Apart from integration with Ceph's CSI driver, K10 also has native
support for Ceph (RBD) to protect persistent volumes provisioned using
the Ceph RBD provisioner. The imageFormat: "2"
and
imageFeatures: "layering"
parameters are required in the Ceph StorageClass
definition. A correctly configured Ceph StorageClass can be seen here.
An Infrastructure Profile must be created from the Settings menu.
The Monitor
, Pool
, User
and Keyring
fields must be specified.
Cinder/OpenStack
K10 supports snapshots and backups of OpenStack's Cinder block storage.
To enable K10 to take snapshots, an OpenStack Infrastructure Profile must be created from the settings menu.
The Keystone Endpoint
, Project Name
, Domain Name
, Username
and Password
are required fields.
If the OpenStack environment spans multiple regions then the Region
field
must also be specified.
vSphere
K10 supports vSphere storage integration with PersistentVolumes provisioned using the vSphere CSI provisioner. The available functionality varies by the type of cluster infrastructure used and is summarized in the table below:
vSphere with Tanzu [1] |
Other Kubernetes infrastructures [1] |
||
---|---|---|---|
vSphere |
Supported versions |
7.0 U3 or higher |
7.0 U1 or higher |
vCenter access required [2] |
Required |
Required |
|
Export |
Export in filesystem mode |
Not Supported [3] |
Supported |
Export in block mode [4] |
To an Object Storage Location, an NFS File Storage Location or a Veeam Repository [5] |
To an Object Storage Location, an NFS File Storage Location or a Veeam Repository [5] |
|
Restore |
Restore from a snapshot |
Not Supported [3] |
Supported |
Restore from an export (any mode) |
Supported |
Supported |
|
Instant Recovery restore |
Not Supported [3] |
From a Veeam Repository |
|
Import |
Import a filesystem mode export |
Supported |
Supported |
Import a block mode export |
From an Object Storage Location, an NFS File Storage Location or a Veeam Repository [5] |
From an Object Storage Location, an NFS File Storage Location or a Veeam Repository [5] |
vSphere with Tanzu supervisor clusters and VMware Tanzu Kubernetes Grid management clusters are not supported.
Access to vCenter is required with all types of cluster infrastructures as K10 directly communicates with vSphere to snapshot a First Class Disk (FCD), resolve paravirtualized volume handles, set tags and access volume data with the VMware VDDK API.
The guest clusters of vSphere with Tanzu use paravirtualized PersistentVolumes. These clusters do not support the static provisioning of a specific FCD from within the guest cluster itself. This disables K10's ability to restore applications from their local snapshots, Instant Recovery and the ability to export snapshot data in filesystem mode.
Block mode snapshot exports are available in all types of vSphere cluster infrastructures. Snapshot content is accessed at the block level directly through the network using the VMware VDDK API. Enable changed block tracking on the VMware cluster nodes to reduce the amount of data transferred during export. See this K10 knowledge base article for how to do so in vSphere with Tanzu guest clusters.
Block mode snapshot exports can be saved in an Object Storage Location, an NFS File Storage Location or a Veeam Repository.
A vSphere Infrastructure Profile must be created from the Settings menu to identify the vCenter server.
The vCenter Server
is required and must be a valid IP address or hostname
that points to the vSphere infrastructure.
The vSphere User
and vSphere Password
fields are also required.
The Enable Tagging
check box allows us to tag snapshots. This will aid
in identifying snapshots taken by K10. If it is enabled, a vSphere tagging
category will be created and displayed under Infrastructure Profiles
.
Note
It is recommended that a dedicated user account be created for K10. To authorize the account, create a role with the following privileges:
-
Allocate space
Browse datastore
Low level file operations
-
Disable methods
Enable methods
Licenses
Virtual Machine Snapshot Management Privileges
Create snapshot
Remove snapshot
Revert to snapshot
vSphere with Tanzu clusters require the following additional privilege to resolve paravirtualized volume handles:
-
Searchable
To enable tagging of FCD volumes the following privileges must be set:
-
Create a category
Create a tag
Delete a tag
Assign this role to the dedicated K10 user account on the following objects:
The root vCenter object
The datacenter objects (propagate down each subtree to reach datastore and virtual machine objects)
There is an upper limit on the maximum number of snapshots for a VMware Kubernetes PersistentVolume. Refer to this or more recent VMware knowledge base articles for the limit and for recommendations on the number of snapshots to maintain. A K10 backup policy provides control over the number of local K10 restore points retained, and by implication, the number of local snapshots retained. A K10 backup and export policy allows separate retention policies for local and exported K10 restore points.
The K10 default timeout for vSphere snapshot related operations may be
too short if you are dealing with very large volumes.
If you encounter timeout errors then adjust the vmWare.taskTimeoutMin
Helm option accordingly.
Note
You may observe that an application's PersistentVolumes do not get deleted even if their Reclaim Policy is Delete. This can happen when using K10 to restore an application in the same namespace or when deleting or uninstalling an application previously backed up by K10.
This is because the VMware CSI driver fails in the deletion of PersistentVolumes containing snapshots: a VMware snapshot is embedded in its associated FCD volume and does not exist independent of this volume, and it is not possible to delete an FCD volume if it has snapshots. The VMware CSI driver leaves such PersistentVolumes in the Released state with a "failed to delete volume" warning (visible with kubectl describe). You may also see errors flagged for this operation in the vCenter GUI. The driver re-attempts the deletion operation periodically, so when all snapshots get deleted the PersistentVolume will eventually be deleted. One can also attempt to manually delete the PersistentVolume again at this time.
When K10 restores an application in the same namespace from some restore point, new Kubernetes PersistentVolume objects (with new FCD volumes) are created for the application. However, any restore point that involves local snapshots will now point into FCD volumes associated with PersistentVolume objects in the Released state! Deletion of these K10 restore points (manually or by schedule) will delete the associated FCD snapshots after which the PersistentVolume objects and their associated FCD volumes will eventually be released.
When uninstalling or deleting an application, do not force delete Kubernetes PersistentVolume objects in the Released state as this would orphan the associated FCD volumes! Instead, use the vCenter GUI or a CLI tool like govc to manually delete the snapshots.
Portworx
Apart from CSI-level support, K10 also directly integrates with the Portworx storage platform.
To enable K10 to take snapshots and restore volumes from Portworx, an Infrastructure Profile must be created from the settings menu.
The Namespace
and Service Name
fields are used to determine the
Portworx endpoint. If these fields are left blank the Portworx defaults of
kube-system
and portworx-service
will be used respectively.
In an authorization-enabled Portworx setup, the Issuer
and Secret
fields must be set.
The Issuer
must represent the JWT issuer. The Secret
is the JWT shared
secret, which is represented by the Portworx environment variable-
PORTWORX_AUTH_JWT_SHAREDSECRET
. Refer to Portworx Security
for more information.
Veeam Backup
A Veeam Repository can be used as the destination for exported snapshot data from persistent volumes provisioned by the vSphere CSI provider in supported vSphere clusters. See the Integration with Veeam Backup Repositories for Kasten K10 Guide for additional details, including the Veeam user account permissions needed, network ports used and licensing information.
A Veeam Repository Location Profile
must be created to identify the desired repository on a particular
Veeam Backup server (immutable
repositories are also supported,
refer to the setup instructions for more details),
A Veeam Repository can only store the image based volume data
from the backup, so a policy which uses a Veeam Repository
location profile will always be used in conjunction with another
location profile that will be used to store the remaining data
in a K10 restore point.
Note
A Veeam Repository Location Profile cannot be used as a destination for Kanister actions in a Backup policy.
A Veeam Backup Policy will be created in the Veeam Backup server for each distinct K10 protected application and K10 backup and export policy pair encountered when the K10 backup and export policy is executed. The K10 catalog identifier is added to the name to ensure uniqueness across multiple clusters that back up to the same Veeam Backup server.
Data from a manual (i.e. not associated with a K10 backup and export policy)
export of an application's volumes is associated with a fixed policy
called Kasten K10 Manual Backup
and is saved as a VeeamZIP backup.
K10 will delete any Veeam restore point associated with a K10 restore point being retired.
Import and restoration of K10 restore points that contain snapshot data exported to a Veeam Repository is possible in supported vSphere clusters using volumes provisioned by the vSphere CSI driver. As K10 restore points are not saved in the Veeam Repository the import action is actually performed on the location profile that contains the K10 restore point being imported. A Veeam Repository Location Profile K10 object with the same name as that used on the exporting system must be present in the importing system and will be referenced during the restore action.
Snapshot data is accessed in block mode directly through the VMware VDDK API. If change block tracking is enabled in the VMware cluster nodes, K10 will send incremental changes to the Veeam Backup Server if possible; if incremental upload is not possible a full backup will be done each export. Regardless, K10 will convert Veeam restore points into a synthetic full to satisfy K10 retirement functionality.
Instant Recovery
Instant Recovery will get an exported restore point up and running much faster than a regular restore. This feature requires vSphere 7.0.3+ and a Veeam Backup server version V12 or higher. This is not supported on vSphere with Tanzu clusters at this time. Before using Instant Recovery, you should ensure that all Storage Classes in your Kubernetes clusters are configured to avoid placing new volumes in the Instant Recovery datastore. Please see this Knowledge Base article for recommendations on Storage Classes for use with Instant Recovery.
When a K10 Instant Recovery is triggered, rather than creating volumes and populating them with data from VBR, K10 asks the Veeam Backup server to do an Instant Recovery of the FCDs (vSphere First Class Disks) that are needed and then creates PVs that use those FCDs. The FCDs exist in a vPower NFS datastore created by the Veeam Backup server and attached to the vSphere cluster hosting the Kubernetes cluster.
Once the Instant Recovery has completed, the application will be running using the Veeam Backup server storage. At that point, the virtual disks can be migrated into their permanent home with no interruption in service. The application will not see any differences in how it is using the storage and all of the pods using the disks will continue operating without any restarts. Alternatively, the restored application can be removed and VBR instructed to "stop publishing" the Instant Recovery session.
Currently Instant Recovery is only supported for Restore Actions, not Restore Policies. To use Instant Recovery, select the Enable Instant Recovery checkbox (this will only appear if all compatibility criteria are met) or set the InstantRecovery property in the RestoreAction spec.
All restore features are supported with Instant Recovery.
When the Instant Recovery has completed, your restored application will be ready to use. You may use it while the disks are stored in the Veeam Backup server datastore, however performance and reliability may be lower than your primary datastore. For production workloads, we recommend migrating the Instant Recovery volumes to your primary datastore. To migrate a workload, use the Veeam Backup server UI, select the FCD Instant Recovery session that corresponds to your K10 restore, select "Migrate to production" then follow the prompts. Detailed information is available in the Finalizing Instant FCD Recovery section of the Veeam Backup & Replication documentation.
For workloads which you have not migrated, the Instant Recovery session in the Veeam Backup server must remain active while the workload is in use. When you are finished using the workload, remove it and the PVs that were restored. Once that has completed, you can go to the Veeam Backup server UI, select the Instant Recovery session and use the Stop Publishing option. Please see Finalizing Instant FCD Recovery section of the Veeam Backup & Replication documentation for detailed instructions.