Protecting Applications

Protecting an application with K10, usually accomplished by creating a policy, requires the understanding and use of three concepts:

  • Snapshots and Backups: Depending on your environment and requirement, you might need just one or both of these data capture mechanisms

  • Scheduling: Specification of application capture frequency and snapshot/backup retention objectives

  • Selection: This defines not just which applications are protected by a policy but, whenever finer-grained control is needed, resource filtering can be used to restrict what is captured on a per-application basis

This section demonstrates how to use these concepts in the context of a K10 policy to protect applications. While you can always create a policy from scratch from the policies page, the easiest way to define policies for unprotected applications is to click on the Applications card on the main dashboard. This will take you to a page where you can see all applications in your Kubernetes cluster.

../_images/overview_apps.png

To protect any unmanaged application, simply click Create a policy and, as shown below, that will take you to the policy creation section with an auto-populated policy name that you can change. The concepts highlighted above will be described in the below sections in the context of the policy creation workflow.

../_images/policies_create.png

Snapshots and Backups

All policies center around the execution of actions and, for protecting applications, you start by selecting the snapshot action with an optional backup (currently called export) option to that action.

Snapshots

../_images/policies_snapshot.png

Note

A number of public cloud providers (e.g., AWS, Azure, Google Cloud) actually store snapshots in object storage and they are retained independent of the lifecycle of the primary volume. However, this is not true of all public clouds (e.g., IBM Cloud) and you might also need to enable backups in public clouds for safety. Please check with your cloud provider's documentation for more information.

Snapshots are the basis of persistent data capture in K10. They are usually used in the context of disk volumes (PVC/PVs) used by the application but can also apply to application-level data capture (e.g., with Kanister).

Snapshots, in most storage systems, are very efficient in terms of having a very low performance impact on the primary workload, requiring no downtime, supporting fast restore times, and implementing incremental data capture.

However, storage snapshots usually also suffer from constraints such as having relatively low limits on the maximum number of snapshots per volume or per storage array. Most importantly, snapshots are not always durable. First, catastrophic storage system failure will destroy your snapshots along with your primary data. Further, in a number of storage systems, a snapshot's lifecycle is tied to the source volume. So, if the volume is deleted, all related snapshots might automatically be garbage collected at the same time. It is therefore highly recommended that you create backups of your application snapshots too.

Backups

../_images/policies_export_backup.png

Note

In most cases, when application-level capture mechanisms (e.g., logical database dumps via Kanister) are used, these artifacts are directly sent to an object store. Backups should not be needed in those scenarios unless a mix of application and volume-level data is being captured or if you have a more specific use case.

Given the limitations of snapshots, it is often advisable to set up backups of your application stack. However, even if your snapshots are durable, backups might still be useful in a variety of use cases including lowering costs with K10's data deduplication or backing your snapshots up in a different infrastructure provider for cross-cloud resiliency.

To convert your snapshots into backups, you need to:

  • Select Enable Backups via Snapshot Exports during policy creation

  • Ensure that the export profile selected was created with the data portability option enabled (verify during policy creation that portable restore points will be created)

Scheduling

There are three components to scheduling:

  • How frequently the primary snapshot action should be performed

  • How often snapshots should be exported into backups

  • Retention schedule of snapshots and backups

Action Frequency

../_images/policies_action_frequency.png

Actions can be set to execute at an hourly, daily, weekly, monthly, or yearly granularity. Actions set to hourly will execute at the top of the hour while other actions will execute at midnight UTC.

It is also possible to select sub-hourly frequencies as a sub-option of hourly frequencies. This is useful when you are protecting mostly Kubernetes objects or small data sets. Care should be taken with more general-purpose workloads because of the risk of stressing underlying storage infrastructure or running into storage API rate limits. Further, sub-hour frequencies will also interact with retention (described below). For example, retaining 24 hourly snapshots at 15-minute intervals would only retain 6 hours of snapshots.

Snapshot Exports to Backups

../_images/policies_backup_selection.png

Backups performed via exports, by default, will be set up to export every snapshot into a backup. However, it is also possible to select a subset of snapshots for exports (e.g., only convert every daily snapshot into a backup).

Retention Schedules

../_images/policies_snapshot_retention.png

A powerful scheduling feature in K10 is the ability to use a GFS retention scheme for cost savings and compliance reasons. With this backup rotation scheme, hourly snapshots and backups are rotated on an hourly basis with one graduating to daily every day and so on. It is possible to set the number of hourly, daily, weekly, monthly, and yearly copies that need to be retained and K10 will take care of both cleanup at every retention tier as well as graduation to the next one.

../_images/policies_backup_retention.png

By default, backup retention schedules will be set to be the same as snapshot retention schedules but these can be set to independent schedules if needed. This allows users to create policies where a limited number of snapshots are retained for fast recovery from accidental outages while a larger number of backups will be stored for long-term recovery needs. This separate retention schedule is also valuable when limited number of snapshots are supported on the volume but a larger backup retention count is needed for compliance reasons.

Application Selection and Resource Filtering

This section describes both how policies can be bound to applications and, once bound, how specific application resources can either be included or excluded from capture.

Application Selection

You can select applications by two specific methods:

  • Application Names

  • Labels

Selecting By Application Name

../_images/policies_select_name.png

The most straightforward way to apply a policy to an application is to use its name (which is derived from the namespace name). Note that you can select multiple application names in the same policy.

Selecting By Labels

../_images/policies_select_label.png

Note

K10 will protect the entire application if any component's labels match the policy's label selection. If you need to only protect certain application components, resource filtering, described below, can be used.

For policies that need to span multiple applications (e.g., protect all applications that use MongoDB or applications that have been annotated with the gold label), you can also select applications by label. Any application (namespace) that has a matching label as defined in the policy will be selected. If multiple labels are selected, a union (logical OR) will be performed when deciding what applications the policy should be applied to. That is, applications with at least one matching label will be selected.

Note that label-based selection can be used to create forward-looking policies as the policy will automatically apply to any future application that has the matching label. For example, using the heritage: Tiller (Helm v2) or heritage: Helm (Helm v3) selector will apply the policy you are creating to any new Helm-deployed applications as the Helm package manager automatically adds that label to any Kubernetes workload it creates.

Resource Filtering

../_images/policies_resource_filters.png

Warning

Filters should be used with care. It is easy to accidentally define a policy that might leave out essential components of your application.

Resource filtering is supported for both backup policies and restore actions. The recommended best practice is to create backup policies that capture all resources to future-proof restores and to use filters to limit what is restored.

In K10, filters describe which Kubernetes resources should be included or excluded in the backup. If no filters are specified, all the API resources in a namespace are captured by the BackupActions created by this Policy.

Resource types are identified by group, version, and resource type names, or GVR (e.g., networking.k8s.io/v1/networkpolicies). Core Kubernetes types do not have a group name and are identified by just a version and resource type name (e.g., v1/configmaps). Individual resources are identified by their resource type and resource name, or GVRN. In a filter, an empty or omitted group, version, resource type or resource name matches any value. For example, if you set Group: apps and Resource: statefulsets, it will capture all StatefulSets no matter the API Version (e.g., v1 or v1beta1).

Filters reduce the resources in the backup by first selectively including and then excluding resources:

  • If no include or exclude filters are specified, all the API resources belonging to an application are included in the set of resources to be backed up

  • If only include filters are specified, resources matching any GVRN entry in the include filter are included in the set of resources to be backed up

  • If only exclude filters are specified, resources matching any GVRN entry in the exclude filter are excluded from the set of resources to be backed up

  • If both include and exclude filters are specified, the include filters are applied first and then exclude filters will be applied only on the GVRN resources selected by the include filter

For a full list of API resources in your cluster, run kubectl api-resources.

Safe Backup

For safety, K10 automatically includes namespaced and non-namespaced resources such as associated volumes (PVCs and PVs) and StorageClasses when a StatefulSet, Deployment, or DeploymentConfig is included by filters. If you do not want to capture these auto-included resource, they can be removed by specifying an exclude filter.

Similarly, given the strict dependency between the objects, we protect Custom Resource Definitions (CRDs) if a Custom Resource (CR) is included in a backup. However, it is not possible today to exclude a CRD via exclude filters and any failure to exclude will be silently ignored.

Finally, note that it is not possible today to include non-namespaced objects (e.g., CRDs or StorageClasses) if they are not referred to by an included resource.

Working With Policies

Viewing Policy Activity

Once you have created a policy and have navigated back to the main dashboard, you will see the selected applications quickly switch from unmanaged to non-compliant (i.e., a policy covers the objects but no action has been taken yet). Soon after, they will both switch to compliant as snapshots and backups get invoked and the application enters a protected state. You can also scroll down on the page to see the activity, how long each snapshot took, and the generated artifacts. Your page will now look similar to this:

../_images/dashboard_compliant.png

More detailed job information can be obtained by clicking on the in-progress or completed jobs.

Manual Policy Runs

../_images/policies_manual_run.png

It is possible to manually create a policy run by going to the policy page and clicking the Run button on the desired policy. Note that any artifacts created by this action will not be eligible for automatic retirement and will need to be manually cleaned up.

Editing Policies

It is also possible to edit created policies by clicking the edit button on the policies page.

../_images/policies_edit.png

Policies, a Kubernetes Custom Resource (CR), can also be edited directly by manually modifying the CR's YAML through the dashboard or command line.

Changes made to the policy (e.g., new labels added or resource filtering applied) will take effect during the next scheduled policy run.

Careful attention should be paid to changing a policy's retention schedule as that action will automatically retire and delete restore points that no longer fall under the new retention scheme.

Deleting A Policy

You can easily delete a policy from the policies page or using the API. However, in the interests of safety, deleting a policy will not delete all the restore points that were generated by it. They will need to be manually deleted from the Application restore point view or via the API.

Policy Exceptions

Even if a namespace is covered by a policy, it is possible to have the namespace be ignored by the policy. You can add the k10.kasten.io/ignorebackuppolicy annotation to the namespace(s) you want to be ignored. Namespaces that are tagged with the k10.kasten.io/ignorebackuppolicy annotation will be skipped during scheduled backup operations.