-
Notifications
You must be signed in to change notification settings - Fork 386
Cloud Disaster Recovery Guide #4470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
Tracking comments in ClickHouse Cloud Disaster Recovery - Public Docs. Adding more directed comments in the review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aashishkohli updated PR for comments. PTAL.
title: 'Disaster recovery' | ||
description: 'This guide provides an overview of disaster recovery.' | ||
doc_type: 'guide' | ||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this document in the broader scope of backups in our docs:
- Features > Backups > Overview > move to Guides > Backups > Review and Restore Backups
- Features > Backups > Configurable Backups move to Guides > Backups > Configure Backup Schedules
- Features > Backups > Export Backups to your Own Cloud Account move to Guides > Export Backups
This document would then move to Reference > Data Resiliency. Comments below for x-references.
Add a new document under Features > Backups. The new document should cover:
- Review and Restore Backups
- Configure Backup Schedules
- Export Backups
These should have brief descriptions and links to the Guide pages above.
|
||
It is helpful to cover some definitions first. | ||
|
||
**RPO (Recovery Point Objective)**: The maximum acceptable data loss measured in time following a disruptive event. Example: An RPO of 30 mins means that in the event of a failure the DB should be restorable to data no older than 30 mins. This, of course, depends on how frequently backups are taken. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIP:
Customers should perform periodic backup restore testing to understand the specific RTO for their service size and configuration.
|
||
**Default backups**: By default, ClickHouse Cloud takes a backup of your service every 24 hours. These backups are in the same region as the service, and happen in the ClickHouse CSP (cloud service provider) storage bucket. In the event that the data in the primary service gets corrupted, the backup can be used to restore to a new service. | ||
|
||
**External backups (in customer's own storage bucket)**: Enterprise Tier customers can export backups to their object storage in their own account, in the same region, or in another region. Cross-cloud backup export support is coming soon. Applicable data transfer charges will apply for cross-region, and cross-cloud backups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INFO:
This feature is not currently available in PCI/ HIPAA or encrypted services.
|
||
**External backups (in customer's own storage bucket)**: Enterprise Tier customers can export backups to their object storage in their own account, in the same region, or in another region. Cross-cloud backup export support is coming soon. Applicable data transfer charges will apply for cross-region, and cross-cloud backups. | ||
|
||
**Configurable backups**: Customers can configure backups to happen at a higher frequency, up to every 6 hours, to improve the RPO. Customers can also configure longer retention. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INFO:
The size of the database plays a significant role in how quickly a backup completes and when the next backup starts if backups are scheduled close to each other. Customers should monitor the first few backups and test recovery to verify the correct RTO/RPO for the service.
|
||
### Primary service data corruption {#primary-service-data-corruption} | ||
|
||
In this case the data can be restored from the backup to another service in the same region. The backup could be up to 24 hours old if using the default backup policy, or up to 6 hours old (if using configurable backups with 6 hours frequency). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restored from the backup
Link this to https://clickhouse.com/docs/cloud/manage/backups/overview#restore-a-backup
|
||
### Primary region downtime {#primary-region-downtime} | ||
|
||
Customers in the Enterprise Tier can export backups to their own cloud provider bucket. If you are concerned about regional failures, we recommend exporting backups to a different region. Keep in mind that cross-region data transfer charges will apply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export backups
Link this to https://clickhouse.com/docs/cloud/manage/backups/export-backups-to-own-cloud-account
Summary
Adding ClickHouse Cloud Disaster Recovery Guide