Support Point-in-Time Recovery (PITR)
Description
PITR allows user to recover the database back to any point of time.
In principle, given an initial full backup archive, and several incremental changes, it is enough to restore to a snapshot at any given time.
The PITR feature should balance between the cluster's performance when making the continuous backup, and the restore speed when doing PITR.
Task List
Sprint 1 - 100 person day
- [x] P0 https://github.com/pingcap/ticdc/issues/768 TiCDC support external storage (contain snapshot level consistent log backup).
- [x] P0 https://github.com/pingcap/br/issues/440 BR support restores data from TiCDC log backup.
- [ ] P0 https://github.com/pingcap/br/issues/441 BR support generates undo SQL from TiCDC log backup.
- [ ] P0 https://github.com/pingcap/br/issues/511 Add checksum mechanism for cdc log restore.
- [ ] P1 https://github.com/pingcap/br/issues/445 PITR building block features in SQL
- [ ] P1 https://github.com/pingcap/br/issues/442 BR support generates differential backup with log backup.
Sprint 2 - 100 person day
- [ ] https://github.com/pingcap/tidb-operator/issues/3017 TiDB Operator supports PITR with TiCDC and BR
- [ ] implement local external storage, to adapt to the situation where the user does not have cloud storage?
- [ ] PITR backup management
- [ ] Automatic backup retain strategy
- [ ] Automatically generate full/differential backup data
- [ ] Automatic backup data verification
Category
- Feature
- Reliability
Value
Value description
(TBD)
Value score
- (TBD) / 5
Workload estimation
- (TBD) person-day
Time
Time GanttStart: 2020-07-01 GanttDue: 2020-10-22 GanttProgress: 80%
There are currently two solutions we may choose:
- 3. Perform BR full backup regularly at some sparse interval, and use CDC to capture the incremental changes between two BR full backups.
- 5. Use RocksDB Checkpoints.
We will decide which to use after investigating viability of Solution 5.
After discussion we have confirmed the initial requirement expectations of the PITR:
- Support recover to the latest available point in time(RPO close to 0).
- Implement Point-in-Time Recovery on the new cluster.
- Archive incremental logs and manage backup meta information.
- Revoke the misuse of DML.
- Support recover database level, table level.
and also confirmed the initial form of PITR:
- Based on current tools(BR/CDC), provide backup data recovery and record change log capabilities, Platform(DBaaS) and users implement the integration solution.
PITR Product Ideas
Building blocks features
- BR provides Full and differential data backup/recovery (base on external storage, like s3,gcs, HTTP https://github.com/pingcap/br/issues/308)
- TiCDC provides log backup/recovery/rollback (base on external storage, like s3/gcs)
- needs to investigate to implement s3 sink in ticdc, s3 does not support long connection (the log streaming)
- need a new tool to do ticdc log recovery/logic rollback? Do we implement the features in br/ticdc/tidb?
- Recover specific full/differential data and logs to specify the TiDB cluster
Management feature
Provides the entrance of PITR Management
- Recovery a new cluster to a point in time
- Rollback some DML operations(one specified transaction or to a point in time)?
Provides Backup Data management, manage fundamental data, include
- manage backup data/log
- find proper backup data/log to recover
Hi guys, I am really interested in this feature, are there any updates on this?