原 PG 17新特性之使用pg_basebackup做增量备份恢复实战
Tags: 原创PGPostgreSQL备份恢复增量备份PG 17pg_basebackup
简介
PostgreSQL 17 引入了一个重要的新功能:支持使用 pg_basebackup 执行增量备份。这一改进极大地简化了备份和恢复的操作流程,并提高了备份效率,尤其适用于备份数据量较大的场景。
参数变化
PG 17提供了两个参数来支持增量备份功能而无需借助第三方备份工具(pg_probackup、pg_rman、pgbackrest等)
第一个参数是summarize_wal,控制是否开启walsummarizer进程来记录WAL摘要信息。
默认情况下,此进程不会启用,但需要启用后增量备份才能正常工作。设置为on之后,从操作系统能看到walsummarizer进程,使用如下SQL语句也能观测到该后台进程。
该进程会在pg_wal/summaries目录下创建 WAL 文件的摘要信息,内容是非文本形式。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | ALTER SYSTEM SET summarize_wal = 'on'; SELECT pg_reload_conf(); SELECT * FROM pg_stat_activity WHERE backend_type = 'walsummarizer'; [pg17@lhrpgalloe bk]$ ps -ef|grep walsummarizer pg17 667311 3450 0 15:07 ? 00:00:00 postgres: walsummarizer pg17 670529 667844 0 15:55 pts/7 00:00:00 grep --color=auto walsummarizer [pg17@lhrpgalloe bk]$ [pg17@lhrpgalloe pg_wal]$ cd summaries/ [pg17@lhrpgalloe summaries]$ ll total 32 -rw------- 1 pg17 pg17 3376 Dec 4 15:07 00000001000000088D0E4D80000000088D24FEC0.summary -rw------- 1 pg17 pg17 986 Dec 4 15:07 00000001000000088D24FEC0000000088D2A4B30.summary -rw------- 1 pg17 pg17 7300 Dec 4 15:07 00000001000000088D2A4B30000000088EDBC4F8.summary -rw------- 1 pg17 pg17 32 Dec 4 15:29 00000001000000088EDBC4F8000000088F000028.summary -rw------- 1 pg17 pg17 32 Dec 4 15:34 00000001000000088F0000280000000891000028.summary -rw------- 1 pg17 pg17 32 Dec 4 15:39 0000000100000008910000280000000892000060.summary -rw------- 1 pg17 pg17 32 Dec 4 15:52 0000000100000008920000600000000893000028.summary [pg17@lhrpgalloe summaries]$ [pg17@lhrpgalloe summaries]$ pwd /pg17/pgdata/pg_wal/summaries [pg17@lhrpgalloe summaries]$ more 00000001000000088D2A4B30000000088EDBC4F8.summary {+eM@ postgres=# \x Expanded display is on. postgres=# SELECT * FROM pg_stat_activity WHERE backend_type = 'walsummarizer'; -[ RECORD 1 ]----+------------------------------ datid | datname | pid | 667311 leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | 2024-12-04 15:07:04.998946+08 xact_start | query_start | state_change | wait_event_type | Activity wait_event | WalSummarizerWal state | backend_xid | backend_xmin | query_id | query | backend_type | walsummarizer postgres=# select pg_available_wal_summaries(); pg_available_wal_summaries ---------------------------- (1,8/8D0E4D80,8/8D24FEC0) (1,8/8D24FEC0,8/8D2A4B30) (1,8/8D2A4B30,8/8EDBC4F8) (1,8/8EDBC4F8,8/8F000028) (1,8/8F000028,8/91000028) (1,8/91000028,8/92000060) (1,8/92000060,8/93000028) (7 rows) postgres=# |
walsummarizer进程会将WAL的摘要信息写入PGDATA/pg_wal/summaries目录下。
第二个参数wal_summary_keep_time设置为非0的值时,根据时间周期自动清理WAL摘要信息文件,默认策略是自动删除10天前的文件。
1 2 3 4 5 | postgres=# show wal_summary_keep_time; wal_summary_keep_time ----------------------- 10d (1 row) |
工具pg_combinebackup 介绍
在 PostgreSQL 17 中,新增了工具 pg_combinebackup,这是一个用于合并全量备份和增量备份的实用工具。它的主要作用是将一系列增量备份整合到全量备份中,生成一个新的完整备份,简化管理并提高恢复的效率。
1 | pg_combinebackup full_20241204152955 inc_20241204153430 -o bk_all |
若是备份为tar格式,则需要解压缩才可以,否则报错:
1 2 | [pg17@lhrpgalloe bk]$ pg_combinebackup full_20241204152955 inc_20241204153430 -o bk_all pg_combinebackup: error: could not open file "inc_20241204153430/PG_VERSION": No such file or directory |
pg_basebackup示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | [pg17@lhrpgalloe bk]$ pg_basebackup --help pg_basebackup takes a base backup of a running PostgreSQL server. Usage: pg_basebackup [OPTION]... Options controlling the output: -D, --pgdata=DIRECTORY receive base backup into directory -F, --format=p|t output format (plain (default), tar) -i, --incremental=OLDMANIFEST take incremental backup -r, --max-rate=RATE maximum transfer rate to transfer data directory (in kB/s, or use suffix "k" or "M") -R, --write-recovery-conf write configuration for replication -t, --target=TARGET[:DETAIL] backup target (if other than client) -T, --tablespace-mapping=OLDDIR=NEWDIR relocate tablespace in OLDDIR to NEWDIR --waldir=WALDIR location for the write-ahead log directory -X, --wal-method=none|fetch|stream include required WAL files with specified method -z, --gzip compress tar output -Z, --compress=[{client|server}-]METHOD[:DETAIL] compress on client or server as specified -Z, --compress=none do not compress tar output General options: -c, --checkpoint=fast|spread set fast or spread (default) checkpointing -C, --create-slot create replication slot -l, --label=LABEL set backup label -n, --no-clean do not clean up after errors -N, --no-sync do not wait for changes to be written safely to disk -P, --progress show progress information -S, --slot=SLOTNAME replication slot to use -v, --verbose output verbose messages -V, --version output version information, then exit --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE use algorithm for manifest checksums --manifest-force-encode hex encode all file names in manifest --no-estimate-size do not estimate backup size in server side --no-manifest suppress generation of backup manifest --no-slot prevent creation of temporary replication slot --no-verify-checksums do not verify checksums --sync-method=METHOD set method for syncing files to disk -?, --help show this help, then exit Connection options: -d, --dbname=CONNSTR connection string -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT database server port number -s, --status-interval=INTERVAL time between status packets sent to server (in seconds) -U, --username=NAME connect as specified database user -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) Report bugs to <pgsql-bugs@lists.postgresql.org>. PostgreSQL home page: <https://www.postgresql.org/> |
增量备份恢复示例
全量+增量备份
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | -- 配置参数 ALTER SYSTEM SET summarize_wal = 'on'; SELECT pg_reload_conf(); -- 造数据(每2秒持续做update操作) create database db1; \c db1 CREATE TABLE t_hash AS SELECT id, md5(id::text) FROM generate_series(1, 2000000) AS id; CREATE TABLE t1 (last_updated timestamptz); INSERT INTO t1 (last_updated) VALUES (now()); db1=# \dt+ List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description --------+--------+-------+----------+-------------+---------------+------------+------------- public | t1 | table | postgres | permanent | heap | 8192 bytes | public | t_hash | table | postgres | permanent | heap | 131 MB | (2 rows) db1=# UPDATE t1 SET last_updated = now(); \watch 2 UPDATE 1 UPDATE 1 UPDATE 1 。。。。。。。。 <=== (每2秒持续做update操作) -- 全量备份 pg_basebackup -D /bk/full_$(date +%Y%m%d%H%M%S) -Ft -z -c fast -P -v -- 增量备份 pg_basebackup -D /bk/inc_$(date +%Y%m%d%H%M%S)_full_20241204173216 -Ft -z -c fast --incremental=/bk/full_20241204173216/backup_manifest -P -v |
日志:


