合 GreenPlum 7.4.0新特性介绍
简介
GreenPlum 7.4.0 于2025-01-31发布,带来了很多的新特性。
VMware Greenplum 7.4.0 是一个小版本更新,包含新的功能、变更,并修复了多个问题。
新功能与变更
新功能
- gpmigrate 工具:Tanzu Greenplum 7.4.0 引入了
gpmigrate
工具,用于检查 Tanzu Greenplum 6 与 Greenplum 7 之间的 gpbackup 和 gprestore 迁移兼容性,同时可检测pg_catalog
中的用户定义对象。 - 数据库级和角色级 GUC 检查:新增了一项检查功能,检测在数据库或角色级别设置的已移除 GUC 变量。
- gp_bloat_estimates 视图:新增
gp_toolkit.gp_bloat_estimates
视图,可显示表膨胀情况,包括死元组(dead tuples)和未使用空间。 - gp_resqueue_assignment_user GUC:新增
gp_resqueue_assignment_user
GUC 参数,允许在当前用户和认证用户之间切换资源队列分配方式。 - 资源队列和资源组重命名:支持重命名资源队列和资源组。
- gp_cleanup_unused_rp_snapshots() 函数:新增
gp_toolkit.gp_cleanup_unused_rp_snapshots()
UDF,用于清理热备库上未使用的基于恢复点的快照,防止快照积累。 - gp_orphaned_backends 视图:新增
gp_toolkit.gp_orphaned_backends
视图,可列出系统中所有孤立的后端进程,该视图基于gp_stat_activity
结构。 - Append-Optimized 表的膨胀诊断视图:
gp_toolkit.gp_bloat_diag_appendoptimized
:用于分析 AO 表的膨胀情况。gp_toolkit.gp_bloat_estimates_appendoptimized
:收集 AO 表的膨胀统计数据,与gp_bloat_estimates
类似但适用于 AO 表。
- Orca 查询优化器增强:
- 优化元数据 ID 对象的内存消耗,减少优化阶段的内存限制问题。
- 支持 Row Expressions。
- 支持 SIRV 函数。
- 密码管理增强:新增
advanced_password_check.role_password_status(rolname TEXT)
UDF,显示密码创建时间、过期时间以及角色的有效期。 - PostgresML 版本升级至 2.8.2。
- gpload SSL 协议支持:支持
ssl_min_protocol_version
选项,可在 YAML 配置文件中设置TLSv1.2
或TLSv1.3
。 - gpcheckcat 命令增强:新增
--skip-leaked-temp-schema
选项,可跳过pg_temp
和pg_toast_temp
相关的临时模式清理检查。 - VACUUM 选项增强:
vacuumdb
命令新增--ao-aux-only
选项,可仅对 AO 辅助表执行 VACUUM 。 - gpinitstandby 增强:新增
-A <standby_hostaddress>
选项,可指定备用协调器(standby coordinator)的主机地址,未指定时默认使用主机名。 - 新增 Pointcloud 扩展:支持存储和管理点云数据。
- 新增 pgRouting 扩展:增强 PostgreSQL 数据库的地理空间路由与网络分析功能。
- pg_stat_activity 新功能:新增
<early unassign>
记录项。 - Greenplum Virtual Postmaster HA:新增对 EL9 版本的支持。
- gpctl 工具:支持使用 gRPC 通信框架初始化 Greenplum,可用于创建有镜像或无镜像的集群,并支持 YAML、JSON、TOML 格式的输入配置文件。
- gpservice 工具:用于启动和管理基于 gRPC 的 hub 和 agent 服务器,支持跨主机任务调度。
- gp_resgroup_enable_early_unassign:允许通过 GUC
gp_resgroup_enable_early_unassign
启用 early unassign 机制。
增强功能
- 视图命名规范:新增对核心计数(core counting)相关视图的命名规范支持。
- gpcheckcat 在线检查:
gpcheckcat
现在支持将dependency
和distribution_policy
作为在线检查项。 - gpexpand 增强:
gpexpand.status_detail
现在在初始化阶段就会填充,并支持并行 worker。- 优化
gpexpand
连接管理,提高初始化效率。 gpexpand
现在会压缩段模板 tar 包后再分发,提高网络性能。- 允许
gpexpand
忽略SIGHUP
信号,确保终端退出后任务仍可继续运行。
- gp_toolkit 增强:
- 新增
gp_toolkit.__gp_aoseg_all()
、gp_toolkit.__gp_aocsseg_all()
、gp_toolkit.__gp_seg_all_summary()
,用于收集 AO 表的磁盘段文件元数据。 - 新增
gp_toolkit.gp_resgroup_bypassed_queries
视图,显示绕过资源组执行的查询。
- 新增
- 资源队列增强:恢复时资源队列配置可以被正确应用,而无需重启集群。
- 热备库优化:限制快照导入次数,减少不必要的开销。
- 同步复制优化:调整同步复制的节流(throttling)策略,减少死锁风险。
- EXPLAIN 计划改进:当 Result 节点作为哈希过滤器时,EXPLAIN 计划会显示相关信息。
- Orca 查询优化改进:
- 降低大规模分区表查询的优化时间和内存使用。
- 多表 JOIN 查询在随机分布的分区表上执行时,优化查询计划。
- gpcheckcat -l 增强:可区分哪些检查项可在线运行,哪些需要离线运行。
- gprecoverseg 新增选项:
gprecoverseg -c <content_id>
允许对指定内容 ID 进行重新平衡(需与-r
选项配合使用)。gprecoverseg
现在可在 segment 处于启动恢复状态时提供提示信息。
- 日志增强:新增
/var/log/greenplum-7/
目录,存放gpservice
相关日志。 - gpconfig 配置变更记录:使用
gpconfig
修改参数时,postgresql.conf
现在会记录时间戳。
修复的问题
服务器
- 修复查询 hang 住问题:修复了 interconnect 组件中因接收端缓存了错误的发送地址导致查询挂起的问题。
- 修复启动崩溃问题:修复了在包含超过 10,000 张表的启用磁盘配额的数据库中启动或重启时可能发生的崩溃问题。
- 修复 CLUSTER 命令问题:允许在映射的非共享系统关系(如
pg_class
)上执行CLUSTER
操作。 - 修复临时表清理竞争条件:修复 autovacuum 在清理临时表时可能遇到的竞争条件,并新增
gp_autovacuum_enable_temp_table_cleanup
GUC 参数控制是否启用清理。 - 修复外键约束问题:修复在
FOREIGN KEY CONSTRAINT
约束添加过程中可能错误使用不同索引的问题。 - 修复 AO 辅助表名称不一致问题。
- 修复查询计划相关错误。
- 修复 gpbackup 备份时收集 AO 表段文件统计信息导致的性能回退问题。
原文
VMware Greenplum 7.4.0 is a minor release that includes new and changed features and resolves several issues.
New and Changed Features
New Features
- Tanzu Greenplum 7.4.0 introduces the
gpmigrate
utility to check major version compatibility between Tanzu Greenplum 6 and Greenplum 7 forgpbackup
andgprestore
migrations. This utility can also detect user defined objects inpg_catalog
. - Introduced a new check for removed GUCs set at the database or role level.
- Tanzu Greenplum 7.4.0 introduces the
gp_toolkit.gp_bloat_estimates
view to display bloat in terms of both dead tuples and unused space. - Tanzu Greenplum 7.4.0 introduces the
gp_resqueue_assignment_user
GUC to toggle between assigning a resource queue to a statement based on the current user or the authenticated user. - Tanzu Greenplum 7.4.0 allows you to rename the resource queues and groups.
- Tanzu Greenplum 7.4.0 adds a UDF
gp_toolkit.gp_cleanup_unused_rp_snapshots()
, to remove unused restore-point based snapshots on the hot standby, preventing snapshot accumulation. - Tanzu Greenplum 7.4.0 introduces a new view,
gp_toolkit.gp_orphaned_backends
, which lists all orphaned backend processes in the system. This view acts as a wrapper forgp_stat_activity
and uses the same schema. - Tanzu Greenplum 7.4.0 introduces the
gp_toolkit.gp_bloat_diag_appendoptimized
view for table bloat diagnosis in append-optimized tables. Additionally, introduced thegp_toolkit.gp_bloat_estimates_appendoptimized
view to collect table bloat statistics for append-optimized tables, similar to thegp_toolkit.gp_bloat_estimates
for heap tables. - Tanzu Greenplum 7.4.0 introduces a new feature to optimize memory consumption of metadata ID objects during Orca's optimization stage, helping users avoid optimization-related memory limitations.
- Orca now supports Row Expressions.
- Orca now supports SIRV functions.
- Tanzu Greenplum 7.4.0 introduces a new UDF
advanced_password_check.role_password_status(rolname TEXT)
to display the password creation time, expiration time, and the role's valid until time. - The PostgresML version has been updated to 2.8.2.
- Tanzu Greenplum 7.4.0 introduces support for
ssl_min_protocol_version
ingpload
forgpfdists
connections via the configuration file. The new setting is located in the YAML file underinput::source::ssl_min_protocol_version
. Supported values align with thegpfdist --ssl_min_protocol_version
option, allowingTLSv1.2
andTLSv1.3
. - Tanzu Greenplum 7.4.0 introduces a new optional flag,
--skip-leaked-temp-schema
, ingpcheckcat
to allow skipping the temporary schema (pg_temp
,pg_toast_temp
) cleanup check that runs at the start ofgpcheckcat
. - Tanzu Greenplum 7.4.0 introduces a new optional flag,
--ao-aux-only
tovacuumdb
to allow vacuum on appendoptmized aux tables. - Tanzu Greenplum 7.4.0 introduces a new option in
gpinitstandby
to specify the host address of the standby coordinator. This can be done using the-A <standby_hostaddress>
option. If the standby address is not specified, it will default to the standby hostname. - Tanzu Greenplum 7.4.0 introduces the
Pointcloud
extension that enables storing and managing point cloud data. - Tanzu Greenplum 7.4.0 introduces the
pgRouting
extension for the PostgreSQL database that enhances its capabilities by introducing advanced geospatial routing and network analysis. - Tanzu Greenplum 7.4.0 introduces the ability for
pg_stat_activity
to list<early unassign>
entries. - Tanzu Greenplum 7.4.0 introduces EL9 support for Greenplum Virtual Postmaster HA service.
- Tanzu Greenplum 7.4.0 introduces
gpsetup
utility to configure system kernel parameters on Greenplum coordinator and segment hosts. - Tanzu Greenplum 7.4.0 introduces the
gpctl
utility to initialize Greenplum using the new gRPC communication framework.gpctl
can be used to create a mirrored or mirrorless cluster and also supports input configuration file of various formats like YAML, JSON, TOML. - Tanzu Greenplum 7.4.0 introduces the
gpservice
utility for initiating and managing the hub and agent gRPC servers used for inter-host communication. Additionally, it offers the capability to schedule tasks using the hub gRPC server. - Tanzu Greenplum 7.4.0 enables you to enable early unassign by setting the server configuration parameter
gp_resgroup_enable_early_unassign
.
Enhancements
- Tanzu Greenplum 7.4.0 now supports view naming conventions for core counting.
- Tanzu Greenplum 7.4.0 now supports
gpcheckcat
to designate the 'dependency' and 'distribution_policy' checks as online. - Tanzu Greenplum 7.4.0 introduces the following enhancements for the
gpexpand
utility::- gpexpand now populates
gpexpand.status_detail
during the initialization phase with parallel workers. - Optimized connection usage during
gpexpand
's initialization phase. gpexpand
now compresses the segment template tar before distributing it to the expansion segments, improving network performance.gpexpand
to ignore the SIGHUP signal and continue running, even if the surrounding terminal exits.
- gpexpand now populates
- Tanzu Greenplum 7.4.0 introduces the following enhancements for the gp_toolkit utility:
- The
gp_toolkit
utulity now supportsgp_toolkit.__gp_aoseg_all()
,gp_toolkit.__gp_aocsseg_all()
, andgp_toolkit.__gp_seg_all_summary()
to collect metadata information from the on-disk segment files of all append-optimized tables in the database. - The
gp_toolkit
utility now supportsgp_toolkit.gp_resgroup_bypassed_queries
, which displays queries running on the cluster that have bypassed resource group assignment.
- The
- This version of Tanzu Greenplum supports resource queue changes being restored from continuous recovery without requiring a cluster restart.
- Enhanced performance in hot standby by limiting the number of snapshot imports, eliminating unnecessary overhead.
- Enhanced
syncrep
by implementing throttling once in a non-critical path, reducing the risk of issues like deadlocks that occur from throttling at multiple points. - Improved the
EXPLAIN
plan by indicating when theResult
node serves as a hash filter. - Improved planning time and reduced memory usage for queries using
Orca
with large partitioned tables. - Improved
Orca
query plans for cases with multiple joins involving randomly distributed partitioned tables. - Enhanced
gpcheckcat -l
to indicate which checks can be run in online mode and which cannot. - The
gprecoverseg
command now supports the following options:- The
gprecoverseg
command with the -c option, enable users to rebalance specific segments by specifying their content IDs. This option must be used in conjunction with the -r option. - The
gprecoverseg
rebalance by displaying a message when segments are undergoing startup recovery as part of the rebalance operation.
- The
- Enhanced logging by introducing a new log folder,
/var/log/greenplum-7
, to holdgpservice
logs. This folder is created as part of thegpdb
rpm installation and will serve as the default location forgpservice
logs. - When a parameter is changed using
gpconfig
, the corresponding entry inpostgresql.conf
will now include a timestamp marking the change.
Resolved Issues
Server
35872626
Resolves an interconnect issue where a query would hang if the receiver used an incorrect cached sender address.
35841158
Resolves a crash that occurred when starting or rebooting the cluster with a disk quota-enabled database containing over 10,000 tables.
35819266/35817157
Resolves an issue that allows CLUSTER to be run on mapped non-shared system relations (such as
pg_class
).35582363
Resolves race conditions in
autovacuum
during temporary table cleanup. Also, added a GUC parametergp_autovacuum_enable_temp_table_cleanup
to optionally enable or disable this cleanup.35537176
Resolves an issue where different indexes could be used while adding FOREIGN KEY CONSTRAINT.
35940969
Resolved an issue where unnecessary resource throttling occurred when the Greenplum cluster was started in restricted mode.
N/A
Resolves an issue where residual QE processes were not cleared after being disconnected from the QD.
N/A
Resolves an issue where the AO auxiliary table name was inconsistent with the main table OID.
N/A
Resolves an integer overflow issue with
pg_class.relpages
.N/A
Resolves an issue with inconsistent output when creating and dropping external tables.
N/A
Resolves an issue with
gp_toolkit.gp_num_physical_cores_per_host
that could show incorrect results on a GPDR cluster.N/A
Resolves an issue with VACUUM of append-optimized tables that might cause incorrect SELECT results or PANIC errors.
N/A
Resolves an issue by providing a new UDF in
gp_toolkit
to retrieve all tables and their sizes in the database.N/A
Resolves an error that occurred during the scan of external tables with check constraints.
N/A
Resolves an issue causing high memory peak during a single
INSERT SELECT
query.本人提供Oracle(OCP、OCM)、MySQL(OCP)、PostgreSQL(PGCA、PGCE、PGCM)等数据库的培训和考证业务,私聊QQ646634621或微信dbaup66,谢谢!