GreenPlum 7.4.0新特性介绍

0    147    2

Tags:

👉 本文共约6922个字,系统预计阅读时间或需27分钟。

简介

GreenPlum 7.4.0 于2025-01-31发布,带来了很多的新特性。


VMware Greenplum 7.4.0 是一个小版本更新,包含新的功能、变更,并修复了多个问题。

新功能与变更

新功能

  • gpmigrate 工具:Tanzu Greenplum 7.4.0 引入了 gpmigrate 工具,用于检查 Tanzu Greenplum 6 与 Greenplum 7 之间的 gpbackup 和 gprestore 迁移兼容性,同时可检测 pg_catalog 中的用户定义对象。
  • 数据库级和角色级 GUC 检查:新增了一项检查功能,检测在数据库或角色级别设置的已移除 GUC 变量。
  • gp_bloat_estimates 视图:新增 gp_toolkit.gp_bloat_estimates 视图,可显示表膨胀情况,包括死元组(dead tuples)和未使用空间。
  • gp_resqueue_assignment_user GUC:新增 gp_resqueue_assignment_user GUC 参数,允许在当前用户和认证用户之间切换资源队列分配方式。
  • 资源队列和资源组重命名:支持重命名资源队列和资源组。
  • gp_cleanup_unused_rp_snapshots() 函数:新增 gp_toolkit.gp_cleanup_unused_rp_snapshots() UDF,用于清理热备库上未使用的基于恢复点的快照,防止快照积累。
  • gp_orphaned_backends 视图:新增 gp_toolkit.gp_orphaned_backends 视图,可列出系统中所有孤立的后端进程,该视图基于 gp_stat_activity 结构。
  • Append-Optimized 表的膨胀诊断视图:
    • gp_toolkit.gp_bloat_diag_appendoptimized:用于分析 AO 表的膨胀情况。
    • gp_toolkit.gp_bloat_estimates_appendoptimized:收集 AO 表的膨胀统计数据,与 gp_bloat_estimates 类似但适用于 AO 表。
  • Orca 查询优化器增强:
    • 优化元数据 ID 对象的内存消耗,减少优化阶段的内存限制问题。
    • 支持 Row Expressions
    • 支持 SIRV 函数
  • 密码管理增强:新增 advanced_password_check.role_password_status(rolname TEXT) UDF,显示密码创建时间、过期时间以及角色的有效期。
  • PostgresML 版本升级至 2.8.2
  • gpload SSL 协议支持:支持 ssl_min_protocol_version 选项,可在 YAML 配置文件中设置 TLSv1.2TLSv1.3
  • gpcheckcat 命令增强:新增 --skip-leaked-temp-schema 选项,可跳过 pg_temppg_toast_temp 相关的临时模式清理检查。
  • VACUUM 选项增强vacuumdb 命令新增 --ao-aux-only 选项,可仅对 AO 辅助表执行 VACUUM 。
  • gpinitstandby 增强:新增 -A <standby_hostaddress> 选项,可指定备用协调器(standby coordinator)的主机地址,未指定时默认使用主机名。
  • 新增 Pointcloud 扩展:支持存储和管理点云数据。
  • 新增 pgRouting 扩展:增强 PostgreSQL 数据库的地理空间路由与网络分析功能。
  • pg_stat_activity 新功能:新增 <early unassign> 记录项。
  • Greenplum Virtual Postmaster HA:新增对 EL9 版本的支持。
  • gpctl 工具:支持使用 gRPC 通信框架初始化 Greenplum,可用于创建有镜像或无镜像的集群,并支持 YAML、JSON、TOML 格式的输入配置文件。
  • gpservice 工具:用于启动和管理基于 gRPC 的 hub 和 agent 服务器,支持跨主机任务调度。
  • gp_resgroup_enable_early_unassign:允许通过 GUC gp_resgroup_enable_early_unassign 启用 early unassign 机制。

增强功能

  • 视图命名规范:新增对核心计数(core counting)相关视图的命名规范支持。
  • gpcheckcat 在线检查gpcheckcat 现在支持将 dependencydistribution_policy 作为在线检查项。
  • gpexpand 增强:
    • gpexpand.status_detail 现在在初始化阶段就会填充,并支持并行 worker。
    • 优化 gpexpand 连接管理,提高初始化效率。
    • gpexpand 现在会压缩段模板 tar 包后再分发,提高网络性能。
    • 允许 gpexpand 忽略 SIGHUP 信号,确保终端退出后任务仍可继续运行。
  • gp_toolkit 增强:
    • 新增 gp_toolkit.__gp_aoseg_all()gp_toolkit.__gp_aocsseg_all()gp_toolkit.__gp_seg_all_summary(),用于收集 AO 表的磁盘段文件元数据。
    • 新增 gp_toolkit.gp_resgroup_bypassed_queries 视图,显示绕过资源组执行的查询。
  • 资源队列增强:恢复时资源队列配置可以被正确应用,而无需重启集群。
  • 热备库优化:限制快照导入次数,减少不必要的开销。
  • 同步复制优化:调整同步复制的节流(throttling)策略,减少死锁风险。
  • EXPLAIN 计划改进:当 Result 节点作为哈希过滤器时,EXPLAIN 计划会显示相关信息。
  • Orca 查询优化改进:
    • 降低大规模分区表查询的优化时间和内存使用。
    • 多表 JOIN 查询在随机分布的分区表上执行时,优化查询计划。
  • gpcheckcat -l 增强:可区分哪些检查项可在线运行,哪些需要离线运行。
  • gprecoverseg 新增选项:
    • gprecoverseg -c <content_id> 允许对指定内容 ID 进行重新平衡(需与 -r 选项配合使用)。
    • gprecoverseg 现在可在 segment 处于启动恢复状态时提供提示信息。
  • 日志增强:新增 /var/log/greenplum-7/ 目录,存放 gpservice 相关日志。
  • gpconfig 配置变更记录:使用 gpconfig 修改参数时,postgresql.conf 现在会记录时间戳。

修复的问题

服务器

  • 修复查询 hang 住问题:修复了 interconnect 组件中因接收端缓存了错误的发送地址导致查询挂起的问题。
  • 修复启动崩溃问题:修复了在包含超过 10,000 张表的启用磁盘配额的数据库中启动或重启时可能发生的崩溃问题。
  • 修复 CLUSTER 命令问题:允许在映射的非共享系统关系(如 pg_class)上执行 CLUSTER 操作。
  • 修复临时表清理竞争条件:修复 autovacuum 在清理临时表时可能遇到的竞争条件,并新增 gp_autovacuum_enable_temp_table_cleanup GUC 参数控制是否启用清理。
  • 修复外键约束问题:修复在 FOREIGN KEY CONSTRAINT 约束添加过程中可能错误使用不同索引的问题。
  • 修复 AO 辅助表名称不一致问题
  • 修复查询计划相关错误
  • 修复 gpbackup 备份时收集 AO 表段文件统计信息导致的性能回退问题

原文

VMware Greenplum 7.4.0 is a minor release that includes new and changed features and resolves several issues.

New and Changed Features

New Features

  • Tanzu Greenplum 7.4.0 introduces the gpmigrate utility to check major version compatibility between Tanzu Greenplum 6 and Greenplum 7 for gpbackup and gprestore migrations. This utility can also detect user defined objects in pg_catalog.
  • Introduced a new check for removed GUCs set at the database or role level.
  • Tanzu Greenplum 7.4.0 introduces the gp_toolkit.gp_bloat_estimates view to display bloat in terms of both dead tuples and unused space.
  • Tanzu Greenplum 7.4.0 introduces the gp_resqueue_assignment_user GUC to toggle between assigning a resource queue to a statement based on the current user or the authenticated user.
  • Tanzu Greenplum 7.4.0 allows you to rename the resource queues and groups.
  • Tanzu Greenplum 7.4.0 adds a UDF gp_toolkit.gp_cleanup_unused_rp_snapshots(), to remove unused restore-point based snapshots on the hot standby, preventing snapshot accumulation.
  • Tanzu Greenplum 7.4.0 introduces a new view, gp_toolkit.gp_orphaned_backends, which lists all orphaned backend processes in the system. This view acts as a wrapper for gp_stat_activity and uses the same schema.
  • Tanzu Greenplum 7.4.0 introduces the gp_toolkit.gp_bloat_diag_appendoptimized view for table bloat diagnosis in append-optimized tables. Additionally, introduced the gp_toolkit.gp_bloat_estimates_appendoptimized view to collect table bloat statistics for append-optimized tables, similar to the gp_toolkit.gp_bloat_estimates for heap tables.
  • Tanzu Greenplum 7.4.0 introduces a new feature to optimize memory consumption of metadata ID objects during Orca's optimization stage, helping users avoid optimization-related memory limitations.
  • Orca now supports Row Expressions.
  • Orca now supports SIRV functions.
  • Tanzu Greenplum 7.4.0 introduces a new UDF advanced_password_check.role_password_status(rolname TEXT) to display the password creation time, expiration time, and the role's valid until time.
  • The PostgresML version has been updated to 2.8.2.
  • Tanzu Greenplum 7.4.0 introduces support for ssl_min_protocol_version in gpload for gpfdists connections via the configuration file. The new setting is located in the YAML file under input::source::ssl_min_protocol_version. Supported values align with the gpfdist --ssl_min_protocol_version option, allowing TLSv1.2 and TLSv1.3.
  • Tanzu Greenplum 7.4.0 introduces a new optional flag, --skip-leaked-temp-schema, in gpcheckcat to allow skipping the temporary schema (pg_temp, pg_toast_temp) cleanup check that runs at the start of gpcheckcat.
  • Tanzu Greenplum 7.4.0 introduces a new optional flag, --ao-aux-only to vacuumdb to allow vacuum on appendoptmized aux tables.
  • Tanzu Greenplum 7.4.0 introduces a new option in gpinitstandby to specify the host address of the standby coordinator. This can be done using the -A <standby_hostaddress> option. If the standby address is not specified, it will default to the standby hostname.
  • Tanzu Greenplum 7.4.0 introduces the Pointcloud extension that enables storing and managing point cloud data.
  • Tanzu Greenplum 7.4.0 introduces the pgRouting extension for the PostgreSQL database that enhances its capabilities by introducing advanced geospatial routing and network analysis.
  • Tanzu Greenplum 7.4.0 introduces the ability for pg_stat_activity to list <early unassign> entries.
  • Tanzu Greenplum 7.4.0 introduces EL9 support for Greenplum Virtual Postmaster HA service.
  • Tanzu Greenplum 7.4.0 introduces gpsetup utility to configure system kernel parameters on Greenplum coordinator and segment hosts.
  • Tanzu Greenplum 7.4.0 introduces the gpctl utility to initialize Greenplum using the new gRPC communication framework. gpctl can be used to create a mirrored or mirrorless cluster and also supports input configuration file of various formats like YAML, JSON, TOML.
  • Tanzu Greenplum 7.4.0 introduces the gpservice utility for initiating and managing the hub and agent gRPC servers used for inter-host communication. Additionally, it offers the capability to schedule tasks using the hub gRPC server.
  • Tanzu Greenplum 7.4.0 enables you to enable early unassign by setting the server configuration parameter gp_resgroup_enable_early_unassign.

Enhancements

  • Tanzu Greenplum 7.4.0 now supports view naming conventions for core counting.
  • Tanzu Greenplum 7.4.0 now supports gpcheckcat to designate the 'dependency' and 'distribution_policy' checks as online.
  • Tanzu Greenplum 7.4.0 introduces the following enhancements for the gpexpand utility::
    • gpexpand now populates gpexpand.status_detail during the initialization phase with parallel workers.
    • Optimized connection usage during gpexpand's initialization phase.
    • gpexpand now compresses the segment template tar before distributing it to the expansion segments, improving network performance.
    • gpexpand to ignore the SIGHUP signal and continue running, even if the surrounding terminal exits.
  • Tanzu Greenplum 7.4.0 introduces the following enhancements for the gp_toolkit utility:
    • The gp_toolkit utulity now supports gp_toolkit.__gp_aoseg_all(), gp_toolkit.__gp_aocsseg_all(), and gp_toolkit.__gp_seg_all_summary() to collect metadata information from the on-disk segment files of all append-optimized tables in the database.
    • The gp_toolkit utility now supports gp_toolkit.gp_resgroup_bypassed_queries, which displays queries running on the cluster that have bypassed resource group assignment.
  • This version of Tanzu Greenplum supports resource queue changes being restored from continuous recovery without requiring a cluster restart.
  • Enhanced performance in hot standby by limiting the number of snapshot imports, eliminating unnecessary overhead.
  • Enhanced syncrep by implementing throttling once in a non-critical path, reducing the risk of issues like deadlocks that occur from throttling at multiple points.
  • Improved the EXPLAIN plan by indicating when the Result node serves as a hash filter.
  • Improved planning time and reduced memory usage for queries using Orca with large partitioned tables.
  • Improved Orca query plans for cases with multiple joins involving randomly distributed partitioned tables.
  • Enhanced gpcheckcat -l to indicate which checks can be run in online mode and which cannot.
  • The gprecoverseg command now supports the following options:
    • The gprecoverseg command with the -c option, enable users to rebalance specific segments by specifying their content IDs. This option must be used in conjunction with the -r option.
    • The gprecoverseg rebalance by displaying a message when segments are undergoing startup recovery as part of the rebalance operation.
  • Enhanced logging by introducing a new log folder, /var/log/greenplum-7, to hold gpservice logs. This folder is created as part of the gpdb rpm installation and will serve as the default location for gpservice logs.
  • When a parameter is changed using gpconfig, the corresponding entry in postgresql.conf will now include a timestamp marking the change.

Resolved Issues

Server

标签:

Avatar photo

小麦苗

学习或考证,均可联系麦老师,请加微信db_bao或QQ646634621

您可能还喜欢...

发表回复