Skip to content

3.0.5 Release Notes #50333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gavinchou opened this issue Apr 23, 2025 · 1 comment
Open

3.0.5 Release Notes #50333

gavinchou opened this issue Apr 23, 2025 · 1 comment

Comments

@gavinchou
Copy link
Contributor

gavinchou commented Apr 23, 2025

Behavior Changes

New Features

Lakehouse

  • Added Catalog/Database/Table quantity monitoring metrics to FE Metrics (#47891)
  • MaxCompute Catalog now supports Timestamp type (#48768)

Asynchronous Materialized Views

Query Execution

  • Added URL processing functions: top_level_domain, first_significant_subdomain, cut_to_first_significant_subdomain (#42488)
  • Added year_of_week function with Trino-compatible implementation (#48870)
  • percentile_array function now supports Float and Double data types (#48094)

Semi-structured Data Management

Storage-Compute Separation

  • Added compute group renaming support (#46221)

Improvements

Storage

  • Stream Load now supports compressed JSON file ingestion (#49044)

  • Enhanced error messages for various ingestion scenarios (#48436 #47721 #47804 #48638 #48344 #49287 #48009)

  • Added multiple metrics for Routine Load (#49045 #48764)

  • Optimized Routine Load scheduling algorithm to prevent single job failure from affecting overall scheduling (#47847)

  • Added Routine Load system table (#49284)

  • Improved query performance for Merge-On-Write (MOW) tables under high-frequency ingestion (#48968)

  • Enhanced Profile information display for Key Range queries (#48191)

  • Accelerated Compaction task generation to improve performance (#49547)

Storage-Compute Separation

Lakehouse

  • Optimized BE Scanner closure logic for Trino Connector Catalog to accelerate memory release (#47857)
  • ClickHouse JDBC Catalog now auto-adapts to different driver versions (#46026)

Asynchronous Materialized Views

  • Enhanced planning performance for transparent rewrite (#48782)
  • Optimized tvf mv_infos performance (#47415)
  • Disabled catalog metadata refresh during external table-based MV construction to reduce memory usage (#48767)

Query Optimizer

  • Improved statistics collection performance for key columns and partition columns (#46534)
  • Query result aliases now strictly match user input (#47093)
  • Enhanced column pruning after common subexpression extraction in aggregation operators (#46627)
  • Improved error messages for function binding failures and unsupported subqueries (#47919 #47985)

Semi-structured Data Management

  • json_object function now supports complex type parameters (#47779)
  • Added support for writing UInt128 to IPv6 type (#48802)
  • Enabled inverted index support for ARRAY fields in VARIANT type (#47688 #48117)

Security

  • Improved Ranger authorization performance (#49352)

Others

  • Optimized JVM Metrics interface performance (#49380)

Bug Fixes

Storage

  • Fixed data loss in Stream Load on ARM architecture (#49666)

  • Fixed missing error URL return for data quality issues in Insert Into Select (#49687)

  • Fixed error URL reporting for multi-table Routine Load data quality issues (#49130)

  • Fixed incorrect results when using Insert Into Values during Schema Change (#49338)

  • Fixed core dump caused by tablet commit info reporting (#48732)

  • Added Azure China region support for S3 Load (#48642)

  • Fixed data correctness issues in several edge cases (#48056 #48399 #48400 #48748 #48775 #48867 #49165 #49193 #49350 #49710 #49825)

  • Fixed untimely cleanup of completed transactions (#49564)

  • Changed JSONB default value to {} for partial column updates (#49066)

  • Fixed delete bitmap update lock release issue in Storage-Compute Separation model (#47766)

  • Fixed "get image failed" error in K8s environment (#49072)

  • Reduced CPU consumption in dynamic partition scheduling (#48577)

  • Fixed column exception after materialized view renaming (#48328)

  • Fixed memory and file cache leakage after failed Schema Change (#48426)

  • Fixed base compaction failure for tables with empty partitions (#49062)

  • Fixed data correctness issues in complex type modifications (#49452)

  • Fixed core dump in cold compaction (#48329)

  • Fixed cumulative point stagnation with delete operations (#47282)

  • Fixed memory insufficiency in large-scale full compaction (#48958)

Storage-Compute Decoupled

  • Fixed file cache cleanup failure in K8s environment (#49199)
  • Fixed FE CPU spike caused by read-write locks during high-frequency ingestion (#48564)

Lakehouse

Data Lakes

  • Fixed BE core dump during concurrent writes to Hive/Iceberg tables (#49842)
  • Fixed write failures to Hive/Iceberg tables on AWS S3 (#47162)
  • Fixed incorrect Iceberg Position Deletion reads (#47977)
  • Added Tencent Cloud COS support for Iceberg table creation (#49885)
  • Fixed Kerberos authentication for Paimon data on HDFS (#47192)
  • Fixed memory leak in Hudi Jni Scanner (#48955)
  • Fixed multi-partition list reading in MaxCompute Catalog (#48325)

JDBC

  • Fixed NPE when fetching row count from JDBC Catalog (#49442)
  • Fixed OceanBase Oracle mode connection test (#49442)
  • Fixed column type length inconsistency in concurrent JDBC Catalog access (#48541)
  • Fixed Classloader leak in JDBC Catalog BE (#46912)
  • Fixed connection thread leak in PostgreSQL JDBC Catalog (#49568)

Export

  • Fixed EXPORT job stuck in EXPORTING state (#47974)
  • Disabled OUTFILE auto-retry to prevent duplicate files (#48095)

Others

  • Fixed NPE when executing TVF queries via FE WebUI (#49213)
  • Fixed Hadoop Libhdfs thread local null pointer exception (#48280)
  • Fixed "Filesystem already closed" error in FE Hadoop access (#48351)
  • Fixed Catalog comment persistence issue (#46946)
  • Fixed Parquet complex type reading errors (#47734)

Asynchronous Materialized Views

  • Fixed slow MV construction in extreme scenarios (#48074)
  • Fixed nested MV transparent rewrite failure (#48222)

Query Optimizer

Query Execution

  • Fixed pipeline task scheduling deadlocks/performance issues (#49976 #49007)
  • Fixed memory corruption on FE connection failure (#48370 #48313)
  • Fixed memory corruption with lambda and array functions (#49140)
  • Fixed BE core caused by null string-to-JSONB conversion (#49810)
  • Standardized undefined behaviors in parse_url (#49149)
  • Fixed array_overlap null handling (#49403)
  • Fixed case conversion errors for non-ASCII characters (#49763)
  • Fixed BE core in percentile function (#48563)
  • Fixed multiple memory corruption issues (#48288 #49737 #48018 #47964)
  • Fixed incorrect SET operator results (#48001)
  • Reduced default Arrow Flight thread pool size to prevent FD exhaustion (#48530)
  • Fixed window function memory corruption (#48458)

Semi-structured Data Management

  • Fixed chunked Stream Load JSON import (#48474)
  • Enhanced JSONB format validation (#48731)
  • Fixed crash with large STRUCT fields (#49552)
  • Extended VARCHAR length support in complex types (#48025)
  • Fixed array_avg crash with specific parameters (#48691)
  • Fixed ColumnObject::pop_back crash in VARIANT type (#48935 #48978)
  • Disabled index building on VARIANT type (#49844)
  • Disabled inverted index v1 format for VARIANT type (#49890)
  • Fixed multi-layer CAST errors in VARIANT type (#47954)
  • Optimized inverted index metadata lookup for VARIANT with many subcolumns (#48153)
  • Reduced VARIANT schema memory consumption in Storage-Compute Separation mode (#47629 #48463)
  • Fixed PreparedStatement ID overflow (#48116)
  • Fixed row storage with DELETE operations (#49609)

Inverted Index

  • Fixed ARRAY type null bitmap handling (#48052)
  • Fixed Date/Datetimev1 Bloomfilter comparison (#47005)
  • Fixed UTF-8 4-byte character truncation (#48792)
  • Fixed index loss after immediate column addition (#48547)
  • Fixed empty data handling in ARRAY inverted index (#48264)
  • Improved FE metadata upgrade compatibility (#49283)
  • Fixed match_phrase_prefix cache error (#46517)
  • Fixed file cache cleanup after compaction (#49738)

Security

  • Removed Select_Priv check for DELETE operations (#49239)
  • Prevented non-root users from modifying root privileges (#48752)
  • Fixed intermittent LDAP PartialResultException (#47858)

Others

  • Fixed JAVA_OPTS_FOR_JDK_17 recognition (#48170)
  • Fixed BDB metadata write failure caused by InterruptException (#47874)
  • Improved SQL hash generation for multi-statement requests (#48242)
  • User attribute variables now override session variables (#48548)
@gavinchou
Copy link
Contributor Author

行为变更

新特性

Lakehouse

  • FE Metrics 新增 Catalog/Database/Table 数量监控指标(#47891
  • MaxCompute Catalog 支持 Timestamp 类型(#48768

查询执行

  • 新增 URL 处理函数:top_level_domainfirst_significant_subdomaincut_to_first_significant_subdomain#42488
  • 新增 year_of_week 函数,兼容 Trino 语法实现(#48870
  • percentile_array 函数支持 Float 和 Double 数据类型(#48094

存算分离

  • 支持重命名计算组(Rename Compute Group)(#46221

改进

存储

  • Stream Load 支持 JSON 压缩文件导入(#49044
  • 优化多个导入场景的错误提示信息(#48436 #47721 #47804 #48638 #48344 #49287 #48009
  • 新增 Routine Load 多项监控指标(#49045 #48764
  • 优化 Routine Load 调度算法,避免单任务异常影响整体调度(#47847
  • 新增 Routine Load 系统表(#49284
  • 优化主键表(MOW)高频导入场景的查询性能(#48968
  • 优化 Key Range 查询的 Profile 信息展示(#48191
  • 优化 Compaction 任务生成速度以提升性能(#49547

存算分离

Lakehouse

  • 优化 Trino Connector Catalog 的 BE 端 Scanner 关闭逻辑,加速内存释放(#47857
  • ClickHouse JDBC Catalog 自动兼容新旧版本驱动(#46026

异步物化视图

  • 优化透明改写(Transparent Rewrite)的规划性能(#48782
  • 优化 tvf mv_infos 性能(#47415
  • 基于外部表的物化视图构建时取消 Catalog 元数据刷新,减少内存占用(#48767

查询优化器

  • 优化 Key 列与分区列的统计信息收集性能(#46534
  • 查询结果别名与用户输入保持严格一致(#47093
  • 优化聚合算子中公共子表达式抽取后的列裁剪逻辑(#46627
  • 增强函数绑定失败及子查询不支持的报错信息(#47919 #47985

半结构化数据管理

  • json_object 函数支持复杂类型参数(#47779
  • 支持将 UInt128 写入 IPv6 类型(#48802
  • 支持 VARIANT 类型中 ARRAY 字段的倒排索引(#47688 #48117

权限

  • 提升 Ranger 鉴权性能(#49352

其他

  • 优化 JVM Metrics 接口性能(#49380

缺陷修复

导入

  • 修复 ARM 架构下 Stream Load 数据丢失问题(#49666
  • 修复 Insert Into Select 遇到数据质量错误未返回错误 URL 的问题(#49687
  • 修复 Routine Load 多表导入时数据质量错误未返回错误 URL 的问题(#49130
  • 修复 Schema Change 期间 Insert Into Values 导入结果异常问题(#49338
  • 修复 Tablet Commit 信息上报导致的 Core Dump 问题(#48732
  • 修复 S3 Load 导入不支持 Azure 中国区域名的问题(#48642

主键模型

存储

  • 修复 K8s 环境下 FE 报 "get image failed" 错误(#49072
  • 优化动态分区调度的 CPU 消耗(#48577
  • 修复重命名物化视图(MV)导致列异常的问题(#48328
  • 修复 Schema Change 失败后未释放内存和 File Cache 的问题(#48426
  • 修复含空分区表的 Base Compaction 失败问题(#49062
  • 修复复杂类型变更导致的数据正确性问题(#49452
  • 修复 Cold Compaction 导致 Core Dump 的问题(#48329
  • 修复存在 Delete 操作时 Cumulative Point 未提升的问题(#47282
  • 修复大数据量 Full Compaction 内存不足问题(#48958

存算分离

  • 修复 K8s 环境下 File Cache 清除失败问题(#49199
  • 修复高频导入时读写锁导致的 FE CPU 飙升问题(#48564

Lakehouse

Data Lakes

  • 修复并发写入 Hive/Iceberg 表可能引发的 BE Core Dump(#49842
  • 修复 AWS S3 存储的 Hive/Iceberg 表写入失败问题(#47162
  • 修复 Iceberg Position Deletion 读取结果错误(#47977
  • 修复腾讯云 COS 无法创建 Iceberg 表的问题(#49885
  • 修复 Kerberos 认证 HDFS 访问 Paimon 数据失败问题(#47192
  • 修复 Hudi Jni Scanner 内存泄漏问题(#48955
  • 修复 MaxCompute Catalog 多分区列表读取错误(#48325

JDBC

  • 修复 JDBC Catalog 表行数查询空指针问题(#49442
  • 修复 OceanBase Oracle 模式连接测试失败(#49442
  • 修复 JDBC Catalog 并发场景下列类型长度错误(#48541
  • 修复 JDBC Catalog BE 端 Classloader 泄漏(#46912
  • 修复 PostgreSQL JDBC Catalog 连接线程泄漏(#49568

Export

  • 修复 EXPORT 作业卡在 EXPORTING 状态(#47974
  • 禁止 OUTFILE 自动重试以防止重复文件导出(#48095

其他

  • 修复 FE WebUI 执行 TVF 查询空指针问题(#49213
  • 修复 Hadoop Libhdfs Thread Local 空指针异常(#48280
  • 修复 FE 访问 Hadoop Filesystem 报 "Filesystem already closed"(#48351
  • 修复 Catalog Comment 未持久化问题(#46946
  • 修复 Parquet 复杂类型读取报错(#47734

异步物化视图

  • 修复极端场景下物化视图构建任务卡顿问题(#48074
  • 修复嵌套物化视图透明改写失效问题(#48222

查询优化器

查询执行

  • 修复 Pipeline 任务调度导致的卡死/性能问题(#49976 #49007
  • 修复 FE 连接失败时的内存越界问题(#48370 #48313
  • 修复 Lambda 函数与数组函数共用导致的内存越界(#49140
  • 修复 String 与 JSONB 类型转换空值导致 BE Core(#49810
  • 规范 parse_url 未定义行为(#49149
  • 修复 array_overlap 函数空值结果异常(#49403
  • 修复非 ASCII 字符大小写转换错误(#49763
  • 修复 percentile 函数部分场景 BE Core(#48563
  • 修复多个内存越界问题(#48288 #49737 #48018 #47964
  • 修复 SET 算子结果错误(#48001
  • 降低 Arrow Flight 默认线程池大小以避免句柄耗尽(#48530
  • 修复窗口函数内存越界导致 BE Core(#48458

半结构化数据管理

  • 修复 Transfer-Encoding: chunked 的 Stream Load JSON 导入异常(#48474
  • 增强 JSONB 格式合法性校验(#48731
  • 修复 STRUCT 类型字段过多导致的 Crash(#49552
  • 支持复杂类型 VARCHAR 长度扩展(#48025
  • 修复 array_avg 函数在特定参数下的 Crash(#48691
  • 修复 VARIANT 类型 ColumnObject::pop_back Crash(#48935 #48978
  • 禁用 VARIANT 类型的索引构建操作(#49844
  • 禁用 VARIANT 类型倒排索引 V1 格式(#49890
  • 修复 VARIANT 多层 CAST 结果错误(#47954
  • 优化 VARIANT 多子列倒排索引元数据查询性能(#48153
  • 优化存算分离模式下 VARIANT Schema 内存消耗(#47629 #48463
  • 修复 PreparedStatement ID 溢出问题(#48116
  • 修复行存与 Delete 操作结合问题(#49609

倒排索引

  • 修复 ARRAY 类型倒排索引 Null Bitmap 错误(#48052
  • 修复 Date/Datetimev1 类型 Bloomfilter 索引比较错误(#47005
  • 修复 UTF-8 四字节字符截断问题(#48792
  • 修复新增列后立即创建倒排索引导致丢失的问题(#48547
  • 修复 ARRAY 倒排索引空数据处理异常(#48264
  • 修复倒排索引 FE 元数据升级兼容性(#49283
  • 修复 match_phrase_prefix 缓存错误(#46517
  • 修复 Compaction 后倒排索引 File Cache 未清理(#49738

权限

  • DELETE 操作不再检查 Select_Priv 权限(#49239
  • 禁止非 root 用户修改 root 权限(#48752
  • 修复 LDAP 偶发 Partial Result Exception(#47858

其他

  • 修复 JDK17 环境 JAVA_OPTS 识别异常(#48170
  • 修复 InterruptException 导致 BDB 元数据写入失败(#47874
  • 优化多语句请求的 SQL Hash 生成(#48242
  • 用户属性变量优先级高于 Session 变量(#48548

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant