October 2023 Update

Documentation

  • Add documentation for simple UDAF interface.

  • Add blog post about reduce_agg lambda aggregate function.

  • Extend documentation for datetime Presto functions to explain handling of time zones.

  • Extend documentation for reduce_agg() Presto lambda aggregate function.

Core Library

  • Add spill support to Window, RowNumber and TopNRowNumber operators.

  • Add spill support after receiving all input to HashAggregation operator. #6903

  • Add spill stats to the output of printPlanWithStats.

  • Add logic to adaptively abandon partial TopNRowNumber if cardinality reduction is not sufficient. #7195

  • Add optimized version of Window operator for the case when inputs are already partitioned and sorted. #5437

  • Add support for order-able and comparable arguments to function signatures.

  • Add support for order-able and comparable arguments to the Simple Function interface. #7293

  • Fix Unnest operator to honor preferred_output_batch_rows configuration property and avoid producing huge vectors. #7051

Presto Functions

Spark Functions

Hive Connector

  • Add support for reading from Azure Storage. #6675

Performance and Correctness

  • Optimize spilling by switching to gfx::timsort (from std::sort). #6745.

  • Add support for disabling caching in expression evaluation to reduce memory usage via enable_expression_evaluation_cache configuration property. #6898

  • Add support for validating output of every operator via debug.validate_output_from_operators configuration property. #6687

  • Add support for order-able function arguments to the Fuzzer. #6950

  • Fix edge cases in datetime processing during daylight saving transition. #7011

  • Fix comparisons of complex types values using floating point numbers in the RowContainer. #5833

  • Fix window aggregations for empty frames. #6872

  • Fix GroupID operator with duplicate grouping keys in the output. #6738

  • Fix global grouping set aggregations for empty inputs. #7112

  • Fix aggregation function framework to require raw input types for all aggregates to avoid confusion and incorrect results. #7037

Build Systems

  • Add support for Conda Environments. #6282

Credits

Alex, Alex Hornby, Amit Dutta, Ann Rose Benny, Bikramjeet Vig, Chengcheng Jin, Christian Zentgraf, Cody Ohlsen, Daniel Munoz, David Tolnay, Deepak Majeti, Genevieve (Genna) Helsel, Huameng (Michael) Jiang, Jacob Wujciak-Jens, Jaihari Loganathan, Jason Sylka, Jia Ke, Jialiang Tan, Jimmy Lu, John Elliott, Jubin Chheda, Karteekmurthys, Ke, Kevin Wilfong, Krishna Pai, Krishna-Prasad-P-V, Laith Sakka, Ma-Jian1, Mahadevuni Naveen Kumar, Mark Shroyer, Masha Basmanova, Orri Erling, PHILO-HE, Patrick Sullivan, Pedro Eugenio Rocha Pedreira, Pramod, Prasoon Telang, Pratik Joseph Dabre, Pratyush Verma, Rong Ma, Sergey Pershin, Wei He, Zac, aditi-pandit, dependabot[bot], duanmeng, joey.ljy, lingbin, rrando901, rui-mo, usurai, wypb, xiaoxmeng, xumingming, yan ma, yangchuan, yingsu00, zhejiangxiaomai, 高阳阳