October 2023 Update¶
Documentation¶
Add documentation for simple UDAF interface.
Add blog post about reduce_agg lambda aggregate function.
Extend documentation for datetime Presto functions to explain handling of time zones.
Extend documentation for
reduce_agg()
Presto lambda aggregate function.
Core Library¶
Add spill support to Window, RowNumber and TopNRowNumber operators.
Add spill support after receiving all input to HashAggregation operator. #6903
Add spill stats to the output of printPlanWithStats.
Add logic to adaptively abandon partial TopNRowNumber if cardinality reduction is not sufficient. #7195
Add optimized version of Window operator for the case when inputs are already partitioned and sorted. #5437
Add support for order-able and comparable arguments to function signatures.
Add support for order-able and comparable arguments to the Simple Function interface. #7293
Fix Unnest operator to honor preferred_output_batch_rows configuration property and avoid producing huge vectors. #7051
Presto Functions¶
Add
find_first()
andfind_first_index()
scalar lambda functions.Add
any_match()
,all_match()
,none_match()
scalar lambda functions.Add
all_keys_match()
,any_keys_match()
,any_values_match()
,no_keys_match()
,no_values_match()
scalar lambda functions.Add
remove_nulls()
scalar function.Add
ends_with()
andstarts_with()
scalar functions.Add
to_ieee754_32()
scalar function.Add support for non-constant patterns and escape characters to
like()
function. #6917Add support for BOOLEAN inputs to
least()
andgreatest()
scalar functions.Add support for INTEGER inputs to
poisson_cdf()
andbinomial_cdf()
scalar functions.Add support for maps with keys of UNKNOWN type in
map_filter()
scalar lambda function.Add support for REAL inputs to
geometric_mean()
aggregate function.Add support for floating point keys to
map_union_sum()
aggregate function.Add support for CAST to and from complex types with nested JSON values. #7256
Fix 1ms-off issue in
from_unixtime()
scalar function. #7047Fix
array_min()
andarray_max()
for floating point numbers to match Presto. #7128Fix
checksum()
aggregate function. #6910Fix
array_sort()
andcontains()
scalar functions to reject inputs with nested nulls.Fix
map_agg()
,set_agg()
,min_by()
andmax_by()
aggregate functions to reject inputs with nested nulls.Fix
array_sort()
andarray_sort_desc()
to restrict inputs to order-able types. #6928Fix
min()
,min_by()
,max()
,max_by()
aggregate functions to restrict inputs to order-able types. #7232Fix CAST(VARCHAR as JSON) for Unicode characters. #7119
Fix CAST(JSON as ROW) to use case-insensitive match for keys. #7016
Spark Functions¶
Add
array_min()
,array_max()
,add_months()
,conv()
,substring_index()
,datediff()
scalar functions.Add support for DECIMAL inputs to
multiply()
anddivide()
.Fix
sum()
aggregate function for BIGINT inputs to allow overflow.
Hive Connector¶
Add support for reading from Azure Storage. #6675
Performance and Correctness¶
Optimize spilling by switching to gfx::timsort (from std::sort). #6745.
Add support for disabling caching in expression evaluation to reduce memory usage via enable_expression_evaluation_cache configuration property. #6898
Add support for validating output of every operator via debug.validate_output_from_operators configuration property. #6687
Add support for order-able function arguments to the Fuzzer. #6950
Fix edge cases in datetime processing during daylight saving transition. #7011
Fix comparisons of complex types values using floating point numbers in the RowContainer. #5833
Fix window aggregations for empty frames. #6872
Fix GroupID operator with duplicate grouping keys in the output. #6738
Fix global grouping set aggregations for empty inputs. #7112
Fix aggregation function framework to require raw input types for all aggregates to avoid confusion and incorrect results. #7037
Build Systems¶
Add support for Conda Environments. #6282
Credits¶
Alex, Alex Hornby, Amit Dutta, Ann Rose Benny, Bikramjeet Vig, Chengcheng Jin, Christian Zentgraf, Cody Ohlsen, Daniel Munoz, David Tolnay, Deepak Majeti, Genevieve (Genna) Helsel, Huameng (Michael) Jiang, Jacob Wujciak-Jens, Jaihari Loganathan, Jason Sylka, Jia Ke, Jialiang Tan, Jimmy Lu, John Elliott, Jubin Chheda, Karteekmurthys, Ke, Kevin Wilfong, Krishna Pai, Krishna-Prasad-P-V, Laith Sakka, Ma-Jian1, Mahadevuni Naveen Kumar, Mark Shroyer, Masha Basmanova, Orri Erling, PHILO-HE, Patrick Sullivan, Pedro Eugenio Rocha Pedreira, Pramod, Prasoon Telang, Pratik Joseph Dabre, Pratyush Verma, Rong Ma, Sergey Pershin, Wei He, Zac, aditi-pandit, dependabot[bot], duanmeng, joey.ljy, lingbin, rrando901, rui-mo, usurai, wypb, xiaoxmeng, xumingming, yan ma, yangchuan, yingsu00, zhejiangxiaomai, 高阳阳