Array Functions¶
- aggregate(array(E), start, merge, finish) -> array(E)¶
Applies a binary operator to an initial state and all elements in the array, and reduces this to a single state. The final state is converted into the final result by applying a finish function.
SELECT aggregate(array(1, 2, 3), 0, (acc, x) -> acc + x, acc -> acc * 10); -- 60
- array(E, E1, ..., En) -> array(E, E1, ..., En)¶
Returns an array with the given elements.
SELECT array(1, 2, 3); -- [1,2,3]
- array_contains(array(E), value) boolean ¶
Returns true if the array contains the value.
SELECT array_contains(array(1, 2, 3), 2); -- true
- array_distinct(array(E)) -> array(E)¶
Remove duplicate values from the input array.
SELECT array_distinct(ARRAY [1, 2, 3]); -- [1, 2, 3] SELECT array_distinct(ARRAY [1, 2, 1]); -- [1, 2] SELECT array_distinct(ARRAY [1, NULL, NULL]); -- [1, NULL]
- array_except(array(E) x, array(E) y) -> array(E)¶
Returns an array of the elements in array
x
but not in arrayy
, without duplicates.SELECT array_except(ARRAY [1, 2, 3], ARRAY [4, 5, 6]); -- [1, 2, 3] SELECT array_except(ARRAY [1, 2, 3], ARRAY [1, 2]); -- [3] SELECT array_except(ARRAY [1, 2, 2], ARRAY [1, 1, 2]); -- [] SELECT array_except(ARRAY [1, 2, 2], ARRAY [1, 3, 4]); -- [2] SELECT array_except(ARRAY [1, NULL, NULL], ARRAY [1, 1, NULL]); -- []
- array_intersect(array(E), array(E1)) -> array(E2)¶
Returns an array of the elements in the intersection of array1 and array2, without duplicates.
SELECT array_intersect(array(1, 2, 3), array(1, 3, 5)); -- [1,3]
- array_max(array(E)) E ¶
Returns maximum non-NULL element of the array. Returns NULL if array is empty or all elements are NULL. When E is DOUBLE or REAL, returns NaN if any element is NaN.
SELECT array_max(array(1, 2, 3)); -- 3 SELECT array_max(array(-1, -2, -2)); -- -1 SELECT array_max(array(-1, -2, NULL)); -- -1 SELECT array_max(array()); -- NULL SELECT array_max(array(-0.0001, -0.0002, -0.0003, float('nan'))); -- NaN
- array_min(array(E)) E ¶
Returns minimum non-NULL element of the array. Returns NULL if array is empty or all elements are NULL. When E is DOUBLE or REAL, NaN value is considered greater than any non-NaN value.
SELECT array_min(array(1, 2, 3)); -- 1 SELECT array_min(array(-1, -2, -2)); -- -2 SELECT array_min(array(-1, -2, NULL)); -- -2 SELECT array_min(array(NULL, NULL)); -- NULL SELECT array_min(array()); -- NULL SELECT array_min(array(4.0, float('nan')]); -- 4.0 SELECT array_min(array(NULL, float('nan'))); -- NaN
- array_position(x, element) bigint ¶
Returns the position (1-based) of the first occurrence of the
element
in arrayx
(or 0 if not found).SELECT array_position(array(1, 2, 3), 2); -- 2 SELECT array_position(array(1, 2, 3), 4); -- 0 SELECT array_position(array(1, 2, 3, 2), 2); -- 2
- array_remove(x, element) array ¶
Remove all elements that equal
element
from arrayx
. Returns NULL as result ifelement
is NULL. If arrayx
is empty array, returns empty array. If all elements in arrayx
are NULL butelement
is not NULL, returns arrayx
.SELECT array_remove(array(1, 2, 3), 3); -- [1, 2] SELECT array_remove(array(2, 1, NULL), 1); -- [2, NULL] SELECT array_remove(array(1, 2, NULL), NULL); -- NULL SELECT array_remove(array(), 1); -- [] SELECT array_remove(array(NULL, NULL), -1); -- [NULL, NULL]
- array_repeat(element, count)¶
Returns an array containing
element
count
times. Ifcount
is negative or zero, returns empty array. Ifelement
is NULL, returns an array containingcount
NULLs. Ifcount
is NULL, returns NULL as result. Throws an exception ifcount
exceeds 10’000.SELECT array_repeat(100, 3); -- [100, 100, 100] SELECT array_repeat(NULL, 3); -- [NULL, NULL, NULL] SELECT array_repeat(100, NULL); -- NULL SELECT array_repeat(100, 0); -- [] SELECT array_repeat(100, -1); -- []
- array_sort(array(E)) -> array(E)¶
Returns an array which has the sorted order of the input array(E). The elements of array(E) must be orderable. Null elements will be placed at the end of the returned array.
SELECT array_sort(array(1, 2, 3)); -- [1, 2, 3] SELECT array_sort(array(3, 2, 1)); -- [1, 2, 3] SELECT array_sort(array(2, 1, NULL); -- [1, 2, NULL] SELECT array_sort(array(NULL, 1, NULL)); -- [1, NULL, NULL] SELECT array_sort(array(NULL, 2, 1)); -- [1, 2, NULL]
- concat(array(E), array(E1), ..., array(En)) -> array(E, E1, ..., En)¶
Returns the concatenation of array(E), array(E1), …, array(En).
SELECT concat(array(1, 2, 3), array(4, 5), array(6)); -- [1, 2, 3, 4, 5, 6]
- exists(array(T), function(T, boolean)) → boolean
Returns whether at least one element of an array matches the given predicate.
Returns true if one or more elements match the predicate; Returns false if none of the elements matches (a special case is when the array is empty); Returns NULL if the predicate function returns NULL for one or more elements and false for all other elements. Throws an exception if the predicate fails for one or more elements and returns false or NULL for the rest.
- filter(array(E), func) -> array(E)¶
Filters the input array using the given predicate.
SELECT filter(array(1, 2, 3), x -> x % 2 == 1); -- [1, 3] SELECT filter(array(0, 2, 3), (x, i) -> x > i); -- [2, 3] SELECT filter(array(0, null, 2, 3, null), x -> x IS NOT NULL); -- [0, 2, 3]
- flatten(array(array(E))) -> array(E)¶
Transforms an array of arrays into a single array. Returns NULL if the input is NULL or any of the nested arrays is NULL.
SELECT flatten(array(array(1, 2), array(3, 4))); -- [1, 2, 3, 4] SELECT flatten(array(array(1, 2), array(3, NULL))); -- [1, 2, 3, NULL] SELECT flatten(array(array(1, 2), NULL, array(3, 4))); -- NULL
- forall(array(T), function(T, boolean)) → boolean
Returns whether all elements of an array match the given predicate.
Returns true if all the elements match the predicate (a special case is when the array is empty); Returns false if one or more elements don’t match; Returns NULL if the predicate function returns NULL for one or more elements and true for all other elements. Throws an exception if the predicate fails for one or more elements and returns true or NULL for the rest.
- get(array(E), index) E ¶
Returns an element of the array at the specified 0-based index. Returns NULL if index points outside of the array boundaries.
SELECT get(array(1, 2, 3), 0); -- 1 SELECT get(array(1, 2, 3), 3); -- NULL SELECT get(array(1, 2, 3), -1); -- NULL SELECT get(array(1, 2, 3), NULL); -- NULL SELECT get(array(1, 2, NULL), 2); -- NULL
- in(value, array(E)) boolean ¶
Returns true if value matches at least one of the elements of the array. Supports BOOLEAN, REAL, DOUBLE, BIGINT, VARCHAR, TIMESTAMP, DATE input types.
- shuffle(array(E), seed) -> array(E)¶
Generates a random permutation of the given
array
using a seed derived from the parameterseed
and the configuration spark.partition_id.seed
must be constant.SELECT shuffle(array(1, 2, 3), 0); -- [3, 1, 2] SELECT shuffle(array(0, 0, 0), 0); -- [0, 0, 0] SELECT shuffle(array(1, NULL, 1, NULL, 2), 0); -- [2, 1, NULL, NULL, 1]
- size(array(E), legacySizeOfNull) integer ¶
Returns the size of the array. Returns null for null input if legacySizeOfNull is set to false. Otherwise, returns -1 for null input.
SELECT size(array(1, 2, 3), true); -- 3 SELECT size(NULL, true); -- -1 SELECT size(NULL, false); -- NULL
- sort_array(array(E)) -> array(E)¶
Returns an array which has the sorted order of the input array. The elements of array must be orderable. Null elements will be placed at the beginning of the returned array.
SELECT sort_array(array(1, 2, 3)); -- [1, 2, 3] SELECT sort_array(array(NULL, 2, 1)); -- [NULL, 1, 2]
- sort_array(array(E), ascendingOrder) -> array(E)
Returns an array which has the sorted order of the input array. The elements of array must be orderable. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.
SELECT sort_array(array(3, 2, 1), true); -- [1, 2, 3] SELECT sort_array(array(2, 1, NULL, true); -- [NULL, 1, 2] SELECT sort_array(array(NULL, 1, NULL), false); -- [1, NULL, NULL]
- transform(array(E), function) -> array(E)¶
Transforms elements in an array using the function.
SELECT transform(array(1, 2, 3), x -> x + 1); -- [2,3,4] SELECT transform(array(1, 2, 3), (x, i) -> x + i); -- [1,3,5]
- zip_with(array(T), array(U), function(T, U, R)) -> array(R)¶
Merges the two given arrays, element-wise, into a single array using
function
. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applyingfunction
SELECT zip_with(ARRAY[1, 3, 5], ARRAY['a', 'b', 'c'], (x, y) -> (y, x)); -- [ROW('a', 1), ROW('b', 3), ROW('c', 5)] SELECT zip_with(ARRAY[1, 2], ARRAY[3, 4], (x, y) -> x + y); -- [4, 6] SELECT zip_with(ARRAY['a', 'b', 'c'], ARRAY['d', 'e', 'f'], (x, y) -> concat(x, y)); -- ['ad', 'be', 'cf'] SELECT zip_with(ARRAY['a'], ARRAY['d', null, 'f'], (x, y) -> coalesce(x, y)); -- ['a', null, 'f']