==================================== String Functions ==================================== .. note:: These functions assume that the input strings contain valid UTF-8 encoded Unicode code points. There are no explicit checks for valid UTF-8 and the functions may return incorrect results on invalid UTF-8. Additionally, the functions operate on Unicode code points and not user visible *characters* (or *grapheme clusters*). Some languages combine multiple code points into a single user-perceived *character*, the basic unit of a writing system for a language, but the functions will treat each code point as a separate unit. The :func:`lower` and :func:`upper` functions do not perform locale-sensitive, context-sensitive, or one-to-many mappings required for some languages. Specifically, this will return incorrect results for Lithuanian, Turkish and Azeri. .. function:: chr(n) -> varchar Returns the Unicode code point ``n`` as a single character string. .. function:: codepoint(string) -> integer Returns the Unicode code point of the only character of ``string``. .. function:: concat(string1, ..., stringN) -> varchar Returns the concatenation of ``string1``, ``string2``, ``...``, ``stringN``. This function provides the same functionality as the SQL-standard concatenation operator (``||``). .. function:: ends_with(string, substring) -> boolean Returns whether ``string`` ends_with with ``substring``. .. function:: from_utf8(binary) -> varchar Decodes a UTF-8 encoded string from ``binary``. Invalid UTF-8 sequences are replaced with the Unicode replacement character ``U+FFFD``. Each invalid UTF-8 codepoint, including those of multi-byte long, is replaced with one replacement character. .. function:: from_utf8(binary, replace) -> varchar :noindex: Decodes a UTF-8 encoded string from ``binary``. Invalid UTF-8 sequences are replaced with `replace`. Each invalid UTF-8 codepoint, including those of multi-byte long, is replaced with one replacement character. The `replace` argument can be either Unicode code point (bigint), a single character or empty string. When `replace` is an empty string invalid characters are removed. .. function:: hamming_distance(string1, string2) -> bigint Returns the Hamming distance of ``string1`` and ``string2``, i.e. the number of positions at which the corresponding characters are different. Note that the two strings must have the same length. .. function:: length(string) -> bigint Returns the length of ``string`` in characters. .. function:: levenshtein_distance(string_1, string_2) -> bigint Returns the Levenshtein edit distance of 2 strings. I.e. the minimum number of single-character edits (insertions, deletions or substitutions) needed to convert ``string_1`` to ``string_2``. .. function:: lower(string) -> varchar Converts ``string`` to lowercase. .. function:: lpad(string, size, padstring) -> varchar Left pads ``string`` to ``size`` characters with ``padstring``. If ``size`` is less than the length of ``string``, the result is truncated to ``size`` characters. ``size`` must not be negative and ``padstring`` must be non-empty. .. function:: ltrim(string) -> varchar Removes leading whitespace from string. See :func:`trim` for the set of recognized whitespace characters. .. function:: ltrim(string, chars) -> varchar :noindex: Removes the longest substring containing only characters in ``chars`` from the beginning of the ``string``. :: SELECT ltrim('test', 't'); -- est SELECT ltrim('tetris', 'te'); -- ris .. function:: replace(string, search) -> varchar Removes all instances of ``search`` from ``string``. .. function:: replace(string, search, replace) -> varchar :noindex: Replaces all instances of ``search`` with ``replace`` in ``string``. If ``search`` is an empty string, inserts ``replace`` in front of every character and at the end of the ``string``. .. function:: reverse(string) -> varchar :noindex: Returns input string with characters in reverse order. .. function:: reverse(varbinary) -> varbinary :noindex: Returns input binary with bytes in reversed order. .. function:: rpad(string, size, padstring) -> varchar Right pads ``string`` to ``size`` characters with ``padstring``. If ``size`` is less than the length of ``string``, the result is truncated to ``size`` characters. ``size`` must not be negative and ``padstring`` must be non-empty. .. function:: rtrim(string) -> varchar Removes trailing whitespace from string. See :func:`trim` for the set of recognized whitespace characters. .. function:: rtrim(string, chars) -> varchar :noindex: Removes the longest substring containing only characters in ``chars`` from the end of the ``string``. :: SELECT rtrim('test', 't'); -- tes SELECT rtrim('test...', '.'); -- test .. function:: split(string, delimiter) -> array(string) Splits ``string`` on ``delimiter`` and returns an array. .. function:: split(string, delimiter, limit) -> array(string) :noindex: Splits ``string`` on ``delimiter`` and returns an array of size at most ``limit``. The last element in the array always contains everything left in the string. ``limit`` must be a positive number. .. function:: split_part(string, delimiter, index) -> string Splits ``string`` on ``delimiter`` and returns the part at index ``index``. Field indexes start with 1. If the index is larger than the number of fields, then null is returned. .. function:: split_to_map(string, entryDelimiter, keyValueDelimiter) -> map Splits ``string`` by ``entryDelimiter`` and ``keyValueDelimiter`` and returns a map. ``entryDelimiter`` splits ``string`` into key-value pairs. ``keyValueDelimiter`` splits each pair into key and value. Note that ``entryDelimiter`` and ``keyValueDelimiter`` are interpreted literally, i.e., as full string matches. ``entryDelimiter`` and ``keyValueDelimiter`` must not be empty and must not be the same. ``entryDelimiter`` is allowed to be the trailing character. Raises an error if there are duplicate keys. .. function:: starts_with(string, substring) -> boolean Returns whether ``string`` starts with ``substring``. .. function:: strpos(string, substring) -> bigint Returns the starting position of the first instance of ``substring`` in ``string``. Positions start with ``1``. If not found, ``0`` is returned. .. function:: strpos(string, substring, instance) -> bigint :noindex: Returns the position of the N-th ``instance`` of ``substring`` in ``string``. ``instance`` must be a positive number. Positions start with ``1``. If not found, ``0`` is returned. It takes into account overlapping strings when counting occurrences. :: SELECT strpos('aaa', 'aa', 2); -- 2 .. function:: strrpos(string, substring) -> bigint Returns the starting position of the last instance of ``substring`` in ``string``. Positions start with ``1``. If not found, ``0`` is returned. .. function:: strrpos(string, substring, instance) -> bigint :noindex: Returns the position of the N-th ``instance`` of ``substring`` in ``string`` starting from the end of the string. ``instance`` must be a positive number. Positions start with ``1``. If not found, ``0`` is returned. It takes into account overlapping strings when counting occurrences. :: SELECT strrpos('aaa', 'aa', 2); -- 1 .. function:: substr(string, start) -> varchar Returns the rest of ``string`` from the starting position ``start``. Positions start with ``1``. A negative starting position is interpreted as being relative to the end of the string. Returns empty string if absolute value of ``start`` is greater then length of the ``string``. .. function:: substr(string, start, length) -> varchar :noindex: Returns a substring from ``string`` of length ``length`` from the starting position ``start``. Positions start with ``1``. A negative starting position is interpreted as being relative to the end of the string. Returns empty string if absolute value of ``'start`` is greater then length of the ``string``. .. function:: trim(string) -> varchar Removes starting and ending whitespaces from ``string``. Recognized whitespace characters: ====== =========================== ====== =========================== Code Description Code Description ====== =========================== ====== =========================== 9 TAB (horizontal tab) U+1680 Ogham Space Mark 10 LF (NL line feed, new line) U+2000 En Quad 11 VT (vertical tab) U+2001 Em Quad 12 FF (NP form feed, new page) U+2002 En Space 13 CR (carriage return) U+2003 Em Space 28 FS (file separator) U+2004 Three-Per-Em Space 29 GS (group separator) U+2005 Four-Per-Em Space 30 RS (record separator) U+2006 Four-Per-Em Space 31 US (unit separator) U+2008 Punctuation Space 32 Space U+2009 Thin Space _ _ U+200a Hair Space _ _ U+200a Hair Space _ _ U+2028 Line Separator _ _ U+2029 Paragraph Separator _ _ U+205f Medium Mathematical Space _ _ U+3000 Ideographic Space ====== =========================== ====== =========================== .. function:: trim(string, chars) -> varchar :noindex: Removes the longest substring containing only characters in ``chars`` from the beginning and end of the ``string``. :: SELECT trim('test', 't'); -- es SELECT trim('.t.e.s.t.', '.t'); -- e.s .. function:: upper(string) -> varchar Converts ``string`` to uppercase. Unicode Functions ----------------- .. function:: to_utf8(string) -> varbinary Encodes ``string`` into a UTF-8 varbinary representation.