Click here to view and discuss this page in DocCommentXchange. In the future, you will be sent there automatically.

SQL Anywhere 10.0.1 » SQL Anywhere Server - Database Administration » International Languages and Character Sets » Understanding collations

SQL Anywhere Collation Algorithm (SACA) Next Page

Unicode Collation Algorithm (UCA)

The Unicode Collation Algorithm is an algorithm for sorting the entire Unicode character set. It provides linguistically correct comparison, ordering, and case conversion. The UCA was developed as part of the Unicode standard. SQL Anywhere implements the UCA using the International Components for Unicode (ICU) open source library, developed and maintained by IBM.


The default UCA ordering sorts most characters in most languages into an appropriate order. However, because of the sorting and comparison variations between languages sharing characters, the UCA cannot provide proper sorting for all languages. For this purpose, ICU provides a syntax for the tailoring the UCA.

The UCA provides advanced comparison, ordering, and case conversion at a small cost in space and time.

The mapped form of a string is longer than the original string. The algorithm provides sophisticated handling of more complex characters.

Unlike the SQL Anywhere Collation Algorithm, the Unicode Collation Algorithm is only for use with single-byte and UTF-8 character sets, and it separates each character into one or more attributes. For letters, these attributes are base character, accent, and case.

Non-letters typically have only one attribute, the base character.

UCA compares character strings as follows:

The original string values are equal if and only if the base characters, accents, and case are the same for both strings.


Suppose UCA is used to compare the strings in the first column of the table below. The subsequent columns describe the three attributes for each string. Notice that the base characters are identical; the words differ only in accents and case.

StringBase charactersAccentsCase
noelnoelnone, none, none, nonelower, lower, lower, lower
noëlnoelnone, none, accent, nonelower, lower, lower, lower
Noelnoelnone, none, none, noneupper, lower, lower, lower
Noëlnoelnone, none, accent, noneupper, lower, lower, lower

The following table shows the ordering that would occur in the four possible combinations of accent- and case-sensitivity using UCA:

Accent sensitive Case sensitive ORDER BY result Explanation

Noel, noël, Noël, noel in any order

  • Accents ignored

  • Case ignored

  • All values considered equal

  • Random order within set of four


Noel, noel in any order,

followed by

noël, Noël in any order

  • No-accents before accents, so e before ë

  • Case ignored, N and n are in random order within each set of two


Noel, Noël in any order,

followed by

noël, noel in any order

  • Uppercase before lowercase, so N before n

  • Accents ignored, e and ë are in random order within each set of two






  • No-accents before accents, so e before ë

  • Uppercase before lowercase, so N before n