Returns its first argument as a string converted to a different encoding. The two argument form changes the encoding for case within a character set. The three argument form changes the encoding scheme.
The format for the $ZCONVERT() function is:
$ZCO[NVERT](expr1, expr2,[expr3])
The first expression is the string to convert. If the expression contains a code-point value that is not in the character set, $ZCONVERT() generates a run-time error.
In the two argument form, the second expression specifies a code that determines the form of the result. In the three-argument form, the second expression specifies a code that controls the character set interpretation of the first argument. If the expression does not evaluate to one of the defined codes corresponding to a valid code for the number of available arguments, $ZCONVERT() generates a run-time error.
The optional third expression specifies the a code that determines the character set of the result. If the expression does not evaluate to one of the defined codes $ZCONVERT() generates a run-time argument. The three-argument form is not supported in M mode.
The valid (case insensitive) character codes for expr2 in the two-argument form are:
U converts the string to UPPER-CASE. "UPPER-CASE" refers to words where all the characters are converted to their "capital letter" equivalents. $ZCONVERT() retains characters already in UPPER-CASE "capital letter" form unchanged.
L converts the string to lower-case. "lower-case" refers to words where all the letters are converted to their "small letter" equivalents. $ZCONVERT() retains characters already in lower-case or having no lower-case equivalent unchanged.
T converts the string to title case. "Title case" refers to a string with the first character of each word in upper-case and the remaining characters in the lower-case. $ZCONVERT() retains characters already conforming to "Title case" unchanged.
The valid (case insensitive) codes for character set encoding for expr2 and expr3 in the three-argument form are:
"UTF-8"-- a multi-byte variable length Unicode® encoding form.
"UTF-16LE"-- a multi-byte 16-bit Unicode® encoding form in little-endian.
"UTF-16BE"-- a multi-byte 16-bit Unicode® encoding form in big-endian.
"UTF-16"-- a multi-byte 16-bit Unicode® encoding form which uses the same endian level as that of the current system.
Note | |
---|---|
When UTF-8 mode is enabled, GT.M uses the ICU Library to perform case conversion. As mentioned in the Theory of Operation section, the case conversion of the strings occurs according to UTF-8 code-point values. This may not be the linguistically or culturally correct case conversion, for example, of the names in the telephone directories. Therefore, application developers must ensure that the actual case conversion is linguistically and culturally correct for their specific needs. The two-argument form of the $ZCONVERT() function in M mode does not use the ICU Library to perform operation related to the case conversion of the strings. |
Example:
GTM>write $zconvert("Happy New Year","U") HAPPY NEW YEAR
Example:
GTM>Write $zconvert("HAPPY NEW YEAR","T") Happy New Year
Example:
GTM>Set T8="主要雨在西班牙停留在平原"
GTM>Write $Length(T8)
12
GTM>Set T16=$zconvert(T8,"UTF-8","UTF-16LE")
GTM>Write $length(T16)
%GTM-E-BADCHAR, $ZCHAR(129,137,232,150) is not a valid character in the UTF-8 encoding form
GTM>Set T16=$ZCOnvert(T16,"UTF-16LE","UTF-8")
GTM>Write $length(T16)
9
In the above example, $LENGTH() function triggers an error because it takes only UTF-8 encoding strings as the argument.