Chapter 12. Internationalization

Revision History
Revision V6.3-011 20 December 2019
Revision V6.3-010 31 October 2019
Revision V6.3-006 26 October 2018
Revision V6.3-001 20 March 2017
Revision V6.1-000 28 August 2014

In Using the %GBLDEF Utility, added changes for global spaning regions.

Table of Contents

Collation Sequence Definitions
Creating the Shared Library holding the alternative sequencing routines
Defining the Environment Variable
Defining a Default Database Collation Method
Establishing A Local Collation Sequence
Creating the Alternate Collation Routines
Transformation Routine (gtm_ac_xform_1 or gtm_ac_xform)
Inverse Transformation Routine (gtm_ac_xback or gtm_ac_xback_1)
Transform Utility Routine (gtm_ac_xutil)
Version Control Routines (gtm_ac_version and gtm_ac_verify)
Using the %GBLDEF Utility
Example of Upper and Lower Case Alphabetic Collation Sequence
Example of Collating Alphabets in Reverse Order using gtm_ac_xform_1 and gtm_ac_xback_1
Implementing an Alternative Collation Sequence for Unicode® characters
Matching Alternative Patterns
Pattern Code Definition
Pattern Code Selection

This chapter describes GT.M facilities for applications using characters encoded in other than eight-bit bytes (octets). Before continuing with use of UTF-8 features, you will need to ensure that your system has installed and configured the needed infrastructure for languages you wish to support, including International Components for Unicode (ICU / libicu), UTF-8 locale(s), and terminal emulators with appropriate fonts. This chapter addresses the specific issues of defining alternative collation sequences, and defining unique patterns for use with the pattern match operator.

Alternative collation sequences (or an alternative ordering of strings) can be defined for global and local variable subscripts. They can be established for specified globals or for an entire database. The alternative sequences are defined by a series of routines in an executable file pointed to by an environment variable. As the collation sequence is implemented by a user-supplied program, virtually any collation policy may be implemented. Detailed information on establishing alternative collation sequences and defining the environment variable is provided in “Collation Sequence Definitions”.

M has defined pattern classes that serve as arguments to the pattern match operator. GT.M supports user definition of additional pattern classes as well as redefinition of the standard pattern classes. Specific patterns are defined in a text file that is pointed to by an environment variable. Pattern classes may be re-defined dynamically. The details of defining these pattern classes and the environment variable are described in the section called “Matching Alternative Patterns”.

For some languages (such as Chinese), the ordering of strings according to Unicode® code-points (character values) may or may not be the linguistically or culturally correct ordering. Supporting applications in such languages requires development of collation modules - GT.M natively supports M collation, but does not include pre-built collation modules for any specific natural language. Therefore, applications that use characters in Unicode may need to implement their own collation functions. For more information on developing a collation module for Unicode, refer to “Implementing an Alternative Collation Sequence for Unicode® characters”.