Chapter 12. Internationalization

Chapter 12. Internationalization
Prev		Next

Revision History
Revision V6.3-011	20 December 2019	In “Establishing A Local Collation Sequence”, Move some material to the Utilities chapter In “Examining Global Collation Characteristics”, Move some material to Utilities chapter In “Using the %GBLDEF Utility”, Move some material to the Utilities chapter
Revision V6.3-010	31 October 2019	In “Transform Utility Routine (gtm_ac_xutil)”, add new section In “Transformation Routine (gtm_ac_xform_1 or gtm_ac_xform)”, correct section titles for input and output arguments; fix typos
Revision V6.3-006	26 October 2018	In “Implementing an Alternative Collation Sequence for Unicode® characters”, minor corrections. In “Pattern Code Definition”, UTF-8 tweaks
Revision V6.3-001	20 March 2017	In “Pattern Code Selection”, added a compiler warning.
Revision V6.1-000	28 August 2014	In “Using the %GBLDEF Utility”, added changes for global spaning regions.

Table of Contents

Collation Sequence Definitions

Creating the Shared Library holding the alternative sequencing routines
Defining the Environment Variable
Defining a Default Database Collation Method
Establishing A Local Collation Sequence

Creating the Alternate Collation Routines

Transformation Routine (gtm_ac_xform_1 or gtm_ac_xform)
Inverse Transformation Routine (gtm_ac_xback or gtm_ac_xback_1)
Transform Utility Routine (gtm_ac_xutil)
Version Control Routines (gtm_ac_version and gtm_ac_verify)
Using the %GBLDEF Utility
Example of Upper and Lower Case Alphabetic Collation Sequence
Example of Collating Alphabets in Reverse Order using gtm_ac_xform_1 and gtm_ac_xback_1

Implementing an Alternative Collation Sequence for Unicode® characters

Matching Alternative Patterns

Pattern Code Definition
Pattern Code Selection

This chapter describes GT.M facilities for applications using characters encoded in other than eight-bit bytes (octets). Before continuing with use of UTF-8 features, you will need to ensure that your system has installed and configured the needed infrastructure for languages you wish to support, including International Components for Unicode (ICU / libicu), UTF-8 locale(s), and terminal emulators with appropriate fonts. This chapter addresses the specific issues of defining alternative collation sequences, and defining unique patterns for use with the pattern match operator.

Alternative collation sequences (or an alternative ordering of strings) can be defined for global and local variable subscripts. They can be established for specified globals or for an entire database. The alternative sequences are defined by a series of routines in an executable file pointed to by an environment variable. As the collation sequence is implemented by a user-supplied program, virtually any collation policy may be implemented. Detailed information on establishing alternative collation sequences and defining the environment variable is provided in “Collation Sequence Definitions”.

M has defined pattern classes that serve as arguments to the pattern match operator. GT.M supports user definition of additional pattern classes as well as redefinition of the standard pattern classes. Specific patterns are defined in a text file that is pointed to by an environment variable. Pattern classes may be re-defined dynamically. The details of defining these pattern classes and the environment variable are described in the section called “Matching Alternative Patterns”.

For some languages (such as Chinese), the ordering of strings according to Unicode® code-points (character values) may or may not be the linguistically or culturally correct ordering. Supporting applications in such languages requires development of collation modules - GT.M natively supports M collation, but does not include pre-built collation modules for any specific natural language. Therefore, applications that use characters in Unicode may need to implement their own collation functions. For more information on developing a collation module for Unicode, refer to “Implementing an Alternative Collation Sequence for Unicode® characters”.

Prev		Next
Type Limits for Call-ins and Call-outs	Home	Collation Sequence Definitions