HP OpenVMS Systems Documentation |
Common Desktop Environment: Internationalization Programmer's Guide 5 Xt and Xlib DependenciesContents of Chapter: This chapter discusses tasks related to internationalizing with Xt and Xlib. Locale ManagementThe following defines support for the locale mechanism that controls all locale-dependent Xlib and Common Desktop Environment functions.X Locale ManagementX locale supports one or more of the locales defined by the host environment. The Xlib conforms to the American National Standards Institute (ANSI) C library, and the locale announcement method is thesetlocale() function. This function configures the locale operation of both the host C library and Xlib. The operation of Xlib is governed by the LC_CTYPE category; this is called the current locale.The XSupportsLocale() function is used to determine whether the current locale is supported by X. The client is responsible for selecting its locale and X modifiers. Clients should provide a means for the user to override the clients' locale selection at client invocation. Most single-display X clients operate in a single locale for both X and the host-processing environment. They configure the locale by calling three functions: setlocale(), XSupportsLocale(), and XSetLocaleModifiers(). The semantics of certain categories of X internationalization capabilities can be configured by setting modifiers. Modifiers are named by implementation-dependent and locale-specific strings. The only standard use for this capability at present is selecting one of several styles of keyboard input methods. The XSetLocaleModifiers() function is used to configure Xlib locale modifiers for the current locale. The recommended procedure for clients initializing their locale and modifiers is to obtain locale and modifier announcers separately from one of the following prioritized sources:
Note: When a locale command-line option or locale resource is defined, the effect should be to set all categories to the specified locale, overriding any category-specific settings in the local host environment. Locale and Modifier DependenciesThe internationalized Xlib functions operate in the current locale configured by the host environment and in the X locale modifiers set by theXSetLocaleModifiers() function, or in the locale and modifiers configured at the time some object supplied to the function was created. For each locale-dependent function, Table 5-1 lists locale and modifier dependencies.
Table 5-1 Locale and Modifier Dependencies
All text strings processed by internationalized Xlib functions are assumed to begin in the initial state of the encoding of the locale, if the encoding is state-dependent. All Xlib functions behave as if they do not change the current locale or X modifier setting. (This means that any function, provided within a library either by Xlib or by the application, that changes the locale or calls the Xt Locale ManagementXt locale management includes the following two functions:
XtSetLanguageProcBefore the initialization of the Xt Toolkit, applications should normally call theXtSetLanguageProc() function with one of the following functions:
Note: The locale is not actually set until the toolkit is initialized (for example, by way of the XtAppInitialize() function). Therefore, the setlocale() function may be needed after the XtSetLanguageProc() function and the initializing of the toolkit (for example, if calling the catopen() function).
Resource databases are created in the current process locale. During display initialization prior to creating the per-screen resource database, the Intrinsics call to a specified application procedure to set the locale according to options found on the command line or in the per-display resource specifications.
The callout procedure provided by the application is of type
XtDisplayInitialize() function. The function returns a new language string that is subsequently used by the XtDisplayInitialize() function to establish the path for loading resource files. This string is cached and is the locale of the display.
Initially, no language procedure is set by the intrinsics. To set the language procedure for use by the
XtSetLanguageProc() function sets the language procedure that is called from the XtDisplayInitialize() function for all subsequent displays initialized in the specified application context. If the applicationcontext parameter is null, the specified language procedure is registered in all application contexts created by the calling process, including any future application contexts that may be created. If the procedure parameter is null, a default language procedure is registered. The XtSetLanguageProc() function returns the previously registered language procedure. If a language procedure has not yet been registered, the return value is unspecified; but if this return value is used in a subsequent call to the XtSetLanguageProc() function, it causes the default language procedure to be registered.The default language procedure does the following:
XtSetLanguageProc() function prior to the XtDisplayInitialize() function, as in the following example.
XtDisplayInitializeTheXtDisplayInitialize() function first determines the language string to be used for the specified display and loads the application's resource database for this display-host-application combination from the following sources in order of precedence:
XtDisplayInitialize() function creates a unique resource database for each display parameter specified. When a database is created, a language string is determined for the display parameter in a manner equivalent to the following sequence of actions.
The
The database constructed from the command line is then queried for the resource name.
The application-specific class resource file name is constructed from the class name of the application. It points to a localized resource file that is usually installed by the site manager when the application is installed. The file is found by calling the
The application-specific user resource file name points to a user-specific resource file and is constructed from the class name of the application. This file is owned by the application and typically stores user customizations. Its name is found by calling the If the resulting resource file exists, it is merged into the resource database. This file can be provided with the application or created by the user. The temporary database created from the server resource property or user resource file during language determination is then merged into the resource database. The server resource file is created entirely by the user and contains both display-independent and display-specific user preferences.
If one exists, a user's environment resource file is then loaded and merged into the resource database. This file name is user- and host-specific. The user's environment resource file name is constructed from the value of the user's Font ManagementInternational text drawing is done using a set of one or more fonts, as needed for the locale of the text.
The two methods of internationalized drawing within the system environment allow clients to choose one of the static output widgets (for example,
Static output widgets require that text be converted to The following information explains the mechanism for managing fonts using the Xlib routines and functions. Creating and Freeing a Font SetXlib international text drawing is done using a set of one or more fonts, as needed for the locale of the text. Fonts are loaded according to a list of base font names supplied by the client and the charsets required by the locale. TheXFontSet is an opaque type.
Obtaining Font Set MetricsMetrics for the internationalized text drawing functions are defined in terms of a primary draw direction, which is the default direction in which the character origin advances for each succeeding character in the string. The Xlib interface is currently defined to support only a left-to-right primary draw direction. The drawing origin is the position passed to the drawing function when the text is drawn. The baseline is a line drawn through the drawing origin parallel to the primary draw direction. Character ink is the pixels painted in the foreground color and does not include interline or intercharacter spacing or image text background pixels.The drawing functions are allowed to implement implicit text direction control, reversing the order in which characters are rendered along the primary draw direction in response to locale-specific lexical analysis of the string.
Regardless of the character rendering order, the origins of all characters are on the primary draw direction side of the drawing origin. The screen location of a particular character image may be determined with the
The drawing functions are allowed to implement context-dependent rendering, where the glyphs drawn for a string are not simply a combination of the glyphs that represent each individual character. A string of two characters drawn with the The drawing functions do not interpret newline characters, tabs, or other control characters. The behavior when nonprinting characters are drawn (other than spaces) is implementation-dependent. It is the client's responsibility to interpret control characters in a text stream.
To find out about context-dependent rendering, use the Drawing Localized TextThe functions defined in this section draw text at a specified location in a drawable. They are similar to theXDrawText() , XDrawString() , and XDrawImageString() functions except that they work with font sets instead of single fonts, and they interpret the text based on the locale of the font set instead of treating the bytes of the string as direct font indexes. If a BadFont error is generated, characters prior to the offending character may have been drawn. The text is drawn using the fonts loaded for the specified font set; the font in the graphics context (GC) is ignored and may be modified by the functions. No validation that all fonts conform to some width rule is performed.
Use the Inputting Localized TextThe following discusses the Xlib and desktop mechanisms used for international text input. If you are using MotifText[Field] widgets or you are using the XmIm APIs for text input, this section provides background information. However, it will not impact your application design or coding practice. If you are not interested in how character input is achieved from the keyboard with low-level Xlib calls, you can proceed to "Interclient Communications Conventions for Localized Text" .Xlib Input Method OverviewThis section provides definitions for terms and concepts used for internationalized text input and a brief overview of the intended use of the mechanisms provided by Xlib.A large number of languages in the world use alphabets consisting of a small set of symbols (letters) to form words. To enter text into a computer in an alphabetic language, a user usually has a keyboard on which there are key symbols corresponding to the alphabet. Sometimes, a few characters of an alphabetic language are missing on the keyboard. Many computer users who speak a Latin-alphabet-based language only have an English-based keyboard. They need to press a combination of keystrokes to enter a character that does not exist directly on the keyboard. A number of algorithms have been developed for entering such characters, known as European input methods, the compose input method, or the dead-keys input method. Japanese is an example of a language with a phonetic symbol set, where each symbol represents a specific sound. There are two phonetic symbol sets in Japanese: Katakana and Hiragana. In general, Katakana is used for words that are of foreign origin, and Hiragana for writing native Japanese words. Collectively, the two systems are called Kana. Hiragana consists of 83 characters; Katakana, 86 characters. Korean also has a phonetic symbol set, called Hangul. Each of the 24 basic phonetic symbols (14 consonants and 10 vowels) represent a specific sound. A syllable is composed of two or three parts: the initial consonants, the vowels, and the optional last consonants. With Hangul, syllables can be treated as the basic units on which text processing is done. For example, a delete operation may work on a phonetic symbol or a syllable. Korean code sets include several thousands of these syllables. A user types the phonetic symbols that make up the syllables of the words to be entered. The display may change as each phonetic symbol is entered. For example, when the second phonetic symbol of a syllable is entered, the first phonetic symbol may change its shape and size. Likewise, when the third phonetic symbol is entered, the first two phonetic symbols may change their shape and size. Not all languages rely solely on alphabetic or phonetic systems. Some languages, including Japanese and Korean, employ an ideographic writing system. In an ideographic system, rather than taking a small set of symbols and combining them in different ways to create words, each word consists of one unique symbol (or, occasionally, several symbols). The number of symbols may be very large: approximately 50,000 have been identified in Hanzi, the Chinese ideographic system. There are two major aspects of ideographic systems for their computer usage. First, the standard computer character sets in Japan, China, and Korea include roughly 8,000 characters, while sets in Taiwan have between 15,000 and 30,000 characters, which make it necessary to use more than one byte to represent a character. Second, it is obviously impractical to have a keyboard that includes all of a given language's ideographic symbols. Therefore a mechanism is required for entering characters so that a keyboard with a reasonable number of keys can be used. Those input methods are usually based on phonetics, but there are also methods based on the graphical properties of characters. In Japan, both Kana and Kanji are used. In Korea, Hangul and sometimes Hanja are used. Now, consider entering ideographs in Japan, Korea, China, and Taiwan. In Japan, either Kana or English characters are entered and a region is selected (sometimes automatically) for conversion to Kanji. Several Kanji characters can have the same phonetic representation. If that is the case, with the string entered, a menu of characters is presented and the user must choose the appropriate option. If no choice is necessary or a preference has been established, the input method does the substitution directly. When Latin characters are converted to Kana or Kanji, it is called a Romaji conversion. In Korea, it is usually acceptable to keep Korean text in Hangul form, but some people may choose to write Hanja-originated words in Hanja rather than in Hangul. To change Hangul to Hanja, a region is selected for conversion and the user follows the same basic method as described for Japanese. Probably because there are well-accepted phonetic writing systems for Japanese and Korean, computer input methods in these countries for entering ideographs are fairly standard. Keyboard keys have both English characters and phonetic symbols engraved on them, and the user can switch between the two sets. The situation is different for Chinese. While there is a phonetic system called Pinyin promoted by authorities, there is no consensus for entering Chinese text. Some vendors use a phonetic decomposition (Pinyin or another), others use ideographic decomposition of Chinese words, with various implementations and keyboard layouts. There are about 16 known methods, none of which is a clear standard. Also, there are actually two ideographic sets used: Traditional Chinese (the original written Chinese) and Simplified Chinese. Several years ago, the People's Republic of China launched a campaign to simplify some ideographic characters and eliminate redundancies altogether. Under the plan, characters would be streamlined every five years. Characters have been revised several times now, resulting in the smaller, simpler set that makes up Simplified Chinese. Input Method ArchitectureAs shown in the previous section, there are many different input methods used today, each varying with language, culture, and history. A common feature of many input methods is that the user can type multiple keystrokes to compose a single character (or set of characters). The process of composing characters from keystrokes is called preediting. It may require complex algorithms and large dictionaries involving substantial computer resources.Input methods may require one or more areas in which to show the feedback of the actual keystrokes, to show ambiguities to the user, to list dictionaries, and so on. The following are the input method areas of concern.
Using an input server implies communications overhead, but applications can be migrated without relinking. Input methods can be implemented either as a token communicating to an input server or as a local library.
The abstraction used by a client to communicate with an input method is an opaque data structure represented by the XIM data type. This data structure is returned by the A single input server can be used for one or more languages, supporting one or more encoding schemes. But the strings returned from an input method are always encoded in the (single) locale associated with the XIM object. Input ContextsXlib provides the ability to manage a multithreaded state for text input. A client may be using multiple windows, each window with multiple text entry areas, with the user possibly switching among them at any time. The abstraction for representing the state of a particular input thread is called an input context. The Xlib representation of an input context is an XIC. See Figure 5-1 for an illustration.Figure 5-1 Input method and input contexts
An input context is the abstraction retaining the state, properties, and semantics of communication between a client and an input method. An input context is a combination of an input method, a locale specifying the encoding of the character strings to be returned, a client window, internal state information, and various layout or appearance characteristics. The input context concept somewhat matches for input the graphics context abstraction defined for graphics output.
One input context belongs to exactly one input method. Different input contexts can be associated with the same input method, possibly with the same client window. An XIC is created with the Considering the example of a client window with multiple text entry areas, the application programmer can choose to implement the following:
Keyboard InputTo obtain characters from an input method, a client must call theXmbLookupString() function or XwcLookupString() function with an input context created from that input method. Both a locale and display are bound to an input method when they are opened, and an input context inherits this locale and display. Any strings returned by the XmbLookupString() or XwcLookupString() function are encoded in that locale. Xlib Focus ManagementFor each text-entry area in which theXmbLookupString() or XwcLookupString() function is used, there is an associated input context.
When the application focus moves to a text-entry area, the application must set the input context focus to the input context associated with that area. The input context focus is set by calling the
Also, when the application focus moves out of a text-entry area, the application should unset the focus for the associated input context by calling the
Note: To set and unset the input context focus correctly, it is necessary to track application-level focus changes. Such focus changes do not necessarily correspond to X server focus changes. If a single input context is used to do input for multiple text-entry areas, it is also necessary to set the focus window of the input context whenever the focus window changes. Xlib Geometry ManagementIn most input method architectures (OnTheSpot being the notable exception), the input method performs the display of its own data. To provide better visual locality, it is often desirable to have the input method areas embedded within a client. To do this, the client may need to allocate space for an input method. Xlib provides support that allows the client to provide the size and position of input method areas. The input method areas that are supported for geometry management are the status area and the preedit area.The fundamental concept on which geometry management for input method windows is based is the proper division of responsibilities between the client (or toolkit) and the input method. The division of responsibilities is the following:
Before a client provides geometry management for an input method, it must determine if geometry management is needed. The input method indicates the need for geometry management by setting the After a client has established with the input method that it will do geometry management, the client must negotiate the geometry with the input method. The geometry is negotiated by the following steps:
XNFontSet and XNLineSpacing values may change the geometry desired by the input method. It is the responsibility of the client to renegotiate the geometry of the input method window when it is needed. In addition, a geometry management callback is provided by which an input method can initiate a geometry change. Event FilteringA filtering mechanism is provided to allow input methods to capture X events transparently to clients. It is expected that toolkits (or clients) using theXmbLookupString() or XwcLookupString() function call this filter at some point in the event processing mechanism to make sure that events needed by an input method can be filtered by that input method. If there is no filter, a client can receive and discard events that are necessary for the proper functioning of an input method. The following provides a few examples of such events:
CallbacksWhen an OnTheSpot input method is implemented, only the client can insert or delete preedit data in place and possibly scroll existing text. This means the echo of the keystrokes has to be achieved by the client itself, tightly coupled with the input method logic.
When a keystroke is entered, the client calls the There are a number of cases where the input method logic has to call back the client. Each of those cases is associated with a well-defined callback action. It is possible for the client to specify, for each input context, which callback is to be called for each action. There are also callbacks provided for feedback of status information and a callback to initiate a geometry request for an input method. X Server Keyboard ProtocolThis section discusses the server and keyboard groups.
A keysym is the encoding of a symbol on a keycap. The goal of the server's keysym mapping is to reflect the actual key caps on the physical keyboards. The user can redefine the keyboard by running the X Version 11 Release 4 (X11R4) allows for definition of a bilingual keyboard at the server. The following describes this capability. A list of keysyms is associated with each key code. The following list discusses the set of symbols on the corresponding key:
The first four elements of the list are split into two groups of keysyms. Group 1 contains the first and second keysyms; Group 2 contains the third and fourth keysyms. Within each group, if the second element of the group is NoSymbol, the group is treated as if the second element were the same as the first element, except when the first element is an alphabetic keysym The standard rules for obtaining a keysym from an event make use of the Group 1 and Group 2 keysyms only; no interpretation of other keysyms in the list is given here. The modifier state determines which group to use. Switching between groups is controlled by the keysym named MODE SWITCH by attaching that keysym to some key code and attaching that key code to any one of the modifiers Mod1 through Mod5. This modifier is called the group modifier. For any key code, Group 1 is used when the group modifier is off, and Group 2 is used when the group modifier is on. Within a group, the keysym to use is also determined by the modifier state. The first keysym is used when the Shift and Lock modifiers are off. The second keysym is used when the Shift modifier is on, when the Lock modifier is on, and when the second keysym is uppercase alphabetic, or when the Lock modifier is on and is interpreted as ShiftLock. Otherwise, when the Lock modifier is on and is interpreted as CapsLock, the state of the Shift modifier is applied first to select a keysym; if that keysym is lowercase alphabetic, the corresponding uppercase keysym is used instead. No spatial geometry of the symbols on the key is defined by their order in the keysym list, although a geometry might be defined on a vendor-specific basis. The server does not use the mapping between key codes and keysyms. Rather, it stores it merely for reading and writing by clients. The KeyMask modifier named Lock is intended to be mapped to either a CapsLock or a ShiftLock key, but which one it is mapped to is left as an application-specific decision, user-specific decision, or both. However, it is suggested that users determine mapping according to the associated keysyms of the corresponding key code. Interclient Communications Conventions for Localized TextThe following information explains how components use Interclient Communications Conventions (ICCC) to communicate text data and is offered as a guideline to understand how ICCC selections are performed. TheXmText widget , XmTextField widget, and the dtterm command adhere to these guidelines.
The toolkit is enhanced for internationalized ICCC compliance. The selection mechanism of For developers who use the toolkit to write their applications, the toolkit enables the application to be ICCC-compliant. However, for developers who may use another non-ICCC-compliant toolkit to develop applications that communicate with toolkit-based applications, the following may be helpful. Owner of SelectionAny owner returns at least the following atom list when XA_TARGETS is requested on some localized text:
XA_TEXT is requested, the owner returns its text as is with the encoding type of the property set to the code set of the current locale (no data conversion). An atom is created, representing the name of the code set of the locale.
When
When
Note: XA_STRING is defined to be ISO8859-1.
Requester of SelectionA requester first requestsXA_TARGET when text data is to be communicated with the selection owner.The requester then searches for one of the following atoms in priority order:
XA_TEXT atom is used only if none of the other atoms is found. Because the owner returns a property with a type representing its encoding, the requester attempts to convert to the code set of its locale.
If the type
When converting from XmClipboardXmClipboard is also enhanced to be ICCC-compliant in conjunction with the XmText and XmTextField widgets. When text is being put on the clipboard by way of the XmText and XmTextField widgets, the following ICCC protocol is implemented:
When text is being retrieved from the clipboard by way of the
Note: If text is put directly on the clipboard, the application needs to specify the format, or encoding type in the form of an atom, along with the text to put on the clipboard. Similarly, if text is retrieved directly from the clipboard, the retrieving application needs to check the format to see what encoding the data on the clipboard is encoded in and take the appropriate action. Passing Window Title and Icon Name to Window ManagersThe default of theXtNtitleEncoding and XtNiconNameEncoding resources for the VendorShell class is set to None. This is done only when using the libXm.a library. The libXt.a library still retains XA_STRING as the default for the resources.
This is done so that, as a default case, the
It is recommended that the user not set the
Assuming the Window Manager being communicated with is ICCC-compliant, that Window Manager is able to use the encoding type of
When setting the
The Window Manager is enhanced so that it always converts the client title and icon name passed from clients to the encoding of its current locale, and an
Note: This allows clients running with different code sets but with similar character sets to communicate their titles to the Window Manager. For example, both a PC code client and an ISO8859-1 client can display their titles regardless of the code set of the Window Manager. MessagesPart of internationalizing a system environment toolkit-based application is not to have any locale-specific data hardcoded within the application source. One common locale-specific item is messages (error and warning) returned by the application of the standard I/O (input/output).In general, for any error or warning messages to be displayed to the user through a system environment toolkit widget or gadget, the messages need to be externalized through message catalogs.
For dialog messages to be displayed through a toolkit component, the messages need to be externalized through localized resource files. This is done in the same way as localizing resources, such as the
For example, if a warning message is to be displayed through an
The localized resource files can be put in the The preceding two choices are left as design decisions for the application developer.
|