HP OpenVMS Systems Documentation

Content starts here

Compaq C Run-Time Library Utilities Reference Manual


Previous Contents Index


ICONV COMPILE

Creates a conversion table file from a conversion source file. The conversion table file is used by the ICONV CONVERT command to convert characters in a file from one codeset to another.

Format

ICONV COMPILE sourcefile tablefile


Parameters

sourcefile

Required.

Name of the conversion source file. The default file type is .ISRC. The file naming convention that Compaq uses for conversion source files is:


fromcodeset_tocodeset.isrc

tablefile

Required.

Name of the conversion table file to be created. The default file type is .ICONV. The required file naming convention for conversion table files is:


fromcodeset_tocodeset.iconv

Public conversion table files are in the directory defined by the logical name SYS$I18N_ICONV. Put new conversion table files in the same directory if you want to make them available systemwide.


Qualifiers

/LISTING[=listfile]

Directs ICONV COMPILE to produce a listing file, which contains the source file listing and any error messages generated during compilation. If the file name is omitted from the qualifier, the default listing file name is sourcefile.LIS.

Description

The ICONV commands support any 1- to 4-byte codesets that are state independent. They do not support state-dependent codesets.

Note

There is an implementation restriction in the tocodeset encodings in this implementation. The characters in tocodeset must not use 0XFF in the fourth byte.

The conversion source file contains the character conversion rules for a specific conversion.

The format of a codeset conversion source file is defined as follows:


<fromcodeset_mb_cur_max>    value
<fromcodeset_mb_cur_min>    value
<tocodeset_mb_cur_max>      value
<tocodeset_mb_cur_min>      value
<fallback_code>             value
<escape_char>               value
<comment_char>              value
<fromcodeset_range>         value...value;value...value;...;value...value
ICONV_TABLE
fromvalue                         tovalue
fromvalue                         tovalue
   .                                 .
   .                                 .
   .                                 .
fromvalue                         tovalue
END ICONV_TABLE

where the <...> symbols and their associated values are codeset declarations, and the fromvalue/tovalue pairs are character conversion rules.

Codeset Declarations

The codeset declarations must precede the character conversion rules. Each declaration consists of a symbol, starting in column 1 and including the surrounding brackets, followed by one or more blanks (tabs or spaces), followed by the value to be assigned to the symbol. See Table 4-2.

Table 4-2 Codeset Declarations
Symbol Value
<fromcodeset_mb_cur_max> The maximum number of bytes in a character in the fromcodeset. This value defaults to 1.
<fromcodeset_mb_cur_min> The minimum number of bytes in a character in the fromcodeset. This value must be less than or equal to fromcodeset_mb_cur_max. If this value is not specified, it defaults to the value of fromcodeset_mb_cur_max.
<tocodeset_mb_cur_max> The maximum number of bytes in a character in the tocodeset. This value defaults to 1.
<tocodeset_mb_cur_min> The minimum number of bytes in a character in the tocodeset. This value must be less than or equal to tocodeset_mb_cur_max. If this value is not specified, it defaults to the value of tocodeset_mb_cur_max.
<fallback_code> The tovalues for the fromvalues that appear in the <fromcodeset_range> but are not specified between ICONV_TABLE and END ICONV_TABLE. Specify one of three kinds of values:
  • SAME --- Specifies that the tovalues are the same as the fromvalues.
  • ERROR --- Specifies that the conversion from the fromvalue to a tovalue is not supported. ICONV CONVERT issues a warning and ignores the rest of the record read. The Compaq C Run-Time Library routine iconv returns to the caller with an "illegal character" error.
  • User-defined tovalue --- The fromvalues are converted to the specified user-defined tovalue.

    The user-defined tovalue can represent a multibyte character with the restriction that 0XFF cannot be used as the value in the fourth byte. The settings for user-defined tovalues for <fallback_code> are the same as the settings for character conversion rule values. You can use octal, decimal, or hexadecimal digits. If the <fallback_code> is not specified, it defaults to SAME.

<escape_char> The escape character used to indicate that subsequent characters are interpreted in a special way. The escape character defaults to backslash (\).
<comment_char> The character that, when placed in column 1 of a line, indicates that the line will be ignored. The default comment character is the number sign (#).
<fromcodeset_range> The fromcodeset encoding ranges. Specify this declaration if the fromcodeset is a multibyte codeset. If the fromcodeset is omitted, it defaults to a single-byte codeset and the table created by ICONV COMPILE will support only single-byte fromcodeset conversions.

When specifying codeset encoding ranges for the fromcodeset, every zone of characters must be specified. If any zones of characters are missing from the <fromcodeset_range> specification, the codeset conversion might be incorrect. It is very important to specify the codeset encoding ranges correctly for the fromcodesets supported by the rest of the Compaq C Run-Time Library. If this is not done, the codeset support for iconv and the rest of the Compaq C Run-Time Library will not be consistent.

For example, the fromcodeset ranges for EUCJP are specified as:


<fromcodeset_range>  \x0...\x7f;\x8e\xa1...\x8e\xfe;
                     \xa1\xa1...\xfe\xfe;\x8f\xa1\xa1...\x8f\xfe\xfe

The settings for <fromcodeset_range> values are the same as the settings for character conversion rule values. You can use octal, decimal, or hexadecimal digits.

Character Conversion Rules

The character conversion rules are all the lines between the string ICONV_TABLE starting in column 1 and END ICONV_TABLE starting in column 1.

Character conversion rules must begin in column 1.

Empty lines and lines containing a comment_char in the first column are ignored. Comments are optional.

Character conversion rules can have one of two forms:


fromvalue                   tovalue

fromvalue...fromvalue       tovalue

Place one or more blanks (tabs or spaces) between fromvalue and tovalue.

Use the first format to define a single-character conversion rule. For example:


\d32       \d101
\d37       \d106

Use the second format to define a range of character conversion rules. In this format, the ending fromvalue must be equal to or greater than the starting fromvalue. The subsequent fromvalues defined by the range are converted to tovalues in increasing order.

For example, consider the following line:


\d223\d32...\d223\d35       \d129\d254

This line is interpreted as:


\d223\d32       \d129\d254
\d223\d33       \d129\d255
\d223\d34       \d130\d0
\d223\d35       \d130\d1

For settings of fromvalue and tovalue:

  • A decimal constant is defined as one, two, or three decimal digits preceded by the escape character and lowercase d. For example: \d42.
  • An octal constant is defined as one, two, or three octal digits preceded by the escape character. For example: \141.
  • A hexadecimal constant is defined as one or two hexadecimal digits preceded by the escape character and a lowercase x. For example: \x6a.

Each constant represents a single-byte value. You can represent multibyte values by concatenating two or more decimal, octal, or hexadecimal constants.

Note

When constants are concatenated for multibyte values, they must have the same radix (decimal, octal, or hexadecimal). Only characters in the Portable Character Set can be used to construct conversion source files.

Also see the ICONV CONVERT command.

errors

If an error is encountered during processing, ICONV COMPILE does not generate an output tablefile. If a warning is encountered, a valid table file is created. However, because a warning can indicate a user error, always check the returned warning messages.

Some ICONV COMPILE error messages and their descriptions follow.


%ICONV-E-INVFCSRNG, syntax error in <fromcodeset_range> definition

The previous error occurs when the definition of the <fromcodeset_range> symbol does not conform to the required syntax. The <fromcodeset_range> symbol defines encoding ranges and is required for multibyte codesets.


%ICONV-E-INVSYNTAX, invalid file syntax

The previous error occurs when a line in the source does not conform to the required syntax.


%ICONV-E-BADTABLE, bad table caused by invalid value for <fromcodeset_range>
definition

The previous error occurs when an invalid value is specified for the codeset encoding ranges. The encoding ranges are defined by the <fromcodeset_range> symbol.


Examples

#1

$ ICONV COMPILE /LISTING EUCTW_DECHANYU.ISRC EUCTW_DECHANYU.ICONV

      

This example shows how to create a conversion table file to convert the EUCTW codeset to the DECHANYU codeset. The listing file, EUCTW_DECHANYU.LIS, contains a listing of the source file and any error messages generated by the compiler.


ICONV CONVERT

Converts characters in a file from one codeset to another codeset. The converted characters are written to an output file.

Format

ICONV CONVERT infile outfile


Parameters

infile

Required.

Name of the file that contains the characters to be converted. The /FROMCODE qualifier specifies the codeset of the characters in this file.

outfile

Required.

Name of the file created by ICONV CONVERT. The /TOCODE qualifier specifies the codeset of the characters in this file.


Qualifiers

/FROMCODE=fromcodeset

Required.

Specifies the codeset of the characters in infile.

/TOCODE=tocodeset

Required.

Specifies the codeset of the characters in outfile.


Description

The ICONV CONVERT command converts the characters in infile from the codeset identified by the /FROMCODE qualifier to the codeset identified by the /TOCODE qualifier. The converted file is written to outfile.

The conversion is done in one of two ways:

  • Using a conversion table file to look up the converted characters. This is the default method. Conversion table files are created by the DCL command ICONV COMPILE.
  • Using a shareable image file that implements the required conversion. This method can be used whenever the implementation of a converter by table is either not convenient, for example, huge virtual address space versus small space by algorithm, or not possible, for example, for state dependent encoding like ISO2022.

The converter's file naming convention, valid for both table or image file type of implementations, is:


fromcodeset_tocodeset.iconv

Note

If you add conversion files to your system, they must use the same file-naming convention.

ICONV CONVERT searches your current directory for a converter file. If it cannot find the file, it then searches the system directory defined by the logical name SYS$I18N_ICONV.


Examples

#1

$ ICONV CONVERT /FROMCODE=EUCTW /TOCODE=DECHANYU -
_$ FROMFILE.DAT TOFILE.DAT

      

This example shows a conversion from EUCTW characters to DECHANYU characters. The EUCTW characters in the file FROMFILE.DAT are converted to the corresponding DECHANYU characters. The converted characters are stored in the file TOFILE.DAT.


LOCALE COMPILE

Converts a locale source file into a binary locale file. The binary locale file is used by those utilities and C routines that are dependent on the setting of the international environment logical names.

Format

LOCALE COMPILE sourcefile


Parameters

sourcefile

Required.

Name of the locale source file, which defines each category of the locale. The default file type for the source file is .LSRC. For the definition of the locale source file format, see Chapter 2.


Qualifiers

/CHARACTER_DEFINITIONS=filename

/NOCHARACTER_DEFINITIONS

Optional. Default: /NOCHARACTER_DEFINITIONS

Specifies a character-set description file (charmap) for the locale. This file maps characters to their actual character encodings.

If a charmap is not specified, no symbolic names (other than collating symbols defined in a collating symbol keyword) are allowed in the locale source file.

For a definition of the charmap file format, see Chapter 3. The default file type for a charmap is .CMAP.

/DISPLAY[=[NO]HOLE]

Optional. Default: /DISPLAY=NOHOLE

Used with certain Chinese locales and terminals to specify that 4-byte characters occupy four printing positions (columns) on the terminal display. The default value (/DISPLAY=NOHOLE) specifies that 4-byte characters occupy two printing positions.

/IGNORE=WARNINGS

/NOIGNORE

Optional. Default: /NOIGNORE

Generates an output file even if LOCALE COMPILE issues warning messages. Use the /IGNORE keyword cautiously because the warnings could indicate user errors that you might want to correct before using the resulting locale file.

/LISTING[=filename]

/NOLISTING

Optional. Batch default: /LISTING; interactive default: /NOLISTING

Name of the listing file. The /SHOW qualifier controls the information included in the listing file. If the file name is omitted, the default is sourcefile.LIS.

/OUTPUT=[filename]

/NOOUTPUT

Optional. Default: /OUTPUT=sourcefile.LOCALE

Name of the output file. Public locales are stored in the directory defined by the logical name SYS$I18N_LOCALE. If the output file is in any other location, the locale is private.

/NOOUTPUT results in no output file creation, even if the compilation succeeds.

/SHOW[=(keyword[,...])]

Optional. Default: /SHOW=(SOURCE,TERMINAL)

/SHOW, together with /LISTING, controls the information included in the listing file. You can specify the following keywords:

Keyword Description
ALL Include all information.
BRIEF Include a summary of the symbol table.
[NO]CHARACTER_DEFINITIONS Include or omit the charmap file.
NONE Do not print any information. The listing file contains only the generated error messages.
[NO]SOURCE Include or omit a listing of the source file.
[NO]STATISTICS Include or omit compiler performance information.
[NO]SYMBOLS Include or omit a listing of the charmap symbol table.
[NO]TERMINAL Display compiler messages at the terminal.

Description

Use the LOCALE COMPILE command to add new locales to your system in addition to those supplied by Compaq. To compile a locale, LOCALE COMPILE requires two files:
  • A charmap file that defines the character set for the locale. If you do not specify a charmap file, symbolic names cannot be specified in the locale source file. If this happens, LOCALE COMPILE issues an error or warning message, depending on the category processed, and no output file is produced. (Also see the /IGNORE qualifier.)
  • A locale source file. This file describes one or more of the locale categories: LC_CTYPE, LC_COLLATE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and LC_TIME.

errors

The following error messages are related to the LOCALE COMPILE command:
  • %LOCALE-E-CASEALRDY, case conversion already exists for 'character'
    Where character is a character from the codeset. This error can occur when the locale compiler is processing the LC_CTYPE category. It indicates that more than one case conversion is specified for character.

  • %LOCALE-E-PREOFCMAP, premature end of file in charmap file
    Occurs if there is no END CHARMAP statement in the charmap file.
  • %LOCALE-E-PREEOFSRC, premature end of file in source file
    Occurs if there is an error with the END statements in the locale source file.
  • %LOCALE-F-NOADDSYM, failed to add symbol to symbol table
    Occurs when there is insufficient memory to finish the compilation. Check the amount of memory available to your process.
  • %LOCALE-F-NOINITSYM, failed to initialize symbol table
    Occurs if memory is insufficient to finish the compilation. Check the amount of memory available to your process.

Examples

#1

$ LOCALE COMPILE EN_GB_ISO8859-1 /CHARACTER_DEFINITIONS=ISO8859-1 -
_$ /LIST /SHOW=(CHARACTER_DEFINITIONS,SYMBOLS,STATISTICS)

      

This example shows how to generate a locale file named EN_GB_ISO8859-1.LOCALE from the source file EN_GB_ISO8859-1.LSRC, using the charmap file ISO8859-1.CMAP. To use this locale file, copy it to the SYS$I18N_LOCALE directory and set the LANG logical to "EN_GB.ISO8859-1". The listing file contains a listing of the charmap file, the symbol table, performance information, and any error messages generated by the compiler.


LOCALE LOAD

Loads the specified locale name into the system's memory as shared, read-only global data.

Format

LOCALE LOAD locale_identifier


Parameters

locale_identifier

Required.

Character string that identifies the locale to be loaded. Specify one of the following:

  • Name of the public locale
    Specifies the public locale. Use the format:


    language_country.codeset[@modifier]
    

    LOCALE LOAD searches for the public locale binary file in the location defined by the logical name SYS$I18N_LOCALE. The file type defaults to .LOCALE. The period (.) and at-sign (@) characters in the name specified are replaced by underscore (_) characters.
    For example, if the name specified is "zh_CN.dechanzi@radical", LOCALE LOAD searches for the following binary locale file:
    SYS$I18N_LOCALE:ZH_CN_DECHANZI_RADICAL.LOCALE
  • Name of a file
    Specifies the binary locale file. This can be any valid file specification. If either the device or directory is not specified, LOCALE LOAD first applies the current caller's device and directory as defaults. If the file is not found, the device and directory defined by the SYS$I18N_LOCALE logical name are used as defaults. The file type defaults to .LOCALE.
    Wildcards are not valid. The binary locale file cannot reside on a remote node.

Qualifiers

None.


Description

The LOCALE LOAD command loads the specified locale name into the system's memory as several shared, read-only, global sections. All processes that access the loaded locale then use this one copy of the locale, thereby reducing overall demand on system memory.

This DCL command is privileged, typically issued by the system manager. The following privileges are required:

  • SYSGBL
  • PRMGBL

Examples

#1

$ LOCALE LOAD JA_JP_DECKANJI
      

This example shows how to load the JA_JP_DECKANJI locale.


Previous Next Contents Index