HP OpenVMS/Hanzi RTL Chinese Processing (HSY$) Manual

HP OpenVMS/Hanzi RTL Chinese Processing (HSY$) Manual

Order Number: BA322-90018


May 2005

This manual documents the library routines contained in the HSY$ facility of the OpenVMS/Hanzi Run-Time Library.

Revision/Update Information: This document supersedes the Introduction to the Multi-byte Processing Run Time Library HSYSHR manual, Version 6.0

Software Version: OpenVMS/Hanzi I64 Version 8.2 OpenVMS/Hanzi Alpha Version 7.3-2




Hewlett-Packard Company Palo Alto, California


© Copyright 2005 Hewlett-Packard Development Company, L.P.

Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.

The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Intel and Itanium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

Printed in Singapore

Contents


Preface

This manual provides users of the HP OpenVMS/Hanzi operating system with detailed usage and reference information on library routines supplied in the HSY$ facility of the OpenVMS/Hanzi Run-Time Library for Chinese processing.

Intended Audience

This manual is intended for application programmers who want to write applications for Chinese processing.

Document Structure

This manual is organized into two parts as follows:

Associated Document

A description of how the Run-Time Library routines are accessed is presented in OpenVMS Programming Interface: Calling a System Routine. The HSY$ Run-Time Library routines can be used with other RTL facilities provided in OpenVMS and OpenVMS/Hanzi. Descriptions of the other RTL facilities and their corresponding routines are presented in the following books:

Application programmers using any programming language can refer to Guide to Creating OpenVMS Modular Procedures for writing modular and reentrant code, and OpenVMS/Hanzi User Guide for understanding the DEC Hanzi character set.

High-level language programmers will find additional information on calling Run-Time Library routines in their language reference manuals. Additional information may also be found in the programming language user's guide provided with your OpenVMS programming language software.

For a complete list and description of the manuals in the OpenVMS documentation set, see Overview of OpenVMS Documentation.

For additional information about HP OpenVMS products and services, visit the following World Wide Web address:


http://www.hp.com/go/openvms 

Conventions

The following conventions may be used in this manual:
Ctrl/ x A sequence such as Ctrl/ x indicates that you must hold down the key labeled Ctrl while you press another key or a pointing device button.
PF1 x A sequence such as PF1 x indicates that you must first press and release the key labeled PF1 and then press and release another key or a pointing device button.
[Return] In examples, a key name enclosed in a box indicates that you press a key on the keyboard. (In text, a key name is not enclosed in a box.)

In the HTML version of this document, this convention appears as brackets, rather than a box.

... A horizontal ellipsis in examples indicates one of the following possibilities:
  • Additional optional arguments in a statement have been omitted.
  • The preceding item or items can be repeated one or more times.
  • Additional parameters, values, or other information can be entered.
.
.
.
A vertical ellipsis indicates the omission of items from a code example or command format; the items are omitted because they are not important to the topic being discussed.
( ) In command format descriptions, parentheses indicate that you must enclose choices in parentheses if you specify more than one.
[ ] In command format descriptions, brackets indicate optional choices. You can choose one or more items or no items. Do not type the brackets on the command line. However, you must include the brackets in the syntax for OpenVMS directory specifications and for a substring specification in an assignment statement.
| In command format descriptions, vertical bars separate choices within brackets or braces. Within brackets, the choices are optional; within braces, at least one choice is required. Do not type the vertical bars on the command line.
{ } In command format descriptions, braces indicate required choices; you must choose at least one of the items listed. Do not type the braces on the command line.
bold type Bold type represents the introduction of a new term. It also represents the name of an argument, an attribute, or a reason.
italic type Italic type indicates important information, complete titles of manuals, or variables. Variables include information that varies in system output (Internal error number), in command lines (/PRODUCER= name), and in command parameters in text (where dd represents the predefined code for the device type).
Example This typeface indicates code examples, command examples, and interactive screen displays. In text, this type also identifies URLs, UNIX commands and pathnames, PC-based commands and folders, and certain elements of the C programming language.
UPPERCASE TYPE Uppercase type indicates a command, the name of a routine, the name of a file, or the abbreviation for a system privilege.
- A hyphen at the end of a command format description, command line, or code line indicates that the command or statement continues on the following line.
numbers All numbers in text are assumed to be decimal unless otherwise noted. Nondecimal radixes---binary, octal, or hexadecimal---are explicitly indicated.


Chapter 1
INTRODUCTION

The OpenVMS/Hanzi Chinese Processing Run-Time Library (or simply HSYSHR) is a library of prewritten, commonly-used routines that perform a wide variety of multi-byte Chinese language processing operations. It represents the HSY$ facility of the OpenVMS/Hanzi Run-Time Library. All HSY$ routines follow the OpenVMS Procedure Calling Standard. They are callable from any programming languages supported in OpenVMS/Hanzi, thus increasing program flexibility.

1.1 Organization of HSYSHR

Routines in HSYSHR are grouped according to the types of tasks they perform. Altogether, there are nine groups of routines. All routine names are prefixed by the facility code HSY$. Those routines prefixed by HSY$DX_ pass string by descriptor, otherwise strings are passed by the address of the starting position of the string. Table 1-1 shows the nine groups of HSY$ routines.

Table 1-1 HSYSHR routine groups
Group Types of Tasks Performed
String Routines Perform manipulation of strings containing multi-byte or mixed ASCII and multi-byte characters.
Read Write Routines Perform read and write of ASCII and multi-byte characters in user buffers.
Pointer Routines Perform character pointer manipulation.
Comparison Routines Perform comparison of strings containing multi-byte or mixed ASCII multi-byte characters.
Searching Routines Perform searching of substrings in buffer containing multi-byte or mixed ASCII and multi-byte characters.
Counting Routines Perform counting of bytes and characters in buffer containing multi-byte or mixed ASCII and multi-byte characters.
Character Type Routines Perform checking of different classes of local language symbols and characters.
Date Time Routines Provide local language date time format.
Conversion Routines Perform various multi-byte character specific conversion.

Table 1-2 to Table 1-10 list all routines available for each of the aforementioned groups, followed by brief statements of the routines' functions.

Table 1-2 String Routines
Routine Name Function
HSY$CH_MOVE Moves a substring from a specified source buffer to a specified destination buffer.
HSY$TRIM Trims trailing one-byte and multi-byte spaces and TAB characters.
HSY$TRUNC Returns the position of the first character that follows the truncated string.
HSY$DX_TRIM Trims trailing one-byte and multi-byte spaces and TAB characters.
HSY$DX_TRUNC Truncates the input string to the specified length.

Table 1-3 Read Write Routines
Routine Name Function
HSY$CH_GCHAR Reads the current character.
HSY$CH_GNEXT Reads the current character.
HSY$CH_NEXTG Reads the next character, skipping the current character.
HSY$CH_RCHAR Reads the current character.
HSY$CH_RNEXT Reads the current character.
HSY$CH_RPREV Reads the previous character.
HSY$DX_RCHAR Reads the current character.
HSY$DX_RNEXT Reads the current character.
HSY$CH_PCHAR Writes a specified character to the current position of a buffer.
HSY$CH_PNEXT Writes a specified character to the current position of a buffer.
HSY$CH_WCHAR Writes a specified character to the current position of a buffer.
HSY$CH_WNEXT Writes a specified character to the current position of a buffer.
HSY$DX_WCHAR Writes a specified character.
HSY$DX_WNEXT Writes a specified character.

Table 1-4 Pointer Routines
Routine Name Function
HSY$SKPC Skips a specified character.
HSY$CH_CURR Points to the first byte of the current character.
HSY$CH_NEXT Points to the first byte of the next character.
HSY$CH_PREV Points to the first byte of the previous character.
HSY$POS_CURR Points to the first byte of the current character.
HSY$POS_NEXT Points to the first byte of the next character.
HSY$POS_PREV Points to the first byte of the previous character.
HSY$DX_SKPC Skips a specified character.
HSY$DX_POS_CURR Points to the first byte of the current character.
HSY$DX_POS_NEXT Points to the first byte of the next character.
HSY$DX_POS_PREV Points to the first byte of the previous character.

Table 1-5 Comparison Routines
Routine Name Function
HSY$COMPARE Compares two specified strings.
HSY$STR_EQUAL Checks if two specified character strings are equal.
HSY$DX_STR_EQUAL Checks if two specified character strings are equal.

Table 1-6 Searching Routines
Routine Name Function
HSY$LOCC Locates the position of the first occurrence of the specified character.
HSY$POSITION Searches the first occurrence of a specified substring in the input string.
HSY$STR_SEARCH Searches the first occurrence of a specified substring in the input string with conversion performed prior to comparing the characters.
HSY$STR_START Checks if the specified substring is found in another input string and starts from the first byte of the input string.
HSY$DX_LOCC Locates the position of the first occurrence of the specified character.
HSY$DX_POSITION Searches the first occurrence of a substring in a specified string.
HSY$DX_STR_SEARCH Searches the first occurrence of a specified substring in the input string.
HSY$DX_STR_START Checks if the specified substring is found in another input string and starts from the first byte of the input string.

Table 1-7 Counting Routines
Routine Name Function
HSY$CH_SIZE Tells the byte length of the specified character.
HSY$CH_NCHAR Returns the number of characters in a specified string.
HSY$CH_NBYTE Counts the number of bytes of a character string.
HSY$DX_NOF_CHAR Returns the number of characters in a specified number of bytes.
HSY$DX_NOF_BYTE Counts the number of bytes of a character string.

Table 1-8 Character Type Routines
Routine Name Function
HSY$IS_VALID Checks if the input character is a valid multi-byte character.
HSY$IS_IDEOGRAPH Checks if the input multi-byte character is an ideographic multi-byte character.
HSY$IS_DESCRIPTION Checks if the input character is a multi-byte local language punctuation.
HSY$IS_TECHNICAL Checks if the input character is a scientific or mathematical multi-byte symbol character.
HSY$IS_UNIT Checks if the input character is a multi-byte standard unit symbol character.
HSY$IS_GENERAL Checks if the input character is a multi-byte general symbol character.
HSY$IS_LINE_DRAWING Checks if the input character is a multi-byte line drawing symbol character.
HSY$IS_DIGIT Checks if the input character is a one-byte or multi-byte numeric digit.
HSY$IS_ROMAN Checks if the input character is a one-byte or multi-byte English letter.
HSY$IS_GREEK Checks if the input character is a multi-byte Greek letter.
HSY$IS_RUSSIAN Checks if the input character is a multi-byte Russian letter.
HSY$IS_ALPHA Checks if the input character is a Greek, Russian or Roman letter.
HSY$IS_UPPER Checks if the input character is an upper case Greek, Russian or Roman letter.
HSY$IS_LOWER Checks if the input character is a lower case Greek, Russian or Roman letter.
HSY$IS_HIRAGANA Checks if the input character is a multi-byte Japanese Hiragana character.
HSY$IS_KATAKANA Checks if the input character is a multi-byte Japanese Katakana character.
HSY$IS_KANA Checks if the input character is a multi-byte Japanese Kana character.
HSY$IS_PARENTHESIS Checks if the input character is a multi-byte parenthesis symbol character.
HSY$IS_LEFT_PARENTHESIS Checks if the input character is a multi-byte left parenthesis symbol character.
HSY$IS_RIGHT_PARENTHESIS Checks if the input character is a multi-byte right parenthesis symbol character.
HSY$IS_NO_FIRST Checks if the input character is a multi-byte "NO FIRST" character.
HSY$IS_NO_LAST Checks if the input character is a multi-byte "NO-LAST" character.

Table 1-9 Date Time Routines
Routine Name Function
HSY$DX_DATE_TIME Returns the date and time in local language format.
HSY$DX_TIME Returns the date and time of the system time in local language format.

Table 1-10 Conversion Routines
Routine Name Function
HSY$CHG_KEISEN Converts '0' to '9' and '-' to multi-byte line drawing characters.
HSY$CHG_GENERAL Performs general multi-byte conversion.
HSY$CHG_KANA_HIRA Converts Katakana characters to Hiragana characters.
HSY$CHG_KANA_KATA Converts Hiragana characters to Katakana characters.
HSY$CHG_KANA_KANA Toggles Kana characters to Hiragana or Katakana characters.
HSY$CHG_ROM_FULL Converts half form ASCII to full form ASCII.
HSY$CHG_ROM_HALF Converts full form ASCII to half form ASCII equivalence.
HSY$CHG_ROM_SIZE Toggles the form (full form or half form) of the input character.
HSY$CHG_ROM_UPPER Converts one-byte and multi-byte letters to upper case.
HSY$CHG_ROM_LOWER Converts one byte and multi-byte letters to lower case.
HSY$CHG_ROM_CASE Toggles the casing of one-byte and multi-byte letters of the input character.
HSY$TRA_KANA_HIRA Converts Katakana character strings to Hiragana character strings.
HSY$TRA_KANA_KATA Converts Hiragana character strings to Katakana character strings.
HSY$TRA_KANA_KANA Toggles Kana character strings to Hiragana or Katakana characters.
HSY$TRA_ROM_FULL Converts half form ASCII to full form ASCII.
HSY$TRA_ROM_HALF Converts full form ASCII to half form ASCII equivalence.
HSY$TRA_ROM_SIZE Toggles the form (full form or half form) of the input string.
HSY$TRA_ROM_UPPER Converts one-byte and multi-byte letters to upper case.
HSY$TRA_ROM_LOWER Converts one-byte and multi-byte letters to lower case.
HSY$TRA_ROM_CASE Toggles the casing of one-byte and multi-byte letters found in the string.
HSY$TRA_SYMBOL Converts the sequence of a one-byte character to a string of multi-byte symbols.
HSY$DX_TRA_KANA_HIRA Converts Katakana character strings to Hiragana character strings.
HSY$DX_TRA_KANA_KATA Converts Hiragana character strings to Katakana character strings.
HSY$DX_TRA_KANA_KANA Toggles Kana character strings to Hiragana or Katakana character strings.
HSY$DX_TRA_ROM_FULL Converts half form ASCII to full form ASCII.
HSY$DX_TRA_ROM_HALF Converts full form ASCII to half form ASCII equivalence.
HSY$DX_TRA_ROM_SIZE Toggles the form (full form or half form) of the input string.
HSY$DX_TRA_ROM_UPPER Converts one-byte and multi-byte letters to upper case.
HSY$DX_TRA_ROM_LOWER Converts one-byte and multi-byte letters to lower case.
HSY$DX_TRA_ROM_CASE Toggles the casing of one-byte and multi-byte letters found in the input string.
HSY$DX_TRA_SYMBOL Converts the sequence of a one-byte character to a string of multi-byte symbols.


Next Contents