 |
HP OpenVMS RTL Library (LIB$) Manual
The TPA$K_LENGTH0 symbol represents the number of bytes (36) in the
basic 32-bit argument block. You can use this symbol to determine the
start of any user-defined fields you add to the argument block.
Table lib-10 describes the argument block fields.
Figure lib-20 LIB$T [ABLE_]PARSE 32-Bit Argument Block
The 64-bit LIB$T[ABLE_]PARSE argument block accommodates quadword
addresses and values as well as input tokens whose binary
representations require no more than 64 bits.
LIB$T[ABLE_]PARSE defines the first 10 words of the 64-bit argument
block as shown in Figure lib-21. You can add fields to the end of the
argument block as a means of passing data to action routines.
The TPA64$K_LENGTH0 symbol represents the number of bytes (80) in the
basic 64-bit argument block. You can use this symbol to determine the
start of any user-defined fields you add to the argument block.
Table lib-10 describes the argument block fields.
Figure lib-21 LIB$T [ABLE_]PARSE 64-Bit Argument Block (Alpha
and I64 Only)
1.4.2 Symbolic Names for Argument Block Fields
The fields in each type of argument block have symbolic names.
Figure lib-20 and Figure lib-21 show some of these symbolic names. This
section tells you how to access these names in some of the most
commonly used languages:
- MACRO assembly language --- MACRO language programs can define both
the 32-bit and 64-bit argument block names by invoking the macro
$TPADEF (automatically loaded from the system macro library). The field
names define the byte offset of the field from the start of the
argument block. This includes the bit fields ($V_names). In addition,
bit mask values ($M_names) are available for the bit fields.
- BLISS --- The field names are also available to BLISS programs from
the system macro SYS$LIBRARY:STARLET.L32 and SYS$LIBRARY:STARLET.L64
libraries. Each name (except for the $M_names) is defined as a
fixed-reference macro that operates on a byte-based block. The $M_names
are defined as literals.
- C --- The same field names are available to C programs from the
tpadef.h file. For the 32-bit and 64-bit argument
blocks, the names are defined as elements of the tpadef and
tpa64def structures, respectively.
See Section 2.2 for an example of an argument block
declaration.
1.4.3 32-Bit and 64-Bit Argument Block Fields
Table lib-10 describes the fields of the 32-bit and 64-bit argument
blocks.
Note that most fields have two symbols and one description. The symbol
that begins with the prefix TPA$ is used with a 32-bit argument block,
while the symbol that begins with the prefix TPA64$ is used with a
64-bit argument block. To prevent cumbersome explanations,
Table lib-10 uses only the main part of a field name, without the
prefix used in the actual code, when referring to a field for both the
32-bit and 64-bit argument blocks. For example, the options field is
referred to as OPTIONS rather than specifying both TPA$L_OPTIONS and
TPA64$L_OPTIONS. The complete field name is used only when referring to
a field for one particular form of argument block.
Table lib-10 LIB$T [ABLE_]PARSE Argument Block Fields
Symbol |
Description |
TPA$L_COUNT
TPA64$L_COUNT
|
A longword containing the value of TPA$K_COUNT0 for 32-bit argument
blocks or TPA64$K_COUNT0 for 64-bit argument blocks. TPA$K_COUNT0 is
defined to be 8. TPA64$K_COUNT0 is defined to be --1.
If the value contained in this longword is greater than or equal to
8, LIB$T[ABLE_]PARSE treats the argument block as a 32-bit argument
block. If the value is --1, LIB$T[ABLE_]PARSE treats the argument block
as a 64-bit argument block.
For LIB$TPARSE (VAX only), a longword containing the number of
longwords that make up the rest of the argument block. This longword
functions as the argument count when the argument block becomes the
argument list to an action routine. This field must contain a value
that is greater than or equal to the value of TPA$K_COUNT0, whose
numeric value is 8.
|
TPA$L_OPTIONS
TPA64$L_OPTIONS
|
Contains various flag bits and other options. The defined flags are as
follows:
- TPA$V_BLANKS, TPA64$V_BLANKS
1 --- Setting this bit causes LIB$T[ABLE_]PARSE to process
blanks and tabs explicitly, rather than treating them as separators.
See Section 3.2 for information about processing blanks.
- TPA$V_ABBRFM, TPA64$V_ABBRFM
1 --- Setting this bit allows keywords to be abbreviated to
any length. If an abbreviated keyword string is ambiguous, the first
eligible transition listed in the state matches it.
- TPA$V_ABBREV, TPA64$V_ABBREV
1 --- Setting this bit allows keywords to be abbreviated to
the shortest length that is unambiguous in that state. See the
Abbreviating Keywords section.
- TPA$V_AMBIG, TPA64$V_AMBIG
1 --- LIB$T[ABLE_]PARSE sets this bit when it has detected
an ambiguous keyword string in the current state.
The OPTIONS field also contains the following option:
TPA$B_MCOUNT, TPA64$B_MCOUNT --- This byte contains the
minimum number of characters allowed for the abbreviation of a keyword.
If its value is zero, abbreviations are not allowed. Preventing
ambiguity is the responsibility of the state table designer. If the
ABBRFM or ABBREV flag is set, LIB$T[ABLE_]PARSE ignores MCOUNT. MCOUNT
is the high byte of the OPTIONS field.
|
TPA64$Q_STRINGDESC
|
For a 64-bit argument block, the three quadwords starting with
TPA64$Q_STRINGDESC form an embedded 64-bit descriptor for the input
string.
2 On entry, LIB$T[ABLE_]PARSE writes the fields of
TPA64$Q_STRINGDESC as follows:
DSC64$B_CLASS = DSC64$K_CLASS_S
DSC64$B_DTYPE = DSC64$K_DTYPE_T
DSC64$L_MBMO = --1
DSC64$W_MBO = +1
|
TPA$L_STRINGCNT
TPA64$Q_STRINGCNT
|
Contains the number of characters remaining in the input string.
For a 32-bit argument block, TPA$L_STRINGCNT and TPA$L_STRINGPTR
form an embedded 32-bit descriptor for the input string.
2
|
|
For both 32-bit and 64-bit argument blocks:
- You must initialize the STRINGCNT and STRINGPTR fields to describe
the input string. Use LIB$ANALYZE_SDESC or LIB$ANALYZE_SDESC_64 to read
the string length and address from the string's descriptor and write
them in STRINGCNT and STRINGPTR, respectively.
- Before LIB$T[ABLE_]PARSE calls an action routine, it modifies
STRINGCNT and STRINGPTR to describe the remainder of the input string.
- When LIB$T[ABLE_]PARSE returns, STRINGCNT and STRINGPTR describe
the portion of the input string that LIB$T[ABLE_]PARSE did not process.
This occurs whether LIB$T[ABLE_]PARSE returns success or failure.
|
TPA$L_STRINGPTR
TPA64$Q_STRINGPTR
|
Contains the address of the remainder of the string being parsed.
|
TPA64$Q_TOKENDESC
|
For a 64-bit argument block, the three quadwords starting with
TPA64$Q_TOKENDESC form an embedded 64-bit descriptor for the current
token.
2 On entry, LIB$T[ABLE_]PARSE writes the fields of
TPA64$Q_TOKENDESC as follows:
DSC64$B_CLASS = DSC64$K_CLASS_S
DSC64$B_DTYPE = DSC64$K_DTYPE_T
DSC64$L_MBMO = --1
DSC64$W_MBO = +1
|
TPA$L_TOKENCNT
TPA64$Q_TOKENCNT
|
Contains the number of characters in the current token.
For a 32-bit argument block, TPA$L_TOKENCNT and TPA$L_TOKENPTR form
an embedded 32-bit descriptor for the input token.
2
For both 32-bit and 64-bit argument blocks, LIB$T[ABLE_]PARSE
updates TOKENCNT and TOKENPTR, to reflect the current token.
|
TPA$L_TOKENPTR
TPA64$Q_TOKENPTR
|
Contains the address of the current token.
|
TPA$B_CHAR
3
TPA64$B_CHAR
3
|
Contains the character matched by one of the single-character symbol
types: 'x', TPA$_ANY, TPA$_ALPHA, or TPA$_DIGIT.
|
TPA$L_NUMBER
3
TPA64$Q_NUMBER
3
|
Contains the binary representation of a numeric token that matches
TPA$_OCTAL, TPA$_DECIMAL, TPA$_HEX, TPA$_UIC, or TPA$_IDENT. For a
64-bit argument block, it can also contain the binary representation of
a numeric token that matches TPA$_DECIMAL_64, TPA$_OCTAL_64, or
TPA$_HEX_64.
|
(Alpha and I64 specific) TPA$Q_NUMBER
3
|
For a 32-bit argument block on an Alpha system, contains the binary
representation of a numeric token that matches TPA$_DECIMAL_64,
TPA$_OCTAL_64, or TPA$_HEX_64. LIB$T[ABLE_]PARSE coverts the numeric
token in the appropriate radix before storing it in the TPA$Q_NUMBER
field.
In the 32-bit argument block, TPA$Q_NUMBER overlays TPA$L_NUMBER
and the longword in which TPA$B_CHAR resides.
|
TPA$L_PARAM
TPA64$Q_PARAM
|
Contains the optional 32-bit argument supplied by the state transition
in its
argument argument. For a 64-bit argument block,
LIB$T[ABLE_]PARSE sign-extends the argument value before storing it in
TPA64$Q_PARAM.
|
1LIB$T[ABLE_]PARSE defines bit masks TPA$M_BLANKS,
TPA$M_ABBRFM, TPA$M_ABBREV, and TPA$M_AMBIG for use by languages such
as MACRO. These bit masks correspond to the location of the $V_ fields
in the OPTIONS field.
2See the HP OpenVMS Calling Standard manual for information about string
descriptor fields.
3LIB$T[ABLE_]PARSE modifies TPA$Q_NUMBER prior to calling an
action routine from a transition whose symbol type is listed in the
TPA$Q_NUMBER Description column. It does not modify this field while
executing a transition that specifies any other symbol type.
2 Coding and Using a Simple State Table
LIB$T[ABLE_]PARSE can parse programming languages, command languages,
or any other grammar for which a deterministic parser is the best
choice.
To code a program to use LIB$T[ABLE_]PARSE, perform the following steps:
- Set up state tables to implement the language's grammar (See
Section 2.1 )
- Define the argument block and other common variables (See Section
2.2 )
- Include the call to LIB$T[ABLE_]PARSE in the main program (See
Section 2.3 )
This section provides examples that demonstrate the use of
LIB$T[ABLE_]PARSE to perform these three steps. The examples parse the
command language of a simple report management utility. This
hypothetical utility allows a user to perform the following activities:
- Obtain a list of available reports (SHOW command).
- Read reports on the terminal (READ command).
- Print reports (PRINT command).
- Store new reports (FILE command).
The examples use the BASIC programming language for everything except
the state and keyword tables, which are coded in BLISS.
This simple state table program does not use any action routines or
other arguments. See Section 3 for information about how to
use these features of LIB$T[ABLE_]PARSE.
2.1 Setting Up a State Table
A state table associates the parser's alphabet with a set of possible
transitions.
It is often helpful to create a graphical representation of a state
table before attempting to code it. The following section illustrates
two possible approaches.
2.1.1 Diagramming the Transitions
One way to set up these tables is to start from a transition diagram of
the language you want to parse. (If you do not know how to construct a
transition diagram, you might find it helpful to read an introductory
text about compiler design and construction before you start.) Each
circle represents a state in the state table. Each arrow, labeled with
an input option, represents a transition out of one state to another
state or within the same state.
Figure lib-22 shows a transition diagram for the hypothetical utility
described in this section.
Figure lib-22 Transition Diagram for a Hypothetical
Utility
Another technique for developing a state table starts with a tabular
diagram in which the first column is the starting state, the second
column identifies the input token, or keyword, and the third gives the
resultant state.
Figure lib-23 is a tabular diagram of the utility that appears in
Figure lib-22.
Figure lib-23 Tabular Diagram of a Hypothetical Utility
In this case, each unique entry in the Starting State or Resulting
State column represents a state in the state table. Each entry in the
Input column represents a possible transition out of the state in the
Starting State column to a state in the Resulting State column.
2.1.2 Coding a State Table
For both MACRO and BLISS, you begin the state table with an $INIT_STATE
macro. If you use MACRO to define your state table, then:
- Use the $STATE macro to define each state.
- Follow each $STATE macro with one instance of the $TRAN macro for
each transition from this state to another state or within the same
state.
If you use BLISS to define the state table, then:
- Use the $STATE macro to define each state and its associated
transitions.
Note
The order in which you define the states is important. If you do not
specify a target state for a transition, LIB$T[ABLE_]PARSE transfers
control to the next state in the state table.
|
The following MACRO and BLISS examples code the state table for the
hypothetical utility diagrammed in Figure lib-22 and Figure lib-23.
Note that neither of these state tables includes the error state,
because LIB$T[ABLE_]PARSE automatically generates an error if the input
token does not match a transition in the current state. To provide a
transition to your own error state, code the last transition in the
state with the TPA$_LAMBDA symbol type and specify a transition to your
error state. The TPA$_LAMBDA symbol type matches any input token.
The state table, coded using MACRO, for this simple language looks like
this:
.TITLE simplelang
.ident 'v1'
;+
; Define the LIB$TABLE_PARSE control symbols
;-
$TPADEF
$INIT_STATE SIMPLE_LANGUAGE_TABLE, SIMPLE_KEYWORD_TABLE
$STATE START
$TRAN 'PRINT', STATE1
$TRAN 'READ', STATE1
$TRAN 'FILE', STATE1
$TRAN 'SHOW', STATE1
$STATE STATE1
$TRAN TPA$_STRING, STATE1
$TRAN TPA$_EOS, TPA$_EXIT
$END_STATE
.END
|
Using the BLISS macros yields the following state table definition:
MODULE simple_statetable =
BEGIN
!+
! These libraries contain the macros and other definitions
! needed to generate the state tables.
!-
LIBRARY 'SYS$LIBRARY:STARLET';
LIBRARY 'SYS$LIBRARY:TPAMAC';
!+
! UFD_STATE is the name you are giving the state table.
! UFD_KEY names the keyword table.
! Be sure to use the same name in the call to LIB$T[ABLE_]PARSE.
!-
$INIT_STATE (UFD_STATE, UFD_KEY);
!+
! Read the command name (to the first blank in the command).
! Each string is a keyword; you are limited to 220 keywords
! per state table.
!-
$STATE (START, !Be careful of your punctuation here.
('CREATE',STATE1), ! Each transition is surrounded by
('FILE',STATE1), ! parentheses; each entry except the
('PRINT',STATE1), ! last is followed by a comma.
('READ',STATE1)
);
$STATE (STATE1,
(TPA$_STRING, STATE1), ! If there is more than one report name
(TPA$_EOS, TPA$_EXIT) ! specified, go back and process it.
); ! exit when done.
END
ELUDOM ! End of module CREATE_TABLE
|
Assemble or compile this module as you would any other program module.
2.2 Defining the Argument Block
After you have set up the state tables, you need to declare the
LIB$T[ABLE_]PARSE argument block in such a way that both your program
and LIB$T[ABLE_]PARSE can use it. This means the data must be defined
in an area common to the calling program and the program module
containing the state table definitions.
In most programming languages you will use a combination of EXTERNAL
statements and common data definitions to create and access a separate
data PSECT. If you do not know what mechanisms the language you are
using provides, consult the documentation for that language.
The following example shows the LIB$T[ABLE_]PARSE argument block
defined for use in a BASIC program.
!LIB$T[ABLE_]PARSE requires that TPA$K_COUNT0 be eight.
DECLARE INTEGER CONSTANT TPA$K_COUNT0 = 8, &
BTPA$L_COUNT = 0, &
BTPA$L_OPTIONS=1, &
BTPA$L_STRINGCNT=2, &
BTPA$L_STRINGPTR=3, &
BTPA$L_TOKENCNT=4, &
BTPA$L_TOKENPTR=5, &
BTPA$B_CHAR=6, &
BTPA$L_NUMBER=7, &
BTPA$L_PARAM=8
!+
! The LIB$T[ABLE_]PARSE argument block.
!-
MAP (TPARSE_BLOCK) LONG TPARSE_ARRAY (TPA$K_COUNT0)
!+
! Redefining the map allows you to use the standard
! LIB$T[ABLE_]PARSE symbolic names. TPA$L_STRINGCNT,
! for example, references the same storage location
! as TPARSE_ARRAY(2) and TPARSE_ARRAY(BTPA$L_STRINGCNT).
!-
MAP (TPARSE_BLOCK) LONG &
TPA$L_COUNT , &
TPA$L_OPTIONS, &
TPA$L_STRINGCNT, &
TPA$L_STRINGPTR, &
TPA$L_TOKENCNT, &
TPA$L_TOKENPTR, &
TPA$B_CHAR, &
TPA$L_NUMBER, &
TPA$L_PARAM
|
Before your program can call LIB$T[ABLE_]PARSE, it must place the
necessary information in the argument block.
The example utility does not need to set any flags because it uses the
LIB$T[ABLE_]PARSE defaults for options such as blanks processing and
abbreviations. However, it must put the address and length of the
string to be parsed into the TPA$L_STRINGCNT and TPA$L_STRINGPTR fields.
The address and the length of the string to be parsed are available in
the descriptor of the input string (called COMMAND_LINE in the
following program). However, BASIC, like most high-level languages,
does not allow you to look at the descriptors of your strings. Instead,
you can use LIB$ANALYZE_SDESC or LIB$ANALYZE_SDESC_64 to read the
length and address from the string descriptor and place them in the
argument block.
2.3 Coding the Call to LIB$T[ABLE_]PARSE
The following example demonstrates calling LIB$T[ABLE_PARSE from a
high-level language (BLISS). This program uses the BLISS state table
described in Section 2.1.2 .
5 %TITLE "BLISS Program to Call LIB$T[ABLE_]PARSE
OPTION TYPE=EXPLICIT
!+
! COMMAND_LINE is the string to receive the input
! command from the terminal.
! ERROR_MSG_TEXT is the system error message
! returned from LIB$SYS_GETMSG
! (used in the error handling routine)
!-
DECLARE STRING COMMAND_LINE, ERROR_MSG_TEXT
!+
! RET_STATUS receives the status from the system calls.
! SAVE_STATUS is used when an error occurs
! and the error handling routine calls
! LIB$SYS_GETMSG to obtain the error text.
!-
DECLARE LONG RET_STATUS, SAVE_STATUS
!+
! UFD_STATE is the address of the state table.
! UFD_KEY is the address of the key table.
! Both addresses are set up by the macros in module
! SIMPLE_STATETABLE32.
!-
EXTERNAL LONG UFD_STATE, UFD_KEY
!+
! To allow us to compare returned statuses more easily.
!-
EXTERNAL INTEGER CONSTANT SS$_NORMAL, &
LIB$_SYNTAXERR, &
LIB$_INVTYPE
!+
! This program calls the following Run-Time Library
! routines:
!
! LIB$T[ABLE_]PARSE to parse the input string
!
! LIB$ANALYZE_SDESC to get the length and starting
! address of the command string and place them
! in the LIB$T[ABLE_]PARSE argument block.
!
! LIB$SYS_GETMSG to find the facility, severity, and text
! of any system errors that occur
! during program execution.
!-
EXTERNAL LONG FUNCTION LIB$TABLE_PARSE, &
LIB$ANALYZE_SDESC, &
LIB$SYS_GETMSG
!+
20 ! This file defines the argument block that is passed
! to LIB$T[ABLE_]PARSE. It also defines subscripts that
! make it easier to access the array.
!
! Keeping the argument block definitions in a separate
! file makes them easier to modify and lets other
! programs use the same definitions.
!-
%INCLUDE "SIMPLE_TPARSE_BLOCK"
50 ON ERROR GOTO ERROR_HANDLER
60 !+
! LIB$T[ABLE_]PARSE requires that TPA$L_COUNT, the
! first field in the argument block, have a value
! of TPA$K_COUNT0, whose value is 8.
!-
TPA$L_COUNT = TPA$K_COUNT0
75 !+
! Prompt at the terminal for the user's action.
! A real utility should provide a friendlier,
! clearer interface.
!-
GET_INPUT: PRINT "Your options are: " , " READ report "
PRINT , " FILE report "
PRINT , " PRINT report "
PRINT , " CREATE report "
PRINT
INPUT "What would you like to do"; COMMAND_LINE
!+
! Get the length and starting address of the command line
! and place them in the LIB$T[ABLE_]PARSE argument block. Note
! that LIB$ANALYZE_SDESC stores the length as a word.
!-
RET_STATUS = LIB$ANALYZE_SDESC (COMMAND_LINE BY DESC, &
TPARSE_ARRAY (BTPA$L_STRINGCNT) BY REF, &
TPARSE_ARRAY (BTPA$L_STRINGPTR) BY REF)
IF RET_STATUS <> SS$_NORMAL THEN
GOTO ERROR_HANDLER
END IF
100 !+
! Call LIB$T[ABLE_]PARSE to process the input string.
!
! Note that LIB$T[ABLE_]PARSE expects to receive its arguments
! by reference, while BASIC's default for arrays and
! strings is by descriptor. Therefore the BY REF
! clauses are required. Without them, LIB$T[ABLE_]PARSE
! cannot find the input string
! and the parse will always fail.
!-
RET_STATUS = LIB$TABLE_PARSE (TPARSE_ARRAY () BY REF, &
UFD_STATE BY REF, &
UFD_KEY BY REF )
!+
! This simple program provides no information except that
! a valid command was entered. The next section discusses
! techniques for gathering more information.
!-
IF RET_STATUS = SS$_NORMAL
!+
! For now, exit on success.
!-
THEN PRINT "Parse successful"
GOTO 9999
!+
! If the parse failed, give the user a chance to try again.
!-
ELSE IF RET_STATUS = LIB$_SYNTAXERR THEN
PRINT "You did not enter a valid command."
PRINT "Please try again."
GOTO GET_INPUT
!+
! If a more serious error occurred, inform the user
! and exit.
!-
ELSE
Goto ERROR_HANDLER
END IF
END IF
500 ERROR_HANDLER: SAVE_STATUS = RET_STATUS
RET_STATUS = LIB$SYS_GETMSG (SAVE_STATUS,,ERROR_MSG_TEXT)
PRINT "Something went wrong."
PRINT ERL, ERROR_MSG_TEXT
RESUME 9999
9999 END
|
Compile this program as you would any other BASIC program.
When both the state tables and the main program have been compiled,
link them together to form a single executable image, as follows:
$ LINK SIMPLANG,SIMPLANG_STATETABLE
|
3 Using Advanced LIB$T[ABLE_]PARSE Features
The LIB$T[ABLE_]PARSE call in the previous program tells you that the
command the user entered was valid, but nothing else---not even which
command was entered. A program usually needs more information than this.
The following sections describe some of the more complicated ways to
process input strings or to gather extra information for your program,
including:
- Action routines (see 3.1 )
- Blanks in the input string (see 3.2 )
- Special characters in the input string (see 3.3 )
- Abbreviated keywords (see 3.4 )
- Subexpressions (see 3.5 )
- Modular use of LIB$T[ABLE_]PARSE (see 3.6 )
3.1 Using Action Routines
After LIB$T[ABLE_]PARSE finds a match between a transition and the
leading portion of the input string, it determines if the transition
that made the match specified an action routine. If it did,
LIB$T[ABLE_]PARSE stores the value of the transition's
argument longword, if any, in the argument block PARAM
field and calls the action routine.
- If the action routine returns success, LIB$T[ABLE_]PARSE processes
the mask or msk-adr arguments, if
any, and continues to execute the transition as it would if there was
no action routine.
- If the action routine returns failure, LIB$T[ABLE_]PARSE does not
execute the transition and continues attempting to match successive
transitions.
3.1.1 Passing Data to an Action Routine
An action routine has only one argument, the argument block. You can
pass additional data to the action routine using:
- The transition's optional argument argument
- Fields you add to the end of the argument block
LIB$TABLE_PARSE and LIB$TPARSE use different linkages for passing the
argument block to the action routine:
- LIB$TABLE_PARSE uses the standard calling mechanism and passes the
argument block, by reference, as the only argument to the action
routine.
Therefore, for OpenVMS systems, action routines are
written as:
|