HP OpenVMS Systems Documentation

Content starts here

OpenVMS RTL String Manipulation (STR$) Manual


Previous Contents Index

2.3 Selecting String Manipulation Routines

To perform a given string manipulation operation, you can often choose one of several routines from the Run-Time Library. The LIB$, OTS$, and STR$ facilities all contain string copying and dynamic string allocation routines. Furthermore, a MACRO or BLISS program can call several of these routines using either a JSB or CALL entry point.

You should consider the factors discussed in the following sections when choosing a routine to perform the desired operation.

2.3.1 Efficiency

One of the major considerations in choosing among several routines is the efficiency of the various options.

In general, LIB$ and STR$ routines execute more efficiently than the corresponding OTS$ routines. OTS$ routines usually invoke the LIB$ entry point to perform an operation.

JSB entry points usually execute more efficiently than CALL entry points. However, a high-level language cannot explicitly access a JSB entry point. Further, a JSB entry point does not establish a stack frame and executes entirely in the environment of the calling program. This means, for instance, that the called routine cannot establish its own condition handler, so it cannot regain control if an exception occurs during execution. Also, some of the efficiency gained by using the JSB entry point may be lost because the calling routine must explicitly save all of the registers that the called routine uses.

Some routines perform a specific operation that is a subset of a more general capability. These more specialized routines are usually more efficient. For example, if you want to join two strings together, STR$APPEND and STR$PREFIX are more specific, and more efficient, than STR$CONCAT. Similarly, STR$LEFT and STR$RIGHT are subsets of the capabilities of STR$POS_EXTR.

2.3.2 Argument Passing

The mechanism by which a routine passes or receives arguments may also help you to decide among several routines that perform basically the same function.

Routines in the LIB$ and STR$ facilities pass scalar input arguments by reference to CALL entry points and by immediate value to JSB entry points. OTS$ routines pass scalar input arguments by immediate value to all entry points. For most high-level languages, the default passing mechanism is by reference. Thus, if you call a LIB$ or STR$ routine from one of these languages, you do not need to specify the passing mechanism for input scalar arguments.

Some routines require you to set up and pass more arguments than others. For example, some use a single string descriptor, while others require separate arguments for the length and the address of the string. Which routine you choose then depends on the form of the information already available in your program.

2.3.3 Error Handling

Routines from the LIB$, OTS$, and STR$ facilities handle errors in string copying differently:

  • LIB$
    The LIB$ string-copying routines return a completion status. When an output string must be truncated and its length depends on input arguments, LIB$ routines consider this to be a partial success; they therefore return LIB$_STRTRU instead of a severe error. This process corresponds to the convention of many higher-level languages, which do not consider truncation to be an error.
  • OTS$
    The OTS$ string-copying routines also signal errors that are considered fatal (such as invalid descriptor class). In addition, the routine returns in R0 the number of bytes in the source string that were not moved to the destination string. For VAX systems, this is the same as a MOVC5 instruction. The JSB entry points for OTS$ string-copying routines also leave registers R1 through R5 as they would be after a VAX MOVC5 instruction. See the VAX Architecture Reference Manual for a complete description of the MOVC5 instruction.
  • STR$
    The STR$ string-copying routines generally signal errors instead of returning a completion status. In the case of truncation errors, STR$ routines return an error status with a severity of WARNING (STR$_TRU). STR$ routines consider range errors to be qualified success.

Table 2-4 indicates the errors and the corresponding message that each facility considers severe.

Table 2-4 Severe Errors, by Facility
Error LIB$_ OTS$_ STR$_
Fatal internal error FATERRLIB FATINTERR FATINTERR
Illegal string class INVSTRDES INVSTRDES ILLSTRCLA
Insufficient virtual memory INSVIRMEM INSVIRMEM INSVIRMEM

Some Run-Time Library routines require you to specify the length of a string or the position of a character within a string. When you refer to character positions in a string, the first position is 1. Given a string with length L, containing a substring specified by character positions M to N, the following evaluation rules apply:

  • If M is less than 1, M is considered to equal 1.
  • If M is greater than L, the substring specified is the null string.
  • If N is greater than L, N is considered to equal the length of the source string.
  • If M is greater than N, the substring specified is the null string.

When specifying a substring of length L, the following applies:

  • If L is less than 0, the substring specified is the null string. (A null string is a descriptor with zero length. A descriptor with a nonzero length and a zero pointer generates an error and yields unspecified results.)

If any of these evaluation rules applies, the range error status (qualified success) is returned. STR$POSITION represents the exception to this convention. This routine returns a function value giving the character position of a substring within a string. If the function value is 0, the substring was not found.

2.4 Allocating Resources for Dynamic Strings

This section tells how to use the Run-Time Library string resource allocation routines. These routines allocate virtual memory for a dynamic string and place the address of the allocated memory in a descriptor.

Dynamic strings may be the most convenient type to write, since you need not specify constant length, maximum length, or position for them. However, there are some restrictions on dynamic strings.

  • They may cause program execution to be slower at run time.
  • They require larger address space.
  • They are not supported by all OpenVMS Alpha and OpenVMS VAX languages.

In most cases, when you call a Run-Time Library routine to manipulate dynamic strings, the Run-Time Library routine itself allocates the required memory for the string. Your program needs to allocate only the descriptors.

For example, if you are copying a source string into a dynamic destination string, simply use one of the library's string-copying routines. Copy the input string into a dynamic string whose length and address are initialized to zero. The string-copying routine then allocates the space that the calling program needs.

However, if your program must explicitly construct or modify a dynamic string descriptor, it must use the Run-Time Library allocation and deallocation routines. This technique may be necessary, for instance, if you are constructing a string out of components that are not themselves in string form. Further, you can use one of the deallocation routines to free the dynamic string after the string resources are no longer needed, in order to optimize the program's use of resources.

The Run-Time Library provides eight entry points for string resource allocation and deallocation, all with slightly different input arguments, calling techniques, or methods of indicating errors. The following tables summarize these routines and their functions.

The following routines allocate a specified number of bytes of dynamic virtual memory to a specified string descriptor.

Routine JSB Entry Point
LIB$SGET1_DD LIB$SGET1_DD_R6
LIB$SGET1_DD_64 LIB$SGET1_DD_R6
OTS$SGET1_DD OTS$SGET1_DD_R6
STR$GET1_DX STR$GET1_DX_R4
STR$GET1_DX_64 STR$GET1_DX_R4

The following routines return one dynamic string area to free storage, and set the descriptor POINTER and LENGTH fields to zero.

Routine JSB Entry Point
LIB$SFREE1_DD LIB$SFREE1_DD6
OTS$SFREE1_DD OTS$SFREE1_DD6
STR$FREE1_DX STR$FREE1_DX_R4

The following routines return one or more dynamic string areas to free storage, and set the descriptor POINTER and LENGTH fields to zero.

Routine JSB Entry Point
LIB$SFREEN_DD LIB$SFREEN_DD6
OTS$SFREEN_DD OTS$SFREEN_DD6

When you call the dynamic string allocation routines, consider the following factors:

  • When your program calls a string allocation routine, it needs to allocate space only for the string descriptor before making the call. Your program does this using the statement of the particular language, either statically at compile time or dynamically in local stack storage or heap storage.
  • If your routine explicitly allocates dynamic string descriptors in stack storage, it must explicitly free the associated dynamic string areas by calling the LIB$SFREE1_DD, OTS$SFREE1_DD, or STR$FREE1_DX routine. Then your routine must free the storage for the descriptor. After both areas have been freed, your routine can return to the calling program. If the deallocation is not done, the dynamic string area becomes unavailable when the RET instruction removes the descriptors that point to the string area.
  • If a routine has explicitly allocated dynamic string areas, and the routine is then unwound by the Condition Handling Facility (CHF), the allocated address space cannot be referenced again. For this reason, your program should establish a handler that frees the associated dynamic string areas when the SS$_UNWIND condition is signaled. The handler can free these areas by calling one of the deallocation routines. This technique is especially important if a large amount of address space is involved, or if the routine allocates space within a repeating loop.

You can call the string resource allocation routines only from user mode, at asynchronous system trap (AST) or non-AST level. However, be extremely careful if you manipulate dynamic strings at AST level. The string manipulation routines in the Run-Time Library do not prevent the strings that they are manipulating at non-AST level from being modified at AST level.

For example, consider the case in which a string manipulation routine has calculated the lengths and addresses involved in a concatenation operation. This string manipulation routine may be interrupted by an AST. The user, at AST level, may write to the same string, changing its length and address. It is then possible to resume execution of the routine with addresses that are no longer allocated or string lengths that are no longer valid. For this reason, if you use dynamic strings at AST level, you should allocate, use, and deallocate them within the AST code.

The dynamic string manipulation routines are intended for use at user mode only. To manipulate dynamic strings at another access mode, you should allocate and deallocate storage for each string at that access mode to avoid side effects. Link each segment of your program that runs at a different access mode with the /NOSYSSHR qualifier. In this way, you establish a separate copy of the string database for each access mode.

2.4.1 String Zone

All virtual memory for dynamic strings is allocated from a Run-Time Library zone called the string zone.

The string zone has the following benefits:

  • Efficient memory utilization.
  • Allocation and deallocation for long strings (more than 136 bytes for a VAX system and more than 272 bytes for an Alpha system) is twice as fast.
  • Elimination of paging contention with the default zone by isolation of the string virtual memory accesses to a separate zone. A direct side effect of this is that corruptions caused by writing into previously freed strings no longer affect items allocated in the default zone, directly easing the debugging effort for such problems.

Table 2-5 shows attribute values for 32-bit and 64-bit string zones. VAX systems have a 32-bit string zone; Alpha systems have both a 32-bit and a 64-bit string zone.

Table 2-5 String Zone Attributes
Attribute 32-bit String Zone 64-bit String Zone
Algorithm Quick fit Quick fit
Number of lookaside lists 17 (short strings from 8 to 136 bytes) 17 (short strings from 8 to 272 bytes)
Area of initial size 4 pages 4 pages
Area of extension size 32 pages 32 pages
Block size 8 bytes 16 bytes
Alignment Longword boundary Quadword boundary
Smallest block size 16 bytes (includes boundary tags) 32 bytes (includes boundary tags)
Boundary tags Boundary tags are used for long strings Boundary tags are used for long strings
Page limit No page limit No page limit
Fill on allocate No fill on allocate No fill on allocate
Fill on free No fill on free No fill on free


Part 2
STR$ Reference Section

This section contains detailed descriptions of the routines in the OpenVMS RTL String Manipulation (STR$) facility.

STR$ADD

The Add Two Decimal Strings routine adds two decimal strings of digits.

Format

STR$ADD asign ,aexp ,adigits ,bsign ,bexp ,bdigits ,csign ,cexp ,cdigits


RETURNS


OpenVMS usage: cond_value
type: longword (unsigned)
access: write only
mechanism: by value


Arguments

asign


OpenVMS usage: longword_unsigned
type: longword (unsigned)
access: read only
mechanism: by reference

Sign of the first operand. The asign argument is the address of an unsigned longword containing this sign. A value of 0 is considered positive; a value of 1 is considered negative.

aexp


OpenVMS usage: longword_signed
type: longword (signed)
access: read only
mechanism: by reference

Power of 10 by which adigits is multiplied to get the absolute value of the first operand. The aexp argument is the address of a signed longword containing this exponent.

adigits


OpenVMS usage: char_string
type: character string
access: read only
mechanism: by descriptor

Text string of unsigned digits representing the absolute value of the first operand before aexp is applied. The adigits argument is the address of a descriptor pointing to this string. This string must be an unsigned decimal number.

bsign


OpenVMS usage: longword_unsigned
type: longword (unsigned)
access: read only
mechanism: by reference

Sign of the second operand. The bsign argument is the address of an unsigned longword containing the second operand's sign. A value of 0 is considered positive; a value of 1 is considered negative.

bexp


OpenVMS usage: longword_signed
type: longword (signed)
access: read only
mechanism: by reference

Power of 10 by which bdigits is multiplied to get the absolute value of the second operand. The bexp argument is the address of a signed longword containing the second operand's exponent.

bdigits


OpenVMS usage: char_string
type: character string
access: read only
mechanism: by descriptor

Text string of unsigned digits representing the absolute value of the second operand before bexp is applied. The bdigits argument is the address of a descriptor pointing to this string. This string must be an unsigned decimal number.

csign


OpenVMS usage: longword_unsigned
type: longword (unsigned)
access: write only
mechanism: by reference

Sign of the result. The csign argument is the address of an unsigned longword containing the result's sign. A value of 0 is considered positive; a value of 1 is considered negative.

cexp


OpenVMS usage: longword_signed
type: longword (signed)
access: write only
mechanism: by reference

Power of 10 by which cdigits is multiplied to get the absolute value of the result. The cexp argument is the address of a signed longword containing this exponent.

cdigits


OpenVMS usage: char_string
type: character string
access: write only
mechanism: by descriptor

Text string of unsigned digits representing the absolute value of the result before cexp is applied. The cdigits argument is the address of a descriptor pointing to this string. This string is an unsigned decimal number.

Description

STR$ADD adds two strings of decimal numbers (a and b). Each number to be added is passed to STR$ADD in three arguments:
  1. xdigits-the string portion of the number
  2. xexp-the power of ten needed to obtain the absolute value of the number
  3. xsign-the sign of the number

The value of the number x is derived by multiplying xdigits by 10xexp and applying xsign. Therefore, if xdigits is equal to '2' and xexp is equal to 3 and xsign is equal to 1, then the number represented in the x arguments is 2 * 103 plus the sign, or -2000.

The result of the addition c is also returned in those three parts.


Condition Values Returned

SS$_NORMAL Routine successfully completed.
STR$_TRU String truncation warning. The destination string could not contain all the characters in the result string.

Condition Values Signaled

LIB$_INVARG Invalid argument.
STR$_FATINTERR Fatal internal error. An internal consistency check has failed. This usually indicates an internal error in the Run-Time Library and should be reported to your Compaq support representative.
STR$_ILLSTRCLA Illegal string class. The class code found in the class field of a descriptor is not a string class code allowed by the OpenVMS calling standard.
STR$_INSVIRMEM Insufficient virtual memory. STR$ADD could not allocate heap storage for a dynamic or temporary string.
STR$_WRONUMARG Wrong number of arguments.


Example


100 !+
    ! This is a sample arithmetic program
    ! showing the use of STR$ADD to add
    ! two decimal strings.
    !-

    ASIGN% = 1%
    AEXP% = 3%
    ADIGITS$ = '1'
    BSIGN% = 0%
    BEXP% = -4%
    BDIGITS$ = '2'
    CSIGN% = 0%
    CEXP% = 0%
    CDIGITS$ = '0'
    PRINT "A = "; ASIGN%; AEXP%; ADIGITS$
    PRINT "B = "; BSIGN%; BEXP%; BDIGITS$
    CALL STR$ADD        (ASIGN%, AEXP%, ADIGITS$, &
                        BSIGN%, BEXP%, BDIGITS$,  &
                        CSIGN%, CEXP%, CDIGITS$)
    PRINT "C = "; CSIGN%; CEXP%; CDIGITS$
999 END

      

This BASIC example uses STR$ADD to add two decimal strings, where the following values apply:

A = -1000 (ASIGN = 1, AEXP = 3, ADIGITS = '1')
B = .0002 (BSIGN = 0, BEXP = -4, BDIGITS = '2')

The output generated by this program is listed below; note that the decimal value of C equals -999.9998 (CSIGN = 1, CEXP = -4, CDIGITS = '9999998').


A = 1  3 1
B = 0 -4 2
C = 1 -4 9999998


Previous Next Contents Index