HP OpenVMS Systems Documentation

Content starts here

OpenVMS MACRO-32 Porting and User's Guide


Previous Contents Index

2.5.4 Preserve Argument for Entry Point Register Declaration

The preserve argument indicates those registers that should be preserved over the routine call. This should include only those registers that are modified and whose full 64-bit contents should be saved and restored.

The preserve argument causes registers to be preserved whether or not they would have been preserved automatically by the compiler's processing of a .CALL_ENTRY or .JSB_ENTRY directive. This is also the only way in a .JSB32_ENTRY routine to save and restore the full 64 bits of a register. Note that because R0 and R1 are scratch registers, by calling standard definition, the compiler never saves and restores them in any routine unless you specify them in the preserve argument at the routine's entry point.

This argument overrides the output and scratch arguments. If you specify a register both in the preserve argument and in the output or scratch arguments, the compiler will preserve the register but will report the following warning:


%AMAC-W-REGDECCON, register declaration conflict in routine A

The preserve argument has no effect on the compiler's temporary register usage.

2.5.5 Help for Specifying Register Sets

When you invoke the compiler, specifying /FLAG=HINTS on the command line, the compiler generates messages that can assist you in constructing the register sets for routine entry points. Among the hints the compiler provides are the following:

  • Registers that might be used as input to the routine, which may not have been your intention. If a register is read before being written, it will be included in this set.
  • Registers that might be used for output values. If a register is written but not subsequently read, the compiler will include it in the list of possible output values. Again, this may not have been your intention.
  • Registers the compiler saves and restores that are not listed in the entry point's preserve argument.

It is recommended that the .CALL_ENTRY, .JSB_ENTRY, and .JSB32_ENTRY register arguments reflect the routine interface, as described in the routine's text header. Wherever possible, you should declare input, output, scratch, and preserve register arguments for all routines. You only need to provide the argument when there are registers to be declared (for instance, input=<> is not necessary).

2.6 Branching Between Local Routines

The compiler allows a branch from the body of one routine into the body of another routine in the same module and psect. However, because this may result in additional overhead in both routines, the compiler reports an information-level message.

Note

The compiler does not recognize a call to $EXIT as terminating a routine. Add an extra RET or RSB, whichever is applicable, after $EXIT to terminate the routine.

If a CALL routine branches into a code path that executes an RSB, an error message is reported. Such a CALL routine, if not corrected, will fail at run time.

If a JSB routine branches into a code path that executes a RET instruction, and the JSB routine preserves any registers, an informational message is issued. This construct will work at run time, but the registers saved by the JSB routine will not be restored.

If routines that share a code path have different register declarations, the register restores will be done conditionally. That is, the registers written on the stack at routine entry will be the same for both routines, but whether or not the register is restored will depend upon which entry point was invoked.

For example:


rout1: .jsb_entry output=r3
            movl    r1, r3          ! R3 is output, not preserved
            movl    r2, r4          ! R4 should be preserved
            blss    lab1
            rsb

rout2:  .jsb_entry                  ! R3 is not output, and
            movl    #4, r3          ! should be auto-preserved
            movl    r0, r4          ! R4 should be preserved
lab1:       clrl    r0
            rsb

For both routines, R3 will be included in the registers saved on the stack at entry. However, at exit, a mask (also in the stack frame) will be tested before restoring R3. The mask will not be tested before R4 is restored, because R4 should be restored for both entry points.

Note that declaring registers that are destroyed in two routines that share code as scratch in one but not the other is actually more expensive than letting them be saved and restored. In this case, declare them as scratch in both, or if one routine requires that they be preserved, as preserve in both.

2.7 Declaring Exception Entry Points

The .EXCEPTION_ENTRY directive, as described in Appendix B, indicates the entry point of an exception service routine. Use the .EXCEPTION_ENTRY directive to declare the entry points for routines serving interrupts such as the following:

  • Interval clock
  • Interprocessor interrupt
  • System/processor correctable error
  • Power failure
  • System/processor machine abort
  • Software interrupt

At routine entry, R3 must contain the address of the procedure descriptor. The routine must exit with an REI instruction.

At exception entry points, the interrupt dispatcher pushes onto the stack registers R2 through R7, the PC, and the PSL. To access the contents of these registers, specify the stack_base argument in the .EXCEPTION_ENTRY directive. The compiler generates code that places the value of the SP at routine entry in the register you specify in stack_base, allowing the exception service routine to use this register to locate the contents of registers on the stack.

The compiler automatically saves and restores all other registers used in the routine, plus, if the service routine issues a CALL or a JSB instruction, all scratch registers, including R0 and R1.

Note

Error handling routines, established when their addresses are stored in the frame at 0(FP), are not .EXCEPTION_ENTRY routines. Such error handlers should be declared as .CALL_ENTRY routines and end with RET instructions.

2.8 Using Packed Decimal Instructions

The packed decimal directive .PACKED and all packed decimal instructions, except EDITPC, are supported for the MACRO-32 compiler by emulation routines that exist outside the compiled module.

2.8.1 Differences Between the VAX and Alpha Implementations

The differences between the implementations on VAX and Alpha systems are noted in the following list:

  • Packed decimal instructions and atomicity
    Since all packed decimal instructions are emulated via subroutine calls, they are not atomic or restartable, and cannot be made atomic by the .PRESERVE ATOMICITY directive or /PRESERVE=ATOMICITY switch. Also, there is no guarantee that Alpha registers from R16 through R28 are preserved across the instruction.
  • Use of argument registers (R16 through R21)
    Arguments are passed to the emulation routines by means of the argument registers (R16 through R21); attempts to use these registers as arguments in a packed decimal instruction will not work correctly.
    The compiler converts references to the VAX argument pointer (AP) into references to the Alpha argument registers (see Section 2.3). For example, the compiler converts a parameter reference off the VAX AP, such as 4(AP), to Alpha R16.
    To prevent these problems with AP-based references when packed decimal or floating-point instructions are used, it is simplest to specify /HOME_ARGS=TRUE on the entry point of the routine that contains these references. This forces all AP-based references to be read from the procedure frame, rather than the argument registers, so that no changes need to be made to the instructions.
    If /HOME_ARGS=TRUE is not used, the source code that contains parameter references based on the AP register as operands to packed-decimal instructions (or floating-point instructions) should be changed. Copy any AP-based operand to a temporary register first, and use that temporary register in the packed-decimal or floating-point instruction. For example, change the following code:


    MOVP     R0,  @8(AP), @4(AP)
    

    to:


    MOVL    8(AP), R1
    MOVL    4(AP), R2
    MOVP    R0,(R1),(R2)
    
  • Overflow traps
    Bits 7 and 5 of the VAX Program Status Word (PSW) are the DV (decimal overflow trap enable) and IV (integer overflow trap enable) bits respectively. These bits are not emulated as part of the emulation of the VAX PSW, but the packed decimal support allows you to enable or disable integer and decimal overflow, although you must do so statically at compile time.
    To enable decimal overflow faults, define the symbol PD_DEC_OVF to be nonzero. If PD_DEC_OVF is not defined or is set to zero, the packed decimal emulation routines will not generate decimal overflow faults. To enable integer overflow faults, define the symbol PD_INT_OVF to be nonzero. If PD_INT_OVF is not defined or is set to zero, the packed decimal emulation routines will not generate integer overflow faults.
    The MACRO qualifier /ENABLE=OVERFLOW and directive .ENABLE OVERFLOW have no effect on overflow trap enabling for the packed decimal emulation routines. You must use PD_DEC_OVF and PD_INT_OVF.
  • Trap routines for reserved operand, decimal divide by zero, integer overflow, and decimal overflow
    The emulation routines have their own trap routines for reserved operand, decimal divide by zero, integer overflow, and decimal overflow. Reserved operand and decimal divide by zero traps are always taken when the events occur. The overflow traps are taken only when they have been explicitly enabled. The traps call LIB$SIGNAL with a severity of fatal.
  • Messages from the packed decimal emulation routines
    All messages from the packed decimal emulation routines of the compiler use the following standard OpenVMS signal values: SS$_ROPRAND,SS$_DECOVF, SS$_INTOVF, and SS$_FLTDIV.
  • Restriction on format of arguments
    Because these instructions are implemented by means of macros, there is one restriction on the format of the arguments. In a macro invocation, an initial circumflex (^) is interpreted to mean that the parameter is a string, and the character immediately following the circumflex is the string delimiter. Because of this, you cannot use arguments that begin with an operand type specification, such as ^x20(SP). Note that immediate mode arguments, such as #^XFF, can use an operand type specification because the circumflex is not the initial character.

2.9 Using Floating-Point Instructions

All floating-point instructions and directives, with the exception of POLYx, EMODx and all H-Floating instructions, are supported.

These instructions are emulated via subroutine calls. This support is provided to allow hands-off compatibility for most existing VAX MACRO modules and is not designed for fast floating-point performance.

Besides the overhead of the emulation routine call, all floating-point operands must be passed through memory because the Alpha architecture does not have instructions to move values directly from the integer registers to the floating-point registers. In addition, on the first floating-point instruction, the FEN (Floating-point enable) bit is set for the process which will cause the entire floating-point register set to be saved and restored on every context switch for the life of the image.

2.9.1 Differences Between the VAX and Alpha Implementations

The differences between the implementations on VAX and Alpha systems are noted in the following list:

  • Floating-point instructions and atomicity
    Since all floating-point instructions are emulated via subroutine calls, they are not atomic or restartable, and cannot be made atomic by the .PRESERVE ATOMICITY directive or /PRESERVE=ATOMICITY switch. Also, there is no guarantee that Alpha registers from R16 through R28 are preserved across the instruction.
  • Use of argument registers (R16 through R21)
    Arguments are passed to the emulation routines by means of the argument registers (R16 through R21); attempts to use these registers as arguments in a floating-point instruction will not work correctly.
    The compiler converts references to the VAX argument pointer (AP) into references to the Alpha argument registers (see Section 2.3). For example, the compiler converts a parameter reference off the VAX AP, such as 4(AP), to Alpha R16.
    To prevent these problems with AP-based references when packed decimal or floating-point instructions are used, it is simplest to specify /HOME_ARGS=TRUE on the entry point of the routine that contains these references. This forces all AP-based references to be read from the procedure frame, rather than the argument registers, so that no changes need to be made to the instructions.
    If /HOME_ARGS=TRUE is not used, the source code that contains parameter references based on the AP register as operands to packed-decimal instructions (or floating-point instructions) should be changed. Copy any AP-based operand to a temporary register first, and use that temporary register in the packed-decimal or floating-point instruction. For example, change the following code:


    MOVP     R0,  @8(AP), @4(AP)
    

    to:


    MOVL    8(AP), R1
    MOVL    4(AP), R2
    MOVP    R0,(R1),(R2)
    
  • D_floating format
    D_floating format is not fully supported in the Alpha architecture. All arithmetic operations or conversions must be done by converting D_floating format to G_floating format, doing the operation in G_floating format, and converting back to D_floating format. This results in the loss of the extra 3 bits of precision in the D_floating format mantissa, besides taking the time in conversion. This means that there is nothing to be gained by using D_floating format. It is available for compatibility only, with the caveat that some precision will be lost.
  • Overflow traps
    It is not possible to enable traps on integer overflow or floating-point underflow. Alpha systems will always trap on floating-point overflow. The /ENABLE qualifier and the .ENABLE directive have no effect on overflow trapping. Overflow traps that lead to predictable results on VAX systems will give the same results on Alpha systems.
  • Dirty zeros
    Code may need modification to eliminate dirty zeros. A true zero in all VAX floating-point formats has all bits set to zero. If the exponent bits are all zero but some of the remaining bits are set, this is a dirty zero, and the VAX system will treat this as a zero. An Alpha system will take a reserved operand trap.
  • Restriction on format of arguments
    Because these instructions are implemented by means of macros, there is one restriction on the format of the arguments. In a macro invocation, an initial circumflex (^) is interpreted to mean that the parameter is a string, and the character immediately following the circumflex is the string delimiter. Because of this, you cannot use arguments that begin with an operand type specification, such as ^x20(SP). Note that immediate mode arguments, such as #^XFF, can use an operand type specification because the circumflex is not the initial character.
  • Floating-point return values
    A MACRO program that calls out to a routine and expects a floating-point return value in R0 may require a " jacket" between the call and the called routines. The purpose of the jacket is to move the returned value from floating-point register 0 to R0.

2.9.2 Impact on Routines in Other Languages

This support does not make the floating-point register set visible to the compiler. It simply allows floating point-operations to be done on the integer registers. This means that routines in other languages that want to interface with a VAX MACRO routine, either calling it or being called by it, must not expect any floating-point values as inputs or outputs. Compilers for other languages will pass these values in the floating-point registers. Floating-point arguments can be passed into or out of a VAX MACRO routine only by pointer.

Calls to run-time library (RTL) routines of other languages fall into this category. For example, a call to MTH$RANDOM returns a floating value in floating-point register F0. The compiler cannot directly read F0. You need to either create a jacket routine (in another language), which makes the call to MTH$RANDOM and then moves the result to R0, or write a separate routine that only does the move.

2.10 Preserving the Atomicity and Granularity of VAX MACRO

The VAX architecture includes instructions that perform a read-modify-write memory operation so that it appears to be a single, noninterruptible operation in a uniprocessing system. Atomicity is the term used to describe the ability to modify memory in one operation. Because the complexity of such instructions severely limits performance, read-modify-write operations on an Alpha system can be performed only by nonatomic, interruptible instruction sequences.

Furthermore, VAX instructions can address single aligned or unaligned byte, word, and longword locations in memory without affecting the surrounding memory locations. (A data item is considered aligned if its address is an even multiple of the item's size in bytes.) Granularity is the term used to describe the ability to independently write to portions of aligned longwords. Because byte, word, and unaligned longword access also severely limits performance, an Alpha system can only access aligned longword or quadword locations. Therefore, a sequence of instructions to write a single byte, word, or unaligned longword causes some of the surrounding bytes to be read and rewritten.

Both architectural differences can cause data to become corrupted under certain conditions. In an Alpha system, atomicity and granularity preservation are not provided by locking out other threads from modifying memory, but by providing a way to determine if a piece of memory may have been modified during the read-modify-write operation. In this case, the read-modify-write operation is retried.

To ensure data integrity, the compiler provides certain qualifiers and directives to be used for the conditions described in the following sections.

2.10.1 Preserving Atomicity

On VAX and Alpha multiprocessing systems alike, an application in which multiple, concurrent threads can modify shared data in a writable global section must have some way of synchronizing their access to that data. On a VAX single processor system, a memory modification instruction is sufficient to provide synchronized access to shared data. However, it is not sufficient on Alpha systems.

The compiler provides the /PRESERVE=ATOMICITY option to guarantee the integrity of read-modify-write operations for VAX instructions that have a memory modify operand. Alternatively, you can insert the .PRESERVE ATOMICITY and .NOPRESERVE ATOMICITY directives in sections of VAX MACRO source code as required to enable and disable atomicity.

For instance, the instruction INCL (R1) requires a read, modify, and write sequence on the data pointed to by R1. In a VAX system, the microcode performs these three operations. Therefore, an interrupt cannot occur until the sequence is fully completed. In an Alpha system, the following three instructions are required to perform the one VAX instruction:


        LDL     R28,(R1)
        ADDL    R28, 1,R28
        STL     R28,(R1)
The problem with this Alpha code sequence is that an interrupt can occur between any of the three instructions. If the interrupt causes an AST routine to execute or causes another process to be scheduled between the LDL and the STL, and the AST or other process updates the data pointed to by R1, the STL will store the result (R1) based on stale data.

When an atomic operation is required, and /PRESERVE=ATOMICITY (or .PRESERVE ATOMICITY) is specified, the compiler generates the following Alpha instruction sequence for INCL (R1):


Retry:  LDL_L   R28,(R1)
        ADDL    R28,#1,R28
        STL_C   R28,(R1)

        BEQ     R28, fail
         .
         .
         .
fail:   BR      Retry

If (R1) is modified by any other code thread on the current or any other processor during this sequence, the Store Longword Conditional instruction (STL_C) will not update (R1), but will indicate an error by writing 0 into R28. In this case, the code branches back and retries the operation until it completes without interference.

The BEQ Fail and BR Retry are done instead of a BEQ Retry because the branch prediction logic of the Alpha architecture assumes that backward conditional branches will be taken. Since this operation will rarely need to be retried, it is more efficient to make a forward conditional branch which is assumed not to be taken.

Because of the way atomicity is preserved on Alpha systems, this guarantee of atomicity applies to both uniprocessor and multiprocessor systems. This guarantee applies only to the actual modify instruction and does not extend interlocking to subsequent or previous memory accesses (see Section 2.10.6).

You should take special care in porting an application to an Alpha system if it involves multiple processes that modify shared data in a writable global section, even if the application executes only on a single processor. Additionally, you should examine any application in which a mainline process routine modifies data in process space that can also be modified by an asynchronous system trap (AST) routine or condition handler. See Migrating to an OpenVMS AXP System: Recompiling and Relinking Applications1 for a more complete discussion of the programming issues involved in read-modify-write operations in an Alpha system.

Warning

When preserving atomicity, the compiler generates aligned memory instructions that cannot be handled by the PALcode unaligned fault handler. They will cause a fatal reserved operand fault on unaligned addresses. Therefore, all memory references for which .PRESERVE ATOMICITY is specified must be to aligned addresses (see Section 2.10.5).

Note

1 This manual has been archived but is available on the OpenVMS Documentation CD-ROM.


Previous Next Contents Index