 |
HP OpenVMS Alpha Version 7.3--2 Release Notes
B.3 Noncompliant Code Characteristics
The areas of noncompliance detected by the SRM_CHECK tool can be
grouped into the following four categories. Most of these can be fixed
by recompiling with new compilers. In rare cases, the source code may
need to be modified. See Section B.5 for information about compiler
versions.
- Some versions of OpenVMS compilers introduce noncompliant code
sequences during an optimization called "loop rotation." This problem
can be triggered only in C or C++ programs that use LDx_L/STx_C
instructions in assembly language code that is embedded in the C/C++
source using the ASM function, or in assembly language written in
MACRO--32 or MACRO--64. In some cases, a branch was introduced between
the LDx_L and STx_C instructions.
This can be addressed by
recompiling.
- Some code compiled with very old BLISS, MACRO--32, DEC Pascal, or
DEC COBOL compilers may contain noncompliant sequences. Early versions
of these compilers contained a code scheduling bug where a load was
incorrectly scheduled after a load_locked.
This can be addressed by
recompiling.
- In rare cases, the MACRO--32 compiler may generate a noncompliant
code sequence for a BBSSI or BBCCI instruction where there are too few
free registers.
This can be addressed by recompiling.
- Errors may be generated by incorrectly coded MACRO--64 or
MACRO--32 and incorrectly coded assembly language embedded in C or C++
source using the ASM function.
This requires source code changes.
The new MACRO--32 compiler flags noncompliant code at compile time.
If the SRM_CHECK tool finds a violation in an image, you should
recompile the image with the appropriate compiler (see Section B.5).
After recompiling, you should analyze the image again. If violations
remain after recompiling, examine the source code to determine why the
code scheduling violation exists. Then make the appropriate changes to
the source code.
B.4 Coding Requirements
The Alpha Architecture Reference Manual describes how an
atomic update of data between processors must be formed. The Third
Edition, in particular, has much more information on this topic. This
edition details the conventions of the interlocked memory sequence.
Exceptions to the following two requirements are the source of all
known noncompliant code:
- There cannot be a memory operation (load or store) between the
LDx_L (load locked) and STx_C (store conditional) instructions in an
interlocked sequence.
- There cannot be a branch taken between an LDx_L and an STx_C
instruction. Rather, execution must "fall through" from the LDx_L to
the STx_C without taking a branch.
Any branch whose target is
between an LDx_L and matching STx_C creates a noncompliant sequence.
For instance, any branch to "label" in the following example would
result in noncompliant code, regardless of whether the branch
instruction itself was within or outside of the sequence:
LDx_L Rx, n(Ry)
...
label: ...
STx_C Rx, n(Ry)
|
Therefore, the SRM_CHECK tool looks for the following:
- Any memory operation (LDx/STx) between an LDx_L and an STx_C
- Any branch that has a destination between an LDx_L and an STx_C
- STx_C instructions that do not have a preceding LDx_L instruction
This typically indicates that a backward branch is taken from an
LDx_L to the STx_C Note that hardware device drivers that do device
mailbox writes are an exception. These drivers use the STx_C to write
the mailbox. This condition is found only on early Alpha systems and
not on PCI-based systems.
- Excessive instructions between an LDx_L and an STxC
The AARM
recommends that no more than 40 instructions appear between an LDx_l
and an STx_C. In theory, more than 40 instructions can cause hardware
interrupts to keep the sequence from completing. However, there are no
known occurrences of this.
To illustrate, the following are examples of code flagged by SRM_CHECK.
** Found an unexpected ldq at 0008291C
00082914 AC300000 ldq_l R1, (R16)
00082918 2284FFEC lda R20, 0xFFEC(R4)
0008291C A6A20038 ldq R21, 0x38(R2)
|
In the above example, an LDQ instruction was found after an LDQ_L
before the matching STQ_C. The LDQ must be moved out of the sequence,
either by recompiling or by source code changes. (See Section B.3.)
** Backward branch from 000405B0 to a STx_C sequence at 0004059C
00040598 C3E00003 br R31, 000405A8
0004059C 47F20400 bis R31, R18, R0
000405A0 B8100000 stl_c R0, (R16)
000405A4 F4000003 bne R0, 000405B4
000405A8 A8300000 ldl_l R1, (R16)
000405AC 40310DA0 cmple R1, R17, R0
000405B0 F41FFFFA bne R0, 0004059C
|
In the above example, a branch was discovered between the LDL_L and
STQ_C. In this case, there is no "fall through" path between the LDx_L
and STx_C, which the architecture requires.
Note
This branch backward from the LDx_L to the STx_C is characteristic of
the noncompliant code introduced by the "loop rotation" optimization.
|
The following MACRO--32 source code demonstrates code where there is a
"fall through" path, but this case is still noncompliant because of the
potential branch and a memory reference in the lock sequence:
getlck: evax_ldql r0, lockdata(r8) ; Get the lock data
movl index, r2 ; and the current index.
tstl r0 ; If the lock is zero,
beql is_clear ; skip ahead to store.
movl r3, r2 ; Else, set special index.
is_clear:
incl r0 ; Increment lock count
evax_stqc r0, lockdata(r8) ; and store it.
tstl r0 ; Did store succeed?
beql getlck ; Retry if not.
|
To correct this code, the memory access to read the value of INDEX must
first be moved outside the LDQ_L/STQ_C sequence. Next, the branch
between the LDQ_L and STQ_C, to the label IS_CLEAR, must be eliminated.
In this case, it could be done using a CMOVEQ instruction. The CMOVxx
instructions are frequently useful for eliminating branches around
simple value moves. The following example shows the corrected code:
movl index, r2 ; Get the current index
getlck: evax_ldql r0, lockdata(r8) ; and then the lock data.
evax_cmoveq r0, r3, r2 ; If zero, use special index.
incl r0 ; Increment lock count
evax_stqc r0, lockdata(r8) ; and store it.
tstl r0 ; Did write succeed?
beql getlck ; Retry if not.
|
B.5 Compiler Versions
Table B-1 contains information about versions of compilers that
might generate noncompliant code sequences and the recommended minimum
versions to use when you recompile.
Table B-1 Versions of OpenVMS Compilers
Old Version |
Recommended Minimum Version |
BLISS V1.1
|
BLISS V1.3
|
DEC Ada V3.5
|
HP Ada V3.5A
|
DEC C V5.x
|
DEC C V6.0
|
DEC C++ V5.x
|
DEC C++ V6.0
|
DEC COBOL V2.4, V2.5
|
DEC COBOL V2.6
|
DEC Pascal V5.0-2
|
DEC Pascal V5.1-11
|
MACRO--32 V3.0
|
V3.1 for OpenVMS Version 7.1-2
V4.1 for OpenVMS Version 7.2
|
MACRO--64 V1.2
|
See below.
|
Current versions of the MACRO--64 assembler might still encounter the
loop rotation issue. However, MACRO--64 does not perform code
optimization by default, and this problem occurs only when optimization
is enabled. If SRM_CHECK indicates a noncompliant sequence in the
MACRO--64 code, it should first be recompiled without optimization. If
the sequence is still flagged when retested, the source code itself
contains a noncompliant sequence that must be corrected.
Alpha computers with 21264 processors require strict adherence to the
restrictions for interlocked memory sequences for the LDx_L and STx_C
instructions described in the Alpha Architecture Reference Manual,
Third Edition. To help ensure that uses of interlocked memory
instructions conform to the architectural guidelines, additional
checking has been added to Version 3.1 of the MACRO--32 Compiler for
OpenVMS Alpha.
The Alpha Architecture Reference Manual, Third Edition
describes the rules for instruction use within interlocked memory
sequences in Section 4.2.4. The MACRO--32 for OpenVMS Alpha Version 3.1
compiler observes these rules in the code it generates from MACRO--32
source code. However, the compiler provides EVAX_LQxL and EVAX_STxC
built-ins, which allow these instructions to be written directly in
source code.
The MACRO--32 Compiler for OpenVMS Alpha Version 4.1 now performs
additional code checking and displays warning messages for noncompliant
code sequences.
B.6 Recompiling Code with ALONONPAGED_INLINE or LAL_REMOVE_FIRST
Any MACRO--32 code on OpenVMS Alpha that invokes either the
ALONONPAGED_INLINE or the LAL_REMOVE_FIRST macro from the
SYS$LIBRARY:LIB.MLB macro library must be recompiled on OpenVMS Version
7.2 or higher to obtain a correct version of these macros. The change
to these macros corrects a potential synchronization problem that is
more likely to be encountered on newer processors, starting with Alpha
21264 (EV6).
Note
Source modules that call the EXE$ALONONPAGED routine (or any of its
variants) do not need to be recompiled. These modules
transparently use the correct version of the routine that is included
in this release.
|
|