 |
VAX MACRO and Instruction Set Reference Manual
10.6.2 Vector Arithmetic Exceptions
Vector operate instructions are always executed to completion, even if
a vector arithmetic exception occurs. If an exception occurs, a default
result is written. The default result is as follows:
- The low-order 32 bits of the true result for integer overflow.
- Zero for floating underflow if exceptions are disabled.
- An encoded reserved operand for floating divide by zero, floating
overflow, reserved operand, and enabled floating underflow. For vector
convert instructions that convert floating-point data to integer data,
where the source element is a reserved operand, the value written to
the destination element is UNPREDICTABLE.
The exception condition type and destination register number are always
recorded in the Vector Arithmetic Exception Register (VAER) when a
vector arithmetic exception occurs. Refer to Section 10.2.3, Internal
Processor Registers, for more information.
10.6.3 Vector Processor Disabled
As a result of error conditions or software control, the vector
processor signals the scalar processor not to issue any more vector
instructions. The vector processor is disabled when this signal is
generated and its state is reflected in VPSR<VEN>.
Because the scalar and vector processors can execute asynchronously,
the scalar processor may not receive this signal immediately. As a
result, the scalar processor may continue to view the vector processor
as enabled and send it vector instructions. Once the scalar processor
receives this signal, it will view the vector processor as disabled and
will not send it any more vector instructions (including MFVP/MTVP).
While the vector processor is disabled, and in the absence of hardware
errors, it will complete all pending instructions in its instruction
queue including those sent by the scalar processor after the vector
processor became disabled.
The vector processor can either disable itself or be disabled by
software. The following error conditions cause the vector processor to
disable itself:
- Vector arithmetic exception (flagged by VPSR<AEX>)
- Hardware error (flagged by VPSR<IMP> in some implementations)
- On some implementations, receipt of an illegal vector opcode
(flagged by VPSR<IVO>)
In these cases, the vector processor clears VPSR<VEN> and flags
the error condition by setting the appropriate bit in VPSR. (See
Table 10-1.)
Software disables the vector processor by writing a zero into
VPSR<VEN> using an MTPR instruction. Once the vector processor is
disabled, only software can enable it. The software does this by
writing a one to VPSR<VEN> using an MTPR. Recall that after
performing an MTPR to VPSR, software must then issue an MFPR from VPSR
to ensure that the new state of VPSR will affect the execution of
subsequently issued vector instructions. The MFPR will not complete in
this case until the new state of the vector processor becomes visible
to the scalar processor.
When the vector processor disables itself due to a hardware error, it
is implementation dependent whether the vector processor completes any
pending vector instruction. However, in this case, the vector processor
ensures when it is reenabled that all incompleted instructions have
been flushed from the instruction queue.
If the scalar processor attempts to issue a vector instruction after it
views the vector processor as disabled, then a vector processor
disabled fault occurs. The vector processor disabled fault uses SCB
offset 68 (hex). The exception handling software (running on the scalar
processor) can then read the vector internal processor registers (IPRs)
with MFPR instructions to determine what exception conditions are
recorded in the vector processor and if the vector processor is still
busy processing other unfinished instructions.
Once the scalar processor views the vector processor as disabled, the
only operations that can be issued to the vector processor are MTPR and
MFPR to and from the vector IPRs.
10.6.4 Handling Disabled Faults and Vector Context Switching
The following flow outlines the required steps for handling a vector
processor disabled fault.
If the new process executing on the scalar processor has a vector
instruction to execute, saving and restoring the state of the vector
processor---that is, vector context switching---is done as part of
handling a subsequent vector processor disabled fault.
If a vector processor disabled fault occurs and the current scalar
process is also the current vector process, then software must perform
the following procedure:
- Obtain the vector processor status by reading the VPSR using the
MFPR instruction.
- Perform the following checks to see if any of these conditions
caused the vector processor to be disabled. If any of these conditions
exist, a decision to not continue this flow may occur.
- If VPSR<IVO> is set, then write one to clear VPSR<IVO>
using the MTPR instruction, and report an illegal vector opcode error.
- If VPSR<IMP> is set, then write one to clear VPSR<IMP>
using the MTPR instruction, and report an implementation-specific error.
- If VPSR<AEX> is set, then write one to clear VPSR<AEX>
using the MTPR instruction, and enter the vector arithmetic exception
handler with information in VAER.
- If the software scalar-context-switch flag is set, indicating that
a scalar context switch has been done, then perform the following:
- Make sure the vector processor has access to correct P0LR, P0BR,
P1LR, and P1BR values.
- If any vector translation buffer needs to be invalidated, then
write zero into the VTBIA IPR using the MTPR instruction. Vector
translation buffer flushing is required if the process was swapped out
and the mapping change has not yet been made known to the vector
translation buffer.
- Clear the software scalar-context-switch flag.
- Enable the vector processor by writing one to VPSR<VEN> using
the MTPR instruction.
Ensure the new state of the vector processor becomes visible to the
scalar processor by reading VPSR with the MFPR instruction.
- REI to retry the vector instruction at the time of the vector
processor disabled fault. If there is an asynchronous memory management
exception pending, it is taken when that vector instruction is reissued
to the vector processor.
If a vector processor disabled fault occurs and the current scalar
process is not the current vector process, then software must perform
the following procedure:
- Check if there is a current vector process. If there is one, then
perform the following procedure:
- Wait for VPSR<BSY> to be clear using the MFPR instruction.
- Perform the following check to see if this condition caused the
vector processor to be disabled. If this condition exists, a decision
to not continue this flow may occur.
- If VPSR<IMP> is set, then report an implementation-specific
error.
- If VPSR<IVO> is set, then set a software IVO flag for this
process. The illegal vector opcode error is handled when this process
next tries to execute in the vector processor.
- If VPSR<AEX> is set, then set a software AEX flag for this
process, and save vector arithmetic exception state from VAER using the
MFPR instruction. Any vector arithmetic exception conditions are
handled when this process next tries to execute in the vector processor.
- At this point there cannot be a synchronous memory management
exception pending. But, if asynchronous memory management handling is
implemented, there may be an asynchronous memory management exception
pending. Because scalar/vector memory synchronization was required
before scalar context switching, all such pending exceptions are known
at this time. So, if VPSR<PMF> is set, then perform the following
procedure:
- Set a software asynch-memory-exception-pending flag for this
process.
- Store implementation-specific vector state in memory starting at
the address in VSAR by writing one to VPSR<STS> using the MTPR
instruction.
- Reset the vector processor state to clear VAER and VPSR, and enable
the vector processor. Writing a one to both VPSR<RST> and
VPSR<VEN> using the same MTPR instruction accomplishes this.
Ensure the new state of the vector processor becomes visible to the
scalar processor by reading VPSR with the MFPR instruction.
- Store the current vector (V0--V15) and vector control (VLR, VMR,
and VCR) register values using VST and MFVP instructions.
- Read the VMAC IPR using the MFPR instruction. This ensures
scalar/vector memory synchronization and that all hardware errors
encountered by previous vector memory instructions have been reported.
- Make the current scalar process also the current vector process.
- Clear the software scalar-context-switch flag.
- Make sure the vector processor has access to correct P0LR, P0BR,
P1LR, and P1BR values, and invalidate any vector translation buffer by
writing zero to the VTBIA IPR using the MTPR instruction.
- Load the saved vector (V0--V15) and vector control (VLR, VMR, and
VCR) register values using VLD and MTVP instructions.
- If the software IMP, IVO, or AEX flags for this process are set,
perform the following procedure:
- Disable the vector processor by writing zero to VPSR<VEN>
using the MTPR instruction.
Ensure the new state of the vector processor becomes visible to the
scalar processor by reading VPSR with the MFPR instruction.
- If set, clear the software IMP flag for this process and finish
handling the implementation-specific error. A decision to not continue
this flow may occur.
- If set, clear the software IVO flag for this process and report an
illegal vector opcode error occurred. A decision to not continue this
flow may occur.
- If set, clear the software AEX flag for this process and enter the
vector arithmetic exception handler with saved VAER state. A decision
to not continue this flow may occur.
- If the software async-memory-exception-pending flag for this
process is set, perform the following procedure:
- Clear the software async-memory-exception-pending flag for this
process.
- Send the vector processor the memory address that points to
implementation-specific vector state for this process by writing VSAR
using the MTPR instruction.
- Reload the implementation-specific vector state for this process
and leave the vector processor enabled by writing one to both
VPSR<RLD> and VPSR<VEN> using the same MTPR instruction.
From this state, the vector processor determines if VPSR<PMF>,
VPSR<MF>, or both need to be set, and does it. Ensure the new
state of the vector processor becomes visible to the scalar processor
by reading VPSR with the MFPR instruction.
- REI to retry the vector instruction at the time of the vector
processor disabled fault. If there is an asynchronous memory management
exception pending, it is taken when that vector instruction is reissued
to the vector processor.
10.6.5 MFVP Exception Reporting Examples
This section gives examples of Move from Vector Processor (MFVP)
exception reporting that are ensured by the vector processor. The rules
used to determine the correct result for each example are found in: the
tables of dependencies found in Section 10.5.3.3, the description of MSYNC
in Section 10.7.2, and the description of MFVP in Section 10.15.
Examples of Exceptions That Cause MSYNC to Fault
The following examples illustrate which exceptions are ensured by the
vector processor to always cause MSYNC to fault:
#1 |
VVMULF V1, V1, V2
VVADDF V3, V2, V3
MTVLR #1
VSTL V2, A, #4
VVCVTFD V2, V3
MSYNC R0
|
The MSYNC faults if exceptions occur in the production of V2[0] by the
VVMULF or in the storage of V2[0] by the VSTL. MSYNC need not fault if
exceptions occur in the production of: V2[1..VLR-1] by the VVMULF,
V3[0..VLR-1] by the VVADDF, or V3[0..VLR-1] by the VVCVTFD.
#2 |
VVADDF V1, V1, V0
VLDL A, #4, V0
MSYNC R0
|
The MSYNC faults if exceptions occur in the loading of V0[0..VLR-1]
from memory. MSYNC need not fault if exceptions occur in the production
of V0[0..VLR-1] by the VVADDF.
#3 |
VVADDF V1, V1, V2
VLDL A, #4, V1
MSYNC R0
|
The MSYNC faults if exceptions occur in the loading of V1[0..VLR-1]
from memory. MSYNC need not fault if exceptions occur in the production
of V2[0..VLR-1] by the VVADDF.
#4 |
VVMULF V1, V1, V2
VVGTRF V2, V3
VSTL/1 V0, A, #4
MSYNC R0
|
The MSYNC faults if exceptions occur: in the production of V2[0..VLR-1]
by the VVMULF, in the production of VMR<0..VLR-1> by the VVGTRF,
or in the storage by the VSTL/1 of elements of V0 for which the
corresponding VMR bit is one.
Examples of Exceptions the Processor Reports Prior to MFVP Completion
The following examples illustrate which exceptions the vector processor
will report prior to the completion of an MFVP from a vector control
register:
#1 |
VLDL A, #4, V1
VVMULF V1, V1, V2
MTVLR #1
VVGTRF V2, V3
MFVMRHI R1
MFVMRLO R2
|
Unreported exceptions that occur: in the loading of V1[0] from memory
by the VLDL, in the production of V2[0] by the VVMULF, and VMR<0>
by the VVGTRF are reported by the vector processor prior to the
completion of the MFVMRLO. The vector processor need not at that time
report any exceptions that occur in the loading of V1[1..63] from
memory by the VLDL or in the production of V2[1..63] by the VVMULF.
Note that the vector processor need not report any exceptions before
completing MFVMRHI.
#2 |
VVGTRF V0, V1
MTVMRLO #patt
MFVMRLO R1
|
For any value of "i" in the range of 0 to 31 inclusive: the
value of VMR<i> delivered by MFVMRLO only depends on the value
placed into VMR<i> by the MTVMRLO. As a result, the vector
processor need not report exceptions that occur in the production of
VMR by the VVGTRF prior to completing the MFVMRLO.
#3 |
VVMULF/1 V1, V1, V2
MTVMRLO #patt
MFVMRLO R1
|
For any value of "i" in the range of 0 to 31 inclusive: the
value of VMR<i> delivered by MFVMRLO only depends on the value
placed into VMR<i> by the MTVMRLO. As a result, the vector
processor need not report exceptions that occur in the production of
V2[0..VLR-1] by the VVMULF/1 prior to completing the MFVMRLO.
#4 |
MTVLR #64
VVMULF V0, V0, V2
VVGTRF V0, V2
MTVLR #32
IOTA #str, V4
MFVCR R1
|
Prior to the completion of the MFVCR, the vector processor must report
any exceptions that occurred in the production of V2[0..31] by the
VVMULF and VMR<0..31> by the VVGTRF. Note that VCR produced by an
IOTA depends only on VMR<0..VLR-1>. Recall that no exceptions can
occur in the production of V4[0..VCR-1] by IOTA.
#5 |
MTVLR #64
VLDL A, #4, V2
VVGTRF V0, V1
VSGTRF/1 #3.0, V2
MFVMRLO R1
|
For any value of "i" in the range of 0 to 31 inclusive: prior
to the completion of the MFVMRLO, the vector processor must report any
exceptions that occurred: in the loading of V2[i] from memory for which
V0[i] is greater than V1[i], in the production of VMR<0..31> by
the VVGTRF, and in the production of VMR<0..31> by the VSGTRF/1.
#6 |
VVMULF V1, V1, V1
VSTL V1, base, #str
MTVMRLO base
MFVMRLO R1
|
In this example, the value of VMR<31:0> delivered by MFVMRLO only
depends on the value placed into VMR<31:0> by the MTVMRLO --
whether this value is V1[0] or the previous value of the location is
UNPREDICTABLE. As a result, the vector processor need not report
exceptions that occur in the production of V1 by the VVMULF or in the
storage of V1 by the VSTL.
10.7 Synchronization
For most cases, it is desirable for the vector processor to operate
concurrently with the scalar processor so as to achieve good
performance. However, there are cases where the operation of the vector
and scalar processors must be synchronized to ensure correct program
results. Rather than forcing the vector processor to detect and
automatically provide synchronization in these cases, the architecture
provides software with special instructions to accomplish the
synchronization. These instructions synchronize the following:
- Exception reporting between the vector and scalar processors
- Memory accesses between the scalar and vector processors
- Memory accesses between multiple load/store units of the vector
processor
Software must determine when to use these synchronization instructions
to ensure correct results.
The following sections describe the synchronization instructions.
10.7.1 Scalar/Vector Instruction Synchronization (SYNC)
A mechanism for scalar/vector instruction synchronization between the
scalar and vector processors is provided by SYNC, which is implemented
by the MFVP instruction. SYNC allows software to ensure that the
exceptions of previously issued vector instructions are reported before
the scalar processor proceeds with the next instruction. SYNC detects
both arithmetic exceptions and asynchronous memory management
exceptions and reports these exceptions by taking the appropriate VAX
instruction fault. Once it issues the SYNC, the scalar processor
executes no further instructions until the SYNC completes or faults.
In beginning the execution of SYNC, the vector processor determines if
any previously issued vector instruction has encountered exceptions
which have yet to be reported to the scalar processor. If so, the SYNC
is faulted; otherwise, the vector processor waits for either of the
following conditions to be true:
- A pending or currently executing vector instruction encounters an
exception---in which case the SYNC faults
- The vector processor determines that all pending and currently
executing vector instructions (including memory instructions in
asynchronous memory management mode) will execute to completion without
encountering vector exceptions. In that case the SYNC completes.
When SYNC completes, a longword value (which is UNPREDICTABLE) is
returned to the scalar processor. The scalar processor writes the
longword value to the scalar destination of the MFVP and then proceeds
to execute the next instruction. If the scalar destination is in
memory, it is UNPREDICTABLE whether the new value of the destination
becomes visible to the vector processor until scalar/vector memory
synchronization is performed.
When SYNC faults, it is not completed by the vector processor and the
scalar processor does not write a longword value to the scalar
destination of the MFVP. Also, depending on the exception condition
encountered, the SYNC itself takes either a vector processor disabled
fault or memory management fault. If both faults are encountered while
the vector processor is performing SYNC, then the SYNC itself takes a
vector processor disabled fault. Note that it is UNPREDICTABLE whether
the vector processor is idle when the fault is generated. After the
appropriate fault has been serviced, the SYNC may be returned to
through an REI.
SYNC only affects the scalar/vector processor pair that executed it. It
has no effect on other processors in a multiprocessor system.
10.7.2 Scalar/Vector Memory Synchronization
Scalar/vector memory synchronization allows software to ensure that the
memory activity of the scalar/vector processor pair has ceased and the
resultant memory write operations have been made visible to each
processor in the pair before the pair's scalar processor proceeds with
the next instruction. Two ways are provided to ensure scalar/vector
memory synchronization: using MSYNC, which is implemented by the MFVP
instruction, and using the MFPR instruction to read the VMAC (Vector
Memory Activity Check) internal processor register (IPR). Section 10.7.2.1
discusses MSYNC in detail. Section 10.7.2.2 discusses VMAC in detail.
Scalar/vector memory synchronization does not mean that previously
issued vector memory instructions have completed; it only means that
the vector and scalar processors are no longer performing memory
operations. While both VMAC and MSYNC provide scalar/vector memory
synchronization, MSYNC performs significantly more than just that
function. In addition, VMAC and MSYNC differ in their exception
behavior.
Note that scalar/vector memory synchronization only affects the
scalar/vector processor pair that executed it. It has no effect on
other processors in a multiprocessor system. Scalar/vector memory
synchronization does not ensure that the write operations made by one
scalar/vector pair are visible to any other scalar or vector processor.
Software can make data visible and shared between a scalar/vector pair
and other scalar and vector processors by using the mechanisms
described in the VAX Architecture Reference Manual. Software must first make a memory write
operation by the vector processor visible to its associated scalar
processor through scalar/vector memory synchronization before making
the write operation visible to other processors. Without performing
this scalar/vector memory synchronization, it is UNPREDICTABLE whether
the vector memory write will be made visible to other processors even
by the mechanisms described in the VAX Architecture Reference Manual.
Lastly, waiting for VPSR<BSY> to be clear does not guarantee that
a vector write operation is visible to the scalar processor.
|