HP OpenVMS Systemsask the wizard |
The Question is: Hi! I know I do not have much information in stack dump, but is there a way to debug the C code for this program, and then deposit PC or other register values to find out the location where it crashed? If yes, could you please show how to do it? It does not r ecur. Thanks for your time and help, Narendra Patel The Answer is :
It is clearly long past time for a "how do I debug application code"
posting, so here it is...
First off, please acquire and read through the OpenVMS manuals --
reading the manuals is not an admission of defeat, it is how you
learn about the OpenVMS operating system, how to use it, how to
program it, and how to debug applications. The manuals also have
pointers to available tools, routines, and options. Put another
way, PLEASE READ THE AVAILABLE MANUALS -- this is not intended
to appear rude, simply that this is very likely the single most
important step and one of the most vital resources, and a step
that is oft-neglected. (If you have already skimmed the manuals,
thank you!)
The specific manuals that will be of interest here are the OpenVMS
User's Guide, the OpenVMS Programming Concepts Manual, the
OpenVMS Debugger Manual, and the manual(s) specific to your
chosen programming lanuage. You will need general familiarity
with and reference access to other manuals -- such as the OpenVMS
LIB Run-Time Library (RTL) manual and the OpenVMS System Services
Reference Manual.
You will also want to learn the basic troubleshooting technique
for hardware and software -- this is a technique often known as
"divide and conquer", or as a real-world version of a recreational
game known as "Mastermind". Per this approach, you need to segment
the environment into two or more smaller units, and to then prove
(or disprove) the existance of the problem within the smaller unit(s).
Most anything that can be used to divide the environment into smaller
ranges can be used to (quickly) localize the problem -- with simpler
applications, dividing up the application into code modules is among
the most common approaches -- then work through each module, proving
(or disproving) the existence of the fault. With larger and more
complex applications, you can need to create a testing structure
around the routine(s) being tested, and you can need to incorporate
debugging into the application -- writing tracking information into
a debugging log file, for instance, can be useful in finding where
the failure occurs at run-time. Additional details are available
in the OpenVMS Debugger Manual and the OpenVMS Programming Concepts
Manual.
Next, you will need to specifically learn about the OpenVMS condition
value format and -- as you learn more advanced techniques on how to
program and how to debug your code -- and the signal and mechanism
arrays and the basics of a signal handler. As a start, the OpenVMS
condition value describes whether or not the condition indicates
success (odd) or failure (even), and the severity. The condition
value information available also provides details on what other
information can be included with a signal. Details on signals and
signal handlers and condition values are in the OpenVMS Programming
Concepts Manual.
In this case, an access violation error (ACCVIO) provides you with
a wealth of information. It provides you with details of the type
of access -- whether it was a failed memory read, a failed memory
write, or some other access -- the failing program counter address,
the virtual address reference that failed -- and other details.
Yes, all of these details are readily available within the
information provided by the access violation! (See attached.)
To decode the virtual addresses returned by an access violation or
by another similar OpenVMS display, you need to have created and
retained a listings file -- preferably one with machine code
generation enabled -- and a full link map. Starting with the virtual
address reported, use the link map to find the module contributed the
code that contains the virtual address. Then use the object listings
for the component that contributed the code to locate the offset of
the failing instruction. (If this information was not maintained,
working backwards is far more difficult -- you are left to use the
binary instruction data around the failure to locate the associated
source code, and this process is far more involved. Keep the maps
and listing files around, in other words.)
Rather easier than an approach based on virtual address arithmetic
and far easier than working backwards from the instruction stream
is to use integrated debugging -- this inclusion is arguably an
essential component of any non-trivial application -- and to use the
OpenVMS Debugger. The OpenVMS Debugger in particular can be used to
examine the source code, to examine the stack, and can even be
programmed to wait patiently for the incidence of a particular value
within a particular program variable before reporting back to the
programmer. The debugger can also be activated from within a
signal handler, and commands to generate a traceback can be
generated. Details on the debugger are in the OpenVMS Debugger
Manual.
Please learn how to use the debugger. Additional discussions of
some of the features and uses of the debugger are available in
topics (1017), (1314), (1661), (3031) and (4129).
For a list of common programming bugs, please see topic (1661).
For access violations in particular, there are a large number of
topics here in Ask The Wizard, including topics (837), (1705),
(2195), (2223), (3215), (5533), (6065), (6495), (6776), (7110),
(7551), and others.
For shared memory programming requirements and details useful when
debugging shared memory, please see topics (2681), (6984) and (7383),
among others. The OpenVMS Programming Concepts Manual has details
on memory synchronization requirements, shared memory interlocking,
and related topics.
For information on tracking down virtual memory and memory allocation
bugs in your code -- malloc, free, lib$get_vm, lib$free_vm or otherwise
commonly trip over corruptions in the memory heap that can (and do)
result from buffer overruns -- please see topic (3257).
You will also want to move to more current product releases, lest
you spend your time and effort debugging something that has been
fixed in the interval since the release shipped. In this case,
OpenVMS Alpha V7.1-1H2 was a hardware release, and it is strongly
recommended that any hardware release be upgraded as soon as a
subsequent mainline release is available (and appropriately tested
in your environment).
Please also acquire and install any mandatory ECO kit(s) for the
version of OpenVMS in use.
C V5.7 is also rather old, and in need of an upgrade -- this C release
particularly predates a correction to the code generator; a fix needed
for correct code generation for use on the Alpha 21264 and later Alpha
microprocessors.
Please also acquire and install any mandatory ECO kit(s) for the
version of OpenVMS in C -- the C run-time library (RTL) ships with
OpenVMS, not with C.
And please, please remember that the OpenVMS manuals are your friends
here, and please remember to read these documents! Again, Thank you,
and please realize no offense is intended -- while the OpenVMS Wizard
wouild certainly like to see OpenVMS be entirely intuitive and not
need any accompanying manuals or documentation , this is presently
not the case. (If you cannot find the supporting documentation or
related materials for the problem(s) of interest, PLEASE LET THE
OpenVMS Wizard KNOW WHAT YOU COULD NOT FIND, and WHAT YOU LOOKED FOR.
Armed with this information, the OpenVMS Wizard can update the OpenVMS
manuals or other documention, and can also address your questions and
your particular concerns.
--
ACCVIO, access violation, reason mask='xx', virtual
address='location', PC='location', PSL='xxxxxxxx'
Facility: SYSTEM, System Services
Explanation: An image attempted to read from or write to a memory location
that is protected from the current mode. This message
indicates an exception condition and is followed by a register
and stack dump to help locate the error. The reason mask is
a longword whose lowest 5 bits, if set, indicate that the
instruction caused a length violation (bit 0), referenced the
process page table (bit 1), attempted a read/modify operation
(bit 2), was a vector operation on an improperly aligned
vector element (bit 3), or was a vector instruction reference
to an I/O space address (bit 4).
This message is also displayed when an attempt has been made
to make the user stack larger than the user's virtual address
space permits. For example, the automatic user stack expansion
algorithm reports an access violation with the following
two conditions (which should serve as hints that automatic
expansion has failed):
o The reason mask has the low-order bit set. This indicates a
memory reference that is not described in any page table.
o The relatively small P1 address space virtual address is
referenced.
User Action: Examine the PC and virtual address displayed in the message.
The virtual address is often the address to which the access
attempt was made. However, in the case of vector-related
access violations and in some processor implementations, the
reported virtual address may be some other address in the same
page as that address to which the access was attempted. Check
the program listing to verify that instruction operands or
procedure call arguments are correct.
If this message is displayed because of insufficient virtual
memory for automatic user stack expansion, reduce the user
stack requirements of the image or increase the virtual
address space available to the process in which the image
is executed.
|