HP OpenVMS Systemsask the wizard |
The Question is: Hi! I know I do not have much information in stack dump, but is there a way to debug the C code for this program, and then deposit PC or other register values to find out the location where it crashed? If yes, could you please show how to do it? It does not r ecur. Thanks for your time and help, Narendra Patel The Answer is : It is clearly long past time for a "how do I debug application code" posting, so here it is... First off, please acquire and read through the OpenVMS manuals -- reading the manuals is not an admission of defeat, it is how you learn about the OpenVMS operating system, how to use it, how to program it, and how to debug applications. The manuals also have pointers to available tools, routines, and options. Put another way, PLEASE READ THE AVAILABLE MANUALS -- this is not intended to appear rude, simply that this is very likely the single most important step and one of the most vital resources, and a step that is oft-neglected. (If you have already skimmed the manuals, thank you!) The specific manuals that will be of interest here are the OpenVMS User's Guide, the OpenVMS Programming Concepts Manual, the OpenVMS Debugger Manual, and the manual(s) specific to your chosen programming lanuage. You will need general familiarity with and reference access to other manuals -- such as the OpenVMS LIB Run-Time Library (RTL) manual and the OpenVMS System Services Reference Manual. You will also want to learn the basic troubleshooting technique for hardware and software -- this is a technique often known as "divide and conquer", or as a real-world version of a recreational game known as "Mastermind". Per this approach, you need to segment the environment into two or more smaller units, and to then prove (or disprove) the existance of the problem within the smaller unit(s). Most anything that can be used to divide the environment into smaller ranges can be used to (quickly) localize the problem -- with simpler applications, dividing up the application into code modules is among the most common approaches -- then work through each module, proving (or disproving) the existence of the fault. With larger and more complex applications, you can need to create a testing structure around the routine(s) being tested, and you can need to incorporate debugging into the application -- writing tracking information into a debugging log file, for instance, can be useful in finding where the failure occurs at run-time. Additional details are available in the OpenVMS Debugger Manual and the OpenVMS Programming Concepts Manual. Next, you will need to specifically learn about the OpenVMS condition value format and -- as you learn more advanced techniques on how to program and how to debug your code -- and the signal and mechanism arrays and the basics of a signal handler. As a start, the OpenVMS condition value describes whether or not the condition indicates success (odd) or failure (even), and the severity. The condition value information available also provides details on what other information can be included with a signal. Details on signals and signal handlers and condition values are in the OpenVMS Programming Concepts Manual. In this case, an access violation error (ACCVIO) provides you with a wealth of information. It provides you with details of the type of access -- whether it was a failed memory read, a failed memory write, or some other access -- the failing program counter address, the virtual address reference that failed -- and other details. Yes, all of these details are readily available within the information provided by the access violation! (See attached.) To decode the virtual addresses returned by an access violation or by another similar OpenVMS display, you need to have created and retained a listings file -- preferably one with machine code generation enabled -- and a full link map. Starting with the virtual address reported, use the link map to find the module contributed the code that contains the virtual address. Then use the object listings for the component that contributed the code to locate the offset of the failing instruction. (If this information was not maintained, working backwards is far more difficult -- you are left to use the binary instruction data around the failure to locate the associated source code, and this process is far more involved. Keep the maps and listing files around, in other words.) Rather easier than an approach based on virtual address arithmetic and far easier than working backwards from the instruction stream is to use integrated debugging -- this inclusion is arguably an essential component of any non-trivial application -- and to use the OpenVMS Debugger. The OpenVMS Debugger in particular can be used to examine the source code, to examine the stack, and can even be programmed to wait patiently for the incidence of a particular value within a particular program variable before reporting back to the programmer. The debugger can also be activated from within a signal handler, and commands to generate a traceback can be generated. Details on the debugger are in the OpenVMS Debugger Manual. Please learn how to use the debugger. Additional discussions of some of the features and uses of the debugger are available in topics (1017), (1314), (1661), (3031) and (4129). For a list of common programming bugs, please see topic (1661). For access violations in particular, there are a large number of topics here in Ask The Wizard, including topics (837), (1705), (2195), (2223), (3215), (5533), (6065), (6495), (6776), (7110), (7551), and others. For shared memory programming requirements and details useful when debugging shared memory, please see topics (2681), (6984) and (7383), among others. The OpenVMS Programming Concepts Manual has details on memory synchronization requirements, shared memory interlocking, and related topics. For information on tracking down virtual memory and memory allocation bugs in your code -- malloc, free, lib$get_vm, lib$free_vm or otherwise commonly trip over corruptions in the memory heap that can (and do) result from buffer overruns -- please see topic (3257). You will also want to move to more current product releases, lest you spend your time and effort debugging something that has been fixed in the interval since the release shipped. In this case, OpenVMS Alpha V7.1-1H2 was a hardware release, and it is strongly recommended that any hardware release be upgraded as soon as a subsequent mainline release is available (and appropriately tested in your environment). Please also acquire and install any mandatory ECO kit(s) for the version of OpenVMS in use. C V5.7 is also rather old, and in need of an upgrade -- this C release particularly predates a correction to the code generator; a fix needed for correct code generation for use on the Alpha 21264 and later Alpha microprocessors. Please also acquire and install any mandatory ECO kit(s) for the version of OpenVMS in C -- the C run-time library (RTL) ships with OpenVMS, not with C. And please, please remember that the OpenVMS manuals are your friends here, and please remember to read these documents! Again, Thank you, and please realize no offense is intended -- while the OpenVMS Wizard wouild certainly like to see OpenVMS be entirely intuitive and not need any accompanying manuals or documentation , this is presently not the case. (If you cannot find the supporting documentation or related materials for the problem(s) of interest, PLEASE LET THE OpenVMS Wizard KNOW WHAT YOU COULD NOT FIND, and WHAT YOU LOOKED FOR. Armed with this information, the OpenVMS Wizard can update the OpenVMS manuals or other documention, and can also address your questions and your particular concerns. -- ACCVIO, access violation, reason mask='xx', virtual address='location', PC='location', PSL='xxxxxxxx' Facility: SYSTEM, System Services Explanation: An image attempted to read from or write to a memory location that is protected from the current mode. This message indicates an exception condition and is followed by a register and stack dump to help locate the error. The reason mask is a longword whose lowest 5 bits, if set, indicate that the instruction caused a length violation (bit 0), referenced the process page table (bit 1), attempted a read/modify operation (bit 2), was a vector operation on an improperly aligned vector element (bit 3), or was a vector instruction reference to an I/O space address (bit 4). This message is also displayed when an attempt has been made to make the user stack larger than the user's virtual address space permits. For example, the automatic user stack expansion algorithm reports an access violation with the following two conditions (which should serve as hints that automatic expansion has failed): o The reason mask has the low-order bit set. This indicates a memory reference that is not described in any page table. o The relatively small P1 address space virtual address is referenced. User Action: Examine the PC and virtual address displayed in the message. The virtual address is often the address to which the access attempt was made. However, in the case of vector-related access violations and in some processor implementations, the reported virtual address may be some other address in the same page as that address to which the access was attempted. Check the program listing to verify that instruction operands or procedure call arguments are correct. If this message is displayed because of insufficient virtual memory for automatic user stack expansion, reduce the user stack requirements of the image or increase the virtual address space available to the process in which the image is executed.
|