HP OpenVMS Systems Documentation

Content starts here

OpenVMS System Manager's Manual


Previous Contents Index

16.6 Writing the System Dump File to the System Disk

If you have more than one path to the system disk, or the system disk is a shadow set with multiple members, you must take additional steps to ensure that a system dump can be written to the system disk.

16.6.1 System Dump to System Disk on Alpha

If there is more than one path to the system disk, the console environment variable DUMP_DEV must describe all paths to the system disk. This ensures that if the original boot path becomes unavailable due to failover, the system can still locate the system disk and write the system dump to it.

If the system disk shadow set has multiple members, the console environment variable DUMP_DEV must describe all paths to all members of the shadow set. This ensures that if the master member changes, the system can still locate the master member and write the system dump to it.

If you do not define DUMP_DEV, the system can write a system dump only to the physical disk used at boot time using only the same physical path used at boot time. For instructions on setting DUMP_DEV, see Section 16.7.1.

Certain configurations (for example, those using Fibre Channel disks) may contain more combinations of paths to the system disk than can be listed in DUMP_DEV. In that case, Compaq recommends that you include in DUMP_DEV all paths to what is usually the master member of the shadow set, because shadow set membership changes occur less often than path changes.

You can write the system dump to an alternate disk (see Section 16.7.1), but when doing so you must still define a path to the system disk for writing error logs. Also, DUMP_DEV should contain all paths to the system disk in addition to the paths to the alternate dump disk.

If there are more paths than DUMP_DEV can contain, Compaq recommends that you define all paths to the dump disk and as many paths as possible (but at least one) to the system disk. Note that the system disk must be the last entry in the list.

16.6.2 System Dump to System Disk on VAX

To ensure that the system can locate the system disk and write the system dump to it when there is more than one path to the system disk, or when the system disk shadow set has multiple members, you must follow the platform-specific instructions regarding booting. On some VAX systems, you must set appropriate register values; on other VAX systems, you must set specific environment variables. See the upgrade and installation supplement for your VAX system for details.

Note that if the system has multiple CI star couplers, the shadow set members must all be connected through the same star coupler.

16.7 Writing the System Dump File to an Alternate Disk

You can write the system dump file to a device other than the system disk on OpenVMS systems. This is especially useful in large-memory systems and in clusters with common system disks where sufficient disk space is not always available on one disk to support customer dump file requirements.

Requirements for DOSD are somewhat different on VAX and Alpha systems. On both systems, however, you must correctly enable the DUMPSTYLE system parameter to enable the bugcheck code to write the system dump file to an alternate device.

The following sections describe the requirements for DOSD on Alpha and VAX systems.

16.7.1 DOSD Requirements on Alpha Systems

On Alpha systems, DOSD has the following requirements:

  • The dump device directory structure must resemble the current system disk structure. The [SYSn.SYSEXE]SYSDUMP.DMP file will reside there, with the same boot time system root.
    Use AUTOGEN to create this file. In the MODPARAMS.DAT file, the following symbol prompts AUTOGEN to create the file:


    DUMPFILE_DEVICE = $nnn$ddcuuuu
    

    You can enter a list of devices.
  • The dump disk must have an ODS-2 file structure.
  • The dump device cannot be part of a volume set.
  • Although not a requirement, Compaq recommends that you mount the dump device during system startup. If the dump device is mounted, it can be accessed by CLUE and AUTOGEN and for the analysis of crash dumps. For best results, include the MOUNT command in SYS$MANAGER:SYCONFIG.COM.
  • For the Crash Log Utility Extractor (CLUE) to support DOSD, you must define the logical name CLUE$DOSD_DEVICE to point to the dump file to be analyzed after a system crash. For instructions, refer to Section 16.9.
  • The dump device cannot be part of a shadow set unless it is also the system device and the master member of the shadow set.
  • Use the following format to specify the dump device environment variable DUMP_DEV at the console prompt:

    >>> SET DUMP_DEV device-name[...]

    Note

    On DEC 3000 series systems, the following restrictions on the use of the DUMP_DEV environment variable exist:
    • This variable is not preserved across system power failures because DEC 3000 series systems do not have enough nonvolatile RAM to save the contents of the file. You must reset the DUMP_DEV variable after a power failure. (DUMP_DEV is preserved across all other types of restarts and bootstraps, however.)
    • You cannot clear DUMP_DEV (except by power-cycling the system).
    • You must use console firmware Version 6.0 or greater because earlier versions do not provide support for DUMP_DEV.

    On some CPU types, you can enter only one device; on other CPU types, you can enter a list of devices. The list can include various alternate paths to the system disk and the dump disk.
    By specifying an alternate path with DUMP_DEV, the disk can fail over to the alternate path when the system is running. If the system crashes subsequently, the bugcheck code can use the alternate path by referring to the contents of DUMP_DEV.
    When you enter a list of devices, however, the system disk must come last.

How to Perform This Task

To designate the dump device with the DUMP_DEV environment variable, and enable the DUMPSTYLE system parameter, follow these steps:

  1. Display the value of BOOTDEF_DEV; for example:


    >>> SHOW BOOTDEF_DEV
    


    BOOTDEF_DEV             dub204.7.0.4.3,dua204.4.0.2.3
    
  2. Display the devices on the system as follows:


    >>> SHOW DEVICES
    


    Resetting IO subsystem...
    
    dua204.4.0.2.3     $4$DUA204 (RED70A)        RA72
    dua206.4.0.2.3     $4$DUA206 (RED70A)        RA72
    dua208.4.0.2.3     $4$DUA208 (RED70A)        RA72
    
    polling for units on cixcd1, slot 4, xmi0...
    
    dub204.7.0.4.3     $4$DUA204 (GRN70A)        RA72
    dub206.7.0.4.3     $4$DUA206 (GRN70A)        RA72
    dub208.7.0.4.3     $4$DUA208 (GRN70A)        RA72
    >>>
    

    In this example:
    • DUA204 is the system disk device.
    • DUA208 is the DOSD device.
  3. To provide two paths to the system disk, with the dump disk as DUA208 (also with two paths), set DUMP_DEV as follows:


    >>> SET DUMP_DEV dua208.4.0.2.3,dub208.7.0.4.3,dub204.7.0.4.3,dua204.4.0.2.3
    

    In this example, dua208.4.0.2.3 and dub208.7.0.4.3 are paths to the dump device; dub204.7.0.4.3 and dua204.4.0.2.3 are paths to the boot device.

    Note

    The system chooses the first valid device that it finds in the list as the dump device. Therefore, the dump disk path entries must appear before the system disk entries in the list.
  4. Display all environment variables on the system by entering the SHOW * command; for example:


    >>> SHOW *
    


    auto_action             HALT
    baud                    9600
    boot_dev  dua204.4.0.2.3
    boot_file
    boot_osflags            0,0
    boot_reset              ON
    bootdef_dev             dub204.7.0.4.3,dua204.4.0.2.3
    booted_dev  dua204.4.0.2.3
    booted_file
    booted_osflags          0,0
    cpu                     0
    cpu_enabled             ff
    cpu_primary             ff
    d_harderr               halt
    d_report                summary
    d_softerr               continue
    dump_dev  dua208.4.0.2.3,dub208.4.0.4.3,dub204.7.0.4.3,dua204.4.0.2.3
    enable_audit            ON
    interleave              default
    language                36
    pal                     V5.48-3/O1.35-2
    prompt                  >>>
    stored_argc             2
    stored_argv0            B
    stored_argv1            dua204.4.0.2.3
    system_variant          0
    version                 T4.3-4740 Jun 14 2000 15:16:38
    >>>
    
  5. Enable the DOSD bit of the DUMPSTYLE system parameter by setting bit 2. For example, enter the value of 4 at the SYSBOOT> prompt to designate an uncompressed physical dump to an alternate disk with minimal console output:


    >>> BOOT
    SYSBOOT> SET DUMPSTYLE 4
    

The OpenVMS System Management Utilities Reference Manual and online help contain details about the DUMPSTYLE system parameter.

Note

The error log dump file is always created on the system disk so that error log buffers can be restored when the system is rebooted. This file is not affected by setting the DUMPSTYLE system parameter or the DUMP_DEV environmental variable.

16.7.2 DOSD Requirements on VAX Systems

On VAX systems, DOSD has the following requirements:

  • The system must be connected directly to, and must boot from, CI controllers.
  • The dump device must physically connect to the same two HSx CI controllers as the boot device. These two controllers must be connected through the same CI star coupler.
  • The dump device directory structure must resemble the current system disk structure. The [SYSn.SYSEXE]SYSDUMP.DMP file will reside there, with the same boot time system root.
    Use AUTOGEN to create this file. In the MODPARAMS.DAT file, the following symbol prompts AUTOGEN to create the file:


    DUMPFILE_DEVICE = $nnn$ddcuuuu
    

    You can list only one device.
  • The volume label can be up to 12 characters long. The ASCII string DOSD_DUMP must be part of this volume label. For example, valid volume labels are DOSD_DUMP, DOSD_DUMP_12, 12_DOSD_DUMP. The label is read and retained in a memory boot data structure.
  • The dump device cannot be part of a volume set. Compaq strongly recommends that the dump device also not be part of a shadow set.
  • The dump device cannot be MSCP unit zero (0); only units 1 to 4095 (1---FFF) are supported.
    You can designate the dump device as follows:
    • On VAX 7000 configurations, by using bits 16 through 27 of the DUMPSTYLE system parameter. Note that the DUMP_DEV environment variable that is provided on VAX 7000 configurations is not used by OpenVMS VAX.
    • On configurations other than the VAX 7000, by using bits 16 through 27 of register 3 (R3). You can use this portion of the register to specify the dump device.

    The OpenVMS System Management Utilities Reference Manual and online help contain details about the DUMPSTYLE system parameter.

Note

To restore error log buffers when the system is rebooted after a system crash, the error logs must be saved on the system disk. For this purpose, AUTOGEN creates a SYSDUMP.DMP file on the system disk; the file is large enough to contain the maximum size of error log buffers.

16.8 Using SDA to Analyze the Contents of a Crash Dump

The System Dump Analyzer utility (SDA) lets you interpret the contents of the system dump file to investigate the probable causes of the crash. For information about analyzing a crash dump, refer to the OpenVMS VAX System Dump Analyzer Utility Manual or the OpenVMS Alpha System Analysis Tools Manual.

If your system fails, use SDA to make a copy of the system dump file written at the time of the failure and contact your Compaq support representative. For information about copying the system dump file, see Section 16.12.

16.9 Using SDA CLUE Commands to Analyze Crash Dump Files (Alpha Only)

SDA CLUE (Crash Log Utility Extractor) commands automate the analysis of crash dumps and maintain a history of all fatal bugchecks on a standalone system or cluster. You can use SDA CLUE commands in conjunction with SDA to collect and decode additional dump file information not readily accessible through standard SDA. You can also use SDA CLUE with Dump Off System Disk (DOSD) to analyze a system dump file that resides on a disk other than the system disk.

16.9.1 Understanding CLUE (Alpha Only)

On Alpha systems, SDA is automatically invoked by default when you reboot the system after a system failure. To better facilitate crash dump analysis, SDA CLUE commands automatically capture and archive summary dump file information in a CLUE listing file.

A startup command procedure initiates commands that:

  • Invoke SDA
  • Issue an SDA CLUE HISTORY command
  • Create a listing file called CLUE$nodename_ddmmyy_hhmm.LIS

The CLUE HISTORY command adds a one-line summary entry to a history file and saves the following output from SDA CLUE commands in the listing file:

  • Crash dump summary information
  • System configuration
  • Stack decoder
  • Page and swap files
  • Memory management statistics
  • Process DCL recall buffer
  • Active XQP processes
  • XQP cache header

The contents of this CLUE list file can help you analyze a system failure.

If these files accumulate more space than the threshold allows (default 5000 blocks), the oldest files are deleted until the threshold limit is reached. This can also be customized using the CLUE$MAX_BLOCK logical name.

To inhibit the running of CLUE at system startup, define the logical CLUE$INHIBIT in the SYLOGICALS.COM file as /SYS TRUE.

It is important to remember that CLUE$nodename_ddmmyy_hhmm.LIS contains only an overview of the crash dump and does not always contain enough information to determine the cause of the crash. If you must do an in-depth analysis of the system crash, Compaq recommends that you always use the SDA COPY command to save the dump file.

16.9.2 Displaying Data Using SDA CLUE Commands (Alpha Only)

Invoke CLUE commands at the SDA prompt as follows:


SDA> CLUE CONFIG

CLUE commands provide summary information of a crash dump captured from a dump file. When debugging a crash dump interactively, you can use SDA CLUE commands to collect and decode some additional information from a dump file, which is not easily accessible through standard SDA. For example, CLUE can quickly provide detailed XQP summaries.

You can also use CLUE commands interactively on a running system to help identify performance problems.

You can use all CLUE commands when analyzing crash dumps; the only CLUE commands that are not allowed when analyzing a running system are CLUE CRASH, CLUE ERRLOG, CLUE HISTORY, and CLUE STACK.

Refer to OpenVMS Alpha System Analysis Tools Manual for more information about using SDA CLUE commands.

16.9.3 Using SDA CLUE with Dump Off System Disk (Alpha Only)

Dump off system disk (DOSD) allows you to write the system dump file to a device other than the system disk. For SDA CLUE to be able to correctly find the dump file to be analyzed after a system crash, perform the following steps:

  1. Modify the command procedure SYS$MANAGER:SYCONFIG.COM to add the system logical name CLUE$DOSD_DEVICE to point to the device where the dump file resides. You need to supply only the physical or logical device name without a file specification.
  2. Modify the command procedure SYS$MANAGER:SYCONFIG.COM to mount systemwide the device where the dump file resides. Otherwise, SDA CLUE cannot access and analyze the dump file.

In the following example, the dump file is placed on device $3$DUA25, with the label DMP$DEV. You need to add the following commands to SYS$MANAGER:SYCONFIG.COM:


$mount/system/noassist $3$dua25: dmp$dev dmp$dev
$define/system clue$dosd_device dmp$dev

16.10 Using CLUE to Obtain Historical Information About Crash Dumps (VAX Only)

On VAX systems, the Crash Log Utility Extractor (CLUE) displays the contents of a crash history file. By examining the contents of the crash history file, you can understand and resolve the issues responsible for failures (crashes), and you might also obtain other useful data.

16.10.1 Understanding CLUE (VAX Only)

The crash history file, which is created and updated by CLUE, contains key parameters from crash dump files. Unlike crash dumps, which are overwritten with each system failure and are therefore typically available only for the most recent failure, the crash history file is a permanent record of system failures.

After a system fails and physical memory is copied to the crash dump file, CLUE automatically appends the relevant parameters to the file CLUE$OUTPUT:CLUE$HISTORY.DATA when the system is restarted. The remainder of this section describes how you can use CLUE to display the data it has collected; reference information about CLUE is available in the OpenVMS System Management Utilities Reference Manual.

Note

The history file typically grows by about 10 to 15 blocks for each entry. You can limit the number of entries in the binary file by defining the logical name CLUE$MAX_ENTRIES to be the maximum number desired. When this number is reached, the oldest entries are deleted from the history file.

By default, operator shutdowns are recorded in the history file. You can exclude information from operator shutdowns in the history file by defining the logical name CLUE$EXCLUDE_OPERS as being TRUE, for example by including the following line in SYS$MANAGER:SYSTARTUP_VMS.COM:


$ DEFINE /SYSTEM CLUE$EXCLUDE_OPERS TRUE

16.10.2 Displaying Data Using CLUE (VAX Only)

To display data using CLUE, you must first define the following symbol:


$ CLUE :== $CLUE

After defining the symbol, you can use CLUE to display information by entering the following command:


$ CLUE/DISPLAY
CLUE_DISPLAY>

At the CLUE_DISPLAY> prompt, you can issue commands to perform the following actions:

  • Use the DIRECTORY command to list failures that have occurred since a specified date, failures of a particular type, failures that contain a specified module, and failures that have a specified offset.
    For example, you can list all the failures in the history file using the DIRECTORY command, as follows:


    CLUE_DISPLAY> DIRECTORY
    
  • Use the SHOW command to generate information similar to that obtained from certain commands in the System Dump Analyzer utility (SDA).
    For example, if you wanted complete information about the crash listed as crash number 7, the following SHOW command would provide the information:


    CLUE_DISPLAY> SHOW ALL 7
    
  • Use the EXTRACT command to write the data from an entry to a file.
    For example, the following command writes the data from entry number 7 in the crash history file to a file named 15MAYCRASH.TXT:


    CLUE_DISPLAY> EXTRACT 7/OUTPUT=15MAYCRASH.TXT
    

For more information about CLUE commands, refer to the OpenVMS System Management Utilities Reference Manual.

16.11 Saving the Contents of the System Dump File After a System Failure

If the system fails, it overwrites the contents of the system crash dump file and the previous contents are lost. For this reason, ensure that your system automatically analyzes and copies the contents of the system dump file each time the system reboots.

On Alpha systems, SDA is invoked by default during startup, and a CLUE list file is created. Generated by a set sequence of commands, the CLUE list file contains only an overview of the crash and might not provide enough information to determine the cause of the crash. Compaq, therefore, recommends that you always copy the system dump file.

Refer to the OpenVMS Alpha System Analysis Tools Manual for information about modifying your site-specific command procedure to execute additional commands such as SDA COPY upon startup after a system failure.

On VAX systems, modify the site-specific startup command procedure SYSTARTUP_VMS.COM so that it invokes the System Dump Analyzer utility (SDA) when the system is booted.

Be aware of the following facts:

  • When invoked from the site-specific startup procedure in the STARTUP process, SDA executes the specified commands only if the system is booting immediately after a system failure. If the system is rebooting after it was shut down with SHUTDOWN.COM or OPCCRASH.EXE, SDA exits without executing the commands.
  • Although you can use the DCL command COPY to copy the dump file, the SDA command COPY is preferable because it copies only the blocks occupied by the dump and it marks the dump file as copied. The SDA COPY command is preferable also when the dump was written into the paging file, SYS$SYSTEM:PAGEFILE.SYS, because the SDA COPY command releases to the pager those pages occupied by the dump. For more information, see Section 16.13.
  • Because a system dump file can contain privileged information, protect copies of dump files from world read access. For more information about file protection, refer to the OpenVMS Guide to System Security.
  • System dump files have the NOBACKUP attribute, so the Backup utility (BACKUP) does not copy them unless you use the qualifier /IGNORE=NOBACKUP when invoking BACKUP. When you use the SDA command COPY to copy the system dump file to another file, the operating system does not automatically set the new file to NOBACKUP. If you want to set the NOBACKUP attribute on the copy, use the SET FILE command with the /NOBACKUP qualifier as described in the OpenVMS DCL Dictionary.

Example

The SDA command COPY in the following example saves the contents of the file SYS$SYSTEM:PAGEFILE.SYS and performs some analysis of the file. Note that the COPY command is the final command because the blocks of the page file used by the dump are released as soon as the COPY command completes, and can be used for paging before any other SDA commands can be executed.


$ !
$ !      Print dump listing if system just failed
$ !
$ ANALYZE/CRASH_DUMP SYS$SYSTEM:PAGEFILE.SYS
    SET OUTPUT DISK1:SYSDUMP.LIS        ! Create listing file
    READ/EXECUTIVE                      ! Read in symbols for kernel
    SHOW CRASH                          ! Display crash information
    SHOW STACK                          ! Show current stack
    SHOW SUMMARY                        ! List all active processes
    SHOW PROCESS/PCB/PHD/REG            ! Display current process
    COPY SYS$SYSTEM:SAVEDUMP.DMP        ! Save system dump file
    EXIT
$ SET FILE/NOBACKUP SYS$SYSTEM:SAVEDUMP.DMP


Previous Next Contents Index