|
OpenVMS System Manager's Manual
16.6 Writing the System Dump File to the System Disk
If you have more than one path to the system disk, or the system disk
is a shadow set with multiple members, you must take additional steps
to ensure that a system dump can be written to the system disk.
16.6.1 System Dump to System Disk on Alpha
If there is more than one path to the system disk, the console
environment variable DUMP_DEV must describe all paths to the system
disk. This ensures that if the original boot path becomes unavailable
due to failover, the system can still locate the system disk and write
the system dump to it.
If the system disk shadow set has multiple members, the console
environment variable DUMP_DEV must describe all paths to all members of
the shadow set. This ensures that if the master member changes, the
system can still locate the master member and write the system dump to
it.
If you do not define DUMP_DEV, the system can write a system dump only
to the physical disk used at boot time using only the same physical
path used at boot time. For instructions on setting DUMP_DEV, see
Section 16.7.1.
Certain configurations (for example, those using Fibre Channel disks)
may contain more combinations of paths to the system disk than can be
listed in DUMP_DEV. In that case, Compaq recommends that you include in
DUMP_DEV all paths to what is usually the master member of the shadow
set, because shadow set membership changes occur less often than path
changes.
You can write the system dump to an alternate disk (see
Section 16.7.1), but when doing so you must still define a path to the
system disk for writing error logs. Also, DUMP_DEV should contain all
paths to the system disk in addition to the paths to the alternate dump
disk.
If there are more paths than DUMP_DEV can contain, Compaq recommends
that you define all paths to the dump disk and as many paths as
possible (but at least one) to the system disk. Note that the system
disk must be the last entry in the list.
16.6.2 System Dump to System Disk on VAX
To ensure that the system can locate the system disk and write the
system dump to it when there is more than one path to the system disk,
or when the system disk shadow set has multiple members, you must
follow the platform-specific instructions regarding booting. On some
VAX systems, you must set appropriate register values; on other VAX
systems, you must set specific environment variables. See the upgrade
and installation supplement for your VAX system for details.
Note that if the system has multiple CI star couplers, the shadow set
members must all be connected through the same star coupler.
16.7 Writing the System Dump File to an Alternate Disk
You can write the system dump file to a device other than the system
disk on OpenVMS systems. This is especially useful in large-memory
systems and in clusters with common system disks where sufficient disk
space is not always available on one disk to support customer dump file
requirements.
Requirements for DOSD are somewhat different on VAX and Alpha systems.
On both systems, however, you must correctly enable the DUMPSTYLE
system parameter to enable the bugcheck code to write the system dump
file to an alternate device.
The following sections describe the requirements for DOSD on Alpha and
VAX systems.
16.7.1 DOSD Requirements on Alpha Systems
On Alpha systems, DOSD has the following requirements:
- The dump device directory structure must resemble the current
system disk structure. The [SYSn.SYSEXE]SYSDUMP.DMP file will
reside there, with the same boot time system root.
Use AUTOGEN to
create this file. In the MODPARAMS.DAT file, the following symbol
prompts AUTOGEN to create the file:
DUMPFILE_DEVICE = $nnn$ddcuuuu
|
You can enter a list of devices.
- The dump disk must have an ODS-2 file structure.
- The dump device cannot be part of a volume set.
- Although not a requirement, Compaq recommends that you mount the
dump device during system startup. If the dump device is mounted, it
can be accessed by CLUE and AUTOGEN and for the analysis of crash
dumps. For best results, include the MOUNT command in
SYS$MANAGER:SYCONFIG.COM.
- For the Crash Log Utility Extractor (CLUE) to support DOSD, you
must define the logical name CLUE$DOSD_DEVICE to point to the dump file
to be analyzed after a system crash. For instructions, refer to
Section 16.9.
- The dump device cannot be part of a shadow set unless it is also
the system device and the master member of the shadow set.
- Use the following format to specify the dump device environment
variable DUMP_DEV at the console prompt:
>>> SET DUMP_DEV device-name[...]
|
Note
On DEC 3000 series systems, the following restrictions on the use of
the DUMP_DEV environment variable exist:
- This variable is not preserved across system power failures because
DEC 3000 series systems do not have enough nonvolatile RAM to save the
contents of the file. You must reset the DUMP_DEV variable after a
power failure. (DUMP_DEV is preserved across all other types of
restarts and bootstraps, however.)
- You cannot clear DUMP_DEV (except by power-cycling the system).
- You must use console firmware Version 6.0 or greater because
earlier versions do not provide support for DUMP_DEV.
|
On some CPU types, you can enter only one device; on other CPU
types, you can enter a list of devices. The list can include various
alternate paths to the system disk and the dump disk. By specifying
an alternate path with DUMP_DEV, the disk can fail over to the
alternate path when the system is running. If the system crashes
subsequently, the bugcheck code can use the alternate path by referring
to the contents of DUMP_DEV. When you enter a list of devices,
however, the system disk must come last.
How to Perform This Task
To designate the dump device with the DUMP_DEV environment variable,
and enable the DUMPSTYLE system parameter, follow these steps:
- Display the value of BOOTDEF_DEV; for example:
BOOTDEF_DEV dub204.7.0.4.3,dua204.4.0.2.3
|
- Display the devices on the system as follows:
Resetting IO subsystem...
dua204.4.0.2.3 $4$DUA204 (RED70A) RA72
dua206.4.0.2.3 $4$DUA206 (RED70A) RA72
dua208.4.0.2.3 $4$DUA208 (RED70A) RA72
polling for units on cixcd1, slot 4, xmi0...
dub204.7.0.4.3 $4$DUA204 (GRN70A) RA72
dub206.7.0.4.3 $4$DUA206 (GRN70A) RA72
dub208.7.0.4.3 $4$DUA208 (GRN70A) RA72
>>>
|
In this example:
- DUA204 is the system disk device.
- DUA208 is the DOSD device.
- To provide two paths to the system disk, with the dump disk as
DUA208 (also with two paths), set DUMP_DEV as follows:
>>> SET DUMP_DEV dua208.4.0.2.3,dub208.7.0.4.3,dub204.7.0.4.3,dua204.4.0.2.3
|
In this example, dua208.4.0.2.3 and dub208.7.0.4.3 are paths to the
dump device; dub204.7.0.4.3 and dua204.4.0.2.3 are paths to the boot
device.
Note
The system chooses the first valid device that it finds in the list as
the dump device. Therefore, the dump disk path entries must appear
before the system disk entries in the list.
|
- Display all environment variables on the system by entering the
SHOW * command; for example:
auto_action HALT
baud 9600
boot_dev dua204.4.0.2.3
boot_file
boot_osflags 0,0
boot_reset ON
bootdef_dev dub204.7.0.4.3,dua204.4.0.2.3
booted_dev dua204.4.0.2.3
booted_file
booted_osflags 0,0
cpu 0
cpu_enabled ff
cpu_primary ff
d_harderr halt
d_report summary
d_softerr continue
dump_dev dua208.4.0.2.3,dub208.4.0.4.3,dub204.7.0.4.3,dua204.4.0.2.3
enable_audit ON
interleave default
language 36
pal V5.48-3/O1.35-2
prompt >>>
stored_argc 2
stored_argv0 B
stored_argv1 dua204.4.0.2.3
system_variant 0
version T4.3-4740 Jun 14 2000 15:16:38
>>>
|
- Enable the DOSD bit of the DUMPSTYLE system parameter by setting
bit 2. For example, enter the value of 4 at the SYSBOOT> prompt to
designate an uncompressed physical dump to an alternate disk with
minimal console output:
>>> BOOT
SYSBOOT> SET DUMPSTYLE 4
|
The OpenVMS System Management Utilities Reference Manual and online help contain details about the DUMPSTYLE
system parameter.
Note
The error log dump file is always created on the system disk so that
error log buffers can be restored when the system is rebooted. This
file is not affected by setting the DUMPSTYLE system parameter or the
DUMP_DEV environmental variable.
|
16.7.2 DOSD Requirements on VAX Systems
On VAX systems, DOSD has the following requirements:
- The system must be connected directly to, and must boot from, CI
controllers.
- The dump device must physically connect to the same two
HSx CI controllers as the boot device. These two controllers
must be connected through the same CI star coupler.
- The dump device directory structure must resemble the current
system disk structure. The [SYSn.SYSEXE]SYSDUMP.DMP file will
reside there, with the same boot time system root.
Use AUTOGEN to
create this file. In the MODPARAMS.DAT file, the following symbol
prompts AUTOGEN to create the file:
DUMPFILE_DEVICE = $nnn$ddcuuuu
|
You can list only one device.
- The volume label can be up to 12 characters long. The ASCII string
DOSD_DUMP must be part of this volume label. For example, valid volume
labels are DOSD_DUMP, DOSD_DUMP_12, 12_DOSD_DUMP. The label is read and
retained in a memory boot data structure.
- The dump device cannot be part of a volume set. Compaq strongly
recommends that the dump device also not be part of a shadow set.
- The dump device cannot be MSCP unit zero (0); only units 1 to 4095
(1---FFF) are supported.
You can designate the dump device as
follows:
- On VAX 7000 configurations, by using bits 16 through 27 of the
DUMPSTYLE system parameter. Note that the DUMP_DEV environment variable
that is provided on VAX 7000 configurations is not used by OpenVMS VAX.
- On configurations other than the VAX 7000, by using bits 16 through
27 of register 3 (R3). You can use this portion of the register to
specify the dump device.
The OpenVMS System Management Utilities Reference Manual and online help contain details about the
DUMPSTYLE system parameter.
Note
To restore error log buffers when the system is rebooted after a system
crash, the error logs must be saved on the system disk. For this
purpose, AUTOGEN creates a SYSDUMP.DMP file on the system disk; the
file is large enough to contain the maximum size of error log buffers.
|
16.8 Using SDA to Analyze the Contents of a Crash Dump
The System Dump Analyzer utility (SDA) lets you interpret the contents
of the system dump file to investigate the probable causes of the
crash. For information about analyzing a crash dump, refer to the
OpenVMS VAX System Dump Analyzer Utility Manual or the OpenVMS Alpha System Analysis Tools Manual.
If your system fails, use SDA to make a copy of the system dump file
written at the time of the failure and contact your Compaq support
representative. For information about copying the system dump file, see
Section 16.12.
16.9 Using SDA CLUE Commands to Analyze Crash Dump Files (Alpha Only)
SDA CLUE (Crash Log Utility Extractor) commands automate the analysis
of crash dumps and maintain a history of all fatal bugchecks on a
standalone system or cluster. You can use SDA CLUE commands in
conjunction with SDA to collect and decode additional dump file
information not readily accessible through standard SDA. You can also
use SDA CLUE with Dump Off System Disk (DOSD) to analyze a system dump
file that resides on a disk other than the system disk.
16.9.1 Understanding CLUE (Alpha Only)
On Alpha systems, SDA is automatically invoked by default when you
reboot the system after a system failure. To better facilitate crash
dump analysis, SDA CLUE commands automatically capture and archive
summary dump file information in a CLUE listing file.
A startup command procedure initiates commands that:
- Invoke SDA
- Issue an SDA CLUE HISTORY command
- Create a listing file called CLUE$nodename_ddmmyy_hhmm.LIS
The CLUE HISTORY command adds a one-line summary entry to a history
file and saves the following output from SDA CLUE commands in the
listing file:
- Crash dump summary information
- System configuration
- Stack decoder
- Page and swap files
- Memory management statistics
- Process DCL recall buffer
- Active XQP processes
- XQP cache header
The contents of this CLUE list file can help you analyze a system
failure.
If these files accumulate more space than the threshold allows (default
5000 blocks), the oldest files are deleted until the threshold limit is
reached. This can also be customized using the CLUE$MAX_BLOCK logical
name.
To inhibit the running of CLUE at system startup, define the logical
CLUE$INHIBIT in the SYLOGICALS.COM file as /SYS TRUE.
It is important to remember that CLUE$nodename_ddmmyy_hhmm.LIS
contains only an overview of the crash dump and does not always contain
enough information to determine the cause of the crash. If you must do
an in-depth analysis of the system crash, Compaq recommends that you
always use the SDA COPY command to save the dump file.
16.9.2 Displaying Data Using SDA CLUE Commands (Alpha Only)
Invoke CLUE commands at the SDA prompt as follows:
CLUE commands provide summary information of a crash dump captured from
a dump file. When debugging a crash dump interactively, you can use SDA
CLUE commands to collect and decode some additional information from a
dump file, which is not easily accessible through standard SDA. For
example, CLUE can quickly provide detailed XQP summaries.
You can also use CLUE commands interactively on a running system to
help identify performance problems.
You can use all CLUE commands when analyzing crash dumps; the only CLUE
commands that are not allowed when analyzing a running system are CLUE
CRASH, CLUE ERRLOG, CLUE HISTORY, and CLUE STACK.
Refer to OpenVMS Alpha System Analysis Tools Manual for more information about using SDA CLUE
commands.
16.9.3 Using SDA CLUE with Dump Off System Disk (Alpha Only)
Dump off system disk (DOSD) allows you to write the system dump file to
a device other than the system disk. For SDA CLUE to be able to
correctly find the dump file to be analyzed after a system crash,
perform the following steps:
- Modify the command procedure SYS$MANAGER:SYCONFIG.COM to add the
system logical name CLUE$DOSD_DEVICE to point to the device where the
dump file resides. You need to supply only the physical or logical
device name without a file specification.
- Modify the command procedure SYS$MANAGER:SYCONFIG.COM to mount
systemwide the device where the dump file resides. Otherwise, SDA CLUE
cannot access and analyze the dump file.
In the following example, the dump file is placed on device $3$DUA25,
with the label DMP$DEV. You need to add the following commands to
SYS$MANAGER:SYCONFIG.COM:
$mount/system/noassist $3$dua25: dmp$dev dmp$dev
$define/system clue$dosd_device dmp$dev
|
16.10 Using CLUE to Obtain Historical Information About Crash Dumps (VAX Only)
On VAX systems, the Crash Log Utility Extractor (CLUE) displays the
contents of a crash history file. By examining the
contents of the crash history file, you can understand and resolve the
issues responsible for failures (crashes), and you might also obtain
other useful data.
16.10.1 Understanding CLUE (VAX Only)
The crash history file, which is created and updated by CLUE, contains
key parameters from crash dump files. Unlike crash dumps, which are
overwritten with each system failure and are therefore typically
available only for the most recent failure, the crash history file is a
permanent record of system failures.
After a system fails and physical memory is copied to the crash dump
file, CLUE automatically appends the relevant parameters to the file
CLUE$OUTPUT:CLUE$HISTORY.DATA when the system is restarted. The
remainder of this section describes how you can use CLUE to display the
data it has collected; reference information about CLUE is available in
the OpenVMS System Management Utilities Reference Manual.
Note
The history file typically grows by about 10 to 15 blocks for each
entry. You can limit the number of entries in the binary file by
defining the logical name CLUE$MAX_ENTRIES to be the maximum number
desired. When this number is reached, the oldest entries are deleted
from the history file.
By default, operator shutdowns are recorded in the history file. You
can exclude information from operator shutdowns in the history file by
defining the logical name CLUE$EXCLUDE_OPERS as being TRUE, for example
by including the following line in SYS$MANAGER:SYSTARTUP_VMS.COM:
$ DEFINE /SYSTEM CLUE$EXCLUDE_OPERS TRUE
|
|
16.10.2 Displaying Data Using CLUE (VAX Only)
To display data using CLUE, you must first define the following symbol:
After defining the symbol, you can use CLUE to display information by
entering the following command:
$ CLUE/DISPLAY
CLUE_DISPLAY>
|
At the CLUE_DISPLAY> prompt, you can issue commands to perform the
following actions:
- Use the DIRECTORY command to list failures that have occurred since
a specified date, failures of a particular type, failures that contain
a specified module, and failures that have a specified offset.
For
example, you can list all the failures in the history file using the
DIRECTORY command, as follows:
- Use the SHOW command to generate information similar to that
obtained from certain commands in the System Dump Analyzer utility
(SDA).
For example, if you wanted complete information about the
crash listed as crash number 7, the following SHOW command would
provide the information:
- Use the EXTRACT command to write the data from an entry to a file.
For example, the following command writes the data from entry
number 7 in the crash history file to a file named 15MAYCRASH.TXT:
CLUE_DISPLAY> EXTRACT 7/OUTPUT=15MAYCRASH.TXT
|
For more information about CLUE commands, refer to the OpenVMS System Management Utilities Reference Manual.
16.11 Saving the Contents of the System Dump File After a System Failure
If the system fails, it overwrites the contents of the system crash
dump file and the previous contents are lost. For this reason, ensure
that your system automatically analyzes and copies the contents of the
system dump file each time the system reboots.
On Alpha systems, SDA is invoked by default during startup, and a CLUE
list file is created. Generated by a set sequence of commands, the CLUE
list file contains only an overview of the crash and might not provide
enough information to determine the cause of the crash. Compaq,
therefore, recommends that you always copy the system dump file.
Refer to the OpenVMS Alpha System Analysis Tools Manual for information about modifying your
site-specific command procedure to execute additional commands such as
SDA COPY upon startup after a system failure.
On VAX systems,
modify the site-specific startup command procedure SYSTARTUP_VMS.COM so
that it invokes the System Dump Analyzer utility (SDA) when the system
is booted.
Be aware of the following facts:
- When invoked from the site-specific startup procedure in the
STARTUP process, SDA executes the specified commands only if the system
is booting immediately after a system failure. If the system is
rebooting after it was shut down with SHUTDOWN.COM or OPCCRASH.EXE, SDA
exits without executing the commands.
- Although you can use the DCL command COPY to copy the dump file,
the SDA command COPY is preferable because it copies only the blocks
occupied by the dump and it marks the dump file as copied. The SDA COPY
command is preferable also when the dump was written into the paging
file, SYS$SYSTEM:PAGEFILE.SYS, because the SDA COPY command releases to
the pager those pages occupied by the dump. For more information, see
Section 16.13.
- Because a system dump file can contain privileged information,
protect copies of dump files from world read access. For more
information about file protection, refer to the OpenVMS Guide to System Security.
- System dump files have the NOBACKUP attribute, so the Backup
utility (BACKUP) does not copy them unless you use the qualifier
/IGNORE=NOBACKUP when invoking BACKUP. When you use the SDA command
COPY to copy the system dump file to another file, the operating system
does not automatically set the new file to NOBACKUP. If you want to set
the NOBACKUP attribute on the copy, use the SET FILE command with the
/NOBACKUP qualifier as described in the OpenVMS DCL Dictionary.
Example
The SDA command COPY in the following example saves the contents of the
file SYS$SYSTEM:PAGEFILE.SYS and performs some analysis of the file.
Note that the COPY command is the final command because the blocks of
the page file used by the dump are released as soon as the COPY command
completes, and can be used for paging before any other SDA commands can
be executed.
$ !
$ ! Print dump listing if system just failed
$ !
$ ANALYZE/CRASH_DUMP SYS$SYSTEM:PAGEFILE.SYS
SET OUTPUT DISK1:SYSDUMP.LIS ! Create listing file
READ/EXECUTIVE ! Read in symbols for kernel
SHOW CRASH ! Display crash information
SHOW STACK ! Show current stack
SHOW SUMMARY ! List all active processes
SHOW PROCESS/PCB/PHD/REG ! Display current process
COPY SYS$SYSTEM:SAVEDUMP.DMP ! Save system dump file
EXIT
$ SET FILE/NOBACKUP SYS$SYSTEM:SAVEDUMP.DMP
|
|