Debugging ASR
Updated June 17, 2004
Created February 05, 2004
Autogenerated Site Map
Search this Site!:
If your HP ProLiant server experiences an ASR then here are some steps to debug what is going on:
- Disable ASR
The purpose of ASR is to restart the HP ProLiant server once it is determined that the system is not responding. The ASR procedure allows for normal operations of the HP ProLiant server to resume rather than remaining in a halted state.
Sysrq can be used to debug a system that is in a crashed state. Once Linux has been rebooted, sysrq will not be able to provide useful information. To use sysrq, ASR should be disabled to prevent the server from rebooting before we can gather information about the crash from sysrq.
To disable ASR you must reboot the server, press F9 on bootup to enter the RBSU (Rom Bios Setup Utility). From the menu select "disable asr".
- Enable the serial console
Follow the serial console instructions to set up a workstation and to configure the server.
Be sure to test the serial console configuration to make sure it works.
Logging in is not required to perform sysrq tasks.
- Enable the SysRq key
Either echo 1 >/proc/sys/kernel/sysrq (which only lasts until the next reboot)
or
edit /etc/sysctl.conf and turn on the SysRq key (this method is persistant upon each reboot)
The /etc/sysctl.conf file will have the following entry, just change the 0 to a 1 as I have shown here in bold:
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 1
Use sysrq to debug when the crash occurs
The following will demonstrate sysrq information gathering via minicom:
- Start minicom without modem reset
minicom -o
- Enable logging and provide a filename for the log
CTRL-a l
sysrq-200402051343.log
- Gather sysrq reports (let's gather p, t, and m)
CTRL-a f p (The SysRq-p should be done once for each processor (cpu) in the system)
CTRL-a f t
CTRL-a f m
- Be sure all the necessary reports have been gathered before continuing. Other sysrq commands are listed in /usr/src/linux-2.4/Documentation/sysrq.txt
It is probably best to cat the logfile from another terminal session on the workstation to verify that the logs were gathered.
- Since the server is hung, we can minimize data loss by performing a sync, unmount, and reboot:
CTRL-a f s
Note that the sync hasn't taken place until you see the "OK" and "Done" appear on the screen.
CTRL-a f u
Again, the unmount (remount read-only) hasn't taken place until you see the "OK" and "Done" message appear on the screen.
CTRL-a f b
- Stop logging if you haven't already done so
CTRL-a l
- (Optional) Exit minicom without reset
CTRL-a q
- Use the log to debug the system crash
If you are working with technical support, then provide the sysrq log to them.
Search this Site!:
Homepage: http://www.cpqlinux.com
Site Map: http://www.cpqlinux.com/sitemap.html