Can't boot! Hardware problem?

From: Peter Chapin <pchapin_at_vtc.vsc.edu>
Date: Tue, 01 May 2001 12:13:51 -0400 (EDT)

I am in the process of upgrading from Tru64 v5.0A to v5.1. I'm using a
DS10 system. I've encountered a problem that prevents my system from
booting and I'm not certain how to proceed to fix it. I don't believe this
problem has anything to do with the upgrade. I believe it is a hardware
problem.

Before doing the system upgrade I wanted to upgrade my firmware to v5.8
(came with TU5.1). I shut down to the console with "shutdown -h now" and
then booted the firmware CD with "boot dqa0". The boot appeared to start
okay, but when it said "jumping to bootstrap code" (or something like
that) I then got several messages of the form

        "waiting for pkb0.7.0.15.0 to stop"

These messages were separated by several seconds. I got the impression
that something was timing out over and over again.

After about 10 messages refering to pkb the boot continued normally. I
performed the firmware upgrade without incident. I powered off the system
for a good 20-30 seconds as recommended in the documentation for the
firmware upgrade (it actually only says 10 seconds). I then rebooted to
continue with the upgrade installation.

Again, I received about 10 (or so) "waiting for pkb..." messages. The
system got over that and the boot continued. However, the boot process
seemed abnormally slow to me. Later, after printing

        "isp0: Fast RAM timing enabled"

(and several other messages related to isp0) the boot process hung. I
tried cycling the power and, amazingly, the system booted normally. As a
point of information I noticed that immediately after printing the message
above, the normal boot process involves a change to the fonts being used
on the console.

I continued with my upgrade. The installation of TU5.1 appeared to go
fine. There were no problems. When the system rebooted after the upgrade
installation, there were no odd messages about pkb0.7.0.15.0. However, the
boot process hung again at the same point as before. I tried cycling the
power again, but it didn't help.

At the console when I do "show device" I do not see any mention of
pkb0.7.0.15.0. However, I KNOW the device was there when I started this
process. The device pka0.7.0.14.0 is shown as being present as it was
before (there used to be both pka and pkb).

The console command "show config" shows

        Bus 00 Slot 14: QLogic ISP10x0
                        pka0.7.0.14.0 SCSI BUS ID 7
                        ... devices connected to this SCSI bus ...

There is currently no mention of Bus 00, Slot 15 nor pkb anything.

If I run the "test" command at the console, the system locks and I have to
power off to recover. During the boot up there are no special messages
displayed during the "Testing the System" phase.

My guess is that my second SCSI adapter went west and the console software
just deleted it from the configuration. However, the system still tries to
probe it during the boot and that causes the hang. As it happens, I don't
currently have any important resources on that bus. All my disks, etc, are
on the first SCSI bus. Thus my immediate question is: how can I convince
this system to entirely forget about the second bus and boot without it?
In the longer term: how do I determine the nature of the problem
sufficiently precisely so that I can get it fixed? I could use a bit of
education about the architecture of these DS10 boxes.

P.S. I honestly don't think this has anything to do with the upgrade or
with the installation of the new firmware since pkb was manifesting some
sort of strangeness before I applied the new firmware. Is there any point
in downgrading to the old firmware and trying that? Additional note: there
was no evidence of any problems in the running system before I started my
upgrade procedure.

Thanks.

Peter
Received on Tue May 01 2001 - 16:15:21 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT