SUMMARY: system stuck after oveheating crash

From: Carmen San Martin <carmen_at_mail.wistar.upenn.edu>
Date: Wed, 23 Jan 2002 16:22:04 -0500 (EST)

  Many thanks to
  
  Benton, Marco; Jeff Berliner; Warren Sturm; Joerg Bruehe; "system
administration account"; Selden E Ball Jr; Dr. Thomas.Blinn; alan.

  The consensus seemed to be that the crash fried the Ethernet card. We booted
on standalone mode by changing /etc/svc.conf to local instead of
bind, and the machine seems to be working with no problem.

  We learned the interesting bit that the DT depends heavily on the network
working properly.

  Also, we have been warned that there is no way to tell what other things
might have suffered from the overheating. Well, once we replace the ethernet
card we'll see.

  The original answers and question follow. Thanks again for this great list,
  
            Carmen


 ------------------------------------------------------------




From: "Benton, Marco "

usually the boot process will hang is the system cant talk to the
nameserver, or the system name isnt found in /etc/hosts.

either fix the network or change /etc/svc.conf to local instead of bind
(dns) which will at least let the system boot.

maybe your SRM env variables got squashed...


From: Jeff Berliner

        Sounds a lot like your networking is not working! Could the
overheating have fried your network card? Do you have a link light on
the back of the NIC? Could the overheating have affected some other
network electronics in the same location? Otherwise, I'd boot into
single user mode (>>> boot -fl s) and ensure that your gateways and
netmask are properly set.

                                                        - jeff

From: Warren Sturm

Have you tried booting the generic kernel (genvmunix)? If that goes ok,
run doconfig and rebuild your configuration. Also check your
/etc/rc.config and make sure it looks ok.

Portmap is needed for CDE to work properly.



From: Joerg Bruehe


This is just a guess:
On an AIX machine I use, CDE (the graphic desktop) would not come up
(block indefinitely) when the OS had failed to start the "loopback"
interface for local TCP/IP sockets (for some still unknown reason).
My impression is that all X11 stuff uses local sockets for
communication.

Your mentioning that "ping localhost" fails while "ping 1.2.3.4"
(local IP) works makes me assume that you also have no working
loopback comunication.
You should use command line means to re-establish loopback sockets
(but unluckily I can not tell you how to do it).

From: system administration account


Sounds like the system can't talk to the rest of your network.



From: Selden E Ball Jr

Carmen,

The hangs you describe are consistant with the system's network
hardware malfunctioning. Replacing the network card will probably
fix them.

I do hope you have the system on a maintenance contract.
Overheating usually causes intermittant failures which may
not happen for months afterward. They will continue until
all of the damaged components have been replaced.

You have my sympathies. We've been through such problems in the past.
As a result, our computer rooms now have reduntant cooling so that
they'll stay cool even if half of the cooling units are offline.
(e.g. for scheduled maintenance.)

Good luck!



From: "Dr. Thomas.Blinn_at_Compaq.com"

You're right, the network isn't working. CDE is heavily dependent
on the network working. Maybe you fried your Ethernet card, maybe
it's something else. Fix the network problem and then you'll find
that the system might be back to normal, or not, depending on what
else got damaged by the overheating problem.


From: alan_at_nabeth.cxo.cpqcorp.net


        I'd check the network interface card. If the system shutdown
        because it was too hot, then any number of other things may
        have suffered.
        
        
        
------------- Original posting --------------------------------

  Dear managers,

  our Alphaserver 4100 (OSF/1 4.0D) crashed due to an overheating
problem. After restarting, the boot process gets stuck at "ONC portmap
service started".

  If we hit ^C it goes on booting, but then it hangs again when starting
the desktop. If we use command line login instead, the only thing we have
noticed is that the network is not running properly, we only manage to
ping the same machine and only if we use the IP address.

  We found an old message with a similar problem in the list. It seemed to
relate it to nfs. We tried not setting up nfs at boot up and got rid of
the "ONC portmap" problem, but we still get stuck at desktop start up
and the network is still not working.

  We would appreciate ideas about what else to check...

  Thanks,

         Carmen
Received on Wed Jan 23 2002 - 21:48:36 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:43 NZDT