We had our AlphaServer 1000A NFS server stop wanting to be an NFS
server last night. The system didn't actually crash, and it was
still possible to log in to the machine after NFS services
stopped. Our Networker backups actually ran to completion after
the NFS failure (although we haven't checked whether the tapes
make sense yet). Any program that tried to look up kernel
information, like 'ps', 'top', or 'uptime' would stall
indefinitely and become unkillable, but other programs like
'swapon' and 'df' still worked. Eventually, once we shut down
all the NFS client systems that depended on it, we were able to
reboot the system and it's running fine again.
The only messages we got in /var/adm/messages that seemed to be
related to this were:
Aug 11 14:21:00 aserver1 vmunix: malloc_mem_alloc: no space in map
These started several hours before NFS failed, but only appeared
every half-hour to hour.
Any one have any ideas?
Received on Wed Aug 12 1998 - 18:46:08 NZST