Alpha 3000-600S - OSF/1 2.0 hanging

From: Olivier T CALLE <francais_at_spu.edu>
Date: Fri, 20 Jan 1995 18:40:01 -0800 (PST)

Hello all,

The problem being described is occurring on an Alpha 3000-600S with 128MB
of memory, and running OSF/1 2.0 rev 240
It has about 2000 user accounts, with daytime usage between 40-100 users
logged in at once.

My questions are:
Is this a known problem? Is it a bug? Is it fixable? Or do we need to
upgrade?

Here is a short synopsis of what has been happening to our server since
last night (Thursday-Friday):

time - what happened
========================================================================
2129 - Server reboots itself with the following panic string:
       vmunix: panic: lat_scl1_new: bugs tcb pointer
0052 - Server "hangs" (see definition below)
0055 - Press interrupt button to get to console and reboot multiuser
0134 - Server "hangs" again
0137 - Go to console and reboot to single user mode
0209 - type "shutdown -r now" to go back to multiuser mode
1117 - Server hangs again, power off, reboot
1208 - Same thing
1332 - Same thing
1513 - Same thing
Now - Still up...

Definition/description of a "hang":

1. The system stops responding to user input on both LAT and network
   (telnet) connections.
2. During this "hang", the system will allow you to connect, but as soon as
   you hit return on your password (whether it's correct or not), you
   are stuck and must terminate your connection.
3. The system _does_ respond to ping, gopher and http connections.
4. A telnet 25 connection just sits there, without displaying the sendmail,
   etc. prompt.
5. The system does _not_ respond to a remote finger.
6. The system does not appear to respond to an rsh. It appeared to let me
   connect, but I couldn't execute any commands (like w, uptime)

Ancillary information:
1. When the Server was in the very first hang and subsequent self reboot,
   LAT completely disallowed connections and kicked off users with a
   message about "No response within timeout period" (or something along
   those lines.) In the following few hangs, it still allowed
   connections, as described in #2 above.
2. /usr/bin/login has been dumping core with the following stack trace:

DGSE> dbx /usr/bin/login /core
dbx version 3.11.4
Type 'help' for help.
Core file created by program "login"

warning: /usr/bin/login has no symbol table -- very little is supported without
it


signal Segmentation fault at [strlen:54 ,0x3ff800a6ae0] Source not avai
lable
(dbx) where
> 0 strlen(0x3ffc00966d0, 0x0, 0x0, 0x1, 0x140007000) ["../../../../../src/usr/
ccs/lib/libc/alpha/strlen.s":54, 0x3ff800a6ae0]
   1 __bsd_siad_getpwnam(0x0, 0x140007000, 0x3, 0x1, 0x3ff800b0648) ["../../../.
./../src/usr/ccs/lib/libc/SIA/siad_getpass.c":1267, 0x3ff8013100c]
   2 __siad_getpwnam(0x3, 0x1, 0x3ff800b0648, 0x8570000000000, 0x3ff800dbacc) ["
../../../../../src/usr/ccs/lib/libc/SIA/siad_getpass.c":1237, 0x3ff80130f28]
   3 __sia_switch(0x3ff800f5a88, 0x3ffc008df90, 0x100000001, 0xffffffffffffffff,
 0x0) ["../../../../../src/usr/ccs/lib/libc/SIA/sia_switch.c":235, 0x3ff800dbac8
]
   4 __sia_getpasswd(0x3ff00000002, 0x0, 0x11fffef18, 0x3ff00000000, 0x3ff8012d5
18) ["../../../../../src/usr/ccs/lib/libc/SIA/sia_getpass.c":166, 0x3ff80118ce8]
   5 __getpwnam(0x3ff80123770, 0x0, 0x0, 0x3ffc00a90c0, 0x3ffc00a9108) ["../../.
./../../src/usr/ccs/lib/libc/getpasswd.c":124, 0x3ff800bfb78]
   6 __sia_ses_estab(0x3ff80123770, 0x140007000, 0x11ffffee8, 0x816322f1f6c91, 0
x0) ["../../../../../src/usr/ccs/lib/libc/SIA/sia_s_estab.c":259, 0x3ff800e16a8]
(dbx) q

Thanks for any insight,

Olivier T. CALLE

internet: <francais_at_spu.edu>
work tel: 206-281-2435 home tel: 206-286-7115 To err is human,
US mail: MAILSTOP 1686, SPU, Seattle, WA 98119-1997 to really foul up
callsign: N7TAP, class: Tech+ requires the root
jobs: Computer and Information Systems Hired Hacker password...
WWW Page URL: http://www.spu.edu/~francais/
Psalm 48:14 SPU Electrical Engineering Major
Received on Fri Jan 20 1995 - 21:39:46 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT