Disappearing processes

From: Ian Wojtowicz <i_at_woj.com>
Date: Wed, 23 Dec 1998 09:15:14 -0800

Hi everyone.

I have a strange problem with my DU 4.0B machine. Every once in a while, my
Apache, ircd, sshd and cron daemon processes die for no reason (that I can
find). This usually happens when I'm not logged, which just adds to the
pandemonium.

The only clues that I've managed to piece together is that these daemons are
all standalone (ie: not spawned from inetd), and the following kern.log:

Dec 23 10:31:02 nation1 vmunix: MACHINE CHECK type 0x660 Machine check
abort
Dec 23 10:31:02 nation1 vmunix: ptr[0-1] = 0000000100000092
0000000000000000
Dec 23 10:31:02 nation1 vmunix: ptr[2-3] = 000c06f800000004
0000000000000020
Dec 23 10:31:02 nation1 vmunix: ptr[4-5] = 0000000000000018
0000000000000006
Dec 23 10:31:02 nation1 vmunix: ptr[6-7] = 0000000000000000
0000000000104000
Dec 23 10:31:02 nation1 vmunix: ptr[8-9] = 0000000000000000
0000000000000008
Dec 23 10:31:02 nation1 vmunix: ptr[10-11] = fffffc0000430e30
0000000000000000
Dec 23 10:31:02 nation1 vmunix: ptr[12-13] = fffffc00004311d0
fffffc0000431200
Dec 23 10:31:02 nation1 vmunix: ptr[14-15] = fffffc0000431260
fffffc0000430fd0
Dec 23 10:31:02 nation1 vmunix: ptr[16-17] = fffffc0000430ca0
00000001201a931c
Dec 23 10:31:02 nation1 vmunix: ptr[18-19] = 000000011fffd6d0
ffffffff84ec7a38
Dec 23 10:31:02 nation1 vmunix: ptr[20-21] = fffffc000057f940
0000000000000005
Dec 23 10:31:02 nation1 vmunix: ptr[22-23] = 6068686c7c7c7c7c
000003ffc0183508
Dec 23 10:31:02 nation1 vmunix: ptr[24-25] = 0000000000000000
0000000000010000
Dec 23 10:31:02 nation1 vmunix: ptr[26-27] = 0000000140308150
0000000000000000
Dec 23 10:31:02 nation1 vmunix: ptr[28-29] = 00000000029b0000
fffffffc00000000
Dec 23 10:31:02 nation1 vmunix: ptr[30-31] = 0000000000000001
000000000085ba38
Dec 23 10:31:02 nation1 vmunix: exc_addr = 00000001201a8f26
Dec 23 10:31:02 nation1 vmunix: exc_sum = 0000000000000000
Dec 23 10:31:02 nation1 vmunix: exc_mask = 0000000000000000
Dec 23 10:31:02 nation1 vmunix: iccsr = 0000000000000000
Dec 23 10:31:02 nation1 vmunix: pal_base = 0000000000060000
Dec 23 10:31:02 nation1 vmunix: hier = 00000000000018f0
Dec 23 10:31:02 nation1 vmunix: hirr = 0000000000000000
Dec 23 10:31:02 nation1 vmunix: mm_csr = 0000000000005020
Dec 23 10:31:02 nation1 vmunix: dc_stat = 0000000000000007
Dec 23 10:31:02 nation1 vmunix: dc_addr = 00000007ffffffff
Dec 23 10:31:02 nation1 vmunix: abox_ctl = 000000000000042e
Dec 23 10:31:03 nation1 vmunix: biu_stat = 0000000000002440
Dec 23 10:31:03 nation1 vmunix: biu_addr = 00000000012a9510
Dec 23 10:31:03 nation1 vmunix: biu_ctl = 0000000e10006335
Dec 23 10:31:03 nation1 vmunix: fill_syndrome = 0000000000000080
Dec 23 10:31:03 nation1 vmunix: fill_addr = 00000000012a9510
Dec 23 10:31:03 nation1 vmunix: va = 00000000001081e8
Dec 23 10:31:03 nation1 vmunix: bc_tag = 0000000000401295
Dec 23 10:31:03 nation1 vmunix: ident = 92
Dec 23 10:31:03 nation1 vmunix: mcr_stat = 40404040
Dec 23 10:31:03 nation1 vmunix: intr = 00000000
Dec 23 10:31:03 nation1 vmunix: tc_status = 00000000
Dec 23 10:31:03 nation1 vmunix: config = 00000000
Dec 23 10:31:03 nation1 vmunix: panic (cpu 0): Machine check - Hardware
error
Dec 23 10:31:03 nation1 vmunix: syncing disks... DUMP.prom: dev SCSI 0 4 0 0
300 0 FLAMG-IO, block 131072
Dec 23 10:31:03 nation1 vmunix: DUMP.prom: dev SCSI 0 4 0 0 300 0 FLAMG-IO,
block 131072
Dec 23 10:31:03 nation1 vmunix: Alpha boot: available memory from 0x72e000
to 0x4000000
Dec 23 10:31:03 nation1 vmunix: Digital UNIX V4.0B (Rev. 564); Mon Aug 4
17:26:28 EDT 1997
Dec 23 10:31:03 nation1 vmunix: physical memory = 64.00 megabytes.
Dec 23 10:31:03 nation1 vmunix: available memory = 56.82 megabytes.
Dec 23 10:31:03 nation1 vmunix: using 238 buffers containing 1.85 megabytes
of memory
Dec 23 10:31:03 nation1 vmunix: tc0 at nexus
Dec 23 10:31:03 nation1 vmunix: tcds0 at tc0 slot 4
Dec 23 10:31:03 nation1 vmunix: scsi0 at tcds0 slot 0
Dec 23 10:31:03 nation1 vmunix: rz3 at scsi0 target 3 lun 0 (LID=0) (DEC
RZ29B (C) DEC 0014)
Dec 23 10:31:03 nation1 vmunix: ln0: DEC LANCE Module Name: PMAD-BA
Dec 23 10:31:03 nation1 vmunix: ln0 at tc0 slot 5
Dec 23 10:31:03 nation1 vmunix: ln0: DEC LANCE Ethernet Interface, hardware
address: 08-00-2B-BA-AB-8F
kern.loggcc -c -I. -I. -g -O2 -D_PATH_TCSHELL='"/usr/local/bin
/tcsh"' ed.screen.c
Dec 23 10:31:04 nation1 vmunix: scc0 at tc0 slot 5
Dec 23 10:31:04 nation1 vmunix: bba0 at tc0 slot 5
Dec 23 10:31:04 nation1 vmunix: 1280X1024
Dec 23 10:31:04 nation1 vmunix: DEC 3000 - M300 system
Dec 23 10:31:04 nation1 vmunix: Firmware revision: 6.9
Dec 23 10:31:04 nation1 vmunix: PALcode: OSF version 1.45
Dec 23 10:31:04 nation1 vmunix: dli: configured
Dec 23 10:31:04 nation1 vmunix: ADVFS: using 566 buffers containing 4.42
megabytes of memory

I'm guessing that this is a soft reboot. Is that correct? If so, does anyone
have any idea how it could be happening without my explicit written consent?

Hunting for clues...

ian

___________________________________________________________________________
ian wojtowicz <i_at_woj.com> http://woj.com
Received on Wed Dec 23 1998 - 17:15:43 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT