Hello,
I've been on the phone with DEC for about a month now, and
figured I'd try another avenue, since they have yet to provide answers.
I work with an alphaserver 2100a with the following
hardware configuration:
2 cpus (was single cpu, still same problem that I'll describe in a sec)
20 gig of raid 5 from a RAID controller who's model I can't remember, but
it comes up as xcr0 in uerf
640 meg ram from two RAM boards
100 meg ethernet 21140-AA
10 meg ethernet 21040-AA
DU4.0a is the software on this puppy. Now - we have a 3.2g machine
(actually any 3.2g involved, not just this one) serving NFS - maillocking
is set to 0 (lockf), and everything works just fine except when the 4.0a
machine, as the NFS client of the 3.2g machine, tries to lockf a file on
an NFS-mounted filesystem. I've even written a stupid C program to work
with this, and it just sits there - never even times out with an error
... I let it run all night!
Anywho, I also run 4.0a on a multia/UDB, and the same lockf tests that
fail on the 2100 WORK on the multia.. Uhhh. ok..
UDB is stock udb plus 2 gig external disk and 48 meg of ram.
So, I'm at a complete loss. I have even replaced rpc.lockd and rpc.statd
with 3.2g bins just to see if that would change anything. Nope.
Yes, they're running. Yes, the machine's been rebooted (many many
times). DEC has NO clue. I have NO clue.
I'll explain the software mix on this machine, but I have a feeling it's
more system/kernel-related than the software.
It runs
wu-ftpd (the academ-beta virtual ftp version)
apache stronghold
radius
sendmail 8.8.5
procmail (who knows what version)
and probably a few other misc. things
The 4.0a system that works is not running really anything - it's more of
a toy.
/etc/sm on the failing system HAS had entries in it. Though I've seen it
empty while I'm failing on a lockf. /etc/state has a 1 in it.. I reset
that yesterday. I've seen it up to 27.
binmail is always complaining about lockf, so I've mounted the queue to
another system and run in there to alleviate this problem (cute solution
huh?) ...
Can't think of any other info that would help.. Anyway.. Trust me, I've
gone through everything, and my primary job, though not a long-time job,
is system admin, so I know what I'm doing (to an exent :), and I've been
through all the hoops with DEC.. it's not like I'm missing something
stupid (though i could be..) anyhow.. anyone else seen this?
Oh, by the way, 4.0a -> 4.0a (udb -> 2100), where the lockf test is
performed on the 2100a, also fails. But a lockf from the udb works
fine.. Go figure.
Something just ain't right. Ideas? I'd _really_ appreciate it!!
TIA,
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
: Benjamin C. Goodwin - ben_at_acol.com :
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Received on Fri Feb 28 1997 - 02:18:11 NZDT