Wow,
Three minutes, a new record! Thanks to Sean O'Connell [sean_at_stat.Duke.EDU]
His response:
I believe this has been fixed in later versions (patchkits?) of
Tru64. There is a global kernel lock that gets invoked on an
rmfset. This was apparently quite nettlesome wrt using clonefset/
rmfset while backups of large advfs filesystems.
My original question:
This is just a question, not really important, as I don't use rmfset too
often. I'm actually just curious.
I added some storage to my machine, a 4100 running 4.0d pk3, and as a
result was moving around some file systems to make more effective use of
the
extra storage. This left me with 3 filesets on 3 domains that I no longer
needed. I ran the first rmfset and it took a while as it was a fairly large
file set I was removing. My system monitoring package BigBrother suddenly
turned red and indicated that ftp was down on the machine I was working on.
I tried to ftp to the machine, from two other machines myself. It said it
was "connecting" but just sat there stuck, no login prompt. I checked to
see
if the rmfset was using a lot of CPU , just making response slow. Not the
problem, top revealed that rmfset was the top user, but the CPU was 63%
idle. I checked inetd and it was running, and telnet was working, as well
as
ssh. Then just as suddenly BigBrother was green again, and the two machines
that were stuck were now asking for a name and I was able to login. I
checked on my rmfset and it was complete.
I ran my next rmfset and again ftp went down on that machine. Again two
separate machines got stuck in the same place. I had multiple windows open
and this time I noticed that as soon as the prompt returned on the rmfset
machine, the other two machines immediately asked for the ftp login, and I
was able to get in.
Just a coincidence? I tried it with my final rmfset and got identical
results. ftp stopped working just as soon as I started the rmfset, and came
back just as soon as the rmfset completed. I should add that none of these
filesets were in the root_domain or usr_domain. All three were domains
created to hold local data, on a mirrored HSZ controller.
I'm just starting to get familiar with the alpha architecture, and like I
said the CPU was not overloaded, and you could login to the system via
telnet and ssh and run other commands, so I was wondering if this was some
kind of I/O bus problem. I can't think of any other logical reason why one
program would cause a problem with an unrelated program. Anybody seen this?
Received on Wed Feb 16 2000 - 18:29:27 NZDT