-----BEGIN PGP SIGNED MESSAGE-----
Hi Folks,
There seems to be a problem with keystrokes not being delivered to a
process with a read pending on /dev/tty, under DU 4.0b.
There are two circumstances in which we have seen this.
1) The operators start an xterm on their workstation, and rsh to a machine
with a tape drive, then by means of sudo proceed to run a script called
backup with root privs.
2) From my own workstation I use ssh to start an xterm running on the
machine with the tape drive and then su, then run the same backup script.
The script determines which machine to backup, then tests
connectivity to it by rcp-ing /etc/fstab. Then it tests the tape drive by
writing something and reading it back. Then it starts the backup whose
central theme is:
for each fileset or partition on remotehost
rsh remotehost dump or vdump it to standard output | dd of=tape obs=whatever
Generally this works. But two of our machines have disks or
filesets bigger than one tape.
In those cases, some hours after the backup was started, dd will
write a message to the screen saying `end of media, wait for closing, press
return when next tape loaded [q]', or words to that effect.
At this point `lsof -p <dd process>' shows that the controlling
terminal is open on fd's 1 and 2, the pipe from rsh is open on fd's 0 and 3,
and /dev/tty is open on fd 4. As expected. And the offset for fd 4 is 0t0.
Now one would expect that changing the tape and typing <return>
would make the backup continue. But that very rarely happens.
On several occasion the operator had to type two or more, sometimes
many many more, returns before the backup continued.
Indeed, in one case I saw myself, I typed <return> and then went to
the other window to rerun the lsof. The offset of fd4 was still 0t0. I
then typed `y<return>a<return>'. The offset was now 0t2, and the tape drive
had not been re-opened. I then typed around 24 returns. Made no difference
to the offset.
The actual terminal (/dev/ttyp0 or whatever) open on fd's 1 and 2
did increment their counts appropraitely according to lsof. Only /dev/tty
open on fd 4 remained unmoved.
(One or our disks requires three tapes. After (eventually) accepting
a return on fd 4, for the second tape, it opens the tape on fd5. When
asking for confirmation of the second tape swap it again behaves badly,
requiring many returns before opening the third tape on fd5. /dev/tty
remained open on fd4).
All the characters typed are echo-ed, returns move the cursor to
the left column of the next line and so on.
It doesn't make any difference if the commands are typed immediately
after the end of media message or hours later.
It doesn't make any difference if I use /usr/bin/dd or /sbin/dd or
a /sbin/dd copied from a DU 3.2 machine.
It doesn't make any difference if the tape is a DLT or an Exabyte.
As far as we can remember this never happened under Du 3.2.
Applying the duv40bas00003-19970425 jumbo patch made no difference.
(Some of the patch didn't apply though, because we don't have UUCP installed).
Because it takes so long to create the situation, I've tried
replacing the name of the tape drive with /dev/rfd0c which fills up in about
a minute. I pressed return every minute or so for most of a day without
seeing the problem.
The only non-standard thing we do is use tcsh as the user shell. The
backup script itself is a /bin/sh script.
We can only partially work around this. In one case we can do the
backup on three tapes, but in another case there is one single fileset
bigger than the tape capacity.
Anyone know of a patch guaranteed to fix this problem? Or any
workaround?
Keith
-----BEGIN PGP SIGNATURE-----
Version: 2.6.2i
iQCVAwUBM6hwAnEpE0nRVDfpAQFuAQP+OvDcW2SNXs8okup7hO9fW+58BEAh7fQ4
kb7UuoC18lq8h6MbWoM2xHxph0RQVVMaC2+q8u14Cxo86YG/GeO5tUrL5ByvT4bA
wDy4SwfC+1f5Rn9gkEEt0S5wROXfKqnyA8TMgezfsWuvf8hg+w0gVnKuK0ONVLqL
ULFmYy5cBfQ=
=avIP
-----END PGP SIGNATURE-----
Received on Thu Jun 19 1997 - 01:40:31 NZST