OS bug in tu50, tu51

From: Michael A. Crowley <mcrowley_at_MtHolyoke.edu>
Date: Sun, 28 Jan 2001 13:24:26 -0500 (EST)

This is an update of an earlier posting with the subject:
        ?Bug in mv in TU50
I have now had the problem in tu5.1 also and have found a
new related symptom that may be more fundamental.

If you wish to see more detail on this problem, please check:
        http://www.mtholyoke.edu/lits/network/tu5x-bug/

Generally, here is what is happening.

I am having an intermittent problem with the "mv" command
in some very critical areas. The result of the problem
is that after a:
        mv a b
both files are gone.

This is occurring in /etc on a UFS filesystem and is showing
up with files such as passwd, passwd.pag, passwd.dir, or group
vanishing when updated versions are moved into place.

The symptoms of the general problem began with TU5.0 running on a DS20.
However, I have installed TU5.1 pl2 on another DS20 and have
experienced the same problem. Thus the problem has transcended
operating system levels and hardware.

The symptoms never occurred in Ultrix or DU4.0b, systems which ran
the script that hashed the passwd files in a subdirectory of /etc
(in the root file system) and moved the hashed files into place.

Although not documented here, I have also had problems with
a script that periodically moved the /etc/group file into place
after being generated on another system. So it is not just
the passwd files which this is happening to.

I have found and documented a problem that may show
that the "mv" problem is a symptom of something else.
I've found some occasions where a file like "passwd" was
fading in and out of existence. Since these words do not
make sense, you'll need to see the data from the operations
that show the "fading in and out" symptom. I have appended
a data file with this as an example.

In the /etc directory, I found that a command like:
        ls -li passwd
would sometimes report the file is missing and sometimes not.
I have this well documented. It is like the file is fading in
and out of existence. When there, the inode does not appear
to have changed.

I constructed a program to watch for the occurrence of this
kind of event and log data when it occurs. (See the web site
listed above for the program.) I added "lsof /etc"
to the program to see if any files were listed as being open
by any process, but I have not found the fading file to
be open according to lsof.

I found that when the problem of fading in and out was occurring,
I could mv the fading file to a new name and then cp the contents
back to the original name. At that point, the symptoms totally
disappeared. I have documented this occurring twice.

Any help here would be appreciated, including someone replicating
this thing or figuring out to make it _less_ intermittent so
the cause might be determined. (The CSC ref# is c010126-2732.l)
One of the data files appended below.

-mike

-----------------------------------------------------------------------------
 Michael A. Crowley Director of Networking
 mcrowley_at_mtholyoke.edu 216 Dwight Hall, Mount Holyoke College
 413-538-2140 fax: 413-538-2331 South Hadley, MA 01075-6415
 http://www.mtholyoke.edu/~mcrowley http://www.mtholyoke.edu/lits/network
-----------------------------------------------------------------------------

Here are the data from one such fading episode. For more data, see
the web site listed above. Doing a "mv" and "cp" back fixed it:
=============================================================================
Script started on Wed Jan 17 15:37:36 2001

# ls -li passwd ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 passwd
Wed Jan 17 15:38:26 EST 2001
# ls -li passwd ; date
ls: passwd not found
Wed Jan 17 15:38:27 EST 2001
# ls -li passwd ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 passwd
Wed Jan 17 15:38:28 EST 2001
# ls -li passwd ; date
ls: passwd not found
Wed Jan 17 15:38:28 EST 2001
# ls -li passwd ; date
ls: passwd not found
Wed Jan 17 15:38:29 EST 2001
# ls -li passwd ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 passwd
Wed Jan 17 15:38:30 EST 2001
# ls -li passwd ; date
ls: passwd not found
Wed Jan 17 15:38:30 EST 2001
# ls -li passwd ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 passwd
Wed Jan 17 15:38:31 EST 2001
# ls -li passwd ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 passwd
Wed Jan 17 15:38:32 EST 2001
# ls -li passwd ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 passwd
Wed Jan 17 15:38:32 EST 2001
# ls -li passwd ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 passwd
Wed Jan 17 15:38:33 EST 2001
#
#
# mv passwd p
# cp p passwd
#
#
# ls -li passwd ; date
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:38:55 EST 2001
# ls -li passwd ; date
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:38:56 EST 2001
# ls -li passwd ; date
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:38:57 EST 2001
# ls -li passwd ; date
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:38:58 EST 2001
# ls -li passwd ; date
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:38:58 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:07 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:12 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:13 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:13 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:14 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:15 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:15 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:16 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:17 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:17 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:18 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:18 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:19 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:19 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:20 EST 2001
# ls -li passwd p ; date
63444 -rw-r--r-- 1 root system 967593 Jan 17 15:20 p
88967 -rw-r--r-- 1 root system 967593 Jan 17 15:38 passwd
Wed Jan 17 15:39:20 EST 2001
# exit
#
script done on Wed Jan 17 15:39:32 2001
Received on Sun Jan 28 2001 - 18:26:33 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:41 NZDT