SUMMARY: Infinite Inode Loop

From: hans van der heide <hans.vanderheide_at_bluewin.ch>
Date: Fri, 10 Mar 2000 15:22:12 +0100

Hi,

Thanks to Larry Garrett, Oisin McGuinness, Corinne Haesaerts, Saulo Murguia,
Joerg Bruehe, St. Suika Fenderson Roberts, and last but not least Steve
Hancock for answering so quickly to my question.

Some of them suggested to just remove the directories, which however results
in an endless loop.
Others said I should findout and remove the inode, which I was not able to
do by lack of such a program.
Many said the "." and ".." directories were disturbed, and I should try to
find a way to write to the directory file, changing the ">" character to a
".". As this is a productive system from one of our customers I didnot take
the risk of playing around.
Also some proposed to "unlink" or "clri" the directory, which both did not
work.
Most of them also warned me doing to much tricky things.

The most valuable answer came from Steve Hancock, who wrote:

> You have a corrupted directory (inode 1544). Is this UFS or
> AdvFS? If UFS, then you need to remove the directory using clri(8)
> and fsck(8) the file system. If AdvFS, I can give you a procedure
> for removing it, if you like. I need "showfile -x" output from
> the directory itself and showfdmn from the domain.
>
> Steve

and:

> It is hard to say what caused the problem. If I had to guess I would
suspect
> a bad block was taken. When a bad block is revectored, it will try to read
> the data over to the new block, but sometimes it only partially succeeds.
> You may have experienced this phenomenon here. I would surf the binary
> error log and look for recent BBR entries. Also, the kern.log is a good
place
> to look, but your system may have automatically aged off the one you want
if
> it was over 10 days ago.
>
> You could leave the bad directory where it is and, hopefully, you won't
> run into any additional bad areas. Or, you can try to correct this
> localized corruption. One of the first things to try is use verify and see
> if that will clean it up. I haven't had much luck with that, however.
> verify(8) is much better for metadata corruptions which this isn't and it
> could cloud the problem by changing the procedure below.
>
> The (somewhat risky) procedure to follow is:
>
> 1) vdump(8) the entire domain.
>
> 2) unmount the filesets.
>
> 3) Write zeros over the bad area.
>
> use:
>
> dd if=/dev/zero of=<char_dev> bs=512 oseek=<location> count=<count>
>
> where the values for you are:
>
> char_dev = /dev/rrz0g
> location = 1357264
> count = 16
>
> 4) Mount the file system.
>
> 5) Remove the directory with rmdir(1).
>
> 6) If something goes wrong, remake and vrestore(8) the file system.
>
>
> This should get rid of only that bad directory, leaving the rest of the
> domain intact.
>
> Steve

(the numbers come from "showfile" and "showfdmn")

and:

> If you did not see any BBRs (or other hardware problems on this disk)
> back to June, it is probably not failing hardware. If the system's been
> in operation for some time, the problem may have been there for a
> while, but doesn't seem likely. Check your drive firmware and make sure
> its up-to-date. It could have been some kind of transient software or
> hardware problem, perhaps (some things in su mode do not get logged).
> A CPU or memory exception could cause corruptions too. Also, if this
> domain was created and used before you upgraded to 4.0D, then there
> were some known issues and you should have verified everything before
> the upgrade, per the installation guide.
>
> Patches are a prudent measure. If the patches they have installed are
> quite old, I would make sure you rule out any known kernel corruptors
> that have been fixed by newer patches. They should be planning for patch
> kits as soon as they come out. I think you can get on a mailing list
> from the Services web site which lets you know when new ones come out.
>
> Recreating the file system is a good option, in my opinion. You can try
> to vdump selected directories and perform selected restores, if the
> data is imporant enough. I've see that work in these cases. Otherwise,
> remake it with mkfdmn -o and restore from the old backup tape (check
> to make sure you can read it first :-).
>
> Please try to convince your customer of the value of regular backups. I
> have a simple script I wrote years ago to perform nightly backups which
> has save my bacon many times (this week, in fact :-). It doesn't take
> long to write such a thing and you could charge them a few bux to do
> it, if they lack the expertise :-).
>
> Good luck,
>
> Steve

which I will try next week when I will be on the location.

Original question:

> I have the following trivial problem:
>
> On one of our systems (Tru64 4.0D, advfs) I find a directory with an
> infinite loop of inodes in it, a subdirectory points to the directory
> itself. While this system cannot be shutdown and the directory is in /usr
> I cannot run verify which might eventually solve the problem itself.
> How can I remove / patch the directory in a running system?

and my follow-up

> Maybe I did not make myself clear, the problem shows like this:
>
> cd /usr/inodes_with_infinite_loops
>
> ls -ali
>
> 1544 drwxr-xr-x 9 root system 8192 Mär 9 13:38 .
> 2 drwxr-xr-x 27 root system 8192 Jul 26 1999 ..
> 17977 drwxr-xr-x 2 root adm 8192 Jul 26 1999 15-Jul-15:24
> 994 drwxr-xr-x 2 root system 8192 Jul 26 1999 cron
> 1689 drwxr-xr-x 2 root system 8192 Jul 26 1999 esnmp
> 1687 drwxrwxrwx 2 root system 8192 Jul 26 1999 samba
> 637 drwxr-xr-x 2 root system 8192 Jul 26 1999 sambavar
> 1042 drwxr-xr-x 2 root adm 8192 Jul 26 1999 syslog.dated
> 1542 drwxrwxrwx 2 root system 8192 Jul 26 1999 tmp1
>
> ls -aliR tmp1
>
> total 16
> 1542 drwxrwxrwx 2 root system 8192 Jul 26 1999 >
> 1544 drwxr-xr-x 9 root system 8192 Mär 9 13:38 >.
>
> tmp1/>:
> total 16
> 1542 drwxrwxrwx 2 root system 8192 Jul 26 1999 >
> 1544 drwxr-xr-x 9 root system 8192 Mär 9 13:38 >.
>
> tmp1/>/>:
> total 16
> 1542 drwxrwxrwx 2 root system 8192 Jul 26 1999 >
> 1544 drwxr-xr-x 9 root system 8192 Mär 9 13:38 >.
>
> tmp1/>/>/>:
> total 16
> 1542 drwxrwxrwx 2 root system 8192 Jul 26 1999 >
> 1544 drwxr-xr-x 9 root system 8192 Mär 9 13:38 >.
>
> tmp1/>/>/>/>:
> total 16
> 1542 drwxrwxrwx 2 root system 8192 Jul 26 1999 >
> 1544 drwxr-xr-x 9 root system 8192 Mär 9 13:38 >.
>
> and-so-on, and shows the following doing
>
> ls -ali tmp1
>
> total 16
> 1542 drwxrwxrwx 2 root system 8192 Jul 26 1999 >
> 1544 drwxr-xr-x 9 root system 8192 Mär 9 13:38 >.
>
> cd tmp1 ; ls -ali
>
> . not found
>
> and find results in
>
> find tmp1 -inum 1544
>
> find:
>
tmp1/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
>
/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
>
/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
>
/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
>
/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
>
/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
>
/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
>
/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>/>
> />/>/>/>/>/>/>/>/>/>/>/>/>/>/>. Path name too long.
> find: bad directory <..>
>
> rmlink tmp1 does not change anything.
>
> Mark the ">" in the "." and ".." entries. These look very strange to me.

Thanks to all, who helped me on the right track,

Hans van der Heide (hans.vanderheide_at_bdl.ch)
BDL Informatik GmbH (www.bdl.ch)
Tel: +41 (0) 56 442'40'31
Fax: +41 (0) 56 442'40'33
Received on Fri Mar 10 2000 - 14:21:41 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT