[Q] df and quotas going haywire!

From: Clare West <clare_at_cs.auckland.ac.nz>
Date: Wed, 23 Oct 1996 17:20:56 +1300

Well it's me again. No more kernel panics, but things are very strange.

Here is the output from my (formated) df -k command for local disks:

# df
Filesystem 1024-bl Used Availab Capa Mounted
root_domain#root 98304 78566 13432 86% /
/proc 0 0 0 100% /proc
usr_domain#usr 861184 486482 358136 58% /usr
/dev/rz6c 635230 446958 188272 71% /cdrom
tmp_dmn#tmp_fs 200200 547025 0 274% /tmp
r0_dmn#grad_fs 2064384 551646 10288 99% /users/studs/grad
r0_dmn#ugrad_fs 2064384 508660 10288 99% /users/studs/ugrad
r1_dmn#ass_fs 2064384 549169 1281408 30% /ass
r1_dmn#staffc_fs 2065408 538328 1281408 30% /users/staffc
r2_dmn#grada_fs 2064384 546119 1518265 27% /users/studs/grada
r2_dmn#foo 2064384 547783 1516601 27% /foo
gradb_dmn#gradb_fs 895048 547783 329144 63% /users/studs/gradb

The first strange occurance was when I noticed the 274% full /tmp. The file
system seemed relatively empty, I couldn't see any strange files with lsof
and I could create a 20MB file with dd. Currently a du reports:

# du -sk /tmp
672 /tmp

The other filesystems with obviously incorrect data are:

r0_dmn#grad_fs, r0_dmn#ugrad_fs: Available is probably right, but Used is
                                 definitely wrong
r1_dmn#ass_fs, r1_dmn#staffc_fs,
r2_dmn#grada_fs, r2_dmn#foo: Used and available look wrong. eg:

# du -sk /users/staffc
1635466 /users/staffc

dxadvfs shows similar screwy data. For file domains the numbers look about
right, but for individual file sets they are similar to the df results
above. showfdmn seems to give ok data -- but just for file domains.

The r[012]_dmn domains are all on 4GB seagate disks, configured as JBOD on
our RAID controller: (from uerf)

xcr0 at pci0 slot 7
re0 at xcr0 unit 0 (unit status =
 _ONLINE, raid level = JBOD)
re1 at xcr0 unit 1 (unit status =
 _ONLINE, raid level = JBOD)
re2 at xcr0 unit 2 (unit status =
 _ONLINE, raid level = JBOD)

Each file set is kept below 2GB with file set quotas as we nfs mount them
on old machines that do not cope correctly when they are above that.

In the process of looking at the file systems, I noticed that quotas had
gone mad too. Consider one of our students who uses a ~70MB on
r0_dmn#grad_fs. Using vedquota I see the following:

/users/studs/grad: blocks in use: 0, limits (soft = 81920, hard = 87040)
        inodes in use: 0, limits (soft = 0, hard = 0)

yet a du -sk on his home directory shows:

73844 fredfish

(I just checked and the files in his home directory are owned by him, and
he is the only person in our yp database with his uid)

I am also still getting lots of these in /var/adm/messages:

Oct 23 17:07:36 cs26 vmunix: chk_blk_quota: user/group underflow
Oct 23 17:07:36 cs26 vmunix: chk_bf_quota: user/group underflow

from looking at the archives it seemed that vquotacheck was the answer but...

 Things I have done recently:

  -- run vquotacheck (and quotacheck -- but they are the same?)
     the first few times I ran this it seemed great -- fixed up a problem
     with a student whose usage was about 3MB but who the quota system
     thought was using 20MB. I just ran it now, and it's fixing people's
     usages to 0. I first ran this program on friday, 11/10/96 (10/11/96 in
     US usage).

examples from the vquotacheck -guv -a I just ran:

/users/staffc: clare fixed: inodes 17 -> 0 blocks 36 -> 0

This is me! I'm using up plenty of blocks on this filesystem.

/users/studs/ugrad: bwue001 fixed: inodes 16755 -> 16684 blocks
547898 -> 545601
/users/staffc: bwue001 fixed: inodes 16755 -> 16684 blocks 547898 -> 545601
/foo: bwue001 fixed: inodes 16755 -> 16684 blocks 547898 -> 545601

As far as I know, this user has no files on any of these file systems.
Although these numbers look about right for what he does have on the one
file system I expect him to have files on. I just looked with vedquota and
it reports him using 545601 blocks (I think that means MB) on every file
system with quotas turned on.

  -- installed the pmap.o patch OSF400-032 mentioned in my previous message
     ([S] panic (cpu 1): kernel memory fault)
  -- installed these patches: OSF400DX-001, OSF400-045, OSF400-062 in a vain
     attempt to fix the problem. I can back out of them if needed.

The Semester has just finished here, so these problems are not urgent. I
happened to run into a Digital Engineer today. And he seemed to thing that
the df bug is an old one cropping up again. I will be opening a call with
Digital in the not to distant future (next week probably).

Any advice as to how to proceed would be gratefully accepted.

clare

--
Clare West, Rm 107, Ext 8266
clare_at_cs.auckland.ac.nz
Received on Wed Oct 23 1996 - 06:33:44 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:47 NZDT