SUMMARY: 32k limit to a directory?

From: Jerome M Berkman <jerry_at_uclink.berkeley.edu>
Date: Tue, 10 Jul 2001 09:56:30 -0700 (PDT)

The short answer to my question is that a directory can not have more
than 32k subdirectories, but it can have more than 32k files, unless
multiply linked.

        - Jerry Berkman, UC Berkeley

Thanks to:

1. Claude Scarpelli <claude_at_genoscope.cns.fr>

He had posted a description of the problem in April, see:
http://www.xray.mpe.mpg.de/mailing-lists/tru64-unix-managers/2001-04/msg00158.html

2. Matthias Dolder <matthias.dolder_at_compaq.com>

I've run the script to create 25k and 75k files on my XP1000 6/667
running V5.1, patchkit 3. System was writing to a single ultra-scsi
disk via kzpba ultra-scsi controller. As you can see from the attached
'scripts' [not attached to the SUMMARY], creation times remain constant
up to 75k files. Also total elapsed time increases linearly. Note that
the advfs domain was created under v5.1. The on-disk format changed
from Tru64 v4 to v5, however, v5 supports both, the old and the new
on-disk format and is not converting it. So if you plan to go from T64
v4 to v5, you need to recreate the advfs domains to get the better
'lot's of files' support.

3. "Dr. Thomas.Blinn_at_Compaq.com" <tpb_at_doctor.zk3.dec.com>

The problem is apparently with hard links to a given file. If you make
your script create a file, then try to create over 32K hard links to the
same file, I believe it will fail on AdvFS (and succeed on UFS). When
you create a new subdirectory in a given directory, the subdirectory
has a ".." entry which is "hard-linked" to the parent directory's "."
and if you look at the "." entry in a directory that has subdirectories
with "ls -ald ." you will see it has multiple hard links. Hard linking
LOTS of names to a single file is uncommon, but part of the way that the
directory structure's integrity is maintained (and the way you walk up
the tree from a given directory) is through hard-linking the special name
".." in each subdirectory to the parent directory. (You can't hard link
a random file name/entry to a directory by yourself; this is file system
stuff.) I haven't actually tried the "hard link lots of files to one
i-node" test, but the directory linking does fail. Some idiot used a
short where there should have been an int or long and it took a long time
to find and fix it, because it was in a basic data structure used in a
lot of places in the AdvFS file system code and utilities; it wasn't an
easy problem to fix.

4. Chris Adams <cmadams_at_hiwaay.net>

The limitation is on the number of subdirectories in a single directory.

For example, we had our users' home directories like /home/<username>.
When we hit 32K users, we had 32K subdirectories in /home. That is a
limit because of the way Unix filesystems work. When you create a
subdirectory, it includes the ".." entry as a hard link to the parent
directory. You can only have a fixed number of hard links (32K on
AdvFS) to an object in the filesystem, so when you create 32K
subdirectories, there are 32K hard links to the parent directory, so you
cannot create another subdirectory.

Creating lots of files works just fine. It will slow down the system
access to that directory and those files some, because the system has to
look through more data before it can find what you are looking for, but
with AdvFS that isn't a huge penalty.

5. Serguei Patchkovskii <patchkov_at_ucalgary.ca>

There is a problem for directories with large number of -directories-
in it. Each directory must contain an entry for its parent directory
(represented by the ".." filename). Additionally, each directory must
contain an entry for itself (represented by "."). Moreover, each
directory (apart from the root "/") must appear in a higher-level
directory. Altogether, a directory, with N subdirectories in it, must
be referenced N+2 times.

Now, filesystems used by Tru64 (at least as of 4.0d, which I have
access to), store the number of references to a file or directory
in a field declared as "short di_nlink". The largest non-negative
number, which could be stored in this field is 32767. As a consequence,
no directory can have more than 32765 subdirectories in it. The
actual limit may be somewhat lower, as the link count is also used
for keeping track of open file descriptors, which refer to the file
or directory.

Original question:

> Ian Veach wrote that Chris Adams reminded us that there are
> 32K directory limitations that require workarounds ...
>
> We will soon have a directory with about 40,000 entries (for a product
> for which we don't have source), so this worries me.
>
> We are currently running DU 4.0F using ADVFS. I just ran a simple
> script (see below) to create 75,000 files in a single directory,
> and the only problem was speed. It takes roughly twice the system
> time to create the 50,000th file as the 25,000th file. I've heard
> Tru64 5.1 will solve this. Are there other problems with directories
> with large numbers of files?
Received on Tue Jul 10 2001 - 16:57:24 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT