System tuning for directories with large numbers of files

From: Lamont Granquist <lamontg_at_raven.genome.washington.edu>
Date: Tue, 10 Aug 1999 13:06:59 -0700

We're doing work developing software that takes large numbers of files
containing genetic sequence and compiles them into contiguous good quality
genetic sequence. We therefore somewhat routinely find ourselves
attempting to read diretories containing 80,000 files or more and notice a
substantial performance hit in file access compared with smaller sets of
data.

So, my question is pretty simple in just wanting to know if there is some
way to improve filesystem performance in these kinds of situations? Or
would we be better off solving the problem in userland by breaking up the
datasets into 80 directories of 1,000 files and in some way hashing the
individual filename to come up with the appropriate directory to get the
file out of?

We're using 4.0D with ADVFS partitions.

-- 
Lamont Granquist                       lamontg_at_genome.washington.edu
Dept. of Molecular Biotechnology       (206)616-5735  fax: (206)685-7344
Box 352145 / University of Washington / Seattle, WA 98195
PGP pubkey: finger lamontg_at_raven.genome.washington.edu | pgp -fka
Received on Tue Aug 10 1999 - 20:12:17 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT