---- Kevin Reardon <kreardon_at_na.astro.it> Check in the archives for two summaries on this subject /snip/: <included by Richard Bemrose> http://www.ornl.gov/its/archives/mailing-lists/alpha-osf-managers/1995/07/msg00087.html http://www.ornl.gov/its/archives/mailing-lists/alpha-osf-managers/1996/04/msg00377.html Most people claim going down to 1-3% doesn't cause any problems. You will want to be sure to run tunefs -o time on your file system after lowering the freespace. Maybe the default switch from 'time' to 'space' allocation techniques when free space is set to less that 10% is the real source of the speed hit that tunefs is talking about? I include below an explanation by Dr. Wayward Inode himself from one of the above summaries that describes the only objective way I've seen to determine an optimal free space value for a disk. From: alan_at_nabeth.cxo.dec.com Background. The Berkeley Fast File System is divided up into a set of groups of cylinder, typically 32 cylinders per group. Each cylinder group has its own inode table, cylinder group summary, backup superblock and data space. When a new directory is created the cylinder group with the most free space is selected. When files are created in that directory, the allocation algorithm prefers to use the cylinder group where the directory was allocated. For time-sharing workloads this allows generally related files to be close together. When blocks are allocated to a file, the allocation code prefers to use the same cylinder group as the file, then nearby cylinder groups, then a quadratic hash search, and finally a linear search. To help keep sufficient free space in cylinder groups for the allocations, large files are split up over multiple cylinder groups. To help the file system have free space, some amount is reserved (the minfree of 10%). As cylinder groups fill up and the file system fills up, the slower search algorithms are used, reducing performance. More importantly for read performance the blocks of a poorly allocated file will scrattered all over the disk. The 10% minfree default. The 10% value was selected over 10 years ago when the largest available disks were around 512 MB. Given the geometries of disks at the time and typical cylinder group arrangement, 1/2 MB to 1 MB was reserved per cylinder group (averaging across the disk). Unfortunately, the 10% value has been virtually enshrined as a fundemental law of the universe, without much work to ensure that is the right value for modern disks. ---- Nick Batchelor <Nick.Batchelor_at_unilever.com> As far as I understand it, the 10% limit only applies to traditional UFS file systems. I think it has to do with the alogorithms UFS uses to try to position related blocks adjacently to each other. With less than 10% free space, the system will spend a disproportionate amount of time just trying to work out where to put new blocks into the file system. I don't think the same problem occurs with more advanced file systems like JFS and Advfs which use extent based algorithms for allocating space in the file system. ---- Alan Rollow <alan_at_nabeth.cxo.dec.com> The 10% default and comment about affect on performance came from a time when large disks on UNIX systems were 256 MB and bigger than many mid-range AlphaServers. Few vendors have bothered update that text as inherited from Berkeley or even test to see what values should be used for larger disks. In Digital's case it doesn't help that UFS is the poor 2nd cousin to AdvFS these days. Space on UFS is organized around the cylinder group. Each group of cylinders has a backup of the superblock, its own summary block and its own inode table. UFS allocates new data to a file by trying to keep it close to the existing data of a file. As files get large, the space is spread out over the disk so that one file doesn't use all the space in a cylinder group. As the disk becomes full, so do the cylinder groups. If a group is full, but the file system would have preferred to allocate data it in it, it has to find a nearby group for the space. If it can't find a nearby group, then it will eventually take space in the first group it finds, spreading the file out more than is desired. By keeping some percentage of the space free, normal allocation of space will spread this evenly between the cylinder group. As the percentage reserved space is reduced, more groups will fill up and you may get poor allocations for medium sized and large files. On those older disks, when the 10% number was made the default, that 10% represented an average of between 256 KB and 1 MB per cylinder group; with the typical group being 16 cylinders at the claimed geometry. As the capacity of disks has increased so have the size and number of cylinders. Keeping those same capacities per group (on average), today's large disks can get by with reserved space of 2-5%. A really large disk could probably get by with 1%. I've never tried to measure the affect of using these smaller percentages of reserved space and I'm not sure that providing an average of .n MB per group is the right goal. I think some- one did study this once. It may have been a paper presented in a USENIX Proceedings, or something from a university. I haven't read the paper, but I've read of it, and I recall that it recommended smaller percentages for reserved space on large disks. Some groups are bound to fill up sooner than expected, especially if large files dominate the file system, but that can be controlled by tuning maxbpg so that it is smaller than the size of the cylinder group (or by making the groups larger). ---- Ryan Niemes <NIEMES_at_opus.oca.udayton.edu> This is a very good question that I would like answered as well. I did read the man pages on one of our Solaris boxes, and it says the same thing (guess I never noticed it before). I will try to look into it there, but if you get an answer please let me know. ---- LBRO <lbro_at_dscc.dk> This smells of RAID and UFS. The UFS divides the disk into 'cylinder groups' that are adjacent cylinders. The performance of UFS depends on its ability to find the next free block in the same cylinder group as the the one where the disk head is already located. The possibility for that decreases as free disk space drops. At nearly zero free space on a disk with a well distributed UFS filesystem, the available blocks will be scattered evenly over the entire disk. The funny thing about UFS is that when space usage drops again, the file that were scattered around can be 'defragmented' just by copying it. Because then UFS allocates nice optimal blocks for the new copy again. That is why there is no need for a defragment utility on UFS. BUT: On a RAID volume, how can you tell the 'disk geometry' ? If you want to have any benefit of the 'cylinder group' stuff, you must be able to tell exactly how the RAID controller works (you must express the function of three or more disks in terms of cylinders and sectors of one disk) So maybe it is time for you to turn over to AdvFS that does not optimize based on disk geometry. (AdvFS has a defragment program instead). /snip,snip due to LBRO's request/ So, my opinion is: Disk systems are so complex today (striping, parity...) that we will never know if the UFS optimizer really helps us, I would say it doesn't, though we may be sure that it harms performance whem filling is high. So, use AdvFS and go fill the disk. ---- Anthony Talltree <aad_at_nwnet.net> More like 1.9G, since the 23G drives have about 19G of usable space. Some OS's, eg. BSDI's, default minfree to 5%. The 10% figure was informally picked in an age where filesystems and their usage were much different. I suggest using tunefs to set minfree to 5%, then force optimization back to time. Much of the minfree issue depends on the use of the filesystem. If it's being used for big preallocated files (Oracle, Cyclone, Diablo), then minfree can happily be set to 0. ---- Regards, Rich /_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ _ \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\ /_/ Richard A Bemrose /_\ Polymers and Colloids Group \_\ /_/ email: rb237_at_phy.cam.ac.uk /_\ Cavendish Laboratory \_\ /_/ Tel: +44 (0)1223 337 267 /_\ University of Cambridge \_\ /_/ Fax: +44 (0)1223 337 000 /_\ Madingley Road \_\ /_/ (space for rent) / \ Cambridge, CB3 0HE, UK \_\ /_/_/_/_/_/_/ http://www.poco.phy.cam.ac.uk/~rb237 \_\_\_\_\_\_\ "Life is everything and nothing all at once" -- Billy Corgan, Smashing PumpkinsReceived on Tue Apr 28 1998 - 14:19:35 NZST
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT