-- I have a firm grip on reality. Now I can strangle it! -----Original Message----- From: Loucks Guy [mailto:Guy.Loucks_at_det.nsw.edu.au] Sent: Wednesday, April 12, 2000 1:30 PM To: tru64-unix-managers_at_ornl.gov Cc: 'Nikola Milutinovic' Subject: TRU64 and optimizing Disk Performance People, Sorry for the delay in posting this, I received a fair bit of information, probably the most relevant was from Alan. I however did receive some contradicting information on ADVFS. Some of which concurs with our experience of it with SQUID proxy (similiar situation to our WEB DNS, lots of small files, and many UFS partitions handle this better than one ADVFS). I think the final solution will be a set of solid state DASD. MFS may have been an option, but state persistance is desireable. A few people mentioned the use of a database, however I do not see this as being a bennefit, as long table scans over a data table with 2.2million+ rows does not scale well, as much as Oracle and the other vendors may believe, we would end up holding replicated data, snapshots, ... not managable. Alan's note about a few larger files might be an option and the use of a C-ISAM / VSAM / HSAM database may be apropriate. The answers and the original questions below, also the URL's for WEB DNS as some people were interested in this: Web: http://www.cc.utah.edu/~keide/Software/UofU_DNS_Tools/ Author: http://www.utah.edu/~keide/Kirk_Eide.html Original Post: People, I am looking for peoples feedback to the following situation: We have a data base structure consisting of some 3280 directories containing about 6560 files. The directory structure is about 65MB in size. The average file size is quite small. The change rate is fairly slow, say a couple hundred transactions / hour (This is absolute TOP rate). This is a WEB interface to our DNS management system. We need to be able to search and locate entries in this structure rapidly (lets say sub 10sec). I am interested in peoples experience and options. The ideas on the table so far: * RAMDISK, it is small enough, keep a snapshot in RAM access at memory speed (What RAM DISK OPTIONS USED IN TRU64?) * Hard Ware this is always an option, I would like to keep this as a last resort, use existing resources to their maximum. We could always throw an EMC with a few gig's and pre-emptive frame access... * Alternate FS: currently using ADVFS, go to UFS. muck around with UBC.... * other options??? If people can forward their thoughts I will summarise. Cheers, Guy Responses: Nikola Milutinovic [Nikola.Milutinovic_at_ev.co.yu] UBC is supposed to handle caching vs. vmem issues quite well. Of course, you can create RAM disk (see man for mfs and newfs). That is like locking a part of a filesystem in memory. What Web software are you using for DNS management? I was planning to write my own, but if there is something free, I wouldn't mind using it untill I get my own. Nix. alan_at_nabeth.cxo.dec.com First, a general comment. The time to open and close a file will always dominate the access time to the data in the file when the file is sufficiently small. I don't know where the crossover is, but it is probably above the 10 KB average file size you have. If better performance is the goal you might want to look at something that uses fewer, larger files. I think there are kernel parameters that can adjust the size of the hash tables used by the namei code (file name lookup). Sizing those tables correctly can reduce the amount of disk access needed to lookup a file name when it is opened, which may improve performance. I think modern versions of sys_check can look at namei cache hit statistics and will make recommend- ations. I don't think the directory is large enough to make search times a significant issue. A switch to UFS might have an adverse affect since the directories will probably be further apart on disk, increasing seek time when you have to go to the disk for data. AdvFS in V5 also improves the internal search structure, which may help. Finally, there are two ways to interpret "ramdisk"; a memory based file system or a solid state SCSI device. You can use MFS, but it is volatile making unsuitable for persistent read-write data. A 65 MB MFS isn't particular hard to create, if you have the memory to back it. Such memory is a candidate for paging and there isn't an easy to lock it down. SCSI based solid state disks generally provide exceptional seek performance. Data rates may not be much better than rotating disks simply because the SCSI bus data rate limits transfer speed often as not. Still, they generally support a backend disk to allow the data to be persistent and it may offer better performance in some applications. You don't seem to have said what version of Tru64 UNIX you're using, so check the SPD for that version to see what solid state disks are supported (EZxx model names). I think the SPD is kept in the DOCUMENTATION directory of the base operating system installation CDROM. Andrew Leahy [alf_at_cit.nepean.uws.edu.au] On Mon, 10 Apr 2000, Loucks Guy wrote: > The ideas on the table so far: > * RAMDISK, it is small enough, keep a snapshot in RAM access at memory > speed (What RAM DISK OPTIONS USED IN TRU64?) man mfs - The mfs command builds a memory file system (mfs), which is a UFS file system in virtual memory, and mounts it on the specified mount-node. When the file system is unmounted, mfs exits and the contents of the file system are lost. > * Hard Ware this is always an option, I would like to keep this as a > last resort, use existing resources to their maximum. We could always throw > an EMC with a few gig's and pre-emptive frame access... > * Alternate FS: currently using ADVFS, go to UFS. muck around with > UBC.... I'd certainly try a small UFS partition. I moved our Squid proxy servers from AdvFS to UFS because of poor AdvFS performance when handling hundreds of thousands of small files. Jim Belonis [belonis_at_dirac.phys.washington.edu] I still say a database is the way to go. High numbers of records just makes it all the more important to do it right. And rapidly increasing size means you can't afford to screw around with solutions that don't scale. I would (if I were a database guru) set up a database easy and fast to search (a few million records should be reasonable speed hashed properly). [ I'm not sure a simple perl-hash scales that large with speed since I've never had to use one with more than a few thousand records. ] And use that database to generate the text files to be used by the DNS service periodically, or just keep the database and your DNS service files in sync by modifying them together. Depending on the complexity of the records and whether you want to search on sub-parts of the records (like individual words in a TXT record or HINFO record) you might want to do a fully indexed full-text search database. I've never used one, but I understand they can be incredibly fast. My only practial knowlege about this is that the original altavista search engine was essentially this and searched billions of records in a few seconds (but the whole index was in RAM). Come to think of it, it might be neat to consider using a web search engine even though your 'pages' are not out on the web. If your files are ordinary text files, they can be treated as web pages I believe even though they are not written in HTML. Alternatively, you can throw hardware at the problem as you suggested in your original message and get a RAMdisk which should be findable. I remember one came with VMS that they used for standalone backup or booting off CDROM or something. And I used it for some other purpose. But I don't remember one for Digital Unix. Good luck. Jim Belonis > Thanks Jim, > > Even using a real database would be a problem. There are currently a little > over 2.2 million records (we are searching the content), doing a text search > or even a hash search, could be problematic. > > I am now considering setting up the likes of a data store, to search for the > required info. Essentially we want to ensure when an A or PTR record is > removed (or someone tries to remove it more to the point) it does not leave > any CNAME or MX etnries. Our managed zones are going to double in the next 6 > months as we bring on-line the 2850 schools we connected to our network last > year. > > The objective is to push out / delegate the DNS management, without > maintaining 3 500 DNS servers across the state. (NB: NSW is about the size > of or a little larger than Texas.) > > Any suggestions on databases would be entertained, the store management is > held within a few routines, and using PERL DBI is always an option... > > Cheers, > > Guy > > Guy R. Loucks > Senior Unix Systems Administrator > Networks Branch > NSW Department of Education & Training > Information Technology Bureau > Direct +61 2 9942 9887 > Fax +61 2 9942 9600 > Mobile +61 (0)429 041 186 > Email guy.loucks_at_det.nsw.edu.au > > > > > -----Original Message----- > From: Jim Belonis [mailto:belonis_at_dirac.phys.washington.edu] > Sent: Monday, April 10, 2000 8:37 PM > To: Loucks Guy > Subject: Re: TRU64 and optimizing Disk Performance > > > > If you are searching by filename, even 'find' should be able to search > 65MB in 6500 files in under 10 seconds. > > If you are searching the actual content of the files, I have nothing much > to say, except "why not use a real database ?" You need not answer, > I assume you have your reasons. > > > People, > > > > I am looking for peoples feedback to the following situation: > > > > We have a data base structure consisting of some 3280 directories > containing > > about 6560 files. The directory structure is about 65MB in size. The > average > > file size is quite small. > > > > The change rate is fairly slow, say a couple hundred transactions / hour > > (This is absolute TOP rate). > > > > This is a WEB interface to our DNS management system. We need to be able > to > > search and locate entries in this structure rapidly (lets say sub 10sec). > I > > am interested in peoples experience and options. > > > > The ideas on the table so far: > > * RAMDISK, it is small enough, keep a snapshot in RAM access at memory > > speed (What RAM DISK OPTIONS USED IN TRU64?) > > * Hard Ware this is always an option, I would like to keep this as a > > last resort, use existing resources to their maximum. We could always > throw > > an EMC with a few gig's and pre-emptive frame access... > > * Alternate FS: currently using ADVFS, go to UFS. muck around with > > UBC.... > > * other options??? > > > > If people can forward their thoughts I will summarise. > > > > Cheers, > > > > Guy > > > > Guy R. Loucks > > Senior Unix Systems Administrator > > Networks Branch > > NSW Department of Education & Training > > Information Technology Bureau > > Direct +61 2 9942 9887 > > Fax +61 2 9942 9600 > > Mobile +61 (0)429 041 186 > > Email guy.loucks_at_det.nsw.edu.au > > > > > > > > > > > -- > J.James(Jim)Belonis II, U of Washington Physics Computer Cost Center Manager > belonis_at_phys.washington.edu Internet University of Washington Physics > Dept. > http://www.phys.washington.edu/~belonis r. B234 Physics Astronomy Building > 1pm to midnite 7 days (206) 685-8695 Box 351560 Seattle, WA > 98195-1560 > -- J.James(Jim)Belonis II, U of Washington Physics Computer Cost Center Manager belonis_at_phys.washington.edu Internet University of Washington Physics Dept. http://www.phys.washington.edu/~belonis r. B234 Physics Astronomy Building 1pm to midnite 7 days (206) 685-8695 Box 351560 Seattle, WA 98195-1560 Guy R. Loucks Senior Unix Systems Administrator Networks Branch NSW Department of Education & Training Information Technology Bureau Direct +61 2 9942 9887 Fax +61 2 9942 9600 Mobile +61 (0)429 041 186 Email guy.loucks_at_det.nsw.edu.auReceived on Thu Apr 20 2000 - 03:23:12 NZST
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT