-- +-----------------------------------+---------------------------------+ | Tom Webster | "Funny, I've never seen it | | SysAdmin MDA-SSD ISS-IS-HB-S&O | do THAT before...." | | webster_at_ssdpdc.lgb.cal.boeing.com | - Any user support person | +-----------------------------------+---------------------------------+ | Unless clearly stated otherwise, all opinions are my own. | +---------------------------------------------------------------------+ ------------------------------------------------------------ >From harm1_at_llnl.gov Mon Aug 10 13:55:19 1998 Date: Thu, 6 Aug 1998 13:33:24 -0800 From: Jim Harm <harm1_at_llnl.gov> To: Richard Bemrose <rb237_at_phy.cam.ac.uk> Subject: Re: another defragcron question Please, recall that he wrote "should not leave the filesystem in an inconsistent state". The fact is that it frequently was left in an inconsistent state (in addition to the system crash). Some of the inconsistencies in the filesystem we were instructed to ignore and some we were told would be fixed by running the verify program on the filesystem while it was unmounted; and it did fix some of them, and when there were still uglies in the filesystem we were almost always able to recover the data with the salvage program. Inconvenient, but effective. DEC has made several patches to chip away at the problems with advfs and defragmentation problems, but the Digital UNIX 5.0 promisses to REALLY fix the problems. We, too, have requested a tool or logic to resolve when we were "at risk" to have problems with advfs. It seems that the worst case is when a fragmented file that is about one third the size of the filesystem is moved to an available space in the filesystem of the same size. This is usually only on large filesystems(large=over 2 GB ?). The log space becomes insufficient and confusion reigns. You will be able to reduce the frequency of occurrence by increasing the log size from 512 to something larger(problem might go away completely). It is not just defragment that can cause the problem: (this occurs when running the defragment, balance, rmvol, and migrate AdvFS utilities). and can occur from simply piping /dev/null into an existing big file. We run defragment on all our small(less than 2GB) filesystems nightly. We also run defragment on our larger(10 to 20 GB filesystems where we have increased the advfs log size to 64K, nightly. If you defragment; defragment nightly to keep fragmentation low enough to avoid the defragment crash. We have found it just too dangerous to run defragment on our 120 GB filesystems because we can put a few very large fragmented files in a system with a large free space(worst case) into those filesystems in a short time in our environment and we are being very cautious about data loss. }}}===============>> LLNL James E. Harm (Jim); jharm_at_llnl.gov (925) 422-4018 Page: 423-7705x57152 ------------------------------------------------------------ >From tburke_at_davidjones.com.au Mon Aug 10 13:55:19 1998 Date: Fri, 7 Aug 1998 10:40:41 +1000 From: Tony Burke <tburke_at_davidjones.com.au> To: rb237_at_phy.cam.ac.uk Subject: Re: another defragcron question Rich, I sent a note to Judith Reed after she posted the summary. We have had serious problems with defrag also - but I believe we have got to the botton of it and it is not defrag as such... I have attached below a copy some information we received from Oracle. Regards, Tony Burke. _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_ /_/ Oracle Worldwide Customer Support Internet: response_at_au.oracle.com Asia Pacific Global Support Centre Facsimile: +61 3 9696 3081 Level 3, 324 St. Kilda Road Call Response: +61 3 9246 0400 Melbourne Victoria Australia _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_ /_/ Attachment: --------- This document describes the neccessary changes to the advfs parameter for an optimal operation of Oracle. If the parameters are not set in this way you will see poor system performance during tablespace creations, this can even be seen as a system hang if the datafile exceeds 1.5 Gb. Using the following parameters you have a smooth I/O pattern. Here the recommendation: AdvfsCacheMaxPercent = 1 AdvfsMaxDevQueueLength = 16 or 32 AdvfsFavorBlockingQueue = 0 The advfs parameters are necessary for a smooth I/O operation with Oracle. Here some insight: AdvfsCacheMaxPercent: is defining the size of the advfs buffer size. This is in percent of the system memory. The default value of 7% (in your case 10%) are used to buffer data. In case of a dedicated DB server this buffering is already done by Oracle. Oracle as the DB engine is comparable to a huge buffer cache. To buffer this data again in AdvFs doesn't make sense, so we recommend the minimum size (1%). In case you create a tablespace (adding a datafile) you start to fill the advfs buffer. There is an algorithm in place to start flushing the advfs buffer as soon as it reaches a certain usage level (90% of the advfs cache). The flushing is actually creating a very high I/O and CPU activity. This algorithm will be improved in the next major OS version. But if the cache is kept small, the flushing activity will never have the same hanging effect. The recommendation for the AdvfsMaxDevQueueLength is based on the I/O subsystem. We have seen the best performance results for Oracle DB servers with values of 16 or 32. Here an explanation: The AdvfsMaxDevQLen sysconfig parameter was added to V4.0 to handle the tradeoff between quick synchronous IO response times and maximizing IO throughput to AdvFS volumes. A default value of 80 was chosen as a compromise between these needs. Essentially, response time was favored over IO throughput in choosing that default for general system workload enviroments. The range is 0, 1 to 65536. from my testing a reasonable range is between 20 up to 2000. Too low a value will hurt potential IO throughput. Too high will cause excessively long user response times for synchronous IO requests measured in seconds to minutes. Value 0 actually deactivates the AdvFS per-volume threshold causing any and all IO requests to immediately be issued to the disk whenever there is a sync request. We don't recommend you choose this value. Since AdvfsMaxDevQLen applies to all AdvFS volumes in the system, you need to choose the value wisely if you plan to change it from the default. If your environment is such that hardly anyone or any application needs to synchronously wait on IO, then you might consider higher values up to 2000. I have seen this help for example when you have systems that generally write asynchronous data to files, but rarely have users waiting for data. If your system environment contains mixed user applications or is sensitive to synchronous IO response times, then use values less than 300. In case of a dedicated DB server this is mostly synchronous IO (e.g. Oracle is doing all writes with the O_SYNC option) A general and simplistic way to figure this is: synchronous IO response time = AdvfsMaxDevQLen * average IO response time assuming there are AdvfsMaxDevQLen IO requests already being processed to the disk when the synchronous request is issued. The average IO response time here means how long it takes one IO request to complete when there is no other traffic to the disk. Check the output from LSM volstat to see the average read and write response times. Another parameter worthwhile mentioning is AdvfsFavorBlockingQueue = 0 This causes AdvFS to mix the ratio of synchronous IO data flushing with asynchronous IO instead of the default which is to first flush all synchronous IO then asynchronous IO. In this case, synchronous IO means read requests, explicit user syncronous writes, and fsync writes of modified cached data. This parameter is useful for mixed DB/Application servers. I hope this helps to explain the "Hang" behaviour to the customer. Please update the case as soon as possible. ------------------------------------------------------------ >From rjackson_at_portal.gmu.edu Mon Aug 10 13:55:19 1998 Date: Fri, 7 Aug 1998 07:38:34 -0400 (EDT) From: Richard L Jackson Jr <rjackson_at_portal.gmu.edu> Reply-To: Richard L Jackson Jr <rjackson_at_gmu.edu> To: Richard Bemrose <rb237_at_phy.cam.ac.uk> Subject: Re: another defragcron question I have had the panic as well and discussed this with Digital. I was informed it is, of course, a known problem. Digital UNIX 4.0D has enhancements that reduces the chance of this condition and Digital Engineering is working on additional fixes. So, I disabled defragcron and run defrag once a week. I also try to make sure none of my disks are close to full and I am in the process of upgrading to DU 4.0D from DU 4.0B. Good luck. Digital does have a workaround for this problem. Contact CSC for the workaround. Note a spare volume will be needed in the workaround... -------------------- PROBLEM: Under some circumstances, AdvFS can panic with a message "log half full". At this time, the following situations are known to cause it: 1) When a very large file truncate is performed (this can occur when a file is overwritten by another file or by an explicit truncate system call), and the fileset containing the file has a clone fileset. 2) When very large, highly fragmented files are migrated (this occurs when running the defragment, balance, rmvol, and migrate AdvFS utilities). -------------------- Regards, Richard Jackson Computer Center Lead Engineer Mgr, Central Systems & Dept. UNIX Consulting University Computing & Information Systems (UCIS) George Mason University, Fairfax, Virginia ------------------------------------------------------------ Regards, Rich /_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ _ \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\ /_/ Richard A Bemrose /_\ Polymers and Colloids Group \_\ /_/ email: rb237_at_phy.cam.ac.uk /_\ Cavendish Laboratory \_\ /_/ Tel: +44 (0)1223 337 267 /_\ University of Cambridge \_\ /_/ Fax: +44 (0)1223 337 000 /_\ Madingley Road \_\ /_/ (space for rent) / \ Cambridge, CB3 0HE, UK \_\ /_/_/_/_/_/_/ http://www.poco.phy.cam.ac.uk/~rb237 \_\_\_\_\_\_\ "Life is everything and nothing all at once" -- Billy Corgan, Smashing PumpkinsReceived on Tue Aug 11 1998 - 08:08:12 NZST
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT