SUMMARY: another defragcron question

From: Richard Bemrose <rb237_at_phy.cam.ac.uk>
Date: Mon, 10 Aug 1998 14:47:25 +0100 (BST)

Hello gurus,

I first must thank the following people for their quick and informative
replies: "C.Ruhnke" <i769646_at_smrs013a.mdc.com>, Bruce Kelly
<kellybe_at_llnl.gov>, Tom Webster <webster_at_ssdpdc.lgb.cal.boeing.com>, Jim
Harm <harm1_at_llnl.gov>, Tony Burke <tburke_at_davidjones.com.au> and Richard L
Jackson Jr <rjackson_at_portal.gmu.edu>.

In my original poster, I asked how I could reduced the likelihood of
defrag causing a system panic. The answer is not that simple since the
problem is not isolated to defrag. Richard L Jackson Jr forwarded
Digital's explanation:
     Under some circumstances, AdvFS can panic with a message "log half
     full". At this time, the following situations are known to cause it:
     1) When a very large file truncate is performed (this can occur when
     a file is overwritten by another file or by an explicit truncate
     system call), and the fileset containing the file has a clone
     fileset. 2) When very large, highly fragmented files are migrated
     (this occurs when running the defragment, balance, rmvol, and
     migrate AdvFS utilities).

I've append all replies for additoinal information and advice. Therefore,
to reduced the likelihood of a system panic I concluded the following
points:
    1) if at all possible, do not run defrag on a group/production server
    2) obtain the latest patches from Digital and/or wait for Digital
       UNIX V5
    3) increase the log size (suggested size is 65536 [max.])
    4) the choice how often to run defrag is system dependent; the general
       consensus was to run defrag nightly on domains less than 20Gb to
       keep fragmentation low enough to avoid the defragment crash.
       Larger domain size may pose problems - some thought should be given
       how to handle these domains.

These comments are mine alone and do not represent the University of
Cambridge and/or Digital. Please contact your local Digital support for
more details.

On Thu, 6 Aug 1998, Richard Bemrose wrote:
> Following on from Judith Reed's <jreed_at_wukon.appliedtheory.com> summary on
> Tue, 04 Aug 1998 regarding "SUMMARY: defragcron causes corruption in ADVFS
> filesys?" I was pleased to read that defrag does not leave a file system
> in an inconsistent state. However I was alarmed to hear, under certain
> conditions, defrag could cause a system panic (not desirable on our main
> group server!).
>
> So I was wondering how I could reduced the likelihood of a system panic
> (assuming I leave defrag enabled). What are fellow system admin's views on
> the following:
> 1) perform defrag monthly (or weekly)
> 2) write a wrapper which only perform defrag if the domain is less than
> a predefined capacity (any recommendations?) else mail root
> 3) within "/usr/sbin/defragcron", reduce the "def_defrag_threshold"
> value (again, any recommendations?)
> 4) a mixture of 1), 2) and/or 3)
>
> Any suggestions are welcome. Summary to follow.

------------------------------------------------------------
>From i769646_at_smrs013a.mdc.com Mon Aug 10 13:55:19 1998
Date: Thu, 6 Aug 1998 07:44:47 -0500 (CDT)
From: "C.Ruhnke" <i769646_at_smrs013a.mdc.com>
To: Richard Bemrose <rb237_at_phy.cam.ac.uk>
Subject: Re: another defragcron question

The most likely cause for defrag to crash a system is having a file with
too many extents (hyper-fragmentation as I call it). How many are too
many?
Depends... The default log file (meta-data) size for an AdvFS domain is
512 blocks. According to Digital support this is enough to handle files
with up to 40,000 extents. How do you know how many extents a file has?
Use the "showfile -x <filename>" command. In the above mentioned
discussion
of defragcron Judith summarized my reply how I use this command; a good
script programmer can easily better my procedure...

How can you reduce the likelihood of a crash? Good question! Note that
defrag is not the culprit here. Hyper-fragmentation is the root evil with
AdvFS the main accomplice. ANY AdvFS activity against a file with too
many
extents can potentially cause DUNIX to panic. Thus the "best" answer is
to reduce the likelihood of hyper-fragmentation.

     1) Try to keep plenty of free space on a domain. A domain with 50%
        free space is less likely to hyper-fragment than one with 10%
free.
        Yes, I know 50% free space is wasteful, but you can shoot for 25%
        Disks are cheap these days; how much does it cost for downtime to
        recover filesystems that got corrupted during the crash?

     2) Regularly defragment the domains. What is regular? Depends on
        the usage of the domain. If the files on the domain are fairly
        static and grow slowly, you could get away with defrag every year.
        If the domain is relatively active -- say, lots of users doing
        lots of edits and compiles and creates and deletes -- you might
        need to defrag weekly or even daily (extreme case!).

     3) The def_defrag_threshold value is not a perfect evaluation of
        fragmentation. It only takes ONE badly fragmented file to
        cause trouble. In my case, "defrag -n -v <domain>" showed an
        average fragmentation of several hundred for a domain that had
        one file with over 188,000 fragments -- there were a lot of files
        with only 1 extent! This resulted from the creation of one file
        that was several GB in size. The BEST test is to use the "find"
        and "showfile" combination and then check the number of extents
        in the MOST fragmented file on the filesystem.

     4) Increase the log file size for larger domains. As disks become
        bigger and cheaper, domain sizes are growing to mondo proportions.
        I have 50GB domains now and some users are talking of needing
        several hundred GB in the near future. 512 block log files
        are NOT appropriate for these domains. What's the "right"
        log file size? Idunno... I think one of the key determinants
        is file size on the domain. If the files are "small" like in
        the several KB or MB range, then 1024 blocks should be fine!
        if you get the user who suddenly decides to see how big he can
        grow a test file, 10240 blocks may be needed. I personnaly
        think the guy who was complaining that 65535 blocks wasn't
        enough had some other problem besides AdvFS.

CAVEATE EMPTOR!!! I am NOT a UNIX wizard, nor an AdvFS expert. These
musings are based on my experiences of the last several months with
hyper-fragmentation and trying to learn how to analyze it. Also these
comments are mine alone and do not represent any commitments by Digital
or my employer, IBM.

--CHRis

-
=============================================================================
Chris H. Ruhnke Phone: (314)233-7314
IBM Global Services M/S S306-6340 FAX : (314)234-2262
325 J.S. McDonnell Blvd Email:
i769646_at_smrs013.mdc.com
Hazelwood, MO 63042

------------------------------------------------------------
>From kellybe_at_llnl.gov Mon Aug 10 13:55:19 1998
Date: Thu, 6 Aug 1998 07:40:46 -0800
From: Bruce Kelly <kellybe_at_llnl.gov>
To: Richard Bemrose <rb237_at_phy.cam.ac.uk>
Subject: Re: another defragcron question

The problem is not with when or how long to preform the defragmentation.
It
is with the number of extents or the size of files that cause panics. If
you have a file with a large number of extents (~10000) or a very large
file (10GB) it is possible to get "log half empty" problems. The only
thing
that helps is to increase the log file size for the file system. How to do
this was explained in the previous responses. However, it is still
possible
to get the problem since for a 100GB file system you would need over 7
million pages for the advfs log file. The max is only 64k however. On the
other hand, it is working fine for our 120GB file systems with 65500 pages
in the log file at our site.

Bruce Kelly

Bruce Kelly PO Box 808, Livermore, CA 94451
510-423-0640 L-73, Computer Systems Group
fax 510-422-9429 Lawrence Livermore National Laboratory
kellybe_at_llnl.gov University of California


------------------------------------------------------------
>From webster_at_ssdpdc.lgb.cal.boeing.com Mon Aug 10 13:55:19 1998
Date: Thu, 6 Aug 1998 09:48:17 -0700
From: Tom Webster <webster_at_ssdpdc.lgb.cal.boeing.com>
To: rb237_at_phy.cam.ac.uk
Subject: Re: another defragcron question


The #1 solution to this appears to be, don't run 4.0d on production
servers.
I know that some of the Tru-Cluster users are stuck and there is no good
way
to back a system down to an older version of DU, but I thought I should
bring it up. We defrag our DU4.0b production servers over the weekends
using
a homegrown script and haven't had any trouble under various versions of
DU (3.2g, 4.0a, 4.0b). DU4.0d appears to have some unresolved AdvFS
problems.
[Sorry for the rant, I'm getting some mild pressure to go to 4.0d because
it
is Y2k compliant -- regardless of the apparent AdvFS problems.]
 
> So I was wondering how I could reduced the likelihood of a system panic
> (assuming I leave defrag enabled). What are fellow system admin's views
on
> the following:
> 1) perform defrag monthly (or weekly)
> 2) write a wrapper which only perform defrag if the domain is less
than
> a predefined capacity (any recommendations?) else mail root
> 3) within "/usr/sbin/defragcron", reduce the "def_defrag_threshold"
> value (again, any recommendations?)
> 4) a mixture of 1), 2) and/or 3)

If I had to guess, I'd say:

1. Install the latest patch kit and get the addtional AdvFS patches from
DEC,
   they were discussed in previous messages in this thread.
   
2. Verify the domains before you start trying to automate defraging, to
make
   sure you don't have any pre-exising AdvFS problems.
   
3. Defragment early and often to keep the work light.

4. The threshold on filesystem capacity is a good idea.

Just my $0.02 worth,

Tom
--
+-----------------------------------+---------------------------------+
| Tom Webster                       |  "Funny, I've never seen it     |
| SysAdmin MDA-SSD ISS-IS-HB-S&O    |   do THAT before...."           |
| webster_at_ssdpdc.lgb.cal.boeing.com |   - Any user support person     |
+-----------------------------------+---------------------------------+
|      Unless clearly stated otherwise, all opinions are my own.      |  
+---------------------------------------------------------------------+
------------------------------------------------------------
>From harm1_at_llnl.gov Mon Aug 10 13:55:19 1998
Date: Thu, 6 Aug 1998 13:33:24 -0800
From: Jim Harm <harm1_at_llnl.gov>
To: Richard Bemrose <rb237_at_phy.cam.ac.uk>
Subject: Re: another defragcron question
Please, recall that he wrote "should not leave the filesystem in an
inconsistent state".  The fact is that it frequently was left in an
inconsistent state (in addition to the system crash).
Some of the inconsistencies in the filesystem we were instructed to ignore
and some we were told would be fixed by running the verify program on the
filesystem while it was unmounted; and it did fix some of them,
and when there were still uglies in the filesystem
we were almost always able to recover the data with the salvage program.
Inconvenient, but effective.
DEC has made several patches to chip away at the problems with advfs and
defragmentation problems, but the Digital UNIX 5.0 promisses to REALLY
fix the problems.
We, too, have requested a tool or logic to resolve when we were "at risk"
to have problems with advfs.  It seems that the worst case is when
a fragmented file that is about one third the size of the filesystem
is moved to an available space in the filesystem of the same size.
This is usually only on large filesystems(large=over 2 GB ?).
The log space becomes insufficient and confusion reigns.
You will be able to reduce the frequency of occurrence by increasing the
log
size from 512 to something larger(problem might go away completely).
It is not just defragment that can cause the problem:
(this  occurs when running the defragment, balance, rmvol, and
       migrate  AdvFS utilities).
and can occur from simply piping /dev/null into an existing big file.
We run defragment on all our small(less than 2GB) filesystems nightly.
We also run defragment on our larger(10 to 20 GB filesystems where we have
increased the advfs log size to 64K, nightly.
If you defragment; defragment nightly to keep fragmentation low enough to
avoid the defragment crash.
We have found it just too dangerous to run defragment on our 120 GB
filesystems
because we can put a few very large fragmented files in a system with a
large
free space(worst case) into those filesystems in a short time in our
environment and we are being very cautious about data loss.
}}}===============>>  LLNL
James E. Harm (Jim); jharm_at_llnl.gov
(925) 422-4018 Page: 423-7705x57152
------------------------------------------------------------
>From tburke_at_davidjones.com.au Mon Aug 10 13:55:19 1998
Date: Fri, 7 Aug 1998 10:40:41 +1000
From: Tony Burke <tburke_at_davidjones.com.au>
To: rb237_at_phy.cam.ac.uk
Subject: Re: another defragcron question
Rich,
I sent a note to Judith Reed after she posted the summary.
We have had serious problems with defrag also - but I believe we have got
to the botton of it and it is not defrag as such...
I have attached below a copy some information we received from Oracle.
Regards,
Tony Burke.
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_
/_/
Oracle Worldwide Customer Support  Internet:  response_at_au.oracle.com
Asia Pacific Global Support Centre Facsimile:      +61 3 9696 3081
Level 3, 324 St. Kilda Road        Call Response:  +61 3 9246 0400
Melbourne  Victoria  Australia
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_
/_/
Attachment:
---------
This document describes the neccessary changes to the advfs parameter for
an optimal operation of Oracle. If the parameters are not set in this way
you will see poor system performance during tablespace creations, this can
even be seen as a system hang if the datafile exceeds 1.5 Gb. Using
the following parameters you have a smooth I/O pattern.
Here the recommendation:
AdvfsCacheMaxPercent = 1
AdvfsMaxDevQueueLength = 16 or 32
AdvfsFavorBlockingQueue = 0
The advfs parameters are necessary for a smooth I/O operation with Oracle.
Here some insight:
AdvfsCacheMaxPercent: is defining the size of the advfs buffer size. This
is
in percent of the system memory. The default value of 7% (in your case
10%)
are used to buffer data. In case of a dedicated DB server this buffering
is already done by Oracle. Oracle as the DB engine is comparable to a huge
buffer cache. To buffer this data again in AdvFs doesn't make sense, so we
recommend the minimum size (1%).
In case you create a tablespace (adding a datafile) you start to fill the
advfs buffer. There is an algorithm in place to start flushing the advfs
buffer as soon as it reaches a certain usage level (90% of the advfs
cache).
The flushing is actually creating a very high I/O and CPU activity.
This algorithm will be improved in the next major OS version. But if the
cache is kept small, the flushing activity will never have the same
hanging
effect.
The recommendation for the AdvfsMaxDevQueueLength is based on the I/O
subsystem. We have seen the best performance results for Oracle DB servers
with values of 16 or 32.
Here an explanation:
The AdvfsMaxDevQLen sysconfig parameter was added to V4.0 to handle the
tradeoff between quick synchronous IO response times and maximizing IO
throughput to AdvFS volumes. A default value of 80 was chosen as a
compromise
between these needs. Essentially, response time was favored over IO
throughput in choosing that default for general system workload
enviroments.
The range is 0, 1 to 65536. from my testing a reasonable range is between
20 up to 2000. Too low a value will hurt potential IO throughput. Too high
will cause excessively long user response times for synchronous IO
requests
measured in seconds to minutes.
Value 0 actually deactivates the AdvFS per-volume threshold causing any
and all IO requests to immediately be issued to the disk whenever there
is a sync request. We don't recommend you choose this value.
Since AdvfsMaxDevQLen applies to all AdvFS volumes in the system,
you need to choose the value wisely if you plan to change it from the
default.
If your environment is such that hardly anyone or any application needs to
synchronously wait on IO, then you might consider higher values up to
2000.
I have seen this help for example when you have systems that generally
write asynchronous data to files, but rarely have users waiting for data.
If your system environment contains mixed user applications or is
sensitive
to synchronous IO response times, then use values less than 300. In case
of a dedicated DB server this is mostly synchronous IO (e.g. Oracle is
doing
all writes with the O_SYNC option)
A general and simplistic way to figure this is:
synchronous IO response time = AdvfsMaxDevQLen * average IO response time
assuming there are AdvfsMaxDevQLen IO requests already being processed
to the disk when the synchronous request is issued. The average IO
response time here means how long it takes one IO request to complete when
there is no other traffic to the disk.
Check the output from LSM volstat to see the average read and write
response
times.
Another parameter worthwhile mentioning is
AdvfsFavorBlockingQueue = 0
This causes AdvFS to mix the ratio of synchronous IO data flushing with
asynchronous IO instead of the default which is to first flush
all synchronous IO then asynchronous IO. In this case, synchronous
IO means read requests, explicit user syncronous writes, and
fsync writes of modified cached data.
This parameter is useful for mixed DB/Application servers.
I hope this helps to explain the "Hang" behaviour to the customer.
Please update the case as soon as possible.
------------------------------------------------------------
>From rjackson_at_portal.gmu.edu Mon Aug 10 13:55:19 1998
Date: Fri, 7 Aug 1998 07:38:34 -0400 (EDT)
From: Richard L Jackson Jr <rjackson_at_portal.gmu.edu>
Reply-To: Richard L Jackson Jr <rjackson_at_gmu.edu>
To: Richard Bemrose <rb237_at_phy.cam.ac.uk>
Subject: Re: another defragcron question
I have had the panic as well and discussed this with Digital.  I was
informed
it is, of course, a known problem.  Digital UNIX 4.0D has enhancements
that
reduces the chance of this condition and Digital Engineering is working on
additional fixes.  So, I disabled defragcron and run defrag once a week.
I
also try to make sure none of my disks are close to full and I am in the
process of upgrading to DU 4.0D from DU 4.0B.  Good luck.
Digital does have a workaround for this problem.  Contact CSC for the
workaround.  Note a spare volume will be needed in the workaround...
--------------------
PROBLEM:
       Under some circumstances, AdvFS can panic with a message "log half
       full". At this time, the following situations are known to cause
it:
       1) When a very large file truncate is performed (this can occur
when
       a file is overwritten by another file or by an explicit truncate
       system call), and the fileset containing the file has a clone
       fileset.  2) When very large, highly fragmented files are migrated
       (this  occurs when running the defragment, balance, rmvol, and
       migrate  AdvFS utilities).
--------------------
Regards,
Richard Jackson
Computer Center Lead Engineer
Mgr, Central Systems & Dept. UNIX Consulting
University Computing & Information Systems (UCIS)
George Mason University, Fairfax, Virginia
------------------------------------------------------------
Regards,
Rich
 /_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ _ \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\
/_/       Richard A Bemrose     /_\ Polymers and Colloids Group \_\
/_/ email: rb237_at_phy.cam.ac.uk  /_\    Cavendish  Laboratory    \_\  
/_/   Tel: +44 (0)1223 337 267  /_\   University of Cambridge   \_\   
/_/   Fax: +44 (0)1223 337 000  /_\       Madingley  Road       \_\   
/_/       (space for rent)      / \   Cambridge,  CB3 0HE, UK   \_\   
 /_/_/_/_/_/_/  http://www.poco.phy.cam.ac.uk/~rb237 \_\_\_\_\_\_\
             "Life is everything and nothing all at once"
              -- Billy Corgan, Smashing Pumpkins
Received on Tue Aug 11 1998 - 08:08:12 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:38 NZDT