Thanks to:
Dennis MacDonell
Chang Song
Richard Jackson
Tom Blinn
Henry McCracke
Arrigo Triulzi
Larry Scott
Most people pointed to smoothsync as being the likely
culprit, but Larry Scott gave an in-depth analysis
and the cure. I was also fiddling with the mount option
smsync2, thinking I could turn smoothsync on/off on a
per-filesystem basis using this option. But careful
reading of 'man mount' shows that smsync2 only controls
whether dirty-pages or dirty-and-idle pages are flushed
by smoothsync. Thanks again Larry.
Its great to see this list being monitored by Compaq engineers.
Cheers,
Terry.
The cure:
edit your /etc/inittab file if necessary and comment out
the line:
smsync:23:wait:/sbin/sysconfig -r vfs smoothsync-age=30 > /dev/null 2>&1
then do:
/sbin/sysconfig -r vfs smoothsync-age=0
to suppress smoothsync action system-wide.
No need to reboot.
Those irritating click-click-click's should now disappear.
The cause:
> The disk activity seems to occur in clusters of between 5 and 10
> blips at 1 second intervals, each cluster at 15 second intervals.
This is smoothsync behavior. Smoothsync pushes data blocks according to
an aging algorithm applied to each individual page's timestamp. Metadata
blocks, such as stats, however, do not have a timestamp, so smoothsync
amortizes the metadata writes across the second half of the smoothsync
time period. For example, given smoothsync-age=30 and 90 dirty metadata
pages, 6 pages will get pushed each second starting 15 seconds into the
aging period.
In V4, smoothsync applied only to data pages; all dirty metadata pages
would get scheduled together. Smoothsync of metadata was introduced in
V5.
Example - cat'ing or exec'ing a few files which exist in cache. No data
I/O is necessary, as the files are in cache, but async metadata writes
get performed to update the file stats. On V4, all of these I/O's
would get scheduled together. On a lightly loaded system, they would
likely occur within a single 1-sec interval. On V5, these I/O's would
get spread out over the smoothsync period. In the case of 1000 dirty
pages, this amortization is goodness. In the case of a few dirty pages,
this amortization could be the cause of a "disk tick", until all the pages
are pushed. (Might be worth changing this behavior; keep the disks a
little quieter.)
One of the major benefits of not flooding a device with too many requests
is an improvement in response time, resulting from a decreased latency for
synchronous I/O requests. Without smoothsync, it is easy to flood a
device queue; until the I/O's complete, the system could appear to be
hung.
> If I boot the machine to single-user, (>>> boot -flags s) there is
> no problem. At this point, the running processes are:
Smoothsync gets enabled on the transition into multi-user mode, and
disabled on transition back into single-user.
> If I kill all processes which have arisen since single-user mode,
> the situation doesnt change.
Smoothsync is a kernel thread, so will not be visible from a ps, and
cannot be killed via the kill command. Smoothsync can be disabled via:
# sysconfig -r vfs smoothsync-age=0
Received on Wed Apr 04 2001 - 13:26:16 NZST