SUMMARY: large NFS server tuning

From: Steve VanDevender <stevev_at_hexadecimal.uoregon.edu>
Date: Thu, 24 Oct 1996 15:29:16 -0700 (PDT)

I received responses from, and thank:

Alan Rollow <alan_at_nabeth.cxo.dec.com>
Dave Golden <golden_at_falcon.invincible.com>
Don Weyel <weyel_at_bme.unc.edu>
Jim Zelenka <jz1j+_at_andrew.cmu.edu>
<werme_at_zk3.dec.com>

Highlights of their responses are attached below.

A couple respondents pointed out that I had not mentioned the topology
of the network of machines that use the NFS server, or some critical
details of the HSZ. The five main machines that use the NFS server are
on a Fast Ethernet (100base-T) in the same IP subnet. Our HSZ is
configured for RAID-5, has a 32M write-back cache which is enabled, and
the disks are supposed to be fairly well-balanced on the HSZ internal
SCSI busses.

Since posting my original query we discovered one application that was
writing a 2 megabyte checkpoint file every couple of minutes over NFS.
While this alone should not cause major performance problems with a
well-tuned NFS server, we have reconfigured the application to write to
local disk and this seems to have substantially reduced problems with
our current configuration. Our clients run NFS v3.

I see substantial periods of heavy use of the metadata cache controlled
by "bufcache", and sometimes high rates of cache misses during those
times.

My reading of the responses indicated:

1. Increasing the "maxusers" kernel parameter should help with high NFS
loads.

2. The UBC in Digital UNIX does cache file data from UFS, but not
metadata (inodes, indirect blocks, directories), so a large increase in
the "bufcache" kernel parameter too much will actually take memory away
from caching file data without any performance benefit.

3. A modest increase in "bufcache" may help for dedicated file servers.

I intend to experiment with raising "bufcache" to 6% from 3% and raising
"maxusers" to 128 from the default 32 as a first step.

Alan Rollow writes:
> Digital UNIX uses a unified buffer cache, that uses all of free
> memory for buffer space given the chance. The 3% bufcache
> setting is for buffer headers, not the buffers themselves.
>
> Where supported, use NFS V3 clients since that will help
> performance, notably for writes. If you can't have NFS V3
> client, consider the addition of Prestoserve to the system.
> This is software, license and an NVRAM board that provides
> a fast write capability. A problem with V2 writes is that
> they are synchronous on the client and server. This makes
> the write load on the server very hard to optimize.
>
> I think you almost have to use the write-back cache on the
> HSZ for RAID-5, but if not required, you should use it. The
> synchronous write load doesn't allow the HSZ much opportunity
> to do the writes efficiently and you get all the classic
> performance problems of RAID-5.
>
> Next look for the network adapters and disks being a bottle-
> neck. Ethernet is limited to about 800 KB/sec before it
> saturates. This is only about 100 I/Os per second when the
> I/O size is 8 KB, which is common for file system I/O. RAID-5
> and RAID-0 will help distribute this among multiple disks,
> but you should verify.
>
> On the HSZ make sure that the members of the RAID are on
> different back-end busses where possible.

Dave Golden writes:
> Check out your value for "maxusers" in your /sys/conf/SYSNAME file.
> You want many more than the default 32 for the number of clients
> that you are serving. (try 256 or 512).
>
> You didn't say how much storage you have on the HSZ40, but I'm betting
> it's considerable.
>
> You probably want to consider a RAM upgrade. That will improve caching
> in the UBC as well as speed up your directory lookups, get attributes, etc.
>
> Don't overlook your network. You didn't describe its topology, so I have
> no idea if you're "armed for bear" or running everything through one segment
> and a bunch of $200 twisted pair hubs. That's usually the cause of sporadic
> "sluggishness". Check your collision/error counts with netstat -i and get a
> tool such as ethermon to see if your utilization is too high. A bogged network
> will make the fastest server look like junk.
>
> Good luck,
>
> Dave
>
> --
> Dave Golden golden_at_invincible.com
> Invincible Technologies Corporation

Don Weyel writes:
> Steve,
>
> Our configuration is much like yours, although we have fewer users online
> at any time, but we do have many compute intensive jobs, and all of our
> exported directories are ADVfs. I, too, have looked at NFS tuning, and
> about all I can recommend is to get the O'Reilly publication entitled
> "Managing NFS and NIS" by Hal Stern, in which an entire chapter is
> dedicated to tuning NFS. It is highly informative, although a little
> depressing, in that your options are _relatively_ limited. Adding nfsd
> daemons on the server and biod (nfsiod on OSF 3.2) daemons on clients is
> the most overtly 'available' tuning mechanism, you have to take a fair
> amount of time to twiddle, watch, and twiddle again. The more systems you
> have, the more time you have to dedicate to it.
>
> Anyway, the O'Reilly book goes into detail on these issues and is well
> worth the expenditure.
>
> Good luck.
>
> Don

Jim Zelenka writes:
> Excerpts from internet.computing.alpha-osf-managers: 22-Oct-96 large NFS
> server tuning Steve VanDevender_at_hexade (792)
>
> > All of the filesystems on the AlphaServer system are UFS, and the
> > AlphaServer has 128M of RAM. My suspicion is that because of this the
> > 3% bufcache setting may be much too low to allow for useful caching (3%
> > of 128M is only about 4M of cache).
>
>
> What 3% setting are you referring to? DU dynamically balances physical
> page usage between vm and the ubc based on hit/miss rate and frequency
> for both. It's not unusual to see an I/O bound system using most of its
> pages for bufcache rather than vm.
>
> -Jim Zelenka

werme_at_zk3.dec.com writes:
> Tidbits:
>
> > All file data is in the UBC (Unified Buffer Cache). This is willing to
> use all available memory.
>
> > The bufcache you refer to is used for metadata (inodes, directories,
> indeirect blocks, etc.) Worth increasing for dedicated file servers.
>
> > It may also be worthwhile to increase the size of the namei cache. I
> don't know how it's currently sized, but I don't know what release
> you're running anyway.
>
> Read the system tuning manual, it has a lot of hints.
>
> You may be running into disk updates issues if you have a lot of NFS V3
> traffic and are receiving writes faster than the disks can do them.
> (This generally means FDDI or fast Ethernet.)
>
> If you only have Ethernet, then I would expect the network to suffer
> congestion before the file server starts struggling. Tcpdump traces
> are a big help, see the Digital UNIC FAQ for configuration help.
>
> Eric (Ric) Werme | werme_at_zk3.dec.com
> Digital Equipment Corp. | This space intentionally left blank.
Received on Fri Oct 25 1996 - 00:50:18 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:47 NZDT