nfs connections failing when accessing large image files from an application. from MacDonell, Dennis on 2001-04-27 (tru64-unix-managers)

From: MacDonell, Dennis <DennisMacDonell_at_auslig.gov.au>
Date: Fri, 27 Apr 2001 14:56:50 +1000

Hi,

I'm not sure that this is particular to this situation but in this case I
have a Sun UltraSparc10 (10/100 mbps ethernet) Solaris 5.7 nfs client and a
DEC 3000 (oldish 10mbps ethernet/turbo channel) Tru64 5.0A nfs server. Both
the server and the client are on the same vlan, so they are in direct
communication with each other. Now I can cp files from the server to the
client no problemo, but if we run an application like imagine using as input
a large 9500x13000 single band Spot4 image (around 125mb all up) the
connection just clogs the application. The status bar that the application
puts up just hangs.

My low capacity brain, seems to think that the cp command and the
application should be doing similar things, so why does one fail (or hang)
and the other succeed. Some questions come to mind -
(a) Does the cp command recognise that the copy is from a different file
system (ie nfs, rather than ufs) and do something different.
(b) Can we track down where the bottle neck is, ie are packets being
discarded (unlikely as the machines are on the same vlan, so there should be
no packet discard, assuming we are using UDP rather than TCP), or is one or
other machine waiting for an ACK, or is the application timing out a read
and then restarting (I wouldn't thing that is very likely),
(c) are there ways of forcing nfs connections to be more reliable for large
files (small files don't seem to be a problem).
(d) is there any significant difference between whether the data is pushed
across a connection (ie write command executed on the client) or is pulled
across a connection (read command executed on the client).

I think we can assume that the application is trying to grab 9500 bytes each
read. Given that TCP/IP has a packet limit of around 1k bytes (I believe or
is it 4K?), I expect that each read is being sent over the wire in a number
of packets and that the nfs software is having to re-assemble each read at
the client end. Perhaps there is a way of tuning the client/server(s) so
that they perform this operation more efficiently.

We have had similar sorts of problems with samba connections with PCs
running NT4. Current samba connections are using the following socket
options to help guarantee reliability -
socket options = SO_KEEPALIVE TCP_NODELAY
however I'm a little unsure as whether this is related. I believe software
can use stream/socket connects or something(s) else. So I guess another
question is - how does one discover what sort of connection is being
established between the client and the server.

Dennis

######################################
Dennis Macdonell
Systems Administrator
AUSLIG
mail: PO Box 2, Belconnen, ACT 2617
email: mcdonell_at_auslig.gov.au
ph: 61 2 6201 4326
fax: 61 2 6201 4377
######################################
Received on Fri Apr 27 2001 - 04:57:54 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:42 NZDT