Routing and NFS

From: Serguei Patchkovskii <patchkov_at_ucalgary.ca>
Date: Thu, 04 Mar 1999 17:39:30 -0700 (MST)

Hi managers,

We'd run into a curious situation which I am at loss to explain,
so that I'd like to get some opinions here. We run a cluster of
about a hundred DPW 500au's in a remote boot setup. When the
whole thing is booting (e.g. after a power loss), the collective
traffic easily saturates 100Mbit Ethernet, so we installed a
second network interface on the server node (the nodes are connected
by 3COM 3300 switch, which can easity handle the load). Both
interfaces have IP addresses in the same subnet, and traffic
is spread over the interfaces with static routes, both on client
and server nodes. This works quite fine for the purpose it was
intended (i.e. booting), but:

here comes the strange bit I can't understand. Static host routes
on the server side work fine for all TCP traffic, as well as UDP
traffic EXCEPT for packets originating from the NFS server (port 2049).
For example (.34 is the primary interface on the server, .219 is
the secondary):

# netstat -rn
...
Destination Gateway Flags Refs Use Interface
...
XXX.XXX.XXX.28 XXX.XXX.XXX.219 UHS 1 98934 tu1
...

Now, if I attempt to tftp a file from the client .28, tcpdump on tu1
sees the whole exchange:

# tcpdump -i tu1 \( host XXX.XXX.XXX.34 or host XXX.XXX.XXX.219 \) \
                and host XXX.XXX.XXX.28 and udp
...
17:04:49.485072 XXX.XXX.XXX.28.2109 > XXX.XXX.XXX.34.tftp: 40 RRQ "Censored"[|tftp]
17:04:49.601216 XXX.XXX.XXX.34.3117 > XXX.XXX.XXX.28.2109: udp 516
17:04:49.601216 XXX.XXX.XXX.28.2109 > XXX.XXX.XXX.34.3117: udp 4
17:04:49.601216 XXX.XXX.XXX.34.3117 > XXX.XXX.XXX.28.2109: udp 246
17:04:49.602192 XXX.XXX.XXX.28.2109 > XXX.XXX.XXX.34.3117: udp 4

BUT, an NFS access from .28 arrives over the tu1, but the responce
goes back over tu0 - even though the static route indicates tu1 as
the right destination (the traces below are obviously for different
packets, but the pattern is always the same: NFS packets arrive on
tu1, and leave over tu0 REGARDLESS of the static route):

# tcpdump -i tu1 \( host XXX.XXX.XXX.34 or host XXX.XXX.XXX.219 \) \
                and host XXX.XXX.XXX.28 and udp
...
17:05:13.135664 XXX.XXX.XXX.28.1017 > XXX.XXX.XXX.219.2049: udp 108
...

# tcpdump -i tu0 \( host XXX.XXX.XXX.34 or host XXX.XXX.XXX.219 \) \
                and host XXX.XXX.XXX.28 and udp
...
17:12:18.366976 XXX.XXX.XXX.34.2049 > XXX.XXX.XXX.28.1014: udp 168
...


The whole situation is mostly a curiosity in our setup: a single
100Mbit link handles our steady-state NFS requirements quite well,
and the routes do work for the most network-intensive part of the
startup. Still, the fact that NFS UDP (and ONLY NFS UDP) traffic
does not obey the static route means that there is something
strange going on - and I'd like to know why...

I would appreciate any suggestions and ideas.

/Serge.P
Received on Fri Mar 05 1999 - 00:42:42 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT