We are having puzzling errors with ftp.
I have an AS4000 running Digital Unix 4.0d. It talks to about 20 PC's
running Windows NT 4.0 on a dedicated network (recently upgraded from
10Mbits to 100 Mbits). The PC's will typically ftp large data sets
(30-100MB) from the Alpha, crunch the data sets for a while, then ftp
the modified data back to the Alpha. Each PC will grab a different data
set ("job"), work on it and return it, then go looking for another job.
At any given moment, nobody may be ftp'ing, or several boxes may be
ftp'ing at the same time.
A couple of weeks ago, we started seeing read and write failures for
ftp. Some PC's were unable to grab their initial data sets, some could
not return the finished data sets. A few timeout errors were logged as
well. All these errors are seen only in the APPLICATION logs on the NT
boxes. As yet, I have seen no system errors reported by the Alpha.
There are some read/write failures reported by the NIC itself (netstat
-I tu1 -s). They are a bit cryptic though.
************************************************************************
***
tu1 Ethernet counters at Fri Sep 4 08:48:08 1998
65535 seconds since last zeroed
4294967278 bytes received
4294967284 bytes sent
159343879 data blocks received
234554725 data blocks sent
6241221 multicast bytes received
55944 multicast blocks received
1558996 multicast bytes sent
10899 multicast blocks sent
6445193 blocks sent, initially deferred
687580 blocks sent, single collision
644871 blocks sent, multiple collisions
374 send failures, reasons include:
Excessive collisions
Carrier check failed
0 collision detect check failure
5443 receive failures, reasons include:
Block check error
Framing Error
Frame too long
0 unrecognized frame destination
0 data overruns
0 system buffer unavailable
0 user buffer unavailable
***************************************************************
Our first attempt to fix this was to upgrade to 100Mbit on the network,
assuming that we were flooding it with too many simultaneous transfers.
We now see about half the number of erros, but they have not gone away.
Maybe we are still flooding it.
Where would ftp log the errors if the OS on my alpha did receive
notification of ftp failures? I have poked around in the syslog files
and found nothing. It bothers me that the app on the NT side is logging
errors, but that my Alpha is oblivious. Do I have to turn the logging
on in some way? I would think that the ftpd process would report
failures, even if it did not initiate the ftp.
Also - what about threshholds with ftp. Is there a max number of
connections, or some maximum buffer limit that I might be hitting. The
man pages have given me no real clues. As far as I can tell the only
maximum would be how much the NIC can take all at once.
Suggestions, pointers, guesses are all welcome.
TIA
susrod_at_hbsi.com
Received on Fri Sep 04 1998 - 16:21:57 NZST