Followup #3: transfer times 10x difference from Dan Kirkpatrick on 1999-05-08 (tru64-unix-managers)

From: Dan Kirkpatrick <dkirk_at_suhep.phy.syr.edu>
Date: Fri, 07 May 1999 15:49:36 -0400

Ok... quick recap...
ftp from 100mbps to 100mbps remote:/dev/null is 4500kbytes/sec
ftp from 100mbps to 100mbps remote:/tmp/FILE is 33kbytes/sec
if I do the same from a 10mbps machine to the same 100mbps machine, it goes
1100kbytes/sec whether to a file or /dev/null on that same 100mbps machine.

If you look at the timing to /dev/null, it shows network is fine, i've
checked duplex settings, cables, etc.
It's not related to disk speeds since they are wide devices and the
1100kbps consistent xfer is to slower disks.
It's not load on the machine since I get the same timing for 100% load and
0% load.
I also tried adding speed 100 or speed 200 to /etc/rc.config and matching
these duplex settings with the switch and verified all sides are forced to
the right equal settings. I dont suggest it's hardware since it's the same
between 3 machines and i've tried a direct crossover cable (eliminating
switch possibility) with the same results.
Packetfiltering (tcpdump) shows 1-2 sec lags between streams of packets
when ftping to a remote file, but no lags when ftping to remote /dev/null...
I think i've ruled out the disks since a transfer from a 10mbps machine
transfers at rate it should to and from these disks, and local copies are
acceptable. I've also tried to different disks...

Hmmm.... ?
It's so far boggled me, this group, and dec support for over 2 weeks.
Perhaps there's a way to make a ram drive to try a write to something other
than /dev/null and disks on this scsi bus.
Any other ideas!?

Just in case it may spark something...
Here's an excerpt from /var/adm/messages (2 scsi cards on this one):

Apr 30 13:58:43 hepsu02 vmunix: Alpha boot: available memory from 0xae8000
to 0xfffe000
Apr 30 13:58:43 hepsu02 vmunix: Digital UNIX V4.0B (Rev. 564); Fri Apr 30
13:29:46 EDT 1999
Apr 30 13:58:43 hepsu02 vmunix: physical memory = 256.00 megabytes.
Apr 30 13:58:43 hepsu02 vmunix: available memory = 245.21 megabytes.
Apr 30 13:58:43 hepsu02 vmunix: using 975 buffers containing 7.61 megabytes
of memory
Apr 30 13:58:43 hepsu02 vmunix: Digital AlphaPC 164LX 533 MHz system
Apr 30 13:58:43 hepsu02 vmunix: Firmware revision: 4.9
Apr 30 13:58:44 hepsu02 vmunix: Digital UNIX PALcode version 1.22
Apr 30 13:58:44 hepsu02 vmunix: Module 1095:646 not in pci option table,
can't configure it
Apr 30 13:58:44 hepsu02 vmunix: pci0 at nexus
Apr 30 13:58:44 hepsu02 vmunix: itpsa0 at pci0 slot 5
Apr 30 13:58:44 hepsu02 vmunix: ITPSA VERSION V1.1.25 1998/03/26
Apr 30 13:58:44 hepsu02 vmunix: IntraServer ROM Version V1.0 c1998
Apr 30 13:58:44 hepsu02 vmunix: scsi0 at itpsa0 slot 0
Apr 30 13:58:44 hepsu02 vmunix: rz0 at scsi0 target 0 lun 0 (LID=0)
(SEAGATE ST34555W 0930) (Wide16)
Apr 30 13:58:44 hepsu02 vmunix: rz1 at scsi0 target 1 lun 0 (LID=1)
(SEAGATE ST423451W 0013) (Wide16)
Apr 30 13:58:44 hepsu02 vmunix: rz2 at scsi0 target 2 lun 0 (LID=2)
(SEAGATE ST423451W 0013) (Wide16)
Apr 30 13:58:44 hepsu02 vmunix: tz4 at scsi0 target 4 lun 0 (LID=3)
(Quantum DLT4000 CD50)
Apr 30 13:58:44 hepsu02 vmunix: trio0 at pci0 slot 6
Apr 30 13:58:44 hepsu02 vmunix: trio0: S3 Trio64V+ (SVGA) Plug-N-Play, 2.0 Mb
Apr 30 13:58:44 hepsu02 vmunix: tu0: DECchip 21140: Revision: 2.0
Apr 30 13:58:44 hepsu02 vmunix: tu0: auto negotiation capable device
Apr 30 13:58:44 hepsu02 vmunix: tu0 at pci0 slot 7
Apr 30 13:58:44 hepsu02 vmunix: tu0: DEC TULIP (10/100) Ethernet Interface,
hardware address: 00-00-F8-06-87-E0
Apr 30 13:58:45 hepsu02 vmunix: tu0: auto negotiation off: selecting
100BaseTX (UTP) port: half duplex
Apr 30 13:58:45 hepsu02 vmunix: isa0 at pci0
Apr 30 13:58:45 hepsu02 vmunix: gpc0 at isa0
Apr 30 13:58:45 hepsu02 vmunix: ace0 at isa0
Apr 30 13:58:45 hepsu02 vmunix: ace1 at isa0
Apr 30 13:58:45 hepsu02 vmunix: lp0 at isa0
Apr 30 13:58:45 hepsu02 vmunix: fdi0 at isa0
Apr 30 13:58:45 hepsu02 vmunix: fd0 at fdi0 unit 0
Apr 30 13:58:45 hepsu02 vmunix: itpsa1 at pci0 slot 9
Apr 30 13:58:45 hepsu02 vmunix: ITPSA VERSION V1.1.25 1998/03/26
Apr 30 13:58:45 hepsu02 vmunix: IntraServer ROM Version V1.0 c1998
Apr 30 13:58:45 hepsu02 vmunix: scsi1 at itpsa1 slot 0
Apr 30 13:58:45 hepsu02 vmunix: rz10 at scsi1 target 2 lun 0 (LID=4)
(SEAGATE ST118273LW 5766) (Wide16)
Apr 30 13:58:45 hepsu02 vmunix: rz11 at scsi1 target 3 lun 0 (LID=5)
(SEAGATE ST118273LW 5766) (Wide16)
Apr 30 13:58:45 hepsu02 vmunix: rz12 at scsi1 target 4 lun 0 (LID=6)
(SEAGATE ST118273LW 5766) (Wide16)
Apr 30 13:58:45 hepsu02 vmunix: rz13 at scsi1 target 5 lun 0 (LID=7)
(SEAGATE ST118273LW 5766) (Wide16)
Apr 30 13:58:45 hepsu02 vmunix: lvm0: configured.
Apr 30 13:58:46 hepsu02 vmunix: lvm1: configured.
Apr 30 13:58:46 hepsu02 vmunix: kernel console: ace0
Apr 30 13:58:46 hepsu02 vmunix: dli: configured
Apr 30 13:59:03 hepsu02 vmunix: SuperLAT. Copyright 1994 Meridian
Technology Corp. All rights reserved.
Apr 30 13:59:15 hepsu02 vmunix: pcxal_init_keyboard: keyboard init unsuccessful
Apr 30 13:59:16 hepsu02 vmunix: tu0: transmit FIFO underflow: threshold
raised to: 256 bytes
Apr 30 13:59:16 hepsu02 vmunix: tu0: transmit FIFO underflow: threshold
raised to: 512 bytes
Apr 30 14:02:04 hepsu02 vmunix: tu0: transmit FIFO underflow: threshold
raised to: 1024 bytes
Apr 30 14:19:33 hepsu02 vmunix: tu0: transmit FIFO underflow: using
store-forward:

And here's the rest of the thread...

>I did have to disable autonegotiation, but that didn't resolve the problem
>either. It did eliminate the "late collisions" and "FDS errors" which are
>systematic of duplex mismatch.
>
>Turns out it must not be the switch. I've connected two machines directly
>with a crossover cable and it still has the same long delay (~1mbps, vs.
>10mbps or 100mbps).
>They are TULIP (10/100) Ethernet Interface (on a DEC Alpha 566, running
>Digital Unix 4.0b with jumbo patch of 7/1/98).
>
>Does Dunix need to be told it's 100mbps? The ewa0_mode is set at boot level
>to force 100mbps half or full duplex. I've tried both and both are the
>same. All I can think of next is to bump them down to 10mbps and see what
>happens, but in the end we need 100mbps...
>I've tried both nfs copies and ftp of non nfs partitions.
>Seems fine though with the same nfs mounted and ftp from one of these
>machines to and from a 10mbps machine.
>
>Any thing else to recommend?
>
>
>>The predominant suggestion was to check/force the duplex on the machine and
>>on the switch to either full or half duplex. I tried it both ways,
>>disabling the port, setting to half or full on both sides, then reenabling
>>the port.
>>all I have in /etc/rc.config for IFCONFIG_0="<machineip> netmask
>255.255.255.0"
>>so I tried changing the duplex at boot level
>>
>>I tried all combinations... at boot level, the machines had ewa0_mode FastFD
>>and the switch was set at auto-negotiate which selected half duplex. Ok...
>>so just change it right?
>>I tried explicitly telling the switch to use full duplex... same problem
>>I tried explicitly telling the machine and the switch to do half duplex...
>>same problem
>>And disabled ports before change, and enabled after change. Even tried a
>>powerdown.
>>I don't expect it's the cables (cat 5, 1meter) since they transfer 5-10x
>>faster to 10mbps machines.
>>
>>I realize cpu/disk speed may result in a greater bottleneck than the
>>switch/network, but it should at least approach the performance of the same
>>file from a 10mbps machine. I've tried it with both nfs copies, and with
>>ftp (eliminating the cause of an nfs problem?).
>>netstat -i shows virtually no Ierrs Oerrs or Coll. "monitor" also didn't
>>show anything strange.
>>
>>Still scratching my head...
>>
>>Here's the original thread...
>>
>>>Ok... we have a bunch of servers... 3 of which are 100mbps and 6 are 10mbps.
>>>Here's the times of nfs copy/ftp of a ~3.5mb file
>>>
>>>10mbps machine --> 10mbps machine 4.3 sec
>>>10mbps machine --> 100mbps machine 7.6 sec
>>>100mbps machine --> 10mbps machine 5.6 sec
>>>100mbps machine --> 100mbps machine 142.5 sec ----WHY?!
>>>
>>>I think i've ruled out nfs problems since all machines are using automount
>>>and all are mounted using 2mb read/write cache (I've tried disabling cache
>>>too).
>>>They are all on the same subnet and same Cisco 2916 10/100 switch, and
>>>settings on the switch are pretty much default set when it comes out of the
>>box
>>>the 100mbps machines are set at eprom to do 100mbps, and the switch
>>>autoconfigures as 100mbps, half duplex
>>>
>>>I don't think the gateway is the problem since they are all on the same
>>>subnet but how does a machine determine which of 2 gateways to use? All
>>>machines are running /usr/sbin/gated and most or all don't have a
>>>/etc/gated.conf or /etc/gateways file
>>>
>>>one 100mbps machine's ifconfig tu0:
>>>tu0: flags=c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX>
>>> inet 128.230.57.16 netmask ffffff00 broadcast 128.230.57.255 ipmtu 1500
>>>
>>>one 10mbps machine's ifconfig ln0:
>>>ln0: flags=c63<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST,SIMPLEX>
>>> inet 128.230.57.3 netmask ffffff00 broadcast 128.230.57.255 ipmtu 1500
>>>
>>>Any other ideas of why the 10x difference or something I may be missing?
>>>
--------------------------------------------------------------------------
Dan Kirkpatrick dkirk_at_phy.syr.edu
Computer Systems Manager
Department of Physics
Syracuse University, Syracuse, NY
http://www.phy.syr.edu/~dkirk Fax: (315) 443-9103
--------------------------------------------------------------------------
Received on Fri May 07 1999 - 19:53:07 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:39 NZDT