--
PT>take care! -- contact Compaq support for assistance!
PT>
PT>I've seen the 30 second sync affect a DU system to the point where it
can't
PT>even respond to pings for a few seconds, even though the NIC is not
PT>saturated. I've also seen X emulators affected by this comms
"brownout" and
PT>clients time out, though this symptom would normally indicate a
different
PT>sort of network bandwidth problem. It's possible
PT>
PT>>From my reading of the FIFO underflow, the sync freeze problem may
actually
PT>be causing these messages indirectly (because the kernel is unable to
PT>complete transfering a packet to the tulip NIC), and adjusting the
PT>threshold for early transmission is the normal operation of the
driver. I
PT>think this message is another symptom of the real problem...
PT>
PT>On the other hand, if your freezes aren't 30 seconds apart, and you
have
PT>definite signs of saturated network, then I've probably led you
astray, so
PT>continue with your current line of investigation! Note that the FIFO
PT>underflow message is more likely triggered by a busy system (run
queue? free
PT>mem? paging out?) rather than a busy network.
PT>
PT>have fun!
PT>Phil T
Dr Blinn's reply points to the tu driver itself:
TPB>As my esteemed colleague who maintains the "tulip" ("tu") driver
noted
TPB>to me:
TPB>
TPB>I found the following in the v4.0D patch documentation.
TPB>
TPB>Of course, for different versions, the numbers change, but
TPB>the patch remains the same.
TPB>
TPB>- --------------------------------------------------
TPB>
TPB>
TPB>NEW PATCHID: 681.00
TPB>PATCH ID: OSF425-651
TPB>REQUIRED PATCHES: NONE
TPB>CONDITIONALLY REQUIRED PATCHES: NONE
TPB>SUPERSEDED PATCHES: OSF425-388-2 (297.02), OSF425-562 (597.00)
TPB>SPECIAL INSTRUCTIONS: NONE
TPB>FULL DESCRIPTION:
TPB>PROBLEM: (QAR 55766 QAR 60909) (Patch ID: OSF425-388)
TPB> =*=*=
TPB>This patch fixes the following problems that may occur on some DE500
TPB>adapters:
TPB>
TPB>o The hardware setup operation may interrupt a pending ARP packet
TPB>transmission.
TPB>
TPB>o If the cable to the adapter is not connected, the hardware setup
TPB>operation
TPB> will not execute.
TPB>
TPB>PROBLEM: (CLD ALC-08171) (Patch ID: OSF425-562)
TPB> =*=*=
TPB>
TPB>When using a DE504-BA in an AS800 with a second SCSI controller on
the
TPB>shared PCI bus, the DE504 experiences DMA latency and increases it's
TPB>buffersize from 128 bytes to 1024 bytes in four steps during heavy
load
TPB>and finally goes into a store/forward mode. When this happens the
TPB>device does a reset and stops working for approximately 2 seconds.
TPB>During this time all incoming datagrams and messages are lost.
TPB>
TPB>This patch adds a kernel global variable to the tu driver that
TPB>specifies whether store/forward mode should be permanently enabled
when
TPB>the tu driver starts. To enable this mode, patch the variable
TPB>tu_sf_mode using dbx:
TPB>
TPB># dbx -k /vmunix
TPB>
TPB>(dbx) patch tu_sf_mode=1
TPB>1
TPB>(dbx) quit
TPB>#
TPB>
TPB>And the following DOES require an actual patch for V4.0D, I believe
the
TPB>fix is in teh later releases.
TPB>
TPB>PROBLEM: (QAR 65058, QAR 65259) (Patch ID: OSF425-651)
TPB> =*=*=
TPB>The patch fixes a problem in the tulip driver. The tulip driver
needs to
TPB>support DC21143-xD Errata V4.0 for ethernet connections. This chip
is
TPB>currently being used on Compaq Professional Workstation XP1000
TPB>(as well as several others in the near future).
TPB>
TPB>
TPB>FILES:
TPB>./sys/BINARY/tu.mod
TPB> CHECKSUM: 61916 43
TPB> SUBSET: OSFHWBIN425
TPB> ./kernel/io/dec/netif/if_tu.c,v RCS ID: 1.1.145.4
TPB>SUPPORT NOTES: NONE
On my side, I was able to peform a few checks on the network side during
the weekend, and I finally found an acceptable solution (for us at
least):
In a word, the source of observed network slowdown is that switch ports
connecting our Unix servers were incorrectly configured to participate
in Spanning Tree Protocol exchanges.
In more details:
I was able to observe that the DE500 shortly takes the link down upon
enlarging it's FIFO buffer. This brief link transition was seen by the
Ethernet switch port (with STP enabled) as a network topology change,
which induced the port to enter in the blocking state for the duration
specified by the "bridge forward delay timer", which is 15 seconds by
default with our network hardware. So, every time the FIFO size was
adjusted, the server was cut from the rest of the network for this
interval. Since our servers do not act themselves as bridging devices in
the network, there is no need to have STP enabled on the switch ports
they are connected to. Disabling STP on server ports effectively
prevents a flick in link transition from triggering the 15-second wait.
Take note that the interruption in network traffic by the DE500 itself
still exists, but it is down to below 1 second in duration, which is
acceptable for our type of network traffic (having no real-time data
carried on the network).
Thanks again to all who replied.
===============================================
Charles Vachon tel: (418) 627-6355 x2760
email: cvachon2_at_mrn.gouv.qc.ca
Administrateur de système
FRCQ/Ministère des Ressources
Naturelles du Québec
===============================================
Received on Mon Jun 26 2000 - 15:04:56 NZST
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT