-- PT>take care! -- contact Compaq support for assistance! PT> PT>I've seen the 30 second sync affect a DU system to the point where it can't PT>even respond to pings for a few seconds, even though the NIC is not PT>saturated. I've also seen X emulators affected by this comms "brownout" and PT>clients time out, though this symptom would normally indicate a different PT>sort of network bandwidth problem. It's possible PT> PT>>From my reading of the FIFO underflow, the sync freeze problem may actually PT>be causing these messages indirectly (because the kernel is unable to PT>complete transfering a packet to the tulip NIC), and adjusting the PT>threshold for early transmission is the normal operation of the driver. I PT>think this message is another symptom of the real problem... PT> PT>On the other hand, if your freezes aren't 30 seconds apart, and you have PT>definite signs of saturated network, then I've probably led you astray, so PT>continue with your current line of investigation! Note that the FIFO PT>underflow message is more likely triggered by a busy system (run queue? free PT>mem? paging out?) rather than a busy network. PT> PT>have fun! PT>Phil T Dr Blinn's reply points to the tu driver itself: TPB>As my esteemed colleague who maintains the "tulip" ("tu") driver noted TPB>to me: TPB> TPB>I found the following in the v4.0D patch documentation. TPB> TPB>Of course, for different versions, the numbers change, but TPB>the patch remains the same. TPB> TPB>- -------------------------------------------------- TPB> TPB> TPB>NEW PATCHID: 681.00 TPB>PATCH ID: OSF425-651 TPB>REQUIRED PATCHES: NONE TPB>CONDITIONALLY REQUIRED PATCHES: NONE TPB>SUPERSEDED PATCHES: OSF425-388-2 (297.02), OSF425-562 (597.00) TPB>SPECIAL INSTRUCTIONS: NONE TPB>FULL DESCRIPTION: TPB>PROBLEM: (QAR 55766 QAR 60909) (Patch ID: OSF425-388) TPB> =*=*= TPB>This patch fixes the following problems that may occur on some DE500 TPB>adapters: TPB> TPB>o The hardware setup operation may interrupt a pending ARP packet TPB>transmission. TPB> TPB>o If the cable to the adapter is not connected, the hardware setup TPB>operation TPB> will not execute. TPB> TPB>PROBLEM: (CLD ALC-08171) (Patch ID: OSF425-562) TPB> =*=*= TPB> TPB>When using a DE504-BA in an AS800 with a second SCSI controller on the TPB>shared PCI bus, the DE504 experiences DMA latency and increases it's TPB>buffersize from 128 bytes to 1024 bytes in four steps during heavy load TPB>and finally goes into a store/forward mode. When this happens the TPB>device does a reset and stops working for approximately 2 seconds. TPB>During this time all incoming datagrams and messages are lost. TPB> TPB>This patch adds a kernel global variable to the tu driver that TPB>specifies whether store/forward mode should be permanently enabled when TPB>the tu driver starts. To enable this mode, patch the variable TPB>tu_sf_mode using dbx: TPB> TPB># dbx -k /vmunix TPB> TPB>(dbx) patch tu_sf_mode=1 TPB>1 TPB>(dbx) quit TPB># TPB> TPB>And the following DOES require an actual patch for V4.0D, I believe the TPB>fix is in teh later releases. TPB> TPB>PROBLEM: (QAR 65058, QAR 65259) (Patch ID: OSF425-651) TPB> =*=*= TPB>The patch fixes a problem in the tulip driver. The tulip driver needs to TPB>support DC21143-xD Errata V4.0 for ethernet connections. This chip is TPB>currently being used on Compaq Professional Workstation XP1000 TPB>(as well as several others in the near future). TPB> TPB> TPB>FILES: TPB>./sys/BINARY/tu.mod TPB> CHECKSUM: 61916 43 TPB> SUBSET: OSFHWBIN425 TPB> ./kernel/io/dec/netif/if_tu.c,v RCS ID: 1.1.145.4 TPB>SUPPORT NOTES: NONE On my side, I was able to peform a few checks on the network side during the weekend, and I finally found an acceptable solution (for us at least): In a word, the source of observed network slowdown is that switch ports connecting our Unix servers were incorrectly configured to participate in Spanning Tree Protocol exchanges. In more details: I was able to observe that the DE500 shortly takes the link down upon enlarging it's FIFO buffer. This brief link transition was seen by the Ethernet switch port (with STP enabled) as a network topology change, which induced the port to enter in the blocking state for the duration specified by the "bridge forward delay timer", which is 15 seconds by default with our network hardware. So, every time the FIFO size was adjusted, the server was cut from the rest of the network for this interval. Since our servers do not act themselves as bridging devices in the network, there is no need to have STP enabled on the switch ports they are connected to. Disabling STP on server ports effectively prevents a flick in link transition from triggering the 15-second wait. Take note that the interruption in network traffic by the DE500 itself still exists, but it is down to below 1 second in duration, which is acceptable for our type of network traffic (having no real-time data carried on the network). Thanks again to all who replied. =============================================== Charles Vachon tel: (418) 627-6355 x2760 email: cvachon2_at_mrn.gouv.qc.ca Administrateur de système FRCQ/Ministère des Ressources Naturelles du Québec ===============================================Received on Mon Jun 26 2000 - 15:04:56 NZST
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT