-- Craig ,,, Wot, NO mountains! ======================oOO=(o o)=OOo=================================== Craig Morgan (_) Lecturer, CS Group School of Computing Email: C.Morgan_at_soc.staffs.ac.uk Staffordshire University Phone: +44 (0)1785 353466 Beaconside Fax: +44 (0)1785 353497 Stafford, UK ST18 0DG Pager: +44 (0)839 453754 "It's the downhill thrills, that make the uphill slog worthwhile..." ====================================================================== =========================================================================== From: Hellebo Knut <Knut.Hellebo_at_nho.hydro.com> Regards, At least for 3.0 I know there are patches for the tulip drivers. Maybe they didn't make in time to 3.2 and you still have to install these ?? Contact DEC for info. -- ****************************************************************** * Knut Helleboe | DAMN GOOD COFFEE !! * * Norsk Hydro a.s | (and hot too) * * Phone: +47 55 996870, Fax: +47 55 996342 | * * Pager: +47 96 500718 | * * E-mail: Knut.Hellebo_at_nho.hydro.com | Dale Cooper, FBI * ****************************************************************** =========================================================================== From: Martyn Johnson <Martyn.Johnson_at_cl.cam.ac.uk> I think I've read somewhere that different ethernet controller chips REPORT collisions differently. For example, if a packet collides 3 times and then goes on the fourth attempt, some chips report that as 1 collsion (because one packet collided) whereas others report it as 3 collisions (because that's what happened on the wire). Fundamentally, whether a transmission collides or not is going to depend on what is on the wire rather than the particular controller chip. Apart from pathological timing effects, the performance of a particular chip or board is unlikely to have any effect, except in so far as a high-performance interface will load the network more and hence increase the general collision rate. My guess is that the general difference you are seeing between lance-based and tulip-based interfaces is an artefact. I suspect that there is some hardware problem with the machine that is absurdly bad - either the machine itself faulty or some problem with its connection. I only have one tulip-based machine, and its ethernet performance seems fine to me (about 7.1 to 7.6 Mbit thoughput using TCP with the machine in normal service). It is running 3.2A. It is not meaningful for me to compare collision rates because we are using switched ethernet, so traffic levels on different segments vary anyway. I suggest that you pay less attention to collision rate and start measuring throughput with something like ttcp. Throughput is, after all, what actually matters. -- Martyn Johnson maj_at_cl.cam.ac.uk University of Cambridge Computer Lab Cambridge UK =========================================================================== From: Dave Cherkus <cherkus_at_UniMaster.COM> You can't directly compare lance and tulip reports this way. Here's something I wrote a while ago on this topic: Newsgroups: comp.unix.osf.osf1 Subject: Re: V3.0 E-Net Collisions with ftp Organization: UniMaster, Inc. Date: Wed, 4 Jan 1995 02:29:32 GMT You are making a reasonable yet inaccurate assumption that the counters are maintained the same way on both machines, but they are not because the interfaces use two different chips and the chips used in the tu0 interface are more accurate than the ones used in the 2000/300 (ln0?) interface. The AMD LANCE ethernet chip, used in the 2000/300 and also used for many years in DEC and many other vendor's equipment, tells the kernel one of the following things happened after a frame is transmitted: - no collisions occurred - exactly one collision occurred - two or more collisions occurred The Ethernet standard says that up to 15 collisions can occur before the transmission is aborted, so the LANCE does not communicate the full story back to the kernel. The kernel increments the netstat collision counter once when exactly one collision occurred, and by two when two or more collisions occurred. This is inaccurate, but it's the best the kernel could do. It's not just inaccurate, it's always optimistic. This is why you think you are getting 'excessive' collisions - you've been lied to by the AMD LANCE in the past. The older DEC SGEC chip (ne0) and the newer DEC TGEC chip (te0, tu0) can tell the kernel exactly how many collisions occurred, and this is what netstat reports. The AMD LANCE used in TurboChannel and ISA systems is fading into the sunset... You can identify which chip is being used by the message that appears at boot time, or by the interface name (ln0 is AMD LANCE, most of the others are tu0). If you feel more comfortable with the 'classic' statistic, you can run the command # netstat -I tu0 -is and look for 'single colllision' and 'multiple collision', then add the 'single collision' count to two times the 'multiple collision' count to get the 'classic' statistic. -- Dave Cherkus ----- UniMaster, Inc. ----- Contract Software Development Specialties: UNIX TCP/IP X OSF/1 AlphaAXP AIX RS/6000 Performance ISDN Email: cherkus_at_UniMaster.COM Tel: (603) 888-8308 Fax: (603) 888-8308 if (cpu.type == PENTIUM && cpu.step < 8) { panic("Intel Inside!"); } =========================================================================== From: Mike Iglesias <iglesias_at_draco.acs.uci.edu> See the message included below for an answer to your question. I got it from the WAIS search feature of the http://www-archive.stanford.edu/lists/alpha-osf-managers/hyper/ archive. Mike [S] Tulip Ethernet Controller Collision Rate Bivins, Jeff (BIVINS_at_nebeng.otis.utc.com) Sat, 30 Sep 1995 10:36:32 -0600 (CST) My Original question is: > Hello all, > I have 35 AlphaStation 250 4/266 workstations and 2 AlphaServer 2100 4/233 > servers. All of these machines have a DEC TULIP PCI ethernet card. When > using the 'monitor' tool I see on the average 30-40 percent of collision > on a high throughput transfer. > When I send a large file from on of these machine to a DECsystem 5900. The > high collision rate only exist in the Alpha side and not the DECsystem > side. > Is this a tuning issue ? Nope. It's normal. > How can I resolve this issue ? Thanks to those who responded Matt Thomas Dave Cherkus J. Dean Brock Dave Golden The consensus is that the TULIP controller reveals accurate statistics on collisions, where the LANCE controller does not. I will look at this problem from a network perspective. Thanks, Jeff =========================================================================== From: David Lucas <dlucas_at_worldbank.org> Jim - We noticed the same problem with our 2 2100s in a DECsafe ASE environment. One of our Digital support people dug around in the internal archives and found a paper entitled, "The Ethernet Capture Effect: Analysis and Solution", K.K. Ramakrishnan and Henry Yang, (rama, yang_at_erlang.enet.dec.com). In a nutshell, the abstract describes the effect as a situation "where a station transmits consecutive packets exclusively for a prolonged period despite other stations contending for access." Essentially, the Tulip interfaces, when transmitting, take over the wire never giving other systems a chance to send their packets. The solution is a proposed algorithm, Capture Avoidance Binary Exponential Backoff, that includes "an enhanced backoff algorithm for collision resolution in the special case when a station attempts to capture the channel subsequent to an uninterrupted consecutive transmit." Of course, none of this offers much practical advice on how to fix the immediate problem. In our case, we believed our Alphas were having a negative effect on our overall network, and simply bridged them onto their own segment. It hasn't much improved the performance for those 2 systems, but at least our network guys can't point the finger at us when they do have problems. :) The paper is 31 pages long, and I don't have an electronic copy. What I can try and do is scan it and mail it to you. (I have no way of making a document available for anonymous ftp.) It may take a day or so, as it's a bit hectic today. Hope this is of some help to you. d. ======================================================================= David Lucas E-mail: dlucas_at_worldbank.org The World Bank Phone: 202.458.5214 Practice random, senseless acts. =========================================================================== From: Selden E Ball Jr <SEB_at_LNS62.LNS.CORNELL.EDU> Jim, I just took a quick look at the e'net interfaces on our Alphas. We have old and new "tulip" systems as well as lots of 3000 series systems. As best I can tell, the collision rates of both types are consistant with the traffic on the ethernet segments to which they are connected. Have you compared the collision rates of all of the systems which are plugged into the same hub? I'd expect the ratio of Opkts/Coll to be about the same there. Selden =========================================================================== From: "Jonathan B. Craig" <jcraig_at_i2k.net> I don't know but I have been testing DEC NSR and have found that network backups on my (very early model) DEC 2100 w/ Tulip cards have an incredible amount of collisions (50% normal). If you get a suitable response let me know! -- Jonathan B. Craig jcraig_at_gfoods.com Gordon Food Service =========================================================================== From: nick_at_alldata.com (Frank "Nick" Riley) I was reading through the archive a month or so ago, and I recall reading a bunch of messages regarding a bug in the TULIP driver in DU 3.? that required a patch. The symptom was intermittent "voids" in the interface where absolutely no traffic passed. Look through the archive at http://www.ornl.gov/cts/archives/mailing-lists/ and search for "TULIP". =========================================================================== From: ccult1!bommel!dehartog_at_relay.nl.net Hello Jim, You may want to ask your friendly Digital support people for the patch: OSF350-070 (it's mandatory!). Good luck! =========================================================================== From: em_at_icess.ucsb.edu (Ed Mehlschau) We received a tulip interface in a new AlphaStation that yielded very poor performance until it was configured to run half-duplex instead of full-duplex. Apparently DEC ships them in the full dux configuration. I have been told that the config is changed from the boot PROM, but I don't know the exact incantation offhand, sorry. -- Ed =========================================================================== From: anthony baxter <anthony.baxter_at_aaii.oz.au> Just as a data point, I just checked our 4/233's and they all show similar numbers (anything from 20% to 30%). These are 3.2A systems (they go to 3.2C next week), and they show the same boot info for the tulip card as your systems. They're plugged into a switching hub, so there is no way in hell they should be seeing that level of errors. tu0: DECchip 21040-AA: Revision: 2.3 tu0: DEC TULIP Ethernet Interface, _hardware address: 08-00-2B-E4-56-EF I'd be very interested in anything you find out - I'm hoping it's just a bug in the reporting code, but in any case it would be good to have it fixed... Anthony =========================================================================== And a couple things I found in the A-O-M archives. =========================================================================== Subject: (belated) SUMMARY: ethernet constipation on 2100 A500MP X-Url: http://www.ornl.gov/its/archives/mailing-lists/alpha-osf-managers/1995/02/msg00346.html Back in (I think) October I posted a description of a problem with the Sable's ethernet interface. (Periodically, and for no apparent reason, inbound packets would get stuck. As soon as the system sent a packet to some other machine, the inbound clog would clear.) Through a combination of absentmindedness and overwork, I never did get around to posting a summary. So better late than never, here it is ... I got some really helpful replies from a couple of DEC folks (who shall remain nameless to keep them from getting swamped with unsolicited mail). The first reply I got said | [...] I believe you're seeing a bug in the Tulip driver. One | that was recently discovered, and that too quite by accident. | (A line of code was deleted and did not get reinstated.) | It has to do with the driver failing to reset a timer when the | transmit ring transitions to an inactive state (0 entries pending). | Each time a transmit packet is given to the device, a timer is | reset to go off after 5 seconds. This timer therefore never goes | off if the device is kept busy. If, however, a new transmit does | not come in within 5 seconds of the last one, then the timer | goes off and the interface is reset. I believe this reset is what | causes things to get hung-up. The bug apparently first appeared in V2.0b, but was discovered too late for a fix to make it into V3.0. Anyway, the helpful DEC person sent me a patched version of the TULIP driver, and the problems disappeared. He also mentioned that he had arranged for the patches to be made available through Digital's Customer Support Center (for folks covered by a support contract, of course). The relevant patch numbers are OSFV20-065 (for OSF/1 V2.0b) and OSFV30-40 (for OSF/1 V3.0) Mark Bartelt 416/978-5619 Canadian Institute for mark_at_cita.toronto.edu Theoretical Astrophysics mark_at_cita.utoronto.ca "Clothes not busy being worn are busy drying." - Dylan, on laundry day [ singing "It's all right, ma (I'm only bleaching)" ] =========================================================================== Subject: SUMMARY: tuo: packet dropped: no mbuf (again). X-Url: http://www.ornl.gov/its/archives/mailing-lists/alpha-osf-managers/1995/07/msg00190.html Thanks for the reply. DEC was very quick in getting back to me, and I was able to ftp the patch, install it and rebuild the kernel within an hour of my call to DEC. I am including the response I received from Matt Thomas describing the patch. thanks again, dan cambron ORIGINAL: --------------------- >I included a previous summary for reference. I am at V3.2a on a 2100 using >AdvFS and I'm still getting crashes and the message "tu0: packet dropped: no >mbuf". The move to v3.2a doesn't seem to be working. Any thing else I should >do. Is there a patch to v3.2a? I also have a call in to DEC. >thanks >dan REPLIES: ----------------------------- There is a patch. /usr/sys/BINARY/if_tu.o (USG-01533) CHECKSUM: 33316 54 /usr/sys/data/if_tu_data.c CHECKSUM: 13750 7 ---------------------- Patch ID: OSF320-044, OSF320-059 The Tulip (DECchip 21040) driver does not support software selection of the 10Base2 (Thinwire) and 10Base5 (Thickwire) ports. As per the Tulip specification, this selection is expected to be carried out in hardware, and is done so on the DE425 and DE435 modules produced by Digital. In the absence of a jumper solution or auto-sensing hardware, software can also select between the 10Base2 and 10Base5 ports if the hardware implementation utilizes a certain (undocumented) feature of the chip. In particular, the 3-port PCI Ethernet card made by Standard Microsystem Corporation (SMC) makes use of this feature, and the driver as shipped today (since V2.0B), cannot select between the two AUI ports on this module. This patch contains an enhanced media-sensing algorithm to allow software selection of the 10Base2 and 10Base5 ports. This improved algorithm will also provide better diagnostics on boards that use a jumper (such as the DE425 and DE435). For example, the driver will now warn the user if the jumper position was set for Thinwire but no cable was connected to that port. The driver will now display the following message: tu0: auto sensing: selected BNC (10Base2) port: no carrier This patch also contains a fix for a problem where the driver will print out 'packet dropped: no mbuf' messages to the console repeatedly. While this happens, the system becomes unusable for all other activity and is effectively hung from a user's point-of-view. A kernel rebuild is required. ===========================================================================Received on Fri Dec 22 1995 - 05:08:18 NZDT
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:46 NZDT