I recieved two replies
------------------
Matt Theriault, knash_at_mnsi.net
Who had had similar problems that turned out to be bad thinwire connectors.
On the next reoccurance of our problem I will certainly check these -
I can believe that this may be a problem area.
Also confirmed that there is no way to zero counters.
------------------
"Matt Pound" <matt_at_iso.port.ac.uk>
Who was having similar or even worse problems - DEC had just changed
an FDDI board on a 2100 without any improvement and were claiming that
there was a firmware bug on the PCI DEFTA and were waiting for a fix.
------------------
We have now gone for a week without a CRC error so I am just keeping
my fingers crossed and have ordered an FDDI book to try get some
background in order to try and understand what might be happening.
Thanks to all on this list
Tim.
>
>
> Hello All,
>
> I hope that as the majority of machines on our FDDI are alphas (and
> all DEC) you consider that this question is appropiate to this list.
>
> We have an FDDI loop that until 2 weeks ago consisting of
>
> DecConcentrator 500 with 6 DecStation 500/2xx (via thin coax)
> 3COM Linkbuilder FDDI concentrator with 8 Alphas 3000/[46]00 (via UTP)
> 3COM Lanplex 2500 Ethernet switch with 8 Ethernets populated with more
> Decstations & Alphas and PC's
>
> Every couple of months we would be plagued with Ring Init messages
> (10 - 100 per hour) - we were able to stop these by rebooting one or
> more network items - we never really tracked down what cured the
> problem but always could.
>
> The weekend about 10 days ago we:-
> a) Upgraded all alphas from OSF/1 3.0 to 3.2C
> b) added to the FDDI network one 3000/600 as a sencond main NFS server
> and 6 x 250 4/266
>
> All went well for a week until on Friday we started getting MAC CRC
> errors - just a trickle at first but by Sunday the whole system had
> become unusable ( 5+ minutes hangs) and thousands of errors.
>
> By Sunday evening all machines had logged approx 3000 MAC CRC errors
> 10,000 Ring Inits except the new 3000/600 NFS server which has logged
> 150,000 CRC errors and (>?) 65535 LEM events. ( all other machines
> reported LEM events < 3 )
>
> I took a gamble and rebooted this machine.
>
> The MAC CRC error then fell back to a trickle which stayed until Monday
> evening since then no errors at all.
>
> So the questions:-
>
> 1) Anyone any idea what can possibly be happening here?
>
> 2) with 3.2C Ring Inits are no longer reported via syslog - are they
> really so trivial that they can be safely ignored?
>
> 3) What is an LEM event? Does this indicate a hardware problem on the
> only machine reporting these events?
>
> 4) Is the any way to reset the netstat -I fta0 -s counters?
>
> 5) More of a comment - but a reboot resets all counters to zero except
> the seconds since last last zeroed field.
>
> 6) Can anyone recommend a good UK based FDDI expert to call in to try
> to solve these problems?
>
> I have appended the output of netsats -I fta0 -s on the 'rouge' machine
> also extracts from the kernel logs on bothithe 'rouge' machine (joyce) and
> our other main NFS server (byron) at the peak of the problem.
>
> As always Many Thanks to this invaluable list.
>
> Tim.
>
> Tim Janes | e-mail : janes_at_signal.dra.hmg.gb
> Defence Research Agency | tel : +44 684 894100
> Malvern Worcs | fax : +44 684 894384
> Gt Britain | #include <std/disclaim.h>
... FDDI Counters and LOG REMOVED ...
Received on Tue Nov 21 1995 - 01:30:35 NZDT