Occasional fopen failures

From: Tim W. Janes <janes_at_signal.dra.hmg.gb>
Date: Mon, 28 Aug 1995 10:45:30 +0100 (BST)

Hi all,

We have a long running problem in that using fopen on an existing file
very occasionally incorrectly fails.

The error returned by perror is either
No such file or directory
or
Stale NFS file handle
Very occasionally we also see shell scripts incorrectly failing with the error
Command not found.

Our environment
Mixed Ultrix 4.3A - OSF/1 V3.0 ( problem also existed on V1.2 and V1.3)
AMD automounter - NFS2 Protocols
UFS filesystems.

We carry out a lot of batch work with program runs lasting from a few
hours to a few weeks. Programs typically sequentially loop through
very many files many times and may well do many 1,000's or even
10,000's of fopens but only a small no of files open at any one time.

Problem seen when program run on either Ultrix or OSF/1.
Problem ONLY seen if fileserver is OSF/1.
No information available as to whether problem exists if filesystem is local
(We don't run batch jobs on our fileservers)
Problem exists on both Ethernet and FDDI connected fileserver.

We saw an insight a couple of weekend ago when one of our 9Gbyte disks
died giving many read errors and SCSI resets.

At the time a user was trying to access another disk on the same system with 55
jobs each lasting 2 hours each looping through 40 files 30 times. Of
these 45 jobs fell over and many failed at the same instant as a SCSI reset
due to the failing disk.

But generally failures are not associated with any uerf entries.

Many user programs now have fopen enclosed in a loop so that it will
retry the fopen after a few seconds a number of times. I have seen
several occasions when the first fopen fails but the second try
succeeds.

Has anyone else seen similar behaviour or any suggestions?

As I have so little concrete evidence as to where the problem lies I
have not yet raised the issue with DEC.

TIA

Tim Janes

 
Tim Janes | e-mail : janes_at_signal.dra.hmg.gb
Defence Research Agency | tel : +44 1684 894100
Malvern Worcs WR14 3PS | fax : +44 1684 895103
Gt Britain | #include <std/disclaim.h>
Received on Mon Aug 28 1995 - 12:05:40 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:45 NZDT