-- $$$ Emanuele Lombardi $$$ mail: AMB-GEM-CLIM ENEA Casaccia $$$ I-00060 S.M. di Galeria (RM) ITALY $$$ mailto:emanuele.lombardi_at_casaccia.enea.it $$$ tel +39 06 30483366 fax +39 06 30483591 $$$ $$$ ||| $$$ \|/ ;_; $$$ What does a process need | /"\ $$$ to become a daemon ? | \v/ $$$ | | $$$ - a fork o---/!\--- $$$ | |_| $$$ | _/ \_ $$$* Contrary to popular belief, UNIX is user friendly. $$$ It's just very particular about who it makes friends with. $$$* Computers are not intelligent, but they think they are. $$$* True programmers never die, they just branch to an odd address $$$* THIS TRANSMISSION WAS MADE POSSIBLE BY 100% RECYCLED ELECTRONS -----Original Message----- > From: emanuele.lombardi_at_casaccia.enea.it > [mailto:emanuele.lombardi_at_casaccia.enea.it] > Sent: Tuesday, January 30, 2001 4:44 PM > To: tru64-unix-managers_at_ornl.gov > Cc: emanuele.lombardi_at_casaccia.enea.it > Subject: undetected data corruption reading 600Mb files > > > Hardware: ES40 6/500 4CPUS 3Gb RAM (it happened with 4GB as well) > Firmware: 5.8 > Software: T64 Unix 5.1 2nd patch applied > WEBES V3.1 Build 12 09/28/2000 SP 1 Build 4 1 Dec 2000 > File System: Advfs version 4 > Problem: managing large data files (600Mb), data is changed > without any notice to the user > > > Dear friends, > > This was supposed to be the summary of my mail having the subject > "gzip & gunzip not always returning original data" but I prefer to > "open" a new subject since it proved to be a different (and worst) > matter. > > The probles is that, managing large data files (600Mb), data > is changed > without any notice to the user. > A user of mine discovered the problem gzipping/gunzipping his > large data > file: gunzip sometimes returned strange errors, while other times (not > always) the gunzipped data was different that the original data. > > At the beginning, soon after the "gzip & gunzip not always returning > original data" mail I suspected a memory error detected by CA > to be the > cause of the problem. Unfortunately the memory cards has been > replaced, > CA does'nt see any hardware problem, but I still have strange > undetected > data corruptions (even without gzip/gunzip). > > I have to thank very much our doctor, Tom Blinn, for his very fast and > usefull help. Following his suggestion I found out that the > problem was > NOT in gzip/gunzip since I get undetected data corruption even in the > following few lines of code. In it I repeatetely copy an input file > (../a) into file b and c and then I check differences among > the 3 files > using "diff" and "cksum". Well, it happens that those > differences sometimes really > occurs and that there are no noticeble warning or error message. > > #!/bin/csh > unset verbose > set echo > echo pwd=`pwd` > uname -a > unlimit > limit > set n=0 > set echo > loop: > _at_ n ++ > echo " > ==================================================== begin loop $n" > echo start loop n=$n at `date` > ls -ls ../a > cksum ../a > cp ../a b > cksum ../a b > cp b c > cksum ../a b c > ls -ls b c > diff ../a b >/dev/null || echo ERRROR 1: FILES ../a and b > DIFFERS at loop $n > cksum ../a b c > diff b c >/dev/null || echo ERRROR 2: FILES b and c > DIFFERS at loop $n > cksum ../a b c > diff c b >/dev/null || echo ERRROR 3: FILES c and b > DIFFERS at loop $n > cksum ../a b c > diff ../a c >/dev/null || echo ERRROR 4: FILES ../ and c > DIFFERS at loop $n > cksum ../a b c > diff c ../a >/dev/null || echo ERRROR 5: FILES c and ../a > DIFFERS at loop $n > cksum ../a b c > diff ../a b >/dev/null || echo ERRROR 6: FILES ../a and b > DIFFERS at loop $n > cksum ../a b c > diff b ../a >/dev/null || echo ERRROR 7: FILES b and ../a > DIFFERS at loop $n > cksum ../a b c > diff b c >/dev/null || echo ERRROR 8: FILES b and c > DIFFERS at loop $n > cksum ../a b c > echo end loop n=$n at `date` > echo " > ==================================================== end loop $n" > goto loop > > I run the above script using the file ../a which has the following: > ls -ls ../a > 610032 -rw-r--r-- 1 root system 624672000 Jan > 18 17:34 ../a > cksum ../a > 2785050943 624672000 ../a > > > While I'm writing the script is running in background and > here are the results obtained > up to now: > > loop ERROR1 ERROR2 ERROR3 ERROR4 ERROR5 ERROR6 ERROR7 ERROR8 > 1 no no no no no no no no > 2 no no no no no no no no > 3 no no no no no no no no > 4 YES YES YES no no YES YES YES > 5 no no YES no no YES YES YES > 6 no YES YES no no YES YES YES > 7 no no no no no no no no > .... > .... > > Of course, when some ERRORx occurs (that is some diff are found), the > cksum values of the files are not what expected (2785050943 as file > ../a). > > Now I kill the background job and I edit the script > eliminating all the > "diff" commands. The script now contains only the following commands: > cp, ls, and cksum. > > The results are ugly! The checksum of a given file often > changes within > the same loop: the dimensions are always the same, but the contents of > files varies !! > To prove my words I submited the script in background placing > stdout on > a log file. Look at the following which shows the resulting > cksums (wich > should all be the same): > > grep 624672000 CHECK_nogz.log | grep -v system | sort -u > > 1680138362 624672000 b > 2046682359 624672000 c > 2095653778 624672000 b > 218351582 624672000 b > 2371670479 624672000 c > 2785050943 624672000 ../a > 2785050943 624672000 b > 2785050943 624672000 c > 2992181696 624672000 b > 3216358513 624672000 b > 3442014270 624672000 c > > > What else to say ? > Please, help me! > > Thanks to everybody, > Emanuele >Received on Wed Jan 31 2001 - 13:21:02 NZDT
This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:41 NZDT