![]() |
![]() HP OpenVMS Systemsask the wizard |
![]() |
The Question is: We have a number of applications whose performance just recently went right into the dumpster. The common thread seems to be that the files read by these applications were all recently ANAlyzed and CONVerted using Ana/RMS/FDL/Nointeractive..., and that al l have Descending String keys, though the applications may not necessarily be accessing the files via this descending key. These are, for the most part, very large files, with one million records or more. There have been no application changes, and the ap plications have been running against these files and performance has been acceptable since 1992. A portion of the descending key in each file in question is the date in YYYYMM format. Is there something that in the algorithm of reading or writing descending keys that would suddenly cause unusual overhead or a poorly optimized file suddenly when the Y YYYMM = 199810? The Answer is : ANAL/RMS followed by EDIT/FDL/NOINTER performance a generic file tuning taking a limited number of inputs into account. Specifically is looks for the number of data records, the average record size and the disk cluster factor. It does NOT try to use data key values / distribution. If your original file was properly designed, taking application data and usage patterns into account, then the automated tuning is likely to be less efficient. You should try to revive the old FDL files and compare the assigned COMPRESSION, FILL FACTOR and AREA numbers for KEYs and the BUCKET_SIZE for those areas. - The key you mention YYYYMM is rather prone to a large number of duplicate which in turn can dramatically impact performance. Each duplicate will need a 7 bytes pointers. If there are thousands (millions?) of duplicates, then you may need an exceptionally large bucket size to minimize the pain. (there will be pain!) - Large files now-adway live on very large disks with sometimes largish clustersizes ( > 50 ). This may lead Edit/fdl astray, causing it to select overly large buckets. You can force more reasonable choices by replacing the actual clustersize in the analysis fdl by a 'generic' one like 12. Now rerun edit/fdl and use $DIFFERENCE on the old and new output FDLs. - The old file may have been relativly fragmented with records coming and going and random free space for the new records. The new file may be tightly packed (FILL FACTOR?) and any new record may cause a bucket split: additional IO and time! This should become stable after a while, but a re-convert with a lower fill factor (70%?) may be needed. Good luck!
|