![]() |
![]() HP OpenVMS Systemsask the wizard |
![]() |
The Question is: This is a rather complex scenario, and involves an old, donated VAX 6000-440, VMS V6.0 I have an casual involvement in (.edu, no maintenance, and so this problem, tracked over many months, cannot be addressed by CSC). Mass storage is provided by HSC70 se rved disks. There is a VAXstation 4000/90 making up the small cluster. I do not have physical access to this system which make crashing it for dump information during problem periods diffcult. My application, written in C, accessed by multiple, concurrent clients, uses user-mode AST-driven RMS to create, populate and delete files and ACP QIOs to read and modify file revision date/times. Every now and then the system goes ($ MON MODES) 70% kernel mode, 15-20% interrupt, with the application process showing 100% of one CPU (presumably being billed for the 70% kernel, as this is basically the only CPU activity on the system). File cache ($ MON FILE) show non-normal levels of Dir FCB and Dir Data attempt rates (140), with no actual disk activity ($ MON DISK), lots (300) of LOCK ENQs/DEQs ($ MON LOCK), correspondingly non-normal amounts of DLOCK ENQs/DEQs and Dir Function Out activity , lots (140) of buffered I/O. A $ SHOW PROC/CONT on the application show considerable periods spent at priority 16 and attempted in tracking of the PC shows much time spent around 8D0xxxxx, much less time at 7Fxxxxxx, and I cannot find anywhere it might be in my user code (0xxxxxxx). I have applied a number of relevant 6.0 patches (from the public FTP site), shadowing, C RTL, F11, LIBR, etc. without mitigating the problem. Faster CPUs added mid-year seem to have exacerbated the problem. I have also increased the size of the SYSGEN A CP_xxx cache sizes have just disabled the virtual block cache experimentally (impact still to be determined), but am running out of ideas. Can the Wizard suggest anything likely? Thanks (and sorry for the novella). The Answer is : Beyond providing pointers to the previous discussions of correctly synchronizing applications and memory accesses (topics 1661 and 2681), there is far too little detail included here to even begin to diagnose the cause of this problem. (In particular, migrations to faster CPUs, between VAX and Alpha systems, and moves from uniprocessor to SMP configurations does tend to expose latent synchronization flags in application code.) Various system service entry points are located in the 7Fxxxxxx range. For details on these entrypoints, please see the OpenVMS VAX system map file (SYS$SYSTEM:SYS.MAP) and see the SYS$P1_VECTOR module in the STARLET.OLB module. Through the use of the techniques discussed in topics 1661 and 2681 and the debugger, as well as the system map, it may be possible to locate the particular application trigger for this problem -- of course the problem may be due to a flaw in OpenVMS, but the flaw is equally (or more) likely to reside in the application code. OpenVMS VAX V6.0 is no longer supported and ECOs are no longer being generated for this release, and an upgrade to at least V6.2 is strongly recommended -- as you are at OpenVMS VAX V6.0, an upgrade to V7.2 is generally a simple task, and a task that does not normally have any particular application-mode or kernel-mode implications. (The upgrade from OpenVMS VAX V5.x to V6.0 is a major upgrade with various implications for kernel-mode code and kernel-mode applications. The V6.x to V7.0 upgrade was a major upgrade only on the OpenVMS Alpha platform, not on the OpenVMS VAX platform.)
|