![]() |
![]() HP OpenVMS Systemsask the wizard |
![]() |
The Question is: how can I preform VMS hardware monitoring? can I use the DECevent for this? if so how can I configure it to monitor memory single/double bit errors? The Answer is : DECevent and Compaq Analyze are the usual tools for translating the logged errors into text, while the error log entries are generated by OpenVMS kernel-mode components, by OpenVMS components such as RMS, and by layered products. OpenVMS tends to suppress the logging of recoverable (correctable) memory errors until specified thresholds are exceeded, as the incidence of some number of single-bit (correctable) memory errors is entirely normal and expected. On various systems, the memory hardware is explicitly designed to detect and to potentially correct memory errors -- the two most common techniques being parity and ECC. There are, however, memory errors and memory error patterns that can be undetectable and thus uncorrectable -- these are multi-bit errors. There are (undocumented) kernel-mode data cells containing various error counts, these are the cells used by tools such as SHOW ERROR. The details of the internal counts that are maintained -- and the details of the error correction mechanisms and of the machine check mechanisms depend on the specific OpenVMS platform and specific model in use. There is at least one ECO kit in this area, for the AlphaServer GS60 and AlphaServer GS140 series. Please see the ECO kit VMS712_UPDATE for additional details.
|