![]() |
![]() HP OpenVMS Systemsask the wizard |
![]() |
The Question is: We noticed, several times, that batches planned to start at Day1:0:00 were effectively starting at day0:23:59:26.94 (as seen in the log file characteristics and in the program results). We rely on the OpenVms scheduler and the standard queueing system. Is there something already known about such timing problems ? The Answer is : In an OpenVMS Cluster, the clocks on individual nodes can and do drift (apart). Only one node within the cluster is the "timekeeper" for jobs scheduled on queues. If the current timekeeper's clock is running fast(er), and the job being released is executed on a node with a slow(er) clock, it will appear that the job has started before its due time. Remember that the scheduled time is "/AFTER". Queue management does not guarantee exactly when the job will start, just that it will be AFTER the scheduled time, as determined by the clock on the node that is serving and controlling the queues; on the timekeeping node. Please see the OpenVMS FAQ for the architectural hardware clock accuracy (drift) specifications, and please realize that the system software clock can drift further as high-IPL activity can block clock interrupt processing. This means that the clocks within a cluster can potentially drift apart over time, increasing the likelyhood that the situation described will arise. Again, the FAQ has technical details on this. Remember that your OpenVMS system is a computer, and not a chronometer. Customer and business requirements, as well as arcane topics such as temperature stability of the reference source crystal, as well as particular application mechanisms all dictate the clock accuracy. Higher accuracy is certainly technically possible and is available through a variety of optional external hardware and/or software, as is described in the FAQ. What can be done? There are several approaches to minimize the effects of clock drift; to more closely synchronize the clocks: 1) Correcting drift a) Keep nodes in synch with SET TIME/CLUSTER (manual or periodic) (The SET TIME/CLUSTER command uses the same system mechanisms as the SYSMAN CONFIGURATION SET TIME command that some folks will reference, but SET TIME/CLUSTER has the benefit of being a directly-accessable DCL command.) b) Employ a time service like DTSS or NTP to keep nodes in synch c) Purchase external chronometric hardware and/or software to maintain time the the accuracy that your business requires. 2) Reducing the impact a) Don't schedule jobs for exactly midnight - consider using /AFTER="TOMORROW+00:05" or another similar combination time instead. (The offset should be larger than the largest expected cluster time skew.) b) Learn the typical clock drift of your systems and make sure programs and algorithms don't expect higher accuracy c) Put a short delay, say 1 minute, at the start of jobs to allow for cluster time drift d) Do not depend on queue manager timing for high accuracy events. If you need something to happen at an accurate, specific time, use a permanent job or job scheduler, running at a higher or real-time priority, and using direct system service calls (eg: $creprc) rather than SUBMIT or $sndjbc calls Note that you cannot depend on particular node(s) to be the queue manager timekeeper, nor can you (nor should you) predict, or even (easily) determine which node is the timekeeper at a particular moment, so don't even think about it! For additional details on timekeeping and on clock synchronization techniques and tools, please see the OpenVMS FAQ.
|