Introduction
One of the challenges that the customers face with the legacy OpenVMS environments is to have their OpenVMS servers running business applications monitored real-time and have the incidents fixed as soon as they occur. The difficulties here are to have a monitoring system deployed at the first place or if one exists already, to have their existing monitoring systems integrated with the up-to-date solutions to keep current with the technology developments. The situation worsens when the customer is running very old Versions of OpenVMS servers where in, there are no tools currently available in the market that can be deployed because of compatibility issues.
SYSMON (System Monitor) utility is designed to address these challenges.
What is SYSMON?
SYSMON is a DCL (Digital command language) based solution that works on all OpenVMS versions and all supported hardware architectures. Currently. this solution is successfully deployed on three customer environments.
Features
The following are the features of SYSMON:
-
Built on client server model where one server will be acting as a server while the rest of the systems, the clients, will be reporting the incidents to the server. The server in turn is also being monitored by another system (secondary server) to notify in case the server itself goes down.
-
Automatic failover of SYSMON primary server to the secondary server should the primary fail.
-
Highly scalable and customizable.
-
Automatic status tracking of incidents and automatic closure of the incidents when the issue is resolved.
-
Can be installed and set up on the fly. This means, it does not require any down time of the system.
-
Automatic filtering of duplicate incidents.
-
Real-time monitoring interface to view the list of open issues at any point of time.
-
Optional feature to choose the business hours. Any monitoring can be dynamically turned off.
-
Generic alarm interface which can be used by the end users to use SYSMON to notify the incidents from their own scripts.
Working Theory
SYSMON is a subset of OpenVMS command procedures that use the native OpenVMS DCL commands to monitor a specific entity of an OpenVMS system. Examples of these entities could be free space available on the disks, the availability of print/batch queues, and so on. SYSMON consists of the following components:
-
Client
-
Primary Server
-
Monitor Utility
-
Secondary Server
Client
The client component comprises of the monitoring routines and a scheduler that triggers them at a predefined interval. Each monitoring script has its own data file, which contains the specification of the entities to be monitored. The monitoring scripts looks after their intended OpenVMS entities and reports the anomalies to the server, if any. The incidents are notified by transferring an alarm file to a unique location on the server. Similarly, when the incident is resolved at the client’s end, the client signals the server that the specific incident is resolved and the same is closed at the server end.
Primary Server
The server component periodically polls each of the client locations and notifies the reported incidents to the HP support team via an SMTP email. In addition, the status of each of the incidents is tracked in a local database by the server. When the server finds an issue to have been resolved (as signaled by the client), it marks the corresponding incident in the master database as closed.
Monitor Utility
The monitor component is a menu based utility which lets the HP support person to track the status of open incidents and to close them manually, when needed. Developments are underway to include new scripts on the entities such as performance monitoring, security, and so on. Presently, SYSMON can monitor the following entities:
-
Node being Unreachable
• System Process Missing
-
Disk Status change
-
Error count increases on the devices
-
Disk Space
-
Highest File Version Check
-
Memory Page File Utilization
-
Monitor OPCOM messages
-
Queue Status Monitoring
-
Batch job Monitoring
-
Shadow set members Decrease/ Increase
-
SCS Paths between cluster systems
-
Queue Managers’ status
Secondary Server
SYSMON secondary server is basically a client to the primary server. It periodically polls the primary server to see if the server component is running fine. If the server component is not running properly for a period of time or if the server is down, the secondary server migrates itself (figure b) as primary server and broadcasts the change to the rest of the clients. Subsequently, the clients will continue to transfer the incidents to the new server. Whenever, the original primary server is up, it downgrades itself as a client and also assumes the role a secondary server.
SYSMON Architecture Overview
a) SYSMON - During Normal
b) SYSMON – After Migration Operation
of Secondary to Primary
View of the open incidents through the MONITOR Interface

Evidence that the Solution Works
SYSMON has been successfully deployed on three customer sites successfully.
Competitive Approaches
The constraint is that the commercial monitoring tools (for e.g. HP OpenView Operations – OVO) cannot be deployed on the older environments as the tools do not support old OpenVMS versions. On the other hand, the tools available in the past (for e.g. Polycenter Watchdog) are no longer developed and supported on these legacy environments. While SYSMON is targeted for these old platforms, it can run on the latest OpenVMS versions without requiring any modifications. This is proved from the fact that it is presently running on three of our customer’s systems (on all three hardware architectures (VAX, ALPHA and Itanium) and all OpenVMS versions (starting from VAX 5.5-1H3 to Integrity servers V8.3)) successfully.
Current Status
SYSMON has been running properly on all the customer systems since its deployment. It has also undergone a few enhancements, where we have introduced new monitoring entities such as monitoring the members of the mirror set, monitoring the cluster communications, and so on.
Next Steps
The client uses either DECnet – COPY that supports Decnet proxies or through TCPIP – FTP, to transfer the alarm files to the server. As FTP transmits passwords in clear text mode, we are currently enhancing the tool to use secure methods for the alarm file transfers, i.e. SFTP. We are working to integrate SYSMON with an OpenVMS based web server so that all management tasks can be performed over the web interface.
|