![]() |
![]() HP OpenVMS Systemsask the wizard |
![]() |
The Question is: In general, would it be accurate to say that the more processes accessing an indexed RMS file, the less efficient I/O is? That is, if you had a few server processes accessing the file rather than many single processes directly accessing it, the same amoun t of I/O would be completed more rapidly? If so, why? Perhaps related: In a Cluster environment, would the number of nodes in the Cluster have an effect on I/O if a device were accessible throughout the Cluster? The Answer is : It depends on many things. For example, are these processes reading or writing the file? Is the access pattern uniform across all records or are there hot spots. Does the file have global buffers? How much contention is there for records? Consider one extreme, all processes are WRITING the same record on a file with global buffers. The first process to access the record reads the file, resulting in the bucket containing the record being placed in a global buffer. Subsequent requests for the same record are satisfied from the buffer. In a cluster environment, we need to coordinate buffers across nodes. In the same case with a "server" process model, the processes reading the record still need to communicate with the server - an I/O by any other name is still an I/O! Or in this case TWO I/Os as we need both a request and a response. There is also the overhead of context switching. At another extreme, consider all the processes WRITING the same record. The situation is similar, except that we may need to communicate cache coherency across the cluster. Now think about each of the processes reading random records. The effectiveness of the cache may be reduced, but why would the different models result in extra I/O to the file or reduce the "efficiency" of the I/O's? (whatever that means!) There are benefits in using the server process model for accessing data files - for example it gives more control over the data files in terms of security. It can also lead to more flexible application designs since the communication with the server can be implemented through any convenient transport mechanism. The downside of the server model is increased overall I/O and process management overhead, as every request results in a minimum of 3 I/Os, only one of which is a candidate for caching (send request, read data, send response), and two context switches. In a cluster environment you must also consider how the device is accessed. If it's MSCP served, then it IS a single process directly accessing the device already, but at a lower level. The Wizard would recommend that an application be designed with a data access layer which presents the data to the application in whatever form is convenient for the application. This layer can then be implemented as direct RMS access, or client/server, or a data base product or an in-memory data base etc... Don't limit the application to a specific physical implementation by exposing too much detail in the application logic. This approach allows the application to be written without making a choice. Different implementations can be tried and compared without affecting the application. For the same reason, it it much simpler to write an application which is portable across multiple platforms because all the system specific code is hidden in lower levels.
|