Accessing RMS indexed files from non-VMS systems using SOAP and Python

Accessing your legacy RMS indexed files from non VMS systems: A Python/SOAP approach.

Remi Jolin, SysGroup France.
remi.jolin@sysgroup.fr


Context

Accessing OpenVMS data from other systems is more and more necessary. With databases, we can use techniques like ODBC. For indexed files, there are also products that will present them as databases but if you want either a “lighter” solution or need to add some “treatment” between the data and the end-user you have to find some other technology.

SOAP

A broadly deployed way to access data in a distributed environment is SOAP. Many development platforms can access SOAP servers and automatically get functions and data formats by analysing the WSDL representation of the SOAP service. So why could we not present, at least, basic RMS functions as SOAP services for each file?

The Tools

This article will show you, step by step, how to access an indexed file, convert some specific VMS structures to Python structures, set up a SOAP server and build SOAP functions. The tools used for this project are:

  • Python: this scripting language is available on OpenVMS Alpha and IA64. It is well integrated and can access indexed files through the vms.rms package. Python has been ported and is maintained and supported on OpenVMS by SysGroup. The current version available on OpenVMS Alpha and Itanium is based on Python 2.5.4. A port of version 2.7 is coming.
  • construct: a python library that can map binary data to their corresponding type (integer, real, string, …). Very useful to describe the RMS record.
  • soaplib: a python library for building soap servers

We will now see how to combine these tools to set up a simple server that will mimic $put, $get, … accesses to an indexed file. This solution can then be modified to present a more business-oriented interface by integrating some specific treatment between one or multiple files and the SOAP interface.

Our Indexed File

We have a simple indexed file named test.dat with 2 indexes. In Pascal, the record definition would be

rec = record   part_number : [key(0)] integer;
  description : [key(1, duplicates)] packed array[1..132] of char;
  price : integer64; ! the price in cents
  last_update_timestamp : integer64; ! A VMS timestamp
 end;


The record is 152 bytes long.

Accessing the indexed file data

This is done through the vms.rms.IndexedFile package that will handle RMS access and the construct package.


Defining the record structure

We first need to define the format of the record with the construct package from construct import Struct, SLInt32, SLInt64, String


test_struct = Struct('test',
    SLInt32('part_number'),
    String('description', 132),
    SLInt64('price'),
    VMSTimeStamp('last_update_timestamp'))
and the keys:
key0_struct = Struct('key0', SLInt32('part_number'))
key1_struct = Struct('key1', String('description', 132))


Timestamp conversion

Of course VMSTimeStamp is not defined in the construct package and we will have to define how to convert a VMS timestamp to a Python datetime and vice versa.

On VMS, $numtim and lib$cvt_vectim can convert OpenVMS internal time format to/from a 7-word array. $numtim is available in vms.starlet and lib$cvt_vectim is in vms.rtl.lib.

from vms.starlet import numtim
from vms.rtl.lib import cvt_vectim
from datetime import datetime
from construct import Adapter, SLInt64
def v2p_datetime(vms_timestamp):
  ''' converts a VMS timestamp to a Python datetime '''
  state, time_array = numtim(vms_timestamp)
  if state % 2 != 1:
   raise ValueError('VMS error code : %i' % state)
  return datetime(*time_array[:6]) # to simplify, we get rid of 100th of seconds

def p2v_datetime(dt):
  ''' converts a Python datetime to a VMS timestamp '''
  state, vms_time = cvt_vectim((dt.year, dt.month, dt.day, dt.hour,
       dt.minute, dt.second,
      int(dt.microsecond / 10000)))
  if state % 2 != 1:
    raise ValueError('VMS error code : %i' % state)
  return vms_time

class VMSTimeStampAdapter(Adapter):
  def _decode(self, obj, context):
    return v2p_datetime(obj)

  def _encode(self, obj, context)
:     return p2v_datetime(obj)

def VMSTimeStamp(name):
  ''' the construct VMSTimeStamp type '''
  return VMSTimeStampAdapter(SLInt64(name))


Dealing with the indexed file

We will use the vms.rms.IndexedFile package to define a specific class for our file. This class will handle the record conversion between the VMS and the python structure. To simplify the example, the file will be opened and closed for each request.

from vms.rms.IndexedFile import IndexedFile

class TestFile(IndexedFile):

name = 'test.dat'
 def __init__(self):
  super(TestFile, self).__init__(self.name,
      test_struct,
      auto_open_close=False)

 def __enter__(self):
  ''' will be called automatically when 'with TestFile() as f' is executed
  '''
 self.open()
 return self

 def __exit__(self, type, value, traceback):
  ''' will be called automatically when the 'with …' terminates '''
  self.close()

 def primary_keynum(self):
   return 0


We now have routines to find, put, get and update a record in our file.

For example, getting a record with key 0 >= to 12345:

with TestFile() as f:
rec = f.get(0, 12345, RAB_M_KGE)


The SOAP server

To build the SOAP server, we will use soaplib. Unlike the older package soapy that could be used for both servers and clients, soaplib is dedicated to building soap servers.

For each structured record we will use with soap, we need to subclass ClassModel. Here we define the SOAP structure corresponding to our rec record type :


from soaplib.core.model.clazz import ClassModel
from soaplib.model.primitive import String, Integer, DateTime

class Int(Integer):
  pass

class RecModel(ClassModel):
  __namespace__ = 'rec'
  part_number = Int()
  description = String()
  price = Int()
  last_update_timestamp = DateTime()

  @classmethod
   def from_struct(cls, struct):
   ''' converts a test_struct record to a RecModel instance '''
   rec = cls()
   rec.part_number = struct.part_number
   rec.description = struct.description
   rec.price = struct.price
   rec.last_update_timestamp = struct.last_update_timestamp
   return rec


Then we need to define the functions that will be callable by any SOAP client. These functions are methods of a subclass of DefinitionBase. To act as SOAP functions, the methods must be “decorated” by @soap(parameters list).


indexedfile_access_type = {'=': 0,
      '>': RAB_M_KGT,
      '>=': RAB_M_KGE,
    }

class FileManager(DefinitionBase):
  @soap(Int, Int, Int, _returns=Array(RecModel))
  # this method gets 3 integer parameters and returns an array of RecModel
  def get_seq(self, keynum=0, rec_nb=1, offset=0):
    ''' this function will sequentially retrieve rec_nb records
      ordered by keynum, after jumping over offset records
  '''
  # if you don't provide values on the SOAP client side, the corresponding
  # parameters get a value of None, so let's test and put the right values
  if keynum is None:
    keynum = 0
  if rec_nb is None:
    rec_nb = 1
  if offset is None:
    offset = 0
  result = []
  with TestFile() as f:
    f.rewind(keynum)
    for k in range(offset):
      f.next()
    rec_nb -= 1
    for k, rec in enumerate(f):
    result.append(RecModel.from_struct(rec))
    if k >= rec_nb:
      break
    return result

  def get_from_key(self, key_num, key_struct, access_type, rec_nb, offset):
  ''' returns an array of RecModel corresponding to the key and access_type
      (=, >, >=, …)
  This method cannot be called directly as a SOAP function.
  '''
  access_type = indexedfile_access_type.get(access_type, 0)
  if rec_nb is None:
    rec_nb = 1
  if offset is None:
    offset = 0
  result = []
  with TestFile() as f:
    if f.find(key_num, key_struct, access_type):
    # skip 'offset' records
    for k in range(offset):
      f.next()
    rec_nb -= 1
    for k, rec in enumerate(f):
    result.append(RecModel.from_struct(rec))
    if k >= rec_nb:
       break
    return result

  @soap(Int, String, Int, Int, _returns=Array(RecModel))
  def get_from_key_0(self, key, access_type, rec_nb, offset):
    ''' get records with key 0 '''
    key0 = key0_struct.build(Container(part_number=key))
    return self.get_from_key(0, key0, access_type, rec_nb, offset)

  @soap(String, String, Int, Int, _returns=Array(RecModel))
  def get_from_key_1(self, key, access_type, rec_nb, offset):
    ''' get records with key 1 '''
    key1 = key1_struct.build(Container(description=key.ljust(132)[:132]))
    # the string must be exactly 132 characters long...
    return self.get_from_key(1, key1, access_type, rec_nb, offset)

  @soap(RecModel, _returns=String)
  def update(self, record):
  ''' update a record

      previous record is retrieved using the primary key
  '''
  key = key0_struct.build(Container(part_number=record.part_number))
  with TestFile() as f:
    rec_struct = f.get(0, key, 0)
    if rec_struct:
      rec_struct.description = record.description.ljust(132)[:132]
      rec_struct.price = record.price
      # we force the time stamp to the system current datetime
      rec_struct.last_update_timestamp = datetime.now()
      f.update_current(rec_struct)
      return 'ok'
    return 'unknown record'
  return 'error'

  @soap(RecModel, _returns=String)
  def put(self, record):
    ''' put a new record in the file '''
    struct = Container()
    struct.part_number = record.part_number
    struct.description = record.description.ljust(132)[:132]
     struct.price = record.price
    struct.last_update_timestamp = datetime.now()
    with TestFile() as f:
      f.put(struct)
      return 'ok'
    return 'error'

  @soap(RecModel, _returns=String)
  def del_rec(self, record):
      ''' delete a record '''
  key = key0_struct.build(Container(part_number=record.part_number))
   with TestFile() as f:
    if f.find(0, key, 0):
    f.delete_current()
    return 'ok'
    return 'unknown record'
  return 'error'


Now, we must set up the server.


def main_soap():
  try:
    from wsgiref.simple_server import make_server
    soap_application = soaplib.core.Application([FileManager], 'tns')
    wsgi_application = wsgi.Application(soap_application)

    server = 'sg1.sysgroup.fr'
    port = 7888
    print "listening to http://%s:%s" % (server, port)
    print "wsdl is at: http://%s:%s/?wsdl" % (server, port)

    server = make_server(server, port, wsgi_application)
    server.serve_forever()

  except ImportError:
    print "Error: example server code requires Python >= 2.5"


And that's it!


Insuring that a record did not change between calls to get_from_key0/1 and update

We now want to make sure that the record we want to update or delete has not been changed by another user between the time we fetched it and the time we updated or deleted it. Calls to SOAP are done on top of HTTP(S). There is no record locking between calls, nor session persistence...


We will slightly modify RecModel to add a soap_checksum field. This field is calculated for each record fetched from the file (md5 of the whole record). The user is not supposed to change this field. During update or delete, the checksum of the record retrieved from the file is compared with the checksum transmitted by the user in the RecModel of the update/delete call. If it does not match, the record has been modified and the update won't be issued.


You can still force an update or a delete by replacing the checksum value of the “record” parameter by an empty string.


You'll find this version in the complete source kit available on the SysGroup repository (see References) The examples below are based on this version, so you will see this new soap_checksum field in the RecModel records.



And now, let's play with our soap server...

First, we start the server:
$ python my_soap_server.py
DEBUG:soaplib.core.service:building public methods
DEBUG:soaplib._base:adding method '{tns}del_rec'
DEBUG:soaplib._base:adding method '{tns}get_from_key_0'
DEBUG:soaplib._base:adding method '{tns}get_from_key_1'
DEBUG:soaplib._base:adding method '{tns}get_seq'
DEBUG:soaplib._base:adding method '{tns}put'
DEBUG:soaplib._base:adding method '{tns}update'
listening to http://sg1.sysgroup.fr:7888
wsdl is at: http://sg1.sysgroup.fr:7888/?wsdl


Now, we can check if it's responding correctly. First thing to do : try to get the WSDL definition in a web browser


Open firefox with this url: http://sg1.sysgroup.fr:7888/?wsdl


You should get something looking like this (truncated) :




You can see there the definitions of RecModel, rec_del and get_from_key_0 The WSDL definition will be used by SOAP clients to discover (introspect) the functions and parameters the service offers.


We will now use a simple python SOAP client: suds. Here are examples from a Linux machine (it also works on VMS  ).


rj@LRJN1:~$ python Python 2.5.2 (r252:60911, Jan 20 2010, 21:48:48) [GCC 2.4.4 (Ubuntu 4.2.4-lubuntu3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from suds.client import Client


Create a new “client”, don't do caching in tests so we can modify the server and get the actual definition each time we call client=Client(...)


>>> url = ‘http://sg1.sysgroup.fr:7888/?wsdl’ >>> client = Client(url, cache=None)


Let's see what we've got


>>> print client

Service ( Application ) tns="tns"
  Prefixes (2)
   ns0 = "rec"
 ns1 = "tns"
  Ports (1):
  (Application)
  Methods (6):
   del_rec(ns0:RecModel record, )
   get_from_key_0(xs:int key, xs:string access_type, xs:int rec_nb, xs:int offset, )
  get_from_key_1(xs:string key, xs:string access_type, xs:int rec_nb, xs:int offset, )
    get_seq(xs:int keynum, xs:int rec_nb, xs:int offset, )
   put(ns0:RecModel record, )
   update(ns0:RecModel record, )
  Types (14):
    ns0:RecModel
   ns0:RecModelArray
   del_rec
   del_recResponse
   get_from_key_0
   get_from_key_0Response
   get_from_key_1
   get_from_key_1Response
   get_seq
   get_seqResponse
   put
   putResponse
    update
   updateResponse


We can see the available functions (methods) and the specific types used in this example. Then, some examples using the functions...


First, retrieve a record (key0 = 2)
>>> r = client.service.get_from_key_0(2, '=')
>>> r.RecModel
[(RecModel){
   description = "SCREW DRIVER
      "
   last_update_timestamp = 2011-08-12 12:34:56
   price = 1040
   soap_checksum = "5969f675052aaf8a793c3c639307a6c3"
   part_number = 2
  }]

We get an array of 1 RecModel. Get its first item...

>>> rec = r.RecModel[0]
>>> rec
(RecModel){
  description = "SCREW DRIVER
    "
  last_update_timestamp = 2011-08-12 12:34:56
  price = 1040
  soap_checksum = "5969f675052aaf8a793c3c639307a6c3"
  part_number = 2
 }


Change the price and update the record


>>> rec.price = 1120
>>> client.service.update(rec)
ok


Now, if we try to update it again, we'll get an error because, as it has already been updated, the checksum is wrong.


>>> client.service.update(rec)
invalid checksum


Get the same record again to see the new values


>>> r = client.service.get_from_key_0(2, '=')
>>> r
(RecModelArray){
  RecModel[] =
    (RecModel){
      description = "SCREW DRIVER"
    last_update_timestamp = 2011-08-26 11:54:13
    price = 1120
    soap_checksum = "a5401491d7ca5ca56adc77cac148d85a"
    part_number = 2
  },
}


Get up to 10 records sorted by key 1


>>> client.service.get_from_key_1('', '>=', 10)
(RecModelArray){
  RecModel[] =
    (RecModel){
    description = "Hammer
"
    last_update_timestamp = 2011-08-13 15:34:56
    price = 980
    soap_checksum = "b8b3f602fd34dfa845a310a07be8c2f4"
    part_number = 5
  },
  (RecModel){
  description = "Nails (1x100) 100pc"
    last_update_timestamp = 2011-07-10 08:12:56
    price = 400
    soap_checksum = "2fb95d33c9aa2a6ac4a2e478431fbada"
    part_number = 6
  },
  (RecModel){
   description = "SCREW DRIVER"
   last_update_timestamp = 2011-08-26 11:54:13
   price = 1120
   soap_checksum = "a5401491d7ca5ca56adc77cac148d85a"
   part_number = 2
  },
  (RecModel){
   description = "SCREW DRIVER (big)
"    last_update_timestamp = 2011-08-20 17:10:40
   price = 1560
   soap_checksum = "2ac83aa9d398db359d1e01e114f326e1"
   part_number = 10
  },
 }


There are only 4 records in this file...


Get a record where key0 > 2
>>> r = client.service.get_from_key_0(2, '>')
>>> r
(RecModelArray){
  RecModel[] =
      (RecModel){
    description = "Hammer
"
    last_update_timestamp = 2011-08-13 15:34:56
    price = 980
    soap_checksum = "b8b3f602fd34dfa845a310a07be8c2f4"
  part_number = 5
  },
}


Delete this record


>>> client.service.del_rec(r.RecModel[0])
ok
>>>


Add a new record:


>>> rec = client.factory.create('ns0:RecModel')
>>> rec
(RecModel){
  description = None
  last_update_timestamp = None
  price = None
  soap_checksum = None
  part_number = None
 }
>>> rec.part_number=20
>>> rec.description = "New item"
>>> rec.price=12345
>>> client.service.put(rec)
ok


Accessing the server from other languages


What to do next?

Handling the workload

By using WASD, you can take advantage of its capability to adapt to the load by dynamically creating/deleting processes that will run the CGIplus services (check CGIplus and “request throttling” in the WASD configuration manual).


Testing the service

The server described in this article is up and running. You can test it. The test.dat file is rebuilt every hour, so feel free to update, delete, create records in it. You can also download the complete source of the server and test it completely at home. The kit also includes a patch for vms.rms.IndexedFile and soaplib.


Accessing the server from other languages

SOAP clients are widely available and often well integrated in languages CASEs. For example, you can build a simple Flex/Flash interface to any SOAP server in minutes with Flash Builder's WebService component (see Retrieving and handling data using WebService (16:15) in
http://www.adobe.com/devnet/flex/videotraining.html#day2)

References