HP OpenVMS Systemsask the wizard |
The Question is: Dear Wizard, I've got a sequential file which I need to remove the carriage returns from at the end of each line. I've tried converting the file with an fdl to have no carraige return but it makes no difference. I know the carraige returns can be removed on UNIX using chomp? How can I do this on VMS ? The file is 200,000 blocks in size. Regards, Martin. The Answer is :
This depends on the current format of the file. The OpenVMS Wizard
assumes that the reason for the CR characters is that this is a STREAM
file copied from Microsoft MS-DOS or Microsoft Windows system, as this
is a common reason for seeing apparently extraneous CR characters embedded
within a file.
RMS recognises three types of stream files:
1) STREAM_LF - in which records are delimited by an LF character
2) STREAM_CR - in which records are delimited by a CR character
3) STREAM - in which records are delimited by an LF character,
a CR+LF character pair, or an FF or VT character
Often, text files from MS-DOS or Windows systems will have records ending
with a CR+LF pair. When such a file is copied onto a VMS system as a
STREAM_LF file, the CR character becomes part of the data stream and
therefore will appear at the end of each record.
You can check if your file falls into this category with the following
two commands:
$ DIRECTORY/FULL filespec
Check that the Record Format is Stream_LF:
Record format: Stream_LF, maximum 0 bytes, longest 0 bytes
and that the records contain a CR+LF pair. Ensure you dump sufficient
blocks to see the ends of a number of records
$ DUMP/BLOCK=COUNT:1 filespec
31310962 65462031 300A0D38 312E3009 .0.18..01 Feb.11 000020
^^^^
0A0D is a CR+LF pair (remember that the hex dump reads right to left!).
If your file satisfies BOTH these conditions, you have two choices for
removing the CR from your data. The first doesn't actually remove the
character, it just tells RMS that the CR is part of the record
delimiter:
$ SET FILE/ATTRIBUTE=(RFM=STM) filespec
Note that this does not involve any conversion or copying of data. The
DIRECTORY/FULL command will now display the record format as:
Record format: Stream, maximum 0 bytes, longest 0 bytes
and applications reading the file will no longer "see" the embedded CR
character.
If you really must physically remove the CR character, you can now do
so with a simple CONVERT command:
$ SET FILE/ATTRIBUTE=(RFM=STM) filespec
$ CONVERT/FDL=SYS$INPUT filespec newfilespec
RECORD
FORMAT STREAM_LF
$
The first command tells RMS that the record relimiter is CR+LF as before.
The second performs a conversion of the file to STREAM_LF format, so when
the new file is created, records will be delimited by a single LF character.
If your file is NOT a STREAM_LF file, the above will not work. You can
either write a program to remove the CR character, use PERL or similar
tool, or use a text editor. For example, using EVE, the following
keystrokes will remove all "visible" CR characters from any file (though
with a file the size of yours it might take a while!)
$ EDIT/TPU filespec
Prompt Keystroke(s) Explanation
none <DO> Enter command mode
Command: REPLACE<CR> Enter the REPLACE command
Old String: <CTRL V> Used to enter control characters
Press the key to be added: <CR> Enter CR as data
<CR> Terminate the old string
New String: <CR> Terminate the new string
Replace? Type Yes, No, All, Last, or Quit:
A<CR> Replace all instances
<CTRL Z> Write new file and exit
Here is a DCL procedure which will remove ONE CR character from each
record in a sequential file and produce a new file (with VFC format
records).
$ IF p1.EQS."" THEN INQUIRE p1 "Input file"
$ IF p2.EQS."" THEN INQUIRE p2 "Output file"
$ ON WARNING THEN GOTO Cleanup
$ ON CONTROL_Y THEN GOTO Cleanup
$ OPEN/READ in 'p1'
$ OPEN/WRITE out 'p2'
$ cr[0,8]=13 ! CR character
$ loop: READ/END=Cleanup in line
$ line=line-cr
$ WRITE out line
$ GOTO loop
$ Cleanup:
$ CLOSE in
$ CLOSE out
$ EXIT
If you require a non-VFC file, use CREATE/FDL, COPY NLA0: filename,
or other tool to create a non-VFC sequential file format, then use
OPEN/APPEND on the file. (The DCL OPEN command defaults to VFC.)
|