Summary: stupid scripting question from Danielle Georgette on 2000-05-08 (tru64-unix-managers)

From: Danielle Georgette <Danielle.Georgette_at_asx.com.au>
Date: Mon, 08 May 2000 12:03:08 +1000

Morning all,

True to form, this list is great and you people are all wonderful :). I
think I received 10 answers in 10 minutes, with many more to follow. The
most illuminating answers to the 'why' component of my question came from
Bob Vickers and Tom Webster - to paraphrase:

The 'for' shell builtin will set $i to each token in cat.txt, where the
tokens are seperated by any white space (or an end of line char).

The answers to the scripting component were many and varied, and all great,
thanks !

Of course, methods and tools varied hugely, with awk, sed and perl solutions
proposed (examples of each style are attached). Awk with a printf statement
does seem to be the most elegant solution, but they all taught me something.
A few respondents offered ways to stop echo including a /n at the end of its
output line ( using a -n or \\c after the echo statement) . This would have
prevented the echo used in the cut statements breaking up the line - as it
was the problem was happening before the cut, but if not this may have been
a fix.

Thanks to:

Shell script answers:
Bob Vickers [bobv_at_dcs.rhbnc.ac.uk]
Tom Webster [webster_at_ssdpdc.lgb.cal.boeing.com]
Hugh Pritchard [Hugh.Pritchard_at_WCom.com]
Joe comunale [jbc_at_forbin.qc.edu]
Margus Liiv [ml_at_kungla.ee]
Sean O'Connell [sean_at_stat.Duke.EDU]
Roetman, Paul [Paul_Roetman_at_CSXLines.com]
Ralph Wegner [wegner_at_hst.rwth-aachen.de]
Gavin Kreuiter [gavin_at_transactive.usko.com]
Awk, Sed and Perl answers:
Claude Charest [charest_at_Canr.Hydro.Qc.Ca]
Joerg Bruehe [joerg_at_sql.de]
Vangelis Haniotakis [haniotak_at_ucnet.uoc.gr]
Dewhurst, Cy [cy.dewhurst_at_rbch-tr.swest.nhs.uk]
Magali.Bernard_at_univ-st-etienne.fr
Lucio Chiappetti [lucio_at_ifctr.mi.cnr.it]
MacDonell, Dennis [DennisMacDonell_at_auslig.gov.au]
Chad Price [cprice_at_molbio.unmc.edu]
aad_at_lovecraft.talltree.net
Suika Roberts [ssfr_at_unm.edu]

------------------------------------------------------------------

Bob Vickers [bobv_at_dcs.rhbnc.ac.uk]

The line
  for i in `cat text.txt`
will set $i to each token in cat.txt, where the tokens are separated by
any white space. You also need double-quote marks in your echo
statements because echo only writes a single space between each
argument.Try

cat text.txt | while read i
do
  colA=`echo "$"i |cut -c 1-10`
  colC=`echo "$i" |cut -c 20-30`
  echo "$colA$colB$colC" >> newtext.txt
done

Tom Webster [webster_at_ssdpdc.lgb.cal.boeing.com]

The problem is with the way that the for builtin processes lines in a
file. It is trying to parse out filenames, one per line so you can
use them as input to a command. What you are looking for is a way to
suck an entire line in at once (w/o the terminating /n if it can be
helped).

Many moons ago, that would have been a job for the line(1) command, but
it is considered obsolete and is slated for retirement. Its successor
is the read(1) command. The syntax looks a little weird, but it
works quite nicely.

I'm also using the printf command to format the output, it's more
portable and is more commonly used than typeset, so it should
be easier for someone else to maintain (when they need to look at it
in 20 years). I'm assuming from your note that all of the columns
are 10 characters wide, are left justified and are space padded
(including the last column).

Here's the script:

----- snip ----- snip ----- snip ----- snip ----- snip -----
#!/bin/ksh
touch newtext.txt
while read colA colB colC
do
        colB="xx"
        printf "%-10s%-10s%-10s\n" $colA $colB $colC >> newtext.txt
done < text.txt
----- snip ----- snip ----- snip ----- snip ----- snip -----

Notes:

1. The read command takes care of separating the input into three
   separate bits and assigning them to the variables.

2. If you want the fields to be right justified, just get rid of
   the dashes in the printf format.

If your needs get much more complex than this, it's time to
start looking at perl (Practical Extraction and Reporting Language),
this type of work is where it really shines.

Tom

Hugh Pritchard [Hugh.Pritchard_at_WCom.com]

Try something along the lines of this construct:

cat yourfile |
while read CC1 CC2 CC3 ANYTHINGELSEONLINE
do
    ...
    print ...
done > newfile

joe comunale [jbc_at_forbin.qc.edu]

Here it is: thx, it was fun :)

#!/usr/bin/ksh

col1=`cat text.txt | awk '{print $1}'`

cat /dev/null > newtext.txt

for colA in $col1
{
  colC=`cat text.txt | grep $colA | awk '{print $3}'`
  echo "$colA xx $colC" >> newtext.txt
}

cat newtext.txt

exit

Sean O'Connell [sean_at_stat.Duke.EDU]

Just change the first two echo's to 'echo -n'

colA=`echo -n $i |cut -c 1-10`
colC=`echo -n $i|cut -c 20-30`
echo $colA$colB$colC >> newtext.txt

the echo command automagically includes a newline at the end. It
may also ignore spaces...

It might be easier to easier to use printf as your
output (as you can define your line format)

Roetman, Paul [Paul_Roetman_at_CSXLines.com]

in your example, the loop is

  for i in `cat file.txt` ; do
    echo $i
    echo ------
  done

results in this:

  a1
  -------
  a2
  -------
  a3
  -------
  b1
  -------
  b2
etc

instead, try this:

  while read -r a b c
  do
    echo $a $b $c
    echo --------
  done < file.txt

this will result in
   a1 a2 a3
   --------
   b1 b2 b3
   --------
   c1 c2 c3
   --------
so your new program would be something like

  while read -r a b c
  do
    echo "$a xx $c"
    echo --------
  done < file.txt

Margus Liiv [ml_at_kungla.ee]

First checkpoint:
    You must insert \\c on the echo-command for preventing a newline-
character into assigning column values as following:
    colA=`echo $i \\c | cut -c 1-10`

Second checkpoint:
    Is the file newtext.txt the same-type formatted file as the
text.txt?
    If so, then better for reformat is using sed-script as following:
    sed -e 's/$..........$..........$.*$/\1xx \2/' < text.txt
> newtext.txt

Dewhurst, Cy [cy.dewhurst_at_rbch-tr.swest.nhs.uk]

The following awk script would do it:

awk '{printf "%s xx %s\n", $1, $2}' fullpathtofile

The script below should include a \c to prevent echo from printing a
newline.

Claude Charest [charest_at_Canr.Hydro.Qc.Ca]

awk '{print $1 "xx" $3 }' text.txt

Joerg Bruehe [joerg_at_sql.de]

I would use 'sed' - untested:

   sed -e '1,$s/$^..........$..........$.*$/\1xx\2/' < text.txt >
new.txt

1) This code relies on the exact width of 10 positions per column,
   in 'sed' a single period '.' fits any single character.
2) The code does not insert any spaces around the 'xx'.
3) See the manual on 'ed' and 'sed' to understand the command,
   esp. to adapt it if ou do not like the result.
4) Test it before going productive - this is not checked !

Vangelis Haniotakis [haniotak_at_ucnet.uoc.gr]

I'd heartily recommend using Perl for this sort of work. Perl is your
friend, and this sort of work can be done in perl in a (very) few lines.
Go buy the Camel book (Programming Perl, by Larry Wall), you'll be glad
you did.

Anyway, without further ado:

#!/bin/perl
open TEXT, "<text.txt";
open OUT, ">out.txt";
while (<FILE>) {
  ($a, $b, $c) = split $_;
  print OUT "$a xx $c";
}
close FILE;
close OUT;

Lucio Chiappetti [lucio_at_ifctr.mi.cnr.it]
You should use a dedicated tool ... In awk the thing can be done as follows
:

    awk -f myscript.awk < infile > outfile

where myscript.awk is

{print $1,"xx",$3}

Variations on the theme are possible like :

    awk '{print $1,"xx",$3}' < infile > outfile

or even invoking myscript.awk if you make it executable and prefix it
with a line #!/usr/bin/awk -f.

Note : the above takes the first and third field of each line, replaces
the second with xx and separates them with a single space. If you want to
preserve 10-char columns replace print with appropriate printf

+-----------------------------------+---------------------------------+
| Danielle Georgette | Unix is very user friendly, its |
| Unix Admin | just rather particular about |
| danielle.georgette_at_asx.com.au | who it makes friends with. |
+-----------------------------------+---------------------------------+
| All opinions are my own unless clearly stated otherwise. |
+---------------------------------------------------------------------+

Question:

Simple problem: I have a file text.txt containing space formatted text:

a1 a2 a3
b1 b2 b3
c1 c2 c3

etc, where each column is 10 spaces wide.

I want to replace all the text in the middle column with the characters xx.

#!/bin/ksh
typeset -L10 colA
typeset -L10 colB="xx"
typeset -L10 colC
touch newtext.txt
for i in `cat text.txt`
do
colA=`echo $i |cut -c 1-10`
colC=`echo $i|cut -c 20-30`
echo $colA$colB$colC >> newtext.txt
done

But my script is breaking up each input line into the three parts, so what I
see in the output is:

a1
xx
a3
b1
xx
b3

I guess it has something to do with ksh interpreting spaces in the line as
field seperators, but how can I suppress/alter the script to stop this
happening ? I guess it could be done (probably better) in awk, and any
answers such will be welcome - but i'd also love an explaination as to why
the above happens as well.

Thanks!
Received on Mon May 08 2000 - 02:04:38 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:40 NZDT