Thanks to a lot of people.
William_Bochnik_at_acml.com Suggested Perl, Perl, Perl
"Angel R. Rivera" <angel_at_wolf.com> Sugested I use XML.
Mark.Deiss_at_acs-gsg.com Gave a sed script.
I STILL haven't learned Perl, I've never even smelled XML. "sed", however is a
different story.
I'm donating Mark's solution.
sed -ne '
# only going to operate on patterns prefaced with the <td...> html tag
/<td/{
# this is a re-entry point to permit continued operation spanning
multiple lines
: branch1
# check whether the pattern space has the closing </td> html tag
/<\/td>{
# do some housekeeping edits
# remove any multi-line carriage returns, pad with spaces
# remove the <td...> and </td> html tags
s/\n/ /g
s/<td[^>]*>//
s/<\/td>//
# print the resulting pattern space
p
# branch to the end of the loop
b
}
# have not found closing </td> html tag, read another line into the
pattern space
N
# jump to the top of the inner loop and check for the </td> html
closing tag
b branch1
} ' your_html_filename
Nix.
Received on Sun May 06 2001 - 11:13:25 NZST