Hi all.
This is not Tru64 specific, but is a head-cracker for me.
I have a HTML file which contains a HTML table - clean structure, 3 coumns all
around. Like this:
<table>
<tr>
<td class="e-mail><a href=mailto:Name.Surname_at_ev.co.yu>Name Surname</a></td>
<td class="position>Position title</td>
<td class="phone">123456</td>
</tr>
...
</table>
I would like to extract just those table cells - the text between <td>...</td>.
The problem is, according to HTML specification, newlines mean nothing to HTML
parser, so I cannot use any of the line-based tools like sed, awk,...
I need something that can parse such a file with that specification in mind. I'd
like to get the following:
<a href=mailto:Name.Surname_at_ev.co.yu>Name Surname</a>
Position title
123456
as a result. I'm not really into all those "parser" utilities of the Development
package.
Any suggestions?
TYIA,
Nix.
Received on Thu May 03 2001 - 09:24:45 NZST