SUMM: Problems displaying accented characters with more and mail

From: Guy Dallaire <dallaire_at_total.net>
Date: Tue, 29 Apr 1997 14:02:35 -0400

Thanks to the many people who replied. I cannot name everybody. Here is the
original post as well as the solutions:

--------- Original Post ----------

I'm having a problems getting french accented characters to display
properly when I use the 'more' and 'mail'. The characters display OK when I
use 'cat' or 'vi'.

For example, if I edit a file with vi (call it foo.french) and that file
contains:

Le démon sera ici cet été

If I type more foo.french I get something like:

Le dimon sera ici cet iti.

If I mail the file and read it with mail, I get the same kinda junk.

Does anyone know how to fix it ?

----------------- SOLUTIONS -------------------------------------------------

One of the solution for the 'more' problem is to use 'more -v', or you can
also set the MORE environment variable to '-v'. The drawback is that some
other control characters (other than accents) could be printed and look
strange.

Another solution is to set the locale. For example, for French Canada, you
can set and export LANG=fr_CA.ISO8859-1. That fixes the more problem but
NOT the mail problem. And it introduces a lot of changes in the behavior of
your programs like sort order, error messages and prompts, etc... We'll
stick to the dafeult setting, the C locale.

For the mail program, I've not found anything useful. The accented
characters do not display the same way in mail as they do with 'more'.


----------------- ANSWERS -------------------------------------------

> Le démon sera ici cet été

  This sentence appears OK with accented characters in the mail I received
  using Pine (which is surprising since usually Pine says something about
  "using ISO character set" etc.).

> If I type more foo.french I get something like:
> Le dimon sera ici cet iti.

  You may check the options of more. On my system we have a "type" command
  aliased to cat -v | more which displays control characters. The sense is
  to prevent escape sequences to garble the terminal. Your file appears as

  Le dM-imon sera ici cet M-itM-i

  So it looks like your accented e's are in fact meta-I (i.e. I believe
  ASCII i 7-bit with the 8th bit set)

> Does anyone know how to fix it ?

  Since I believe neither of us is of French mother tongue, I believe I
  may speak frankly. I believe that accented characters, as well as any
  other character with diacritic signs, like e.g. German Umlaut, shall
  not be used in electronic documents, particularly in "plain text" files.
  One should limit to the standard 7-bit ASCII set. Anything more is non
  standard. You may fix it FOR YOU, but it won't necessarily work for
  somebody else.

  All languages should have a mean to bypass special characters. For
  instance in German it is customary to use "ae" "oe" and "ue" instead
  of a,o,u with Umlaut. In Italian (MY mother tongue) accented vowels
  are used (though exclusively on the last character of some accented
  words), but when newcomers to the net (the guys with a PC and an italian
  keyboard) use them the results are weird (vary from your example to
  funny things stuffed with =2E and worse). The traditional way used since
  ages by scientists and engineer which used computers since ever, is to
  replace the accent with an apostrophe (i.e. citta', tribu', etc.).
  I tried to use the same for a French text (i.e. put an apostrophe instead
  of an accent on the last letter, and ignore all other accents), but a
  friend who was born in France told me that the "official" way should be
  to remove ALL accents wherever located.

----------------------------------------------------------------------------
Lucio Chiappetti - IFCTR/CNR - via Bassini 15 - I-20133 Milano (Italy)
----------------------------------------------------------------------------
Fuscim donca de Miragn E tornem a sta scio' in Bregn
Che i fachign e i cortesagn Magl' insema no stagn begn
Drizza la', compa' Tapogn (Rabisch, II 41, 96-99)
----------------------------------------------------------------------------
For more info : http://www.ifctr.mi.cnr.it/~lucio/personal.html
----------------------------------------------------------------------------

----------------------------------------------------------------------------
NOTE: I received the following in french, I'll translate for you:

We solved it in csh with:
setenv LANG fr_FR.ISO8859-1

which unfortunatly also modifies all the displays and prompts, trying to
convert them to french (try ls -l), that can pose problems, for example 'rm
-i toto' should be answered by "o" (oui) and not by "y" if you want to
erase the file. It changes habits and it is not funny.

So we decided to create some short scripts for the commands that you are
talking about, morefr, vifr... All these scripts do is to set the LANG
environment variable temporarily, call the program, and unset the variable.

Ex : vifr (in ksh )
#!/usr/bin/ksh
# viewfr
#
if test "A$1" = "A"
then
 echo " Edit a file containing ISO8859 characters"
 echo " "
 echo " arg1 = nom du fichier"
else
 export LANG=fr_FR.ISO8859-1
 vi $1
 unset LANG;export LANG
fi

That still have an inconvenence: the users must get the habit of using vifr
instead of vi, etc...

> If I mail the file and read it with mail, I get the same kinda junk.

I'm not sure that it is related to the same problem.

-- 
_______________________________________________________________________
Magali BERNARD (magali_at_univ-st-etienne.fr)
CRITeR - 23 rue du Dr Paul Michelon - 42023 St-Etienne Cedex 2 - FRANCE
Tel: 04.77.48.50.62
---------------------------------------------------------------------------
NOTE: This is translated also:
Hi,
I do not have this problem in my environment because it is set to french by
default.
I would advise you read the man pages for il8n and ll10n which state the
environment variables that you have to set and I think about the "codeset"
ISO8859-1
NOTE: What is il8n and ll10n ???????????????????
-- 
  Hubert Robitaille                    mailto:hrobitaille_at_encore.fr
  Responsable Support Logiciel
  Encore Computer S.A.                 http://www.encore.com
  B.P. 54     78185 St Quentin en Yvelines Cedex    France
  Tel.: +33/0 1 30 23 36 00             Fax : +33/0 1 34 60 43 95
--------------------------------------------------------------------
Use more -v
(or the MORE environment variable, Exemple setenv MORE -v in csh ou
MORE='-v' in sh)
For the mail, use a better mailer
> Le démon sera ici cet été (The demon will be here this summer)
Emmène ton crucifix en vacances. (Bring along your crucifix for the holliday)
Fred
-- 
| From: Frédéric Arenou
|--------------------------------------------------|
| At: DASGAL - CNRS URA335 | E-mail:    Frederic.Arenou_at_obspm.fr
  |
|    Observatoire de Paris | URL:
http://www.obspm.fr/cgi-bin/whereis?=Arenou |
|    F-92195 Meudon Cedex  | Tel: (33/0) 1 45 07 78 49 /  Fax: 1 45 07 78
78  |
Guy Dallaire
dallaire_at_total.net
"God only knows if god exists"
Received on Tue Apr 29 1997 - 20:14:25 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT