sort problems with 8-bit ISO8859 characters

From: I'm not a real doofus, but I play one at a national laboratory. <BAISLEY_at_fndcd.fnal.gov>
Date: Wed, 26 Nov 1997 11:31:46 -0600

I'm having a problem with sort and 8-bit characters. I'd like to sort
some words and phrases, in dictionary order. It seems that not all of
the sort options are handling the locale properly? I apologize if the
8-bit chars in the following text don't come through the mail nicely.

Given the following sample list (just for demonstration purposes --
not necessarily real spellings!):

emezzxy
Einbahnstraße (German scharfes-S)
mâñana (a with caret, n with tilde)
"Man"
mañ (n with tilde)
mango
mañana (n with tilde)
lack a Batwoman
Émeute (E with accent acute)
l'hopital gown
L'hospital gown
L'hôpital gôwn (o's with carets)

If I set my locale to en_US.ISO8859-1, the command 'sort -f' gives:

"Man"
Einbahnstraße
Émeute
emezzxy
l'hopital gown
L'hôpital gôwn
L'hospital gown
lack a Batwoman
mañ
mañana
mâñana
mango

This is very nearly right. But if I add other qualifiers, things
get ugly. For example, the command 'sort -fd' gives:

Einbahnstraße
emezzxy
lack a Batwoman
l'hopital gown
L'hospital gown
L'hôpital gôwn
mañ
mañana
"Man"
mâñana
mango
Émeute

The â, ô and É are wrong, but the ñ is right. The quotes around
"Man" and the apostrophes are properly ignored. 'sort -fdi' gives:

"Man"
Einbahnstraße
emezzxy
l'hopital gown
L'hospital gown
L'hôpital gôwn
lack a Batwoman
mañ
mañana
mâñana
mango
Émeute

This variation screws up the quotes and apostrophes, still has
the É wrong, but fixes â and ô. Using other locales (I've tried
fr_FR.ISO8859-1 and de_DE.ISO8859-1) makes no difference.

Am I missing something? I'm running DU 4.0B with jumbo kit 4.
TIA!

                                                        Cheers,
                                                        Wayne

http://www-oss.fnal.gov/~baisley
Received on Wed Nov 26 1997 - 18:50:24 NZDT

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:37 NZDT