all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Uwe Brauer <oub@mat.ucm.es>
Cc: emacs-devel@gnu.org
Subject: Re: sort-lines including non ASCII
Date: Thu, 07 Jul 2016 18:20:54 +0300	[thread overview]
Message-ID: <83d1mplbjd.fsf@gnu.org> (raw)
In-Reply-To: <87wpkxx5dc.fsf@mat.ucm.es> (message from Uwe Brauer on Thu, 07 Jul 2016 07:41:03 +0000)

> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Thu, 07 Jul 2016 07:41:03 +0000
> 
>  > Because you are thinking Spanish, I presume.  Emacs by default is not
>    > sensitive to the current locale or language, when it compares strings,
>    > and instead does that in binary order of the characters' Unicode
>    > codepoints.  The advantage is that the order comes out the same in any
>    > locale.
> 
> Hm I just made an experiment with Hebrew, with and without niqqud and
> indeed 

> בית
> אבא
> אוויר

> Is sorted correctly and also

> אוויר
> בית
> אַבָא

> So the niqqud does not influence the sorting but the accent in spanish
> does. Most likely Unicode is the culprit here, but it is contra
> intuitive.

Unicode has nothing to do with this.  The difference between אַ and Á
is that the former is always 2 characters, while the latter is usually
only one.  That's why sort-lines produces what looks like correct
results with Hebrew.  To see the problem there, you need to sort אבא
with אַבָא and אתבשא, for example.  Or something similar.



  reply	other threads:[~2016-07-07 15:20 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-05 20:58 sort-lines including non ASCII Uwe Brauer
2016-07-05 21:57 ` Óscar Fuentes
2016-07-07  7:35   ` Uwe Brauer
2016-07-06 14:34 ` Eli Zaretskii
2016-07-06 14:52   ` Michael Heerdegen
2016-07-07  7:34     ` Uwe Brauer
2016-07-07 15:17       ` Eli Zaretskii
2016-07-07 16:30         ` Michael Heerdegen
2016-07-07 16:56           ` Eli Zaretskii
2016-07-07 17:32             ` Michael Heerdegen
2016-07-07 19:53               ` Eli Zaretskii
2016-07-07 22:55                 ` Michael Heerdegen
2016-07-08 10:01                   ` Eli Zaretskii
2016-07-14 21:10                     ` Michael Heerdegen
2016-07-14 21:14                       ` Clément Pit--Claudel
2016-07-14 21:19                       ` Noam Postavsky
2016-07-14 21:26                         ` Michael Heerdegen
2016-07-14 21:57                           ` Noam Postavsky
2016-07-08 13:40               ` Richard Stallman
2016-07-08 14:36                 ` Michael Heerdegen
2016-07-09 16:58                   ` Richard Stallman
2016-07-12 23:06                     ` John Wiegley
2016-07-07  7:41   ` Uwe Brauer
2016-07-07 15:20     ` Eli Zaretskii [this message]
2016-07-07 16:13       ` Uwe Brauer
2016-07-07 16:35         ` Eli Zaretskii
2016-07-07  8:23 ` Teemu Likonen
2016-07-07 15:23   ` Eli Zaretskii
2016-07-08  4:17     ` Teemu Likonen
2016-07-08  6:32       ` Eli Zaretskii
2016-07-08  6:36         ` Eli Zaretskii
2016-07-08  6:50           ` Teemu Likonen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83d1mplbjd.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=oub@mat.ucm.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.