From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alexis Newsgroups: gmane.emacs.bugs Subject: bug#20316: 24.5; `string-lessp' doesn't respect value of LC_COLLATE Date: Tue, 14 Apr 2015 10:55:53 +1000 Message-ID: <87egnn5y06.fsf@gmail.com> References: <87twwk61re.fsf@gmail.com> <87h9sk5z6a.fsf@gmx.de> <87pp785y21.fsf@gmail.com> <83twwkccdk.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1428973049 22628 80.91.229.3 (14 Apr 2015 00:57:29 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 14 Apr 2015 00:57:29 +0000 (UTC) Cc: michael.albinus@gmx.de To: 20316@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Apr 14 02:57:13 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YhpA7-0007Bc-IX for geb-bug-gnu-emacs@m.gmane.org; Tue, 14 Apr 2015 02:57:11 +0200 Original-Received: from localhost ([::1]:53847 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YhpA6-0002LN-Gr for geb-bug-gnu-emacs@m.gmane.org; Mon, 13 Apr 2015 20:57:10 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51292) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YhpA2-0002LE-4l for bug-gnu-emacs@gnu.org; Mon, 13 Apr 2015 20:57:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yhp9y-000212-RD for bug-gnu-emacs@gnu.org; Mon, 13 Apr 2015 20:57:05 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:37344) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yhp9y-00020q-N2 for bug-gnu-emacs@gnu.org; Mon, 13 Apr 2015 20:57:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Yhp9x-0004wV-So for bug-gnu-emacs@gnu.org; Mon, 13 Apr 2015 20:57:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Alexis Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 14 Apr 2015 00:57:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20316 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 20316-submit@debbugs.gnu.org id=B20316.142897296918930 (code B ref 20316); Tue, 14 Apr 2015 00:57:01 +0000 Original-Received: (at 20316) by debbugs.gnu.org; 14 Apr 2015 00:56:09 +0000 Original-Received: from localhost ([127.0.0.1]:55353 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yhp96-0004vE-LQ for submit@debbugs.gnu.org; Mon, 13 Apr 2015 20:56:09 -0400 Original-Received: from mail-pa0-f43.google.com ([209.85.220.43]:35228) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yhp93-0004uk-L8 for 20316@debbugs.gnu.org; Mon, 13 Apr 2015 20:56:06 -0400 Original-Received: by pabtp1 with SMTP id tp1so119621597pab.2 for <20316@debbugs.gnu.org>; Mon, 13 Apr 2015 17:55:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type:content-transfer-encoding; bh=7lZvg1q0Yvg3oKyQ9LRaaWqnaqPtFqW9vxSoNnd5AoQ=; b=i+bQ/NRrmQf2A6Z0NU8yLMAVh8M6lxrQPZtIQas14MkwrmqmfDZip269qjCZDF7w1O 5WryHNMpsDeh3WG2bwJ/M3ZgY+jPjKR6q7ld5MusSgYT8XLn3vHTI5F6cC9PK1jMhwGq w8o9Oam53gOOe7cCthsZfGljjI/Qs+Jz4ML/jZhpjrAKphV/I8+jJzVWOIjC/iSToaEE B3P2E/9Fx7xkTmwlfVtKLZyAi9g8V0JbGwfJmRKcyLpbcTHXcdgkRViL4plbr5Q3hMHn 0JaVbXbn7upAVrA/4jG0fWaBuufayKntdQAgkJzNbF2FX/xCU6+oTg1Kk6/S4K96899w FdVA== X-Received: by 10.66.182.5 with SMTP id ea5mr26804251pac.45.1428972959723; Mon, 13 Apr 2015 17:55:59 -0700 (PDT) Original-Received: from localhost (ppp118-209-46-94.lns20.mel4.internode.on.net. [118.209.46.94]) by mx.google.com with ESMTPSA id yy2sm8442200pbb.6.2015.04.13.17.55.57 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 13 Apr 2015 17:55:58 -0700 (PDT) In-reply-to: <83twwkccdk.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:101492 Archived-At: Eli Zaretskii writes: > Emacs is a multi-lingual editor, and it isn't clear how to apply > locale-specific settings, including collation, to text that can > potentially include many scripts. E.g., a locale could specify > a codeset that doesn't cover characters outside of a particular > script, which will produce undefined results if you try using > system sorting routines with characters outside of that single > script. Okay, fair point. > Also, using locale-specific sorting would produce different > results not only in different locales, but also on different > platforms in the same locale, because the implementation of > locale collation order differs from platform to platform and > from one C library to another. Huh, okay. > Please also note that locale-sensitive collation order could > mean more than just order of characters. E.g., it could specify > that punctuation characters or differences in accents should be > ignored. *nod* > So by default, Emacs sorts disregarding locale-specific > ordering, basically using the Unicode codepoints of the > characters to order them. This makes sense given what you've said above, but can this still be referred to as 'lexicographic' ordering? To me, 'lexicographic ordering' is ordering as per a dictionary for the relevant language, not by codepoint for an arbitrary encoding. Is this wrong? > You could use the external 'sort' utility in the meantime, if it > supports locale-dependent ordering. Once again: the results > will be not 100% deterministic, even for the same locale, so > your users should "caveat emptor". *nod* Thanks, i'll pass that on. (And note it for the future.) > May I ask what does that package of yours do that it needs > locale-dependent sorting? The package itself doesn't do it, but the results it produces might subsequently require such sorting. The package is `org-vcard': https://github.com/flexibeast/org-vcard which allows one to import vCards to contacts in Org-based formats (and export contacts from Org-based formats to vCards). One of the package's users had imported a set of contacts, then expected to be able to sort those contacts according to Croatian rules, using `org-sort' (from `org.el'). However, to quote the user, this resulted in the contacts being sorted: according to the English alphabet rules where the contact entries which start with Croatian characters (Č,Ć,Đ,Š,Ž) are at the end of the list, iow. after 'Z' entries, although it should go like this: A,B,C,Č,Ć,D,Dž,Đ,..S,Š,..Z,Ž Alexis.