From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alexis Newsgroups: gmane.emacs.bugs Subject: bug#20316: 24.5; `string-lessp' doesn't respect value of LC_COLLATE Date: Wed, 15 Apr 2015 09:55:22 +1000 Message-ID: <87a8ya5kph.fsf@gmail.com> References: <87twwk61re.fsf@gmail.com> <87h9sk5z6a.fsf@gmx.de> <87pp785y21.fsf@gmail.com> <83twwkccdk.fsf@gnu.org> <87egnn5y06.fsf@gmail.com> <83r3rmbvvn.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1429055789 22091 80.91.229.3 (14 Apr 2015 23:56:29 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 14 Apr 2015 23:56:29 +0000 (UTC) Cc: michael.albinus@gmx.de To: 20316@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Apr 15 01:56:15 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YiAgd-0002U0-Jn for geb-bug-gnu-emacs@m.gmane.org; Wed, 15 Apr 2015 01:56:11 +0200 Original-Received: from localhost ([::1]:58245 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YiAgd-0006i5-3Q for geb-bug-gnu-emacs@m.gmane.org; Tue, 14 Apr 2015 19:56:11 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46097) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YiAgZ-0006hP-LR for bug-gnu-emacs@gnu.org; Tue, 14 Apr 2015 19:56:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YiAgU-00047J-Po for bug-gnu-emacs@gnu.org; Tue, 14 Apr 2015 19:56:07 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:38506) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YiAgU-000472-N9 for bug-gnu-emacs@gnu.org; Tue, 14 Apr 2015 19:56:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1YiAgU-0001Ln-69 for bug-gnu-emacs@gnu.org; Tue, 14 Apr 2015 19:56:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Alexis Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 14 Apr 2015 23:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20316 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 20316-submit@debbugs.gnu.org id=B20316.14290557395163 (code B ref 20316); Tue, 14 Apr 2015 23:56:02 +0000 Original-Received: (at 20316) by debbugs.gnu.org; 14 Apr 2015 23:55:39 +0000 Original-Received: from localhost ([127.0.0.1]:56515 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YiAg5-0001LB-Ny for submit@debbugs.gnu.org; Tue, 14 Apr 2015 19:55:38 -0400 Original-Received: from mail-pa0-f48.google.com ([209.85.220.48]:34717) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YiAg3-0001Ky-E4 for 20316@debbugs.gnu.org; Tue, 14 Apr 2015 19:55:36 -0400 Original-Received: by pacyx8 with SMTP id yx8so29287654pac.1 for <20316@debbugs.gnu.org>; Tue, 14 Apr 2015 16:55:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type:content-transfer-encoding; bh=xpDC8sRwdH2AIouBCqy4QGQ1PYM/DheV3GcFXZ4TG2A=; b=G/ZLLkgSRECyRp6R7I6TlN3JHJ2+k3VcFUyPUlRPc8SFHRQHgF/wDg7/Mym9QYfy/O 5fLVCDqkgs+VtPxxJpWty3YEnDMctZJvGh+eleMhsM3QNMzVYlNbB8R/G5LE9GKbO7+X F34W2rbolIMYLItc9IHNdik1jFbMxHmQypjTuH/B6y0i3mSlSGQ7wXj9ja+2D1BxepNU 8hl/75+sUJ1p6omP6QeWPv/qemnCQ3iJFTVuLhdi4BrbTU4DIYR6GgDjvCthIclt/6wu QZyC1KeOfOFYP5kUYd6K4CU/PP0dLFGEYiPSt2ef4I8JnlIFlbLnL7Lc05PzmIN58p+N JfuQ== X-Received: by 10.66.65.195 with SMTP id z3mr40128496pas.81.1429055729434; Tue, 14 Apr 2015 16:55:29 -0700 (PDT) Original-Received: from localhost (ppp118-209-46-94.lns20.mel4.internode.on.net. [118.209.46.94]) by mx.google.com with ESMTPSA id oj10sm2192759pdb.38.2015.04.14.16.55.27 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 14 Apr 2015 16:55:28 -0700 (PDT) In-reply-to: <83r3rmbvvn.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:101541 Archived-At: Eli Zaretskii writes: > I think we use "lexicographic" for lack of a more accurate word. > We could use something like "code point (binary) order", but > would that be clear enough to be useful? i would certainly find that more useful overall, as i think it's less ambiguous (to me) than 'lexicographic order' in this context. i assume it's "code point [according to the overall encoding of the relevant buffer]"? And given your earlier point, i'm guessing it would also be useful to say something along the lines of "If the data being sorted contains multiple encodings, all bets are off"? (Which is relevant in the `org-vcard' case of people possibly trying to sort contacts whose names are based in a variety of locales.) > Note that we are not alone in this; at least this page: > > http://en.cppreference.com/w/cpp/string/byte/strcoll > > says that the C function 'strcmp' does a "lexicographical > comparison". So do a few other similar pages; google for > "difference between strcmp and strcoll". Well, that to me feels like continued holdover from the C+ASCII (or at best Latin-1) 'byte == character' mindset .... >> A,B,C,Č,Ć,D,Dž,Đ,..S,Š,..Z,Ž > > That's "collation order" in action, note that the diacritic > order is applied _after_ the alphabetic order of the base > characters. That's what string-collate-lessp does. *nod* That's why my first thoughts about this issue went to collation settings; given that (it seems to me) Emacs has a far better handle on i18n and m17n issues than most software, i assumed that sorting-by-collation-order would already be available in 24.x. However, given what you've said, i've now got a better understanding of why implementing this is not straightforward. Thanks for taking the time to explain all this! Alexis.