From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: "Roland Winkler" Newsgroups: gmane.emacs.devel Subject: Re: strip accents and sorting [was: BibTeX issues] Date: Fri, 30 Aug 2019 14:09:47 -0500 Message-ID: <29819.36697.297846.23913@gargle.gargle.HOWL> References: <87mufv2e9s.fsf@uni-bielefeld.de> <87ftllji9u.fsf@gnu.org> <83tva1b02r.fsf@gnu.org> <17902.3833.825923.23911@gargle.gargle.HOWL> <20085.68375.750044.23913@gargle.gargle.HOWL> <838sraa6e3.fsf@gnu.org> <837e6ua47y.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="3731"; mail-complaints-to="usenet@blaine.gmane.org" Cc: rudalics@gmx.at, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Aug 30 21:10:27 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1i3mHz-0000sA-42 for ged-emacs-devel@m.gmane.org; Fri, 30 Aug 2019 21:10:27 +0200 Original-Received: from localhost ([::1]:40658 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i3mHw-0008H7-Gb for ged-emacs-devel@m.gmane.org; Fri, 30 Aug 2019 15:10:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41781) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i3mHO-0008H0-7D for emacs-devel@gnu.org; Fri, 30 Aug 2019 15:09:51 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:49231) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1i3mHN-0001zX-LS; Fri, 30 Aug 2019 15:09:49 -0400 Original-Received: from [2602:30a:2e52:d720:65b7:1416:12e7:8bfb] (port=35322 helo=regnitz) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1i3mHM-0008UJ-RC; Fri, 30 Aug 2019 15:09:49 -0400 In-Reply-To: <837e6ua47y.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:239710 Archived-At: On Fri Aug 30 2019 Eli Zaretskii wrote: > > You could set LC_COLLATE=3Den_US.utf8 inside Emacs, or even bind it > > around the call to string-collate-lessp. I think we support that on > > GNU/Linux. >=20 > Actually, string-collate-lessp accepts an optional argument LOCALE > that can be used for that. So it's even easier than I remembered. Thanks! Unfortunately, string-collate-lessp with locale en_US.utf8 folds case, (sort '("b" "A" "B" "a") (lambda (s1 s2) (string-collate-lessp s1 s2 "en_US.utf8"))) =E2=87=92 ("a" "A" "b" "B") whereas (sort '("b" "A" "B" "a") (lambda (s1 s2) (string-collate-lessp s1 s2 "C"))) =E2=87=92 ("A" "B" "a" "b") though in both cases the optional arg IGNORE-CASE of string-collate-lessp is nil. (I guess this is not a bug of string-collate-lessp, but it is an intended "feature" of the locale en_US.utf8.) Similarly, the locale en_US.utf8 ignores dots "." which for my taste bundles too many features. (Does anybody know where the feature bundles of different locales are described? So far, I have not found anything.) But something like bibtex-mode could introduce a new user option bibtex-sort-locale that is used as optional arg when sorting BibTeX records with string-collate-lessp.