From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#59275: Unexpected return value of `string-collate-lessp' on Mac Date: Wed, 16 Nov 2022 15:00:06 +0200 Message-ID: <83mt8rgill.fsf@gnu.org> References: <87zgcsdfma.fsf@localhost> <83iljgib4w.fsf@gnu.org> <87h6z0cl6b.fsf@localhost> <837czwi6yp.fsf@gnu.org> <8735ajel7y.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1829"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 59275@debbugs.gnu.org To: Ihor Radchenko Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Nov 16 14:01:26 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ovI2g-0000Hk-AY for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 16 Nov 2022 14:01:26 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovI2a-0002h6-RL; Wed, 16 Nov 2022 08:01:20 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovI2N-0002ZP-5Q for bug-gnu-emacs@gnu.org; Wed, 16 Nov 2022 08:01:13 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovI2I-0002uT-9z for bug-gnu-emacs@gnu.org; Wed, 16 Nov 2022 08:01:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ovI2I-0007zC-5V for bug-gnu-emacs@gnu.org; Wed, 16 Nov 2022 08:01:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 16 Nov 2022 13:01:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59275 X-GNU-PR-Package: emacs Original-Received: via spool by 59275-submit@debbugs.gnu.org id=B59275.166860361930640 (code B ref 59275); Wed, 16 Nov 2022 13:01:02 +0000 Original-Received: (at 59275) by debbugs.gnu.org; 16 Nov 2022 13:00:19 +0000 Original-Received: from localhost ([127.0.0.1]:56322 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ovI1a-0007y8-Sw for submit@debbugs.gnu.org; Wed, 16 Nov 2022 08:00:19 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:35144) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ovI1W-0007xn-SS for 59275@debbugs.gnu.org; Wed, 16 Nov 2022 08:00:17 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovI1R-0002iJ-Jg; Wed, 16 Nov 2022 08:00:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=txniEdsEYWygIFO11tJfsYFUaox/ci288IFrhdM0EwU=; b=RkzgypVx36K8uftnYXzq Ovi7jG8f5jIw5UspMFxPVAZP4mH1x9RMQRy6G9pwpLJ+yj0RS4veHuLk2uV5odTVMmLHyRcnx1/n/ xOUpeq+H+wP9KeaZbAeooEsvhTI6BM26Bg4QfI2p3KoX1y5ufDx0lOaTrypP2aSidO8p0Kki2gPRn UloTrfUbll6ZAhaDILIRqEOLPEStEJBnAkTOaPbM8GpzV9sF3dBl7P8+fWI1+SAmeg+h59DnuC7Yy ej/sbjCJBs89nV2XHN/Sn1Ilbgva+93z0y/42LNzmUvpgFKS6ajb1oTnZ5iitdLIahFkHHOKouRcy rY/wj8syB9w1Vg==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovI1R-0006c6-1I; Wed, 16 Nov 2022 08:00:09 -0500 In-Reply-To: <8735ajel7y.fsf@localhost> (message from Ihor Radchenko on Wed, 16 Nov 2022 01:34:09 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:248011 Archived-At: > From: Ihor Radchenko > Cc: 59275@debbugs.gnu.org > Date: Wed, 16 Nov 2022 01:34:09 +0000 > > Eli Zaretskii writes: > >> > string-collate-lessp is inherently platform- (and locale-) dependent. > >> > Don't use it if you want consistent results across platforms and > >> > locales. > >> > >> Is there a better alternative? > > > > Alternative to do what job? > > Reliable sorting. > In particular, I am looking for a better PREDICATE argument for > `sort-subr' for case-sensitive and case-insensitive sorting of strings. In the strict order of Unicode codepoints? Use compare-strings. > >> Also, do I miss something, or is this pitfall not documented in the > >> docstring of `string-collate-lessp'? > > > > It isn't? then what is this about: > > > > This function obeys the conventions for collation order in your > > locale settings. For example, punctuation and whitespace characters > > might be considered less significant for sorting: > > > > (sort '("11" "12" "1 1" "1 2" "1.1" "1.2") 'string-collate-lessp) > > => ("11" "1 1" "1.1" "12" "1 2" "1.2") > > [...] > > To emulate Unicode-compliant collation on MS-Windows systems, > > bind ‘w32-collate-ignore-punctuation’ to a non-nil value, since > > the codeset part of the locale cannot be "UTF-8" on MS-Windows. > > The above sounds like we just need to worry about some edge cases where > different approaches may exist to sorting. Like with punctuation, > numbers, and spaces. > > Having > > (string-collate-lessp "a" "B" "C" t) ; => nil > > is totally unexpected because case-insensitive "a"<"B"<"C" sounds like > the only reasonable outcome. It is hard to guess what will be unexpected for people. When the doc string was written, the example used there was deemed to be the most striking surprise from using locale-dependent collation, so it was what we used. > I'd like the warning to be even more prominent. You want to make it explicit that for systems where we use string-lessp the IGNORE-CASE argument is ignored? Or do you want some other change? Anyway, feel free to suggest some text to that effect.