From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#59275: Unexpected return value of `string-collate-lessp' on Mac Date: Sat, 26 Nov 2022 10:06:42 +0200 Message-ID: <83r0xqta0d.fsf@gnu.org> References: <87zgcsdfma.fsf@localhost> <83iljgib4w.fsf@gnu.org> <87h6z0cl6b.fsf@localhost> <837czwi6yp.fsf@gnu.org> <8735ajel7y.fsf@localhost> <83mt8rgill.fsf@gnu.org> <877czokbpk.fsf@localhost> <8335ac4eo5.fsf@gnu.org> <87ilj7dbms.fsf@localhost> <83sfib172p.fsf@gnu.org> <877czimpz4.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="24583"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 59275@debbugs.gnu.org To: Ihor Radchenko Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Nov 26 09:07:11 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oyqDP-0006CU-OD for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 26 Nov 2022 09:07:11 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oyqDJ-0003fW-Jm; Sat, 26 Nov 2022 03:07:05 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oyqDG-0003ew-Ss for bug-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:07:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oyqDG-0002dZ-Kv for bug-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:07:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oyqDG-0000v1-E8 for bug-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:07:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 26 Nov 2022 08:07:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59275 X-GNU-PR-Package: emacs Original-Received: via spool by 59275-submit@debbugs.gnu.org id=B59275.16694499863475 (code B ref 59275); Sat, 26 Nov 2022 08:07:02 +0000 Original-Received: (at 59275) by debbugs.gnu.org; 26 Nov 2022 08:06:26 +0000 Original-Received: from localhost ([127.0.0.1]:37720 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oyqCg-0000tz-1f for submit@debbugs.gnu.org; Sat, 26 Nov 2022 03:06:26 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:58354) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oyqCe-0000tl-3P for 59275@debbugs.gnu.org; Sat, 26 Nov 2022 03:06:24 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oyqCY-0002aM-ML; Sat, 26 Nov 2022 03:06:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=ftgoWBe3FJUCjwtE1Dq5MQ4C3Ty05UYUga/VmFZLuIs=; b=A9eBKSdCbmcbXVL1eDwO uyq8yRCM/p9rKuFyAdxEt9P6hDViL73sZ0j16PJE9wUjrcSzQHIBE0AurSlYKMqds6/Y0n+H0B52c wfjOUnAcui574BlGgffDydIO1XjtbVsYPuzNd4yeMm6OTwrbmLcmGwjcQtkhhWUBOhazaUER29iJq dtvsXkm+RssrWYEM4ywDORA/B5Rr/j42WbkwGwJG0kZ7TDw2jwVp25caLiA1nPnHl+IbGixJVYa4r q1fqZ+PfoyDw2ZyqyZIcJn0xJpL6+hHJdk4JL+LwITSYvie+H8eUPpMSxi3p3yJWo65fiAS2ZpSke 2tKP2DqSp5gvRA==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oyqCX-0003iE-Uw; Sat, 26 Nov 2022 03:06:18 -0500 In-Reply-To: <877czimpz4.fsf@localhost> (message from Ihor Radchenko on Sat, 26 Nov 2022 02:03:43 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:249046 Archived-At: > From: Ihor Radchenko > Cc: 59275-done@debbugs.gnu.org > Date: Sat, 26 Nov 2022 02:03:43 +0000 > > We concluded that a better fallback when collation is not available > would be using downcase+string-lessp when `string-collate-lessp' is > called with non-nil IGNORE-CASE argument. This has caveats, see below. I won't argue about your Org-local decision, since I don't know enough about the intended uses of what you did, but I do have something to say about this decision in general. I suggest at least a FIXME comment where you do this stuff, based on what I tell below. > Would it be acceptable for Emacs to change the fallback behavior of > `string-collate-lessp' to: > > 1. If string collation is not available and IGNORE-CASE is nil, fallback > to`string-lessp'; > 2. If string collation is not available and IGNORE-CASE is non-nil, > use `downcase' + `string-lessp'. 'downcase' uses the buffer-local case table if such is defined for the buffer that happens to be the current when you invoke 'downcase', and that's another cause of inconsistency and user surprises, especially when the strings you compare don't really "belong" to the current buffer. Also, in some (rarely-used) locales, downcasing has unexpected results, even with the default case-table. For example, downcasing "I" produces "ı", not "i" as expected. Did you think about these cases when making the above decision? > I also do not think that it will be backwards-incompatible. If the call > to `string-collate-lessp' explicitly requests ignoring case, `downcase' > is more expected than bare `string-lessp' that _does not_ ignore case. > > WDYT? See above. What you suggest is perhaps fine for plain-ASCII text, but not in general, IMNSHO. The reason for what Emacs currently does on systems that lack collation functions is that for such systems collation rules are indeterminate, and so inventing them by following naïve rules of plain ASCII, in particular the case-conversion rules, is potentially very wrong. These are general-purpose APIs, not something concrete in specific Org contexts, and as such, these APIs cannot "mostly work", they should work always and for every possible use case. And we are talking about a single system where these problems happen, which is macOS, right? Wouldn't it be better for "Someone" who uses macOS to just bite the bullet and write a proper collation function, or find a free software implementation of one, and include it in Emacs? This is what I did for MS-Windows at the time string-collate-lessp was added to Emacs. Why cannot macOS users do the same?