From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.bugs Subject: bug#59275: Unexpected return value of `string-collate-lessp' on Mac Date: Sat, 26 Nov 2022 08:47:13 +0000 Message-ID: <87v8n2je5q.fsf@localhost> References: <87zgcsdfma.fsf@localhost> <83iljgib4w.fsf@gnu.org> <87h6z0cl6b.fsf@localhost> <837czwi6yp.fsf@gnu.org> <8735ajel7y.fsf@localhost> <83mt8rgill.fsf@gnu.org> <877czokbpk.fsf@localhost> <8335ac4eo5.fsf@gnu.org> <87ilj7dbms.fsf@localhost> <83sfib172p.fsf@gnu.org> <877czimpz4.fsf@localhost> <83r0xqta0d.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12186"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 59275@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Nov 26 09:47:19 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oyqqF-0002yq-DW for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 26 Nov 2022 09:47:19 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oyqq0-0004KK-5I; Sat, 26 Nov 2022 03:47:04 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oyqpz-0004KA-0j for bug-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:47:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oyqpy-0003qP-HC for bug-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:47:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oyqpy-0001s1-Dd for bug-gnu-emacs@gnu.org; Sat, 26 Nov 2022 03:47:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Ihor Radchenko Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 26 Nov 2022 08:47:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 59275 X-GNU-PR-Package: emacs Original-Received: via spool by 59275-submit@debbugs.gnu.org id=B59275.16694524107170 (code B ref 59275); Sat, 26 Nov 2022 08:47:02 +0000 Original-Received: (at 59275) by debbugs.gnu.org; 26 Nov 2022 08:46:50 +0000 Original-Received: from localhost ([127.0.0.1]:37768 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oyqpm-0001ra-7C for submit@debbugs.gnu.org; Sat, 26 Nov 2022 03:46:50 -0500 Original-Received: from mout02.posteo.de ([185.67.36.66]:35147) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oyqpj-0001rL-BK for 59275@debbugs.gnu.org; Sat, 26 Nov 2022 03:46:49 -0500 Original-Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 699C2240101 for <59275@debbugs.gnu.org>; Sat, 26 Nov 2022 09:46:38 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1669452401; bh=zFbRJeuMobBMo+X2FjNaqFWY7J5CaoYi8caDhrereIA=; h=From:To:Cc:Subject:Date:From; b=Ed3BI8CnfeedvqavZIRUek0LrLWjitoK7Q9SarrRZioOtiqoFQCAkNY/9bI0kfXZr MjoOOxAkux4j6kBqjXekol03u6jrhQARjIlaTx+F5kNIWw3u29YKDSB6/Ha5c4Y0xZ K2MIaSC3kvcIvtrmzCECxVJnZ6cU+E4Lo7zD3ItxArQFUUAo5+42m+SkNecu/I4RwS 8m0K9jSrykV/fs1BeUQMOoLMl3SoNRC3NXIKwWbAAOBjVGDmYI1YNz8En94S9z4C7w a7j2SPRZFfAqFiZEw9D7ReJjN9JvvzpZ4kB5LBT2fFyNppBKMFnAKQQSk9+qQK//FN fG55rHCouJb6Q== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4NK51n46Pmz6tlh; Sat, 26 Nov 2022 09:46:35 +0100 (CET) In-Reply-To: <83r0xqta0d.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:249053 Archived-At: Eli Zaretskii writes: >> We concluded that a better fallback when collation is not available >> would be using downcase+string-lessp when `string-collate-lessp' is >> called with non-nil IGNORE-CASE argument. > > This has caveats, see below. I won't argue about your Org-local decision, > since I don't know enough about the intended uses of what you did, but I = do > have something to say about this decision in general. I suggest at least= a > FIXME comment where you do this stuff, based on what I tell below. Thanks for the information! >> Would it be acceptable for Emacs to change the fallback behavior of >> `string-collate-lessp' to: >>=20 >> 1. If string collation is not available and IGNORE-CASE is nil, fallback >> to`string-lessp'; >> 2. If string collation is not available and IGNORE-CASE is non-nil, >> use `downcase' + `string-lessp'. > > 'downcase' uses the buffer-local case table if such is defined for the > buffer that happens to be the current when you invoke 'downcase', and tha= t's > another cause of inconsistency and user surprises, especially when the > strings you compare don't really "belong" to the current buffer. Interesting. Is there any reason why this is not mentioned in the docstring for `downcase'? I now see 4.10 The Case Table section of the manual, and it looks like case tables should be set mostly automatically (by Emacs?) according to the language environment. Are details about this process documented anywhere? Are these case conversion tables independent of glibc? > Also, in > some (rarely-used) locales, downcasing has unexpected results, even with = the > default case-table. For example, downcasing "I" produces "=C4=B1", not "= i" as > expected. Did you think about these cases when making the above decision? I did not. However, I recall reading somewhere that it is possible work around this kind of issues by calling case conversion several times: upcase -> downcase -> upcase -> downcase. I did not. But now, after you reminded me about this caveat, I do recall https://nullprogram.com/blog/2014/06/13/ that mentioned something similar about caveats with composition. Just mentioning it for your reference. (I am not sure if the caveats discussed have been raised on Emacs devel). >> I also do not think that it will be backwards-incompatible. If the call >> to `string-collate-lessp' explicitly requests ignoring case, `downcase' >> is more expected than bare `string-lessp' that _does not_ ignore case. >>=20 >> WDYT? > > See above. What you suggest is perhaps fine for plain-ASCII text, but not > in general, IMNSHO. > > The reason for what Emacs currently does on systems that lack collation > functions is that for such systems collation rules are indeterminate, and= so > inventing them by following na=C3=AFve rules of plain ASCII, in particula= r the > case-conversion rules, is potentially very wrong. These are general-purp= ose > APIs, not something concrete in specific Org contexts, and as such, these > APIs cannot "mostly work", they should work always and for every possible > use case. I feel that I miss something. Don't Emacs provide unicode case conversion tables? Why plain ASCII rules? > And we are talking about a single system where these problems happen, whi= ch > is macOS, right? Wouldn't it be better for "Someone" who uses macOS to j= ust > bite the bullet and write a proper collation function, or find a free > software implementation of one, and include it in Emacs? This is what I = did > for MS-Windows at the time string-collate-lessp was added to Emacs. Why > cannot macOS users do the same? It would be. But how can we ask for this? etc/TODO? Or maybe re-open this bug report? --=20 Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at