From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Roland Winkler Newsgroups: gmane.emacs.devel Subject: Re: case-insensitive string comparison Date: Wed, 20 Jul 2022 13:10:35 -0500 Message-ID: <8735evlko4.fsf@gnu.org> References: <87ilnsq4cr.fsf@gnu.org> <87mtd3n455.fsf@gnu.org> <83ilnrlnd1.fsf@gnu.org> <87lesnlm7a.fsf@gnu.org> <83fsivlllz.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="12459"; mail-complaints-to="usenet@ciao.gmane.io" Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jul 20 20:11:55 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oEEAt-00035Y-DI for ged-emacs-devel@m.gmane-mx.org; Wed, 20 Jul 2022 20:11:55 +0200 Original-Received: from localhost ([::1]:59422 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oEEAr-0003lp-VW for ged-emacs-devel@m.gmane-mx.org; Wed, 20 Jul 2022 14:11:53 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:46198) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oEE9d-0002Lt-1X for emacs-devel@gnu.org; Wed, 20 Jul 2022 14:10:38 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:34868) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oEE9c-0007zi-Oq; Wed, 20 Jul 2022 14:10:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=rQ66eV3zTdWbwHjQq4gS6G/LbfRMt0xSTWQZ4ZOIYBk=; b=KuGJtuB6zV8Q1zTL62y5 Fc1TDvBiouSrfjR5d2SVHUKjl4z6aiJFKl6LE01LGWyxwZ1bOtI9opVGy0XugUBxoidbg6zDtjhvp 59SzNx0Gy18Z0EItNAjkLjdmagGHq1tSXaSEI/juaY1qTbuV+kDSAnvz/MCpGBUvgk/JvN1451n8S b9wD8/eM9i4tmovp4la5kmnFrIt9jxzCzx5K6hkUHZuHKw148g5hVhlxwjwrU9jJ6cW5VUWq1suBT zaEKs4SIODCII2MjA8EFc401xTAYPAmrc0n4YgDs0F2u4vaOH8EJRvvYuogkHJYIEYfUpezj++20R 3MDkaLYMpOpoNg==; Original-Received: from [2600:1700:5650:f790::42] (port=45750 helo=regnitz) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oEE9c-00050G-Co; Wed, 20 Jul 2022 14:10:36 -0400 In-Reply-To: <83fsivlllz.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 20 Jul 2022 20:50:16 +0300") X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:292315 Archived-At: On Wed, Jul 20 2022, Eli Zaretskii wrote: >> Even mentioning the difficulties could be useful here. > > I'm not sure I agree. To describe all the important aspects of this > would take too long, and it isn't the job of our manual to document > this stuff. Read this if you want to know: > > https://unicode.org/reports/tr10/ A footnote pointing the interested reader to this report could already be useful. I am not suggesting to try to provide a more exhaustive discussion of this topic. I am suggesting to mention briefly that the topic is subtle and depends on details "beyond emacs itself". >> I am not sure I can follow your argument. Do you suggest that, likely, >> BBDB will work best if it compares names using compare-strings? > > Yes. Thanks, that's already good to know! > But in addition, you should set up the case table of the current > buffer when you do so, because otherwise special cases with the likes > of the Turkish language's dotless I could in rare cases screw you. > >> (I'd be glad to hear that.) This code should work for users who do not >> want to build their own case table and stuff like that. > > Not the users should build the case table, BBDB (or whatever Lisp > program that needs the comparison) should. It's not that hard, > really: if you only need ASCII, use ascii-case-table, otherwise copy > the standard case-table and modify it to make sure I downcases to i > and similarly with a few other exceptional letters. I am not sure it would be possible to predict how a default case table for BBDB should differ from the standard case table. BBDB might be the only package of a user that accumulates strings that go beyond what otherwise a user is dealing with regularly. If there is a sensible "BBDB default case table" I'd hope that this is the standard case table. Or if not: can you suggest an emacs package that I can look into as a source of inspiration?