From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Roland Winkler Newsgroups: gmane.emacs.devel Subject: Re: case-insensitive string comparison Date: Wed, 20 Jul 2022 11:24:38 -0500 Message-ID: <87mtd3n455.fsf@gnu.org> References: <87ilnsq4cr.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22245"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jul 20 18:26:01 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oECWO-0005bQ-Tv for ged-emacs-devel@m.gmane-mx.org; Wed, 20 Jul 2022 18:26:01 +0200 Original-Received: from localhost ([::1]:44660 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oECWN-0003t0-FN for ged-emacs-devel@m.gmane-mx.org; Wed, 20 Jul 2022 12:25:59 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50424) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oECV6-0002pX-FI for emacs-devel@gnu.org; Wed, 20 Jul 2022 12:24:40 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:33742) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oECV6-0006vd-59; Wed, 20 Jul 2022 12:24:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=rNzh7s63U44RfdtCDVch5pjca1XOI/8EezEDo58+ykE=; b=Lv80XlrOs0Flqip0gPVc LbUYiVBthEh7+dzntsDD5zl+bcJIiuAxs2apcAkbDXzKyUZLYINVE7pB97lqveuDtrZcVpyJiG5HN ywXY541HpAWi724kuBCP7I9IuiU6+1/XsfX1fV01GS8FZqpP23dbzD/lE8CvT1LnPMpPaqz1tpGNJ d/Xi46jlVEinu8wJxtnZRNAtL3qVxz5XK9+loh2o+/qKp0GvaBuZWtaxxxlb2xhalSwsuIZTo0jLD 09+2m+nuhSuiWL70zfRE6BGCXaURZjQ3W9whheAW7J/E4JnEQdRv5Ttfw6dwemPYp7gPDe6+zb0Sr 1BSE448gRXwN1w==; Original-Received: from [2600:1700:5650:f790::42] (port=36380 helo=regnitz) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oECV5-0000u1-TO; Wed, 20 Jul 2022 12:24:39 -0400 In-Reply-To: (Stefan Monnier's message of "Tue, 19 Jul 2022 23:01:31 -0400") X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:292309 Archived-At: On Tue, Jul 19 2022, Stefan Monnier wrote: >> PS. Actually, compare-strings/ignore_case is broken because it does, >> essentially, upcase both arguments, see >> https://stackoverflow.com/q/319426/850781 > > Hmm... `string-collate-equalp`? It would be nice if the node in the elisp manual on "comparison of characters and strings" included some discussion on what usage cases with case-folding can / should preferentially be covered by the locale-dependent function string-collate-equalp versus something like compare-strings. In my narrow world, I can think of two extremes: - bibtex-mode needs to compare BibTeX keywords that are ascii strings for which case is insignificant. So bibtex-string= is exactly what Sam suggests to put into subr.el, and I believe that's good enough (just as almost any other approach I can think of for this particular problem). - BBDB needs to know whether a name is already present in the database or not, ignoring case. The function bbdb-string= is again what Sam suggests to put into subr.el. The function string-collate-equalp might be better suited for this. But which locale should it use? The records in my BBDB cover larger parts of the world and I do not even know which locale(s) might work best for each of them, not to mention that BBDB needs to loop over all records. Is there a "univeral default locale"?