From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Letter-case conversions in network protocols Date: Sat, 08 May 2021 12:50:11 +0300 Message-ID: <83h7jda71o.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3390"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Fatih Aydin To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat May 08 11:51:07 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lfJc3-0000nR-Bz for ged-emacs-devel@m.gmane-mx.org; Sat, 08 May 2021 11:51:07 +0200 Original-Received: from localhost ([::1]:47040 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lfJc2-0004hn-9V for ged-emacs-devel@m.gmane-mx.org; Sat, 08 May 2021 05:51:06 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51196) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lfJbE-0003ut-1x for emacs-devel@gnu.org; Sat, 08 May 2021 05:50:16 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:55960) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lfJbD-00006M-JU; Sat, 08 May 2021 05:50:15 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3296 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lfJbC-0001ni-S1; Sat, 08 May 2021 05:50:15 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:269034 Archived-At: For the immediate trigger, see bug#44604. The general problem that I'd like us to discuss is how to deal with letter-case conversions in code that deals with protocols, such as network-related protocols, that need to recognize certain keywords. The problem here is that when Emacs starts in certain locales, or changes to the corresponding language-environments, we modify the case tables to comply with the rules of those locales. An example (though not the only one) is the Turkish locale; see turkish-case-conversion-enable. As result of calling that, downcasing 'I' no longer produces 'i', and code which attempts to match keywords including 'i' case-insensitively fails. Since we generally use the same text-search and matching APIs both for implementing the keyword-based protocols and for more general processing of human-readable code, there's no easy solutions when we need to ignore language-specific case-conversion rules in some of the code. For example, let-binding case-table cannot be done on a too high level, because it will then affect any text processing below that level, and a high-level function has no way of knowing what kind of text processing will be needed by the code it calls, directly or indirectly. So what would be the best/easiest solution to this class of problems? An immediate, but not necessarily easy, candidate is to use (with-case-table ascii-case-table everywhere where we use text-search facilities for keyword processing. However, this means we will have to go over all the places which do this, and manually change the code there, and so will developers of any 3rd-party packages. Are there better solutions? Ideas are welcome.