From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#58168: string-lessp glitches and inconsistencies Date: Tue, 04 Oct 2022 19:24:52 +0300 Message-ID: <83k05fv9nv.fsf@gnu.org> References: <7824372D-8002-4639-8AEE-E80A6D5FEFC6@gmail.com> <877d1l55rn.fsf@gnus.org> <469814C2-197A-4BCA-8E2A-245577340C1E@gmail.com> <878rlzj1zv.fsf@gnus.org> <878rlzfylg.fsf@gnus.org> <017DAAA2-0383-4B47-855E-28348B2E9F06@gmail.com> <831qrnx1jc.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29958"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 58168@debbugs.gnu.org, larsi@gnus.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Oct 04 18:55:12 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oflCJ-0007ZO-Pa for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 04 Oct 2022 18:55:12 +0200 Original-Received: from localhost ([::1]:54636 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oflCI-0001DR-S4 for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 04 Oct 2022 12:55:10 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51624) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ofkk7-0000Eb-4B for bug-gnu-emacs@gnu.org; Tue, 04 Oct 2022 12:26:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:55925) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ofkk6-0008NS-7F for bug-gnu-emacs@gnu.org; Tue, 04 Oct 2022 12:26:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ofkk6-0002yR-2K for bug-gnu-emacs@gnu.org; Tue, 04 Oct 2022 12:26:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 04 Oct 2022 16:26:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58168 X-GNU-PR-Package: emacs Original-Received: via spool by 58168-submit@debbugs.gnu.org id=B58168.166490070611349 (code B ref 58168); Tue, 04 Oct 2022 16:26:02 +0000 Original-Received: (at 58168) by debbugs.gnu.org; 4 Oct 2022 16:25:06 +0000 Original-Received: from localhost ([127.0.0.1]:54996 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ofkjB-0002wz-Ou for submit@debbugs.gnu.org; Tue, 04 Oct 2022 12:25:06 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:42922) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ofkj8-0002wM-2f for 58168@debbugs.gnu.org; Tue, 04 Oct 2022 12:25:04 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:39992) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ofkj2-0006wM-Oz; Tue, 04 Oct 2022 12:24:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=+5tCHjlRgYMn8KWoTdfIK0npJS8xj3RNXMkDVpZwTmc=; b=jUbxsV1JyYc22/6OQt4o PLoRXo+rY47H0KLyowGJBLkTI++LxmO8awAV0v4eZ8fTZqRc81OGVkJ9SIhfZ8KI5A5PYOzIW6jc7 /2Q0oOvsKg0cLSdfpBWEKSceSNOzWUJX01vdtBRa+OT4qeLpObsdOy0QYbHptEGYXgk9ZrToYXCZv 4qTjdd/GT1mbc68HA4HIpLnW0TVdQIRZbWE7css7XhPt2HyxXQk1R+EfskwrCPEN8pO1AAWlMGI3r cuC59KIAs/vxpJ+TrNUemzh2BwhJM1/1xoF/sq9UyKMiGvJUD87mJfSDreFIGShKKJYvWZz51NMfx MvPyqxw1ZRb1aw==; Original-Received: from [87.69.77.57] (port=3602 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ofkj2-0008P9-1I; Tue, 04 Oct 2022 12:24:56 -0400 In-Reply-To: (message from Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= on Tue, 4 Oct 2022 16:44:17 +0200) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:244449 Archived-At: > From: Mattias EngdegÄrd > Date: Tue, 4 Oct 2022 16:44:17 +0200 > Cc: larsi@gnus.org, > 58168@debbugs.gnu.org > > 4 okt. 2022 kl. 13.37 skrev Eli Zaretskii : > > > First I needed to fix fallout from making STRING_CHAR intolerant of > > unibyte text, because redisplay-testsuite caused assertion violations > > in string_char_and_length. > > Good catch! Just to satisfy my curiosity: > > > error ("Invalid format operation %%%c", > > - STRING_CHAR ((unsigned char *) format - 1)); > > + multibyte_format > > + ? STRING_CHAR ((unsigned char *) format - 1) > > + : *((unsigned char *) format - 1)); > > This treats unibyte format strings as if they were Latin-1 for the purpose of the error message. No, it doesn't. It shows the problematic characters as raw bytes, as in "%\200" (where \200 is a single character). If you see something different, please show the recipe. > Not very important, of course, but maybe there should be a UNIBYTE_TO_CHAR in the alternative branch? No, that would show the multibyte codepoint, and will confuse users, because the result would look very different from the problematic format spec in this case. > > (Doesn't it abort for you? or do you not > > build Emacs with --enable-checking?) > > Oh I certainly do that occasionally, but it's mostly when I've changed something at the C level or have reason to believe that something is broken there. Please _always_ test changes related to encoding/decoding and character representation conversions in a --enable-checking build. We should have discovered these bugs in time for Emacs 28.2 to be devoid of them. > > I could understand why you'd want to _add_ the larger values, but why > > replace? > > Because it seemed pretty clear that the old code intended to use #x3ffffc for testing display of raw bytes but a typo turned it into #x3fffc instead which isn't a raw byte but a multibyte character. That it's an easy mistake to make (done so several times myself). Who said anything about #x3fffc? The original code had #xfc, the unibyte code for #x3ffffc. I don't see why we shouldn't test both. In the other problematic hunk you replaced \777774 with \374 -- why? > I've now pushed the patch; the code can be improved further if necessary. I've reverted it. Please stop this madness of rushing into installing changes that are still under controversy.