From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Karl Fogel Newsgroups: gmane.emacs.devel Subject: Re: An interesting line-motion bug. Date: Thu, 17 Nov 2022 14:58:02 -0600 Message-ID: <877cztqox1.fsf@red-bean.com> References: <87cz9mywbr.fsf@red-bean.com> <831qq1epk7.fsf@gnu.org> Reply-To: Karl Fogel Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="38055"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Nov 17 21:58:49 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ovlyC-0009iz-LP for ged-emacs-devel@m.gmane-mx.org; Thu, 17 Nov 2022 21:58:49 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovlxb-0008Dk-JX; Thu, 17 Nov 2022 15:58:11 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovlxa-0008BF-AU for emacs-devel@gnu.org; Thu, 17 Nov 2022 15:58:10 -0500 Original-Received: from sanpietro.red-bean.com ([45.79.25.59]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovlxX-0003yh-Pd; Thu, 17 Nov 2022 15:58:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=red-bean.com; s=202005newsp; h=Content-Transfer-Encoding:Content-Type: MIME-Version:Message-ID:Date:Reply-To:References:In-Reply-To:Subject:Cc:To: From:Sender:Content-ID:Content-Description; bh=Ux+a7iDq/frwI1cwvsN7ukx+d+jPgfTucDmlxnebf+Y=; t=1668718683; x=1669928283; b=QgH6YfOu0WXQ/ZJAL9IrUQzcEmPU6Vya+v2IUKiCJLdnpIN0uWoJWOdO+dxlY5Hc1YQYIbCJraI XI9OxL3aawIrFWcHNPKfaR4kBGTRy8ylBSXoduV6LHHba/ADQ4w16ltx/50yVpswK21tzZn5m38nr jlGmBPzgSnm/6ckEm23kdwTOZDpCN37dEM1Kbv8+B9vOfQ7+dZdlIimcS7U7lL9kKOtJrMZ1gI3DD txrINzapPExY+fj2i4lnnyscKPqN5dEc3hd3NloxZlOSRbYsXghIS9LgWlri//3XzRGWnhLVLfE04 1D6uawzcaok5Z7hhfvFySYaIr1lArN9mulqg==; Original-Received: from [12.106.183.66] (port=18205 helo=hummy) by sanpietro.red-bean.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ovlxS-0090I4-UH; Thu, 17 Nov 2022 20:58:02 +0000 In-Reply-To: <831qq1epk7.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 17 Nov 2022 14:24:56 +0200") Received-SPF: pass client-ip=45.79.25.59; envelope-from=kfogel@red-bean.com; helo=sanpietro.red-bean.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:300074 Archived-At: On 17 Nov 2022, Eli Zaretskii wrote: >> From: Karl Fogel >> Date: Wed, 16 Nov 2022 23:38:32 -0600 >>=20 >> ;; When you hit that breakpoint, point will already be on the=20 >> newline >> ;; at the end of the line ";; in a buffer in which" above. By=20 >> then, >> ;; that newline has a display property that makes it look wide=20 >> to >> ;; Emacs -- specifically, one unit wider than a newline=20 >> normally is. >> ;; But Emacs never asked whether the display property is on a >> ;; *newline*. In other words, Emacs identifies this as a wide >> ;; character and thus mistakenly computes a "width" that=20 >> actually >> ;; extends over a vertical (because newline) span. > > Why is it important that the display property "covers" a=20 > newline? A > display property whose value is a string completely replaces the=20 > text > that it "covers", and that includes the newline in the buffer=20 > text. I don't think I said "cover" anywhere -- maybe you're referring to=20 where I said "extends over"? But I think I basically understand=20 your question, so I'll try to answer it: >From a user's perspective, a display property is not present in=20 the same way a buffer character is present. For example, after=20 running my reproduction sexp, if you then go to the beginning of=20 the buffer and do an isearch for "=E2=8F=8E" [1], the search won't match=20 the display properties on the newlines, but it *will* match that=20 character in the lisp code of my recipe. To the user, a display=20 property "extends over" some buffer text but does not really=20 replace that text. Now, once you get down in to get_char_property_and_overlay() as=20 called from indent.c:check_display_width(), and from that call to=20 consequently landing in the "/* Handle 'display' strings. */"=20 case later in check_display_width(), it is reasonable to say that=20 the display string functionally replaces the buffer text. So=20 where I wrote "Emacs never asked whether the display property is=20 on a newline" above, I could instead have written "Emacs never=20 asked whether the display property includes a newline", and=20 perhaps that latter phrasing would have seemed more natural to=20 you. > What _is_ important is that the display property itself includes=20 > a > newline, which effectively resets the column to zero. Yes. Unfortunately, the fact that the caller might care about=20 that column reset is not known to check_display_width(), and the=20 latter's current calling discipline offers no way to return that=20 information. > Unfortunately, I don't know how to fix this basic issue. The=20 > code > which does that is AFAIU unable to grasp the situation where=20 > moving > across text _decreases_ the column. Well, instead of asking whether moving across text could decrease=20 the column, another way to look at it is this: check_display_width() only has one caller, scan_for_column(), and=20 scan_for_column's promise is: "Scanning from the beginning of the current line, stop at the=20 buffer position ENDPOS or at the column GOALCOL or at the end of=20 line, whichever comes first." The problem we're having is that it's failing to detect that third=20 possibility: "or at the end of line". A solution could be to update check_display_width() to return a=20 bit more information than it currently returns. For example, it=20 could return a special flag value to indicate that there is a=20 newline *somewhere* in a display property at POS, or it could even=20 take a new 'ptrdiff_t *' parameter and set the value of that=20 parameter to the actual position of the newline within the display=20 string (or else to NULL if no newline). > If someone has ideas how to support such situations, please=20 > speak up. > And keep in mind that this code is used much more than for C-n=20 > and > C-p. *nod* Okay, so my worst fear is confirmed -- if you think this is=20 hard too, then it's not just my ignorance making it hard :-). But I have outlined a possible solution above. If it doesn't seem=20 obviously wrong to you, I'd be happy to make (and of course test)=20 a patch for review. Best regards, -Karl [1] Just do `(insert 9166)' if an MTA or MUA somewhere along the=20 way messed things up such that the desired character is not=20 present for you in my post.