From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Karl Fogel Newsgroups: gmane.emacs.devel Subject: An interesting line-motion bug. Date: Wed, 16 Nov 2022 23:38:32 -0600 Message-ID: <87cz9mywbr.fsf@red-bean.com> Reply-To: Karl Fogel Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27726"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) To: Emacs Devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Nov 17 06:39:57 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ovXcx-000704-86 for ged-emacs-devel@m.gmane-mx.org; Thu, 17 Nov 2022 06:39:55 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovXcC-0006bm-SY; Thu, 17 Nov 2022 00:39:08 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovXc1-0006Ys-7x for emacs-devel@gnu.org; Thu, 17 Nov 2022 00:38:57 -0500 Original-Received: from sanpietro.red-bean.com ([45.79.25.59]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovXbo-0007Bh-Ug for emacs-devel@gnu.org; Thu, 17 Nov 2022 00:38:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=red-bean.com; s=202005newsp; h=Content-Type:MIME-Version:Message-ID:Date: Reply-To:Subject:To:From:Sender:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:In-Reply-To:References; bh=jdgnh+yimSQxgtosrO5GWM8HpyRljW2C+Bd7/LIYpdU=; t=1668663513; x=1669873113; b=Sce16DBaXgRRqaizYCmBPzoYi+V5V7GFVvItvTE0ZoBNhxi5mi36KqRta5/KDBCMjsSnxWap6KK HYMJ0oa3edzj4kju7GdjG+/4odCQGmJd+IN8uxUy4q3nKUsWuNpUEtmNRKdg9xiyWlxvbBSz6N/SJ 8PE2iIzEW0vG1YBL/A6EEmtBvM6uRulyIZOE/NSQB78B8b7t6d3P0XHu5ci5jNuRPF4X5ONpF+3ev sCUYhlhb3krRTw8D056LGHGhSvAbYCb4f68s+dVbKLXkHFHaV+HH1xsp3ojK6ke6zcyWxSF9mYpK5 xH5TmSJ1PiBylMpKti/BFxirrmCFC9V9cx6Q==; Original-Received: from [12.106.183.66] (port=59152 helo=hummy) by sanpietro.red-bean.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ovXbc-008ZCk-M4 for emacs-devel@gnu.org; Thu, 17 Nov 2022 05:38:32 +0000 Received-SPF: pass client-ip=45.79.25.59; envelope-from=kfogel@red-bean.com; helo=sanpietro.red-bean.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:300006 Archived-At: --=-=-= Content-Type: text/plain; format=flowed The attached file shows a reproduction recipe for a line-motion bug that had been nagging at me for some time. The file also includes some debugging information and a preliminary diagnosis. I don't know this area of the code very well. While I can think of possible ways to fix this, each potential fix I've thought of so far raises questions that I don't feel secure about answering. If someone who knows this area well can make The Right Fix quickly, then great. If not, then I would be happy to keep studying and make a patch for review. Best regards, -Karl --=-=-= Content-Type: application/emacs-lisp Content-Disposition: attachment; filename=line-motion-bug-reproduction.el Content-Transfer-Encoding: quoted-printable ;; This is a reproduction recipe for a line-motion bug in Emacs ;; (reproduced with commit 7781121c44736a on 'master' as of 2022-11-16, ;; but the bug has been present for at least months and possibly for ;; about 14 years). After the recipe, I've included some debugging ;; information. ;; ;; When you evaluate the sexp below, it will first put point on the ;; newline at the end of "Now we demonstrate a bug in Emacs" and then ;; run `(next-line)'. Normally that would result in point going to ;; the end of the word "which" on the next line. However, because of ;; the bug -- which is stimulated by the `display' property that we've ;; put on newlines -- point instead goes to a wrong place far away, ;; namely, the "t" of the word "properties" on the line after that. ;; ;; -------------------------- ;; Begin reproduction recipe: ;; -------------------------- ;; ;; Now we demonstrate a bug in Emacs ;; in a buffer in which ;; some newlines have display properties ;; and others don't ;; and still others also don't. (let ((line-move-visual nil) ;; necessary to stimulate the bug! (start (progn (goto-char (point-min)) (search-forward "Begin reproduction recipe:") (search-forward "a bug in Emacs") (point))) (end (progn (search-forward "and others don't") (point)))) (goto-char start) (while (search-forward "\n" end t) (let ((nl (1- (point)))) (add-text-properties nl (1+ nl) (list 'display "=E2=8F=8E\n")))) (goto-char start) (message "Now we will do `(next-line)' from here...") (sit-for 1) (next-line) (message "See where point is now. That just ain't right.")) ;; ------------------------ ;; End reproduction recipe. ;; ------------------------ ;; To debug, I used this this `M-x gdb' command: ;; ;; gdb -i=3Dmi --args src/emacs -q --find-file=3DREADME.branch ;; ;; Set a breakpoint on the third line ("width =3D XFIXNUM") of this part ;; of indent.c:check_display_width(): ;; ;; /* Handle 'display' strings. */ ;; else if (STRINGP (val)) ;; width =3D XFIXNUM (Fstring_width (val, Qnil, Qnil)); ;; ;; When you hit that breakpoint, point will already be on the newline ;; at the end of the line ";; in a buffer in which" above. By then, ;; that newline has a display property that makes it look wide to ;; Emacs -- specifically, one unit wider than a newline normally is. ;; But Emacs never asked whether the display property is on a ;; *newline*. In other words, Emacs identifies this as a wide ;; character and thus mistakenly computes a "width" that actually ;; extends over a vertical (because newline) span. ;; ;; By the way, this means you could have *any* number of intervening ;; short-enough lines and point would still go to the "t" of ;; "properties". Try inserting a bunch of lines like this into the ;; recipe (strip off the outer comment and indentation of course): ;; ;; ;; in a buffer in which ;; ;; a short line ;; ;; a short line ;; ;; a short line ;; ;; a short line ;; ;; a short line ;; ;; a short line ;; ;; a short line ;; ;; some newlines have display properties ;; ;; If you re-evaluate the sexp, point will still jump all the way down ;; to the "t" of "properties". ;; ;; Some more context: higher up in the call stack, we're here at this ;; time in indent.c:scan_for_column(): ;; ;; int width =3D check_display_width (scan, col, &endp); ;; if (width >=3D 0) ;; { ;; col +=3D width; ;; if (endp > scan) /* Avoid infinite loops with 0-width overlays. */ ;; { ;; scan =3D endp; ;; scan_byte =3D CHAR_TO_BYTE (scan); ;; continue; ;; } ;; } ;; ;; So width gets set to 1, which causes us to enter the outer `if', ;; where we will find that endp > scan, because check_display_width() ;; has just set endp to the position at the *start* of the line ;; ";; some newlines have display properties". But endp should have ;; been left on the newline at the end of ";; in a buffer in which". ;; We've just skipped past the newline that was supposed to be the ;; limit for scan_for_column(), and we will now 'continue' back up to ;; the start of the top loop... ;; ;; /* Scan forward to the target position. */ ;; while (scan < end) ;; ;; ...without ever checking any of the endloop conditions that happen ;; a bit farther down, after 'c =3D FETCH_BYTE (scan_byte);'. ;; ;; commit 80e3db569f72e628b8fc999d39833dd4fdfca8d1 was where this ;; check was originally added. --=-=-=--