From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#41520: 28.0.50; Crash in character.h due to assertion error Date: Tue, 26 May 2020 19:17:15 +0300 Message-ID: <831rn6vfg4.fsf@gnu.org> References: <83imgkw11q.fsf@gnu.org> <837dx0vysk.fsf@gnu.org> <83tv04uhca.fsf@gnu.org> <83pnarvmm2.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="55478"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 41520@debbugs.gnu.org, stefan@marxist.se To: Pip Cet Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue May 26 18:18:10 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jdcHK-000EJP-E3 for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 26 May 2020 18:18:10 +0200 Original-Received: from localhost ([::1]:40644 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jdcHJ-0003p5-4G for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 26 May 2020 12:18:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:53884) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jdcHC-0003oj-LJ for bug-gnu-emacs@gnu.org; Tue, 26 May 2020 12:18:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:34704) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jdcHC-0007VF-C3 for bug-gnu-emacs@gnu.org; Tue, 26 May 2020 12:18:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jdcHC-0006g5-8K for bug-gnu-emacs@gnu.org; Tue, 26 May 2020 12:18:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 26 May 2020 16:18:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 41520 X-GNU-PR-Package: emacs Original-Received: via spool by 41520-submit@debbugs.gnu.org id=B41520.159050986125640 (code B ref 41520); Tue, 26 May 2020 16:18:02 +0000 Original-Received: (at 41520) by debbugs.gnu.org; 26 May 2020 16:17:41 +0000 Original-Received: from localhost ([127.0.0.1]:46250 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jdcGq-0006fU-M9 for submit@debbugs.gnu.org; Tue, 26 May 2020 12:17:41 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:46416) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jdcGo-0006fH-G5 for 41520@debbugs.gnu.org; Tue, 26 May 2020 12:17:38 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:47257) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jdcGi-0007Rl-Ub; Tue, 26 May 2020 12:17:32 -0400 Original-Received: from [176.228.60.248] (port=2047 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jdcGh-0001F4-Lp; Tue, 26 May 2020 12:17:32 -0400 In-Reply-To: (message from Pip Cet on Mon, 25 May 2020 20:39:06 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:181049 Archived-At: > From: Pip Cet > Date: Mon, 25 May 2020 20:39:06 +0000 > Cc: stefan@marxist.se, 41520@debbugs.gnu.org > > The plan is to introduce additional struct-valued macros for things > like PT/PT_BYTE: > > #define PT_POS POS (PT, PT_BYTE) > > In particular, it's not an lvalue. That's important to me, since > assigning to PT_POS would be a severe bug. So all the places where we now access PT and PT_BYTE separately will now dereference a struct? (Btw, PT and PT_BYTE are already non-lvalues, so we have this covered.) > > Like GPT, for example? > > That's difficult. GPT is, of course, very special. How so? It's just like any other buffer position. > > What about BEGV and ZV? > > BEG, BEGV, ZV, and Z would all have _POS equivalents, and very often > using them results in more readable code. > > > IOW, I don't understand the goal here. > > There are multiple goals: I think this significantly aids readability, > and I think there might still be some minor bugs to catch, and future > bugs to avoid. > > For debug builds only, it might make sense to include the object that > the bytepos-charpos relation is valid for, to catch cases where one > object's correspondence is used for another object. > > > I think I did understand when we were talking about accessing characters by buffer positions, and > > the bugs related to incorrect usage there, but now it sounds like the plot thickens? > > I hope that most code will follow a basic structure of being passed a > Lisp_Object or two (charpos/marker and object), converting that to a > pos_t, handing that to internal APIs, potentially receiving a pos_t > back and converting it back to a Lisp_Object, with only a few lines of > code deep down the call stacks actually unwrapping the pos_t and > manipulating it directly. That means there are a few more cases than > accessing buffer text: comparing two positions, for example, walking a > buffer or string by incrementing or decrementing them, adding two > positions or subtracting them. > > (It's true that all kinds of crazy experiments would be easier with > code that follows this structure, but that's a side effect: the goal > really is to increase readability a little in a lot of places.) I cannot judge readability: I'm too accustomed to the current variables. But it's clear that this will hurt speed, and in the innermost loops of our code. Having to maintain 2 values, recompute one from another, and move them into and out of a structure each adds overhead, some small, some large. They will add up. I don't think I see how we can justify that, as the current code is not horribly unreadable. Let's see what others think. > ch = bidi_fetch_char (cpos += nc, bpos += clen, &disp_pos, &dpp, &bs, > bidi_it->w, fwp, &clen, &nc) > > "nc" and "clen" belong together, and so do cpos and bpos. I find the > names don't make that very obvious, and simply reducing the number of > arguments bidi_fetch_char takes by two helps a little. We can use more descriptive names, that's easy and has zero overhead. Converting all of our sources to using a struct as positions is something different.