From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: "... the window start at a meaningless point within a line." Date: Fri, 16 Oct 2015 09:55:35 +0000 Message-ID: <20151016095535.GA2779@acm.fritz.box> References: <20150930204513.GA3174@acm.fritz.box> <83mvw39dp3.fsf@gnu.org> <20151001094138.GA2515@acm.fritz.box> <83h9maao7w.fsf@gnu.org> <20151001110204.GB2515@acm.fritz.box> <83egheaj9e.fsf@gnu.org> <20151015181642.GA6467@acm.fritz.box> <834mhr99e8.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1444989288 28587 80.91.229.3 (16 Oct 2015 09:54:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 16 Oct 2015 09:54:48 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 16 11:54:39 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Zn1iZ-0006vv-Rb for ged-emacs-devel@m.gmane.org; Fri, 16 Oct 2015 11:54:32 +0200 Original-Received: from localhost ([::1]:52306 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn1iU-0008EU-4G for ged-emacs-devel@m.gmane.org; Fri, 16 Oct 2015 05:54:26 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48174) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn1i8-00089S-GB for emacs-devel@gnu.org; Fri, 16 Oct 2015 05:54:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zn1i5-0002Jo-6d for emacs-devel@gnu.org; Fri, 16 Oct 2015 05:54:04 -0400 Original-Received: from mail.muc.de ([193.149.48.3]:13795) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn1i4-0002Ix-T7 for emacs-devel@gnu.org; Fri, 16 Oct 2015 05:54:01 -0400 Original-Received: (qmail 30660 invoked by uid 3782); 16 Oct 2015 09:53:58 -0000 Original-Received: from acm.muc.de (p5B14790D.dip0.t-ipconnect.de [91.20.121.13]) by colin.muc.de (tmda-ofmipd) with ESMTP; Fri, 16 Oct 2015 11:53:58 +0200 Original-Received: (qmail 2823 invoked by uid 1000); 16 Oct 2015 09:55:35 -0000 Content-Disposition: inline In-Reply-To: <834mhr99e8.fsf@gnu.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x X-Received-From: 193.149.48.3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:191733 Archived-At: Hello, Eli. Thanks for such a rapid reply. On Thu, Oct 15, 2015 at 11:14:39PM +0300, Eli Zaretskii wrote: > > Date: Thu, 15 Oct 2015 18:16:43 +0000 > > Cc: emacs-devel@gnu.org > > From: Alan Mackenzie > > I suspect I still have to modify one or two other primitives, such > > as `compute-motion'. > Why do you need to touch compute-motion? I don't think it's used in > your scenarios. Because it deals with screen lines and columns. But looking more closely at it, it doesn't use struct it's, or anything like that, so perhaps I won't need to do anything to it. > > I have a working proof of concept in a patch below. It is not ready for > > committing for several reasons. One is that I haven't fully grasped what > > SAVE_IT and RESTORE_IT do with the third parameter (bidi data, I think), > > so I've just used SAVE_IT everywhere without much thought. > That argument is where the data is saved and restored. OK. > I'd prefer these [SAVE_IT and RESTORE_IT] to stay private to xdisp.c. > If you really need them elsewhere (I don't think so), there are > alternative solutions. I think most of the time, I just need to assign one struct it to another. > > +/* Get the end of the text line on which POS is. */ > > +static ptrdiff_t > > +pos_line_end_position (struct text_pos *pos) > > +{ > > + ptrdiff_t pt = PT, pt_byte = PT_BYTE; > > + Lisp_Object eol; > > + > > + SET_PT_BOTH (pos->charpos, pos->bytepos); > > + eol = Fline_end_position (Qnil); > > + SET_PT_BOTH (pt, pt_byte); > > + return XINT (eol); > > +} > There are much easier ways to get the end of the line: e.g., scan from > POS for a newline. :-) OK, I'll get rid of that. > > + if (beginning < L0_phys_EOL > > + && end >= ws) > Sorry, but this is a non-starter: you cannot compare character > positions with < and >, assuming characters are displayed on a line in > strict logical order. These comparisons will completely break in > bidirectional text, where character positions change non-linearly with > screen coordinates. This just checks that there is some overlap between (beginning end) and the first text line. > > + if (IT_CHARPOS (*it) < ws) > > + { > > + below = false; /* exact_it is already past `it'. */ > > + SAVE_IT (post_exact_it, exact_it, cache); > > + } > Same here. > > + while (1) > > + { > > + if (IT_CHARPOS (exact_it) < IT_CHARPOS (*it)) > > + SAVE_IT (pre_exact_it, exact_it, cache); > > + if (!forward_to_next_display_line_start (&exact_it)) > > + break; /* protection against infinite looping. */ > > + if (below > > + && IT_CHARPOS (exact_it) >= IT_CHARPOS (*it)) > > + { > > + below = false; > > + SAVE_IT (post_exact_it, exact_it, cache); > > + } > > + if (IT_CHARPOS (exact_it) > end) > > + break; > And here. And everywhere else. > > + if (IT_CHARPOS (exact_it) == IT_CHARPOS (xdisp_it)) > Comparisons with == are OK. There must be some bidi compatible order operators which express "earlier than" etc. I will be needing these. > I also suspect you don't need all those calls to SAVE_IT all over the > place. In any case, you are leaking memory here, because there's not > a single RESTORE_IT to free the memory that SAVE_IT allocates. As I said, straight assignments should do most of the time. But at the very end, when I actually change `it', I think some adjustment of the bidi data will be needed, too. > Can you describe the algorithm of this function in English? Then > perhaps I could help you rewrite it in bidi-aware way. OK, here goes! It's a bit long winded, but that's because the function is little more than an assemblage of special cases. What maybe_move_to_exact_bol does --------------------------------- Normally, when set-window-start is called, Emacs adjusts the position of the window start to the nearest start of display line, calculated from a previous BOL. When `et-window-start's EXACT-START argument is non-nil, the window start is at the exact position supplied. In this situation, the BO(display)Ls are in a different place than where xdisp.c expects them to be. We refer to these BOLs as "exact BOL"s, and "xdisp BOL"s respectively. In the following diagrams, these symbols represent various BOLs: (i) WS is the (exact) window start. (ii) L2, L3, .... are the subsequent exact BOLs on the first text line. (iii) B is the xdisp BOL before WS. (iv) A is the first xdisp BOL after WS. A2, A3, are subsequent xdisp BOLs. (v) C is the first BOL where exact and xdisp BOLs coincide (usually after a LF). C2, C3, ... are subsequent BOLs. They are also called "common" BOLs. So a typical layout of BOLs at the top of the screen would look something like this: B-----------------WS---A-------------L2---A2-------------\nC--------C2--- Here "xdisp_WS", the place where xdisp would have chosen window start, is at A. By contrast, in the following, xdisp_WS would be at B: B---WS-----------------A---L2-------------A2---L3---------\nC-------C2--- When Fvertical_motion runs, it first sets its struct it "it" to a BOL, then possibly moves it to the appropriate column. Between these two actions, maybe_move_to_exact_bol reseats `it' from an xdisp BOL to an exact BOL, or sometimes from a common BOL to an earlier or later BOL. In maybe_move_to_exact_bol, `beginning' and `end' are PT and `it', in the canonical order. Note that `it' will always be at an xdisp (or common) BOL. maybe_move_to_exact_bol first checks that there is some overlap between the interval (beginning end) and the first text line on the window. If there isn't, the function does nothing. Otherwise... The function in essence scans the first text line from WS to C, "counting" the two sorts of BOL which are contained in (beginning end). We have two struct it's, exact_it and xdisp_it, which alternately "hop" forwards over eachother. When an xdisp BOL is encountered, excess_xdisp_BOLs is incremented; on an exact BOL, excess_xdisp_BOLs is decremented. Exact and xdisp BOLs alternate strictly, so the value of excess_xdisp_BOLs is always -1, 0, or 1. Note that a BOL at `beginning' isn't "counted"; one at `end' is. This is due to the way Fvertical_motion first moves `it' to the beginning of the current line before moving it by nlines. It turns out that when nlines > 0, excess_xdisp_BOLs will end up 0 or 1 (since the final change to the variable, at `it', is an increment). When nlines <= 0, the variable will end up -1 or 0 (since the first change, at the first exact BOL after `it', is a decrement). At the end of the function, when excess_xdisp_BOLs is -1 or +1, we move `it' to the exact BOL _after_ `it'. When it's 0, we move `it' to the exact BOL _before_ `it' - that's when `it' is on the first text line. We keep track of these positions in pre_exact_it and post_exact_it, these variables being moved forwards at each exact hop. (They are also incremented when the final xdisp hop reaches C.) When `it' is on a subsequent (common) BOL, and excess_xdisp_BOLs != 0, we advance `it' by one BOL. There is also a specific circumstance when we must move `it' to the previous (common) BOL (see example 5 below). Further notation: (i) ^ represents PT (point) (which Fvertical_motion hasn't yet changed). (ii) "it" is the position of Fvertical_motion's struct it `it'. (iii) T is the "target" - the exact or common BOL we want to end up at. Here is an example, with nlines > 0, and both PT and `it' are on the first text line. The third line in the diagram represents the current value of excess_xdisp_BOLs. It's value is only given where it is changed: nlines = 2. 1. B-----------------WS---A-------------L2---A2-------------\nC--------C2--- ^ it T - 1 0 1 Here, with nlines = 2, Fvertical_motion puts `it' to B, then moves 2 (xdisp) BOLs forward to A2. maybe_move_to_exact_bol starts "counting" _after_ PT, and on reaching A2, excess_xdisp_BOLs has the value 1. So the function moves `it' forward to C, our desired target. In the next example, WS_exact < WS, and nlines < 0: nlines = -1 2. B---WS-----------------A---L2-------------A2---L3---------\nC-------C2--- T it ^ - - - -1 0 Here, with nlines = -1, Fvertical_motion first moves `it' back to A2, then moves one line back to A. maybe_move_to_exact_bol starts "counting" at L2, and finishes at A2, excess_xdisp_BOLs being 0. So we move `it' backwards to WS, our target. Things get a little more complicated when `end' is at or after C, since the common BOLs count as both exact and xdisp BOLs. (Sometimes it is helpful to think of C as an exact BOL and an xdisp BOL in the same place, but in a particular order.) We detect C by comparing exact_it and xdisp_it after every hop. We don't count BOLs beyond C because that would be tiring and unnecessary. Here is an example with nlines > 0, where we go beyond C: nlines = 3 3. B--------------WS---A------------L2---A2-----------\nC--------C2-----C3 ^ it,T - - -1 0 -1,0 Fvertical_motion moves `it' back to A then forward 3 to C2. Our target is also at C2. When xdisp_it reaches C (and detects it by comparing with exact_it), it increments excess_xdisp_BOLs to 0, moves pre_exact_it and post_exact_it forwards to C and C2, and the hopping stops. excess_xdisp_BOLs is 0, but `it' is already beyond C. So we leave `it' where it is, at C2. In the above example, the exact hop arrives at C before the xdisp hop. In the next example, it's the other way around: nlines = 3 4. B---WS---------------A---L2-----------A2---L3--------\nC-------C2------C3 ^ it,T - - - - 1 0 1 Note that when, as here, the exact hop arrives at C second (detecting it), we only decrement excess_xdisp_BOLs when nlines <= 0 (which it isn't here). We do however set put_it_back_a_BOL to true. put_it_back_a_BOL "cancels" excess_xdisp_BOLs's value 1, so we don't move `it' from C2. (Yes, this is an ugly artifice.) The next, and final, example shows why we need the ugly boolean put_it_back_a_BOL: nlines = 2 5. B---WS-------------A---L1------\nC----------C2--- ^ T it - - - -1 0[,-1] In this scenario, Fvertical_motion has advanced `it' to C2, but our target is C. The normal excess_xdisp_BOLs mechanism, when `it' is already beyond C, either advances `it' by one BOL or leaves it alone. Here we need to move `it' one BOL backwards, to C. Hence our boolean variable put_it_back_a_BOL, which will have been set to true when the exact hop arrived at C second. -- Alan Mackenzie (Nuremberg, Germany).