* bug#9771: 24.0.90; Redisplay problems with control characters @ 2011-10-16 22:24 Johan Bockgård 2011-10-17 7:48 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Johan Bockgård @ 2011-10-16 22:24 UTC (permalink / raw) To: 9771 There are several redisplay problems with control characters (GET_FROM_DISPLAY_VECTOR method). 1. Abort in move_it_by_lines (with bidi) emacs -Q C-u 2000 C-q 0 RET M-< M-> Fatal error (6)Aborted (core dumped) (gdb) bt #0 abort () at emacs.c:385 #1 0x0000000000451a48 in move_it_by_lines (it=0x7fffffffc620, dvpos=0) at xdisp.c:8907 #2 0x00000000004c0717 in Frecenter (arg=10) at window.c:5107 #3 0x000000000060ce90 in Ffuncall (nargs=<optimized out>, args=0x7fffffffd460) at eval.c:2974 [...] The following assertion fails: [move_it_by_lines] if (dvpos == 0) { /* DVPOS == 0 means move to the start of the screen line. */ move_it_vertically_backward (it, 0); xassert (it->current_x == 0 && it->hpos == 0); (gdb) p it->current_x $1 = 10 (gdb) p it->hpos $2 = 1 The non-zero values for it->hpos and it->current_x come from move_it_to in this piece of code: [move_it_vertically_backward] /* The above code moves us to some position NLINES down, usually to its first glyph (leftmost in an L2R line), but that's not necessarily the start of the line, under bidi reordering. We want to get to the character position that is immediately after the newline of the previous line. */ if (it->bidi_p && IT_CHARPOS (*it) > BEGV && FETCH_BYTE (IT_BYTEPOS (*it) - 1) != '\n') { EMACS_INT nl_pos = find_next_newline_no_quit (IT_CHARPOS (*it) - 1, -1); move_it_to (it, nl_pos, -1, -1, -1, MOVE_TO_POS); } 2. (Old) problem in BUFFER_POS_REACHED_P The position hpos = 1 above is not just non-zero; it's also in the middle of the ^@ control character (the screen line starts with ^). It's produced by move_it_in_display_line_to: #define BUFFER_POS_REACHED_P() \ [...] && (it->method == GET_FROM_BUFFER \ || (it->method == GET_FROM_DISPLAY_VECTOR \ && it->dpvec + it->current.dpvec_index + 1 >= it->dpend))) According to the condition above, the position in column 0 before the ^ glyph (dpvec_index = 0) is not a possible stop point, but the position between ^ and @ is. Cf. in_display_vector_p: /* Return 1 if IT points into the middle of a display vector. */ in_display_vector_p (struct it *it) { return (it->method == GET_FROM_DISPLAY_VECTOR && it->current.dpvec_index > 0 && it->dpvec + it->current.dpvec_index != it->dpend); } 3. Long lines with display vectors make Emacs really slow (with bidi) emacs -Q C-u 2000 C-q 0 RET M-< Type text... 4. Abort in push_display_prop emacs -Q (setq wrap-prefix "x") C-u 100 C-q 0 RET Fatal error (6)Aborted (core dumped) (gdb) bt #0 abort () at emacs.c:385 #1 0x0000000000477316 in push_display_prop (it=0x7fffffff8cd0, prop=24535585) at xdisp.c:18432 #2 0x0000000000477857 in handle_line_prefix (it=0x7fffffff8cd0) at xdisp.c:18536 #3 0x00000000004782d4 in display_line (it=0x7fffffff8cd0) at xdisp.c:18802 xassert (it->method == GET_FROM_BUFFER || it->method == GET_FROM_STRING); (gdb) p it->method $1 = GET_FROM_DISPLAY_VECTOR ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-16 22:24 bug#9771: 24.0.90; Redisplay problems with control characters Johan Bockgård @ 2011-10-17 7:48 ` Eli Zaretskii 2011-10-17 14:18 ` Eli Zaretskii 2011-10-17 21:04 ` Johan Bockgård 2011-10-17 17:06 ` Eli Zaretskii 2011-10-17 19:16 ` Eli Zaretskii 2 siblings, 2 replies; 11+ messages in thread From: Eli Zaretskii @ 2011-10-17 7:48 UTC (permalink / raw) To: Johan Bockgård; +Cc: 9771 > From: Johan Bockgård <bojohan@gnu.org> > Date: Mon, 17 Oct 2011 00:24:30 +0200 > > 2. (Old) problem in BUFFER_POS_REACHED_P > > The position hpos = 1 above is not just non-zero; it's also in the > middle of the ^@ control character (the screen line starts with ^). It's > produced by move_it_in_display_line_to: > > #define BUFFER_POS_REACHED_P() \ > [...] > && (it->method == GET_FROM_BUFFER \ > || (it->method == GET_FROM_DISPLAY_VECTOR \ > && it->dpvec + it->current.dpvec_index + 1 >= it->dpend))) > > According to the condition above, the position in column 0 before the > ^ glyph (dpvec_index = 0) is not a possible stop point, but the position > between ^ and @ is. Okay, but what is the practical problem with this? > 3. Long lines with display vectors make Emacs really slow (with bidi) It's not the display vectors in general that cause this. It's specifically the display of control characters. E.g., the following test case, which is a small variation of yours, shows no visible slowdown: emacs -Q M-: (require 'disp-table) RET M-: (aset standard-display-table ?x (vconcat "^A")) RET C-u 2000 x M-< Type text... I'll look into this, but is there some real-life use case behind your recipe? Binary nulls in a file generally cause Emacs to make the buffer unibyte, where bidi reordering is disabled. If some optimizations are in order to speed up redisplay in this situation, it would help to have a real-life use case handy, to make sure the optimizations really make a difference. The other parts of the report are more or less clear. Btw, in the future please make a separate bug report for each issue, it makes references to bugs in log messages more straightforward. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-17 7:48 ` Eli Zaretskii @ 2011-10-17 14:18 ` Eli Zaretskii 2011-10-17 21:04 ` Johan Bockgård 1 sibling, 0 replies; 11+ messages in thread From: Eli Zaretskii @ 2011-10-17 14:18 UTC (permalink / raw) To: bojohan, 9771 > Date: Mon, 17 Oct 2011 03:48:27 -0400 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 9771@debbugs.gnu.org > > > 3. Long lines with display vectors make Emacs really slow (with bidi) > > It's not the display vectors in general that cause this. It's > specifically the display of control characters. The reason for the slowness is the bidirectional property of the control characters, and a peculiarity of the UBA, the Unicode Bidirectional Algorithms, regarding the processing of sequences of characters with that particular property. In the particular case in point, it causes Emacs to search to the end of the long line, when it is about to display the first null character. So now this question becomes very relevant: > is there some real-life use case behind your recipe? Depending on the answer, it may or may not be a good idea to look for some heuristics for this case before 24.1 is released, because AFAICS the current implementation of the UBA requirements is correct. ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-17 7:48 ` Eli Zaretskii 2011-10-17 14:18 ` Eli Zaretskii @ 2011-10-17 21:04 ` Johan Bockgård 2011-10-18 12:56 ` Eli Zaretskii 1 sibling, 1 reply; 11+ messages in thread From: Johan Bockgård @ 2011-10-17 21:04 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 9771 Eli Zaretskii <eliz@gnu.org> writes: >> According to the condition above, the position in column 0 before the >> ^ glyph (dpvec_index = 0) is not a possible stop point, but the position >> between ^ and @ is. > > Okay, but what is the practical problem with this? Nothing, really. But it doesn't seem correct. > Binary nulls in a file generally cause Emacs to make the buffer > unibyte, where bidi reordering is disabled. That doesn't seem to be working, then. If I do emacs -Q /usr/bin/emacs the resulting buffer is multibyte. Type M-> and Emacs hangs. (I waited 5 minutes and had to kill the process.) ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-17 21:04 ` Johan Bockgård @ 2011-10-18 12:56 ` Eli Zaretskii 2011-10-18 16:58 ` Eli Zaretskii 2011-10-19 1:06 ` Johan Bockgård 0 siblings, 2 replies; 11+ messages in thread From: Eli Zaretskii @ 2011-10-18 12:56 UTC (permalink / raw) To: Johan Bockgård, handa; +Cc: 9771 > From: Johan Bockgård <bojohan@gnu.org> > Cc: 9771@debbugs.gnu.org > Date: Mon, 17 Oct 2011 23:04:46 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> According to the condition above, the position in column 0 before the > >> ^ glyph (dpvec_index = 0) is not a possible stop point, but the position > >> between ^ and @ is. > > > > Okay, but what is the practical problem with this? > > Nothing, really. But it doesn't seem correct. I'm not sure. Maybe I'm missing something (move_it_in_display_line_to is one of the trickiest functions in the display engine), so please bear with me as I explain why I think the test is correct. You wrote: > #define BUFFER_POS_REACHED_P() \ > [...] > && (it->method == GET_FROM_BUFFER \ > || (it->method == GET_FROM_DISPLAY_VECTOR \ > && it->dpvec + it->current.dpvec_index + 1 >= it->dpend))) > > According to the condition above, the position in column 0 before the > ^ glyph (dpvec_index = 0) is not a possible stop point, but the position > between ^ and @ is. A display vector has its glyphs stored at indices 0..it->dpend-1. IOW, the last glyph's index is it->dpend-1. BUFFER_POS_REACHED_P is invoked always after a call to get_next_display_element, but before the call to set_iterator_to_next. The former just consumes the glyph at it->current.dpvec_index; the latter advances the index to the next glyph of the display vector. Therefore, after a call to get_next_display_element, the glyph at dpvec_index was already consumed, but the index was not yet advanced to the next position. Thus, when the index is at dpend-1, as the condition in BUFFER_POS_REACHED_P says, we have already consumed the last glyph. Consuming the last glyph of a display vector means that the next call to set_iterator_to_next will detect that the display vector is exhausted, and will advance to the next buffer position, the one after the position which we just passed by consuming all the glyphs from the display vector used to display the character at that position. Thus, "buffer position reached". Does this make sense? In practice, after correcting the bug that caused the assertion violation, I can no longer reproduce the situation where we stop in the middle of the ^@ character. If you can show me a recipe for winding up in the middle of a display vector under these or similar circumstances, I will have another look. > > Binary nulls in a file generally cause Emacs to make the buffer > > unibyte, where bidi reordering is disabled. > > That doesn't seem to be working, then. > > If I do > > emacs -Q /usr/bin/emacs > > the resulting buffer is multibyte. Indeed. However, buffer-file-coding-system is no-conversion, which is confusingly inconsistent with "C-x RET c no-conversion C-x C-f". Perhaps Handa-san could shed some light on this inconsistency. Anyway, this being so, I found a way to optimize a couple of expensive loops inside bidi.c for the important special case of a plain L2R text in the buffer. With these optimizations, the bidi redisplay of the entire window showing a buffer with 2000 binary nulls is only a few milliseconds slower than the unidirectional one. I will install those changes later today, after some more testing. > Type M-> and Emacs hangs. (I waited 5 minutes and had to kill the > process.) It takes 1.5 seconds now (with a 44MB emacs binary ;-) ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-18 12:56 ` Eli Zaretskii @ 2011-10-18 16:58 ` Eli Zaretskii 2011-10-20 12:43 ` Eli Zaretskii 2011-10-19 1:06 ` Johan Bockgård 1 sibling, 1 reply; 11+ messages in thread From: Eli Zaretskii @ 2011-10-18 16:58 UTC (permalink / raw) To: bojohan; +Cc: 9771 > Date: Tue, 18 Oct 2011 08:56:34 -0400 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 9771@debbugs.gnu.org > > Anyway, this being so, I found a way to optimize a couple of expensive > loops inside bidi.c for the important special case of a plain L2R text > in the buffer. With these optimizations, the bidi redisplay of the > entire window showing a buffer with 2000 binary nulls is only a few > milliseconds slower than the unidirectional one. > > I will install those changes later today, after some more testing. Done (revision 106124 on the trunk). ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-18 16:58 ` Eli Zaretskii @ 2011-10-20 12:43 ` Eli Zaretskii 0 siblings, 0 replies; 11+ messages in thread From: Eli Zaretskii @ 2011-10-20 12:43 UTC (permalink / raw) To: bojohan, 9771 > Date: Tue, 18 Oct 2011 18:58:41 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 9771@debbugs.gnu.org > > > Date: Tue, 18 Oct 2011 08:56:34 -0400 > > From: Eli Zaretskii <eliz@gnu.org> > > Cc: 9771@debbugs.gnu.org > > > > Anyway, this being so, I found a way to optimize a couple of expensive > > loops inside bidi.c for the important special case of a plain L2R text > > in the buffer. With these optimizations, the bidi redisplay of the > > entire window showing a buffer with 2000 binary nulls is only a few > > milliseconds slower than the unidirectional one. > > > > I will install those changes later today, after some more testing. > > Done (revision 106124 on the trunk). For the record, revision 106151 speeds up the bidi display in a more general class of use cases. (Previously, appending some character, like `a', to the 2000 nulls in the original example would still slow down redisplay, even after the changes in r106124.) ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-18 12:56 ` Eli Zaretskii 2011-10-18 16:58 ` Eli Zaretskii @ 2011-10-19 1:06 ` Johan Bockgård 2011-10-19 11:49 ` Eli Zaretskii 1 sibling, 1 reply; 11+ messages in thread From: Johan Bockgård @ 2011-10-19 1:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 9771 Eli Zaretskii <eliz@gnu.org> writes: > Consuming the last glyph of a display vector means that the next call > to set_iterator_to_next will detect that the display vector is > exhausted, and will advance to the next buffer position, the one after > the position which we just passed by consuming all the glyphs from the > display vector used to display the character at that position. Thus, > "buffer position reached". It seems that BUFFER_POS_REACHED_P at least misses the opportunity to stop immediately before *any* glyph in the display vector has been consumed, i.e. when dpvec_index == 0. (Or the problem is not with BUFFER_POS_REACHED_P but with how it's being used.) [...] > In practice, after correcting the bug that caused the assertion > violation, I can no longer reproduce the situation where we stop in > the middle of the ^@ character. If you can show me a recipe for > winding up in the middle of a display vector under these or similar > circumstances, I will have another look. emacs -Q C-u 2000 C-q 0 RET Notice that the lines visible in the window start with "@^". Press <up> until the top of the buffer just scrolls into view. Notice that lines before point start with "^@", and lines after point start with "@^". Change BUFFER_POS_REACHED_P (or something) so that dpvec_index == 0 is an acceptable stop position and this problem doesn't happen. ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-19 1:06 ` Johan Bockgård @ 2011-10-19 11:49 ` Eli Zaretskii 0 siblings, 0 replies; 11+ messages in thread From: Eli Zaretskii @ 2011-10-19 11:49 UTC (permalink / raw) To: Johan Bockgård; +Cc: 9771-done > From: Johan Bockgård <bojohan@gnu.org> > Cc: handa@m17n.org, 9771@debbugs.gnu.org > Date: Wed, 19 Oct 2011 03:06:09 +0200 > > emacs -Q > > C-u 2000 C-q 0 RET > > Notice that the lines visible in the window start with "@^". > > Press <up> until the top of the buffer just scrolls into view. > > Notice that lines before point start with "^@", and lines after > point start with "@^". Thanks for the test case. > Change BUFFER_POS_REACHED_P (or something) so that dpvec_index == 0 is > an acceptable stop position and this problem doesn't happen. I'm not sure a change in BUFFER_POS_REACHED_P is the right fix for this particular problem. Even if it is, I'm reluctant to make in move_it_in_display_line_to a change that could have effect in many unrelated places, since this function is such a central piece of the display engine. I'd rather make a change that targets this specific problem, certainly while we are in a pretest. I think I found such a change. The problem is that when we start a new redisplay cycle of a window, we begin drawing the window's first line at the last glyph of the display vector used to display the first character position visible in the window. I made a simple change that should fix this (in revision 106131), please see if it resolves the issue. Another problem I found was that one of the redisplay optimizations would redraw too few lines of a window, when the last redrawn line ends in glyphs from a display vector. I fixed that in revision 106133. With all the 4 parts of the original report being addressed now, I'm closing this bug report. If there are any left-overs, please submit separate bug reports about them. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-16 22:24 bug#9771: 24.0.90; Redisplay problems with control characters Johan Bockgård 2011-10-17 7:48 ` Eli Zaretskii @ 2011-10-17 17:06 ` Eli Zaretskii 2011-10-17 19:16 ` Eli Zaretskii 2 siblings, 0 replies; 11+ messages in thread From: Eli Zaretskii @ 2011-10-17 17:06 UTC (permalink / raw) To: Johan Bockgård; +Cc: 9771 > From: Johan Bockgård <bojohan@gnu.org> > Date: Mon, 17 Oct 2011 00:24:30 +0200 > > 4. Abort in push_display_prop > > emacs -Q > > (setq wrap-prefix "x") > > C-u 100 C-q 0 RET > > Fatal error (6)Aborted (core dumped) This part should be fixed in revision 106102 on the trunk. ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#9771: 24.0.90; Redisplay problems with control characters 2011-10-16 22:24 bug#9771: 24.0.90; Redisplay problems with control characters Johan Bockgård 2011-10-17 7:48 ` Eli Zaretskii 2011-10-17 17:06 ` Eli Zaretskii @ 2011-10-17 19:16 ` Eli Zaretskii 2 siblings, 0 replies; 11+ messages in thread From: Eli Zaretskii @ 2011-10-17 19:16 UTC (permalink / raw) To: Johan Bockgård; +Cc: 9771 > From: Johan Bockgård <bojohan@gnu.org> > Date: Mon, 17 Oct 2011 00:24:30 +0200 > > 1. Abort in move_it_by_lines (with bidi) > > emacs -Q > C-u 2000 C-q 0 RET > M-< > M-> > > Fatal error (6)Aborted (core dumped) > > (gdb) bt > #0 abort () at emacs.c:385 > #1 0x0000000000451a48 in move_it_by_lines (it=0x7fffffffc620, dvpos=0) > at xdisp.c:8907 > #2 0x00000000004c0717 in Frecenter (arg=10) at window.c:5107 > #3 0x000000000060ce90 in Ffuncall (nargs=<optimized out>, args=0x7fffffffd460) > at eval.c:2974 > [...] > > The following assertion fails: > > [move_it_by_lines] > > if (dvpos == 0) > { > /* DVPOS == 0 means move to the start of the screen line. */ > move_it_vertically_backward (it, 0); > xassert (it->current_x == 0 && it->hpos == 0); > > (gdb) p it->current_x > $1 = 10 > (gdb) p it->hpos > $2 = 1 This part should be fixed in revision 106106 on the trunk. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-10-20 12:43 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-16 22:24 bug#9771: 24.0.90; Redisplay problems with control characters Johan Bockgård 2011-10-17 7:48 ` Eli Zaretskii 2011-10-17 14:18 ` Eli Zaretskii 2011-10-17 21:04 ` Johan Bockgård 2011-10-18 12:56 ` Eli Zaretskii 2011-10-18 16:58 ` Eli Zaretskii 2011-10-20 12:43 ` Eli Zaretskii 2011-10-19 1:06 ` Johan Bockgård 2011-10-19 11:49 ` Eli Zaretskii 2011-10-17 17:06 ` Eli Zaretskii 2011-10-17 19:16 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).