unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Assertion failure in set_iterator_to_next
@ 2010-04-11  0:52 Stefan Monnier
  2010-04-11  3:14 ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Monnier @ 2010-04-11  0:52 UTC (permalink / raw)
  To: emacs-devel

While playing with my nhexl-mode.el I bumped into some assertion failure
in the redisplay code.  More specifically the assertion

	  xassert (IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it)));

at line 6183.
I think the problem is that the non-bidi code above that line uses
it->len assuming it holds the length of the char at point (well, at
IT_CHARPOS(it)), whereas IIUC it holds the length of "the" glyph.
In my case it->c holds sometimes the "lf symbol" char (u240a) which comes
from a display property and it->len is then 3, which is obviously not
the length of the char at IT_CHARPOS since the buffer I was looking at
only happens to contain ASCII chars.

I use the quick patch below, which seems to fix the problem, but I don't
know enough of the redisplay to know whether that's the right fix, or
whether the right fix is to make sure it->len holds some other value.


        Stefan


--- src/xdisp.c 2010-04-06 16:00:42 +0000
+++ src/xdisp.c 2010-04-10 01:35:11 +0000
@@ -6151,7 +6151,8 @@
 
          if (!it->bidi_p)
            {
-             IT_BYTEPOS (*it) += it->len;
+             IT_BYTEPOS (*it)
+               += BYTES_BY_CHAR_HEAD (*BUF_BYTE_ADDRESS (XBUFFER (it->object), IT_BYTEPOS (*it)));
              IT_CHARPOS (*it) += 1;
            }
          else




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assertion failure in set_iterator_to_next
  2010-04-11  0:52 Assertion failure in set_iterator_to_next Stefan Monnier
@ 2010-04-11  3:14 ` Eli Zaretskii
  2010-04-11 15:02   ` Stefan Monnier
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2010-04-11  3:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 10 Apr 2010 20:52:45 -0400
> 
> While playing with my nhexl-mode.el I bumped into some assertion failure
> in the redisplay code.  More specifically the assertion
> 
> 	  xassert (IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it)));
> 
> at line 6183.
> I think the problem is that the non-bidi code above that line uses
> it->len assuming it holds the length of the char at point (well, at
> IT_CHARPOS(it)), whereas IIUC it holds the length of "the" glyph.

No, it->len should be the length of the multibyte sequence of the
character at IT_CHARPOS (*it).  This is how the basic iteration works,
and if the length is incorrect, we will get garbled display, in
anything but pure 7-bit ASCII buffers.

> In my case it->c holds sometimes the "lf symbol" char (u240a) which comes
> from a display property and it->len is then 3, which is obviously not
> the length of the char at IT_CHARPOS since the buffer I was looking at
> only happens to contain ASCII chars.

I think it->c is not always the character at IT_CHARPOS (*it).  In
particular, this code is under `case GET_FROM_BUFFER', so the iterator
is iterating through the buffer, not through any string.  But as you
say, it->c still holds a character from a string.  I think it->c is
loaded only in get_next_display_element, which is called _after_
set_iterator_to_next advances to the next character.  And if it->what
is anything other than a character, it->c will not be updated at all.

> I use the quick patch below, which seems to fix the problem, but I don't
> know enough of the redisplay to know whether that's the right fix, or
> whether the right fix is to make sure it->len holds some other value.

I don't think it's right.  Can you show a simple test case where this
problem happens?




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assertion failure in set_iterator_to_next
  2010-04-11  3:14 ` Eli Zaretskii
@ 2010-04-11 15:02   ` Stefan Monnier
  2010-04-11 18:14     ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Monnier @ 2010-04-11 15:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> While playing with my nhexl-mode.el I bumped into some assertion failure
>> in the redisplay code.  More specifically the assertion
>> 
>> xassert (IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it)));
>> 
>> at line 6183.
>> I think the problem is that the non-bidi code above that line uses
>> it-> len assuming it holds the length of the char at point (well, at
>> IT_CHARPOS(it)), whereas IIUC it holds the length of "the" glyph.

> No, it->len should be the length of the multibyte sequence of the
> character at IT_CHARPOS (*it).  This is how the basic iteration works,
> and if the length is incorrect, we will get garbled display, in
> anything but pure 7-bit ASCII buffers.

OK, good.  Can you update/complete dispextern.h correspondingly?
It currently says:

  /* If what == IT_CHARACTER, character and length in bytes.  This is
     a character from a buffer or string.  It may be different from
     the character displayed in case that
     unibyte_display_via_language_environment is set.

     If what == IT_COMPOSITION, the first component of a composition
     and length in bytes of the composition.  */
  int c, len;

which lead me to think that it->len is strongly linked to it->c.

>> In my case it->c holds sometimes the "lf symbol" char (u240a) which comes
>> from a display property and it->len is then 3, which is obviously not
>> the length of the char at IT_CHARPOS since the buffer I was looking at
>> only happens to contain ASCII chars.

> I think it->c is not always the character at IT_CHARPOS (*it).  In
> particular, this code is under `case GET_FROM_BUFFER', so the iterator
> is iterating through the buffer, not through any string.  But as you
> say, it->c still holds a character from a string.  I think it->c is
> loaded only in get_next_display_element, which is called _after_
> set_iterator_to_next advances to the next character.  And if it->what
> is anything other than a character, it->c will not be updated at all.

Fair enough.  So the problem now is "why is it->len ==3 rather than ==1"?

>> I use the quick patch below, which seems to fix the problem, but I don't
>> know enough of the redisplay to know whether that's the right fix, or
>> whether the right fix is to make sure it->len holds some other value.
> I don't think it's right.  Can you show a simple test case where this
> problem happens?

I wish I could.  I'll send one as soon as I find it, thanks,


        Stefan




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assertion failure in set_iterator_to_next
  2010-04-11 15:02   ` Stefan Monnier
@ 2010-04-11 18:14     ` Eli Zaretskii
  2010-04-11 20:04       ` Stefan Monnier
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2010-04-11 18:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Sun, 11 Apr 2010 11:02:22 -0400
> 
> > No, it->len should be the length of the multibyte sequence of the
> > character at IT_CHARPOS (*it).  This is how the basic iteration works,
> > and if the length is incorrect, we will get garbled display, in
> > anything but pure 7-bit ASCII buffers.
> 
> OK, good.  Can you update/complete dispextern.h correspondingly?

Will do.

> So the problem now is "why is it->len ==3 rather than ==1"?

I think it's because it was not yet updated.  It is updated only in
the next_element_from_* functions.  E.g., for a character from a
buffer, we have this in next_element_from_buffer:

      /* Get the next character, maybe multibyte.  */
      p = BYTE_POS_ADDR (IT_BYTEPOS (*it));
      if (it->multibyte_p && !ASCII_BYTE_P (*p))
	it->c = STRING_CHAR_AND_LENGTH (p, it->len);     <<<<<<<<<
      else
	it->c = *p, it->len = 1;                         <<<<<<<<<

where I marked the 2 lines that update it->len.  These functions are
again called only from get_next_display_element, which was not yet
called at the point where you have the faulting xassert.

IOW, it->len is correct _before_ we advance to the next character, and
it holds the length of the multibyte sequence of the character we
already consumed, the one we are stepping over by incrementing bytepos
by it->len.

However, all this does not explain at all why you get the assertion
failure, it just explains why neither the value of it->c nor it->len
are relevant to the failure.  All this notwithstanding, it is still
true that `IT_BYTEPOS (*it) == CHAR_TO_BYTE (IT_CHARPOS (*it))', I
think.

Since you are saying this is an ASCII buffer, are charpos and bytepos
at least identical?  If they are, could it be that CHAR_TO_BYTE has a
bug?  If they are not identical, how large is the discrepancy?




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assertion failure in set_iterator_to_next
  2010-04-11 18:14     ` Eli Zaretskii
@ 2010-04-11 20:04       ` Stefan Monnier
  2010-04-11 21:12         ` Eli Zaretskii
  2010-04-19 15:30         ` Eli Zaretskii
  0 siblings, 2 replies; 7+ messages in thread
From: Stefan Monnier @ 2010-04-11 20:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> > No, it->len should be the length of the multibyte sequence of the
>> > character at IT_CHARPOS (*it).  This is how the basic iteration works,
>> > and if the length is incorrect, we will get garbled display, in
>> > anything but pure 7-bit ASCII buffers.
>> OK, good.  Can you update/complete dispextern.h correspondingly?
> Will do.

Thanks.

>> So the problem now is "why is it->len ==3 rather than ==1"?
> I think it's because it was not yet updated.  It is updated only in
> the next_element_from_* functions.  E.g., for a character from a
> buffer, we have this in next_element_from_buffer:

>       /* Get the next character, maybe multibyte.  */
>       p = BYTE_POS_ADDR (IT_BYTEPOS (*it));
>       if (it->multibyte_p && !ASCII_BYTE_P (*p))
>         it-> c = STRING_CHAR_AND_LENGTH (p, it->len);     <<<<<<<<<
>       else
>         it-> c = *p, it->len = 1;                         <<<<<<<<<

Oh, thank you very much.  I searched for this spot and couldn't find it
(I grepped with a pattern like "->len =", which only caught the second
line which couldn't explain why I was getting it->len==3.
That should help me move forward.

> Since you are saying this is an ASCII buffer, are charpos and bytepos
> at least identical?

Yes, they're identical before the assignment to IT_CHARPOS and
IT_BYTEPOS, but since it->len==3 a discrepancy of 2 appears which then
triggers the subsequent xassert.


        Stefan




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assertion failure in set_iterator_to_next
  2010-04-11 20:04       ` Stefan Monnier
@ 2010-04-11 21:12         ` Eli Zaretskii
  2010-04-19 15:30         ` Eli Zaretskii
  1 sibling, 0 replies; 7+ messages in thread
From: Eli Zaretskii @ 2010-04-11 21:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Sun, 11 Apr 2010 16:04:05 -0400
> 
> >       /* Get the next character, maybe multibyte.  */
> >       p = BYTE_POS_ADDR (IT_BYTEPOS (*it));
> >       if (it->multibyte_p && !ASCII_BYTE_P (*p))
> >         it-> c = STRING_CHAR_AND_LENGTH (p, it->len);     <<<<<<<<<
> >       else
> >         it-> c = *p, it->len = 1;                         <<<<<<<<<
> 
> Oh, thank you very much.  I searched for this spot and couldn't find it
> (I grepped with a pattern like "->len =", which only caught the second
> line which couldn't explain why I was getting it->len==3.
> That should help me move forward.
> 
> > Since you are saying this is an ASCII buffer, are charpos and bytepos
> > at least identical?
> 
> Yes, they're identical before the assignment to IT_CHARPOS and
> IT_BYTEPOS, but since it->len==3 a discrepancy of 2 appears which then
> triggers the subsequent xassert.

If this is an all-ASCII buffer, you should be able to say

  (gdb) watch it->len if it->len > 1

and see all the possible suspects.  I would suggest to invoke
"M-x redraw-display" after setting this watchpoint, because that
performs a full redisplay and will iterate over all the characters in
the buffer, its overlays, etc.

You may also find the following commands handy for tracking this
issue:

  (gdb) pit

  (gdb) pgrowx it->glyph_row

The first will show you the current state of the iterator (struct it),
the latter will show the glyphs that have been assembled so far in the
line that is being produced, together with the source (buffer, string,
etc.) from which every glyph comes.  Just remember that using `pit' at
arbitrary places may catch it out of sync, like it->c and it->len that
do not match IT_CHARPOS(*it).

You can also display previously produced glyph rows like this:

  (gdb) pgrowx it->glyph_row - 1
  (gdb) pgrowx it->glyph_row - 2

etc., you get the point.  Finally,

  (gdb) pgrowx MATRIX_ROW(it->w->desired_matrix,N)

will display the Nth glyph row of the glyph matrix being constructed,
which will normally be the Nth screen line (zero-based) on the updated
display of the window it->w.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assertion failure in set_iterator_to_next
  2010-04-11 20:04       ` Stefan Monnier
  2010-04-11 21:12         ` Eli Zaretskii
@ 2010-04-19 15:30         ` Eli Zaretskii
  1 sibling, 0 replies; 7+ messages in thread
From: Eli Zaretskii @ 2010-04-19 15:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Sun, 11 Apr 2010 16:04:05 -0400
> 
> >> > No, it->len should be the length of the multibyte sequence of the
> >> > character at IT_CHARPOS (*it).  This is how the basic iteration works,
> >> > and if the length is incorrect, we will get garbled display, in
> >> > anything but pure 7-bit ASCII buffers.
> >> OK, good.  Can you update/complete dispextern.h correspondingly?
> > Will do.
> 
> Thanks.

Done.




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-04-19 15:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-11  0:52 Assertion failure in set_iterator_to_next Stefan Monnier
2010-04-11  3:14 ` Eli Zaretskii
2010-04-11 15:02   ` Stefan Monnier
2010-04-11 18:14     ` Eli Zaretskii
2010-04-11 20:04       ` Stefan Monnier
2010-04-11 21:12         ` Eli Zaretskii
2010-04-19 15:30         ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).