* potential bug in display_mode_element?
@ 2005-09-12 0:58 Kenichi Handa
2005-09-12 8:04 ` Kim F. Storm
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Kenichi Handa @ 2005-09-12 0:58 UTC (permalink / raw)
I got a bug report for emacs-unicode-2, and it seems that
the same bug exists in HEAD too. The backtrace is this:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208118624 (LWP 29169)]
0x080b7349 in display_mode_element (it=0xbfffd1b0, depth=10, field_width=0, precision=-63, elt=164196259, props=138499373, risky=0) at xdisp.c:16136
16136 while ((precision <= 0 || n < precision)
(gdb) xbacktrace
"execute-extended-command"
"call-interactively"
(gdb) bt full
#0 0x080b7349 in display_mode_element (it=0xbfffd1b0, depth=10, field_width=0, precision=-63, elt=164196259, props=138499373, risky=0) at xdisp.c:16136
c = 0 '\0'
this = (const unsigned char *) 0xaf7a101 <Address 0xaf7a101 out of bounds>
lisp_string = (const unsigned char *) 0xaf7a0fc <Address 0xaf7a0fc out of bounds>
n = 5
field = 138382657
prec = 5
literal = 0
Here the strange thing is that list_string points an address
out of bounds. It is initialized as this:
this = SDATA (elt);
lisp_string = this;
if (literal)
/* omitted because not relevant now */
while ((precision <= 0 || n < precision)
&& *this
&& (mode_line_target != MODE_LINE_DISPLAY
|| it->current_x < it->last_visible_x))
...
and never changed in the while loop. So the only reason I
can think of why the address pointed by list_string becomes
out of bound is that the string data of ELT was relocated in
the loop and the original address was returned to OS.
Actually, display_string is called in the loop, and it will
run Lisp code.
So, I think we meed this change. What do you think?
*** xdisp.c 10 Sep 2005 09:35:12 +0900 1.1050
--- xdisp.c 10 Sep 2005 18:58:05 +0900
***************
*** 16036,16042 ****
--- 16036,16047 ----
&& (mode_line_target != MODE_LINE_DISPLAY
|| it->current_x < it->last_visible_x))
{
+ /* Never change the value of LAST in this block. */
const unsigned char *last = this;
+ /* String data of ELT may be relocated. In such a case,
+ OFFSET can be used to make THIS correctly points into
+ the string data of ELT. */
+ int offset = this - SDATA (elt);
/* Advance to end of string or next format specifier. */
while ((c = *this++) != '\0' && c != '%')
***************
*** 16171,16176 ****
--- 16176,16182 ----
else /* c == 0 */
break;
}
+ this = SDATA (elt) + offset + (this - last);
}
}
break;
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: potential bug in display_mode_element?
2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa
@ 2005-09-12 8:04 ` Kim F. Storm
2005-09-12 11:54 ` Kenichi Handa
2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm
2005-09-12 15:34 ` potential bug in display_mode_element? Richard M. Stallman
2 siblings, 1 reply; 14+ messages in thread
From: Kim F. Storm @ 2005-09-12 8:04 UTC (permalink / raw)
Cc: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> I got a bug report for emacs-unicode-2, and it seems that
> the same bug exists in HEAD too. The backtrace is this:
> and never changed in the while loop. So the only reason I
> can think of why the address pointed by list_string becomes
> out of bound is that the string data of ELT was relocated in
> the loop and the original address was returned to OS.
> Actually, display_string is called in the loop, and it will
> run Lisp code.
>
> So, I think we meed this change. What do you think?
Bravo!! We definitely need this patch.
This is the sort of bug that I've been hunting for re. the
crashes in "compact_small_strings" and the free string list.
I'm not sure if this patch will fix the crashes though.
--
Kim F. Storm <storm@cua.dk> http://www.cua.dk
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: potential bug in display_mode_element?
2005-09-12 8:04 ` Kim F. Storm
@ 2005-09-12 11:54 ` Kenichi Handa
0 siblings, 0 replies; 14+ messages in thread
From: Kenichi Handa @ 2005-09-12 11:54 UTC (permalink / raw)
Cc: emacs-devel
In article <m38xy2zp89.fsf@kfs-l.imdomain.dk>, storm@cua.dk (Kim F. Storm) writes:
> Kenichi Handa <handa@m17n.org> writes:
>> I got a bug report for emacs-unicode-2, and it seems that
>> the same bug exists in HEAD too. The backtrace is this:
>> and never changed in the while loop. So the only reason I
>> can think of why the address pointed by list_string becomes
>> out of bound is that the string data of ELT was relocated in
>> the loop and the original address was returned to OS.
>> Actually, display_string is called in the loop, and it will
>> run Lisp code.
>>
>> So, I think we meed this change. What do you think?
> Bravo!! We definitely need this patch.
I forgot to update lisp_string in that patch. So the
attached one is better (and simpler). I've just installed
it.
---
Kenichi Handa
handa@m17n.org
2005-09-12 Kenichi Handa <handa@m17n.org>
* xdisp.c (display_mode_element): Be sure to make variables THIS
and LISP_STRING point into a string data of ELT.
*** xdisp.c 10 Sep 2005 09:35:12 +0900 1.1050
--- xdisp.c 12 Sep 2005 20:47:28 +0900
***************
*** 16171,16176 ****
--- 16171,16178 ----
else /* c == 0 */
break;
}
+ this += SDATA (elt) - lisp_string;
+ lisp_string = SDATA (elt);
}
}
break;
^ permalink raw reply [flat|nested] 14+ messages in thread
* Potential GC-related problems in compose_chars_in_text
2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa
2005-09-12 8:04 ` Kim F. Storm
@ 2005-09-12 12:41 ` Kim F. Storm
2005-09-13 1:08 ` Kenichi Handa
2005-09-13 15:54 ` Richard M. Stallman
2005-09-12 15:34 ` potential bug in display_mode_element? Richard M. Stallman
2 siblings, 2 replies; 14+ messages in thread
From: Kim F. Storm @ 2005-09-12 12:41 UTC (permalink / raw)
Cc: emacs-devel
The following code in compose_chars_in_text looks suspicious:
if (INTEGERP (val) && XFASTINT (val) == start)
{
to = Fmatch_end (make_number (0));
val = call4 (XCDR (elt), val, to, XCAR (elt), string);
if (INTEGERP (val) && XINT (val) > 1)
{
start += XINT (val);
if (STRINGP (string))
ptr = SDATA (string) + string_char_to_byte (string, start);
else
ptr = CHAR_POS_ADDR (start);
}
else
{
start++;
ptr += len;
>>>> if string is non-nil, and call4 did GC, then ptr may no longer
>>>> point into "string".
}
break;
Likewise, the `pend' pointer may no longer be valid for the same reason
-- on both branches of the above code!!.
Furthermore, the initialization of pend seems bogus too:
ptr = SDATA (string) + string_char_to_byte (string, start);
pend = ptr + SBYTES (string);
Shouldn't that be
pend = SDATA (string) + SBYTES (string);
Here is a patch (untested):
*** composite.c 14 Aug 2005 14:47:27 +0200 1.35
--- composite.c 12 Sep 2005 14:40:52 +0200
***************
*** 616,622 ****
GCPRO1 (string);
stop = end;
ptr = SDATA (string) + string_char_to_byte (string, start);
! pend = ptr + SBYTES (string);
}
else
{
--- 616,622 ----
GCPRO1 (string);
stop = end;
ptr = SDATA (string) + string_char_to_byte (string, start);
! pend = SDATA (string) + SBYTES (string);
}
else
{
***************
*** 680,689 ****
{
start += XINT (val);
if (STRINGP (string))
! ptr = SDATA (string) + string_char_to_byte (string, start);
else
ptr = CHAR_POS_ADDR (start);
}
else
{
start++;
--- 680,698 ----
{
start += XINT (val);
if (STRINGP (string))
! {
! ptr = SDATA (string) + string_char_to_byte (string, start);
! pend = SDATA (string) + SBYTES (string);
! }
else
ptr = CHAR_POS_ADDR (start);
}
+ else if (STRINGP (string))
+ {
+ start++;
+ ptr = SDATA (string) + string_char_to_byte (string, start);
+ pend = SDATA (string) + SBYTES (string);
+ }
else
{
start++;
--
Kim F. Storm <storm@cua.dk> http://www.cua.dk
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: potential bug in display_mode_element?
2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa
2005-09-12 8:04 ` Kim F. Storm
2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm
@ 2005-09-12 15:34 ` Richard M. Stallman
2 siblings, 0 replies; 14+ messages in thread
From: Richard M. Stallman @ 2005-09-12 15:34 UTC (permalink / raw)
Cc: emacs-devel
I think you are right about the bug, but I think your patch does not
completely fix it. This patch, which seems to work, should fix the bug
completely by accessing the string only through its Lisp object pointer.
*** xdisp.c 08 Sep 2005 20:35:28 -0400 1.1050
--- xdisp.c 12 Sep 2005 06:18:48 -0400
***************
*** 15949,15955 ****
{
/* A string: output it and check for %-constructs within it. */
unsigned char c;
! const unsigned char *this, *lisp_string;
if (!NILP (props) || risky)
{
--- 15949,15955 ----
{
/* A string: output it and check for %-constructs within it. */
unsigned char c;
! int offset = 0;
if (!NILP (props) || risky)
{
***************
*** 16007,16014 ****
}
}
! this = SDATA (elt);
! lisp_string = this;
if (literal)
{
--- 16007,16013 ----
}
}
! offset = 0;
if (literal)
{
***************
*** 16031,16072 ****
break;
}
while ((precision <= 0 || n < precision)
! && *this
&& (mode_line_target != MODE_LINE_DISPLAY
|| it->current_x < it->last_visible_x))
{
! const unsigned char *last = this;
/* Advance to end of string or next format specifier. */
! while ((c = *this++) != '\0' && c != '%')
;
! if (this - 1 != last)
{
int nchars, nbytes;
/* Output to end of string or up to '%'. Field width
is length of string. Don't output more than
PRECISION allows us. */
! --this;
! prec = c_string_width (last, this - last, precision - n,
&nchars, &nbytes);
switch (mode_line_target)
{
case MODE_LINE_NOPROP:
case MODE_LINE_TITLE:
! n += store_mode_line_noprop (last, 0, prec);
break;
case MODE_LINE_STRING:
{
! int bytepos = last - lisp_string;
int charpos = string_byte_to_char (elt, bytepos);
int endpos = (precision <= 0
! ? string_byte_to_char (elt,
! this - lisp_string)
: charpos + nchars);
n += store_mode_line_string (NULL,
--- 16030,16073 ----
break;
}
+ /* Handle the non-literal case. */
+
while ((precision <= 0 || n < precision)
! && SREF (elt, offset) != 0
&& (mode_line_target != MODE_LINE_DISPLAY
|| it->current_x < it->last_visible_x))
{
! int last_offset = offset;
/* Advance to end of string or next format specifier. */
! while ((c = SREF (elt, offset++)) != '\0' && c != '%')
;
! if (offset - 1 != last_offset)
{
int nchars, nbytes;
/* Output to end of string or up to '%'. Field width
is length of string. Don't output more than
PRECISION allows us. */
! offset--;
! prec = c_string_width (SDATA (elt) + last_offset,
! offset - last_offset, precision - n,
&nchars, &nbytes);
switch (mode_line_target)
{
case MODE_LINE_NOPROP:
case MODE_LINE_TITLE:
! n += store_mode_line_noprop (SDATA (elt) + last_offset, 0, prec);
break;
case MODE_LINE_STRING:
{
! int bytepos = last_offset;
int charpos = string_byte_to_char (elt, bytepos);
int endpos = (precision <= 0
! ? string_byte_to_char (elt, offset)
: charpos + nchars);
n += store_mode_line_string (NULL,
***************
*** 16077,16083 ****
break;
case MODE_LINE_DISPLAY:
{
! int bytepos = last - lisp_string;
int charpos = string_byte_to_char (elt, bytepos);
n += display_string (NULL, elt, Qnil, 0, charpos,
it, 0, prec, 0,
--- 16078,16084 ----
break;
case MODE_LINE_DISPLAY:
{
! int bytepos = last_offset;
int charpos = string_byte_to_char (elt, bytepos);
n += display_string (NULL, elt, Qnil, 0, charpos,
it, 0, prec, 0,
***************
*** 16088,16099 ****
}
else /* c == '%' */
{
! const unsigned char *percent_position = this;
/* Get the specified minimum width. Zero means
don't pad. */
field = 0;
! while ((c = *this++) >= '0' && c <= '9')
field = field * 10 + c - '0';
/* Don't pad beyond the total padding allowed. */
--- 16089,16100 ----
}
else /* c == '%' */
{
! int percent_position = offset;
/* Get the specified minimum width. Zero means
don't pad. */
field = 0;
! while ((c = SREF (elt, offset++)) >= '0' && c <= '9')
field = field * 10 + c - '0';
/* Don't pad beyond the total padding allowed. */
***************
*** 16113,16119 ****
int bytepos, charpos;
unsigned char *spec;
! bytepos = percent_position - lisp_string;
charpos = (STRING_MULTIBYTE (elt)
? string_byte_to_char (elt, bytepos)
: bytepos);
--- 16114,16120 ----
int bytepos, charpos;
unsigned char *spec;
! bytepos = percent_position;
charpos = (STRING_MULTIBYTE (elt)
? string_byte_to_char (elt, bytepos)
: bytepos);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm
@ 2005-09-13 1:08 ` Kenichi Handa
2005-09-13 15:54 ` Richard M. Stallman
1 sibling, 0 replies; 14+ messages in thread
From: Kenichi Handa @ 2005-09-13 1:08 UTC (permalink / raw)
Cc: emacs-devel
In article <m3mzmixxtc.fsf@kfs-l.imdomain.dk>, storm@cua.dk (Kim F. Storm) writes:
> The following code in compose_chars_in_text looks suspicious:
That's right. But, it's not used anymore and it seems that
I just forgot to delete that function. :-(
I'm sorry for taking your time on it. I've just deleted it.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm
2005-09-13 1:08 ` Kenichi Handa
@ 2005-09-13 15:54 ` Richard M. Stallman
2005-09-14 7:29 ` Kenichi Handa
1 sibling, 1 reply; 14+ messages in thread
From: Richard M. Stallman @ 2005-09-13 15:54 UTC (permalink / raw)
Cc: emacs-devel, handa
I think that the cleanest thing to do, in loops that don't need to be
as fast as possible, is avoid saving addresses of string data at all.
I suspect that compose_chars_in_text does not need to be as fast as
possible. Handa, do you think it does?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-13 15:54 ` Richard M. Stallman
@ 2005-09-14 7:29 ` Kenichi Handa
2005-09-15 2:41 ` Richard M. Stallman
0 siblings, 1 reply; 14+ messages in thread
From: Kenichi Handa @ 2005-09-14 7:29 UTC (permalink / raw)
Cc: emacs-devel, storm
In article <E1EFD7U-0000gq-13@fencepost.gnu.org>, "Richard M. Stallman" <rms@gnu.org> writes:
> I think that the cleanest thing to do, in loops that don't need to be
> as fast as possible, is avoid saving addresses of string data at all.
I agree. But, I think display_mode_element is the place
that have to be as fast as possible.
> I think you are right about the bug, but I think your patch does not
> completely fix it.
Yes. As I wrote, the original patch I mailed was not
complete. So, I installed a new one. Have you seen that
change? Does it still need a fix?
> I suspect that compose_chars_in_text does not need to be as fast as
> possible. Handa, do you think it does?
As I wrote, that function is not used anymore, thus I
deleted it.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-14 7:29 ` Kenichi Handa
@ 2005-09-15 2:41 ` Richard M. Stallman
2005-09-15 4:21 ` Kenichi Handa
0 siblings, 1 reply; 14+ messages in thread
From: Richard M. Stallman @ 2005-09-15 2:41 UTC (permalink / raw)
Cc: storm, emacs-devel
> I think that the cleanest thing to do, in loops that don't need to be
> as fast as possible, is avoid saving addresses of string data at all.
I agree. But, I think display_mode_element is the place
that have to be as fast as possible.
I would think that it doesn't take up very much of Emacs cpu time,
and that a small slowdown there would be insignificant. That function
calls subroutines that do a lot of work.
Looking at the new code, I think it is correct. It is correct because
the code consists of many alternatives, and each one does only a
single nontrivial thing and then gets to the end, where lisp_string
and this are updated. But it seems a bit fragile to me.
What if some branch is later changed to do two things that could
relocate? Then it would have a bug again.
So I think I will still put in the change I made.
The reason I worry about the patch you made is that it depends
on knowing exactly where the relocation might occur.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-15 2:41 ` Richard M. Stallman
@ 2005-09-15 4:21 ` Kenichi Handa
2005-09-16 1:01 ` Richard M. Stallman
0 siblings, 1 reply; 14+ messages in thread
From: Kenichi Handa @ 2005-09-15 4:21 UTC (permalink / raw)
Cc: storm, emacs-devel
In article <E1EFjgX-0000gd-7I@fencepost.gnu.org>, "Richard M. Stallman" <rms@gnu.org> writes:
>> I think that the cleanest thing to do, in loops that don't need to be
>> as fast as possible, is avoid saving addresses of string data at all.
> I agree. But, I think display_mode_element is the place
> that have to be as fast as possible.
> I would think that it doesn't take up very much of Emacs cpu time,
> and that a small slowdown there would be insignificant. That function
> calls subroutines that do a lot of work.
> Looking at the new code, I think it is correct. It is correct because
> the code consists of many alternatives, and each one does only a
> single nontrivial thing and then gets to the end, where lisp_string
> and this are updated. But it seems a bit fragile to me.
> What if some branch is later changed to do two things that could
> relocate? Then it would have a bug again.
> So I think I will still put in the change I made.
I agree that your change is cleaner, and if the performance
won't be harmed that much, it is better to use your change
except for this kind of part:
! while ((c = SREF (elt, offset++)) != '\0' && c != '%')
Isn't it better to avoid using something like "offset++" as
an argument of a macro even if we know that it's currently
safe.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-15 4:21 ` Kenichi Handa
@ 2005-09-16 1:01 ` Richard M. Stallman
2005-09-16 15:39 ` Stefan Monnier
0 siblings, 1 reply; 14+ messages in thread
From: Richard M. Stallman @ 2005-09-16 1:01 UTC (permalink / raw)
Cc: storm, emacs-devel
Isn't it better to avoid using something like "offset++" as
an argument of a macro even if we know that it's currently
safe.
I see no danger in it--most of our macros are safe, and we keep them
that way. If you think it would be cleaner to rewrite that, I don't
mind if you do. But there may not be any clean way to do it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-16 1:01 ` Richard M. Stallman
@ 2005-09-16 15:39 ` Stefan Monnier
2005-09-17 13:39 ` Richard M. Stallman
0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2005-09-16 15:39 UTC (permalink / raw)
Cc: storm, emacs-devel, Kenichi Handa
> Isn't it better to avoid using something like "offset++" as
> an argument of a macro even if we know that it's currently
> safe.
> I see no danger in it--most of our macros are safe, and we keep them
> that way.
Actually, it often happens that things like ENABLE_CHECKING add assert
statements in the macro expansion which causes the args to be used more
than once.
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-16 15:39 ` Stefan Monnier
@ 2005-09-17 13:39 ` Richard M. Stallman
2005-09-19 13:43 ` Stefan Monnier
0 siblings, 1 reply; 14+ messages in thread
From: Richard M. Stallman @ 2005-09-17 13:39 UTC (permalink / raw)
Cc: storm, emacs-devel, handa
Actually, it often happens that things like ENABLE_CHECKING add assert
statements in the macro expansion which causes the args to be used more
than once.
As far as I can see, ENABLE_CHECKING has no such effect anywhere.
And there is only one definition of SREF.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text
2005-09-17 13:39 ` Richard M. Stallman
@ 2005-09-19 13:43 ` Stefan Monnier
0 siblings, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2005-09-19 13:43 UTC (permalink / raw)
Cc: storm, emacs-devel, handa
> Actually, it often happens that things like ENABLE_CHECKING add assert
> statements in the macro expansion which causes the args to be used more
> than once.
> As far as I can see, ENABLE_CHECKING has no such effect anywhere.
It does.
See the definition of XSET:
#define XSET(var, type, ptr) \
(eassert (XTYPE (ptr) == 0), /* Check alignment. */ \
(var) = ((EMACS_INT) (type)) | ((EMACS_INT) (ptr)))
> And there is only one definition of SREF.
Yes, currently SREF does not currently "suffer" from this problem, but it
may become useful at some point in the future to add some kind of check on
the second or of SREF.
I have myself bumped into such problem when I try to add such checks in XINT
(which currently doesn't check that its arg is indeed an integer).
Stefan
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2005-09-19 13:43 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa
2005-09-12 8:04 ` Kim F. Storm
2005-09-12 11:54 ` Kenichi Handa
2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm
2005-09-13 1:08 ` Kenichi Handa
2005-09-13 15:54 ` Richard M. Stallman
2005-09-14 7:29 ` Kenichi Handa
2005-09-15 2:41 ` Richard M. Stallman
2005-09-15 4:21 ` Kenichi Handa
2005-09-16 1:01 ` Richard M. Stallman
2005-09-16 15:39 ` Stefan Monnier
2005-09-17 13:39 ` Richard M. Stallman
2005-09-19 13:43 ` Stefan Monnier
2005-09-12 15:34 ` potential bug in display_mode_element? Richard M. Stallman
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).