* potential bug in display_mode_element? @ 2005-09-12 0:58 Kenichi Handa 2005-09-12 8:04 ` Kim F. Storm ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Kenichi Handa @ 2005-09-12 0:58 UTC (permalink / raw) I got a bug report for emacs-unicode-2, and it seems that the same bug exists in HEAD too. The backtrace is this: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1208118624 (LWP 29169)] 0x080b7349 in display_mode_element (it=0xbfffd1b0, depth=10, field_width=0, precision=-63, elt=164196259, props=138499373, risky=0) at xdisp.c:16136 16136 while ((precision <= 0 || n < precision) (gdb) xbacktrace "execute-extended-command" "call-interactively" (gdb) bt full #0 0x080b7349 in display_mode_element (it=0xbfffd1b0, depth=10, field_width=0, precision=-63, elt=164196259, props=138499373, risky=0) at xdisp.c:16136 c = 0 '\0' this = (const unsigned char *) 0xaf7a101 <Address 0xaf7a101 out of bounds> lisp_string = (const unsigned char *) 0xaf7a0fc <Address 0xaf7a0fc out of bounds> n = 5 field = 138382657 prec = 5 literal = 0 Here the strange thing is that list_string points an address out of bounds. It is initialized as this: this = SDATA (elt); lisp_string = this; if (literal) /* omitted because not relevant now */ while ((precision <= 0 || n < precision) && *this && (mode_line_target != MODE_LINE_DISPLAY || it->current_x < it->last_visible_x)) ... and never changed in the while loop. So the only reason I can think of why the address pointed by list_string becomes out of bound is that the string data of ELT was relocated in the loop and the original address was returned to OS. Actually, display_string is called in the loop, and it will run Lisp code. So, I think we meed this change. What do you think? *** xdisp.c 10 Sep 2005 09:35:12 +0900 1.1050 --- xdisp.c 10 Sep 2005 18:58:05 +0900 *************** *** 16036,16042 **** --- 16036,16047 ---- && (mode_line_target != MODE_LINE_DISPLAY || it->current_x < it->last_visible_x)) { + /* Never change the value of LAST in this block. */ const unsigned char *last = this; + /* String data of ELT may be relocated. In such a case, + OFFSET can be used to make THIS correctly points into + the string data of ELT. */ + int offset = this - SDATA (elt); /* Advance to end of string or next format specifier. */ while ((c = *this++) != '\0' && c != '%') *************** *** 16171,16176 **** --- 16176,16182 ---- else /* c == 0 */ break; } + this = SDATA (elt) + offset + (this - last); } } break; --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: potential bug in display_mode_element? 2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa @ 2005-09-12 8:04 ` Kim F. Storm 2005-09-12 11:54 ` Kenichi Handa 2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm 2005-09-12 15:34 ` potential bug in display_mode_element? Richard M. Stallman 2 siblings, 1 reply; 14+ messages in thread From: Kim F. Storm @ 2005-09-12 8:04 UTC (permalink / raw) Cc: emacs-devel Kenichi Handa <handa@m17n.org> writes: > I got a bug report for emacs-unicode-2, and it seems that > the same bug exists in HEAD too. The backtrace is this: > and never changed in the while loop. So the only reason I > can think of why the address pointed by list_string becomes > out of bound is that the string data of ELT was relocated in > the loop and the original address was returned to OS. > Actually, display_string is called in the loop, and it will > run Lisp code. > > So, I think we meed this change. What do you think? Bravo!! We definitely need this patch. This is the sort of bug that I've been hunting for re. the crashes in "compact_small_strings" and the free string list. I'm not sure if this patch will fix the crashes though. -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: potential bug in display_mode_element? 2005-09-12 8:04 ` Kim F. Storm @ 2005-09-12 11:54 ` Kenichi Handa 0 siblings, 0 replies; 14+ messages in thread From: Kenichi Handa @ 2005-09-12 11:54 UTC (permalink / raw) Cc: emacs-devel In article <m38xy2zp89.fsf@kfs-l.imdomain.dk>, storm@cua.dk (Kim F. Storm) writes: > Kenichi Handa <handa@m17n.org> writes: >> I got a bug report for emacs-unicode-2, and it seems that >> the same bug exists in HEAD too. The backtrace is this: >> and never changed in the while loop. So the only reason I >> can think of why the address pointed by list_string becomes >> out of bound is that the string data of ELT was relocated in >> the loop and the original address was returned to OS. >> Actually, display_string is called in the loop, and it will >> run Lisp code. >> >> So, I think we meed this change. What do you think? > Bravo!! We definitely need this patch. I forgot to update lisp_string in that patch. So the attached one is better (and simpler). I've just installed it. --- Kenichi Handa handa@m17n.org 2005-09-12 Kenichi Handa <handa@m17n.org> * xdisp.c (display_mode_element): Be sure to make variables THIS and LISP_STRING point into a string data of ELT. *** xdisp.c 10 Sep 2005 09:35:12 +0900 1.1050 --- xdisp.c 12 Sep 2005 20:47:28 +0900 *************** *** 16171,16176 **** --- 16171,16178 ---- else /* c == 0 */ break; } + this += SDATA (elt) - lisp_string; + lisp_string = SDATA (elt); } } break; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Potential GC-related problems in compose_chars_in_text 2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa 2005-09-12 8:04 ` Kim F. Storm @ 2005-09-12 12:41 ` Kim F. Storm 2005-09-13 1:08 ` Kenichi Handa 2005-09-13 15:54 ` Richard M. Stallman 2005-09-12 15:34 ` potential bug in display_mode_element? Richard M. Stallman 2 siblings, 2 replies; 14+ messages in thread From: Kim F. Storm @ 2005-09-12 12:41 UTC (permalink / raw) Cc: emacs-devel The following code in compose_chars_in_text looks suspicious: if (INTEGERP (val) && XFASTINT (val) == start) { to = Fmatch_end (make_number (0)); val = call4 (XCDR (elt), val, to, XCAR (elt), string); if (INTEGERP (val) && XINT (val) > 1) { start += XINT (val); if (STRINGP (string)) ptr = SDATA (string) + string_char_to_byte (string, start); else ptr = CHAR_POS_ADDR (start); } else { start++; ptr += len; >>>> if string is non-nil, and call4 did GC, then ptr may no longer >>>> point into "string". } break; Likewise, the `pend' pointer may no longer be valid for the same reason -- on both branches of the above code!!. Furthermore, the initialization of pend seems bogus too: ptr = SDATA (string) + string_char_to_byte (string, start); pend = ptr + SBYTES (string); Shouldn't that be pend = SDATA (string) + SBYTES (string); Here is a patch (untested): *** composite.c 14 Aug 2005 14:47:27 +0200 1.35 --- composite.c 12 Sep 2005 14:40:52 +0200 *************** *** 616,622 **** GCPRO1 (string); stop = end; ptr = SDATA (string) + string_char_to_byte (string, start); ! pend = ptr + SBYTES (string); } else { --- 616,622 ---- GCPRO1 (string); stop = end; ptr = SDATA (string) + string_char_to_byte (string, start); ! pend = SDATA (string) + SBYTES (string); } else { *************** *** 680,689 **** { start += XINT (val); if (STRINGP (string)) ! ptr = SDATA (string) + string_char_to_byte (string, start); else ptr = CHAR_POS_ADDR (start); } else { start++; --- 680,698 ---- { start += XINT (val); if (STRINGP (string)) ! { ! ptr = SDATA (string) + string_char_to_byte (string, start); ! pend = SDATA (string) + SBYTES (string); ! } else ptr = CHAR_POS_ADDR (start); } + else if (STRINGP (string)) + { + start++; + ptr = SDATA (string) + string_char_to_byte (string, start); + pend = SDATA (string) + SBYTES (string); + } else { start++; -- Kim F. Storm <storm@cua.dk> http://www.cua.dk ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm @ 2005-09-13 1:08 ` Kenichi Handa 2005-09-13 15:54 ` Richard M. Stallman 1 sibling, 0 replies; 14+ messages in thread From: Kenichi Handa @ 2005-09-13 1:08 UTC (permalink / raw) Cc: emacs-devel In article <m3mzmixxtc.fsf@kfs-l.imdomain.dk>, storm@cua.dk (Kim F. Storm) writes: > The following code in compose_chars_in_text looks suspicious: That's right. But, it's not used anymore and it seems that I just forgot to delete that function. :-( I'm sorry for taking your time on it. I've just deleted it. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm 2005-09-13 1:08 ` Kenichi Handa @ 2005-09-13 15:54 ` Richard M. Stallman 2005-09-14 7:29 ` Kenichi Handa 1 sibling, 1 reply; 14+ messages in thread From: Richard M. Stallman @ 2005-09-13 15:54 UTC (permalink / raw) Cc: emacs-devel, handa I think that the cleanest thing to do, in loops that don't need to be as fast as possible, is avoid saving addresses of string data at all. I suspect that compose_chars_in_text does not need to be as fast as possible. Handa, do you think it does? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-13 15:54 ` Richard M. Stallman @ 2005-09-14 7:29 ` Kenichi Handa 2005-09-15 2:41 ` Richard M. Stallman 0 siblings, 1 reply; 14+ messages in thread From: Kenichi Handa @ 2005-09-14 7:29 UTC (permalink / raw) Cc: emacs-devel, storm In article <E1EFD7U-0000gq-13@fencepost.gnu.org>, "Richard M. Stallman" <rms@gnu.org> writes: > I think that the cleanest thing to do, in loops that don't need to be > as fast as possible, is avoid saving addresses of string data at all. I agree. But, I think display_mode_element is the place that have to be as fast as possible. > I think you are right about the bug, but I think your patch does not > completely fix it. Yes. As I wrote, the original patch I mailed was not complete. So, I installed a new one. Have you seen that change? Does it still need a fix? > I suspect that compose_chars_in_text does not need to be as fast as > possible. Handa, do you think it does? As I wrote, that function is not used anymore, thus I deleted it. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-14 7:29 ` Kenichi Handa @ 2005-09-15 2:41 ` Richard M. Stallman 2005-09-15 4:21 ` Kenichi Handa 0 siblings, 1 reply; 14+ messages in thread From: Richard M. Stallman @ 2005-09-15 2:41 UTC (permalink / raw) Cc: storm, emacs-devel > I think that the cleanest thing to do, in loops that don't need to be > as fast as possible, is avoid saving addresses of string data at all. I agree. But, I think display_mode_element is the place that have to be as fast as possible. I would think that it doesn't take up very much of Emacs cpu time, and that a small slowdown there would be insignificant. That function calls subroutines that do a lot of work. Looking at the new code, I think it is correct. It is correct because the code consists of many alternatives, and each one does only a single nontrivial thing and then gets to the end, where lisp_string and this are updated. But it seems a bit fragile to me. What if some branch is later changed to do two things that could relocate? Then it would have a bug again. So I think I will still put in the change I made. The reason I worry about the patch you made is that it depends on knowing exactly where the relocation might occur. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-15 2:41 ` Richard M. Stallman @ 2005-09-15 4:21 ` Kenichi Handa 2005-09-16 1:01 ` Richard M. Stallman 0 siblings, 1 reply; 14+ messages in thread From: Kenichi Handa @ 2005-09-15 4:21 UTC (permalink / raw) Cc: storm, emacs-devel In article <E1EFjgX-0000gd-7I@fencepost.gnu.org>, "Richard M. Stallman" <rms@gnu.org> writes: >> I think that the cleanest thing to do, in loops that don't need to be >> as fast as possible, is avoid saving addresses of string data at all. > I agree. But, I think display_mode_element is the place > that have to be as fast as possible. > I would think that it doesn't take up very much of Emacs cpu time, > and that a small slowdown there would be insignificant. That function > calls subroutines that do a lot of work. > Looking at the new code, I think it is correct. It is correct because > the code consists of many alternatives, and each one does only a > single nontrivial thing and then gets to the end, where lisp_string > and this are updated. But it seems a bit fragile to me. > What if some branch is later changed to do two things that could > relocate? Then it would have a bug again. > So I think I will still put in the change I made. I agree that your change is cleaner, and if the performance won't be harmed that much, it is better to use your change except for this kind of part: ! while ((c = SREF (elt, offset++)) != '\0' && c != '%') Isn't it better to avoid using something like "offset++" as an argument of a macro even if we know that it's currently safe. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-15 4:21 ` Kenichi Handa @ 2005-09-16 1:01 ` Richard M. Stallman 2005-09-16 15:39 ` Stefan Monnier 0 siblings, 1 reply; 14+ messages in thread From: Richard M. Stallman @ 2005-09-16 1:01 UTC (permalink / raw) Cc: storm, emacs-devel Isn't it better to avoid using something like "offset++" as an argument of a macro even if we know that it's currently safe. I see no danger in it--most of our macros are safe, and we keep them that way. If you think it would be cleaner to rewrite that, I don't mind if you do. But there may not be any clean way to do it. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-16 1:01 ` Richard M. Stallman @ 2005-09-16 15:39 ` Stefan Monnier 2005-09-17 13:39 ` Richard M. Stallman 0 siblings, 1 reply; 14+ messages in thread From: Stefan Monnier @ 2005-09-16 15:39 UTC (permalink / raw) Cc: storm, emacs-devel, Kenichi Handa > Isn't it better to avoid using something like "offset++" as > an argument of a macro even if we know that it's currently > safe. > I see no danger in it--most of our macros are safe, and we keep them > that way. Actually, it often happens that things like ENABLE_CHECKING add assert statements in the macro expansion which causes the args to be used more than once. Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-16 15:39 ` Stefan Monnier @ 2005-09-17 13:39 ` Richard M. Stallman 2005-09-19 13:43 ` Stefan Monnier 0 siblings, 1 reply; 14+ messages in thread From: Richard M. Stallman @ 2005-09-17 13:39 UTC (permalink / raw) Cc: storm, emacs-devel, handa Actually, it often happens that things like ENABLE_CHECKING add assert statements in the macro expansion which causes the args to be used more than once. As far as I can see, ENABLE_CHECKING has no such effect anywhere. And there is only one definition of SREF. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Potential GC-related problems in compose_chars_in_text 2005-09-17 13:39 ` Richard M. Stallman @ 2005-09-19 13:43 ` Stefan Monnier 0 siblings, 0 replies; 14+ messages in thread From: Stefan Monnier @ 2005-09-19 13:43 UTC (permalink / raw) Cc: storm, emacs-devel, handa > Actually, it often happens that things like ENABLE_CHECKING add assert > statements in the macro expansion which causes the args to be used more > than once. > As far as I can see, ENABLE_CHECKING has no such effect anywhere. It does. See the definition of XSET: #define XSET(var, type, ptr) \ (eassert (XTYPE (ptr) == 0), /* Check alignment. */ \ (var) = ((EMACS_INT) (type)) | ((EMACS_INT) (ptr))) > And there is only one definition of SREF. Yes, currently SREF does not currently "suffer" from this problem, but it may become useful at some point in the future to add some kind of check on the second or of SREF. I have myself bumped into such problem when I try to add such checks in XINT (which currently doesn't check that its arg is indeed an integer). Stefan ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: potential bug in display_mode_element? 2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa 2005-09-12 8:04 ` Kim F. Storm 2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm @ 2005-09-12 15:34 ` Richard M. Stallman 2 siblings, 0 replies; 14+ messages in thread From: Richard M. Stallman @ 2005-09-12 15:34 UTC (permalink / raw) Cc: emacs-devel I think you are right about the bug, but I think your patch does not completely fix it. This patch, which seems to work, should fix the bug completely by accessing the string only through its Lisp object pointer. *** xdisp.c 08 Sep 2005 20:35:28 -0400 1.1050 --- xdisp.c 12 Sep 2005 06:18:48 -0400 *************** *** 15949,15955 **** { /* A string: output it and check for %-constructs within it. */ unsigned char c; ! const unsigned char *this, *lisp_string; if (!NILP (props) || risky) { --- 15949,15955 ---- { /* A string: output it and check for %-constructs within it. */ unsigned char c; ! int offset = 0; if (!NILP (props) || risky) { *************** *** 16007,16014 **** } } ! this = SDATA (elt); ! lisp_string = this; if (literal) { --- 16007,16013 ---- } } ! offset = 0; if (literal) { *************** *** 16031,16072 **** break; } while ((precision <= 0 || n < precision) ! && *this && (mode_line_target != MODE_LINE_DISPLAY || it->current_x < it->last_visible_x)) { ! const unsigned char *last = this; /* Advance to end of string or next format specifier. */ ! while ((c = *this++) != '\0' && c != '%') ; ! if (this - 1 != last) { int nchars, nbytes; /* Output to end of string or up to '%'. Field width is length of string. Don't output more than PRECISION allows us. */ ! --this; ! prec = c_string_width (last, this - last, precision - n, &nchars, &nbytes); switch (mode_line_target) { case MODE_LINE_NOPROP: case MODE_LINE_TITLE: ! n += store_mode_line_noprop (last, 0, prec); break; case MODE_LINE_STRING: { ! int bytepos = last - lisp_string; int charpos = string_byte_to_char (elt, bytepos); int endpos = (precision <= 0 ! ? string_byte_to_char (elt, ! this - lisp_string) : charpos + nchars); n += store_mode_line_string (NULL, --- 16030,16073 ---- break; } + /* Handle the non-literal case. */ + while ((precision <= 0 || n < precision) ! && SREF (elt, offset) != 0 && (mode_line_target != MODE_LINE_DISPLAY || it->current_x < it->last_visible_x)) { ! int last_offset = offset; /* Advance to end of string or next format specifier. */ ! while ((c = SREF (elt, offset++)) != '\0' && c != '%') ; ! if (offset - 1 != last_offset) { int nchars, nbytes; /* Output to end of string or up to '%'. Field width is length of string. Don't output more than PRECISION allows us. */ ! offset--; ! prec = c_string_width (SDATA (elt) + last_offset, ! offset - last_offset, precision - n, &nchars, &nbytes); switch (mode_line_target) { case MODE_LINE_NOPROP: case MODE_LINE_TITLE: ! n += store_mode_line_noprop (SDATA (elt) + last_offset, 0, prec); break; case MODE_LINE_STRING: { ! int bytepos = last_offset; int charpos = string_byte_to_char (elt, bytepos); int endpos = (precision <= 0 ! ? string_byte_to_char (elt, offset) : charpos + nchars); n += store_mode_line_string (NULL, *************** *** 16077,16083 **** break; case MODE_LINE_DISPLAY: { ! int bytepos = last - lisp_string; int charpos = string_byte_to_char (elt, bytepos); n += display_string (NULL, elt, Qnil, 0, charpos, it, 0, prec, 0, --- 16078,16084 ---- break; case MODE_LINE_DISPLAY: { ! int bytepos = last_offset; int charpos = string_byte_to_char (elt, bytepos); n += display_string (NULL, elt, Qnil, 0, charpos, it, 0, prec, 0, *************** *** 16088,16099 **** } else /* c == '%' */ { ! const unsigned char *percent_position = this; /* Get the specified minimum width. Zero means don't pad. */ field = 0; ! while ((c = *this++) >= '0' && c <= '9') field = field * 10 + c - '0'; /* Don't pad beyond the total padding allowed. */ --- 16089,16100 ---- } else /* c == '%' */ { ! int percent_position = offset; /* Get the specified minimum width. Zero means don't pad. */ field = 0; ! while ((c = SREF (elt, offset++)) >= '0' && c <= '9') field = field * 10 + c - '0'; /* Don't pad beyond the total padding allowed. */ *************** *** 16113,16119 **** int bytepos, charpos; unsigned char *spec; ! bytepos = percent_position - lisp_string; charpos = (STRING_MULTIBYTE (elt) ? string_byte_to_char (elt, bytepos) : bytepos); --- 16114,16120 ---- int bytepos, charpos; unsigned char *spec; ! bytepos = percent_position; charpos = (STRING_MULTIBYTE (elt) ? string_byte_to_char (elt, bytepos) : bytepos); ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2005-09-19 13:43 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-09-12 0:58 potential bug in display_mode_element? Kenichi Handa 2005-09-12 8:04 ` Kim F. Storm 2005-09-12 11:54 ` Kenichi Handa 2005-09-12 12:41 ` Potential GC-related problems in compose_chars_in_text Kim F. Storm 2005-09-13 1:08 ` Kenichi Handa 2005-09-13 15:54 ` Richard M. Stallman 2005-09-14 7:29 ` Kenichi Handa 2005-09-15 2:41 ` Richard M. Stallman 2005-09-15 4:21 ` Kenichi Handa 2005-09-16 1:01 ` Richard M. Stallman 2005-09-16 15:39 ` Stefan Monnier 2005-09-17 13:39 ` Richard M. Stallman 2005-09-19 13:43 ` Stefan Monnier 2005-09-12 15:34 ` potential bug in display_mode_element? Richard M. Stallman
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).