unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#9496: 24.0.50; Segfault on TAB-only composition
@ 2011-09-13 20:22 Johan Bockgård
  2011-09-14  5:33 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Johan Bockgård @ 2011-09-13 20:22 UTC (permalink / raw)
  To: 9496


(insert (compose-string "\t"))

Program received signal SIGSEGV, Segmentation fault.
0x00000000004d45d4 in x_set_glyph_string_gc (s=0x7fffffffae80) at xterm.c:1061
1061      PREPARE_FACE_FOR_DISPLAY (s->f, s->face);
(gdb) bt
#0  0x00000000004d45d4 in x_set_glyph_string_gc (s=0x7fffffffae80)
    at xterm.c:1061
#1  x_draw_glyph_string (s=0x7fffffffae80) at xterm.c:2683

This problem is not new, but due to other changes it now makes
`describe-char' (C-u C-x =) crash when executed on a tab character.


2011-09-13  Johan Bockgård  <bojohan@gnu.org>

	* xdisp.c (fill_composite_glyph_string): Always set s->face, to
	avoid a crash.


=== modified file 'src/xdisp.c'
--- src/xdisp.c	2011-09-09 01:06:52 +0000
+++ src/xdisp.c	2011-09-11 15:03:56 +0000
@@ -21745,6 +21749,12 @@ fill_composite_glyph_string (struct glyp
     }
   s->cmp_to = i;
 
+  if (s->face == NULL)
+    {
+      s->face = base_face->ascii_face;
+      s->font = s->face->font;
+    }
+
   /* All glyph strings for the same composition has the same width,
      i.e. the width set for the first component of the composition.  */
   s->width = s->first_glyph->pixel_width;






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-13 20:22 bug#9496: 24.0.50; Segfault on TAB-only composition Johan Bockgård
@ 2011-09-14  5:33 ` Eli Zaretskii
  2011-09-14 12:57   ` Kenichi Handa
  2011-11-11  7:15 ` Kenichi Handa
  2012-02-03 19:28 ` Paul Eggert
  2 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2011-09-14  5:33 UTC (permalink / raw)
  To: Johan Bockgård, handa; +Cc: 9496

> From: Johan Bockgård <bojohan@gnu.org>
> Date: Tue, 13 Sep 2011 22:22:14 +0200
> 
> 
> (insert (compose-string "\t"))
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x00000000004d45d4 in x_set_glyph_string_gc (s=0x7fffffffae80) at xterm.c:1061
> 1061      PREPARE_FACE_FOR_DISPLAY (s->f, s->face);
> (gdb) bt
> #0  0x00000000004d45d4 in x_set_glyph_string_gc (s=0x7fffffffae80)
>     at xterm.c:1061
> #1  x_draw_glyph_string (s=0x7fffffffae80) at xterm.c:2683
> 
> This problem is not new, but due to other changes it now makes
> `describe-char' (C-u C-x =) crash when executed on a tab character.
> 
> 
> 2011-09-13  Johan Bockgård  <bojohan@gnu.org>
> 
> 	* xdisp.c (fill_composite_glyph_string): Always set s->face, to
> 	avoid a crash.

Thanks.  But why not use the value of face_id computed in the loop
above the locus of your patch (which does that for any character but
the TAB)?

Anyway, there's some magic about the TAB character and compositions
that I always wondered about.  Perhaps it's time to find out why
composition-related code tests characters against the TAB in so many
places.  Handa-san, can you tell why, please?





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-14  5:33 ` Eli Zaretskii
@ 2011-09-14 12:57   ` Kenichi Handa
  2011-09-14 13:07     ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2011-09-14 12:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 9496, bojohan

In article <E1R3i6N-0006kA-0X@fencepost.gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> Anyway, there's some magic about the TAB character and compositions
> that I always wondered about.  Perhaps it's time to find out why
> composition-related code tests characters against the TAB in so many
> places.  Handa-san, can you tell why, please?

The usage of TAB in composition is described in the
docstring of compose-region as this:

----------------------------------------------------------------------
(compose-region START END &optional COMPONENTS MODIFICATION-FUNC)

Compose characters in the current region.
[...]
Optional 3rd argument COMPONENTS, if non-nil, is a character, a string
or a vector or list of integers and rules.
[...]
If it is a string, the elements are alternate characters.  In
this case, TAB element has a special meaning.  If the first
character is TAB, the glyphs are displayed with left padding space
so that no pixel overlaps with the previous column.  If the last
character is TAB, the glyphs are displayed with right padding
space so that no pixel overlaps with the following column.
----------------------------------------------------------------------

This feature is typically used in describe-char to avoid a
single combining character overlapping with the surrounding
characters.

By the way, I'm sorry for not actively working on bug
reports related to codes written by me.  I'm now extremely
overloaded.

---
handa@m17n.org





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-14 12:57   ` Kenichi Handa
@ 2011-09-14 13:07     ` Eli Zaretskii
  2011-09-15  0:28       ` Kenichi Handa
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2011-09-14 13:07 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9496, bojohan

> From: Kenichi Handa <handa@m17n.org>
> Cc: bojohan@gnu.org, 9496@debbugs.gnu.org
> Date: Wed, 14 Sep 2011 21:57:04 +0900
> 
> (compose-region START END &optional COMPONENTS MODIFICATION-FUNC)
> 
> Compose characters in the current region.
> [...]
> Optional 3rd argument COMPONENTS, if non-nil, is a character, a string
> or a vector or list of integers and rules.
> [...]
> If it is a string, the elements are alternate characters.  In
> this case, TAB element has a special meaning.  If the first
> character is TAB, the glyphs are displayed with left padding space
> so that no pixel overlaps with the previous column.  If the last
> character is TAB, the glyphs are displayed with right padding
> space so that no pixel overlaps with the following column.
> ----------------------------------------------------------------------
> 
> This feature is typically used in describe-char to avoid a
> single combining character overlapping with the surrounding
> characters.

Thanks.

So what is the meaning of the example shown by Bojohan, i.e.:

  (insert (compose-string "\t"))

What is expected from such a 'composition"?

It sounds like the code in fill_composite_glyph_string is not prepared
to deal with a situation where the string's only character is a TAB,
is that right?





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-14 13:07     ` Eli Zaretskii
@ 2011-09-15  0:28       ` Kenichi Handa
  2011-09-15  0:47         ` Stefan Monnier
  0 siblings, 1 reply; 17+ messages in thread
From: Kenichi Handa @ 2011-09-15  0:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 9496, bojohan

In article <E1R3pBs-0004FS-6i@fencepost.gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> So what is the meaning of the example shown by Bojohan, i.e.:

>   (insert (compose-string "\t"))

> What is expected from such a 'composition"?

I currently have no idea.  It's just a wrong usage.

> It sounds like the code in fill_composite_glyph_string is not prepared
> to deal with a situation where the string's only character is a TAB,
> is that right?

Yes.

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-15  0:28       ` Kenichi Handa
@ 2011-09-15  0:47         ` Stefan Monnier
  2011-09-15  4:10           ` Kenichi Handa
  0 siblings, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2011-09-15  0:47 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 9496, bojohan

>> So what is the meaning of the example shown by Bojohan, i.e.:
>> (insert (compose-string "\t"))
>> What is expected from such a 'composition"?
> I currently have no idea.  It's just a wrong usage.

I think it would be logical to handle this (as well as "\t\t") in the
same way as the empty string (after all "\t" is a "" with a \t
prepended or appended).


        Stefan





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-15  0:47         ` Stefan Monnier
@ 2011-09-15  4:10           ` Kenichi Handa
  0 siblings, 0 replies; 17+ messages in thread
From: Kenichi Handa @ 2011-09-15  4:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 9496, bojohan

In article <jwvmxe6ljkt.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> So what is the meaning of the example shown by Bojohan, i.e.:
>>> (insert (compose-string "\t"))
>>> What is expected from such a 'composition"?
> > I currently have no idea.  It's just a wrong usage.

> I think it would be logical to handle this (as well as "\t\t") in the
> same way as the empty string (after all "\t" is a "" with a \t
> prepended or appended).

That is one way.  But I'm not sure that the current code
handles the composition component of empty string
correctly.  :-(

Another way is to treat it as the same way as "\t" display
property.   Composition component of one character string is
the same as display property of that one character string.
For instance, the following two has the same visual effect:

(put-text-property 1 2 'display "a")
(compose-region 1 2 "a")

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-13 20:22 bug#9496: 24.0.50; Segfault on TAB-only composition Johan Bockgård
  2011-09-14  5:33 ` Eli Zaretskii
@ 2011-11-11  7:15 ` Kenichi Handa
  2012-02-03 19:28 ` Paul Eggert
  2 siblings, 0 replies; 17+ messages in thread
From: Kenichi Handa @ 2011-11-11  7:15 UTC (permalink / raw)
  To: Johan Bockgård; +Cc: 9496

In article <8739g0tcp5.fsf@gnu.org>, Johan Bockgård <bojohan@gnu.org> writes:

> (insert (compose-string "\t"))

> Program received signal SIGSEGV, Segmentation fault.
> 0x00000000004d45d4 in x_set_glyph_string_gc (s=0x7fffffffae80) at xterm.c:1061
> 1061      PREPARE_FACE_FOR_DISPLAY (s->f, s->face);
> (gdb) bt
> #0  0x00000000004d45d4 in x_set_glyph_string_gc (s=0x7fffffffae80)
>     at xterm.c:1061
> #1  x_draw_glyph_string (s=0x7fffffffae80) at xterm.c:2683

> This problem is not new, but due to other changes it now makes
> `describe-char' (C-u C-x =) crash when executed on a tab character.


> 2011-09-13  Johan Bockgård  <bojohan@gnu.org>

> 	* xdisp.c (fill_composite_glyph_string): Always set s->face, to
> 	avoid a crash.

Thank you.  Although we have not yet decided what is the
right displaying of such a composition, I've just installed
your fix at least to avoid crash.

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2011-09-13 20:22 bug#9496: 24.0.50; Segfault on TAB-only composition Johan Bockgård
  2011-09-14  5:33 ` Eli Zaretskii
  2011-11-11  7:15 ` Kenichi Handa
@ 2012-02-03 19:28 ` Paul Eggert
  2012-02-03 21:22   ` Eli Zaretskii
  2 siblings, 1 reply; 17+ messages in thread
From: Paul Eggert @ 2012-02-03 19:28 UTC (permalink / raw)
  To: 9496

When a fix for this was merged into the trunk I noticed a problem from the
trunk's point of view: the fix introduced the possibility of an unchecked
integer overflow which would cause character widths to go negative
and could cause real problems later.  I installed this further fix
to the trunk:

Handle overflow when computing char display width (Bug#9496).
* character.c (char_width): Return EMACS_INT, not int.
(char_width, c_string_width): Check for overflow when
computing the width; this is possible now that individual
characters can have unbounded width.  Problem introduced
by merge from Emacs 23 on 2012-01-19.
=== modified file 'src/character.c'
--- src/character.c	2012-01-19 07:21:25 +0000
+++ src/character.c	2012-02-03 19:19:42 +0000
@@ -311,10 +311,10 @@

 /* Return width (columns) of C considering the buffer display table DP. */

-static int
+static EMACS_INT
 char_width (int c, struct Lisp_Char_Table *dp)
 {
-  int width = CHAR_WIDTH (c);
+  EMACS_INT width = CHAR_WIDTH (c);

   if (dp)
     {
@@ -326,7 +326,12 @@
 	  {
 	    ch = AREF (disp, i);
 	    if (CHARACTERP (ch))
-	      width += CHAR_WIDTH (XFASTINT (ch));
+	      {
+		int w = CHAR_WIDTH (XFASTINT (ch));
+		if (INT_ADD_OVERFLOW (width, w))
+		  string_overflow ();
+		width += w;
+	      }
 	  }
     }
   return width;
@@ -340,7 +345,8 @@
 usage: (char-width CHAR)  */)
   (Lisp_Object ch)
 {
-  int c, width;
+  int c;
+  EMACS_INT width;

   CHECK_CHARACTER (ch);
   c = XINT (ch);
@@ -367,10 +373,14 @@
     {
       int bytes;
       int c = STRING_CHAR_AND_LENGTH (str + i_byte, bytes);
-      int thiswidth = char_width (c, dp);
+      EMACS_INT thiswidth = char_width (c, dp);

-      if (precision > 0
-	  && (width + thiswidth > precision))
+      if (precision <= 0)
+	{
+	  if (INT_ADD_OVERFLOW (width, thiswidth))
+	    string_overflow ();
+	}
+      else if (precision - width < thiswidth)
 	{
 	  *nchars = i;
 	  *nbytes = i_byte;






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-03 19:28 ` Paul Eggert
@ 2012-02-03 21:22   ` Eli Zaretskii
  2012-02-03 22:07     ` Paul Eggert
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2012-02-03 21:22 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 9496

> Date: Fri, 03 Feb 2012 11:28:29 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> -static int
> +static EMACS_INT
>  char_width (int c, struct Lisp_Char_Table *dp)
>  {
> -  int width = CHAR_WIDTH (c);
> +  EMACS_INT width = CHAR_WIDTH (c);

Oh, come on!  CHAR_WIDTH returns the character's width
_on_the_screen_.  So how in the world could it _ever_ overflow a
32-bit integer??





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-03 21:22   ` Eli Zaretskii
@ 2012-02-03 22:07     ` Paul Eggert
  2012-02-04  6:58       ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Eggert @ 2012-02-03 22:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 9496

On 02/03/2012 01:22 PM, Eli Zaretskii wrote:
> how in the world could it _ever_ overflow a
> 32-bit integer??

(setq tab-width 1000)
(standard-display-ascii 256 (make-string 2147484 ?\t))

This causes (char-width 256) to return 2147484000,
now that the patch has been installed.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-03 22:07     ` Paul Eggert
@ 2012-02-04  6:58       ` Eli Zaretskii
  2012-02-04  7:18         ` Paul Eggert
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2012-02-04  6:58 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 9496

> Date: Fri, 03 Feb 2012 14:07:48 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: 9496@debbugs.gnu.org
> 
> On 02/03/2012 01:22 PM, Eli Zaretskii wrote:
> > how in the world could it _ever_ overflow a
> > 32-bit integer??
> 
> (setq tab-width 1000)
> (standard-display-ascii 256 (make-string 2147484 ?\t))

Then don't do that.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-04  6:58       ` Eli Zaretskii
@ 2012-02-04  7:18         ` Paul Eggert
  2012-02-04  8:14           ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Eggert @ 2012-02-04  7:18 UTC (permalink / raw)
  To: 9496

On 02/03/2012 10:58 PM, Eli Zaretskii wrote:
> Then don't do that.

I don't see why not.  GNU code is supposed to avoid arbitrary limits
on the length or number of any data structure.  A 64-bit Emacs should
not arbitrarily impose a 32-bit limit here.






^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-04  7:18         ` Paul Eggert
@ 2012-02-04  8:14           ` Eli Zaretskii
  2012-02-04 23:43             ` Paul Eggert
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2012-02-04  8:14 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 9496

> Date: Fri, 03 Feb 2012 23:18:50 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> 
> On 02/03/2012 10:58 PM, Eli Zaretskii wrote:
> > Then don't do that.
> 
> I don't see why not.

Because the result is meaningless anyway.  Being able to "express" a
2G column wide "character" is not useful, so avoiding overflow for
that use case is not really a solution to the problem.

It would be a much better solution if char_width would limit the
result to the same sane limit we have in all the related functions and
macros, i.e. to 1000.  If you limit everything in that function to
1000, including its result, there's no danger of overflow anywhere,
and the resulting display is not too preposterous.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-04  8:14           ` Eli Zaretskii
@ 2012-02-04 23:43             ` Paul Eggert
  2012-02-05 16:36               ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Paul Eggert @ 2012-02-04 23:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 9496

On 02/04/2012 12:14 AM, Eli Zaretskii wrote:
> It would be a much better solution if char_width would limit the
> result to the same sane limit we have in all the related functions and
> macros, i.e. to 1000.

OK, that's doable, but if done in isolation it would introduce other
bugs, no?  If char-width returns a value that's arbitrarily ceilinged
at 1000, but actual characters can be wider than 1000 columns,
code that uses char-width to count columns will be buggy.

We could address this by changing the code that renders characters,
so that the code limits them to at most 1000 columns on the display.
That's what's done with tabs, so there's good precedent for it.
This would require changes to the rendering code so that it matches
the altered char-width that's ceilinged at 1000.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-04 23:43             ` Paul Eggert
@ 2012-02-05 16:36               ` Eli Zaretskii
  2012-02-05 16:39                 ` Paul Eggert
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2012-02-05 16:36 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 9496

> Date: Sat, 04 Feb 2012 15:43:56 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: 9496@debbugs.gnu.org
> 
> On 02/04/2012 12:14 AM, Eli Zaretskii wrote:
> > It would be a much better solution if char_width would limit the
> > result to the same sane limit we have in all the related functions and
> > macros, i.e. to 1000.
> 
> OK, that's doable, but if done in isolation it would introduce other
> bugs, no?  If char-width returns a value that's arbitrarily ceilinged
> at 1000, but actual characters can be wider than 1000 columns,
> code that uses char-width to count columns will be buggy.

How can an actual character (not its display-table replacement) be
wider than that?

> We could address this by changing the code that renders characters,
> so that the code limits them to at most 1000 columns on the display.

That was my intent, yes.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* bug#9496: 24.0.50; Segfault on TAB-only composition
  2012-02-05 16:36               ` Eli Zaretskii
@ 2012-02-05 16:39                 ` Paul Eggert
  0 siblings, 0 replies; 17+ messages in thread
From: Paul Eggert @ 2012-02-05 16:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 9496

On 02/05/2012 08:36 AM, Eli Zaretskii wrote:

>> If char-width returns a value that's arbitrarily ceilinged
>> at 1000, but actual characters can be wider than 1000 columns,
>> code that uses char-width to count columns will be buggy.
> 
> How can an actual character (not its display-table replacement) be
> wider than that?

By "actual character" I meant to include how the character is
displayed, since char-width measures display width.  So that
includes the display-table replacement if any.

>> We could address this by changing the code that renders characters,
>> so that the code limits them to at most 1000 columns on the display.
> 
> That was my intent, yes.

OK, that sounds reasonable.





^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-02-05 16:39 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-13 20:22 bug#9496: 24.0.50; Segfault on TAB-only composition Johan Bockgård
2011-09-14  5:33 ` Eli Zaretskii
2011-09-14 12:57   ` Kenichi Handa
2011-09-14 13:07     ` Eli Zaretskii
2011-09-15  0:28       ` Kenichi Handa
2011-09-15  0:47         ` Stefan Monnier
2011-09-15  4:10           ` Kenichi Handa
2011-11-11  7:15 ` Kenichi Handa
2012-02-03 19:28 ` Paul Eggert
2012-02-03 21:22   ` Eli Zaretskii
2012-02-03 22:07     ` Paul Eggert
2012-02-04  6:58       ` Eli Zaretskii
2012-02-04  7:18         ` Paul Eggert
2012-02-04  8:14           ` Eli Zaretskii
2012-02-04 23:43             ` Paul Eggert
2012-02-05 16:36               ` Eli Zaretskii
2012-02-05 16:39                 ` Paul Eggert

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).