unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#11082: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits
@ 2012-03-24  5:23 YAMAMOTO Mitsuharu
  2012-03-24  7:01 ` Eli Zaretskii
  0 siblings, 1 reply; 5+ messages in thread
From: YAMAMOTO Mitsuharu @ 2012-03-24  5:23 UTC (permalink / raw)
  To: 11082

In dispextern.h:

   316	struct glyph
   317	{
(snip)
   418	  /* A union of sub-structures for different glyph types.  */
   419	  union
   420	  {
(snip)
   447	    /* Sub-stretch for type == GLYPHLESS_GLYPH.  */
   448	    struct
   449	    {
   450	      /* Value is an enum of the type glyphless_display_method.  */
   451	      unsigned method : 2;
   452	      /* 1 iff this glyph is for a character of no font. */
   453	      unsigned for_no_font : 1;
   454	      /* Length of acronym or hexadecimal code string (at most 8).  */
   455	      unsigned len : 4;
   456	      /* Character to display.  Actually we need only 22 bits.  */
   457	      unsigned ch : 26;
   458	    } glyphless;
   459	
   460	    /* Used to compare all bit-fields above in one step.  */
   461	    unsigned val;
   462	  } u;
   463	};

The member `u.glyphless' above requires at least 33 bits and does not
fit in the size (32 bits) of `u.val' on many environments.  As a
result, equality with respect to the `u.val' member (e.g., used in
GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
glyphs.

According to the comment above, it seems to be OK to shorten the
length of `u.glyphless.ch' member from 26 to 25.  Could someone
confirm this?

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#11082: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits
  2012-03-24  5:23 bug#11082: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits YAMAMOTO Mitsuharu
@ 2012-03-24  7:01 ` Eli Zaretskii
  2012-03-24  8:54   ` Andreas Schwab
  2012-03-26  5:47   ` Kenichi Handa
  0 siblings, 2 replies; 5+ messages in thread
From: Eli Zaretskii @ 2012-03-24  7:01 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: 11082

> Date: Sat, 24 Mar 2012 14:23:28 +0900
> From: YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>
> 
> In dispextern.h:
> 
>    316	struct glyph
>    317	{
> (snip)
>    418	  /* A union of sub-structures for different glyph types.  */
>    419	  union
>    420	  {
> (snip)
>    447	    /* Sub-stretch for type == GLYPHLESS_GLYPH.  */
>    448	    struct
>    449	    {
>    450	      /* Value is an enum of the type glyphless_display_method.  */
>    451	      unsigned method : 2;
>    452	      /* 1 iff this glyph is for a character of no font. */
>    453	      unsigned for_no_font : 1;
>    454	      /* Length of acronym or hexadecimal code string (at most 8).  */
>    455	      unsigned len : 4;
>    456	      /* Character to display.  Actually we need only 22 bits.  */
>    457	      unsigned ch : 26;
>    458	    } glyphless;
>    459	
>    460	    /* Used to compare all bit-fields above in one step.  */
>    461	    unsigned val;
>    462	  } u;
>    463	};
> 
> The member `u.glyphless' above requires at least 33 bits and does not
> fit in the size (32 bits) of `u.val' on many environments.  As a
> result, equality with respect to the `u.val' member (e.g., used in
> GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
> glyphs.

?? Isn't the size of a union defined by its widest member?  If so, we
just end up wasting some storage here, but we should never truncate a
bit field.  Do you have an actual test case that shows such kind of a
bug?

> According to the comment above, it seems to be OK to shorten the
> length of `u.glyphless.ch' member from 26 to 25.  Could someone
> confirm this?

Confirmed.  From the ELisp manual:

     To support this multitude of characters and scripts, Emacs closely
  follows the "Unicode Standard".  The Unicode Standard assigns a unique
  number, called a "codepoint", to each and every character.  The range
  of codepoints defined by Unicode, or the Unicode "codespace", is
  `0..#x10FFFF' (in hexadecimal notation), inclusive.  Emacs extends this
  range with codepoints in the range `#x110000..#x3FFFFF', which it uses
  for representing characters that are not unified with Unicode and "raw
  8-bit bytes" that cannot be interpreted as characters.  Thus, a
  character codepoint in Emacs is a 22-bit integer number.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I would actually suggest to use 22-bit for this field, to avoid
confusion in the future.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#11082: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits
  2012-03-24  7:01 ` Eli Zaretskii
@ 2012-03-24  8:54   ` Andreas Schwab
  2012-03-25  0:29     ` YAMAMOTO Mitsuharu
  2012-03-26  5:47   ` Kenichi Handa
  1 sibling, 1 reply; 5+ messages in thread
From: Andreas Schwab @ 2012-03-24  8:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11082

Eli Zaretskii <eliz@gnu.org> writes:

>> The member `u.glyphless' above requires at least 33 bits and does not
>> fit in the size (32 bits) of `u.val' on many environments.  As a
>> result, equality with respect to the `u.val' member (e.g., used in
>> GLYPH_EQUAL_P) does not necessarily mean the equality of glyphless
>> glyphs.

This is broken since GLYPHLESS_GLYPH was added.

> ?? Isn't the size of a union defined by its widest member?

The size of u.val is defined by the size of unsigned.

> If so, we just end up wasting some storage here, but we should never
> truncate a bit field.

It's not about truncation, but about ignored bits in GLYPH_EQUAL_P.

> I would actually suggest to use 22-bit for this field, to avoid
> confusion in the future.

Making the struct exactly 32 bits may be better since it can make access
to the ch member simpler.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#11082: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits
  2012-03-24  8:54   ` Andreas Schwab
@ 2012-03-25  0:29     ` YAMAMOTO Mitsuharu
  0 siblings, 0 replies; 5+ messages in thread
From: YAMAMOTO Mitsuharu @ 2012-03-25  0:29 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 11082

>>>>> On Sat, 24 Mar 2012 09:54:09 +0100, Andreas Schwab <schwab@linux-m68k.org> said:

>> ?? Isn't the size of a union defined by its widest member?

> The size of u.val is defined by the size of unsigned.

>> If so, we just end up wasting some storage here, but we should
>> never truncate a bit field.

> It's not about truncation, but about ignored bits in GLYPH_EQUAL_P.

Yes, I meant that.

A test case is as follows.  I checked it with Ubuntu 11.10, GTK+
build.

1. emacs -Q
2. (insert #xe0100) C-j
3. C-p C-p C-e C-b C-b C-d 1 C-e
   Now the line at the cursor is "(insert #xe0101)".
4. C-j
   The glyphless glyph just added is shown as "0E0100" instead of
   "0E0101".

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#11082: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits
  2012-03-24  7:01 ` Eli Zaretskii
  2012-03-24  8:54   ` Andreas Schwab
@ 2012-03-26  5:47   ` Kenichi Handa
  1 sibling, 0 replies; 5+ messages in thread
From: Kenichi Handa @ 2012-03-26  5:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 11082

In article <83zkb6trdb.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> >    447	    /* Sub-stretch for type == GLYPHLESS_GLYPH.  */
> >    448	    struct
> >    449	    {
> >    450	      /* Value is an enum of the type glyphless_display_method.  */
> >    451	      unsigned method : 2;
> >    452	      /* 1 iff this glyph is for a character of no font. */
> >    453	      unsigned for_no_font : 1;
> >    454	      /* Length of acronym or hexadecimal code string (at most 8).  */
> >    455	      unsigned len : 4;
> >    456	      /* Character to display.  Actually we need only 22 bits.  */
> >    457	      unsigned ch : 26;
> >    458	    } glyphless;
[...]
> > According to the comment above, it seems to be OK to shorten the
> > length of `u.glyphless.ch' member from 26 to 25.  Could someone
> > confirm this?

> Confirmed.  From the ELisp manual:
[...]
> I would actually suggest to use 22-bit for this field, to avoid
> confusion in the future.

I agree to change the bit length.  I don't remeber well but I
think the current bit length setting was just my mistake.

In article <jwv4ntch6w6.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

> >   dispextern.h (struct glyph): Change the bit length of glyphless.ch to 22
> > to make the member glyphless fit in 32 bits.

> I think it's safer to reduce it to 25 bits, otherwise `val' field will
> refer to undefined bits.

Ok.  I've just installed that change.

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-03-26  5:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-24  5:23 bug#11082: 24.0.94; u.glyphless member in struct glyph does not fit in 32 bits YAMAMOTO Mitsuharu
2012-03-24  7:01 ` Eli Zaretskii
2012-03-24  8:54   ` Andreas Schwab
2012-03-25  0:29     ` YAMAMOTO Mitsuharu
2012-03-26  5:47   ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).