unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Bug in Unicode character width in Emacs 25.1, bisected to a761fbf (Unicode 9.0.0beta import)
@ 2016-09-18 18:52 Ævar Arnfjörð Bjarmason
  2016-09-19 16:34 ` Eli Zaretskii
  0 siblings, 1 reply; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2016-09-18 18:52 UTC (permalink / raw)
  To: emacs-devel-mXXj517/zsQ; +Cc: mu-discuss-/JYPxA39Uh5TLH3MbocFFw, Eli Zaretskii

[I'm sending this to the ML instead of bug-* because I figure a bug
caused by the Unicode 9 import will garner some wider interest than
your typical regression]

The mu4e mode has a mu4e-use-fancy-chars option which if set will use
e.g. ⚓ (Unicode ANCHOR; U+2693) instead of "a" in the vertically
aligned headers view to show that an E-Mail has an attachment.

In Emacs 25.1 this vertical alignment is off consistent with ⚓ being
considered a zero-width character, i.e. the content to the right-hand
side of the ⚓ character is shifted 1 character to the left.

I bisected this to Eli's a761fbf,
http://git.savannah.gnu.org/cgit/emacs.git/commit/?id=a761fbf

I also manually verified that the bisect was correct by testing both
that commit & the preceding commit, 06aad39. The issue doesn't happen
with 06aad39, but does with a761fbf.

The commit also cleanly reverts on top of the emacs-25.1 tag,
reverting it resolves this issue for me.

In both a761fbf and without M-x describe-char for that character looks the same.

I'm sorry that I don't have a more isolated test case than "run mu4e,
turn on mu4e-use-fancy-chars and check out the misalignment in the
header view" but I figure with the bisect + my successfully testing a
revert of a761fbf on top of emacs-25.1 we have enough info to get
started in narrowing this down.

-- 
You received this message because you are subscribed to the Google Groups "mu-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mu-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in Unicode character width in Emacs 25.1, bisected to a761fbf (Unicode 9.0.0beta import)
  2016-09-18 18:52 Bug in Unicode character width in Emacs 25.1, bisected to a761fbf (Unicode 9.0.0beta import) Ævar Arnfjörð Bjarmason
@ 2016-09-19 16:34 ` Eli Zaretskii
       [not found]   ` <831t0fj1c0.fsf-mXXj517/zsQ@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2016-09-19 16:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: mu-discuss, emacs-devel

> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Date: Sun, 18 Sep 2016 20:52:44 +0200
> Cc: mu-discuss@googlegroups.com, Eli Zaretskii <eliz@gnu.org>
> 
> [I'm sending this to the ML instead of bug-* because I figure a bug
> caused by the Unicode 9 import will garner some wider interest than
> your typical regression]

IMO, that was a mistake.  Bugs should be reported to the bug tracker,
and all those who might be interested are reading the bug mailing list
anyway.  Reporting a bug with "M-x report-emacs-bug" has the advantage
of including in the report important details about your system
configuration that might be relevant to the issue.

> The mu4e mode has a mu4e-use-fancy-chars option which if set will use
> e.g. ⚓ (Unicode ANCHOR; U+2693) instead of "a" in the vertically
> aligned headers view to show that an E-Mail has an attachment.
> 
> In Emacs 25.1 this vertical alignment is off consistent with ⚓ being
> considered a zero-width character, i.e. the content to the right-hand
> side of the ⚓ character is shifted 1 character to the left.

This character's width is 2, not zero:

  (char-width ?⚓) => 2

> I'm sorry that I don't have a more isolated test case than "run mu4e,
> turn on mu4e-use-fancy-chars and check out the misalignment in the
> header view" but I figure with the bisect + my successfully testing a
> revert of a761fbf on top of emacs-25.1 we have enough info to get
> started in narrowing this down.

Unfortunately, this description is not enough.  And since it is
unlikely we'll decide to revert that commit, we need more information
to understand what code (or font?) is the culprit and how to fix that.

For starters, I don't yet have a clear idea of what display problems
are caused by that character; a screenshot would help.  The results of
"C-u C-x =" with point on the anchor character would also be of value.

Finally, does selecting a different font for this character fix the
problem?

Thanks.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in Unicode character width in Emacs 25.1, bisected to a761fbf (Unicode 9.0.0beta import)
       [not found]   ` <831t0fj1c0.fsf-mXXj517/zsQ@public.gmane.org>
@ 2016-09-19 19:12     ` Ævar Arnfjörð Bjarmason
       [not found]       ` <CACBZZX6zZ0zDttq6f9FAXFmks5NzOOq+a4akm+7fquGrabDGNQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2016-09-19 20:17       ` Eli Zaretskii
  0 siblings, 2 replies; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2016-09-19 19:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel-mXXj517/zsQ, mu-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 4378 bytes --]

On Mon, Sep 19, 2016 at 6:34 PM, Eli Zaretskii <eliz-mXXj517/zsQ@public.gmane.org> wrote:
>> From: Ævar Arnfjörð Bjarmason <avarab-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> Date: Sun, 18 Sep 2016 20:52:44 +0200
>> Cc: mu-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org, Eli Zaretskii <eliz-mXXj517/zsQ@public.gmane.org>
>>
>> [I'm sending this to the ML instead of bug-* because I figure a bug
>> caused by the Unicode 9 import will garner some wider interest than
>> your typical regression]
>
> IMO, that was a mistake.  Bugs should be reported to the bug tracker,
> and all those who might be interested are reading the bug mailing list
> anyway.  Reporting a bug with "M-x report-emacs-bug" has the advantage
> of including in the report important details about your system
> configuration that might be relevant to the issue.
>
>> The mu4e mode has a mu4e-use-fancy-chars option which if set will use
>> e.g. ⚓ (Unicode ANCHOR; U+2693) instead of "a" in the vertically
>> aligned headers view to show that an E-Mail has an attachment.
>>
>> In Emacs 25.1 this vertical alignment is off consistent with ⚓ being
>> considered a zero-width character, i.e. the content to the right-hand
>> side of the ⚓ character is shifted 1 character to the left.
>
> This character's width is 2, not zero:

Sorry, I had it the other way around, I should have said 2, not zero, anyway:

>   (char-width ?⚓) => 2

That seems like the bug in question. According to the docs of
char-width it returns "width of CHAR when displayed in the current
buffer".

In any fixed-width font I try this:

    b|1
    æ|2
    ✔|3
    ⚓|4

Always shows un unbroken vertical line. I.e. the characters all have
the same display width of one. Does that display differently for you?
I.e. is the vertical bar for the line with the ⚓ on the column as the
digits for the rest?

If not, ⚓ reporting a width of 2 seems like an isolated test case for
the bug (and who knows what other characters also changed...).

Before your patch:

    (char-width ?⚓) --> 2

But now it's:

    (char-width ?⚓) --> 1

The display issues I'm seeing are consistent with the rendering
machinery thinking it has a width of two, and thus subsequent columns
fall out of alignment.

Also, applying this monkeypatch works around the issue:

    diff --git a/src/character.c b/src/character.c
    index 9f60aa7..583357c 100644
    --- a/src/character.c
    +++ b/src/character.c
    @@ -314,6 +314,7 @@ usage: (char-width CHAR)  */)
       CHECK_CHARACTER (ch);
       c = XINT (ch);
       width = char_width (c, buffer_display_table ());
    +  width = 1;
       return make_number (width);
     }

That patch is obviously not meant to be applied as a fix, but just
shows that pretending that everything has a width of 1 again (which in
the case of what mu4e shows, everything does) makes things align
properly again.

>> I'm sorry that I don't have a more isolated test case than "run mu4e,
>> turn on mu4e-use-fancy-chars and check out the misalignment in the
>> header view" but I figure with the bisect + my successfully testing a
>> revert of a761fbf on top of emacs-25.1 we have enough info to get
>> started in narrowing this down.
>
> Unfortunately, this description is not enough.  And since it is
> unlikely we'll decide to revert that commit, we need more information
> to understand what code (or font?) is the culprit and how to fix that.
>
> For starters, I don't yet have a clear idea of what display problems
> are caused by that character; a screenshot would help.  The results of
> "C-u C-x =" with point on the anchor character would also be of value.

I attached a minimal screenshot. The topmost line is correctly
aligned, but the subsequent two are out of alignment with it.

I get the exact same output with C-u C-x = before & after a761fbf, so
it's surely the same output you're seeing:

> Finally, does selecting a different font for this character fix the
> problem?

-- 
You received this message because you are subscribed to the Google Groups "mu-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mu-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: emacs-bug-a761fbf.png --]
[-- Type: image/png, Size: 5221 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in Unicode character width in Emacs 25.1, bisected to a761fbf (Unicode 9.0.0beta import)
       [not found]       ` <CACBZZX6zZ0zDttq6f9FAXFmks5NzOOq+a4akm+7fquGrabDGNQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-09-19 19:16         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2016-09-19 19:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel-mXXj517/zsQ, mu-discuss-/JYPxA39Uh5TLH3MbocFFw

On Mon, Sep 19, 2016 at 9:12 PM, Ævar Arnfjörð Bjarmason
<avarab-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Sep 19, 2016 at 6:34 PM, Eli Zaretskii <eliz-mXXj517/zsQ@public.gmane.org> wrote:
>>> From: Ævar Arnfjörð Bjarmason <avarab-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Date: Sun, 18 Sep 2016 20:52:44 +0200
>>> Cc: mu-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org, Eli Zaretskii <eliz-mXXj517/zsQ@public.gmane.org>
>>>
>>> [I'm sending this to the ML instead of bug-* because I figure a bug
>>> caused by the Unicode 9 import will garner some wider interest than
>>> your typical regression]
>>
>> IMO, that was a mistake.  Bugs should be reported to the bug tracker,
>> and all those who might be interested are reading the bug mailing list
>> anyway.  Reporting a bug with "M-x report-emacs-bug" has the advantage
>> of including in the report important details about your system
>> configuration that might be relevant to the issue.
>>
>>> The mu4e mode has a mu4e-use-fancy-chars option which if set will use
>>> e.g. ⚓ (Unicode ANCHOR; U+2693) instead of "a" in the vertically
>>> aligned headers view to show that an E-Mail has an attachment.
>>>
>>> In Emacs 25.1 this vertical alignment is off consistent with ⚓ being
>>> considered a zero-width character, i.e. the content to the right-hand
>>> side of the ⚓ character is shifted 1 character to the left.
>>
>> This character's width is 2, not zero:
>
> Sorry, I had it the other way around, I should have said 2, not zero, anyway:
>
>>   (char-width ?⚓) => 2
>
> That seems like the bug in question. According to the docs of
> char-width it returns "width of CHAR when displayed in the current
> buffer".
>
> In any fixed-width font I try this:
>
>     b|1
>     æ|2
>     ✔|3
>     ⚓|4

Sorry, I should really read my E-Mails over a couple of more times
before I send them.

> Always shows un unbroken vertical line. I.e. the characters all have

"shows an unbroken"...

> the same display width of one. Does that display differently for you?
> I.e. is the vertical bar for the line with the ⚓ on the column as the
> digits for the rest?
>
> If not, ⚓ reporting a width of 2 seems like an isolated test case for
> the bug (and who knows what other characters also changed...).
>
> Before your patch:
>
>     (char-width ?⚓) --> 2
>
> But now it's:
>
>     (char-width ?⚓) --> 1

I rewrote this a few times an got this mixed up. I mean before it was
1, *now* it's 2.

> The display issues I'm seeing are consistent with the rendering
> machinery thinking it has a width of two, and thus subsequent columns
> fall out of alignment.
>
> Also, applying this monkeypatch works around the issue:
>
>     diff --git a/src/character.c b/src/character.c
>     index 9f60aa7..583357c 100644
>     --- a/src/character.c
>     +++ b/src/character.c
>     @@ -314,6 +314,7 @@ usage: (char-width CHAR)  */)
>        CHECK_CHARACTER (ch);
>        c = XINT (ch);
>        width = char_width (c, buffer_display_table ());
>     +  width = 1;
>        return make_number (width);
>      }
>
> That patch is obviously not meant to be applied as a fix, but just
> shows that pretending that everything has a width of 1 again (which in
> the case of what mu4e shows, everything does) makes things align
> properly again.
>
>>> I'm sorry that I don't have a more isolated test case than "run mu4e,
>>> turn on mu4e-use-fancy-chars and check out the misalignment in the
>>> header view" but I figure with the bisect + my successfully testing a
>>> revert of a761fbf on top of emacs-25.1 we have enough info to get
>>> started in narrowing this down.
>>
>> Unfortunately, this description is not enough.  And since it is
>> unlikely we'll decide to revert that commit, we need more information
>> to understand what code (or font?) is the culprit and how to fix that.
>>
>> For starters, I don't yet have a clear idea of what display problems
>> are caused by that character; a screenshot would help.  The results of
>> "C-u C-x =" with point on the anchor character would also be of value.
>
> I attached a minimal screenshot. The topmost line is correctly
> aligned, but the subsequent two are out of alignment with it.
>
> I get the exact same output with C-u C-x = before & after a761fbf, so
> it's surely the same output you're seeing:
>
>> Finally, does selecting a different font for this character fix the
>> problem?

-- 
You received this message because you are subscribed to the Google Groups "mu-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mu-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in Unicode character width in Emacs 25.1, bisected to a761fbf (Unicode 9.0.0beta import)
  2016-09-19 19:12     ` Ævar Arnfjörð Bjarmason
       [not found]       ` <CACBZZX6zZ0zDttq6f9FAXFmks5NzOOq+a4akm+7fquGrabDGNQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-09-19 20:17       ` Eli Zaretskii
  1 sibling, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2016-09-19 20:17 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: mu-discuss, emacs-devel

> From: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Date: Mon, 19 Sep 2016 21:12:00 +0200
> Cc: emacs-devel@gnu.org, mu-discuss@googlegroups.com
> 
> Sorry, I had it the other way around, I should have said 2, not zero, anyway:
> 
> >   (char-width ?⚓) => 2
> 
> That seems like the bug in question. According to the docs of
> char-width it returns "width of CHAR when displayed in the current
> buffer".
> 
> In any fixed-width font I try this:
> 
>     b|1
>     æ|2
>     ✔|3
>     ⚓|4
> 
> Always shows un unbroken vertical line. I.e. the characters all have
> the same display width of one. Does that display differently for you?

Yes, I see a different display: the last 2 lines have their vertical
lines shifted to the right, as these two characters are wider.

> I.e. is the vertical bar for the line with the ⚓ on the column as the
> digits for the rest?

This is clearly a matter of the font used for this (here, it's Symbola
for the two symbols and Courier for the letters).

> If not, ⚓ reporting a width of 2 seems like an isolated test case for
> the bug (and who knows what other characters also changed...).

It's not a bug.  mu4e should either align text in pixels (e.g., using
the 'space' display property), or live with this limitation.

In general, an application that uses unusual symbols in the middle of
plain text should expect misalignment due to different fonts having
different sizes for the same characters, and because unusual symbols
might come from a font that is different from what the default face
uses.

> The display issues I'm seeing are consistent with the rendering
> machinery thinking it has a width of two, and thus subsequent columns
> fall out of alignment.

As expected: the character is wider than normal, at least with the
fonts I have here, so the misalignment is expected, because counting
in columns, like mu4e does, is inaccurate in these use cases.

> That patch is obviously not meant to be applied as a fix, but just
> shows that pretending that everything has a width of 1 again (which in
> the case of what mu4e shows, everything does) makes things align
> properly again.

Which is clearly wrong, since not all characters have the same width.

> I get the exact same output with C-u C-x = before & after a761fbf, so
> it's surely the same output you're seeing:

We cannot see the same output because we use different fonts.

> > Finally, does selecting a different font for this character fix the
> > problem?

What about this question?

Thanks.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-09-19 20:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-18 18:52 Bug in Unicode character width in Emacs 25.1, bisected to a761fbf (Unicode 9.0.0beta import) Ævar Arnfjörð Bjarmason
2016-09-19 16:34 ` Eli Zaretskii
     [not found]   ` <831t0fj1c0.fsf-mXXj517/zsQ@public.gmane.org>
2016-09-19 19:12     ` Ævar Arnfjörð Bjarmason
     [not found]       ` <CACBZZX6zZ0zDttq6f9FAXFmks5NzOOq+a4akm+7fquGrabDGNQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-09-19 19:16         ` Ævar Arnfjörð Bjarmason
2016-09-19 20:17       ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).