unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
@ 2013-09-05 14:08 Xue Fuqiao
  2013-09-05 14:33 ` Eli Zaretskii
  2013-09-05 16:48 ` Jan Djärv
  0 siblings, 2 replies; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-05 14:08 UTC (permalink / raw)
  To: 15273

[-- Attachment #1: Type: text/plain, Size: 2608 bytes --]

To reproduce:

  emacs -Q
  !                      ;; input an exclamation mark (#x21)
  C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
  RET                    ;; Newline
  C-x 8 2 6 A 0 RET      ;; WARNING SIGN

I tried many fonts, but all results look weird (the first line is
incomplete and too large).

(I haven't tried it on other platforms yet, so I'm not sure whether it's
NS-port specific.)

In GNU Emacs 24.3.50.1 (x86_64-apple-darwin12.4.0, NS apple-appkit-1187.39)
 of 2013-08-31 on xfq.local
Bzr revision: 114077 rgm@gnu.org-20130830174039-3aiddsbwhbn5tf9x
Windowing system distributor `Apple', version 10.3.1187
Configured using:
 `configure --with-ns --enable-checking --disable-silent-rules'

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
! C-x 8 <return> 2 0 e 4 <return> <return> C-x 8 <return>
2 6 a 0 <return> <escape> x r e - e m - m <backspace>
b <tab> <return>

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
read-number: Command attempted to use minibuffer while in minibuffer

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr misearch multi-isearch emacsbug message
format-spec rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils iso-transl time-date
tooltip ediff-hook vc-hooks lisp-float-type mwheel ns-win tool-bar dnd
fontset image regexp-opt fringe tabulated-list newcomment lisp-mode
prog-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process ns
multi-tty emacs)


-- 
Best regards, Xue Fuqiao.
http://www.gnu.org/software/emacs/

[-- Attachment #2: combining-character-sequences.png --]
[-- Type: image/png, Size: 50883 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 14:08 bug#15273: 24.3.50; Combining character sequences are displayed weirdly Xue Fuqiao
@ 2013-09-05 14:33 ` Eli Zaretskii
  2013-09-05 23:26   ` Xue Fuqiao
  2013-09-05 16:48 ` Jan Djärv
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-05 14:33 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: 15273

> Date: Thu, 5 Sep 2013 22:08:21 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> 
>   emacs -Q
>   !                      ;; input an exclamation mark (#x21)
>   C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
>   RET                    ;; Newline
>   C-x 8 2 6 A 0 RET      ;; WARNING SIGN
> 
> I tried many fonts, but all results look weird (the first line is
> incomplete and too large).

What font(s) were used to display this character?

If you go to the first line, which shows the character only partially,
and type C-a, does that fix the problem?

What about "M-x redraw-display RET"?

Finally, if you go to the first partially displayed character and type
"C-u C-x =", what does Emacs show?





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 14:08 bug#15273: 24.3.50; Combining character sequences are displayed weirdly Xue Fuqiao
  2013-09-05 14:33 ` Eli Zaretskii
@ 2013-09-05 16:48 ` Jan Djärv
  2013-09-05 17:12   ` Eli Zaretskii
  2013-09-05 23:27   ` Xue Fuqiao
  1 sibling, 2 replies; 53+ messages in thread
From: Jan Djärv @ 2013-09-05 16:48 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: 15273

[-- Attachment #1: Type: text/plain, Size: 1104 bytes --]

Hello.

5 sep 2013 kl. 16:08 skrev Xue Fuqiao <xfq.free@gmail.com>:

> To reproduce:
> 
>  emacs -Q
>  !                      ;; input an exclamation mark (#x21)
>  C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
>  RET                    ;; Newline
>  C-x 8 2 6 A 0 RET      ;; WARNING SIGN
> 

I assume you mean C-x 8 RET 2 6 A 0 RET.

> I tried many fonts, but all results look weird (the first line is
> incomplete and too large).
> 
> (I haven't tried it on other platforms yet, so I'm not sure whether it's
> NS-port specific.)

On Fedora 19, the first does not combine, I just get ! followed by an upward pointing triangle.
The second is very small, see screen shot.

The NS one looks better, but when moving the cursor over the first row, it redraws funny.  And metrics are off, the distance between the first triangle and the aaa to the right is actually only one space.  The aaa to the left is also followed by a space, but there the distance is too short, and the source of the redraw error.

W32 anyone?

	Jan D.







	Jan D.



[-- Attachment #2.1: Type: text/html, Size: 2575 bytes --]

[-- Attachment #2.2: fedora19.png --]
[-- Type: image/png, Size: 7700 bytes --]

[-- Attachment #2.3: ns.png --]
[-- Type: image/png, Size: 8155 bytes --]

[-- Attachment #2.4: ns-redrawerror.png --]
[-- Type: image/png, Size: 7990 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 16:48 ` Jan Djärv
@ 2013-09-05 17:12   ` Eli Zaretskii
  2013-09-05 17:24     ` Eli Zaretskii
  2013-09-05 17:29     ` Jan Djärv
  2013-09-05 23:27   ` Xue Fuqiao
  1 sibling, 2 replies; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-05 17:12 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Thu, 5 Sep 2013 18:48:43 +0200
> Cc: 15273@debbugs.gnu.org
> 
> 
> >  emacs -Q
> >  !                      ;; input an exclamation mark (#x21)
> >  C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
> >  RET                    ;; Newline
> >  C-x 8 2 6 A 0 RET      ;; WARNING SIGN
> > 
> 
> I assume you mean C-x 8 RET 2 6 A 0 RET.
> 
> > I tried many fonts, but all results look weird (the first line is
> > incomplete and too large).
> > 
> > (I haven't tried it on other platforms yet, so I'm not sure whether it's
> > NS-port specific.)
> 
> On Fedora 19, the first does not combine, I just get ! followed by an upward pointing triangle.
> The second is very small, see screen shot.
> 
> The NS one looks better, but when moving the cursor over the first row, it redraws funny.  And metrics are off, the distance between the first triangle and the aaa to the right is actually only one space.  The aaa to the left is also followed by a space, but there the distance is too short, and the source of the redraw error.
> 
> W32 anyone?

On w32, the Uniscribe font driver does not compose these two
characters.  So I guess this composition is done by the driver used by
NS, and perhaps the composition information Emacs gets is incorrect.

Once again, please show the full output of "C-u C-x =" on the composed
character, the pixel-level data of the composition shown there will
probably tell us what is wrong.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 17:12   ` Eli Zaretskii
@ 2013-09-05 17:24     ` Eli Zaretskii
  2013-09-05 17:33       ` Jan Djärv
  2013-09-05 17:29     ` Jan Djärv
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-05 17:24 UTC (permalink / raw)
  To: jan.h.d; +Cc: xfq.free, 15273

> Date: Thu, 05 Sep 2013 20:12:13 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: xfq.free@gmail.com, 15273@debbugs.gnu.org
> 
> > The NS one looks better, but when moving the cursor over the first row, it redraws funny.  And metrics are off, the distance between the first triangle and the aaa to the right is actually only one space.  The aaa to the left is also followed by a space, but there the distance is too short, and the source of the redraw error.
> > 
> > W32 anyone?
> 
> On w32, the Uniscribe font driver does not compose these two
> characters.  So I guess this composition is done by the driver used by
> NS, and perhaps the composition information Emacs gets is incorrect.
> 
> Once again, please show the full output of "C-u C-x =" on the composed
> character, the pixel-level data of the composition shown there will
> probably tell us what is wrong.

On one OSX system, the result of "! C-x 8 RET 2 0 E 4 RET" _looks_
like a single character, but actually isn't: I can move cursor twice
across the result, and "C-u C-x =" doesn't show the composition
results.  Is that what you see?





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 17:12   ` Eli Zaretskii
  2013-09-05 17:24     ` Eli Zaretskii
@ 2013-09-05 17:29     ` Jan Djärv
  2013-09-05 17:56       ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Djärv @ 2013-09-05 17:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hi.

5 sep 2013 kl. 19:12 skrev Eli Zaretskii <eliz@gnu.org>:

>> From: Jan Djärv <jan.h.d@swipnet.se>
>> Date: Thu, 5 Sep 2013 18:48:43 +0200
>> Cc: 15273@debbugs.gnu.org
>> 
>> 
>>> emacs -Q
>>> !                      ;; input an exclamation mark (#x21)
>>> C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
>>> RET                    ;; Newline
>>> C-x 8 2 6 A 0 RET      ;; WARNING SIGN
>>> 
>> 
>> I assume you mean C-x 8 RET 2 6 A 0 RET.
>> 
>>> I tried many fonts, but all results look weird (the first line is
>>> incomplete and too large).
>>> 
>>> (I haven't tried it on other platforms yet, so I'm not sure whether it's
>>> NS-port specific.)
>> 
>> On Fedora 19, the first does not combine, I just get ! followed by an upward pointing triangle.
>> The second is very small, see screen shot.
>> 
>> The NS one looks better, but when moving the cursor over the first row, it redraws funny.  And metrics are off, the distance between the first triangle and the aaa to the right is actually only one space.  The aaa to the left is also followed by a space, but there the distance is too short, and the source of the redraw error.
>> 
>> W32 anyone?
> 
> On w32, the Uniscribe font driver does not compose these two
> characters.  So I guess this composition is done by the driver used by
> NS, and perhaps the composition information Emacs gets is incorrect.
> 
> Once again, please show the full output of "C-u C-x =" on the composed
> character, the pixel-level data of the composition shown there will
> probably tell us what is wrong.


I don't know if it says anything.  In Emacs the square is actually a triangle, displayed too far to the left, i.e. covering half the letter to the left.  The mailer I use can't display it.

             position: 4 of 4 (75%), column: 3
            character: ⃤ (displayed as ⃤) (codepoint 8420, #o20344, #x20e4)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x20E4
               script: symbol
               syntax: w 	which means: word
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xE2 #x83 #xA4
            file code: #xE2 #x83 #xA4 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    nil:-apple-STIXGeneral-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1 (#x359)

Character code properties: customize what to show
  name: COMBINING ENCLOSING UPWARD POINTING TRIANGLE
  general-category: Me (Mark, Enclosing)
  decomposition: (8420) ('⃤')

There are text properties here:
  fontified            t

	Jan D.






^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 17:24     ` Eli Zaretskii
@ 2013-09-05 17:33       ` Jan Djärv
  2013-09-05 17:56         ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Djärv @ 2013-09-05 17:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hello.

5 sep 2013 kl. 19:24 skrev Eli Zaretskii <eliz@gnu.org>:

> On one OSX system, the result of "! C-x 8 RET 2 0 E 4 RET" _looks_
> like a single character, but actually isn't: I can move cursor twice
> across the result, and "C-u C-x =" doesn't show the composition
> results.  Is that what you see?

Yes.

	Jan D.






^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 17:33       ` Jan Djärv
@ 2013-09-05 17:56         ` Eli Zaretskii
  2013-09-06  5:08           ` Jan Djärv
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-05 17:56 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Thu, 5 Sep 2013 19:33:04 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
> Hello.
> 
> 5 sep 2013 kl. 19:24 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> > On one OSX system, the result of "! C-x 8 RET 2 0 E 4 RET" _looks_
> > like a single character, but actually isn't: I can move cursor twice
> > across the result, and "C-u C-x =" doesn't show the composition
> > results.  Is that what you see?
> 
> Yes.

Thanks.  So this is not a composition at all, and the wicked way the
triangle is displayed is not Emacs's fault at all.  I guess the font
used by NS or the font driver are at fault here.  FWIW, on MS-Windows,
I see the ! and the triangle after it correctly, as 2 separate
characters (using Code2000 font).





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 17:29     ` Jan Djärv
@ 2013-09-05 17:56       ` Eli Zaretskii
  0 siblings, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-05 17:56 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Thu, 5 Sep 2013 19:29:24 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
>              position: 4 of 4 (75%), column: 3
>             character: ⃤ (displayed as ⃤) (codepoint 8420, #o20344, #x20e4)
>     preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0x20E4
>                script: symbol
>                syntax: w 	which means: word
>              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
>           buffer code: #xE2 #x83 #xA4
>             file code: #xE2 #x83 #xA4 (encoded by coding system utf-8-unix)
>               display: by this font (glyph code)
>     nil:-apple-STIXGeneral-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1 (#x359)
> 
> Character code properties: customize what to show
>   name: COMBINING ENCLOSING UPWARD POINTING TRIANGLE
>   general-category: Me (Mark, Enclosing)
>   decomposition: (8420) ('⃤')
> 
> There are text properties here:
>   fontified            t

This is not a composition.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 14:33 ` Eli Zaretskii
@ 2013-09-05 23:26   ` Xue Fuqiao
  0 siblings, 0 replies; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-05 23:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15273

On Thu, Sep 5, 2013 at 10:33 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Thu, 5 Sep 2013 22:08:21 +0800
>> From: Xue Fuqiao <xfq.free@gmail.com>
>>
>>   emacs -Q
>>   !                      ;; input an exclamation mark (#x21)
>>   C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
>>   RET                    ;; Newline
>>   C-x 8 RET 2 6 A 0 RET      ;; WARNING SIGN
>>
>> I tried many fonts, but all results look weird (the first line is
>> incomplete and too large).
>
> What font(s) were used to display this character?

Menlo, Times New Roman, Courier New, FreeMono, FreeSans, and FreeSerif.

> If you go to the first line, which shows the character only partially,
> and type C-a, does that fix the problem?

No.

> What about "M-x redraw-display RET"?

Nothing changes.

> Finally, if you go to the first partially displayed character and type
> "C-u C-x =", what does Emacs show?

Similar to Jan's.  The general-category for this character is "Me (Mark,
Enclosing)", so it is a nonspacing mark, which is a kind of combining
character.

See:
http://www.unicode.org/glossary/#enclosing_mark
http://www.unicode.org/glossary/#nonspacing_mark
http://www.unicode.org/glossary/#combining_character

-- 
Best regards, Xue Fuqiao.
http://www.gnu.org/software/emacs/





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 16:48 ` Jan Djärv
  2013-09-05 17:12   ` Eli Zaretskii
@ 2013-09-05 23:27   ` Xue Fuqiao
  1 sibling, 0 replies; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-05 23:27 UTC (permalink / raw)
  To: Jan Djärv; +Cc: 15273

[-- Attachment #1: Type: text/plain, Size: 563 bytes --]

On Fri, Sep 6, 2013 at 12:48 AM, Jan Djärv <jan.h.d@swipnet.se> wrote:

    Hello.
    5 sep 2013 kl. 16:08 skrev Xue Fuqiao <xfq.free@gmail.com>:
>     To reproduce:
>
>      emacs -Q
>      !                      ;; input an exclamation mark (#x21)
>      C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING
TRIANGLE
>      RET                    ;; Newline
>      C-x 8 2 6 A 0 RET      ;; WARNING SIGN
>
    I assume you mean C-x 8 RET 2 6 A 0 RET.

Yes, sorry.

-- 
Best regards, Xue Fuqiao.
http://www.gnu.org/software/emacs/

[-- Attachment #2: Type: text/html, Size: 882 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-05 17:56         ` Eli Zaretskii
@ 2013-09-06  5:08           ` Jan Djärv
  2013-09-06  6:29             ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Djärv @ 2013-09-06  5:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hello.

5 sep 2013 kl. 19:56 skrev Eli Zaretskii <eliz@gnu.org>:

>> From: Jan Djärv <jan.h.d@swipnet.se>
>> Date: Thu, 5 Sep 2013 19:33:04 +0200
>> Cc: xfq.free@gmail.com,
>> 15273@debbugs.gnu.org
>> 
>> Hello.
>> 
>> 5 sep 2013 kl. 19:24 skrev Eli Zaretskii <eliz@gnu.org>:
>> 
>>> On one OSX system, the result of "! C-x 8 RET 2 0 E 4 RET" _looks_
>>> like a single character, but actually isn't: I can move cursor twice
>>> across the result, and "C-u C-x =" doesn't show the composition
>>> results.  Is that what you see?
>> 
>> Yes.
> 
> Thanks.  So this is not a composition at all, and the wicked way the
> triangle is displayed is not Emacs's fault at all.  I guess the font
> used by NS or the font driver are at fault here.  FWIW, on MS-Windows,
> I see the ! and the triangle after it correctly, as 2 separate
> characters (using Code2000 font).

What do you mean by composition?  Is it that the two characters are replaced by another character that is equivalent?  That may not be possible since we can combine any character with the triangle, such a glyph may not be available.  Or is it simply that Emacs treats the two characters as one?
Why is displaying the ! and the triangle after it as separate characters correct?  The triangle is a composing character and should be displayed above the !.

	Jan D.






^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06  5:08           ` Jan Djärv
@ 2013-09-06  6:29             ` Eli Zaretskii
  2013-09-06  6:42               ` Andreas Schwab
  2013-09-08 13:05               ` Kenichi Handa
  0 siblings, 2 replies; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-06  6:29 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Fri, 6 Sep 2013 07:08:56 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
> > Thanks.  So this is not a composition at all, and the wicked way the
> > triangle is displayed is not Emacs's fault at all.  I guess the font
> > used by NS or the font driver are at fault here.  FWIW, on MS-Windows,
> > I see the ! and the triangle after it correctly, as 2 separate
> > characters (using Code2000 font).
> 
> What do you mean by composition?  Is it that the two characters are replaced by another character that is equivalent?  That may not be possible since we can combine any character with the triangle, such a glyph may not be available.

Character composition in Emacs can happen in 1 of 2 ways:

 . The font driver tells Emacs to compose several characters into a
   single grapheme cluster, by drawing all of them as a single unit,
   and by drawing the 2nd, 3rd, etc. character glyphs at certain pixel
   offsets relative to the base glyph.

 . Emacs itself has composition rules for 2 or more characters; in
   this case, the same pixel offsets come from those rules.

The first possibility includes possible substitution of a single glyph
for several characters, but that's not the only possibility, because
the font driver tells Emacs both the glyphs to draw and their relative
pixel positions.

In both cases, Emacs shows the composition details in "C-u C-x =",
here's an example:

  Composed with the following character(s) "ִ" using this font:
    uniscribe:-outline-Courier New-bold-normal-normal-mono-15-*-*-*-c-*-iso10646-1
  by these glyphs:
    [0 1 1506 690 9 0 8 12 5 nil]
    [0 1 1460 657 9 0 5 12 5 [-7 0 0]]

In the case in point, there was no such display of composition
details.  So I concluded that no composition was done.

> Or is it simply that Emacs treats the two characters as one?

Emacs will treat them as one, more or less, if they are composed by
one of the above two methods.  ("More or less" because we still allow
certain operations, such as delete-char, to act on individual
characters that were composed.)

> Why is displaying the ! and the triangle after it as separate characters correct?  The triangle is a composing character and should be displayed above the !.

I meant "correct" in the sense that there's no apparent redisplay bug:
the display engine behaves according to the information it has.

The OP's bug report was about the partial display of the triangle, not
about the lack of composition.

If the font driver doesn't tell us that the characters need to be
combined, and we don't have in Emacs a rule to do that ourselves, then
the problem, if there is one, is not in the display engine.  If we
want to combine this character, we should write a composition rule for
it.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06  6:29             ` Eli Zaretskii
@ 2013-09-06  6:42               ` Andreas Schwab
  2013-09-06  7:32                 ` Eli Zaretskii
  2013-09-08 13:05               ` Kenichi Handa
  1 sibling, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2013-09-06  6:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Eli Zaretskii <eliz@gnu.org> writes:

> Character composition in Emacs can happen in 1 of 2 ways:
>
>  . The font driver tells Emacs to compose several characters into a
>    single grapheme cluster, by drawing all of them as a single unit,
>    and by drawing the 2nd, 3rd, etc. character glyphs at certain pixel
>    offsets relative to the base glyph.
>
>  . Emacs itself has composition rules for 2 or more characters; in
>    this case, the same pixel offsets come from those rules.

The first way can only work if both characters are coming from the same
font.  Not sure if that is also true for the second way, but I'd guess
yes.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06  6:42               ` Andreas Schwab
@ 2013-09-06  7:32                 ` Eli Zaretskii
  2013-09-06 14:37                   ` Xue Fuqiao
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-06  7:32 UTC (permalink / raw)
  To: Andreas Schwab, Kenichi Handa; +Cc: xfq.free, 15273

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: Jan Djärv <jan.h.d@swipnet.se>,  xfq.free@gmail.com,
>   15273@debbugs.gnu.org
> Date: Fri, 06 Sep 2013 08:42:23 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Character composition in Emacs can happen in 1 of 2 ways:
> >
> >  . The font driver tells Emacs to compose several characters into a
> >    single grapheme cluster, by drawing all of them as a single unit,
> >    and by drawing the 2nd, 3rd, etc. character glyphs at certain pixel
> >    offsets relative to the base glyph.
> >
> >  . Emacs itself has composition rules for 2 or more characters; in
> >    this case, the same pixel offsets come from those rules.
> 
> The first way can only work if both characters are coming from the same
> font.

Yes.

> Not sure if that is also true for the second way, but I'd guess yes.

I'm not sure, either.  The second way happens when we generate the
glyph matrices -- do we select font before or after that?  If after,
we probably look for a font that supports all of the characters in the
composition.  Perhaps Handa-san could comment on this.

But, at least in my testing, I used a font that definitely supports
both characters, and the composition still didn't happen.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06  7:32                 ` Eli Zaretskii
@ 2013-09-06 14:37                   ` Xue Fuqiao
  2013-09-06 15:53                     ` Eli Zaretskii
  2013-09-07 12:25                     ` Wolfgang Jenkner
  0 siblings, 2 replies; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-06 14:37 UTC (permalink / raw)
  To: 15273

[-- Attachment #1: Type: text/plain, Size: 66 bytes --]

I did some further experiments.  See the attachments for details.

[-- Attachment #2: bug15273.txt --]
[-- Type: text/plain, Size: 1253 bytes --]

ã ;; LATIN SMALL LETTER A WITH TILDE
ã ;; LATIN SMALL LETTER A + COMBINING TILDE

ȧ ;; LATIN SMALL LETTER A WITH DOT ABOVE
ȧ ;; LATIN SMALL LETTER A + COMBINING DOT ABOVE

ạ̃ ;; LATIN SMALL LETTER A WITH TILDE + COMBINING DOT BELOW
ạ̃ ;; LATIN SMALL LETTER A + COMBINING TILDE + COMBINING DOT BELOW
ạ̃ ;; LATIN SMALL LETTER A WITH DOT BELOW + COMBINING TILDE
ạ̃ ;; LATIN SMALL LETTER A + COMBINING DOT BELOW + COMBINING TILDE

ạ̇ ;; LATIN SMALL LETTER A WITH DOT BELOW + COMBINING DOT ABOVE
ạ̇ ;; LATIN SMALL LETTER A + COMBINING DOT BELOW + COMBINING DOT ABOVE
ạ̇ ;; LATIN SMALL LETTER A WITH DOT ABOVE + COMBINING DOT BELOW
ạ̇ ;; LATIN SMALL LETTER A + COMBINING DOT ABOVE + COMBINING DOT BELOW

ấ ;; LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
â ;; LATIN SMALL LETTER A WITH CIRCUMFLEX + COMBINING ACUTE ACCENT
ấ ;; LATIN SMALL LETTER A + COMBINING CIRCUMFLEX ACCENT + COMBINING ACUTE ACCENT

á̂ ;; LATIN SMALL LETTER A ACUTE + COMBINING CIRCUMFLEX ACCENT
á̂ ;; LATIN SMALL LETTER A + COMBINING ACUTE ACCENT + COMBINING CIRCUMFLEX ACCENT

ἄ ;; GREEK SMALL LETTER ALPHA + COMBINING COMMA ABOVE + COMBINING ACUTE ACCENT
ά̓ ;; GREEK SMALL LETTER ALPHA + COMBINING ACUTE ACCENT + COMBINING COMMA ABOVE

[-- Attachment #3: bug15273.png --]
[-- Type: image/png, Size: 194199 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06 14:37                   ` Xue Fuqiao
@ 2013-09-06 15:53                     ` Eli Zaretskii
  2013-09-06 22:17                       ` Xue Fuqiao
  2013-09-07 12:25                     ` Wolfgang Jenkner
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-06 15:53 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: 15273

> Date: Fri, 6 Sep 2013 22:37:44 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> 
> I did some further experiments.  See the attachments for details.

What did you do this for?  What did you expect to see?





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06 15:53                     ` Eli Zaretskii
@ 2013-09-06 22:17                       ` Xue Fuqiao
  2013-09-06 22:37                         ` Xue Fuqiao
  2013-09-07  7:26                         ` Eli Zaretskii
  0 siblings, 2 replies; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-06 22:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15273

On Fri, Sep 6, 2013 at 11:53 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Fri, 6 Sep 2013 22:37:44 +0800
>> From: Xue Fuqiao <xfq.free@gmail.com>
>>
>> I did some further experiments.  See the attachments for details.
>
> What did you do this for?  What did you expect to see?

Sorry for not being clear.  I expect the same result for characters in
one group, but there are all kinds of results.  Some characters are
partial displayed, like the OP.

-- 
Best regards, Xue Fuqiao.
http://www.gnu.org/software/emacs/





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06 22:17                       ` Xue Fuqiao
@ 2013-09-06 22:37                         ` Xue Fuqiao
  2013-09-07  7:27                           ` Eli Zaretskii
  2013-09-07  7:26                         ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-06 22:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 15273

On Sat, Sep 7, 2013 at 6:17 AM, Xue Fuqiao <xfq.free@gmail.com> wrote:
> On Fri, Sep 6, 2013 at 11:53 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>>> Date: Fri, 6 Sep 2013 22:37:44 +0800
>>> From: Xue Fuqiao <xfq.free@gmail.com>
>>>
>>> I did some further experiments.  See the attachments for details.
>>
>> What did you do this for?  What did you expect to see?
>
> I expect the same result for characters in one group.

Except for the last group, because the correct order of codes is base
character code + breathing mark code + accent mark code (which is the
first one).

-- 
Best regards, Xue Fuqiao.
http://www.gnu.org/software/emacs/





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06 22:17                       ` Xue Fuqiao
  2013-09-06 22:37                         ` Xue Fuqiao
@ 2013-09-07  7:26                         ` Eli Zaretskii
  2013-09-07  7:36                           ` Jan Djärv
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07  7:26 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: 15273

> Date: Sat, 7 Sep 2013 06:17:02 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> Cc: 15273@debbugs.gnu.org
> 
> On Fri, Sep 6, 2013 at 11:53 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> >> Date: Fri, 6 Sep 2013 22:37:44 +0800
> >> From: Xue Fuqiao <xfq.free@gmail.com>
> >>
> >> I did some further experiments.  See the attachments for details.
> >
> > What did you do this for?  What did you expect to see?
> 
> Sorry for not being clear.  I expect the same result for characters in
> one group, but there are all kinds of results.  Some characters are
> partial displayed, like the OP.

Again, please tell (using "C-u C-x =") which ones are actually
composed, and which ones are displayed as several separate
characters.  If any of these characters are composed and the results
are displayed incorrectly, please show the last portion of "C-u C-x ="
display, where Emacs describes the composition.

Can you try some font that is known to be good at displaying Unicode,
such as Code2000 or DejaVu Sans?

FWIW, I see no problems with Emacs display of these characters on
MS-Windows: characters in each group are indeed displayed the same.
So I'm not sure what are we still discussing here, since the quality
of fonts on any given platform is hardly on-topic in the bug tracker,
and the NS font driver is not part of Emacs.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06 22:37                         ` Xue Fuqiao
@ 2013-09-07  7:27                           ` Eli Zaretskii
  0 siblings, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07  7:27 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: 15273

> Date: Sat, 7 Sep 2013 06:37:03 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> Cc: 15273@debbugs.gnu.org
> 
> > I expect the same result for characters in one group.
> 
> Except for the last group, because the correct order of codes is base
> character code + breathing mark code + accent mark code (which is the
> first one).

I see the same results in the last group as well (on Windows XP),
FWIW.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  7:26                         ` Eli Zaretskii
@ 2013-09-07  7:36                           ` Jan Djärv
  2013-09-07  7:57                             ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Djärv @ 2013-09-07  7:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Xue Fuqiao, 15273


7 sep 2013 kl. 09:26 skrev Eli Zaretskii <eliz@gnu.org>:

> So I'm not sure what are we still discussing here, since the quality
> of fonts on any given platform is hardly on-topic in the bug tracker,
> and the NS font driver is not part of Emacs.

nsfont.m:

 Copyright (C) 2006-2013 Free Software Foundation, Inc.

This file is part of GNU Emacs.

	Jan D.







^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  7:36                           ` Jan Djärv
@ 2013-09-07  7:57                             ` Eli Zaretskii
  2013-09-07  8:02                               ` Jan Djärv
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07  7:57 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Sat, 7 Sep 2013 09:36:16 +0200
> Cc: Xue Fuqiao <xfq.free@gmail.com>,
>  15273@debbugs.gnu.org
> 
> 
> 7 sep 2013 kl. 09:26 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> > So I'm not sure what are we still discussing here, since the quality
> > of fonts on any given platform is hardly on-topic in the bug tracker,
> > and the NS font driver is not part of Emacs.
> 
> nsfont.m:
> 
>  Copyright (C) 2006-2013 Free Software Foundation, Inc.
> 
> This file is part of GNU Emacs.

The OP said that he tried many fonts, and none worked correctly.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  7:57                             ` Eli Zaretskii
@ 2013-09-07  8:02                               ` Jan Djärv
  2013-09-07  8:10                                 ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Djärv @ 2013-09-07  8:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hi.

7 sep 2013 kl. 09:57 skrev Eli Zaretskii <eliz@gnu.org>:

>> From: Jan Djärv <jan.h.d@swipnet.se>
>> Date: Sat, 7 Sep 2013 09:36:16 +0200
>> Cc: Xue Fuqiao <xfq.free@gmail.com>,
>> 15273@debbugs.gnu.org
>> 
>> 
>> 7 sep 2013 kl. 09:26 skrev Eli Zaretskii <eliz@gnu.org>:
>> 
>>> So I'm not sure what are we still discussing here, since the quality
>>> of fonts on any given platform is hardly on-topic in the bug tracker,
>>> and the NS font driver is not part of Emacs.
>> 
>> nsfont.m:
>> 
>> Copyright (C) 2006-2013 Free Software Foundation, Inc.
>> 
>> This file is part of GNU Emacs.
> 
> The OP said that he tried many fonts, and none worked correctly.

So what?  The NS font driver is still part of Emacs even if it has bugs.

	Jan D.






^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:02                               ` Jan Djärv
@ 2013-09-07  8:10                                 ` Eli Zaretskii
  2013-09-07  8:27                                   ` Jan Djärv
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07  8:10 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Sat, 7 Sep 2013 10:02:44 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
> >>> So I'm not sure what are we still discussing here, since the quality
> >>> of fonts on any given platform is hardly on-topic in the bug tracker,
> >>> and the NS font driver is not part of Emacs.
> >> 
> >> nsfont.m:
> >> 
> >> Copyright (C) 2006-2013 Free Software Foundation, Inc.
> >> 
> >> This file is part of GNU Emacs.
> > 
> > The OP said that he tried many fonts, and none worked correctly.
> 
> So what?  The NS font driver is still part of Emacs even if it has bugs.

What kind of bugs do you have in mind?  If the characters aren't
composed, then the only bug I can think of is that nsfont.m somehow
processes the character metrics in the font incorrectly.  To test this
hypothesis Someone(TM) should show the character metrics using some
external tool, and compare that with what nsfont.m calculates.  Or,
alternatively, show that exactly the same font does produce a correct
display on another platform; then we could compare the two font
back-ends we have for these platforms.

As long as none of this is done, adding more pictures to this
discussion doesn't help us make any progress.  And I cannot understand
why requests for showing "C-u C-x =" are consistently ignored by the OP.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:10                                 ` Eli Zaretskii
@ 2013-09-07  8:27                                   ` Jan Djärv
  2013-09-07  8:40                                     ` Eli Zaretskii
  2013-09-07  8:47                                     ` Jan Djärv
  0 siblings, 2 replies; 53+ messages in thread
From: Jan Djärv @ 2013-09-07  8:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hello.

7 sep 2013 kl. 10:10 skrev Eli Zaretskii <eliz@gnu.org>:

>> From: Jan Djärv <jan.h.d@swipnet.se>
>> Date: Sat, 7 Sep 2013 10:02:44 +0200
>> Cc: xfq.free@gmail.com,
>> 15273@debbugs.gnu.org
>> 
>>>>> So I'm not sure what are we still discussing here, since the quality
>>>>> of fonts on any given platform is hardly on-topic in the bug tracker,
>>>>> and the NS font driver is not part of Emacs.
>>>> 
>>>> nsfont.m:
>>>> 
>>>> Copyright (C) 2006-2013 Free Software Foundation, Inc.
>>>> 
>>>> This file is part of GNU Emacs.
>>> 
>>> The OP said that he tried many fonts, and none worked correctly.
>> 
>> So what?  The NS font driver is still part of Emacs even if it has bugs.
> 
> What kind of bugs do you have in mind?  If the characters aren't
> composed, then the only bug I can think of is that nsfont.m somehow
> processes the character metrics in the font incorrectly.

Which is what I said was happening in comment #11.

>  To test this
> hypothesis Someone(TM) should show the character metrics using some
> external tool, and compare that with what nsfont.m calculates.  Or,
> alternatively, show that exactly the same font does produce a correct
> display on another platform; then we could compare the two font
> back-ends we have for these platforms.

nsfont.m has problems with composition, there is even comments about this in the code.
But from what I see on X11 and what you reported from W32, no other platform does produce the correct result (i.e. the ! and the triangle in the same place), so comparing to those backends doesn't help.

> 
> As long as none of this is done, adding more pictures to this
> discussion doesn't help us make any progress.  And I cannot understand
> why requests for showing "C-u C-x =" are consistently ignored by the OP.

Adding more pictures does indeed not help much.

	Jan D.







^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:27                                   ` Jan Djärv
@ 2013-09-07  8:40                                     ` Eli Zaretskii
  2013-09-07  8:54                                       ` Jan Djärv
  2013-09-07  8:47                                     ` Jan Djärv
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07  8:40 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Sat, 7 Sep 2013 10:27:45 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
> But from what I see on X11 and what you reported from W32, no other platform does produce the correct result (i.e. the ! and the triangle in the same place), so comparing to those backends doesn't help.

In that case, yes, I think we can conclude that there's no Emacs
issue.  But Xue now seems to say that there are other similar
situations with other combining characters, so we need to know if any
compositions are involved in those.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:27                                   ` Jan Djärv
  2013-09-07  8:40                                     ` Eli Zaretskii
@ 2013-09-07  8:47                                     ` Jan Djärv
  2013-09-07  9:22                                       ` Eli Zaretskii
  2013-09-09  0:52                                       ` YAMAMOTO Mitsuharu
  1 sibling, 2 replies; 53+ messages in thread
From: Jan Djärv @ 2013-09-07  8:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273


7 sep 2013 kl. 10:27 skrev Jan Djärv <jan.h.d@swipnet.se>:

> Hello.
> 
> 7 sep 2013 kl. 10:10 skrev Eli Zaretskii <eliz@gnu.org>:
> 
>> To test this
>> hypothesis Someone(TM) should show the character metrics using some
>> external tool, and compare that with what nsfont.m calculates.  Or,
>> alternatively, show that exactly the same font does produce a correct
>> display on another platform; then we could compare the two font
>> back-ends we have for these platforms.
> 
> nsfont.m has problems with composition, there is even comments about this in the code.
> But from what I see on X11 and what you reported from W32, no other platform does produce the correct result (i.e. the ! and the triangle in the same place), so comparing to those backends doesn't help.

FWIW, I tested the mac port that has another font driver, and it doesn't work correctly either.  It does not find a good font for the triangle and only displays an empty square.  No composition happens.

	Jan D.







^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:40                                     ` Eli Zaretskii
@ 2013-09-07  8:54                                       ` Jan Djärv
  2013-09-07  9:59                                         ` Eli Zaretskii
  2013-09-07 22:50                                         ` Xue Fuqiao
  0 siblings, 2 replies; 53+ messages in thread
From: Jan Djärv @ 2013-09-07  8:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hello.

7 sep 2013 kl. 10:40 skrev Eli Zaretskii <eliz@gnu.org>:

>> From: Jan Djärv <jan.h.d@swipnet.se>
>> Date: Sat, 7 Sep 2013 10:27:45 +0200
>> Cc: xfq.free@gmail.com,
>> 15273@debbugs.gnu.org
>> 
>> But from what I see on X11 and what you reported from W32, no other platform does produce the correct result (i.e. the ! and the triangle in the same place), so comparing to those backends doesn't help.
> 
> In that case, yes, I think we can conclude that there's no Emacs
> issue.  

Except the larger bug that Emacs does not handle combining characters correctly in the general case, i.e. not combining the ! and the triangle on X11 and W32 (and Mac port). And if it is a bug in the NS font driver, that is also an Emacs issue.  There must be one, either metrics should be corrected or the NS font driver should be modified to behave in the same (incorrect) way as the other platforms.

So it is too early to dismiss this as not an Emacs issue.

> But Xue now seems to say that there are other similar
> situations with other combining characters, so we need to know if any
> compositions are involved in those.

The exact key sequences that produced them would help also.

	Jan D.







^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:47                                     ` Jan Djärv
@ 2013-09-07  9:22                                       ` Eli Zaretskii
  2013-09-09  0:52                                       ` YAMAMOTO Mitsuharu
  1 sibling, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07  9:22 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Sat, 7 Sep 2013 10:47:05 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
> FWIW, I tested the mac port that has another font driver, and it doesn't work correctly either.  It does not find a good font for the triangle and only displays an empty square.  No composition happens.

The support for this character seems to be very scarce.  This page:

  http://www.fileformat.info/info/unicode/char/20e4/fontsupport.htm

lists some fonts, but at least some of them seem to not have a glyph
for this character, at least on Windows.  Code2000 does display the
characters, but that font is not free.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:54                                       ` Jan Djärv
@ 2013-09-07  9:59                                         ` Eli Zaretskii
  2013-09-07 13:44                                           ` Jan Djärv
  2013-09-07 22:50                                         ` Xue Fuqiao
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07  9:59 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Sat, 7 Sep 2013 10:54:58 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
> Except the larger bug that Emacs does not handle combining characters correctly in the general case, i.e. not combining the ! and the triangle on X11 and W32 (and Mac port).

Is there any application out there that does combine these 2
characters, with any font?  If not, perhaps it's not an Emacs issue
after all.

E.g., Code2000, which is really good, produces a composed character
for a followed by u+20d0, but not for a followed by u+20e4.  You can
experiment with other combining diacriticals from that Unicode block,
and you will see that some of them combine, while others do not.

If the fonts do not tell us to combine characters, the only way to do
that is to provide an Emacs composition rule for those characters.

> And if it is a bug in the NS font driver, that is also an Emacs issue.  There must be one, either metrics should be corrected or the NS font driver should be modified to behave in the same (incorrect) way as the other platforms.

If you mean the incorrect display of the lone u+20e4, then I just
installed the STIX fonts (from
http://sourceforge.net/projects/stixfonts/files/?source=navbar) on my
Windows box, and didn't see any problems as originally reported here.
Can you install the latest STIX fonts and try with that on NS?  If the
problem persists, then I agree that nsfont.m is probably the culprit.

> > But Xue now seems to say that there are other similar
> > situations with other combining characters, so we need to know if any
> > compositions are involved in those.
> 
> The exact key sequences that produced them would help also.

You can always use "C-x 8 RET" to type the characters, so I don't see
why the key sequences would be an issue.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06 14:37                   ` Xue Fuqiao
  2013-09-06 15:53                     ` Eli Zaretskii
@ 2013-09-07 12:25                     ` Wolfgang Jenkner
  2013-09-07 15:18                       ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Wolfgang Jenkner @ 2013-09-07 12:25 UTC (permalink / raw)
  To: Xue Fuqiao; +Cc: 15273

On Fri, Sep 06 2013, Xue Fuqiao wrote:

> â ;; LATIN SMALL LETTER A WITH CIRCUMFLEX + COMBINING ACUTE ACCENT

The COMBINING ACUTE ACCENT seems to be missing here, by the way:

             position: 788 of 925 (85%), restriction: <502-926>, column: 2
            character: â (displayed as â) (codepoint 226, #o342, #xe2)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xE2
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), j:Japanese, l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC3 #xA2
            file code: #xC3 #xA2 (encoded by coding system utf-8-emacs)
              display: by this font (glyph code)
    xft:-unknown-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1 (#xA4)

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH CIRCUMFLEX
  old-name: LATIN SMALL LETTER A CIRCUMFLEX
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 770) ('a' '̂')

There are text properties here:
  face                 (gnus-cite-1 message-cited-text)
  fontified            t

[back]







^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  9:59                                         ` Eli Zaretskii
@ 2013-09-07 13:44                                           ` Jan Djärv
  2013-09-07 15:20                                             ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Djärv @ 2013-09-07 13:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hello.

7 sep 2013 kl. 11:59 skrev Eli Zaretskii <eliz@gnu.org>:

>> From: Jan Djärv <jan.h.d@swipnet.se>
>> Date: Sat, 7 Sep 2013 10:54:58 +0200
>> Cc: xfq.free@gmail.com,
>> 15273@debbugs.gnu.org
>> 
>> Except the larger bug that Emacs does not handle combining characters correctly in the general case, i.e. not combining the ! and the triangle on X11 and W32 (and Mac port).
> 
> Is there any application out there that does combine these 2
> characters, with any font?  If not, perhaps it's not an Emacs issue
> after all.

Yes, the builtin TextEdit application does it.

> 
> E.g., Code2000, which is really good, produces a composed character
> for a followed by u+20d0, but not for a followed by u+20e4.  You can
> experiment with other combining diacriticals from that Unicode block,
> and you will see that some of them combine, while others do not.
> 
> If the fonts do not tell us to combine characters, the only way to do
> that is to provide an Emacs composition rule for those characters.
> 
>> And if it is a bug in the NS font driver, that is also an Emacs issue.  There must be one, either metrics should be corrected or the NS font driver should be modified to behave in the same (incorrect) way as the other platforms.
> 
> If you mean the incorrect display of the lone u+20e4, then I just
> installed the STIX fonts (from
> http://sourceforge.net/projects/stixfonts/files/?source=navbar) on my
> Windows box, and didn't see any problems as originally reported here.
> Can you install the latest STIX fonts and try with that on NS?  If the
> problem persists, then I agree that nsfont.m is probably the culprit.

The behaviour iis the same with that STIX version as it is with the OSX supplied STIX version.

	Jan D.






^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07 12:25                     ` Wolfgang Jenkner
@ 2013-09-07 15:18                       ` Eli Zaretskii
  2013-09-07 21:38                         ` Wolfgang Jenkner
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07 15:18 UTC (permalink / raw)
  To: Wolfgang Jenkner; +Cc: xfq.free, 15273

> From: Wolfgang Jenkner <wjenkner@inode.at>
> Date: Sat, 07 Sep 2013 14:25:36 +0200
> Cc: 15273@debbugs.gnu.org
> 
> On Fri, Sep 06 2013, Xue Fuqiao wrote:
> 
> > â ;; LATIN SMALL LETTER A WITH CIRCUMFLEX + COMBINING ACUTE ACCENT
> 
> The COMBINING ACUTE ACCENT seems to be missing here, by the way:
> 
>              position: 788 of 925 (85%), restriction: <502-926>, column: 2
>             character: â (displayed as â) (codepoint 226, #o342, #xe2)
>     preferred charset: unicode (Unicode (ISO10646))
> code point in charset: 0xE2
>                script: latin
>                syntax: w 	which means: word
>              category: .:Base, L:Left-to-right (strong), j:Japanese, l:Latin, v:Viet
>              to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
>           buffer code: #xC3 #xA2
>             file code: #xC3 #xA2 (encoded by coding system utf-8-emacs)
>               display: by this font (glyph code)
>     xft:-unknown-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1 (#xA4)
> 
> Character code properties: customize what to show
>   name: LATIN SMALL LETTER A WITH CIRCUMFLEX
>   old-name: LATIN SMALL LETTER A CIRCUMFLEX
>   general-category: Ll (Letter, Lowercase)
>   decomposition: (97 770) ('a' '̂')

You are showing information about a different character, the character
called out by LATIN SMALL LETTER A WITH CIRCUMFLEX + COMBINING ACUTE
ACCENT has the codepoint of u+1ea5.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07 13:44                                           ` Jan Djärv
@ 2013-09-07 15:20                                             ` Eli Zaretskii
  2013-09-08  8:26                                               ` Jan Djärv
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-07 15:20 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

> From: Jan Djärv <jan.h.d@swipnet.se>
> Date: Sat, 7 Sep 2013 15:44:53 +0200
> Cc: xfq.free@gmail.com,
>  15273@debbugs.gnu.org
> 
> Hello.
> 
> 7 sep 2013 kl. 11:59 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> >> From: Jan Djärv <jan.h.d@swipnet.se>
> >> Date: Sat, 7 Sep 2013 10:54:58 +0200
> >> Cc: xfq.free@gmail.com,
> >> 15273@debbugs.gnu.org
> >> 
> >> Except the larger bug that Emacs does not handle combining characters correctly in the general case, i.e. not combining the ! and the triangle on X11 and W32 (and Mac port).
> > 
> > Is there any application out there that does combine these 2
> > characters, with any font?  If not, perhaps it's not an Emacs issue
> > after all.
> 
> Yes, the builtin TextEdit application does it.

Does it use the same font?

> > If you mean the incorrect display of the lone u+20e4, then I just
> > installed the STIX fonts (from
> > http://sourceforge.net/projects/stixfonts/files/?source=navbar) on my
> > Windows box, and didn't see any problems as originally reported here.
> > Can you install the latest STIX fonts and try with that on NS?  If the
> > problem persists, then I agree that nsfont.m is probably the culprit.
> 
> The behaviour iis the same with that STIX version as it is with the OSX supplied STIX version.

Then I guess something is indeed wrong with nsfont.m.  Let me know if
you need any information from the w32 code.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07 15:18                       ` Eli Zaretskii
@ 2013-09-07 21:38                         ` Wolfgang Jenkner
  2013-09-07 22:29                           ` Xue Fuqiao
  0 siblings, 1 reply; 53+ messages in thread
From: Wolfgang Jenkner @ 2013-09-07 21:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

On Sat, Sep 07 2013, Eli Zaretskii wrote:

> You are showing information about a different character, the character
> called out by LATIN SMALL LETTER A WITH CIRCUMFLEX + COMBINING ACUTE
> ACCENT has the codepoint of u+1ea5.

Indeed, and I pointed out that the character actually contained in the
OP's attachement was not what he meant it to be.

Wolfgang





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07 21:38                         ` Wolfgang Jenkner
@ 2013-09-07 22:29                           ` Xue Fuqiao
  2013-09-07 22:48                             ` Xue Fuqiao
  0 siblings, 1 reply; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-07 22:29 UTC (permalink / raw)
  To: Wolfgang Jenkner; +Cc: 15273

On Sun, Sep 8, 2013 at 5:38 AM, Wolfgang Jenkner <wjenkner@inode.at> wrote:
> On Sat, Sep 07 2013, Eli Zaretskii wrote:
>
>> You are showing information about a different character, the character
>> called out by LATIN SMALL LETTER A WITH CIRCUMFLEX + COMBINING ACUTE
>> ACCENT has the codepoint of u+1ea5.
>
> Indeed, and I pointed out that the character actually contained in the
> OP's attachement was not what he meant it to be.

Sorry, it's my fault.  It should be:

--8<---------------cut here---------------start------------->8---
             position: 757 of 1217 (62%), column: 0
            character: â (displayed as â) (codepoint 226, #o342, #xe2)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xE2
               script: latin
               syntax: w     which means: word
             category: .:Base, L:Left-to-right (strong), j:Japanese,
l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC3 #xA2
            file code: #xC3 #xA2 (encoded by coding system utf-8-unix)
              display: composed to form "ấ" (see below)

Composed with the following character(s) "́" using this font:
  nil:-apple-Menlo-medium-normal-normal-*-12-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 1 226 164 7 0 7 9 0 nil]
  [0 1 769 646 7 0 5 2 0 [-6 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH CIRCUMFLEX
  old-name: LATIN SMALL LETTER A CIRCUMFLEX
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 770) ('a' '̂')
--8<---------------cut here---------------end--------------->8---

I'll also show "C-u C-x =" for other characters later.

-- 
Best regards, Xue Fuqiao.
http://www.gnu.org/software/emacs/





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07 22:29                           ` Xue Fuqiao
@ 2013-09-07 22:48                             ` Xue Fuqiao
  2013-09-08 11:03                               ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-07 22:48 UTC (permalink / raw)
  To: 15273

[-- Attachment #1: Type: text/plain, Size: 42 bytes --]

I've attached my "C-u C-x =" information.

[-- Attachment #2: what-cursor-position.txt --]
[-- Type: text/plain, Size: 20635 bytes --]

The first group:

--------------------------------------------------------------------------------------------------------------------------------------
             position: 1 of 1217 (0%), column: 0
            character: ã (displayed as ã) (codepoint 227, #o343, #xe3)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xE3
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), j:Japanese, l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC3 #xA3
            file code: #xC3 #xA3 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1 (#xA5)

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH TILDE
  old-name: LATIN SMALL LETTER A TILDE
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 771) ('a' '̃')

             position: 38 of 1217 (3%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "ã" (see below)

Composed with the following character(s) "̃" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 97 68 7 0 8 6 0 nil]
  [0 1 771 692 2 -7 6 1 0 [-3 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')
--------------------------------------------------------------------------------------------------------------------------------------

The second group:
--------------------------------------------------------------------------------------------------------------------------------------
             position: 84 of 1217 (7%), column: 0
            character: ȧ (displayed as ȧ) (codepoint 551, #o1047, #x227)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x0227
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC8 #xA7
            file code: #xC8 #xA7 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1 (#x1E9)

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH DOT ABOVE
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 775) ('a' '̇')

             position: 125 of 1217 (10%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "ȧ" (see below)

Composed with the following character(s) "̇" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 97 68 7 0 8 6 0 nil]
  [0 1 775 696 2 -6 3 2 0 [-2 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')
--------------------------------------------------------------------------------------------------------------------------------------

The third group:
--------------------------------------------------------------------------------------------------------------------------------------
             position: 175 of 1217 (14%), column: 0
            character: ã (displayed as ã) (codepoint 227, #o343, #xe3)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xE3
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), j:Japanese, l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC3 #xA3
            file code: #xC3 #xA3 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̃" (see below)

Composed with the following character(s) "̣" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 227 165 7 0 8 9 0 nil]
  [0 1 803 724 2 -6 3 0 2 [-2 1 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH TILDE
  old-name: LATIN SMALL LETTER A TILDE
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 771) ('a' '̃')

             position: 235 of 1217 (19%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̃" (see below)

Composed with the following character(s) "̣̃" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 97 68 7 0 8 6 0 nil]
  [0 2 771 692 2 -7 6 1 0 [-3 0 0]]
  [0 2 803 724 2 -6 3 0 2 [-2 1 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')

             position: 303 of 1217 (25%), column: 0
            character: ạ (displayed as ạ) (codepoint 7841, #o17241, #x1ea1)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x1EA1
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xE1 #xBA #xA1
            file code: #xE1 #xBA #xA1 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̃" (see below)

Composed with the following character(s) "̃" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 7841 2458 7 0 8 6 2 nil]
  [0 1 771 692 2 -7 6 1 0 [-3 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH DOT BELOW
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 803) ('a' '̣')

             position: 363 of 1217 (30%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̃" (see below)

Composed with the following character(s) "̣̃" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 97 68 7 0 8 6 0 nil]
  [0 2 803 724 2 -6 3 0 2 [-2 1 0]]
  [0 2 771 692 2 -7 6 1 0 [-3 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')
--------------------------------------------------------------------------------------------------------------------------------------

The fourth group:
--------------------------------------------------------------------------------------------------------------------------------------
             position: 432 of 1217 (35%), column: 0
            character: ạ (displayed as ạ) (codepoint 7841, #o17241, #x1ea1)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x1EA1
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xE1 #xBA #xA1
            file code: #xE1 #xBA #xA1 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̇" (see below)

Composed with the following character(s) "̇" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 7841 2458 7 0 8 6 2 nil]
  [0 1 775 696 2 -6 3 2 0 [-2 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH DOT BELOW
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 803) ('a' '̣')

             position: 496 of 1217 (41%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̇" (see below)

Composed with the following character(s) "̣̇" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 97 68 7 0 8 6 0 nil]
  [0 2 803 724 2 -6 3 0 2 [-2 1 0]]
  [0 2 775 696 2 -6 3 2 0 [-2 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')

             position: 568 of 1217 (47%), column: 0
            character: ȧ (displayed as ȧ) (codepoint 551, #o1047, #x227)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x0227
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), l:Latin
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC8 #xA7
            file code: #xC8 #xA7 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̇" (see below)

Composed with the following character(s) "̣" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 551 489 7 0 8 9 0 nil]
  [0 1 803 724 2 -6 3 0 2 [-2 1 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH DOT ABOVE
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 775) ('a' '̇')

             position: 632 of 1217 (52%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "ạ̇" (see below)

Composed with the following character(s) "̣̇" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 97 68 7 0 8 6 0 nil]
  [0 2 775 696 2 -6 3 2 0 [-2 0 0]]
  [0 2 803 724 2 -6 3 0 2 [-2 1 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')
--------------------------------------------------------------------------------------------------------------------------------------

The fifth group:
--------------------------------------------------------------------------------------------------------------------------------------
             position: 705 of 1217 (58%), column: 0
            character: ấ (displayed as ấ) (codepoint 7845, #o17245, #x1ea5)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x1EA5
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xE1 #xBA #xA5
            file code: #xE1 #xBA #xA5 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1 (#x99E)

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
  general-category: Ll (Letter, Lowercase)
  decomposition: (226 769) ('â' '́')

             position: 757 of 1217 (62%), column: 0
            character: â (displayed as â) (codepoint 226, #o342, #xe2)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xE2
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), j:Japanese, l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC3 #xA2
            file code: #xC3 #xA2 (encoded by coding system utf-8-unix)
              display: composed to form "ấ" (see below)

Composed with the following character(s) "́" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 226 164 7 0 8 9 0 nil]
  [0 1 769 690 2 -6 5 2 0 [-3 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH CIRCUMFLEX
  old-name: LATIN SMALL LETTER A CIRCUMFLEX
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 770) ('a' '̂')

             position: 825 of 1217 (68%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "ấ" (see below)

Composed with the following character(s) "̂́" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 97 68 7 0 8 6 0 nil]
  [0 2 770 691 2 -7 6 2 0 [-3 0 0]]
  [0 2 769 690 2 -6 5 2 0 [-3 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')
--------------------------------------------------------------------------------------------------------------------------------------

The sixth group:
--------------------------------------------------------------------------------------------------------------------------------------
             position: 909 of 1217 (75%), column: 0
            character: á (displayed as á) (codepoint 225, #o341, #xe1)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0xE1
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), c:Chinese, j:Japanese, l:Latin, v:Viet
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xC3 #xA1
            file code: #xC3 #xA1 (encoded by coding system utf-8-unix)
              display: composed to form "á̂" (see below)

Composed with the following character(s) "̂" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 1 225 163 7 0 8 9 0 nil]
  [0 1 770 691 2 -7 6 2 0 [-3 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A WITH ACUTE
  old-name: LATIN SMALL LETTER A ACUTE
  general-category: Ll (Letter, Lowercase)
  decomposition: (97 769) ('a' '́')

             position: 972 of 1217 (80%), column: 0
            character: a (displayed as a) (codepoint 97, #o141, #x61)
    preferred charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x61
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #x61
            file code: #x61 (encoded by coding system utf-8-unix)
              display: composed to form "á̂" (see below)

Composed with the following character(s) "́̂" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 97 68 7 0 8 6 0 nil]
  [0 2 769 690 2 -6 5 2 0 [-3 0 0]]
  [0 2 770 691 2 -7 6 2 0 [-3 0 0]]

Character code properties: customize what to show
  name: LATIN SMALL LETTER A
  general-category: Ll (Letter, Lowercase)
  decomposition: (97) ('a')
--------------------------------------------------------------------------------------------------------------------------------------

The last group:
--------------------------------------------------------------------------------------------------------------------------------------
             position: 1056 of 1217 (87%), column: 0
            character: α (displayed as α) (codepoint 945, #o1661, #x3b1)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x03B1
               script: greek
               syntax: w 	which means: word
             category: .:Base, G:2-byte Greek, L:Left-to-right (strong), c:Chinese, g:Greek, h:Korean, j:Japanese
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xCE #xB1
            file code: #xCE #xB1 (encoded by coding system utf-8-unix)
              display: composed to form "ἄ" (see below)

Composed with the following character(s) "̓́" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 945 837 8 0 9 6 0 nil]
  [0 2 787 708 2 -6 3 2 0 [-3 0 0]]
  [0 2 769 690 2 -6 5 2 0 [-4 0 0]]

Character code properties: customize what to show
  name: GREEK SMALL LETTER ALPHA
  general-category: Ll (Letter, Lowercase)
  decomposition: (945) ('α')

             position: 1137 of 1217 (93%), column: 0
            character: α (displayed as α) (codepoint 945, #o1661, #x3b1)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x03B1
               script: greek
               syntax: w 	which means: word
             category: .:Base, G:2-byte Greek, L:Left-to-right (strong), c:Chinese, g:Greek, h:Korean, j:Japanese
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xCE #xB1
            file code: #xCE #xB1 (encoded by coding system utf-8-unix)
              display: composed to form "ά̓" (see below)

Composed with the following character(s) "́̓" using this font:
  nil:-apple-DejaVu_Sans-medium-normal-normal-*-12-*-*-*-p-0-iso10646-1
by these glyphs:
  [0 2 945 837 8 0 9 6 0 nil]
  [0 2 769 690 2 -6 5 2 0 [-4 0 0]]
  [0 2 787 708 2 -6 3 2 0 [-3 0 0]]

Character code properties: customize what to show
  name: GREEK SMALL LETTER ALPHA
  general-category: Ll (Letter, Lowercase)
  decomposition: (945) ('α')
--------------------------------------------------------------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:54                                       ` Jan Djärv
  2013-09-07  9:59                                         ` Eli Zaretskii
@ 2013-09-07 22:50                                         ` Xue Fuqiao
  1 sibling, 0 replies; 53+ messages in thread
From: Xue Fuqiao @ 2013-09-07 22:50 UTC (permalink / raw)
  To: Jan Djärv; +Cc: 15273

On Sat, Sep 7, 2013 at 4:54 PM, Jan Djärv <jan.h.d@swipnet.se> wrote:
>> But Xue now seems to say that there are other similar
>> situations with other combining characters, so we need to know if any
>> compositions are involved in those.
>
> The exact key sequences that produced them would help also.

Just using C-x 8 ⏎ and enter the Unicode name of the character (which
is in the attachment) and press return.

-- 
Best regards, Xue Fuqiao.
http://www.gnu.org/software/emacs/





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07 15:20                                             ` Eli Zaretskii
@ 2013-09-08  8:26                                               ` Jan Djärv
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Djärv @ 2013-09-08  8:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Hi.

7 sep 2013 kl. 17:20 skrev Eli Zaretskii <eliz@gnu.org>:

>> From: Jan Djärv <jan.h.d@swipnet.se>
>> Date: Sat, 7 Sep 2013 15:44:53 +0200
>> Cc: xfq.free@gmail.com,
>> 15273@debbugs.gnu.org
>> 
>> Hello.
>> 
>> 7 sep 2013 kl. 11:59 skrev Eli Zaretskii <eliz@gnu.org>:
>> 
>>>> From: Jan Djärv <jan.h.d@swipnet.se>
>>>> Date: Sat, 7 Sep 2013 10:54:58 +0200
>>>> Cc: xfq.free@gmail.com,
>>>> 15273@debbugs.gnu.org
>>>> 
>>>> Except the larger bug that Emacs does not handle combining characters correctly in the general case, i.e. not combining the ! and the triangle on X11 and W32 (and Mac port).
>>> 
>>> Is there any application out there that does combine these 2
>>> characters, with any font?  If not, perhaps it's not an Emacs issue
>>> after all.
>> 
>> Yes, the builtin TextEdit application does it.
> 
> Does it use the same font?

It does.

	Jan D.






^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07 22:48                             ` Xue Fuqiao
@ 2013-09-08 11:03                               ` Eli Zaretskii
  0 siblings, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-08 11:03 UTC (permalink / raw)
  To: Xue Fuqiao, Kenichi Handa; +Cc: 15273

> Date: Sun, 8 Sep 2013 06:48:55 +0800
> From: Xue Fuqiao <xfq.free@gmail.com>
> 
> I've attached my "C-u C-x =" information.

Thanks.  This is quite different from what I get on MS-Windows with
the Uniscribe shaping engine.  I don't know the explanation of these
differences; in fact I cannot even identify the code in nsfont.m that
generates these data structures.  Perhaps Handa-san could help us out
here.

What shaping engine is used by the NS port, btw?

I summarize some of the differences in the computed composition data
below, the data format is defined in the doc string of
composition-get-gstring as follows:

    [ FROM-IDX TO-IDX C CODE WIDTH LBEARING RBEARING ASCENT DESCENT
      [ [X-OFF Y-OFF WADJUST] | nil] ]
where
    FROM-IDX and TO-IDX are used internally and should not be touched.
    C is the character of the glyph.
    CODE is the glyph-code of C in FONT-OBJECT.
    WIDTH thru DESCENT are the metrics (in pixels) of the glyph.
    X-OFF and Y-OFF are offsets to the base position for the glyph.
    WADJUST is the adjustment to the normal width of the glyph.

It looks like NS consistently produces more negative values of X-OFF,
which might explain why portions of the combining characters are drawn
off screen.

 Character seq.              W32                        NS
===============================================================================
  a + u+0303    [0 1 97 165 7 0 7 12 3 nil]    [0 1 97 68 7 0 8 6 0 nil]
                                               [0 1 771 692 2 -7 6 1 0 [-3 0 0]]

  a + u+0307    [0 1 97 489 7 0 7 12 3 nil]    [0 1 97 68 7 0 8 6 0 nil]
                                               [0 1 775 696 2 -6 3 2 0 [-2 0 0]]


u+00e3 + u+0323 [0 1 227 165 7 0 7 12 3 nil]   [0 1 227 165 7 0 8 9 0 nil]
                [0 1 227 724 0 -3 -2 12 3 nil] [0 1 803 724 2 -6 3 0 2 [-2 1 0]]

a + u+0303 + u+0323
                [0 2 97 68 7 0 7 12 3 nil]         [0 2 97 68 7 0 8 6 0 nil]
                [0 2 97 692 0 -4 -1 12 3 [-1 0 0]] [0 2 771 692 2 -7 6 1 0 [-3 0 0]]
                [0 2 97 724 0 -3 -2 12 3 [-1 0 0]] [0 2 803 724 2 -6 3 0 2 [-2 1 0]]

u+1ea1 + u+0303 [0 1 7841 2458 7 0 7 12 3 nil]  [0 1 7841 2458 7 0 8 6 2 nil]
                [0 1 7841 692 0 -4 -1 12 3 nil] [0 1 771 692 2 -7 6 1 0 [-3 0 0]]

a + u+0323 + u+0303
                [0 2 97 68 7 0 7 12 3 nil]         [0 2 97 68 7 0 8 6 0 nil]
                [0 2 97 724 0 -3 -2 12 3 [-1 0 0]] [0 2 803 724 2 -6 3 0 2 [-2 1 0]]
                [0 2 97 692 0 -4 -1 12 3 [-1 0 0]] [0 2 771 692 2 -7 6 1 0 [-3 0 0]]

u+1ea1 + u+0307 [0 1 7841 2458 7 0 7 12 3 nil]  [0 1 7841 2458 7 0 8 6 2 nil]
                [0 1 7841 696 0 -3 -2 12 3 nil] [0 1 775 696 2 -6 3 2 0 [-2 0 0]]

a + u+0323 + u+0307
                [0 2 97 68 7 0 7 12 3 nil]         [0 2 97 68 7 0 8 6 0 nil]
                [0 2 97 724 0 -3 -2 12 3 [-1 0 0]] [0 2 803 724 2 -6 3 0 2 [-2 1 0]]
                [0 2 97 696 0 -3 -2 12 3 [-1 0 0]] [0 2 775 696 2 -6 3 2 0 [-2 0 0]]

u+0227 + u+0323 [0 1 551 489 7 0 7 12 3 nil]    [0 1 551 489 7 0 8 9 0 nil]
                [0 1 551 724 0 -3 -2 12 3 nil]  [0 1 803 724 2 -6 3 0 2 [-2 1 0]]

a + u+0307 + u+0323
                [0 2 97 68 7 0 7 12 3 nil]         [0 2 97 68 7 0 8 6 0 nil]
                [0 2 97 696 0 -3 -2 12 3 [-1 0 0]] [0 2 775 696 2 -6 3 2 0 [-2 0 0]]
                [0 2 97 724 0 -3 -2 12 3 [-1 0 0]] [0 2 803 724 2 -6 3 0 2 [-2 1 0]]

u+00e1 + u+0302 [0 1 225 163 7 0 7 12 3 nil]   [0 1 225 163 7 0 8 9 0 nil]
                [0 1 225 691 0 -4 -1 12 3 nil] [0 1 770 691 2 -7 6 2 0 [-3 0 0]]






^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-06  6:29             ` Eli Zaretskii
  2013-09-06  6:42               ` Andreas Schwab
@ 2013-09-08 13:05               ` Kenichi Handa
  2013-09-08 14:59                 ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Kenichi Handa @ 2013-09-08 13:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

In article <83r4d2mpi2.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

> Character composition in Emacs can happen in 1 of 2 ways:

>  . The font driver tells Emacs to compose several characters into a
>    single grapheme cluster, by drawing all of them as a single unit,
>    and by drawing the 2nd, 3rd, etc. character glyphs at certain pixel
>    offsets relative to the base glyph.

>  . Emacs itself has composition rules for 2 or more characters; in
>    this case, the same pixel offsets come from those rules.

Right.  And U+20E4 has this entry.

(aref composition-function-table #x20E4)
=> (["\\c.\\c^+" 1 compose-gstring-for-graphic]
    [nil 0 compose-gstring-for-graphic])

This says that a base character (char-category ".") followed
by a combing character (char-category "^") should be
composed by the function compose-gstring-for-graphic if
those character are displayed by the same font.

But, I found a bug in characters.el.  In it, U+20E4's
char-actegory is not set as "^".  So, Emacs couldn't compose
it with the preceding base character.  I've just installed a
fix.

Now, Emacs composes U+20E4 with the preceding "!" if the
same font is selected for those two characters.

Next, I'm not sure whether Emacs composes them correctly if
the underlying font driver doesn't support OTF's "mark" and
"mkmk" features.  The function compose-gstring-for-graphic
has a code for such a case, but I don't remember that I
wrote a code for "enclosing combining marks".

---
Kenichi Handa
handa@gnu.org





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-08 13:05               ` Kenichi Handa
@ 2013-09-08 14:59                 ` Eli Zaretskii
  2013-09-10 13:58                   ` Kenichi Handa
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-08 14:59 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: xfq.free, 15273

> From: Kenichi Handa <handa@gnu.org>
> Cc: jan.h.d@swipnet.se, xfq.free@gmail.com, 15273@debbugs.gnu.org
> Date: Sun, 08 Sep 2013 22:05:41 +0900
> 
> (aref composition-function-table #x20E4)
> => (["\\c.\\c^+" 1 compose-gstring-for-graphic]
>     [nil 0 compose-gstring-for-graphic])
> 
> This says that a base character (char-category ".") followed
> by a combing character (char-category "^") should be
> composed by the function compose-gstring-for-graphic if
> those character are displayed by the same font.
> 
> But, I found a bug in characters.el.  In it, U+20E4's
> char-actegory is not set as "^".  So, Emacs couldn't compose
> it with the preceding base character.  I've just installed a
> fix.
> 
> Now, Emacs composes U+20E4 with the preceding "!" if the
> same font is selected for those two characters.

Thanks.

> Next, I'm not sure whether Emacs composes them correctly if
> the underlying font driver doesn't support OTF's "mark" and
> "mkmk" features.  The function compose-gstring-for-graphic
> has a code for such a case, but I don't remember that I
> wrote a code for "enclosing combining marks".

Looks like Uniscribe should need this, at least on XP: I do get a
composition, but the left part of the triangle is off-screen.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-07  8:47                                     ` Jan Djärv
  2013-09-07  9:22                                       ` Eli Zaretskii
@ 2013-09-09  0:52                                       ` YAMAMOTO Mitsuharu
  2013-09-09  5:17                                         ` Jan Djärv
  1 sibling, 1 reply; 53+ messages in thread
From: YAMAMOTO Mitsuharu @ 2013-09-09  0:52 UTC (permalink / raw)
  To: Jan Djärv; +Cc: xfq.free, 15273

[-- Attachment #1: Type: text/plain, Size: 1725 bytes --]

>>>>> On Sat, 7 Sep 2013 10:47:05 +0200, Jan Djärv <jan.h.d@swipnet.se> said:

>> nsfont.m has problems with composition, there is even comments
>> about this in the code.  But from what I see on X11 and what you
>> reported from W32, no other platform does produce the correct
>> result (i.e. the ! and the triangle in the same place), so
>> comparing to those backends doesn't help.

> FWIW, I tested the mac port that has another font driver, and it
> doesn't work correctly either.  It does not find a good font for the
> triangle and only displays an empty square.  No composition happens.

That square is actually the glyph for U+20E4 in the PCMyungjo font.
Though it is not correct of course, but TextEdit.app also selects
PCMyungjo for U+20E4 by default and displays the square.

We need some tricks to make the composition happen for this case on
the Mac port.

(set-fontset-font t #x20e4 "STIXGeneral")
(set-char-table-range composition-function-table #x20e4
		      (cons ["!." 1 font-shape-gstring 0]
			    (aref composition-function-table #x20e4)))

The last element in the vector is an extension specific to the Mac
port, and it specifies the position of the character (relative to that
for the key) from which the font object is obtained.  It was
originally introduced to support Variation Selectors 15 (text-style)
and 16 (emoji-style).

The result may not be satisfactory, but its consistency with that of
TextEdit.app (upper window in the screenshot) indicates it is the
behavior of the standard text shaper on OS X.

The other latin example seems to work perfectly on the Mac port
without any settings.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

[-- Attachment #2: composition.png --]
[-- Type: image/png, Size: 181223 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-09  0:52                                       ` YAMAMOTO Mitsuharu
@ 2013-09-09  5:17                                         ` Jan Djärv
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Djärv @ 2013-09-09  5:17 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: xfq.free@gmail.com, 15273@debbugs.gnu.org

Hello. 

9 sep 2013 kl. 02:52 skrev YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp>:

>>>>>> On Sat, 7 Sep 2013 10:47:05 +0200, Jan Djärv <jan.h.d@swipnet.se> said:
> 
>>> nsfont.m has problems with composition, there is even comments
>>> about this in the code.  But from what I see on X11 and what you
>>> reported from W32, no other platform does produce the correct
>>> result (i.e. the ! and the triangle in the same place), so
>>> comparing to those backends doesn't help.
> 
>> FWIW, I tested the mac port that has another font driver, and it
>> doesn't work correctly either.  It does not find a good font for the
>> triangle and only displays an empty square.  No composition happens.
> 
> That square is actually the glyph for U+20E4 in the PCMyungjo font.
> Though it is not correct of course, but TextEdit.app also selects
> PCMyungjo for U+20E4 by default and displays the square.

I wonder if it is locale dependent. For me TextEdit.ap selects STIXGeneral and the composition is correct. 

     Jan D. 




^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-08 14:59                 ` Eli Zaretskii
@ 2013-09-10 13:58                   ` Kenichi Handa
  2013-09-10 15:37                     ` Eli Zaretskii
  2013-09-12 14:52                     ` Kenichi Handa
  0 siblings, 2 replies; 53+ messages in thread
From: Kenichi Handa @ 2013-09-10 13:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

In article <834n9vl5q1.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:

>> Next, I'm not sure whether Emacs composes them correctly if
>> the underlying font driver doesn't support OTF's "mark" and
>> "mkmk" features.  The function compose-gstring-for-graphic
>> has a code for such a case, but I don't remember that I
>> wrote a code for "enclosing combining marks".

> Looks like Uniscribe should need this, at least on XP: I do get a
> composition, but the left part of the triangle is off-screen.

I saw the same rendering on GNU/Linux with a font
which doesn't have a proper feature for U+20E4.  I'm now
investigating how to improve compose-gstring-for-graphic.

---
Kenichi Handa
handa@gnu.org





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-10 13:58                   ` Kenichi Handa
@ 2013-09-10 15:37                     ` Eli Zaretskii
  2013-09-12 14:52                     ` Kenichi Handa
  1 sibling, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-10 15:37 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: xfq.free, 15273

> From: Kenichi Handa <handa@gnu.org>
> Cc: jan.h.d@swipnet.se, xfq.free@gmail.com, 15273@debbugs.gnu.org
> Date: Tue, 10 Sep 2013 22:58:20 +0900
> 
> In article <834n9vl5q1.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Next, I'm not sure whether Emacs composes them correctly if
> >> the underlying font driver doesn't support OTF's "mark" and
> >> "mkmk" features.  The function compose-gstring-for-graphic
> >> has a code for such a case, but I don't remember that I
> >> wrote a code for "enclosing combining marks".
> 
> > Looks like Uniscribe should need this, at least on XP: I do get a
> > composition, but the left part of the triangle is off-screen.
> 
> I saw the same rendering on GNU/Linux with a font
> which doesn't have a proper feature for U+20E4.

Then I guess it's something related to the fact that the base
character '!' is much narrower than the triangle.

> I'm now investigating how to improve compose-gstring-for-graphic.

Thank you.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-10 13:58                   ` Kenichi Handa
  2013-09-10 15:37                     ` Eli Zaretskii
@ 2013-09-12 14:52                     ` Kenichi Handa
  2013-09-12 16:08                       ` Eli Zaretskii
  2013-09-14  8:51                       ` Jan Djärv
  1 sibling, 2 replies; 53+ messages in thread
From: Kenichi Handa @ 2013-09-12 14:52 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: xfq.free, 15273

In article <87li34ye0z.fsf@gnu.org>, Kenichi Handa <handa@gnu.org> writes:

> I saw the same rendering on GNU/Linux with a font
> which doesn't have a proper feature for U+20E4.  I'm now
> investigating how to improve compose-gstring-for-graphic.

I've just installed a fix.  Now at least on GNU/Linux with
FreeSans font for ASCII and U+20e4, the layouting is
improved.

---
Kenichi Handa
handa@gnu.org





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-12 14:52                     ` Kenichi Handa
@ 2013-09-12 16:08                       ` Eli Zaretskii
  2020-11-18 15:13                         ` Stefan Kangas
  2013-09-14  8:51                       ` Jan Djärv
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2013-09-12 16:08 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: xfq.free, 15273

> From: Kenichi Handa <handa@gnu.org>
> Cc: eliz@gnu.org, xfq.free@gmail.com, 15273@debbugs.gnu.org
> Date: Thu, 12 Sep 2013 23:52:38 +0900
> 
> In article <87li34ye0z.fsf@gnu.org>, Kenichi Handa <handa@gnu.org> writes:
> 
> > I saw the same rendering on GNU/Linux with a font
> > which doesn't have a proper feature for U+20E4.  I'm now
> > investigating how to improve compose-gstring-for-graphic.
> 
> I've just installed a fix.  Now at least on GNU/Linux with
> FreeSans font for ASCII and U+20e4, the layouting is
> improved.

Thanks, it's improved on Windows as well.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-12 14:52                     ` Kenichi Handa
  2013-09-12 16:08                       ` Eli Zaretskii
@ 2013-09-14  8:51                       ` Jan Djärv
  2013-09-14  9:12                         ` Jan Djärv
  1 sibling, 1 reply; 53+ messages in thread
From: Jan Djärv @ 2013-09-14  8:51 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: xfq.free, 15273

Hello.

2013-09-12 16:52, Kenichi Handa skrev:
> In article <87li34ye0z.fsf@gnu.org>, Kenichi Handa <handa@gnu.org> writes:
>
>> I saw the same rendering on GNU/Linux with a font
>> which doesn't have a proper feature for U+20E4.  I'm now
>> investigating how to improve compose-gstring-for-graphic.
>
> I've just installed a fix.  Now at least on GNU/Linux with
> FreeSans font for ASCII and U+20e4, the layouting is
> improved.
>

How?  I don't see any difference compared to earlier.

	Jan D.







^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-14  8:51                       ` Jan Djärv
@ 2013-09-14  9:12                         ` Jan Djärv
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Djärv @ 2013-09-14  9:12 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: xfq.free, 15273

Hello.

2013-09-14 10:51, Jan Djärv skrev:
> Hello.
>
> 2013-09-12 16:52, Kenichi Handa skrev:
>> In article <87li34ye0z.fsf@gnu.org>, Kenichi Handa <handa@gnu.org> writes:
>>
>>> I saw the same rendering on GNU/Linux with a font
>>> which doesn't have a proper feature for U+20E4.  I'm now
>>> investigating how to improve compose-gstring-for-graphic.
>>
>> I've just installed a fix.  Now at least on GNU/Linux with
>> FreeSans font for ASCII and U+20e4, the layouting is
>> improved.
>>
>
> How?  I don't see any difference compared to earlier.
>

Okay, now I see it, I selected the wrong Sans-font.

	Jan D.







^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2013-09-12 16:08                       ` Eli Zaretskii
@ 2020-11-18 15:13                         ` Stefan Kangas
  2020-11-18 17:09                           ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Stefan Kangas @ 2020-11-18 15:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: xfq.free, 15273

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Kenichi Handa <handa@gnu.org>
>> Cc: eliz@gnu.org, xfq.free@gmail.com, 15273@debbugs.gnu.org
>> Date: Thu, 12 Sep 2013 23:52:38 +0900
>>
>> In article <87li34ye0z.fsf@gnu.org>, Kenichi Handa <handa@gnu.org> writes:
>>
>> > I saw the same rendering on GNU/Linux with a font
>> > which doesn't have a proper feature for U+20E4.  I'm now
>> > investigating how to improve compose-gstring-for-graphic.
>>
>> I've just installed a fix.  Now at least on GNU/Linux with
>> FreeSans font for ASCII and U+20e4, the layouting is
>> improved.
>
> Thanks, it's improved on Windows as well.

(That was 7 years ago.)

The recipe to reproduce this bug is:

  !                      ;; input an exclamation mark (#x21)
  C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
  RET                    ;; Newline
  C-x RET 8 2 6 A 0 RET  ;; WARNING SIGN

This gives me, on current master running on GNU/Linux:

!⃤
⚠

Besides the fact that the upper triangle is large and the warning sign
is small in my case, I see nothing untowards about this.

Is this all working correctly now, or is there anything left to be done?
Could this bug be closed?

Thanks in advance.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#15273: 24.3.50; Combining character sequences are displayed weirdly
  2020-11-18 15:13                         ` Stefan Kangas
@ 2020-11-18 17:09                           ` Eli Zaretskii
  0 siblings, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2020-11-18 17:09 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: xfq.free, 15273-done

> From: Stefan Kangas <stefan@marxist.se>
> Date: Wed, 18 Nov 2020 07:13:12 -0800
> Cc: Kenichi Handa <handa@gnu.org>, xfq.free@gmail.com, 15273@debbugs.gnu.org
> 
> The recipe to reproduce this bug is:
> 
>   !                      ;; input an exclamation mark (#x21)
>   C-x 8 RET 2 0 E 4 RET  ;; COMBINING ENCLOSING UPWARD POINTING TRIANGLE
>   RET                    ;; Newline
>   C-x RET 8 2 6 A 0 RET  ;; WARNING SIGN
> 
> This gives me, on current master running on GNU/Linux:
> 
> !⃤
> ⚠
> 
> Besides the fact that the upper triangle is large and the warning sign
> is small in my case, I see nothing untowards about this.
> 
> Is this all working correctly now, or is there anything left to be done?
> Could this bug be closed?

Yes, closing.





^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2020-11-18 17:09 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-05 14:08 bug#15273: 24.3.50; Combining character sequences are displayed weirdly Xue Fuqiao
2013-09-05 14:33 ` Eli Zaretskii
2013-09-05 23:26   ` Xue Fuqiao
2013-09-05 16:48 ` Jan Djärv
2013-09-05 17:12   ` Eli Zaretskii
2013-09-05 17:24     ` Eli Zaretskii
2013-09-05 17:33       ` Jan Djärv
2013-09-05 17:56         ` Eli Zaretskii
2013-09-06  5:08           ` Jan Djärv
2013-09-06  6:29             ` Eli Zaretskii
2013-09-06  6:42               ` Andreas Schwab
2013-09-06  7:32                 ` Eli Zaretskii
2013-09-06 14:37                   ` Xue Fuqiao
2013-09-06 15:53                     ` Eli Zaretskii
2013-09-06 22:17                       ` Xue Fuqiao
2013-09-06 22:37                         ` Xue Fuqiao
2013-09-07  7:27                           ` Eli Zaretskii
2013-09-07  7:26                         ` Eli Zaretskii
2013-09-07  7:36                           ` Jan Djärv
2013-09-07  7:57                             ` Eli Zaretskii
2013-09-07  8:02                               ` Jan Djärv
2013-09-07  8:10                                 ` Eli Zaretskii
2013-09-07  8:27                                   ` Jan Djärv
2013-09-07  8:40                                     ` Eli Zaretskii
2013-09-07  8:54                                       ` Jan Djärv
2013-09-07  9:59                                         ` Eli Zaretskii
2013-09-07 13:44                                           ` Jan Djärv
2013-09-07 15:20                                             ` Eli Zaretskii
2013-09-08  8:26                                               ` Jan Djärv
2013-09-07 22:50                                         ` Xue Fuqiao
2013-09-07  8:47                                     ` Jan Djärv
2013-09-07  9:22                                       ` Eli Zaretskii
2013-09-09  0:52                                       ` YAMAMOTO Mitsuharu
2013-09-09  5:17                                         ` Jan Djärv
2013-09-07 12:25                     ` Wolfgang Jenkner
2013-09-07 15:18                       ` Eli Zaretskii
2013-09-07 21:38                         ` Wolfgang Jenkner
2013-09-07 22:29                           ` Xue Fuqiao
2013-09-07 22:48                             ` Xue Fuqiao
2013-09-08 11:03                               ` Eli Zaretskii
2013-09-08 13:05               ` Kenichi Handa
2013-09-08 14:59                 ` Eli Zaretskii
2013-09-10 13:58                   ` Kenichi Handa
2013-09-10 15:37                     ` Eli Zaretskii
2013-09-12 14:52                     ` Kenichi Handa
2013-09-12 16:08                       ` Eli Zaretskii
2020-11-18 15:13                         ` Stefan Kangas
2020-11-18 17:09                           ` Eli Zaretskii
2013-09-14  8:51                       ` Jan Djärv
2013-09-14  9:12                         ` Jan Djärv
2013-09-05 17:29     ` Jan Djärv
2013-09-05 17:56       ` Eli Zaretskii
2013-09-05 23:27   ` Xue Fuqiao

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).