all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Help with unicode diacritics
@ 2020-12-28  7:07 Stephen Eglen
  2020-12-28 12:33 ` Skip Montanaro
  2020-12-28 14:34 ` Eli Zaretskii
  0 siblings, 2 replies; 14+ messages in thread
From: Stephen Eglen @ 2020-12-28  7:07 UTC (permalink / raw)
  To: help-gnu-emacs

Hello,

can anyone out here help me with debugging a Unicode font issue? In
particular, I'd like to use the COMBINING OVERLINE (x0305) code to make
"x" with a bar above it (i.e. x bar, to represent the mean of x).

I'm on manjaro arch, with Emacs 27.1.  I wrote the following snippet to
demonstrate the problem. See the line of a character a with various
diacritics under the call to (code) to demonstrate the problem.  Out of
the three fonts I tested, Monaco does the best, but ideally I'd like to
use JuliaMono.  If I view this same file in xfce4-terminal, using
JuliaMono, all the diacritics appear fine.  This makes me suspect Emacs.
On the page: https://damtp.cam.ac.uk/user/sje30/temp/accents.png is what
I see (using EXWM, with emacs rendering on left and xfce-terminal on the
right).

How might I debug further?

The JuliaMono font was installed using the arch package
https://github.com/cormullion/juliamono


Alternatively, does anyone recommend a monospace font with good Unicode
performance in Emacs?

Thanks for any pointers, Stephen


(defun code ()
  "Insert a range of unicode diactrics into buffer."
  (dotimes (i 20)
    (insert "a")
    (insert (+ 769 i))
    (insert " ")
  ))

;; x0301 is code point  (+ (* 3 256) 1) => 769
;; (code)
;; á â ã ā a̅ ă ȧ ä ả å a̋ ǎ a̍ a̎ ȁ a̐ ȃ a̒ a̓ a̔ 

;; Monaco renders ok except a̐
;;  (set-face-attribute 'default nil :font "Monaco")
;;
;; quite a few missing 
;; (set-face-attribute 'default nil :font "JuliaMono")
;;
;; quite a few missing
;; (set-face-attribute 'default nil :font "Roboto Mono")






(emacs-version)
"GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.22, cairo version 1.17.3)
 of 2020-08-28"



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-28  7:07 Help with unicode diacritics Stephen Eglen
@ 2020-12-28 12:33 ` Skip Montanaro
  2020-12-28 13:43   ` Stephen Eglen
  2020-12-28 16:50   ` Drew Adams
  2020-12-28 14:34 ` Eli Zaretskii
  1 sibling, 2 replies; 14+ messages in thread
From: Skip Montanaro @ 2020-12-28 12:33 UTC (permalink / raw)
  To: Stephen Eglen; +Cc: Help GNU Emacs

[-- Attachment #1: Type: text/plain, Size: 1540 bytes --]

I was curious, as I also use Manjaro and Emacs 27.1. I installed JuliaMono
from the git repo. I had trouble getting it to appear in the xfce4 terminal
preferences, but I could see it in Emacs's font picker. Running your code
function and pasting into my *scratch* buffer I see:

[image: emacs-fonts.png]

so, fairly inaccurate placement of the diacritical marks.

Punting on JuliaMono in xfce-terminal I went with Monospace Regular:

[image: fonts.png]

That seems to be pretty bad as well, and Monospace regular should be well
sorted. I know nothing about fonts or diacritical marks, and have a US
keyboard, so sometime ago I stashed a file in my home directory where I
could look up the occasional accented character. I visited that file and
the letters all look fine. Here's a chunk of small letter a code points:

[image: compose.png]

That suggests to me that if there's something wrong it might be with the
width of some of the diacritical marks themselves, perhaps too wide for the
letter they sit above? Still, that theory doesn't explain the wacky
rendering of LATIN SMALL LETTER A WITH CIRCUMFLEX. It also renders fine in
my compose file:

[image: circumflex.png]

I'm just shooting in the dark though, so I could well be off-base. Are you
programmatically inserting characters in your buffer in general, or just
for this example? Does your ā also render badly when inserted via your
keyboard? (FWIW, it looks fine for me using C-x 8 RET LATIN SMALL LETTER A
WITH MACRON RET.)

Skip Montanaro

[-- Attachment #2: emacs-fonts.png --]
[-- Type: image/png, Size: 6120 bytes --]

[-- Attachment #3: fonts.png --]
[-- Type: image/png, Size: 5620 bytes --]

[-- Attachment #4: compose.png --]
[-- Type: image/png, Size: 61577 bytes --]

[-- Attachment #5: circumflex.png --]
[-- Type: image/png, Size: 38226 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-28 12:33 ` Skip Montanaro
@ 2020-12-28 13:43   ` Stephen Eglen
  2020-12-28 14:48     ` Eli Zaretskii
  2020-12-28 16:50   ` Drew Adams
  1 sibling, 1 reply; 14+ messages in thread
From: Stephen Eglen @ 2020-12-28 13:43 UTC (permalink / raw)
  To: Skip Montanaro; +Cc: Help GNU Emacs, Stephen Eglen

Thanks Skip.

> I was curious, as I also use Manjaro and Emacs 27.1. I installed JuliaMono
> from the git repo. I had trouble getting it to appear in the xfce4 terminal
> preferences

(I installed via `yay -S ttf-juliamono`)


Thanks for replicating the issue.

> I'm just shooting in the dark though, so I could well be off-base. Are you
> programmatically inserting characters in your buffer in general, or just
> for this example? Does your ā also render badly when inserted via your
> keyboard? (FWIW, it looks fine for me using C-x 8 RET LATIN SMALL LETTER A
> WITH MACRON RET.)

If I do this, then I get a fine-looking a with a overline (ā); the issue
is that I'd like it combined for other letters, e.g. x, and there is no
distinct code point e.g. for "LATIN SMALL LETTER X WITH MACRON".

In the meantime, I've also found this old bug report but that seems to
predate the harfbuzz implementation now in 27.1:

> https://lists.gnu.org/archive/html/bug-gnu-emacs/2015-07/msg00334.html

Thanks, Stephen





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-28  7:07 Help with unicode diacritics Stephen Eglen
  2020-12-28 12:33 ` Skip Montanaro
@ 2020-12-28 14:34 ` Eli Zaretskii
  1 sibling, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2020-12-28 14:34 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Stephen Eglen <sje30@cam.ac.uk>
> Date: Mon, 28 Dec 2020 07:07:36 +0000
> 
> I'm on manjaro arch, with Emacs 27.1.  I wrote the following snippet to
> demonstrate the problem. See the line of a character a with various
> diacritics under the call to (code) to demonstrate the problem.  Out of
> the three fonts I tested, Monaco does the best, but ideally I'd like to
> use JuliaMono.  If I view this same file in xfce4-terminal, using
> JuliaMono, all the diacritics appear fine.  This makes me suspect Emacs.
> On the page: https://damtp.cam.ac.uk/user/sje30/temp/accents.png is what
> I see (using EXWM, with emacs rendering on left and xfce-terminal on the
> right).
> 
> How might I debug further?

You need to use a font that has glyphs both for a (every font will
fulfill that requirement) and the diacritics.  Emacs can only compose
characters if their glyphs come from the same font.

> Alternatively, does anyone recommend a monospace font with good Unicode
> performance in Emacs?

You are not looking for a font with good Unicode coverage, you are
looking for a font with good coverage of Latin diacritics.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-28 13:43   ` Stephen Eglen
@ 2020-12-28 14:48     ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2020-12-28 14:48 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Stephen Eglen <sje30@cam.ac.uk>
> Date: Mon, 28 Dec 2020 13:43:45 +0000
> Cc: Help GNU Emacs <help-gnu-emacs@gnu.org>, Stephen Eglen <sje30@cam.ac.uk>
> 
> In the meantime, I've also found this old bug report but that seems to
> predate the harfbuzz implementation now in 27.1:
> 
> > https://lists.gnu.org/archive/html/bug-gnu-emacs/2015-07/msg00334.html

I don't think that old bug is related to what you see, for two
reasons:

  . the bug was in libm17n-flt, and you are using HarfBuzz
  . the basic rule that Emacs only composes characters supported by
    the same font doesn't depend on the text-shaping engine, so it
    cannot be fixed in one particular engine

You need to find a better font.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: Help with unicode diacritics
  2020-12-28 12:33 ` Skip Montanaro
  2020-12-28 13:43   ` Stephen Eglen
@ 2020-12-28 16:50   ` Drew Adams
  2020-12-28 17:45     ` AproposUnicode (was: Help with unicode diacritics) Janusz S. Bień
  2020-12-30  9:51     ` Help with unicode diacritics Stephen Eglen
  1 sibling, 2 replies; 14+ messages in thread
From: Drew Adams @ 2020-12-28 16:50 UTC (permalink / raw)
  To: Skip Montanaro, Stephen Eglen; +Cc: Help GNU Emacs

> sometime ago I stashed a file in my home directory where I
> could look up the occasional accented character. I visited that file and
> the letters all look fine. Here's a chunk of small letter a code points:

Only partly related to this thread, and just FYI.

My library `apu.el' (Apropos Unicode) can sometimes
help with showing info about Unicode chars.

https://www.emacswiki.org/emacs/AproposUnicode

https://www.emacswiki.org/emacs/download/apu.el



^ permalink raw reply	[flat|nested] 14+ messages in thread

* AproposUnicode (was: Help with unicode diacritics)
  2020-12-28 16:50   ` Drew Adams
@ 2020-12-28 17:45     ` Janusz S. Bień
  2020-12-28 18:11       ` Drew Adams
  2020-12-30  9:51     ` Help with unicode diacritics Stephen Eglen
  1 sibling, 1 reply; 14+ messages in thread
From: Janusz S. Bień @ 2020-12-28 17:45 UTC (permalink / raw)
  To: Drew Adams; +Cc: Help GNU Emacs

On Mon, Dec 28 2020 at  8:50 -08, Drew Adams wrote:
>> sometime ago I stashed a file in my home directory where I
>> could look up the occasional accented character. I visited that file and
>> the letters all look fine. Here's a chunk of small letter a code points:
>
> Only partly related to this thread, and just FYI.
>
> My library `apu.el' (Apropos Unicode) can sometimes
> help with showing info about Unicode chars.
>
> https://www.emacswiki.org/emacs/AproposUnicode
>
> https://www.emacswiki.org/emacs/download/apu.el

I've made a quick test and looks like I would like it, but it seems that
you completely ignore private use characters. For example,  U+E8BF
described by ‘describe-char’ as general-category: Co (Other, Private
Use) is completely ignored by `describe-chars-in-region'.

Best regards

Janusz

-- 
             ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: AproposUnicode (was: Help with unicode diacritics)
  2020-12-28 17:45     ` AproposUnicode (was: Help with unicode diacritics) Janusz S. Bień
@ 2020-12-28 18:11       ` Drew Adams
  2020-12-28 18:16         ` AproposUnicode Janusz S. Bień
  0 siblings, 1 reply; 14+ messages in thread
From: Drew Adams @ 2020-12-28 18:11 UTC (permalink / raw)
  To: jsbien; +Cc: Help GNU Emacs

> > Only partly related to this thread, and just FYI.
> > My library `apu.el' (Apropos Unicode) can sometimes
> > help with showing info about Unicode chars.
>
> I've made a quick test and looks like I would like it, but it seems that
> you completely ignore private use characters. For example,  U+E8BF
> described by ‘describe-char’ as general-category: Co (Other, Private
> Use) is completely ignored by `describe-chars-in-region'.

Thanks for trying it.

First, I'm no expert on Unicode, by any means.

I use `get-char-code-property' to check for Unicode
chars, checking properties `name' and `old-name':

(or (get-char-code-property character 'name)
    (get-char-code-property character 'old-name))

If a char doesn't have either of those properties
then I flag it as not being a Unicode char.

(Perhaps the message shouldn't say it's not a Unicode
char.  I'm no expert on the terminology.  Perhaps it
should just say that the char has no name.)

Maybe you can suggest a code change to do something
more like what you expect?  If so, please do.  Thx.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AproposUnicode
  2020-12-28 18:11       ` Drew Adams
@ 2020-12-28 18:16         ` Janusz S. Bień
  2020-12-28 18:40           ` AproposUnicode Drew Adams
  0 siblings, 1 reply; 14+ messages in thread
From: Janusz S. Bień @ 2020-12-28 18:16 UTC (permalink / raw)
  To: Drew Adams; +Cc: Help GNU Emacs

On Mon, Dec 28 2020 at 10:11 -08, Drew Adams wrote:
>> > Only partly related to this thread, and just FYI.
>> > My library `apu.el' (Apropos Unicode) can sometimes
>> > help with showing info about Unicode chars.
>>
>> I've made a quick test and looks like I would like it, but it seems that
>> you completely ignore private use characters. For example,  U+E8BF
>> described by ‘describe-char’ as general-category: Co (Other, Private
>> Use) is completely ignored by `describe-chars-in-region'.
>
> Thanks for trying it.
>
> First, I'm no expert on Unicode, by any means.
>
> I use `get-char-code-property' to check for Unicode
> chars, checking properties `name' and `old-name':
>
> (or (get-char-code-property character 'name)
>     (get-char-code-property character 'old-name))
>
> If a char doesn't have either of those properties
> then I flag it as not being a Unicode char.
>
> (Perhaps the message shouldn't say it's not a Unicode
> char.  I'm no expert on the terminology.  Perhaps it
> should just say that the char has no name.)

I see in the code the message

"The following chars are not recognized as Unicode:\n%s"

but it is not printed in the described situation. This seems to be just
a bug.

> Maybe you can suggest a code change to do something
> more like what you expect?  If so, please do.  Thx.

With pleasure. Should I do this off the list?

Best regards

Janusz

-- 
             ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: AproposUnicode
  2020-12-28 18:16         ` AproposUnicode Janusz S. Bień
@ 2020-12-28 18:40           ` Drew Adams
  0 siblings, 0 replies; 14+ messages in thread
From: Drew Adams @ 2020-12-28 18:40 UTC (permalink / raw)
  To: jsbien; +Cc: Help GNU Emacs

> I see in the code the message
>   "The following chars are not recognized as Unicode:\n%s"
> but it is not printed in the described situation.
> This seems to be just a bug.

It likely got shown in the echo area but was
overwritten by some subsequent message.  Look
in buffer *Messages* for it.

> > Maybe you can suggest a code change to do something
> > more like what you expect?  If so, please do.  Thx.
> 
> With pleasure. Should I do this off the list?

Yes, please.  We can look into the hidden message
as well.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-28 16:50   ` Drew Adams
  2020-12-28 17:45     ` AproposUnicode (was: Help with unicode diacritics) Janusz S. Bień
@ 2020-12-30  9:51     ` Stephen Eglen
  2020-12-30 17:05       ` Janusz S. Bień
  2020-12-30 20:22       ` Eli Zaretskii
  1 sibling, 2 replies; 14+ messages in thread
From: Stephen Eglen @ 2020-12-30  9:51 UTC (permalink / raw)
  To: Drew Adams; +Cc: Skip Montanaro, Help GNU Emacs, Stephen Eglen

Thanks Drew, and to the others for replying.  (Apologies I can't reply
to individual messages that were not CC'ed to mem, as I'm not on the
mailing list yet.)

To answer Eli's suggestions:

1. I don't need a latin font, I need a unicode font for handling
mathematical expressions like x bar (for the mean of x) as I said in my
original email.

2. I think the juliamono font does contain glyphs for both "a" and the
"overlinecomb".  I asked the font author:
https://github.com/cormullion/juliamono/issues/87

3. If anyone knows of a good font for handling these diacritics, I'd
like to hear recommendations, thank you.  In the meantime, I will
investigate firacode.

best wishes to all for 2021.

Stephen




On Mon, Dec 28 2020, Drew Adams wrote:

>> sometime ago I stashed a file in my home directory where I
>> could look up the occasional accented character. I visited that file and
>> the letters all look fine. Here's a chunk of small letter a code points:
>
> Only partly related to this thread, and just FYI.
>
> My library `apu.el' (Apropos Unicode) can sometimes
> help with showing info about Unicode chars.
>
> https://www.emacswiki.org/emacs/AproposUnicode
>
> https://www.emacswiki.org/emacs/download/apu.el




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-30  9:51     ` Help with unicode diacritics Stephen Eglen
@ 2020-12-30 17:05       ` Janusz S. Bień
  2020-12-30 17:50         ` Stephen Eglen
  2020-12-30 20:22       ` Eli Zaretskii
  1 sibling, 1 reply; 14+ messages in thread
From: Janusz S. Bień @ 2020-12-30 17:05 UTC (permalink / raw)
  To: Stephen Eglen; +Cc: Skip Montanaro, Help GNU Emacs

On Wed, Dec 30 2020 at  9:51 +00, Stephen Eglen wrote:
> Thanks Drew, and to the others for replying.  (Apologies I can't reply
> to individual messages that were not CC'ed to mem, as I'm not on the
> mailing list yet.)
>
> To answer Eli's suggestions:
>
> 1. I don't need a latin font, I need a unicode font for handling
> mathematical expressions like x bar (for the mean of x) as I said in my
> original email.
>
> 2. I think the juliamono font does contain glyphs for both "a" and the
> "overlinecomb".  I asked the font author:
> https://github.com/cormullion/juliamono/issues/87

Are you on Linux?

A useful tool is fc-search-codepoint:

https://unix.stackexchange.com/questions/162305/find-the-best-font-for-rendering-a-codepoint

Regards - Janusz

-- 
             ,   
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-30 17:05       ` Janusz S. Bień
@ 2020-12-30 17:50         ` Stephen Eglen
  0 siblings, 0 replies; 14+ messages in thread
From: Stephen Eglen @ 2020-12-30 17:50 UTC (permalink / raw)
  To: jsbien; +Cc: Skip Montanaro, Help GNU Emacs, Stephen Eglen


> Are you on Linux?
>
> A useful tool is fc-search-codepoint:
>
> https://unix.stackexchange.com/questions/162305/find-the-best-font-for-rendering-a-codepoint

Thanks - yes, I'm on manjaro.  that is helpful.  In the meantime, I've
submitted this as a bug to Emacs following useful feedback from harfbuzz
team:

https://github.com/harfbuzz/harfbuzz/discussions/2790
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45557

Best wishes, Stephen



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Help with unicode diacritics
  2020-12-30  9:51     ` Help with unicode diacritics Stephen Eglen
  2020-12-30 17:05       ` Janusz S. Bień
@ 2020-12-30 20:22       ` Eli Zaretskii
  1 sibling, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2020-12-30 20:22 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Stephen Eglen <sje30@cam.ac.uk>
> Date: Wed, 30 Dec 2020 09:51:43 +0000
> Cc: Skip Montanaro <skip.montanaro@gmail.com>,
>  Help GNU Emacs <help-gnu-emacs@gnu.org>, Stephen Eglen <sje30@cam.ac.uk>
> 
> 1. I don't need a latin font, I need a unicode font for handling
> mathematical expressions like x bar (for the mean of x) as I said in my
> original email.

That might be so, but the characters and the diacritics you mentioned
are all from the Latin blocks, so a font with good Latin coverage
should do this particular job for you.



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-12-30 20:22 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-12-28  7:07 Help with unicode diacritics Stephen Eglen
2020-12-28 12:33 ` Skip Montanaro
2020-12-28 13:43   ` Stephen Eglen
2020-12-28 14:48     ` Eli Zaretskii
2020-12-28 16:50   ` Drew Adams
2020-12-28 17:45     ` AproposUnicode (was: Help with unicode diacritics) Janusz S. Bień
2020-12-28 18:11       ` Drew Adams
2020-12-28 18:16         ` AproposUnicode Janusz S. Bień
2020-12-28 18:40           ` AproposUnicode Drew Adams
2020-12-30  9:51     ` Help with unicode diacritics Stephen Eglen
2020-12-30 17:05       ` Janusz S. Bień
2020-12-30 17:50         ` Stephen Eglen
2020-12-30 20:22       ` Eli Zaretskii
2020-12-28 14:34 ` Eli Zaretskii

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.