font selection mechanism (e.g., Japanese)

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* font selection mechanism (e.g., Japanese)
@ 2010-01-29 15:14 David Reitter
  2010-02-02 11:51 ` Kenichi Handa
  0 siblings, 1 reply; 3+ messages in thread
From: David Reitter @ 2010-01-29 15:14 UTC (permalink / raw)
  To: emacs-devel@gnu.org discussions

I'm trying to understand the font selection system in order to fix  
problems, e.g., with the display of `han' characters in Japanese.  I'd  
appreciate your help if you're knowledgeable with font.c.

A good example for font selection problems in Emacs 23 is Japanese,  
which mixes characters from different scripts.  For example, we have  
latin characters, chinese (kanji, script: 'han'), and other scripts  
(hiragana, katakana).
I have four issues.

1. Selecting a better `han' font.

Finding a font to display 'han' characters is difficult for the  
current algorithm.  What is needed is a font that is similar to the  
context font (the font chosen by the user for the face).  In Emacs 23  
(at least on the NS port with my set of fonts), han characters look  
very different in weight when combined with fonts common on my system,  
e.g. Monaco or Lucida Grande.
See a user's complaint:  http://lists.aquamacs.org/pipermail/aquamacs-devel/2009-August/002271.html

The reason for this is that font selection prefers high coverage of  
the chosen font for the script; it finds some fonts on my system that  
have high (90%) coverage and then chooses among them, even though  
other fonts would have sufficient coverage and look much better.   
`Han' is obviously a pretty big set of characters, so only few fonts  
cover that many characters.  At the same time, only a small portion of  
these is commonly used (in Japanese at least), from what I understand.

Reducing the threshold  for script coverage, eg. in ns_findfonts for  
the NS port, addresses that - the `list' function of the font driver  
will return a much bigger set of fonts so that font_select_entity()  
can do its job.

The problem I have now is to get it to choose different fonts within  
the same script in cases where a low-coverage font does not provide a  
glyph.  The above threshold change makes things work better in  
practice, but the HELLO file shows serious regressions.

Where in the code would one get it to choose a different font for a  
character if the current font can't display it?  This is by-character  
selection, not by-script.

2. face-font-family-alternatives : broken?

face-font-family-alternatives does not work at all for me.  In  
font_find_for_lface(), "val" seems to empty; printing SDATA  
(attrs[LFACE_FAMILY_INDEX]) shows something better, like  
"Lucida_Grande".  But that's not what the alist is queried for.

3. font driver specific matching

As an observation:  The matching provided by the font driver (as a  
backup to listing the entities) is not usually called, at least in the  
NS port.  This is because font_find_for_lface() usually seems to widen  
the search so much (pretty much looking for all fonts) that the  
matching algorithm never gets chance.

4. Searching for fonts by foundry.

Below patch makes the selection algorithm a little more sensible  
(fonts of the same foundry rarely have much in common graphically).   
But it doesn't address the problem.

commit d858f9ea2f60b37aa6f44b3d824cbaf0f1f867ae
Author: David Reitter <david.reitter@gmail.com>
Date:   Fri Jan 29 00:31:17 2010 -0500

     font_find_for_lface: do not try to find face by Foundry (Author)  
name only
     This is not sensible.

diff --git a/src/font.c b/src/font.c
index 557f1fb..b578c04 100644
--- a/src/font.c
+++ b/src/font.c
@@ -3451,6 +3451,10 @@ font_find_for_lface (f, attrs, spec, c)
        ASET (work, FONT_FAMILY_INDEX, family[i]);
        for (j = 0; SYMBOLP (foundry[j]); j++)
         {
+         if (NILP (family[i]) && ! NILP (foundry[j]))
+           /* do not look for "some foundry, any family".
+              That doesn't tend to yield similar fonts. */
+           continue;
           ASET (work, FONT_FOUNDRY_INDEX, foundry[j]);
           for (k = 0; SYMBOLP (registry[k]); k++)
             {

5. Fontsets are not the solution.

I'm looking for an automatic procedure.

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: font selection mechanism (e.g., Japanese)
  2010-01-29 15:14 font selection mechanism (e.g., Japanese) David Reitter
@ 2010-02-02 11:51 ` Kenichi Handa
  2010-02-02 14:36   ` David Reitter
  0 siblings, 1 reply; 3+ messages in thread
From: Kenichi Handa @ 2010-02-02 11:51 UTC (permalink / raw)
  To: David Reitter; +Cc: adrian.b.robert, emacs-devel

In article <9775B394-9A01-4730-A4AE-98949DCA54DF@gmail.com>, David Reitter <david.reitter@gmail.com> writes:

> I'm trying to understand the font selection system in order to fix  
> problems, e.g., with the display of `han' characters in Japanese.  I'd  
> appreciate your help if you're knowledgeable with font.c.

I know about font.c, but know nothing about NS port.

> A good example for font selection problems in Emacs 23 is Japanese,  
> which mixes characters from different scripts.  For example, we have  
> latin characters, chinese (kanji, script: 'han'), and other scripts  
> (hiragana, katakana).
> I have four issues.

> 1. Selecting a better `han' font.

> Finding a font to display 'han' characters is difficult for the  
> current algorithm.  What is needed is a font that is similar to the  
> context font (the font chosen by the user for the face).  In Emacs 23  
> (at least on the NS port with my set of fonts), han characters look  
> very different in weight when combined with fonts common on my system,  
> e.g. Monaco or Lucida Grande.
> See a user's complaint:  http://lists.aquamacs.org/pipermail/aquamacs-devel/2009-August/002271.html

> The reason for this is that font selection prefers high coverage of  
> the chosen font for the script; it finds some fonts on my system that  
> have high (90%) coverage and then chooses among them, even though  
> other fonts would have sufficient coverage and look much better.   
> `Han' is obviously a pretty big set of characters, so only few fonts  
> cover that many characters.  At the same time, only a small portion of  
> these is commonly used (in Japanese at least), from what I understand.

> Reducing the threshold  for script coverage, eg. in ns_findfonts for  
> the NS port, addresses that - the `list' function of the font driver  
> will return a much bigger set of fonts so that font_select_entity()  
> can do its job.

Hmmm, then NS's font-backend should be improved.  Could
someone working on NS port please port take a look at this
problem.

> The problem I have now is to get it to choose different fonts within  
> the same script in cases where a low-coverage font does not provide a  
> glyph.  The above threshold change makes things work better in  
> practice, but the HELLO file shows serious regressions.

Regression in which part?  Only for Han scripts, or for all scripts?

> Where in the code would one get it to choose a different font for a  
> character if the current font can't display it?  This is by-character  
> selection, not by-script.

The function fontset_find_font in fontset.c does that job.

> 2. face-font-family-alternatives : broken?

> face-font-family-alternatives does not work at all for me.  In  
> font_find_for_lface(), "val" seems to empty; printing SDATA  
> (attrs[LFACE_FAMILY_INDEX]) shows something better, like  
> "Lucida_Grande".  But that's not what the alist is queried for.

Do you mean this code in font_find_for_lface?

  if (NILP (val) && STRINGP (attrs[LFACE_FAMILY_INDEX]))
    {
      val = attrs[LFACE_FAMILY_INDEX];
      val = font_intern_prop ((char *) SDATA (val), SBYTES (val), 1);
    }

If val is not correctly set, perhaps you compiled Emacs with
some optimization.  Please try to recompile Emacs as
something like below:

% make CFLAGS=-g clean all

> 3. font driver specific matching

> As an observation:  The matching provided by the font driver (as a  
> backup to listing the entities) is not usually called, at least in the  
> NS port.  This is because font_find_for_lface() usually seems to widen  
> the search so much (pretty much looking for all fonts) that the  
> matching algorithm never gets chance.

In font selection, specs in fontset are mandatory but specs
from face attributes are just preference.  So,
font_find_for_lface widens the restriction by setting only
preferred specs to nil one by one.

By the way, I locally have a code that respect the order of
fonts returned by font_driver->list () in font sorting.  I'm
going to commit it for post 23.2 branch.  Then, each driver
can return fonts in their preferred order.

> 4. Searching for fonts by foundry.

> Below patch makes the selection algorithm a little more sensible  
> (fonts of the same foundry rarely have much in common graphically).   
> But it doesn't address the problem.

Thank you for the patch.  It seems good.  I'll adopt it for
post 23.2 branch.

> commit d858f9ea2f60b37aa6f44b3d824cbaf0f1f867ae
> Author: David Reitter <david.reitter@gmail.com>
> Date:   Fri Jan 29 00:31:17 2010 -0500

>      font_find_for_lface: do not try to find face by Foundry (Author)  
> name only
>      This is not sensible.

> diff --git a/src/font.c b/src/font.c
> index 557f1fb..b578c04 100644
> --- a/src/font.c
> +++ b/src/font.c
> @@ -3451,6 +3451,10 @@ font_find_for_lface (f, attrs, spec, c)
>         ASET (work, FONT_FAMILY_INDEX, family[i]);
>         for (j = 0; SYMBOLP (foundry[j]); j++)
>          {
> +         if (NILP (family[i]) && ! NILP (foundry[j]))
> +           /* do not look for "some foundry, any family".
> +              That doesn't tend to yield similar fonts. */
> +           continue;
>            ASET (work, FONT_FOUNDRY_INDEX, foundry[j]);
>            for (k = 0; SYMBOLP (registry[k]); k++)
>              {

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: font selection mechanism (e.g., Japanese)
  2010-02-02 11:51 ` Kenichi Handa
@ 2010-02-02 14:36   ` David Reitter
  0 siblings, 0 replies; 3+ messages in thread
From: David Reitter @ 2010-02-02 14:36 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: adrian.b.robert, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2615 bytes --]

On Feb 2, 2010, at 6:51 AM, Kenichi Handa wrote:

> Hmmm, then NS's font-backend should be improved.  Could
> someone working on NS port please port take a look at this
> problem.

FWIW, I have experimented with other variants of ns_findfonts() (for "match", not for "list"), but ultimately found that it's mostly "list" that is called while the code in font.c iteratively reduces that constraints.

I also found one font that is returned with the first batch ("Osaka"), which has sufficient (.95) coverage for the 'han' script.  However, it is still not chosen, and I assume that this is because there is not sufficient information present about the font weight.  That's the NS font driver's fault.  I have filed a bug report about this; of course I've tried to provide the weight information (with odd results - not sure why).   Similarly, we don't get the "ADSTYLE" information about fonts either, so that the alternative font can't be chosen according to that either.   Someone with better knowledge of nsfont.m might be able to debug this.

>> The problem I have now is to get it to choose different fonts within  
>> the same script in cases where a low-coverage font does not provide a  
>> glyph.  The above threshold change makes things work better in  
>> practice, but the HELLO file shows serious regressions.
> 
> Regression in which part?  Only for Han scripts, or for all scripts?

For more than han.  What I have since noticed is that we're unable or unwilling to select font foo for a given script, but, when foo doesn't have glyphs for all needed characters, switch to another font bar for the same script for those characters.  Is that correct?

>> Where in the code would one get it to choose a different font for a  
>> character if the current font can't display it?  This is by-character  
>> selection, not by-script.
> 
> The function fontset_find_font in fontset.c does that job.

OK, I'll look into that.  I assume it does that automatically, i.e. without explicit specifications in a fontset?
(All I'm talking about here is automatic selection.  It's clear that a fontset can be constructed manually.)

> By the way, I locally have a code that respect the order of
> fonts returned by font_driver->list () in font sorting.  I'm
> going to commit it for post 23.2 branch.  Then, each driver
> can return fonts in their preferred order.

But then the driver also needs an argument such as a full font spec for the target font, so that the order can be determined by similarity.  Right now, the only information it has is a set of hard constraints.  

- D

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-02-02 14:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-29 15:14 font selection mechanism (e.g., Japanese) David Reitter
2010-02-02 11:51 ` Kenichi Handa
2010-02-02 14:36   ` David Reitter

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).