unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#40097: 28.0.50; Preferred font ignored for specific charset
@ 2020-03-17  4:31 Sergey Organov
  2020-03-17 15:24 ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Sergey Organov @ 2020-03-17  4:31 UTC (permalink / raw)
  To: 40097

[-- Attachment #1: Type: text/plain, Size: 7789 bytes --]

Hello,

[Note: this has been originally observed in GNU Emacs 26.1 running on
Debian Buster GNU/Linux, and then reproduced on the latest Emacs
snapshot, 28.0.50.]

When there is some particular charset property on text, Emacs chooses
to render it using font that has matching encoding, such as:

x:-xos4-terminus-medium-r-normal--16-160-72-72-c-80-microsoft-cp1251 (#xEF)

for windows-1251 charset rather than the default font:

ftcrhb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1

even though the default (unicode) font does support corresponding
characters.

This behavior results in rather unpleasant mixture of fonts.

To reproduce this starting with emacs -Q, evaluate this form:

(let ((buf (get-buffer-create "test encodings")))
  (with-current-buffer buf
    (erase-buffer)
    (insert "Encoding windows-1251: "
            (propertize "привет\n" 'charset 'windows-1251))
    (insert "Encoding      unicode: "
            (propertize "привет\n" 'charset 'unicode)))
  (switch-to-buffer-other-window buf))

Be warned that it's known not to be reproducible on at least some
systems. What I see in "emacs -Q" is attached as Emacs window snapshot.
Please notice how two strings look very different where encodings
differ.

Here are outputs of C-u C-x = being pressed on 2 differing texts:

--- >8 ---
             position: 24 of 60 (38%), column: 23
            character: п (displayed as п) (codepoint 1087, #o2077, #x43f)
              charset: windows-1251 (WINDOWS-1251 (Cyrillic))
code point in charset: 0xEF
               script: cyrillic
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), Y:2-byte Cyrillic, c:Chinese, h:Korean, j:Japanese, y:Cyrillic
             to input: type "C-x 8 RET 43f" or "C-x 8 RET CYRILLIC SMALL LETTER PE"
          buffer code: #xD0 #xBF
            file code: #xD0 #xBF (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    x:-xos4-terminus-medium-r-normal--16-160-72-72-c-80-microsoft-cp1251 (#xEF)

Character code properties: customize what to show
  name: CYRILLIC SMALL LETTER PE
  general-category: Ll (Letter, Lowercase)
  decomposition: (1087) ('п')

There are text properties here:
  charset              windows-1251
--- >8 ---
             position: 54 of 60 (88%), column: 23
            character: п (displayed as п) (codepoint 1087, #o2077, #x43f)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x043F
               script: cyrillic
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), Y:2-byte Cyrillic, c:Chinese, h:Korean, j:Japanese, y:Cyrillic
             to input: type "C-x 8 RET 43f" or "C-x 8 RET CYRILLIC SMALL LETTER PE"
          buffer code: #xD0 #xBF
            file code: #xD0 #xBF (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    ftcrhb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1 (#x37E)

Character code properties: customize what to show
  name: CYRILLIC SMALL LETTER PE
  general-category: Ll (Letter, Lowercase)
  decomposition: (1087) ('п')

There are text properties here:
  charset              unicode
--- >8 ---

For reference, here is a link to original report/discussion:

https://lists.gnu.org/archive/html/help-gnu-emacs/2020-03/msg00049.html


In GNU Emacs 28.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.5, cairo version 1.16.0)
 of 2020-03-08, unofficial emacs-snapshot build: http://emacs.ganneff.de/, git commit 0a3f8da6e1a56ada409cf1677ac40fcc75a8a33c built on runner-19980c3f-project-26-concurrent-0
Repository revision: d01cf197911a365e4422a5561a0cd77fed4d8fc3
Repository branch: HEAD
Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
System Description: Debian GNU/Linux 10 (buster)

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Defining kbd macro...
Keyboard macro defined
completion--do-completion: Keyboard macro terminated by a command ringing the bell
Quit
#<buffer test encodings>
Configured using:
 'configure --build x86_64-linux-gnu --prefix=/usr
 --sharedstatedir=/var/lib --libexecdir=/usr/lib
 --localstatedir=/var/lib --infodir=/usr/share/info
 --mandir=/usr/share/man --with-pop=yes
 --enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/28.0.50/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/28.0.50/site-lisp:/usr/share/emacs/site-lisp
 --build x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
 --libexecdir=/usr/lib --localstatedir=/var/lib
 --infodir=/usr/share/info --mandir=/usr/share/man --with-pop=yes
 --enable-locallisppath=/etc/emacs-snapshot:/etc/emacs:/usr/local/share/emacs/28.0.50/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/28.0.50/site-lisp:/usr/share/emacs/site-lisp
 --with-x=yes --with-x-toolkit=gtk3 --with-toolkit-scroll-bars
 'CFLAGS=-g -O2
 -fdebug-prefix-map=/builds/joerg/emacs/buster_amd64/emacs-snapshot-20200308+emacs-27.0.90-434-g0a3f8da6e1=. -fstack-protector-strong
 -Wformat -Werror=format-security -Wall -fno-omit-frame-pointer'
 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro'

Configured features:
XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB NOTIFY
INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF
ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS PDUMPER LCMS2
GMP

Important settings:
  value of $LC_MONETARY: en_US.UTF-8
  value of $LC_NUMERIC: en_US.UTF-8
  value of $LC_TIME: en_US.UTF-8
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Fundamental

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg
epg-config gnus-util rmail rmail-loaddefs text-property-search time-date
subr-x seq byte-opt gv bytecomp byte-compile cconv mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail
rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils kmacro
cl-loaddefs cl-lib tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode elisp-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote threads dbusbind
inotify lcms2 dynamic-setting system-font-setting font-render-setting
cairo move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 45267 8851)
 (symbols 48 6071 1)
 (strings 32 15735 1935)
 (string-bytes 1 511300)
 (vectors 16 9543)
 (vector-slots 8 130234 10600)
 (floats 8 25 44)
 (intervals 56 261 6)
 (buffers 1000 13))


[-- Attachment #2: Emacs fonts test --]
[-- Type: image/png, Size: 69197 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-17  4:31 bug#40097: 28.0.50; Preferred font ignored for specific charset Sergey Organov
@ 2020-03-17 15:24 ` Eli Zaretskii
  2020-03-17 16:12   ` Sergey Organov
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2020-03-17 15:24 UTC (permalink / raw)
  To: Sergey Organov; +Cc: 40097

> From: Sergey Organov <sorganov@gmail.com>
> Date: Tue, 17 Mar 2020 07:31:02 +0300
> 
> When there is some particular charset property on text, Emacs chooses
> to render it using font that has matching encoding, such as:
> 
> x:-xos4-terminus-medium-r-normal--16-160-72-72-c-80-microsoft-cp1251 (#xEF)
> 
> for windows-1251 charset rather than the default font:
> 
> ftcrhb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1
> 
> even though the default (unicode) font does support corresponding
> characters.

Can you show all the fonts on your system that have microsoft-cp1251
as their registry/encoding?  AFAIK, this is done with xlsfonts.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-17 15:24 ` Eli Zaretskii
@ 2020-03-17 16:12   ` Sergey Organov
  2020-03-17 16:35     ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Sergey Organov @ 2020-03-17 16:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 40097

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Sergey Organov <sorganov@gmail.com>
>> Date: Tue, 17 Mar 2020 07:31:02 +0300
>> 
>> When there is some particular charset property on text, Emacs chooses
>> to render it using font that has matching encoding, such as:
>> 
>> x:-xos4-terminus-medium-r-normal--16-160-72-72-c-80-microsoft-cp1251 (#xEF)
>> 
>> for windows-1251 charset rather than the default font:
>> 
>> ftcrhb:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-15-*-*-*-m-0-iso10646-1
>> 
>> even though the default (unicode) font does support corresponding
>> characters.
>
> Can you show all the fonts on your system that have microsoft-cp1251
> as their registry/encoding?  AFAIK, this is done with xlsfonts.

Sure:

$ xlsfonts | grep cp1251
-xos4-terminus-bold-o-normal--0-0-72-72-c-0-microsoft-cp1251
-xos4-terminus-bold-o-normal--12-120-72-72-c-60-microsoft-cp1251
-xos4-terminus-bold-o-normal--14-140-72-72-c-80-microsoft-cp1251
-xos4-terminus-bold-o-normal--16-160-72-72-c-80-microsoft-cp1251
-xos4-terminus-bold-o-normal--18-180-72-72-c-100-microsoft-cp1251
-xos4-terminus-bold-o-normal--20-200-72-72-c-100-microsoft-cp1251
-xos4-terminus-bold-o-normal--22-220-72-72-c-110-microsoft-cp1251
-xos4-terminus-bold-o-normal--24-240-72-72-c-120-microsoft-cp1251
-xos4-terminus-bold-o-normal--28-280-72-72-c-140-microsoft-cp1251
-xos4-terminus-bold-o-normal--32-320-72-72-c-160-microsoft-cp1251
-xos4-terminus-bold-r-normal--0-0-72-72-c-0-microsoft-cp1251
-xos4-terminus-bold-r-normal--12-120-72-72-c-60-microsoft-cp1251
-xos4-terminus-bold-r-normal--14-140-72-72-c-80-microsoft-cp1251
-xos4-terminus-bold-r-normal--16-160-72-72-c-80-microsoft-cp1251
-xos4-terminus-bold-r-normal--18-180-72-72-c-100-microsoft-cp1251
-xos4-terminus-bold-r-normal--20-200-72-72-c-100-microsoft-cp1251
-xos4-terminus-bold-r-normal--22-220-72-72-c-110-microsoft-cp1251
-xos4-terminus-bold-r-normal--24-240-72-72-c-120-microsoft-cp1251
-xos4-terminus-bold-r-normal--28-280-72-72-c-140-microsoft-cp1251
-xos4-terminus-bold-r-normal--32-320-72-72-c-160-microsoft-cp1251
-xos4-terminus-medium-o-normal--0-0-72-72-c-0-microsoft-cp1251
-xos4-terminus-medium-o-normal--12-120-72-72-c-60-microsoft-cp1251
-xos4-terminus-medium-o-normal--14-140-72-72-c-80-microsoft-cp1251
-xos4-terminus-medium-o-normal--16-160-72-72-c-80-microsoft-cp1251
-xos4-terminus-medium-o-normal--18-180-72-72-c-100-microsoft-cp1251
-xos4-terminus-medium-o-normal--20-200-72-72-c-100-microsoft-cp1251
-xos4-terminus-medium-o-normal--22-220-72-72-c-110-microsoft-cp1251
-xos4-terminus-medium-o-normal--24-240-72-72-c-120-microsoft-cp1251
-xos4-terminus-medium-o-normal--28-280-72-72-c-140-microsoft-cp1251
-xos4-terminus-medium-o-normal--32-320-72-72-c-160-microsoft-cp1251
-xos4-terminus-medium-r-normal--0-0-72-72-c-0-microsoft-cp1251
-xos4-terminus-medium-r-normal--12-120-72-72-c-60-microsoft-cp1251
-xos4-terminus-medium-r-normal--14-140-72-72-c-80-microsoft-cp1251
-xos4-terminus-medium-r-normal--16-160-72-72-c-80-microsoft-cp1251
-xos4-terminus-medium-r-normal--18-180-72-72-c-100-microsoft-cp1251
-xos4-terminus-medium-r-normal--20-200-72-72-c-100-microsoft-cp1251
-xos4-terminus-medium-r-normal--22-220-72-72-c-110-microsoft-cp1251
-xos4-terminus-medium-r-normal--24-240-72-72-c-120-microsoft-cp1251
-xos4-terminus-medium-r-normal--28-280-72-72-c-140-microsoft-cp1251
-xos4-terminus-medium-r-normal--32-320-72-72-c-160-microsoft-cp1251
terminus-cp1251-12
terminus-cp1251-14
terminus-cp1251-16
terminus-cp1251-18
terminus-cp1251-20
terminus-cp1251-22
terminus-cp1251-24
terminus-cp1251-28
terminus-cp1251-32
terminus-cp1251-bold-12
terminus-cp1251-bold-14
terminus-cp1251-bold-16
terminus-cp1251-bold-18
terminus-cp1251-bold-20
terminus-cp1251-bold-22
terminus-cp1251-bold-24
terminus-cp1251-bold-28
terminus-cp1251-bold-32

-- Sergey





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-17 16:12   ` Sergey Organov
@ 2020-03-17 16:35     ` Eli Zaretskii
  2020-03-18  4:43       ` Sergey Organov
  2020-03-18 11:07       ` Robert Pluim
  0 siblings, 2 replies; 15+ messages in thread
From: Eli Zaretskii @ 2020-03-17 16:35 UTC (permalink / raw)
  To: Sergey Organov, Kenichi Handa; +Cc: 40097

> From: Sergey Organov <sorganov@gmail.com>
> Cc: 40097@debbugs.gnu.org
> Date: Tue, 17 Mar 2020 19:12:33 +0300
> 
> > Can you show all the fonts on your system that have microsoft-cp1251
> > as their registry/encoding?  AFAIK, this is done with xlsfonts.
> 
> Sure:

So there's a single font, Terminus, which supports that charset.  I
think you can work around this problem locally by adding that font to
face-ignored-fonts.

We could perhaps introduce a customizable variable that would allow
users who want that to disable the preference of charset-supporting
fonts when the text has the 'charset' property.  CC'ing Handa-san who
could comment on how important is this feature nowadays.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-17 16:35     ` Eli Zaretskii
@ 2020-03-18  4:43       ` Sergey Organov
  2020-03-18 14:35         ` Eli Zaretskii
  2020-03-18 11:07       ` Robert Pluim
  1 sibling, 1 reply; 15+ messages in thread
From: Sergey Organov @ 2020-03-18  4:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 40097

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Sergey Organov <sorganov@gmail.com>
>> Cc: 40097@debbugs.gnu.org
>> Date: Tue, 17 Mar 2020 19:12:33 +0300
>> 
>> > Can you show all the fonts on your system that have microsoft-cp1251
>> > as their registry/encoding?  AFAIK, this is done with xlsfonts.
>> 
>> Sure:
>
> So there's a single font, Terminus, which supports that charset.  I
> think you can work around this problem locally by adding that font to
> face-ignored-fonts.

Yeah:

(setq face-ignored-fonts '(".*-cp1251$"))

does the trick for me indeed, thanks!

> We could perhaps introduce a customizable variable that would allow
> users who want that to disable the preference of charset-supporting
> fonts when the text has the 'charset' property.  CC'ing Handa-san who
> could comment on how important is this feature nowadays.

When I wrote original question, I was sure I've read somewhere in the
docs that Emacs does prefer fonts with particular encoding, and then I
missed it and can't find it anymore. I wonder if it is even documented,
or did I read it somewhere else, maybe in some relevant discussion?

I mean if it's even undocumented and is not important nowadays, maybe
it's indeed better to drop it rather than bother with customizations.
And if it is to be customizable, it should probably be a fontset feature
rather than global?

Thanks,
-- Sergey





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-17 16:35     ` Eli Zaretskii
  2020-03-18  4:43       ` Sergey Organov
@ 2020-03-18 11:07       ` Robert Pluim
  2020-03-18 14:29         ` Eli Zaretskii
  1 sibling, 1 reply; 15+ messages in thread
From: Robert Pluim @ 2020-03-18 11:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Sergey Organov, 40097

>>>>> On Tue, 17 Mar 2020 18:35:05 +0200, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Sergey Organov <sorganov@gmail.com>
    >> Cc: 40097@debbugs.gnu.org
    >> Date: Tue, 17 Mar 2020 19:12:33 +0300
    >> 
    >> > Can you show all the fonts on your system that have microsoft-cp1251
    >> > as their registry/encoding?  AFAIK, this is done with xlsfonts.
    >> 
    >> Sure:

    Eli> So there's a single font, Terminus, which supports that charset.  I
    Eli> think you can work around this problem locally by adding that font to
    Eli> face-ignored-fonts.

    Eli> We could perhaps introduce a customizable variable that would allow
    Eli> users who want that to disable the preference of charset-supporting
    Eli> fonts when the text has the 'charset' property.  CC'ing Handa-san who
    Eli> could comment on how important is this feature nowadays.

Ah, now I see where this is coming from: I was looking down in font.c,
but this is a fontset.c feature.

Iʼm not sure how useful it is, I donʼt think fontconfig has any notion
of 'charset' beyond 'does this font support this Unicode character'.

Robert





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-18 11:07       ` Robert Pluim
@ 2020-03-18 14:29         ` Eli Zaretskii
  2020-03-18 16:28           ` Robert Pluim
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2020-03-18 14:29 UTC (permalink / raw)
  To: Robert Pluim; +Cc: sorganov, 40097

> From: Robert Pluim <rpluim@gmail.com>
> Cc: Sergey Organov <sorganov@gmail.com>,  Kenichi Handa <handa@gnu.org>,
>   40097@debbugs.gnu.org
> Date: Wed, 18 Mar 2020 12:07:54 +0100
> 
> Ah, now I see where this is coming from: I was looking down in font.c,
> but this is a fontset.c feature.

Yes.

> Iʼm not sure how useful it is, I donʼt think fontconfig has any notion
> of 'charset' beyond 'does this font support this Unicode character'.

Indeed; and you will see in ftfont.c that ftfont_list (or, rather, one
of the subroutines it calls) concocts a "charset" by using the few
representative characters from fc_charset_table, and then asks
Fontconfig to list fonts which support those characters (at least
that's my reading of the code).  Which is why I'm puzzled how come
DejaVu Sans Mono is not in the list and we proceed to the xfont
backend, because I'm quite sure DejaVu Sans does support the Cyrillic
characters that represent windows-1251.  If you can tell what I'm
missing here, maybe we could make some progress even without changing
the design.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-18  4:43       ` Sergey Organov
@ 2020-03-18 14:35         ` Eli Zaretskii
  2020-03-18 15:10           ` Sergey Organov
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2020-03-18 14:35 UTC (permalink / raw)
  To: Sergey Organov; +Cc: 40097

> From: Sergey Organov <sorganov@gmail.com>
> Cc: Kenichi Handa <handa@gnu.org>,  40097@debbugs.gnu.org
> Date: Wed, 18 Mar 2020 07:43:39 +0300
> 
> > We could perhaps introduce a customizable variable that would allow
> > users who want that to disable the preference of charset-supporting
> > fonts when the text has the 'charset' property.  CC'ing Handa-san who
> > could comment on how important is this feature nowadays.
> 
> When I wrote original question, I was sure I've read somewhere in the
> docs that Emacs does prefer fonts with particular encoding, and then I
> missed it and can't find it anymore. I wonder if it is even documented,
> or did I read it somewhere else, maybe in some relevant discussion?
> 
> I mean if it's even undocumented and is not important nowadays, maybe
> it's indeed better to drop it rather than bother with customizations.
> And if it is to be customizable, it should probably be a fontset feature
> rather than global?

The customizable option indeed only makes sense if the feature is
still useful to some users in some use cases; otherwise we should just
remove this.

The fact that this is or isn't documented has no importance: we don't
document the internal implementation details, but we keep them as long
as they do what users expect and like.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-18 14:35         ` Eli Zaretskii
@ 2020-03-18 15:10           ` Sergey Organov
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Organov @ 2020-03-18 15:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 40097

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Sergey Organov <sorganov@gmail.com>
>> Cc: Kenichi Handa <handa@gnu.org>,  40097@debbugs.gnu.org
>> Date: Wed, 18 Mar 2020 07:43:39 +0300
>> 
>> > We could perhaps introduce a customizable variable that would allow
>> > users who want that to disable the preference of charset-supporting
>> > fonts when the text has the 'charset' property.  CC'ing Handa-san who
>> > could comment on how important is this feature nowadays.
>> 
>> When I wrote original question, I was sure I've read somewhere in the
>> docs that Emacs does prefer fonts with particular encoding, and then I
>> missed it and can't find it anymore. I wonder if it is even documented,
>> or did I read it somewhere else, maybe in some relevant discussion?
>> 
>> I mean if it's even undocumented and is not important nowadays, maybe
>> it's indeed better to drop it rather than bother with customizations.
>> And if it is to be customizable, it should probably be a fontset feature
>> rather than global?
>
> The customizable option indeed only makes sense if the feature is
> still useful to some users in some use cases; otherwise we should just
> remove this.
>
> The fact that this is or isn't documented has no importance: we don't
> document the internal implementation details, but we keep them as long
> as they do what users expect and like.

IMHO, in general, when a thing is documented its removal must receive
more thought. Just another factor in favor of future support. Not much,
but still... And this one apparently doesn't have it.

-- Sergey





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-18 14:29         ` Eli Zaretskii
@ 2020-03-18 16:28           ` Robert Pluim
  2020-03-18 18:18             ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Robert Pluim @ 2020-03-18 16:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sorganov, 40097

>>>>> On Wed, 18 Mar 2020 16:29:23 +0200, Eli Zaretskii <eliz@gnu.org> said:

    Eli> Indeed; and you will see in ftfont.c that ftfont_list (or, rather, one
    Eli> of the subroutines it calls) concocts a "charset" by using the few
    Eli> representative characters from fc_charset_table, and then asks
    Eli> Fontconfig to list fonts which support those characters (at least
    Eli> that's my reading of the code).  Which is why I'm puzzled how come
    Eli> DejaVu Sans Mono is not in the list and we proceed to the xfont
    Eli> backend, because I'm quite sure DejaVu Sans does support the Cyrillic
    Eli> characters that represent windows-1251.  If you can tell what I'm
    Eli> missing here, maybe we could make some progress even without changing
    Eli> the design.

ftfont.c:

    { "windows-1251", { 0x0401, 0x0490 }, "ru"},

Thread 1 "emacs" hit Breakpoint 3, ftfont_get_charset (registry=XIL(0x394540)) at ftfont.c:486
486	  for (i = j = 0; i < SBYTES (SYMBOL_NAME (registry)); i++, j++)
(gdb) pp registry
microsoft-cp1251
(gdb) 

So correcting the name of the registry in ftfont.c fixes this. Thanks
for the hint, Eli.

Robert





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-18 16:28           ` Robert Pluim
@ 2020-03-18 18:18             ` Eli Zaretskii
  2020-03-18 20:47               ` Robert Pluim
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2020-03-18 18:18 UTC (permalink / raw)
  To: Robert Pluim; +Cc: sorganov, 40097

> From: Robert Pluim <rpluim@gmail.com>
> Cc: sorganov@gmail.com,  handa@gnu.org,  40097@debbugs.gnu.org
> Date: Wed, 18 Mar 2020 17:28:47 +0100
> 
> ftfont.c:
> 
>     { "windows-1251", { 0x0401, 0x0490 }, "ru"},
> 
> Thread 1 "emacs" hit Breakpoint 3, ftfont_get_charset (registry=XIL(0x394540)) at ftfont.c:486
> 486	  for (i = j = 0; i < SBYTES (SYMBOL_NAME (registry)); i++, j++)
> (gdb) pp registry
> microsoft-cp1251
> (gdb) 
> 
> So correcting the name of the registry in ftfont.c fixes this.

You mean, fixing that makes DejaVu Sans Mono be used in this case?
That's great, let's fix this in emacs-27 then.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-18 18:18             ` Eli Zaretskii
@ 2020-03-18 20:47               ` Robert Pluim
  2020-03-19  3:25                 ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Robert Pluim @ 2020-03-18 20:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: sorganov, 40097

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

>>>>> On Wed, 18 Mar 2020 20:18:11 +0200, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: sorganov@gmail.com,  handa@gnu.org,  40097@debbugs.gnu.org
    >> Date: Wed, 18 Mar 2020 17:28:47 +0100
    >> 
    >> ftfont.c:
    >> 
    >> { "windows-1251", { 0x0401, 0x0490 }, "ru"},
    >> 
    >> Thread 1 "emacs" hit Breakpoint 3, ftfont_get_charset (registry=XIL(0x394540)) at ftfont.c:486
    >> 486	  for (i = j = 0; i < SBYTES (SYMBOL_NAME (registry)); i++, j++)
    >> (gdb) pp registry
    >> microsoft-cp1251
    >> (gdb) 
    >> 
    >> So correcting the name of the registry in ftfont.c fixes this.

    Eli> You mean, fixing that makes DejaVu Sans Mono be used in this case?

Yes.

    Eli> That's great, let's fix this in emacs-27 then.

If you want me to be conservative, I could *add* a microsoft-cp1251
entry instead of replacing windows-1251, but as far as I know
windows-1251 is not a valid registry name.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Use-correct-registry-name-for-windows-1251-charset.patch --]
[-- Type: text/x-patch, Size: 1042 bytes --]

From c0001bb3cd097809f152fe9ca985aa319a957678 Mon Sep 17 00:00:00 2001
From: Robert Pluim <rpluim@gmail.com>
Date: Wed, 18 Mar 2020 21:37:55 +0100
Subject: [PATCH] Use correct registry name for windows-1251 charset
To: emacs-devel@gnu.org

* src/ftfont.c (fc_charset_table): The registry to use to lookup
windows-1251 charset is microsoft-cp1251, not windows-1251.
(Bug#40097)
---
 src/ftfont.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/ftfont.c b/src/ftfont.c
index 2b442ead4b..6b549c3ddf 100644
--- a/src/ftfont.c
+++ b/src/ftfont.c
@@ -119,7 +119,7 @@ #define SYMBOL_FcChar8(SYM) (FcChar8 *) SDATA (SYMBOL_NAME (SYM))
     { "jisx0213.2004-1", { 0x20B9F }},
     { "viscii1.1-1", { 0x1EA0, 0x1EAE, 0x1ED2 }, "vi"},
     { "tis620.2529-1", { 0x0E01 }, "th"},
-    { "windows-1251", { 0x0401, 0x0490 }, "ru"},
+    { "microsoft-cp1251", { 0x0401, 0x0490 }, "ru"},
     { "koi8-r", { 0x0401, 0x2219 }, "ru"},
     { "mulelao-1", { 0x0E81 }, "lo"},
     { "unicode-sip", { 0x20000 }},
-- 
2.19.1.816.gcd69ec8cde


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-18 20:47               ` Robert Pluim
@ 2020-03-19  3:25                 ` Eli Zaretskii
  2020-03-19  8:26                   ` Robert Pluim
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2020-03-19  3:25 UTC (permalink / raw)
  To: Robert Pluim; +Cc: sorganov, 40097

> From: Robert Pluim <rpluim@gmail.com>
> Cc: sorganov@gmail.com,  40097@debbugs.gnu.org
> Date: Wed, 18 Mar 2020 21:47:19 +0100
> 
>     >> So correcting the name of the registry in ftfont.c fixes this.
> 
>     Eli> You mean, fixing that makes DejaVu Sans Mono be used in this case?
> 
> Yes.
> 
>     Eli> That's great, let's fix this in emacs-27 then.
> 
> If you want me to be conservative, I could *add* a microsoft-cp1251
> entry instead of replacing windows-1251, but as far as I know
> windows-1251 is not a valid registry name.

I see no need to add it, this registry value cannot be used much, or
we'd have many bug reports about this problem long ago.  Let's just
replace the incorrect value with the correct one.

Thanks.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-19  3:25                 ` Eli Zaretskii
@ 2020-03-19  8:26                   ` Robert Pluim
  2020-03-19 11:12                     ` Sergey Organov
  0 siblings, 1 reply; 15+ messages in thread
From: Robert Pluim @ 2020-03-19  8:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 40097-done, sorganov

>>>>> On Thu, 19 Mar 2020 05:25:39 +0200, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: sorganov@gmail.com,  40097@debbugs.gnu.org
    >> Date: Wed, 18 Mar 2020 21:47:19 +0100
    >> 
    >> >> So correcting the name of the registry in ftfont.c fixes this.
    >> 
    Eli> You mean, fixing that makes DejaVu Sans Mono be used in this case?
    >> 
    >> Yes.
    >> 
    Eli> That's great, let's fix this in emacs-27 then.
    >> 
    >> If you want me to be conservative, I could *add* a microsoft-cp1251
    >> entry instead of replacing windows-1251, but as far as I know
    >> windows-1251 is not a valid registry name.

    Eli> I see no need to add it, this registry value cannot be used much, or
    Eli> we'd have many bug reports about this problem long ago.  Let's just
    Eli> replace the incorrect value with the correct one.

Done for emacs-27 in bed04c502c

Closing.

Robert








^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#40097: 28.0.50; Preferred font ignored for specific charset
  2020-03-19  8:26                   ` Robert Pluim
@ 2020-03-19 11:12                     ` Sergey Organov
  0 siblings, 0 replies; 15+ messages in thread
From: Sergey Organov @ 2020-03-19 11:12 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 40097-done

Robert Pluim <rpluim@gmail.com> writes:

>>>>>> On Thu, 19 Mar 2020 05:25:39 +0200, Eli Zaretskii <eliz@gnu.org> said:
>
>     >> From: Robert Pluim <rpluim@gmail.com>
>     >> Cc: sorganov@gmail.com,  40097@debbugs.gnu.org
>     >> Date: Wed, 18 Mar 2020 21:47:19 +0100
>     >> 
>     >> >> So correcting the name of the registry in ftfont.c fixes this.
>     >> 
>     Eli> You mean, fixing that makes DejaVu Sans Mono be used in this case?
>     >> 
>     >> Yes.
>     >> 
>     Eli> That's great, let's fix this in emacs-27 then.
>     >> 
>     >> If you want me to be conservative, I could *add* a microsoft-cp1251
>     >> entry instead of replacing windows-1251, but as far as I know
>     >> windows-1251 is not a valid registry name.
>
>     Eli> I see no need to add it, this registry value cannot be used much, or
>     Eli> we'd have many bug reports about this problem long ago.  Let's just
>     Eli> replace the incorrect value with the correct one.
>
> Done for emacs-27 in bed04c502c
>
> Closing.

Good job! Thanks a lot to both of you for taking care of this!

-- Sergey





^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-03-19 11:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-17  4:31 bug#40097: 28.0.50; Preferred font ignored for specific charset Sergey Organov
2020-03-17 15:24 ` Eli Zaretskii
2020-03-17 16:12   ` Sergey Organov
2020-03-17 16:35     ` Eli Zaretskii
2020-03-18  4:43       ` Sergey Organov
2020-03-18 14:35         ` Eli Zaretskii
2020-03-18 15:10           ` Sergey Organov
2020-03-18 11:07       ` Robert Pluim
2020-03-18 14:29         ` Eli Zaretskii
2020-03-18 16:28           ` Robert Pluim
2020-03-18 18:18             ` Eli Zaretskii
2020-03-18 20:47               ` Robert Pluim
2020-03-19  3:25                 ` Eli Zaretskii
2020-03-19  8:26                   ` Robert Pluim
2020-03-19 11:12                     ` Sergey Organov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).