* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script [not found] ` <20240803073044.42052C1CAF7@vcs2.savannah.gnu.org> @ 2024-08-03 9:27 ` Po Lu 2024-08-03 15:23 ` Eli Zaretskii 0 siblings, 1 reply; 11+ messages in thread From: Po Lu @ 2024-08-03 9:27 UTC (permalink / raw) To: emacs-devel; +Cc: Eli Zaretskii Eli Zaretskii <eliz@gnu.org> writes: > branch: master > commit 15afa72460b4a0ec910749646cb9852b4c578f5e > Author: Eli Zaretskii <eliz@gnu.org> > Commit: Eli Zaretskii <eliz@gnu.org> > > Fix 'script-representative-chars' for the 'han' script > > * lisp/international/fontset.el (script-representative-chars): > Remove from 'han' codepoints that belong to 'cjk-misc'. > --- > lisp/international/fontset.el | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/lisp/international/fontset.el b/lisp/international/fontset.el > index f5b4b0b4aa4..695c313cb26 100644 > --- a/lisp/international/fontset.el > +++ b/lisp/international/fontset.el > @@ -208,8 +208,7 @@ > (kana #x304B) > (bopomofo #x3105) > (kanbun #x319D) > - (han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400 #x31c0 #x4e10 > - #x5B57 #xfe30 #xf900) > + (han #x2e90 #x2f00 #x3200 #x3300 #x3400 #x4e10 #x5B57 #xfe30 #xf900) Someone reports that this set of characters still does not enable the detection of WenQuanYi Micro Hei, which is certainly complete enough to display all Han text that will be encountered in practice. U+2E90, U+2F00, U+3300 and U+3400 are absent from this font, and quite reasonably so, since they are freestanding radicals, Kana, which belong in the entry for kana rather than han, or obsolete. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-03 9:27 ` master 15afa72460b: Fix 'script-representative-chars' for the 'han' script Po Lu @ 2024-08-03 15:23 ` Eli Zaretskii 2024-08-04 0:16 ` Po Lu 0 siblings, 1 reply; 11+ messages in thread From: Eli Zaretskii @ 2024-08-03 15:23 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: Eli Zaretskii <eliz@gnu.org> > Date: Sat, 03 Aug 2024 17:27:27 +0800 > > > --- a/lisp/international/fontset.el > > +++ b/lisp/international/fontset.el > > @@ -208,8 +208,7 @@ > > (kana #x304B) > > (bopomofo #x3105) > > (kanbun #x319D) > > - (han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400 #x31c0 #x4e10 > > - #x5B57 #xfe30 #xf900) > > + (han #x2e90 #x2f00 #x3200 #x3300 #x3400 #x4e10 #x5B57 #xfe30 #xf900) > > Someone reports that this set of characters still does not enable the > detection of WenQuanYi Micro Hei, which is certainly complete enough to > display all Han text that will be encountered in practice. U+2E90, > U+2F00, U+3300 and U+3400 are absent from this font, and quite > reasonably so, since they are freestanding radicals, Kana, which belong > in the entry for kana rather than han, or obsolete. On what system did that happen? And I don't understand why you say these characters are Kana, this page disagrees: https://en.wikipedia.org/wiki/Kangxi_radical ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-03 15:23 ` Eli Zaretskii @ 2024-08-04 0:16 ` Po Lu 2024-08-04 4:57 ` Eli Zaretskii 2024-08-05 16:25 ` Eli Zaretskii 0 siblings, 2 replies; 11+ messages in thread From: Po Lu @ 2024-08-04 0:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Po Lu <luangruo@yahoo.com> >> Cc: Eli Zaretskii <eliz@gnu.org> >> Date: Sat, 03 Aug 2024 17:27:27 +0800 >> >> > --- a/lisp/international/fontset.el >> > +++ b/lisp/international/fontset.el >> > @@ -208,8 +208,7 @@ >> > (kana #x304B) >> > (bopomofo #x3105) >> > (kanbun #x319D) >> > - (han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400 #x31c0 #x4e10 >> > - #x5B57 #xfe30 #xf900) >> > + (han #x2e90 #x2f00 #x3200 #x3300 #x3400 #x4e10 #x5B57 #xfe30 #xf900) >> >> Someone reports that this set of characters still does not enable the >> detection of WenQuanYi Micro Hei, which is certainly complete enough to >> display all Han text that will be encountered in practice. U+2E90, >> U+2F00, U+3300 and U+3400 are absent from this font, and quite >> reasonably so, since they are freestanding radicals, Kana, which belong >> in the entry for kana rather than han, or obsolete. > > On what system did that happen? Not "what system", "which font": WenQuanYi Micro Hei, one of the better free Han fonts. > And I don't understand why you say these characters are Kana, this > page disagrees: > > https://en.wikipedia.org/wiki/Kangxi_radical That's U+2E90. U+3300 is Kana, according to Scripts.txt: 3300..3357 ; Katakana # So [88] SQUARE APAATO..SQUARE WATTO ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-04 0:16 ` Po Lu @ 2024-08-04 4:57 ` Eli Zaretskii 2024-08-04 7:58 ` Po Lu 2024-08-05 16:25 ` Eli Zaretskii 1 sibling, 1 reply; 11+ messages in thread From: Eli Zaretskii @ 2024-08-04 4:57 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: emacs-devel@gnu.org > Date: Sun, 04 Aug 2024 08:16:20 +0800 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> Someone reports that this set of characters still does not enable the > >> detection of WenQuanYi Micro Hei, which is certainly complete enough to > >> display all Han text that will be encountered in practice. U+2E90, > >> U+2F00, U+3300 and U+3400 are absent from this font, and quite > >> reasonably so, since they are freestanding radicals, Kana, which belong > >> in the entry for kana rather than han, or obsolete. > > > > On what system did that happen? > > Not "what system", "which font": WenQuanYi Micro Hei, one of the better > free Han fonts. <Shrug>Then users should perhaps look for better fonts. I'm quite astonished to hear that free fonts on free systems do so much worse a job than MS-Windows. I have hard time believing that. > > And I don't understand why you say these characters are Kana, this > > page disagrees: > > > > https://en.wikipedia.org/wiki/Kangxi_radical > > That's U+2E90. U+3300 is Kana, according to Scripts.txt: > > 3300..3357 ; Katakana # So [88] SQUARE APAATO..SQUARE WATTO That's just one block out of 4 that you mentioned. And if we want to treat that as Kana, we should change admin/blocks.awk first. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-04 4:57 ` Eli Zaretskii @ 2024-08-04 7:58 ` Po Lu 0 siblings, 0 replies; 11+ messages in thread From: Po Lu @ 2024-08-04 7:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > <Shrug>Then users should perhaps look for better fonts. I'm quite > astonished to hear that free fonts on free systems do so much worse a > job than MS-Windows. I have hard time believing that. Since they are sufficient, Microsoft's excess is not the standard by which to set Emacs's expectations. (Which expectations must not be set by proprietary fonts in any event.) >> > And I don't understand why you say these characters are Kana, this >> > page disagrees: >> > >> > https://en.wikipedia.org/wiki/Kangxi_radical >> >> That's U+2E90. U+3300 is Kana, according to Scripts.txt: >> >> 3300..3357 ; Katakana # So [88] SQUARE APAATO..SQUARE WATTO > > That's just one block out of 4 that you mentioned. The remainder are, as I said, radicals or obsolete, which are not to be found in real documents and many perfectly serviceable fonts. > And if we want to treat that as Kana, we should change > admin/blocks.awk first. Its not being treated as Kana is a bug in blocks.awk, so this is a forgone conclusion. Regardless, this character should be deleted from script-representative-chars, because on my system it is provided by: xfthb:-ADBO-Noto Sans CJK JP-regular-normal-normal-*-16-*-*-*-*-0-iso10646-1 (#x889) which is not the proper regional variant of Noto Sans (or Serif) CJK for Chinese text. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-04 0:16 ` Po Lu 2024-08-04 4:57 ` Eli Zaretskii @ 2024-08-05 16:25 ` Eli Zaretskii 2024-08-05 23:58 ` Po Lu 1 sibling, 1 reply; 11+ messages in thread From: Eli Zaretskii @ 2024-08-05 16:25 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: emacs-devel@gnu.org > Date: Sun, 04 Aug 2024 08:16:20 +0800 > > >> > --- a/lisp/international/fontset.el > >> > +++ b/lisp/international/fontset.el > >> > @@ -208,8 +208,7 @@ > >> > (kana #x304B) > >> > (bopomofo #x3105) > >> > (kanbun #x319D) > >> > - (han #x2e90 #x2f00 #x3010 #x3200 #x3300 #x3400 #x31c0 #x4e10 > >> > - #x5B57 #xfe30 #xf900) > >> > + (han #x2e90 #x2f00 #x3200 #x3300 #x3400 #x4e10 #x5B57 #xfe30 #xf900) > >> > >> Someone reports that this set of characters still does not enable the > >> detection of WenQuanYi Micro Hei, which is certainly complete enough to > >> display all Han text that will be encountered in practice. U+2E90, > >> U+2F00, U+3300 and U+3400 are absent from this font, and quite > >> reasonably so, since they are freestanding radicals, Kana, which belong > >> in the entry for kana rather than han, or obsolete. > > > > On what system did that happen? > > Not "what system", "which font": WenQuanYi Micro Hei, one of the better > free Han fonts. If you remove U+2E90, U+2F00, U+3300 and U+3400 from the list and rebuild Emacs, what happens if you insert U+2F75? Does Emacs succeed to find another font which support that codepoint or does it appear as tofu? If the latter, what happens if you in install some additional font which does support U+2F75? IOW, I'm interested to know what happens on GNU/Linux if more than one font is available that together cover both the "usual" han characters and those additional ones which you think we should remove from script-representative-chars, but neither of these fonts supports all of those characters. Can Emacs solve this by itself on GNU/Linux, or does it need "help" from the user's customization of the fontset? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-05 16:25 ` Eli Zaretskii @ 2024-08-05 23:58 ` Po Lu 2024-08-06 11:35 ` Eli Zaretskii 0 siblings, 1 reply; 11+ messages in thread From: Po Lu @ 2024-08-05 23:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > If you remove U+2E90, U+2F00, U+3300 and U+3400 from the list and > rebuild Emacs, what happens if you insert U+2F75? Does Emacs succeed > to find another font which support that codepoint or does it appear as > tofu? If the latter, what happens if you in install some additional > font which does support U+2F75? I'll ask, but my intuition is that no font will be discovered, since a font must support all of any characters defined as lists in script-representative-chars to be eligible. > IOW, I'm interested to know what happens on GNU/Linux if more than one > font is available that together cover both the "usual" han characters > and those additional ones which you think we should remove from > script-representative-chars, but neither of these fonts supports all > of those characters. Can Emacs solve this by itself on GNU/Linux, or > does it need "help" from the user's customization of the fontset? Probably the latter, unless `han' is divided into scripts for characters, obsolete characters, radicals, and the like. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-05 23:58 ` Po Lu @ 2024-08-06 11:35 ` Eli Zaretskii 2024-08-07 0:17 ` Po Lu 0 siblings, 1 reply; 11+ messages in thread From: Eli Zaretskii @ 2024-08-06 11:35 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: emacs-devel@gnu.org > Date: Tue, 06 Aug 2024 07:58:54 +0800 > > Eli Zaretskii <eliz@gnu.org> writes: > > > If you remove U+2E90, U+2F00, U+3300 and U+3400 from the list and > > rebuild Emacs, what happens if you insert U+2F75? Does Emacs succeed > > to find another font which support that codepoint or does it appear as > > tofu? If the latter, what happens if you in install some additional > > font which does support U+2F75? > > I'll ask, but my intuition is that no font will be discovered, since a > font must support all of any characters defined as lists in > script-representative-chars to be eligible. Note that I said "if you remove those characters". If you did note that, then does it mean when U+2F75 needs to be installed and the current font for han doesn't support it, Emacs will never try to look for _another_ font which supports han characters? Or will it try, but always fail? > > IOW, I'm interested to know what happens on GNU/Linux if more than one > > font is available that together cover both the "usual" han characters > > and those additional ones which you think we should remove from > > script-representative-chars, but neither of these fonts supports all > > of those characters. Can Emacs solve this by itself on GNU/Linux, or > > does it need "help" from the user's customization of the fontset? > > Probably the latter, unless `han' is divided into scripts for > characters, obsolete characters, radicals, and the like. That is again quite disappointing, since I always thought font backends based on Fontconfig can do a better job, because (AFAIR) Fontconfig caches the font information and makes it available for programs that search fonts covering specific characters. What you describe happens on MS-Windows, but there we don't have a way to test whether a font supports a character without actually loading the font (the 'has_char' backends method always fails). ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-06 11:35 ` Eli Zaretskii @ 2024-08-07 0:17 ` Po Lu 2024-08-07 11:47 ` Eli Zaretskii 0 siblings, 1 reply; 11+ messages in thread From: Po Lu @ 2024-08-07 0:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Note that I said "if you remove those characters". > > If you did note that, then does it mean when U+2F75 needs to be > installed and the current font for han doesn't support it, Emacs will > never try to look for _another_ font which supports han characters? > Or will it try, but always fail? How do you mean? During Emacs's search for a suitable font, it is yet to decide what is the "current font for han." >> > IOW, I'm interested to know what happens on GNU/Linux if more than one >> > font is available that together cover both the "usual" han characters >> > and those additional ones which you think we should remove from >> > script-representative-chars, but neither of these fonts supports all >> > of those characters. Can Emacs solve this by itself on GNU/Linux, or >> > does it need "help" from the user's customization of the fontset? >> >> Probably the latter, unless `han' is divided into scripts for >> characters, obsolete characters, radicals, and the like. > > That is again quite disappointing, since I always thought font > backends based on Fontconfig can do a better job, because (AFAIR) > Fontconfig caches the font information and makes it available for > programs that search fonts covering specific characters. Fontconfig is capable of this, but not telepathy. If Emacs submits multiple requests for such and such a list of characters, ftfont cannot telepathically deduce that in the one instance it should only consider those characters which are in common usage, while in the other radicals or obsolete characters. > What you describe happens on MS-Windows, but there we don't have a way > to test whether a font supports a character without actually loading > the font (the 'has_char' backends method always fails). ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-07 0:17 ` Po Lu @ 2024-08-07 11:47 ` Eli Zaretskii 2024-08-07 12:12 ` Po Lu 0 siblings, 1 reply; 11+ messages in thread From: Eli Zaretskii @ 2024-08-07 11:47 UTC (permalink / raw) To: Po Lu; +Cc: emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: emacs-devel@gnu.org > Date: Wed, 07 Aug 2024 08:17:08 +0800 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Note that I said "if you remove those characters". > > > > If you did note that, then does it mean when U+2F75 needs to be > > installed and the current font for han doesn't support it, Emacs will > > never try to look for _another_ font which supports han characters? > > Or will it try, but always fail? > > How do you mean? During Emacs's search for a suitable font, it is yet > to decide what is the "current font for han." I mean the following scenario: . start Emacs . type some common han character, which will be displayed by a font that supports the common han characters . type some rare han character, such as U+2F75, not supported by the font chosen in the previous step I'm asking whether Emacs will in step 3 search and find a font which can display U+2F75, or will it show tofu because it already has a han font, and that font doesn't support U+2F75? > >> Probably the latter, unless `han' is divided into scripts for > >> characters, obsolete characters, radicals, and the like. > > > > That is again quite disappointing, since I always thought font > > backends based on Fontconfig can do a better job, because (AFAIR) > > Fontconfig caches the font information and makes it available for > > programs that search fonts covering specific characters. > > Fontconfig is capable of this, but not telepathy. If Emacs submits > multiple requests for such and such a list of characters, ftfont cannot > telepathically deduce that in the one instance it should only consider > those characters which are in common usage, while in the other radicals > or obsolete characters. But when Emacs actually needs to display one of those rare characters, will Emacs which uses Fontconfig then be able to find a suitable font, if it is installed? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: master 15afa72460b: Fix 'script-representative-chars' for the 'han' script 2024-08-07 11:47 ` Eli Zaretskii @ 2024-08-07 12:12 ` Po Lu 0 siblings, 0 replies; 11+ messages in thread From: Po Lu @ 2024-08-07 12:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Po Lu <luangruo@yahoo.com> >> Cc: emacs-devel@gnu.org >> Date: Wed, 07 Aug 2024 08:17:08 +0800 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > Note that I said "if you remove those characters". >> > >> > If you did note that, then does it mean when U+2F75 needs to be >> > installed and the current font for han doesn't support it, Emacs will >> > never try to look for _another_ font which supports han characters? >> > Or will it try, but always fail? >> >> How do you mean? During Emacs's search for a suitable font, it is yet >> to decide what is the "current font for han." > > I mean the following scenario: > > . start Emacs > . type some common han character, which will be displayed by a font > that supports the common han characters > . type some rare han character, such as U+2F75, not supported by the > font chosen in the previous step > > I'm asking whether Emacs will in step 3 search and find a font which > can display U+2F75, or will it show tofu because it already has a han > font, and that font doesn't support U+2F75? In principle, yes, but with the important exception that the font which supports U+2F75 must also support all of the characters in the entry in script-representative-chars for han. > But when Emacs actually needs to display one of those rare characters, > will Emacs which uses Fontconfig then be able to find a suitable font, > if it is installed? The answer is yes, at least subject to the above. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-08-07 12:12 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <172267024373.1752.11669700725951474437@vcs2.savannah.gnu.org> [not found] ` <20240803073044.42052C1CAF7@vcs2.savannah.gnu.org> 2024-08-03 9:27 ` master 15afa72460b: Fix 'script-representative-chars' for the 'han' script Po Lu 2024-08-03 15:23 ` Eli Zaretskii 2024-08-04 0:16 ` Po Lu 2024-08-04 4:57 ` Eli Zaretskii 2024-08-04 7:58 ` Po Lu 2024-08-05 16:25 ` Eli Zaretskii 2024-08-05 23:58 ` Po Lu 2024-08-06 11:35 ` Eli Zaretskii 2024-08-07 0:17 ` Po Lu 2024-08-07 11:47 ` Eli Zaretskii 2024-08-07 12:12 ` Po Lu
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.