From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: etc/HELLO markup etc. Date: Sun, 23 Dec 2018 17:42:43 +0200 Message-ID: <83zhsw8d1o.fsf@gnu.org> References: <3fd27fe5-e650-b207-fdd4-36f805b89b4d@cs.ucla.edu> <83bm5hcroa.fsf@gnu.org> <9f33127d-f01b-b138-7a0c-ffeac7b77938@cs.ucla.edu> <835zvochdj.fsf@gnu.org> <5f113128-36c9-30c6-3413-8dc36051e058@cs.ucla.edu> <83va3nban3.fsf@gnu.org> <838t0iasju.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1545579700 29265 195.159.176.226 (23 Dec 2018 15:41:40 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 23 Dec 2018 15:41:40 +0000 (UTC) Cc: handa@gnu.org, eggert@cs.ucla.edu, monnier@iro.umontreal.ca, Emacs-devel@gnu.org To: Yuri Khan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Dec 23 16:41:35 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gb5sl-0007VX-7m for ged-emacs-devel@m.gmane.org; Sun, 23 Dec 2018 16:41:35 +0100 Original-Received: from localhost ([::1]:45502 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gb5ur-0002AU-Sy for ged-emacs-devel@m.gmane.org; Sun, 23 Dec 2018 10:43:45 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34746) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gb5uC-0002AC-3n for Emacs-devel@gnu.org; Sun, 23 Dec 2018 10:43:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gb5uB-0004Kt-6q for Emacs-devel@gnu.org; Sun, 23 Dec 2018 10:43:04 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:55388) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gb5u6-0004DZ-Gr; Sun, 23 Dec 2018 10:42:58 -0500 Original-Received: from [176.228.60.248] (port=1277 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gb5u4-0002Zm-C3; Sun, 23 Dec 2018 10:42:57 -0500 In-reply-to: (message from Yuri Khan on Sun, 23 Dec 2018 14:47:39 +0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:231966 Archived-At: > From: Yuri Khan > Date: Sun, 23 Dec 2018 14:47:39 +0700 > Cc: Paul Eggert , handa@gnu.org, > Stefan Monnier , Emacs developers > > There is at least one more situation where different glyphs > could/should be selected for the same Unicode code points, which > charset markup does not solve. > > I’m talking about italic shapes of Cyrillic letters. For some of them, > Russian and Bulgarian use one shape but Serbian and Macedonian use > another shape[1]. There are no examples of Bulgarian, Serbian, or > Macedonian in HELLO, but Russian, Ukrainian and Mongolian examples are > all marked up as “cyrillic-iso8859-5”, which is an encoding that does > not carry language information. > > So: charset markup is not the right solution to the problem of > rendering the same Unicode code point with different glyphs. You mean, it's not a perfect solution, right? Because in the "good" department, it's "good enough" to solve at least part of the problem. No one says we need to reject a solution because it is only partial. I would also like to point out that, as far as the 'charset' property is considered, HELLO is just an example of what _can_ be done, it doesn't pretend to show _everything_ that you could do. E.g., if it's important to be able to display Ukrainian in a font different from that used for Russian, we could use the koi8-u charset for the Ukrainian greeting, and tweak our default fontset to use special fonts for that. We could even invent additional charsets (see define-charset) and then use them for some greetings. Of course, this machinery works best when a charset is unequivocally determined by the prevalent encoding used for text that uses that charset, and that isn't always the case. But still, the feature is there, and it can be extended if needed. Finally, regarding the special handling of italics in Serbian: is there _any_ application out there that solves this problem satisfactorily in multilingual environment? I'm not sure how you could go about that, since fonts generally cover scripts, and there's no special Serbian Cyrillic script, there's just Cyrl to cover them all.