From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: etc/HELLO markup etc. Date: Sat, 22 Dec 2018 22:42:33 +0200 Message-ID: <83a7kx9tty.fsf@gnu.org> References: <3fd27fe5-e650-b207-fdd4-36f805b89b4d@cs.ucla.edu> <83bm5hcroa.fsf@gnu.org> <9f33127d-f01b-b138-7a0c-ffeac7b77938@cs.ucla.edu> <835zvochdj.fsf@gnu.org> <5f113128-36c9-30c6-3413-8dc36051e058@cs.ucla.edu> <83va3nban3.fsf@gnu.org> <838t0iasju.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1545511269 25904 195.159.176.226 (22 Dec 2018 20:41:09 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 22 Dec 2018 20:41:09 +0000 (UTC) Cc: handa@gnu.org, monnier@iro.umontreal.ca, Emacs-devel@gnu.org To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Dec 22 21:41:05 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gao50-0006aP-UC for ged-emacs-devel@m.gmane.org; Sat, 22 Dec 2018 21:41:03 +0100 Original-Received: from localhost ([::1]:58125 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gao77-0002QG-Fw for ged-emacs-devel@m.gmane.org; Sat, 22 Dec 2018 15:43:13 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47413) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gao6x-0002QB-L8 for Emacs-devel@gnu.org; Sat, 22 Dec 2018 15:43:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gao6w-0006hT-58 for Emacs-devel@gnu.org; Sat, 22 Dec 2018 15:43:03 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:50243) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gao6q-0006bI-Vv; Sat, 22 Dec 2018 15:42:57 -0500 Original-Received: from [176.228.60.248] (port=2820 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gao6p-0001rk-KC; Sat, 22 Dec 2018 15:42:56 -0500 In-reply-to: (message from Paul Eggert on Sat, 22 Dec 2018 11:41:05 -0800) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:231960 Archived-At: > Cc: handa@gnu.org, monnier@iro.umontreal.ca, Emacs-devel@gnu.org > From: Paul Eggert > Date: Sat, 22 Dec 2018 11:41:05 -0800 > > Eli Zaretskii wrote: > > > If Han unification is the only important user of the charset property, > > then yes, we could remove the rest of the charset info from HELLO. > > Yes, that's the case. Says you. The issue at hand is precisely whether that is so, or just your opinion and tendency. > the non-Han markup is merely a relic of that file's old method of > encoding It could be both a relic and an important piece of information. > one cannot see the markup's effect when visiting the file with > either C-h h or find-file in the usual way. Of course, one can: via the fonts used to display the various scripts. > > In what way most of what you say is not applicable to etc/enriched.txt > > in general? > > Other forms of enriched-text markup are typically easily visible. Typically, but not exclusively. There's read-only property, there's the 'display' property, and to some extent even the "fixed" face. > > What other facilities are you aware of or can suggest for showing > > multilingual text with such level of detail and precision? > > In practice the most common and often the best way to deal with the situation is > to do what the non-markup part of etc/HELLO is already doing: indicate within > the text itself what language or script is being used, to help the reader who > may be unacquainted with them, and with enough punctuation within the text so > that the reader can easily see what's going on. That's useless for preserving text properties, so won't fit the bill. > One example of such an error is that "日本語" has no charset properties even > though it's obviously intended to use a Japanese script (since it follows the > word "Japanese"). Thanks, I fixed that. > I'm sure there are others. Please report them if you find them.