From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Encoding of etc/HELLO Date: Sat, 19 May 2018 21:03:24 +0300 Message-ID: <83in7jh6ab.fsf@gnu.org> References: <83sh7qxb5j.fsf@gnu.org> <87po2t6gdm.fsf@gmx.de> <83muxxyijl.fsf@gnu.org> <83lgdhyeqv.fsf@gnu.org> <83k1t1xcjp.fsf@gnu.org> <87fu3owqqa.fsf@md5i.com> <83k1rzhdoo.fsf@gnu.org> <69662556-2fb0-356f-dad8-5b94d6833ce5@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1526752899 11953 195.159.176.226 (19 May 2018 18:01:39 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 19 May 2018 18:01:39 +0000 (UTC) Cc: mwd@md5i.com, handa@gnu.org, michael.albinus@gmx.de, emacs-devel@gnu.org To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat May 19 20:01:34 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fK6Ag-0002zQ-9n for ged-emacs-devel@m.gmane.org; Sat, 19 May 2018 20:01:34 +0200 Original-Received: from localhost ([::1]:43927 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fK6Cn-0005u7-AV for ged-emacs-devel@m.gmane.org; Sat, 19 May 2018 14:03:45 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:44646) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fK6Cd-0005to-O0 for emacs-devel@gnu.org; Sat, 19 May 2018 14:03:36 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fK6CZ-0003hg-2v for emacs-devel@gnu.org; Sat, 19 May 2018 14:03:35 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:48447) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fK6CT-0003fm-RA; Sat, 19 May 2018 14:03:25 -0400 Original-Received: from [176.228.60.248] (port=2970 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1fK6CT-0003Ny-2P; Sat, 19 May 2018 14:03:25 -0400 In-reply-to: <69662556-2fb0-356f-dad8-5b94d6833ce5@cs.ucla.edu> (message from Paul Eggert on Sat, 19 May 2018 10:17:33 -0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:225448 Archived-At: > Cc: emacs-devel@gnu.org > From: Paul Eggert > Date: Sat, 19 May 2018 10:17:33 -0700 > > In looking at the new etc/HELLO, I see many uses of that seem > to be unnecessary when Emacs is viewing the file. For example, the first few > uses are: > > latin-iso8859-1¡Hola!, Grüß Gott, Hyvää päivää, Tere > õhtust, Bonlatin-iso8859-3ġu > Czelatin-iso8859-2ść!, Dobrý > den, Which parts seem unnecessary in this snippet? And why? > Can't the abovementioned formatting commands be removed without affecting what > any Emacs user sees, because the corresponding character sets are not unified in > Unicode? What do you mean by "unified" here? In modern Emacs, we don't need to unify the charsets, because they no longer determine the codepoints. The 'charset' property just tells Emacs to which "culture", so-called, or, if you want, to which language the greeting belongs, and the purpose is only one: selection of an appropriate font to display that greeting. (In the future we might use that for other language-dependent features.) > Would it be OK to simplify /etc/HELLO to remove unnecessary formatting > commands, and to keep only the formatting commands that are plausibly needed in > a Unicode text file? And if so, what heuristic should be used to remove the > unnecessary formatting commands? > > I assume that the formatting commands were done automatically, so perhaps I'm > talking about potential changes to lisp/textmodes/enriched.el. Yes, the annotations were produced automatically by enriched.el, but they simply follow what was already there in the original HELLO. You can see that by visiting HELLO on the emacs-26 branch, and then invoking "M-x describe-text-properties" at various places in the file. You will see that the annotations start and end where the 'charset' properties started and ended in the ISO-2022 encoded file. We could, of course, place the 'charset' properties only on the greetings and the language names, leaving the rest of the text without any 'charset' properties. If that's what you mean, then I'm okay with doing that; one could use the new facemenu command I added for that purpose.