all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Eli Zaretskii <eliz@gnu.org>
Cc: handa@gnu.org, monnier@iro.umontreal.ca, Emacs-devel@gnu.org
Subject: Re: etc/HELLO markup etc.
Date: Sat, 22 Dec 2018 11:41:05 -0800	[thread overview]
Message-ID: <d8999440-2406-e0df-7fed-948d8b8a3f8d@cs.ucla.edu> (raw)
In-Reply-To: <838t0iasju.fsf@gnu.org>

Eli Zaretskii wrote:

> If Han unification is the only important user of the charset property,
> then yes, we could remove the rest of the charset info from HELLO.

Yes, that's the case.

> the current HELLO just keeps the information
> that was there before recoding it in UTF-8, nothing was added.

Sure, but the non-Han markup is merely a relic of that file's old method of 
encoding, which avoided Unicode and instead used ISO 2022 escape sequences to 
switch among various 8- and 16-bit encodings, as that was the only way to show 
text in (say) Russian under the constraints of the old method. The non-Han 
markup is completely unnecessary now that the file uses UTF-8. (The Han markup 
probably isn't needed either, though I also would like Handa's opinion on that.)

>> Although the etc/HELLO markup might be of interest to those who care about
>> annotating languages in the text, it's irrelevant to the ordinary purpose of
>> that file, which is to show textual translations of "Hello"
> 
> That's not the original purpose of that file.  The purpose is to show
> scripts, not languages, and to show how we display different scripts
> in the same buffer.

OK, but either way the non-Han markup is irrelevant to the ordinary purpose of 
the file.

>> It's still not a good user interface, though, as it is difficult to see the
>> markup's effect when visiting etc/HELLO in the usual way
> 
> If the usual way is via find-file and its ilk, then you should see the
> same results as with "C-h h", so I'm not sure I understand what you
> mean here.

I meant that one cannot see the markup's effect when visiting the file with 
either C-h h or find-file in the usual way. It's useless markup.

> In what way most of what you say is not applicable to etc/enriched.txt
> in general?

Other forms of enriched-text markup are typically easily visible. If I visit 
etc/enriched.txt I can easily see which parts are marked white on blue 
background, which parts are marked italic, etc. Invisible enriched-text markup 
is much harder to deal with when editing an enriched-text file.

>> the file is not a good showroom for how to maintain multilingual
>> text.
> 
> What other facilities are you aware of or can suggest for showing
> multilingual text with such level of detail and precision?

In practice the most common and often the best way to deal with the situation is 
to do what the non-markup part of etc/HELLO is already doing: indicate within 
the text itself what language or script is being used, to help the reader who 
may be unacquainted with them, and with enough punctuation within the text so 
that the reader can easily see what's going on. This technique has been used for 
centuries, it's by far the most popular technique in common practice today, and 
it suffices for this particular application (with the possible exception of its 
Chinese and Japanese text).

>> It's not a good sign that there seem to be errors in the
>> possibly-useful (i.e., CJ) markup that nobody has noticed since the
>> markup was introduced in May, and that I noticed these errors now
>> only because I was visiting the file literally.
> 
> Which errors?  I don't think we discovered any errors.

Yes, and that's the point! The approach we're taking is not good for dealing 
with the situation.

One example of such an error is that "日本語" has no charset properties even 
though it's obviously intended to use a Japanese script (since it follows the 
word "Japanese"). I'm sure there are others.



  reply	other threads:[~2018-12-22 19:41 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-18 18:46 bug#33796: 27.0.50; Use utf-8 is all our Elisp files Stefan Monnier
2018-12-18 19:22 ` Eli Zaretskii
2018-12-18 19:46   ` Stefan Monnier
2018-12-19 17:54 ` Paul Eggert
2018-12-19 18:11   ` Eli Zaretskii
2018-12-19 22:13     ` Paul Eggert
2018-12-20 16:06       ` Eli Zaretskii
2018-12-20 21:49         ` Paul Eggert
2018-12-21  7:29           ` Eli Zaretskii
2018-12-21 13:46             ` Stefan Monnier
2018-12-21 15:54               ` Eli Zaretskii
2018-12-21 13:55             ` Eli Zaretskii
2018-12-21 21:07             ` Paul Eggert
2018-12-22  1:19               ` Eric Lindblad
2018-12-22  7:56                 ` etc/HELLO markup etc. (Was: 27.0.50; Use utf-8 is all our Elisp files) Eli Zaretskii
2018-12-22  8:12               ` etc/HELLO markup etc Eli Zaretskii
2018-12-22 19:41                 ` Paul Eggert [this message]
2018-12-22 20:42                   ` Eli Zaretskii
2018-12-23  7:47                 ` Yuri Khan
2018-12-23 15:42                   ` Eli Zaretskii
2018-12-23 15:53                     ` Werner LEMBERG
2018-12-23 16:04                       ` Eli Zaretskii
2018-12-23 21:11                         ` Werner LEMBERG
2018-12-28  7:10                 ` Eli Zaretskii
2018-12-29  7:23                 ` handa
2018-12-29  7:37                   ` Eli Zaretskii
2019-01-06 12:06                     ` handa
2019-01-06 15:29                       ` Eli Zaretskii
2019-01-06 17:26                         ` Stefan Monnier
2019-01-06 17:39                           ` Eli Zaretskii
2019-01-06 18:08                             ` Stefan Monnier
2018-12-19 21:16   ` bug#33796: 27.0.50; Use utf-8 is all our Elisp files Stefan Monnier
2019-01-08  2:20 ` Stefan Monnier
  -- strict thread matches above, loose matches on Subject: below --
2018-12-29  5:32 etc/HELLO markup etc Van L
2018-12-29  7:33 ` Eli Zaretskii
2018-12-30  6:51   ` Van L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d8999440-2406-e0df-7fed-948d8b8a3f8d@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=Emacs-devel@gnu.org \
    --cc=eliz@gnu.org \
    --cc=handa@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.