From: Paul Eggert <eggert@cs.ucla.edu>
To: Eli Zaretskii <eliz@gnu.org>
Cc: handa@gnu.org, monnier@iro.umontreal.ca, Emacs-devel@gnu.org
Subject: Re: etc/HELLO markup etc.
Date: Sat, 22 Dec 2018 11:41:05 -0800 [thread overview]
Message-ID: <d8999440-2406-e0df-7fed-948d8b8a3f8d@cs.ucla.edu> (raw)
In-Reply-To: <838t0iasju.fsf@gnu.org>
Eli Zaretskii wrote:
> If Han unification is the only important user of the charset property,
> then yes, we could remove the rest of the charset info from HELLO.
Yes, that's the case.
> the current HELLO just keeps the information
> that was there before recoding it in UTF-8, nothing was added.
Sure, but the non-Han markup is merely a relic of that file's old method of
encoding, which avoided Unicode and instead used ISO 2022 escape sequences to
switch among various 8- and 16-bit encodings, as that was the only way to show
text in (say) Russian under the constraints of the old method. The non-Han
markup is completely unnecessary now that the file uses UTF-8. (The Han markup
probably isn't needed either, though I also would like Handa's opinion on that.)
>> Although the etc/HELLO markup might be of interest to those who care about
>> annotating languages in the text, it's irrelevant to the ordinary purpose of
>> that file, which is to show textual translations of "Hello"
>
> That's not the original purpose of that file. The purpose is to show
> scripts, not languages, and to show how we display different scripts
> in the same buffer.
OK, but either way the non-Han markup is irrelevant to the ordinary purpose of
the file.
>> It's still not a good user interface, though, as it is difficult to see the
>> markup's effect when visiting etc/HELLO in the usual way
>
> If the usual way is via find-file and its ilk, then you should see the
> same results as with "C-h h", so I'm not sure I understand what you
> mean here.
I meant that one cannot see the markup's effect when visiting the file with
either C-h h or find-file in the usual way. It's useless markup.
> In what way most of what you say is not applicable to etc/enriched.txt
> in general?
Other forms of enriched-text markup are typically easily visible. If I visit
etc/enriched.txt I can easily see which parts are marked white on blue
background, which parts are marked italic, etc. Invisible enriched-text markup
is much harder to deal with when editing an enriched-text file.
>> the file is not a good showroom for how to maintain multilingual
>> text.
>
> What other facilities are you aware of or can suggest for showing
> multilingual text with such level of detail and precision?
In practice the most common and often the best way to deal with the situation is
to do what the non-markup part of etc/HELLO is already doing: indicate within
the text itself what language or script is being used, to help the reader who
may be unacquainted with them, and with enough punctuation within the text so
that the reader can easily see what's going on. This technique has been used for
centuries, it's by far the most popular technique in common practice today, and
it suffices for this particular application (with the possible exception of its
Chinese and Japanese text).
>> It's not a good sign that there seem to be errors in the
>> possibly-useful (i.e., CJ) markup that nobody has noticed since the
>> markup was introduced in May, and that I noticed these errors now
>> only because I was visiting the file literally.
>
> Which errors? I don't think we discovered any errors.
Yes, and that's the point! The approach we're taking is not good for dealing
with the situation.
One example of such an error is that "日本語" has no charset properties even
though it's obviously intended to use a Japanese script (since it follows the
word "Japanese"). I'm sure there are others.
next prev parent reply other threads:[~2018-12-22 19:41 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-18 18:46 bug#33796: 27.0.50; Use utf-8 is all our Elisp files Stefan Monnier
2018-12-18 19:22 ` Eli Zaretskii
2018-12-18 19:46 ` Stefan Monnier
2018-12-19 17:54 ` Paul Eggert
2018-12-19 18:11 ` Eli Zaretskii
2018-12-19 22:13 ` Paul Eggert
2018-12-20 16:06 ` Eli Zaretskii
2018-12-20 21:49 ` Paul Eggert
2018-12-21 7:29 ` Eli Zaretskii
2018-12-21 13:46 ` Stefan Monnier
2018-12-21 15:54 ` Eli Zaretskii
2018-12-21 13:55 ` Eli Zaretskii
2018-12-21 21:07 ` Paul Eggert
2018-12-22 1:19 ` Eric Lindblad
2018-12-22 7:56 ` etc/HELLO markup etc. (Was: 27.0.50; Use utf-8 is all our Elisp files) Eli Zaretskii
2018-12-22 8:12 ` etc/HELLO markup etc Eli Zaretskii
2018-12-22 19:41 ` Paul Eggert [this message]
2018-12-22 20:42 ` Eli Zaretskii
2018-12-23 7:47 ` Yuri Khan
2018-12-23 15:42 ` Eli Zaretskii
2018-12-23 15:53 ` Werner LEMBERG
2018-12-23 16:04 ` Eli Zaretskii
2018-12-23 21:11 ` Werner LEMBERG
2018-12-28 7:10 ` Eli Zaretskii
2018-12-29 7:23 ` handa
2018-12-29 7:37 ` Eli Zaretskii
2019-01-06 12:06 ` handa
2019-01-06 15:29 ` Eli Zaretskii
2019-01-06 17:26 ` Stefan Monnier
2019-01-06 17:39 ` Eli Zaretskii
2019-01-06 18:08 ` Stefan Monnier
2018-12-19 21:16 ` bug#33796: 27.0.50; Use utf-8 is all our Elisp files Stefan Monnier
2019-01-08 2:20 ` Stefan Monnier
-- strict thread matches above, loose matches on Subject: below --
2018-12-29 5:32 etc/HELLO markup etc Van L
2018-12-29 7:33 ` Eli Zaretskii
2018-12-30 6:51 ` Van L
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d8999440-2406-e0df-7fed-948d8b8a3f8d@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=Emacs-devel@gnu.org \
--cc=eliz@gnu.org \
--cc=handa@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.