unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Daniel Brooks <db48x@db48x.net>
To: Stefan Monnier <monnier@iro.umontreal.ca>, Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: character sets as they relate to “Raw” string literals for elisp
Date: Mon, 04 Oct 2021 13:49:53 -0700	[thread overview]
Message-ID: <87a6jotszy.fsf@db48x.net> (raw)
In-Reply-To: <jwvpmskkb4s.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Mon, 04 Oct 2021 12:34:19 -0400")

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Daniel Brooks <db48x@db48x.net>
>> Cc: emacs-devel@gnu.org,  rms@gnu.org,  anna@crossproduct.net
>> Date: Mon, 04 Oct 2021 08:36:40 -0700
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> > We can only do this much.  We don't develop any terminal emulators
>> > here, except the two built into Emacs.
>> 
>> I was referring broadly to the whole GNU project, not trying to assign
>> the work specifically to the Emacs project. :)
>
> Then this is not necessarily the best place to raise these issues.

I was replying directly to RMS concerning his statement about non–ascii
characters. RMS is known to have opinions with a wider scope than will
fit in any single mailing list, and I was responding in kind. I
apologize for using “we” so broadly without thinking; it is certainly
the kind of thing that is confusing, so I should have been much more
explicit.

>> Suppose our hypothetical contributor wanted to contribute a new mode
>> with this type of code in it:
>> 
>>     (defun 日本 () (message "日本"))
>
> It would be very inconvenient to have such code.

Absolutely! Possibly almost as inconvenient as having to learn some
English in order to develop the thing. But it doesn’t answer my
question.

I see that prolog-mode only gets a few commits per year (9 last year and
5 so far this year; the high water mark is 10 in a single year). It
imposes a pretty minimal support burden and if it has bugs you can
simply ignore them until a Prolog user brings you a patch, because those
bugs can only affect Prolog users. There is a lot of code in Emacs which
fits this description.

Suppose this hypothetical contribution were a language mode for a
Japanese programming language, and thus had the same support profile?
Suppose also that all messages to the user have already been localized
into English, and that there is an English alias for the mode name (that
is, `日本-mode' toggles the mode, but there’s an alias like `ja-mode' or
something), while the rest of the identifiers are in Japanese.

Would there be any reason to turn away that contribution, or to make the
contributor rewrite it?

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> FWIW, I consider this case quite different from your raw-string case,
> because here the main issue for me is whether the code is maintainable
> and reviewable by someone else.  So, in the context of Emacs, GNU
> ELPA, and NonGNU ELPA, I find such uses problematic.  If I could count
> on having someone else I trust do the reviewing, I might reconsider.

I think that if I read between the lines, you are saying that the Emacs
project _could_ grow to become multi–lingual at all levels, with a
sufficient number of invested contributors who could each review and
maintain different parts of the code. Also that like Eli, you would find
it inconvenient or problematic in the short term. Is that a fair
reading?

> We have that where it's inevitable (like in some packages that define
> features specific to some languages), but even there we prefer to use
> the likes of \u672c instead of the literal characters.  At the very
> least, that avoids the problem with not having a suitable font to
> display them.

As an aside, I think that this is a sensible enough choice, though I
would prefer to choose a more automatic solution. That is, relying on
particular viewers of the source code to tweak their Emacs settings to
present the source differently instead of relying on contributors to use
the codepoint numbers directly. As you suggested in bug#50865, changing
the encoding will automatically render those characters with their
codepoint numbers, which is nicer than forcing a human to type them in
before committing. This has the advantage of working on identifiers as
well as string literals.

>> If we could see our way to accepting such code, then I don’t see why we
>> couldn’t accept code that uses Unicode in much smaller ways, such as
>> this:
>> 
>>     (defvar variable-containing-html #r「<a href="foo.html">click here</a>」)
>
> If we avoid non-ASCII characters, we avoid some problems, so all else
> being equal, it's better.

Hmm. If we (speaking as broadly as possible!) avoid a problem forever,
how will the problem ever get fixed?

Personally, I think that the problems are now mostly fixed. Emacs has
very complete support for character sets, better than virtually all
other applications. Outside of Emacs, support for Unicode is practically
omnipresent as well. There are still notable gaps, like the Linux
console, but they are the exception rather than the rule. I don’t think
that there is much of a problem left to avoid!

>> PS: it occurs to me to wonder if my use of Unicode in the prose of this
>> message, outside of the examples, detracted from its readability in any
>> way?
>
> If someone is reading this on a text-mode terminal, it could.

I am asking if anyone reading my messages, either this one or any of the
last dozen I have sent to the list, have noticed any specific
problems. I have used non–ascii characters in all of them. I’m wondering
if anyone even noticed. If nobody noticed, or if they didn’t detract
from readability, then it is unlikely that Unicode is a problem in
general.

Yuri Khan <yuri.v.khan@gmail.com> writes:

> On Tue, 5 Oct 2021 at 01:58, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> If someone is reading this on a text-mode terminal, it could.
>
> We should probably invent a term more accurate than “text-mode
> terminal” for things that fail to display text.

True! :D

I prefer to say “Linux console” in reference to the one terminal
emulator that we know has severe problems with Unicode. There are many
terminal emulators out there, and I’m sure a few of them have problems,
but for the most part I think all of them can handle Unicode pretty well
primarily because they all rely on OS libraries to do the heavy
lifting. The Linux console is handicapped in this area primarily because
it is inside the kernel, and thus cannot dynamically load libharfbuzz
and libfreetype. (But I can imagine a hypothetical future kernel module
which statically links against them in order to provide a full–featured
terminal in the console.)

db48x



  reply	other threads:[~2021-10-04 20:49 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-08  1:49 "Raw" string literals for elisp Anna Glasgall
2021-09-08  7:10 ` Po Lu
2021-09-08 14:19   ` Anna Glasgall
2021-09-08  7:12 ` Lars Ingebrigtsen
2021-09-08 14:20   ` Anna Glasgall
2021-09-08 11:30 ` Alan Mackenzie
2021-09-08 14:27   ` Anna Glasgall
2021-09-08 11:34 ` Adam Porter
2021-09-08 13:59   ` Clément Pit-Claudel
2021-09-08 14:12     ` Adam Porter
2021-09-09  3:09   ` Richard Stallman
2021-09-08 13:10 ` Stefan Monnier
2021-09-08 14:31   ` Anna Glasgall
2021-09-08 15:27     ` Mattias Engdegård
2021-09-08 15:41       ` Stefan Kangas
2021-09-08 16:45         ` Mattias Engdegård
2021-09-08 16:01       ` Alan Mackenzie
2021-09-08 18:24         ` Mattias Engdegård
2021-09-08 19:00           ` Alan Mackenzie
2021-09-08 19:22         ` Philip Kaludercic
2021-09-08 19:36           ` Alan Mackenzie
2021-09-08 21:11           ` Stefan Kangas
2021-09-08 21:24             ` Philip Kaludercic
2021-09-09  6:52             ` tomas
2021-09-08 15:54     ` Stefan Kangas
2021-09-08 16:05     ` tomas
2021-09-08 16:42       ` Lars Ingebrigtsen
2021-09-08 20:08         ` Stefan Monnier
2021-09-08 20:18       ` Stefan Monnier
2021-09-09  7:04         ` tomas
2021-09-09 10:30         ` Mattias Engdegård
2021-09-09 11:36           ` Stefan Kangas
2021-09-09 13:33             ` Mattias Engdegård
2021-09-09 14:32               ` tomas
2021-09-14 10:43               ` Augusto Stoffel
2021-09-14 11:42                 ` Ihor Radchenko
2021-09-14 13:18                   ` Stefan Monnier
2021-09-14 13:22                     ` Stefan Kangas
2021-09-14 14:01                       ` Ihor Radchenko
2021-09-14 14:39                       ` Clément Pit-Claudel
2021-09-14 15:33                         ` Amin Bandali
2021-09-14 16:05                         ` Eli Zaretskii
2021-09-14 17:49                   ` Jose E. Marchesi
2021-09-08 20:40 ` Anna Glasgall
2021-09-08 21:28   ` Alan Mackenzie
2021-10-02 21:03   ` Daniel Brooks
2021-10-04  0:13     ` Richard Stallman
2021-10-04  0:36       ` Daniel Brooks
2021-10-04 12:00         ` Eli Zaretskii
2021-10-04 15:36           ` character sets as they relate to “Raw” " Daniel Brooks
2021-10-04 16:34             ` Stefan Monnier
2021-10-04 20:49               ` Daniel Brooks [this message]
2021-10-04 21:19                 ` Alan Mackenzie
2021-10-04 22:19                   ` Daniel Brooks
2021-10-05 11:20                     ` Alan Mackenzie
2021-10-05 17:08                       ` Daniel Brooks
2021-10-06 20:54                         ` Richard Stallman
2021-10-07  7:01                           ` Eli Zaretskii
2021-10-05  8:55                 ` Yuri Khan
2021-10-05 16:25                   ` Juri Linkov
2021-10-05 17:15                     ` Eli Zaretskii
2021-10-05 18:40                       ` [External] : " Drew Adams
2021-10-06 20:54                       ` Richard Stallman
2021-10-07  6:54                         ` Eli Zaretskii
2021-10-07 13:14                           ` Stefan Kangas
2021-10-07 13:34                             ` Eli Zaretskii
2021-10-07 14:48                               ` Stefan Kangas
2021-10-07 16:00                                 ` Eli Zaretskii
2021-10-08  0:37                                   ` Stefan Kangas
2021-10-08  6:53                                     ` Eli Zaretskii
2021-10-08 15:09                                       ` Display of em dashes in our documentation Stefan Kangas
2021-10-08 16:12                                         ` Eli Zaretskii
2021-10-08 17:17                                           ` Stefan Kangas
2021-10-10  8:00                                             ` Juri Linkov
2021-10-08 17:27                                           ` Daniel Brooks
2021-10-08 18:26                                           ` [External] : " Drew Adams
2021-10-08 17:17                                       ` character sets as they relate to “Raw” string literals for elisp Alan Mackenzie
2021-10-08 17:42                                         ` Eli Zaretskii
2021-10-08 18:47                                           ` Eli Zaretskii
2021-10-08 20:01                                             ` Alan Mackenzie
2021-10-09  6:18                                               ` Eli Zaretskii
2021-10-09 10:57                                                 ` Alan Mackenzie
2021-10-09 11:49                                                   ` Eli Zaretskii
2021-10-09 13:08                                                     ` Alan Mackenzie
2021-10-09 13:15                                                       ` Eli Zaretskii
2021-10-09 15:07                                                         ` Alan Mackenzie
2021-10-11  0:45                                                           ` linux console limitations Daniel Brooks
2021-10-12 10:18                                                             ` Alan Mackenzie
2021-10-14  4:05                                                               ` Daniel Brooks
2021-10-10  8:03                                                   ` character sets as they relate to “Raw” string literals for elisp Juri Linkov
2021-10-05 18:23                     ` [External] : " Drew Adams
2021-10-05 19:13                       ` Stefan Kangas
2021-10-05 19:20                         ` Drew Adams
2021-10-05 17:13                   ` Daniel Brooks
2021-10-05 12:04                 ` Eli Zaretskii
2021-10-05 21:20                 ` Richard Stallman
2021-10-05 22:13                   ` Daniel Brooks
2021-10-06 12:13                     ` Eli Zaretskii
2021-10-06 18:57                       ` Daniel Brooks
2021-10-07  4:23                         ` Eli Zaretskii
2021-10-07 22:27                         ` Richard Stallman
2021-10-08 10:37                         ` Po Lu
2021-10-08 10:53                           ` Basil L. Contovounesios
2021-10-08 11:27                             ` tomas
2021-10-05 22:25                   ` character sets as they relate to “Raw†" Stefan Kangas
2021-10-06  6:21                     ` Daniel Brooks
2021-10-07 22:20                       ` Richard Stallman
2021-10-06 12:29                     ` Eli Zaretskii
2021-10-06 12:52                       ` Stefan Kangas
2021-10-06 13:10                         ` Jean-Christophe Helary
2021-10-06 11:53                   ` character sets as they relate to “Raw” " Eli Zaretskii
2021-10-04 18:57             ` Eli Zaretskii
2021-10-04 19:14               ` Yuri Khan
2021-10-05 21:20                 ` Richard Stallman
2021-10-06  3:48                   ` character sets as they relate to “Raw†" Matthew Carter
2021-10-04 22:29         ` "Raw" " Richard Stallman
2021-10-05  5:39           ` Daniel Brooks
2021-10-05  5:43             ` Jean-Christophe Helary
2021-10-05  8:24               ` Richard Stallman
2021-10-05 12:23               ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6jotszy.fsf@db48x.net \
    --to=db48x@db48x.net \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).