From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Daniel Brooks Newsgroups: gmane.emacs.devel Subject: character sets as they relate to =?utf-8?B?4oCcUmF34oCd?= string literals for elisp Date: Mon, 04 Oct 2021 08:36:40 -0700 Message-ID: <87k0isu7hz.fsf_-_@db48x.net> References: <4209edd83cfee7c84b2d75ebfcd38784fa21b23c.camel@crossproduct.net> <87v92ft9z6.fsf@db48x.net> <87o885tyle.fsf@db48x.net> <83k0it6lu5.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6560"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) Cc: anna@crossproduct.net, rms@gnu.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Oct 04 17:37:54 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mXQ2L-0001VI-Bi for ged-emacs-devel@m.gmane-mx.org; Mon, 04 Oct 2021 17:37:53 +0200 Original-Received: from localhost ([::1]:42080 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mXQ2J-0001mx-VM for ged-emacs-devel@m.gmane-mx.org; Mon, 04 Oct 2021 11:37:51 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42290) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mXQ1H-0000Xq-27 for emacs-devel@gnu.org; Mon, 04 Oct 2021 11:36:48 -0400 Original-Received: from smtp-out-4.mxes.net ([2605:d100:2f:10::315]:27597) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mXQ1F-0006uE-Cb for emacs-devel@gnu.org; Mon, 04 Oct 2021 11:36:46 -0400 Original-Received: from Customer-MUA (mua.mxes.net [10.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 4HNPvs4gHqz3kWW; Mon, 4 Oct 2021 11:36:41 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mxes.net; s=mta; t=1633361802; bh=/zop9bzIXQ8ZfGrUmyiPCDVJLKaMlOuZaBJ9i3t6R8o=; h=From:To:Subject:References:Date:In-Reply-To:Message-ID: MIME-Version:Content-Type; b=ngr5dRhMIkirpljLy7EIxxi0/NWsVEjcQNBbChx0RXgU+Z/vgUaSx/jxscH4M1TLw dqa9C5PY2eQFRyfpnluBDblU//LdaGDuJ6v/V8CrAGhb3YUxc/Yssi3uthur+mHHkM DAatvtIl8nlDoucDpPZPIcEfi1+uQiLSP+Ars++s= Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGOfPtRkwAAABJQ TFRFpKfbdou67PD6JjJgAwUWXGSeIcyLHgAAAkZJREFUOI1VU8Fy6yAMxLi+Q13fCZ3cnQL3dqTc 7RD+/1feStDXVnXHDuvVSivZTMba2GPdw3gyCGcMAFxTyrTd9dwGoxHiZX9PmRFUHYAQlGGtXY+F Uk0SJOxgJiUEnH1qkitT9D+pQub7qGAmUbR6bu3CvI96Yv6QqkBBMrsyfZccr1/RDXGDTLf4P7ZY glVxe2V+/ACXWO1gvDO9/gDRpFFVmPluvLcmBjd5H6d8DEte+Pbk4rcY/Fa5tLKLOtCZsuQKYhpa LOkYDT7hESya7/WIET3lfQBqX0pwFtbI832Is0ayMUR9B+12xjgPCQ089cfwkCkX6L5TPmRelJTh zMS0Sz1PyjLAMCUWjcmgQLWQMds+e3aaauZDf9dU9A2/8kPVF2odCUoMKHkfjJR+mbgC+DRiycw5 3XSqGe6HmhN/AWjHypkAXOAFW5EiuA1ge2GiZuMb0s1fSEXcATeLUfbyEY2L8yPOmdSsdghQXx3K pz2eoeXuYvMCINVFDrCdNfVUp4eJ6cSEbjbgFjBEvonGGTrgv9cHjAc8aVgSAPoxaONbzfwhDIhR at7IIS7fAGiDSwIA9alhhTBzfA7YM2FY6eMwayrIGK8FDFmshmUA43WqhFtpvoqG9HHaJ7fqtgTz 8EWVkgZgtsylFliHDgk0MB7KAEC45C/rgnGvanNLXyzOeTzcT2nw/N44gfrtYXRQLoz9Q3TgmJRx 2Mx/Q51qzpm+l3m8z2SWBqC5+PZXAtNYlGFf/gKfHfjFkDT4x7od7R+w3Ls+ZdQBuQAAAABJRU5E rkJggg== In-Reply-To: <83k0it6lu5.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 04 Oct 2021 15:00:50 +0300") X-Sent-To: Received-SPF: none client-ip=2605:d100:2f:10::315; envelope-from=db48x@db48x.net; helo=smtp-out-4.mxes.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:276216 Archived-At: Eli Zaretskii writes: > We can only do this much. We don't develop any terminal emulators > here, except the two built into Emacs. I was referring broadly to the whole GNU project, not trying to assign the work specifically to the Emacs project. :) I was even pondering what it would take to do the work myself, now that Rust is allowed in kernel modules=E2=80=A6 > Given that even the Linux console turns out to have staggering gaps in > its support for Unicode, I see no reason for us to pretend Unicode is > supported well enough on the terminals to ignore this issue. The Linux console is not representative of most terminal emulators. It is neglected and rarely used, since it is intended only as a fall=E2=80=93b= ack in case X Windows (or sshd) fails to start. Ideally we should fix it (again speaking broadly), but we (emacs) shouldn=E2=80=99t limit ourselves = to only what it can support. >> For example, if someone contributes a mode it will normally be accepted >> as=E2=80=93is. But if they write the that mode using Japanese characters= , would we >> turn them away? I think that we should not. > > Why is Japanese different from any other script in this context? It isn=E2=80=99t; I simply picked one at random. > I thin unnecessary use of non-ASCII characters, any non-ASCII > characters, should be avoided, for the reasons mentioned above. See > bug#50865 for a recent example that left me astonished. I think that your suggestion to set the terminal-coding-system to latin-1 or us-ascii on the Linux console is the right one. Perhaps that ought to be the default behavior when Emacs detects that it is running in the Linux console, even if the LANG variable indicates that we should be using utf-8. Or perhaps Emacs should instead issue a warning in that case, since for all we know the Linux console could be fixed next week. But in any case, back to my question: Suppose our hypothetical contributor wanted to contribute a new mode with this type of code in it: (defun =E6=97=A5=E6=9C=AC () (message "=E6=97=A5=E6=9C=AC")) That is, all of the identifiers in the source code for this mode are named in some horrible foreign script that you cannot read. Is it so much more unreadable if it sometimes has to be displayed like this? (defun \u65E5\u672C () (message "\u65E5\u672C")) More to the point, do we turn away this contributor or ask them to rewrite their code? My preference is that we simply accept the contribution as=E2=80=93is. If we could see our way to accepting such code, then I don=E2=80=99t see wh= y we couldn=E2=80=99t accept code that uses Unicode in much smaller ways, such as this: (defvar variable-containing-html #r=EF=BD=A2click = here=EF=BD=A3) db48x PS: it occurs to me to wonder if my use of Unicode in the prose of this message, outside of the examples, detracted from its readability in any way?