From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Daniel Brooks Newsgroups: gmane.emacs.devel Subject: Re: character sets as they relate to =?utf-8?B?4oCcUmF34oCd?= string literals for elisp Date: Tue, 05 Oct 2021 15:13:20 -0700 Message-ID: <87mtnnrugv.fsf_-_@db48x.net> References: <4209edd83cfee7c84b2d75ebfcd38784fa21b23c.camel@crossproduct.net> <87v92ft9z6.fsf@db48x.net> <87o885tyle.fsf@db48x.net> <83k0it6lu5.fsf@gnu.org> <87k0isu7hz.fsf_-_@db48x.net> <87a6jotszy.fsf@db48x.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14428"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) Cc: eliz@gnu.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Richard Stallman Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Oct 06 00:14:55 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mXsi6-0003Wm-5X for ged-emacs-devel@m.gmane-mx.org; Wed, 06 Oct 2021 00:14:54 +0200 Original-Received: from localhost ([::1]:56552 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mXsi4-00076b-FH for ged-emacs-devel@m.gmane-mx.org; Tue, 05 Oct 2021 18:14:52 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50024) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mXsgg-0006O1-GE for emacs-devel@gnu.org; Tue, 05 Oct 2021 18:13:26 -0400 Original-Received: from smtp-out-4.mxes.net ([2605:d100:2f:10::315]:38330) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mXsge-0005XQ-Lc for emacs-devel@gnu.org; Tue, 05 Oct 2021 18:13:26 -0400 Original-Received: from Customer-MUA (mua.mxes.net [10.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 4HPBg56gBgz3c9s; Tue, 5 Oct 2021 18:13:21 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mxes.net; s=mta; t=1633472002; bh=J42oDtYjR2D8bBZGOrk5UVFlQpsPUKzPh74voBzghQE=; h=From:To:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=FTEcXpAAf3uZ6AJuOT/CGfQ9eH+BGJdGH55ct0pzntFwXz+nPJdAT3UDYl8uzv+Mp IzWu1iSC01rceZ3H75SkuxiTFr3eA1p0VhkTha6a0lv0BSpnzhhbwynewywOPoBcE+ uoof7ujk54jYAaD0K6Wnd/q5w8OOZhUeLQSWFbRo= In-Reply-To: (Richard Stallman's message of "Tue, 05 Oct 2021 17:20:40 -0400") Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGOfPtRkwAAABJQ TFRFpKfbdou67PD6JjJgAwUWXGSeIcyLHgAAAkZJREFUOI1VU8Fy6yAMxLi+Q13fCZ3cnQL3dqTc 7RD+/1feStDXVnXHDuvVSivZTMba2GPdw3gyCGcMAFxTyrTd9dwGoxHiZX9PmRFUHYAQlGGtXY+F Uk0SJOxgJiUEnH1qkitT9D+pQub7qGAmUbR6bu3CvI96Yv6QqkBBMrsyfZccr1/RDXGDTLf4P7ZY glVxe2V+/ACXWO1gvDO9/gDRpFFVmPluvLcmBjd5H6d8DEte+Pbk4rcY/Fa5tLKLOtCZsuQKYhpa LOkYDT7hESya7/WIET3lfQBqX0pwFtbI832Is0ayMUR9B+12xjgPCQ089cfwkCkX6L5TPmRelJTh zMS0Sz1PyjLAMCUWjcmgQLWQMds+e3aaauZDf9dU9A2/8kPVF2odCUoMKHkfjJR+mbgC+DRiycw5 3XSqGe6HmhN/AWjHypkAXOAFW5EiuA1ge2GiZuMb0s1fSEXcATeLUfbyEY2L8yPOmdSsdghQXx3K pz2eoeXuYvMCINVFDrCdNfVUp4eJ6cSEbjbgFjBEvonGGTrgv9cHjAc8aVgSAPoxaONbzfwhDIhR at7IIS7fAGiDSwIA9alhhTBzfA7YM2FY6eMwayrIGK8FDFmshmUA43WqhFtpvoqG9HHaJ7fqtgTz 8EWVkgZgtsylFliHDgk0MB7KAEC45C/rgnGvanNLXyzOeTzcT2nw/N44gfrtYXRQLoz9Q3TgmJRx 2Mx/Q51qzpm+l3m8z2SWBqC5+PZXAtNYlGFf/gKfHfjFkDT4x7od7R+w3Ls+ZdQBuQAAAABJRU5E rkJggg== X-Sent-To: Received-SPF: none client-ip=2605:d100:2f:10::315; envelope-from=db48x@db48x.net; helo=smtp-out-4.mxes.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:276396 Archived-At: Richard Stallman writes: > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > > Suppose this hypothetical contribution were a language mode for a > > Japanese programming language, and thus had the same support profile? > > I have to guess what a "Japanese programming language" would mean, but > I think you're talking about a mode for editing programs written in a > language whose symbols are meaningful in Japanese and perhaps written > in kana and kanji. Correct. The idea is that this hypothetical Emacs feature would be useful primarily to people who could already read and write Japanese, and who thus would not be inconvenienced because the software was also written in Japanese. > We could conceivably add such a program to Emacs, but should we? I > think it is not worth the trouble; I'd say, let's not. > > You can write and destribute the program, and people could run it. > But we should not distribute programs we can't read. Fair enough; thanks for answering the question! > > I think that if I read between the lines, you are saying that the Ema= cs > > project _could_ grow to become multi=E2=80=93lingual at all levels, w= ith a > > sufficient number of invested contributors who could each review and > > maintain different parts of the code. > > It would be an enormous effort -- just consider translating the > manuals. And updating the translations for each Emacs version. It > would be a big burden. Yes, that=E2=80=99s certainly true; the cost of getting complete parity bet= ween English and a second language would be significant. However, I don=E2=80=99t think that the ongoing costs would be insurmountable, assuming the project attracted additional trusted and proven maintainers along with each additional language. A few docstrings and manual pages get changed in most version, but not enough to make it impossible to keep up. Eli Zaretskii writes: >> From: Daniel Brooks >> Cc: emacs-devel@gnu.org >> Date: Mon, 04 Oct 2021 13:49:53 -0700 >> >> Would there be any reason to turn away that contribution, or to make the >> contributor rewrite it? > > I'm sorry, this is too abstract and theoretical issue, with many > important details missing. So I don't think it will be useful to > seriously consider such a theoretical example. That, however, is not a useful answer. :) What assumptions would you need to make before you could answer yes? Note that this is a purely hypothetical situation; aside from a smattering of Latin and Greek that are useful for English etymology, I cannot read or write any other languages. I don=E2=80=99t have a pile of co= de written in Japanese that I=E2=80=99m going to spring on you if you find a w= ay to say yes. Instead I am looking ahead and wondering what the conditions would have to be like 20 years from now for non=E2=80=93English code to sta= rt showing up. > It turns out there are more exception than we imagine. We just now > had another bug report, this time about Kitty terminal emulator, which > has yet another set of issues with displaying non-ASCII characters > from Emacs. So much so that I was prompted to add an entry in > etc/PROBLEMS with some workarounds for users of Kitty. Granted, their > problems are not that they don't support recently added Unicode > characters, it's that they support them "too well". B ut still, it > doesn't help when the result is a messed-up display. > > Unicode is not a static target, it's a moving one. They issue a new > version of the standard twice a year, and each new version adds new > codepoints with new attributes. If a new version of Unicode adds > double-width characters, and some terminal emulator doesn't keep up, > you will have problems displaying those new codepoints. (AFAIK, > that's in essence the problem with the Linux console: they last > updated when Unicode 5.0 was released.) That=E2=80=99s an interesting point. On the one hand, the fact that the Lin= ux console is still using Unicode 5.0 shows just how unmaintained it is (released in July 2006; the next Emacs release was 22.1 in 2007). On the other hand, perhaps if problems like this keep cropping up we will have to add encodings for older unicode versions. People using the Linux console could set their terminal encoding to 'utf-8-unicode5.0. Characters added after that would show up escaped, and Emacs would know what width the terminal was going to use for each character. > So it might be possible to say that many terminals support substantial > portions of Unicode, but it definitely is NOT right to say that we can > freely use any character we want and think they will work everywhere. So one assumption that you might make is that new source code being added to Emacs must use characters from a version of Unicode which is known to have wide compatibility, rather than immediately jumping to the bleeding=E2=80=93edge version? That would be perfectly reasonable. db48x