From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: character sets as they relate to =?utf-8?B?4oCcUmF34oCd?= string literals for elisp Date: Tue, 05 Oct 2021 15:04:15 +0300 Message-ID: <83pmsj4r0g.fsf@gnu.org> References: <4209edd83cfee7c84b2d75ebfcd38784fa21b23c.camel@crossproduct.net> <87v92ft9z6.fsf@db48x.net> <87o885tyle.fsf@db48x.net> <83k0it6lu5.fsf@gnu.org> <87k0isu7hz.fsf_-_@db48x.net> <87a6jotszy.fsf@db48x.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4024"; mail-complaints-to="usenet@ciao.gmane.io" Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Daniel Brooks Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Oct 05 14:07:47 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mXjEY-0000ok-CV for ged-emacs-devel@m.gmane-mx.org; Tue, 05 Oct 2021 14:07:46 +0200 Original-Received: from localhost ([::1]:32984 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mXjEW-0005IX-CY for ged-emacs-devel@m.gmane-mx.org; Tue, 05 Oct 2021 08:07:44 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42366) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mXjBM-0003IF-9V for emacs-devel@gnu.org; Tue, 05 Oct 2021 08:04:28 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:34634) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mXjBK-0005Q5-8e; Tue, 05 Oct 2021 08:04:26 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3414 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mXjBJ-0002Ny-PI; Tue, 05 Oct 2021 08:04:26 -0400 In-Reply-To: <87a6jotszy.fsf@db48x.net> (message from Daniel Brooks on Mon, 04 Oct 2021 13:49:53 -0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:276324 Archived-At: > From: Daniel Brooks > Cc: emacs-devel@gnu.org > Date: Mon, 04 Oct 2021 13:49:53 -0700 > > I see that prolog-mode only gets a few commits per year (9 last year and > 5 so far this year; the high water mark is 10 in a single year). It > imposes a pretty minimal support burden and if it has bugs you can > simply ignore them until a Prolog user brings you a patch, because those > bugs can only affect Prolog users. There is a lot of code in Emacs which > fits this description. > > Suppose this hypothetical contribution were a language mode for a > Japanese programming language, and thus had the same support profile? > Suppose also that all messages to the user have already been localized > into English, and that there is an English alias for the mode name (that > is, `日本-mode' toggles the mode, but there’s an alias like `ja-mode' or > something), while the rest of the identifiers are in Japanese. > > Would there be any reason to turn away that contribution, or to make the > contributor rewrite it? I'm sorry, this is too abstract and theoretical issue, with many important details missing. So I don't think it will be useful to seriously consider such a theoretical example. > >> (defvar variable-containing-html #r「click here」) > > > > If we avoid non-ASCII characters, we avoid some problems, so all else > > being equal, it's better. > > Hmm. If we (speaking as broadly as possible!) avoid a problem forever, > how will the problem ever get fixed? I don't think it needs fixing. > Personally, I think that the problems are now mostly fixed. Emacs has > very complete support for character sets, better than virtually all > other applications. Outside of Emacs, support for Unicode is practically > omnipresent as well. There are still notable gaps, like the Linux > console, but they are the exception rather than the rule. I don’t think > that there is much of a problem left to avoid! It turns out there are more exception than we imagine. We just now had another bug report, this time about Kitty terminal emulator, which has yet another set of issues with displaying non-ASCII characters from Emacs. So much so that I was prompted to add an entry in etc/PROBLEMS with some workarounds for users of Kitty. Granted, their problems are not that they don't support recently added Unicode characters, it's that they support them "too well". B ut still, it doesn't help when the result is a messed-up display. > I prefer to say “Linux console” in reference to the one terminal > emulator that we know has severe problems with Unicode. There are many > terminal emulators out there, and I’m sure a few of them have problems, > but for the most part I think all of them can handle Unicode pretty well > primarily because they all rely on OS libraries to do the heavy > lifting. Unicode is not a static target, it's a moving one. They issue a new version of the standard twice a year, and each new version adds new codepoints with new attributes. If a new version of Unicode adds double-width characters, and some terminal emulator doesn't keep up, you will have problems displaying those new codepoints. (AFAIK, that's in essence the problem with the Linux console: they last updated when Unicode 5.0 was released.) So it might be possible to say that many terminals support substantial portions of Unicode, but it definitely is NOT right to say that we can freely use any character we want and think they will work everywhere.