From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ulrich Mueller Newsgroups: gmane.emacs.devel Subject: Re: Disambiguate modeline character for UTF-8? Date: Wed, 05 Jul 2023 15:04:08 +0200 Message-ID: References: <83wo1p73d2.fsf@gnu.org> <6ccde339-2bf1-3a4d-61bb-734046bf02d5@cs.ucla.edu> <83r1rx6vgv.fsf@gnu.org> <83lfi56te9.fsf@gnu.org> <83cz16k2kx.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22944"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.3 (gnu/linux) Cc: Ulrich Mueller , emacs-devel@gnu.org, drew.adams@oracle.com, eggert@cs.ucla.edu, monnier@iro.umontreal.ca To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jul 05 15:04:40 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qH2BT-0005j5-W2 for ged-emacs-devel@m.gmane-mx.org; Wed, 05 Jul 2023 15:04:39 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qH2BB-0002Uq-Jd; Wed, 05 Jul 2023 09:04:21 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qH2B8-0002UL-PU for emacs-devel@gnu.org; Wed, 05 Jul 2023 09:04:19 -0400 Original-Received: from smtp.gentoo.org ([2001:470:ea4a:1:5054:ff:fec7:86e4]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_CHACHA20_POLY1305:256) (Exim 4.90_1) (envelope-from ) id 1qH2B6-0003in-UT; Wed, 05 Jul 2023 09:04:18 -0400 In-Reply-To: <83cz16k2kx.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 05 Jul 2023 14:41:34 +0300") Received-SPF: pass client-ip=2001:470:ea4a:1:5054:ff:fec7:86e4; envelope-from=ulm@gentoo.org; helo=smtp.gentoo.org X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:307464 Archived-At: >>>>> On Wed, 05 Jul 2023, Eli Zaretskii wrote: >> Coming back to this thread (which at the time ended in bikeshedding). >> The goal I had in mind was to disambiguate UTF-8, i.e. a unique modeline >> character would be used for it. Currently this is not the case: >> >> U -- utf-8* (all variants) >> U -- utf-16* (all variants) >> U -- utf-7 >> U -- koi8-u >> >> So, I propose to change this to either: >> >> + -- utf-8* (all variants) >> (everything else unchanged) >> >> or: >> >> U -- utf-8* (all variants) >> u -- utf-16* (all variants) >> u -- utf-7 >> K -- koi8-u > TBH, I don't like to change such long-time features. > The only real problem is between UTF-8 and UTF-16, since the others > are hardly ever used these days. UTF-16 is also quite rarely used, > basically only on MS-Windows for system-level files. So is this > really a problem that we need to solve, at the risk of breaking > people's "muscle" memory? If I see the lower-case "u" on the > modeline when I expect to see "U" instead, I'd be surprised. Is it > worth it? UTF-8 is one of the most common encodings, and it is strange that it shares its modeline indicator with anything else. And the "U" is really ambiguous, because context won't help (or how would you decide if a buffer's file encoding is e.g. koi8-u or utf-8?). As you say, the others in the above list are rarely used nowadays. So, maybe users should see the "u" or the "K" to indicate that the file has an unusual encoding?