From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Disambiguate modeline character for UTF-8? Date: Wed, 05 Jul 2023 14:41:34 +0300 Message-ID: <83cz16k2kx.fsf@gnu.org> References: <83wo1p73d2.fsf@gnu.org> <6ccde339-2bf1-3a4d-61bb-734046bf02d5@cs.ucla.edu> <83r1rx6vgv.fsf@gnu.org> <83lfi56te9.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39077"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel@gnu.org, drew.adams@oracle.com, eggert@cs.ucla.edu, monnier@iro.umontreal.ca To: Ulrich Mueller Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jul 05 13:42:39 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qH0u7-0009zG-EE for ged-emacs-devel@m.gmane-mx.org; Wed, 05 Jul 2023 13:42:39 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qH0tF-0007qs-H9; Wed, 05 Jul 2023 07:41:45 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qH0tD-0007qY-Mz for emacs-devel@gnu.org; Wed, 05 Jul 2023 07:41:43 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qH0t7-00041A-Qs; Wed, 05 Jul 2023 07:41:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=mqYkq0jqquz+FM6cj31IufOQIpd5cz79rCxkzxKssNs=; b=ZBxhLfMNilwl se9hj4M6fVBrd7T4565s02AjXZ59Y6aLdqx/UgrkY2yO1z00gB6mBMdR2j+5iz6Qtl7t+IYUIeDo8 /M3LHI8bCMUpFNLLs8by5Cpcb3cGioUnmnCVesKnptezjKm6WUr4ZATuG5y666tLGGZzOUb0mhYsr q8eNMhyx8yggn7JcYVmgcqgZG7HU36OH38jR4bPYSmp4JdqlManN60DxbaZgC/yTLcx5WUw4NlZJ2 BaK9zIdPqynHIx8HpmG++TVJYAOZC/rWWlPXgDqLAIghzk7i59rHfq5j4Ualg5ZXM9SZCVvlSsTow xKIoll/YtnveU49eKOoVjw==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qH0t7-0005Jw-5H; Wed, 05 Jul 2023 07:41:37 -0400 In-Reply-To: (message from Ulrich Mueller on Wed, 05 Jul 2023 12:08:59 +0200) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:307456 Archived-At: > From: Ulrich Mueller > Cc: Drew Adams , Eli Zaretskii , > eggert@cs.ucla.edu, Stefan Monnier > Date: Wed, 05 Jul 2023 12:08:59 +0200 > > >>>>> On Mon, 24 Aug 2020, Ulrich Mueller wrote: > > >>>>> On Mon, 24 Aug 2020, Drew Adams wrote: > >> I'll just say this, as some have suggested that > >> one main thing they want is to be able to easily > >> and quickly tell whether the encoding is NOT > >> utf-8 (and not ASCII, presumably): > > >> The characters "u" and "U" are not so easily > >> distinguished. You might want to pick some > >> other, quite different looking, character for > >> the non-UTF-8 (i.e., UTF-16 etc.). > > > Another idea: Since "-" is used for ASCII, maybe use "+" for UTF-8? > > This would be visually unobtrusive, so any uncommon coding system would > > stand out against it. > > Coming back to this thread (which at the time ended in bikeshedding). > The goal I had in mind was to disambiguate UTF-8, i.e. a unique modeline > character would be used for it. Currently this is not the case: > > U -- utf-8* (all variants) > U -- utf-16* (all variants) > U -- utf-7 > U -- koi8-u > > So, I propose to change this to either: > > + -- utf-8* (all variants) > (everything else unchanged) > > or: > > U -- utf-8* (all variants) > u -- utf-16* (all variants) > u -- utf-7 > K -- koi8-u TBH, I don't like to change such long-time features. The only real problem is between UTF-8 and UTF-16, since the others are hardly ever used these days. UTF-16 is also quite rarely used, basically only on MS-Windows for system-level files. So is this really a problem that we need to solve, at the risk of breaking people's "muscle" memory? If I see the lower-case "u" on the modeline when I expect to see "U" instead, I'd be surprised. Is it worth it?