From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Unicode confusables and reordering characters considered harmful, a simple solution Date: Wed, 03 Nov 2021 21:09:49 +0200 Message-ID: <83ee7xgio2.fsf@gnu.org> References: <875ytag0hb.fsf@yahoo.com> <87zgqmd5np.fsf@mat.ucm.es> <83wnlqk3rn.fsf@gnu.org> <72dd5c2a-42c7-b12e-05ed-e93adbd89727@gmail.com> <83ilxajyhw.fsf@gnu.org> <83fssejxf8.fsf@gnu.org> <835ytajsv2.fsf@gnu.org> <831r3yjqo9.fsf@gnu.org> <83v91aibe7.fsf@gnu.org> <87o872s0wf.fsf_-_@db48x.net> <83lf25gm1j.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="621"; mail-complaints-to="usenet@ciao.gmane.io" Cc: db48x@db48x.net, cpitclaudel@gmail.com, stefan@marxist.se, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Yuri Khan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Nov 03 20:11:27 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1miLfS-000AWb-9W for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Nov 2021 20:11:26 +0100 Original-Received: from localhost ([::1]:38818 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1miLfQ-00031h-Mq for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Nov 2021 15:11:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:42232) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miLdz-0002DY-A8 for emacs-devel@gnu.org; Wed, 03 Nov 2021 15:09:55 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:37530) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miLdx-0000kx-GK; Wed, 03 Nov 2021 15:09:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=OONObXY2EbqLsSXTXvH0BiZWQ/Xqx65V03b+Y/rH6nY=; b=pKqco69u4UCxZteM4UOZ xeGxAzVR6BdJdTKvAh1oJYgT0AgSqpeHsx99egcCz6ym6QNo/KNN465LF/KVm30x0G0S2u2u0c6q0 ZePUQsbypbcQ+IqBb3UGqKj9iTSa0Xx36kOJKpF2vg1dUR95uEQKusQAl5++b/RQMWOqxCUZ4TkVD OUETTomymfe4ZWYG8MtO28GcqybcleL5VAXixyvH2wjDx8UCQie0palfvkk0soRgzS6lFTuZ2X2Dm UgMEPbCNtwXr97PtBd0voL9gzZ9MTKdlccyfJtcDgHfpjqmyN3aaOietj5fPGRRnpeJ4XuOnZCvfr vGheOmh/bDnSmw==; Original-Received: from [87.69.77.57] (port=1765 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miLdx-0004DF-1S; Wed, 03 Nov 2021 15:09:53 -0400 In-Reply-To: (message from Yuri Khan on Thu, 4 Nov 2021 01:45:17 +0700) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278608 Archived-At: > From: Yuri Khan > Date: Thu, 4 Nov 2021 01:45:17 +0700 > Cc: Daniel Brooks , Clément Pit-Claudel , > Stefan Kangas , Stefan Monnier , > Emacs developers > > On Thu, 4 Nov 2021 at 00:56, Eli Zaretskii wrote: > > > The problem with these remappings is that you then get to somehow > > discern between the remapped characters and the real characters which > > look identically on display. > > Real characters are fontified as whichever syntax unit they belong to. > Remapped characters are fontified as whitespace-space-face or > whitespace-hspace-face depending on whether you add them to > whitespace-space-regexp or whitespace-hspace-regexp. I just used what Daniel posted, and that doesn't display the remapped characters in any distinct face. Gotta tinker? > > Also, this will disrupt alignment > > We already have this issue with TABs — when a tab would expand to a > single space, a remapped tab expands to its replacement glyph and a > whole tab-width’s worth of spaces. Yes, it’s slightly annoying. Yes, it's a general problem with remapping. > > and make text using these controls > > much harder to read. E.g., the few places in TUTORIAL.he which use > > those controls are barely readable after turning the above on. > > I tried that and I find it okay. Do you read Hebrew? Those characters look like line noise there, whereas the text with the default display is perfectly readable, and most people won't even know these controls are there (as intended). > > Anyway, if one wants to be able to highlight certain characters on > > display, one could also use highlight-regexp, I think. > > One does not only want to highlight, but also to actually see and > distinguish certain characters What for? The absolute majority of people won't have any idea what is the effect of each of these controls, and how it differs from others. Even I many times need to talk myself through their effect on display. The UBA spec weighs in at more than 30 pages of highly technical text, and I don't expect people to memorize it by heart.