From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Yuri Khan Newsgroups: gmane.emacs.devel Subject: Re: Unicode confusables and reordering characters considered harmful, a simple solution Date: Thu, 4 Nov 2021 02:35:04 +0700 Message-ID: References: <875ytag0hb.fsf@yahoo.com> <87zgqmd5np.fsf@mat.ucm.es> <83wnlqk3rn.fsf@gnu.org> <72dd5c2a-42c7-b12e-05ed-e93adbd89727@gmail.com> <83ilxajyhw.fsf@gnu.org> <83fssejxf8.fsf@gnu.org> <835ytajsv2.fsf@gnu.org> <831r3yjqo9.fsf@gnu.org> <83v91aibe7.fsf@gnu.org> <87o872s0wf.fsf_-_@db48x.net> <83lf25gm1j.fsf@gnu.org> <83ee7xgio2.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5767"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Daniel Brooks , =?UTF-8?Q?Cl=C3=A9ment_Pit=2DClaudel?= , Stefan Kangas , Stefan Monnier , Emacs developers To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Nov 03 20:36:50 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1miM42-0001LK-8O for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Nov 2021 20:36:50 +0100 Original-Received: from localhost ([::1]:42028 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1miM3z-0008O4-CN for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Nov 2021 15:36:48 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48504) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miM2b-0006kI-IF for emacs-devel@gnu.org; Wed, 03 Nov 2021 15:35:21 -0400 Original-Received: from mail-qv1-xf36.google.com ([2607:f8b0:4864:20::f36]:42550) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1miM2Y-0000I7-JX; Wed, 03 Nov 2021 15:35:20 -0400 Original-Received: by mail-qv1-xf36.google.com with SMTP id b17so3709688qvl.9; Wed, 03 Nov 2021 12:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=EB5TEdhOTo+W3pNAttd49z6qtYZVjuLuXP+kkMx1AtU=; b=oa5k056kOFss5I1MKV6bgjW5mAimznxpoWtSX+hDZnrzVLaZVqow/Qpbe1kBhwxbij F4ynoU5+esEewNSs3Ye5MMiovjZD03KWn7ot4ZbTTkhBSeNmYko96erknS9cC2raQFjV np80SCgVW8nkCFog+YgIjt05qixfIugVufIF67nc61gQCGV9I6GOh56K4AU8NINtQ0Zi Ja8MNm+Wj8GbysSzoWJUthIig5H9e/uLo62nDpY6adkEHgHXkH+jDHRelX9PPjupEZnD EjqiDWSJ9hk6Cbt6Q99TrRK+7j72dQCj3tLP4bRiqP2XbkU+ObcXIIbc2tSWPIKSUZZy 86PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=EB5TEdhOTo+W3pNAttd49z6qtYZVjuLuXP+kkMx1AtU=; b=qjqUparxlA70U3M84oTKSPApQJkgDtkqgENz/x5ooPUVEYGSIJDhYPmjE/HkfgAz7Z mWaDMx3lIZmA36jYBJYUdQ+hl5wMH3V3Rk+Gl11dbTIJ1MBhS7lfNC9zVsgoW8NTErK4 4SJUBD8tnMlnWCbidU0z35jVk5Tnq4suTVMDW5+8If0ncAQt818aKNlSZ9VXaaLfxqet YCeDrc4/jYjfZq2TJ1hH/8U75eHYaYiW+QU8x9CCc2DGPhh7WNTvUw1Vfcu2oxwcDygY aNGrE/bZuCEcaqtvulMajYFLUuovFXqMPTAYh3fyVssS3lDD57kBXF7bE1MhYC7sEqkw 3yOw== X-Gm-Message-State: AOAM532hNtTCil6kNeWWDVPZlWXOnV+7+BmvAC5NOzzLE6w5FMGXn1ov FlQHtgNYXhFOm86Fc2l1OrSuGArsZ7HW0eVg3SX5A2SAvb6LRw== X-Google-Smtp-Source: ABdhPJxx666VhAiV9zXwE2ZI1Zo8E76WA3zQtpJ7GQbM/mGa4rRZOSoWRoytOotY/rL5FX114wozEWSZR95Jh9HXEhs= X-Received: by 2002:a05:6214:e41:: with SMTP id o1mr45080325qvc.0.1635968116116; Wed, 03 Nov 2021 12:35:16 -0700 (PDT) In-Reply-To: <83ee7xgio2.fsf@gnu.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::f36; envelope-from=yurivkhan@gmail.com; helo=mail-qv1-xf36.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278613 Archived-At: On Thu, 4 Nov 2021 at 02:09, Eli Zaretskii wrote: > Do you read Hebrew? No. I just imagine how I=E2=80=99d perceive the text if I could. > Those characters look like line noise there, > whereas the text with the default display is perfectly readable, and > most people won't even know these controls are there (as intended). TUTORIAL.he is slightly special, in that both an editor and a reader[^1] use the same mode (because once in a while the user is instructed to edit some part of their copy). In most other cases, I prefer remaps turned on when I=E2=80=99m an editor or reviewer, and off whe= n I=E2=80=99m a reader. [^1]: Here, by =E2=80=9Ceditor=E2=80=9D and =E2=80=9Creader=E2=80=9D I mean= the human roles, not software. > > One does not only want to highlight, but also to actually see and > > distinguish certain characters > > What for? The absolute majority of people won't have any idea what is > the effect of each of these controls, and how it differs from others. > Even I many times need to talk myself through their effect on display. > The UBA spec weighs in at more than 30 pages of highly technical text, > and I don't expect people to memorize it by heart. Most people, when in the reader role, probably won=E2=80=99t and shouldn=E2= =80=99t have to. If I=E2=80=99m editing a text in a bidi language, though, I am expected to = use format control characters, and so I must know where they are or are not. In the same vein, when I edit a program expected to conform to a coding style, I must know where spaces and tabs are, so I do not introduce whitespace-only changes or trailing blanks and keep indentation consistent. Or when I edit anything that will end up as a web page I want to know which spaces and hyphens are non-breaking, so the page will wrap correctly no matter how the user resizes their window and/or zooms the page. (No, I do not trust tools to do these things right; if they could, we would not need format control characters at all. I like tools to let me check what they did and correct if necessary.)