From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Unicode confusables and reordering characters considered harmful Date: Wed, 03 Nov 2021 08:20:01 -0400 Message-ID: References: <875ytag0hb.fsf@yahoo.com> <87zgqmd5np.fsf@mat.ucm.es> <83wnlqk3rn.fsf@gnu.org> <72dd5c2a-42c7-b12e-05ed-e93adbd89727@gmail.com> <83ilxajyhw.fsf@gnu.org> <83fssejxf8.fsf@gnu.org> <835ytajsv2.fsf@gnu.org> <11d5fecb44af1d388b7f@heytings.org> <11d5fecb449846dc0851@heytings.org> <11d5fecb443892de13b1@heytings.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="19000"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: Stefan Kangas , Eli Zaretskii , =?windows-1252?Q?Cl=E9ment?= Pit-Claudel , emacs-devel@gnu.org To: Gregory Heytings Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Nov 03 13:21:48 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1miFH1-0004hn-OC for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Nov 2021 13:21:47 +0100 Original-Received: from localhost ([::1]:33696 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1miFGz-00069A-HD for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Nov 2021 08:21:45 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:43188) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miFFU-0004Zj-DQ for emacs-devel@gnu.org; Wed, 03 Nov 2021 08:20:13 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:16215) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miFFQ-0004CW-7R; Wed, 03 Nov 2021 08:20:10 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 4D7FF8065D; Wed, 3 Nov 2021 08:20:05 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 938A280590; Wed, 3 Nov 2021 08:20:03 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1635942003; bh=+SEy4b0O8OUuFG8DxqUGDlc+uGUSLteSz3HUzGKbudo=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=GPLDsU6uZJzWTZxKrVMs0JicxIHKjLQbqC8FWovDf4NR1+63dy6EjjfkoObkGUDX2 wttiRWwiR1fEDSoCrlM/JOi1A8OsOJjWmwO/VJHSH90Yu1c4IXRvfvOMtsav4l7Abr yHmCECNXMhOYJg5IiTfJf7mMedcIjQMLVokte7VqA/23uDPh2nHchKBscCuUkZGqq+ QMbislgyLTMkje4CjYtGDnbEdkpU6bZj4ILpk0+0fsKuQuMyXmMpd5bvVeKm7QNxKF LzgeBZiDeZhOCxfgOe3kBd5t772daYhUGPW63nD0zPicrIxqTbeGr3HFKxG9LrqLCH /JfBHvR25ASig== Original-Received: from pastel (unknown [45.72.241.23]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 4E0F51207CC; Wed, 3 Nov 2021 08:20:03 -0400 (EDT) In-Reply-To: <11d5fecb443892de13b1@heytings.org> (Gregory Heytings's message of "Wed, 03 Nov 2021 11:31:37 +0000") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278567 Archived-At: > AFAIK, these specific characters are not necessary to write comments and > strings in these languages. Here are two random file which use RTL stri= ngs > and comments, and in which these characters are not used: I was more worried about the fact that, while highlighting those chars might be helpful to warn about accidental uses of them, if attackers want to trick the reader, I'm pretty sure they can get similar results without having to use those special LTR/RTL override chars: int hi =3D 5; int =D7=A9=D6=B8=D7=81=D7=9C=D7=95=D6=B9=D7=9D =3D hi; int hello =3D 10; int =D8=A7=D9=84=D8=B3=D9=91=D9=84=D8=A7=D9=85=D8=B9=D9=84=D9=8A=D9=83 = =3D hello; myfun(=D7=A9=D6=B8=D7=81=D7=9C=D7=95=D6=B9=D7=9D ,=D8=A7=D9=84=D8=B3=D9= =91=D9=84=D8=A7=D9=85=D8=B9=D9=84=D9=8A=D9=83=D9=85) There's no override here, but did I call `myfun` with args 5 and 10 or did I call it with args 10 and 5? [ OK, admittedly, for a bidi-idiot like me, it looks like neither since the Arabic shaping of the two occurrences of the identifier actually look different (and I truly have no clue why that is here), so I'm lead to believe that the second is a reference to a non-existing variable ;-) ] Stefan