From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Gregory Heytings Newsgroups: gmane.emacs.devel Subject: Re: Unicode confusables and reordering characters considered harmful, a simple solution Date: Thu, 04 Nov 2021 14:10:01 +0000 Message-ID: <7699dbfaffce8f3a1f41@heytings.org> References: <72dd5c2a-42c7-b12e-05ed-e93adbd89727@gmail.com> <83ilxajyhw.fsf@gnu.org> <83fssejxf8.fsf@gnu.org> <835ytajsv2.fsf@gnu.org> <831r3yjqo9.fsf@gnu.org> <83v91aibe7.fsf@gnu.org> <87o872s0wf.fsf_-_@db48x.net> <83lf25gm1j.fsf@gnu.org> <83ee7xgio2.fsf@gnu.org> <87fssdrp54.fsf@db48x.net> <831r3xgfz3.fsf@gnu.org> <87v918qx37.fsf@db48x.net> <83o870fjqg.fsf@gnu.org> <7699dbfaffc44df293f3@heytings.org> <83ee7wfe4p.fsf@gnu.org> <7699dbfaff0348867b72@heytings.org> <83a6ikf9pw.fsf@gnu.org> <7699dbfaff090e4342a3@heytings.org> <838ry4f3xf.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="GkRCzNT8ZT" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3076"; mail-complaints-to="usenet@ciao.gmane.io" Cc: cpitclaudel@gmail.com, stefan@marxist.se, yuri.v.khan@gmail.com, db48x@db48x.net, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Nov 04 15:33:07 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1midnf-0000XN-2G for ged-emacs-devel@m.gmane-mx.org; Thu, 04 Nov 2021 15:33:07 +0100 Original-Received: from localhost ([::1]:43268 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1midnd-0008Qv-3z for ged-emacs-devel@m.gmane-mx.org; Thu, 04 Nov 2021 10:33:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38700) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1midRP-0006Fq-V4 for emacs-devel@gnu.org; Thu, 04 Nov 2021 10:10:08 -0400 Original-Received: from heytings.org ([95.142.160.155]:52780) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1midRM-0002L0-MR; Thu, 04 Nov 2021 10:10:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heytings.org; s=20210101; t=1636035001; bh=nMGh0yjuPjYqHf9ZPuE0mSPrPYd45bOMSMYMw8gjfTQ=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:From; b=cqkJUWWiudVIvXy/JaB/wrm3sfVLmlcBDymUBtLLr3moIygvWA0LkvA/KdztKHbsA DTLd/Bfs2DRpitYtXl0T1v4MhgRXkWY45PAvBbX/4/1ZSCq6gkLwAX+K6Iu8QVQ2UG UIbNJXLdE1ua1TKYGpgeluECpTzcDgO1+Brwx57AmrlYcyA/y9dQnbr0AHX59VIuqZ KX/lyy9wJC8vUDPv9p6dY9UNof6sdNpOzJSNRovyLog+2BYgLpi8bTOvtLQ66t4eNC IhsjaYhQSM6H2Xy8mEwTpnXYSteysHIvcbMhxk3TuE5oDa9ApTM77TiuhqFKUlnIDy PXN+72TV9s2dA== In-Reply-To: <838ry4f3xf.fsf@gnu.org> Content-ID: <7699dbfaff6cd57b98a7@heytings.org> Received-SPF: pass client-ip=95.142.160.155; envelope-from=gregory@heytings.org; helo=heytings.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278681 Archived-At: --GkRCzNT8ZT Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Content-ID: <7699dbfaffc62d2e9d6f@heytings.org> >>> Since when is it OK to flag characters that are used very rarely? What= =20 >>> would be the sense of doing that? Should we perhaps flag all the=20 >>> Egyptian hieroglyphs for the same reason? >> >> The answer is above: "given that these controls can have a dangerous=20 >> effect". > > But they don't. Not more than just using RTL characters within LTR text= =20 > would. Just revisit the example posted by Stefan (which I slightly=20 > modified to be more realistic): > > myfun("=D7=A9=D6=B8=D7=81=D7=9C=D7=95=D6=B9=D7=9D" ,"=D8=A7=D9=84=D8= =B3=D9=91=D9=84=D8=A7=D9=85=D8=B9=D9=84=D9=8A=D9=83=D9=85"); > > Which string does this function call pass as the first argument, and=20 > which as the second one? > There is no danger in that example, and in particular nothing invisible.=20 The programmer must just be aware that compilers read source code files in= =20 byte order, which might be different from the order in which the string is= =20 displayed on screen, but is identical to the order in which one=20 forward-char's through the string. There is a danger when, because the source code contains invisible control= =20 characters, the programmer sees something on their screen, and the=20 compiler sees something completely different. --GkRCzNT8ZT--