From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Gregory Heytings Newsgroups: gmane.emacs.devel Subject: Re: Unicode confusables and reordering characters considered harmful, a simple solution Date: Fri, 05 Nov 2021 23:32:30 +0000 Message-ID: <5ad1d47cbd16faa03b2a@heytings.org> References: <831r3yjqo9.fsf@gnu.org> <83v91aibe7.fsf@gnu.org> <87o872s0wf.fsf_-_@db48x.net> <83lf25gm1j.fsf@gnu.org> <83ee7xgio2.fsf@gnu.org> <87fssdrp54.fsf@db48x.net> <831r3xgfz3.fsf@gnu.org> <87v918qx37.fsf@db48x.net> <83o870fjqg.fsf@gnu.org> <7699dbfaffc44df293f3@heytings.org> <83ee7wfe4p.fsf@gnu.org> <7699dbfaff0348867b72@heytings.org> <83a6ikf9pw.fsf@gnu.org> <7699dbfaff090e4342a3@heytings.org> <838ry4f3xf.fsf@gnu.org> <7699dbfaffce8f3a1f41@heytings.org> Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=us-ascii Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7261"; mail-complaints-to="usenet@ciao.gmane.io" Cc: cpitclaudel@gmail.com, stefan@marxist.se, yuri.v.khan@gmail.com, db48x@db48x.net, Eli Zaretskii , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Nov 06 00:33:52 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mj8iV-0001ck-9m for ged-emacs-devel@m.gmane-mx.org; Sat, 06 Nov 2021 00:33:51 +0100 Original-Received: from localhost ([::1]:46944 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mj8iU-0006z6-5D for ged-emacs-devel@m.gmane-mx.org; Fri, 05 Nov 2021 19:33:50 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41010) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mj8hG-0005bE-Pn for emacs-devel@gnu.org; Fri, 05 Nov 2021 19:32:36 -0400 Original-Received: from heytings.org ([95.142.160.155]:54866) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mj8hE-0004cL-7B; Fri, 05 Nov 2021 19:32:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heytings.org; s=20210101; t=1636155150; bh=drT7Uc895V4jOEYvOKgMZitUFgGjw3jT1QhZFqjn99s=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:From; b=pTwo4x21VbxGPDMExSeU7aEmPqMrOrcqDkhU/QthANoArJ/IflB+GujjjuYU7A9pW 7xtw0THDhegG/QkLk0RapMOKswnEJQyN07oabLFYQ217fh73jjkwzpc4R+/gAZbTV/ XOKp4eKe4mj884PiiBSYJ1ZACqHo+GwmMF5wr1x+k+y3cUc3m8ekmHCJXseo+FABEX ZPQQxCXWbnFb2rAYNctBequF1GXevdxihB7SliB5O39Cny+e/YmsFrA7RCYLmFQgtr uvSTJF/040CzOX2uhxNF6aWTv0oWmHH4ZsWt5Wxz14jltMxjO6q9/P/csRxL/7bHvi ibO4xGRFxFbDQ== In-Reply-To: Received-SPF: pass client-ip=95.142.160.155; envelope-from=gregory@heytings.org; helo=heytings.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278823 Archived-At: >> There is a danger when, because the source code contains invisible >> control characters, the programmer sees something on their screen, and >> the compiler sees something completely different. > > You mean there is a special kind of danger coming from the invisible > control characters because they can make code render unexpectedly even > though all the rendered chars are "familiar" (e.g. all-ASCII)? > > That's a good point. > Indeed, that's what I mean. Or rather, that's what the authors of the "Trojan Source" paper mean. And given that the legitimate uses of these invisible control characters in source code are exceedingly rare (I still haven't seen a single real-life case), making them visible by default makes sense. Just like we make no-break spaces visible by default.