From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Unicode confusables and reordering characters considered harmful Date: Tue, 02 Nov 2021 21:36:22 +0200 Message-ID: <831r3yjqo9.fsf@gnu.org> References: <875ytag0hb.fsf@yahoo.com> <87zgqmd5np.fsf@mat.ucm.es> <83wnlqk3rn.fsf@gnu.org> <72dd5c2a-42c7-b12e-05ed-e93adbd89727@gmail.com> <83ilxajyhw.fsf@gnu.org> <83fssejxf8.fsf@gnu.org> <835ytajsv2.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37107"; mail-complaints-to="usenet@ciao.gmane.io" Cc: cpitclaudel@gmail.com, stefan@marxist.se, emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Nov 02 20:37:23 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mhzb1-0009Si-GE for ged-emacs-devel@m.gmane-mx.org; Tue, 02 Nov 2021 20:37:23 +0100 Original-Received: from localhost ([::1]:57794 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mhzb0-0005AS-6t for ged-emacs-devel@m.gmane-mx.org; Tue, 02 Nov 2021 15:37:22 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:58368) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mhza9-0004Sp-9E for emacs-devel@gnu.org; Tue, 02 Nov 2021 15:36:29 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:52774) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mhza7-0005vM-Gj; Tue, 02 Nov 2021 15:36:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=UxX5FDvEso2pL4rSQMgAQWMXKpOdmsleQcdEB/Py+fI=; b=NuVkgbvmSfTq LLXBybBB+lmoUfB+tgWVqNLbqI5n9/9Tf/gBqCYKo7nV/nDF02vqrLZ4zzf1trZrgI75WQlWRXR4P EjK0GAps7iTIL6SnL8KqJJvVDO5YJOqHu30bne2qxsMMhqLM5YDDeWfNa8lu8GRVxzldypEne0b0+ ILYTyLL1Rwe0Xhkk9ESXbiLl/pDQG2Aq0ifQaum1p03jmYs2TnLzOl5gxb8olhsZ9xyF1V3RINAxb IfW62jThTXoqrHramTPiFtMeUDKRsFRzM36uTVtwa4vr4AbVyn8XCi0bFsXReAPZ6goCJCzCjN/P4 vUi/gkA3jbZVBoYcekNtuA==; Original-Received: from [87.69.77.57] (port=2442 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mhza7-0007eP-0t; Tue, 02 Nov 2021 15:36:27 -0400 In-Reply-To: (message from Stefan Monnier on Tue, 02 Nov 2021 15:12:56 -0400) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278523 Archived-At: > From: Stefan Monnier > Cc: Stefan Kangas , cpitclaudel@gmail.com, > emacs-devel@gnu.org > Date: Tue, 02 Nov 2021 15:12:56 -0400 > > > You cannot see those characters on a screenshot, for the same reason > > you cannot see any whitespace characters on a screenshot: they are > > only discernible when you move cursor through them. Which is why I > > asked how are you looking for them. If you are looking for them in a > > screenshot, you will never see them. > > But that's the core of the vulnerability: if you just look at the screen > (and just scroll through it) you will have an incorrect understanding of > what the code does. If you want a more prominent display, customize glyphless-char-display-control to show format-control characters as acronyms, say, or as hex-code. And anyway, my point was that Emacs deviates from Unicode here, which says not to show these controls at all, and by deviating it gives the user some defense against these problems. I did say originally the defense was "weak", so if you want to point out that this is a weak defense, you are preaching to the choir. > It's good that such bidi override chars are displayed as a thin space, > but it's mostly useful to make it possible to edit them (or to `C-x =` > on them), but I don't think it makes a significant different in terms of > the security issues introduced by the presence of those chars in the code. In most cases, there's no need to make these controls stand out, because situations where this presents security risks are extremely rare, to put it mildly, and OTOH having them stand out more by default will make it harder to read text with completely legitimate uses of these controls (example: TUTORIAL.he). Therefore, IMNSHO it's okay to have this off by default (and have a way of turning that on in case of increased paranoia). Moreover, I think adding features that detect the suspicious uses of this functionality will better serve our users than just showing the controls more prominently, because it will have a much lower probability of false positives, and will avoid getting in the way of reading legitimate text which uses these controls for valid reasons.