From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Unicode confusables and reordering characters considered harmful, a simple solution Date: Thu, 04 Nov 2021 11:45:26 +0200 Message-ID: <83ee7wfe4p.fsf@gnu.org> References: <875ytag0hb.fsf@yahoo.com> <87zgqmd5np.fsf@mat.ucm.es> <83wnlqk3rn.fsf@gnu.org> <72dd5c2a-42c7-b12e-05ed-e93adbd89727@gmail.com> <83ilxajyhw.fsf@gnu.org> <83fssejxf8.fsf@gnu.org> <835ytajsv2.fsf@gnu.org> <831r3yjqo9.fsf@gnu.org> <83v91aibe7.fsf@gnu.org> <87o872s0wf.fsf_-_@db48x.net> <83lf25gm1j.fsf@gnu.org> <83ee7xgio2.fsf@gnu.org> <87fssdrp54.fsf@db48x.net> <831r3xgfz3.fsf@gnu.org> <87v918qx37.fsf@db48x.net> <83o870fjqg.fsf@gnu.org> <7699dbfaffc44df293f3@heytings.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30326"; mail-complaints-to="usenet@ciao.gmane.io" Cc: cpitclaudel@gmail.com, stefan@marxist.se, emacs-devel@gnu.org, db48x@db48x.net, monnier@iro.umontreal.ca, yuri.v.khan@gmail.com To: Gregory Heytings Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Nov 04 10:46:14 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1miZK1-0007cK-33 for ged-emacs-devel@m.gmane-mx.org; Thu, 04 Nov 2021 10:46:13 +0100 Original-Received: from localhost ([::1]:42186 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1miZJz-0004v3-RL for ged-emacs-devel@m.gmane-mx.org; Thu, 04 Nov 2021 05:46:11 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40498) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miZJN-0004At-Dy for emacs-devel@gnu.org; Thu, 04 Nov 2021 05:45:33 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:34386) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miZJM-0003Wr-N6; Thu, 04 Nov 2021 05:45:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=jVzjEYUHCjI+jEV9HvcH1yi0NN0ho06FPie047CEJwI=; b=iY4qp1/wZbmu QG6X9t4eJQH/yCYRre6VQsMjzdzGYARapGzvfOcX4hH6Cjy+5wFD1bv22DwLkM6x+9yydyI8Wzn4p atghom10U34rEEVDdv/ZXR+AVjkyOD1ccz+RTub7SFHBmrTQJMENFmSj9Pphcg7FUDllMJqg5jE4Q Ixe+4hS1M3UHo0HnsW2m/+UYl2MgeF3HuClXs3/aQYynmoS0li+7H58nQWJCMlITNswEW1hqMpuCU dKPpcDAluLSWMP5zIucbUfKGYHZ02PegG2mzN9mEEtM5fK6FUSZlXtsBWaOzfUQvWJ7kOr4fwhuO8 dbdtyTV8SPr9O8Vy461jXQ==; Original-Received: from [87.69.77.57] (port=3642 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1miZJH-0005TV-KD; Thu, 04 Nov 2021 05:45:32 -0400 In-Reply-To: <7699dbfaffc44df293f3@heytings.org> (message from Gregory Heytings on Thu, 04 Nov 2021 09:14:42 +0000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:278669 Archived-At: > Date: Thu, 04 Nov 2021 09:14:42 +0000 > From: Gregory Heytings > cc: Daniel Brooks , cpitclaudel@gmail.com, > yuri.v.khan@gmail.com, stefan@marxist.se, monnier@iro.umontreal.ca, > emacs-devel@gnu.org > > > The mere presence of these characters is NOT the root cause. These > > characters are legitimate and helpful when used as intended. See > > TUTORIAL.he for a pertinent example. > > But TUTORIAL.he is not a pertinent example, because it's not a file with > source code. It's a pertinent example to show that these characters do > have legitimate uses, which is obvious. It's a pertinent example, because it shows that these characters have their use in human-readable text of technical nature (which frequently mixes RTL characters with LTR letters and punctuation). That is exactly what happens in comments and strings which use RTL scripts within source code. > If you could find an actual source code file in an actual project in > which these characters are used with their intended purpose, it > would be a pertinent example. Why do you need me to find an actual source code which uses those controls? Isn't it clear that any human-readable text in comments and strings in a program's source code can and will use these controls? How does the tutorial text that explains technical stuff related to a computer program differ from what a programmer could wish to write in a comment or a string in his/her program? Would it be enough if myself I wrote such a source code myself and show it to you? That would be an invented example, but so are the examples in the paper that brought up this subject, so how is that different?? > Otherwise it is safe and reasonable to assume (as the Rust > developers did) that the mere presence of these characters in source > code files is a potential problem and must be flagged as such. It's easy, that's sure. Reasonable it isn't. neither it's safe, because any user who does want these characters used legitimately will quickly turn off that warning for good. So it works for the Rust developers to tick a checkbox, but it isn't a solution for the problem.