From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: master ce63f91025: Add textsec functions for verifying email addresses Date: Tue, 18 Jan 2022 20:42:35 +0200 Message-ID: <8335lkvqas.fsf@gnu.org> References: <164250841214.433.17670666873471731764@vcs2.savannah.gnu.org> <20220118122012.7A3A4C0DA1B@vcs2.savannah.gnu.org> <87ee555fy8.fsf@yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="403"; mail-complaints-to="usenet@ciao.gmane.io" Cc: larsi@gnus.org, emacs-devel@gnu.org To: Po Lu Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jan 18 20:18:31 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n9tzy-000AVz-MN for ged-emacs-devel@m.gmane-mx.org; Tue, 18 Jan 2022 20:18:30 +0100 Original-Received: from localhost ([::1]:45468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n9tzt-0006GI-RB for ged-emacs-devel@m.gmane-mx.org; Tue, 18 Jan 2022 14:18:28 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:49262) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9tRM-0004ry-P1 for emacs-devel@gnu.org; Tue, 18 Jan 2022 13:42:46 -0500 Original-Received: from [2001:470:142:3::e] (port=43982 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9tRM-0002YW-B8; Tue, 18 Jan 2022 13:42:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=n+aiS6uDWQdFH7hl5VTy4vF/nCza9Vx5J3m+mnQRLIk=; b=rH9v28PREWxog92g6JGR kpeZxmKF/rC1MD4vbnrakc3ojOlNz9VhfHSiUyX3VGmTkFIAhgQloSsZWI4wdB1V8tGLk1PxfdIDU mZh6GlBOkj5ffOmwQOqal0YMNZCvGXBMnpJchkL8ztqPLfV3cZFn6OG831cxdzNYwDnoktYfOKZ4B 9wTPV7a9f5niFlnMf69IphuYbBcSbH9dk7EhLAjupGSlzyh9/KEsLuVDhr7YwSdlbnkOBDcZJzYBp 7M8DpZYgvK1A+DoH79dLUaClVkwZ5NUJG+AumWRQWxuqJI6LdEq5XIrEicjsHPBhcHHIK3wGEo3yu iND1LfzV2JB/og==; Original-Received: from [87.69.77.57] (port=4735 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9tRK-0005yz-DV; Tue, 18 Jan 2022 13:42:44 -0500 In-Reply-To: <87ee555fy8.fsf@yahoo.com> (message from Po Lu on Tue, 18 Jan 2022 21:30:39 +0800) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:284944 Archived-At: > From: Po Lu > Cc: Lars Ingebrigtsen > Date: Tue, 18 Jan 2022 21:30:39 +0800 > > Lars Ingebrigtsen writes: > > > +(defun textsec-name-suspicious-p (name) > > + "Say whether NAME looks suspicious. > > +NAME is (for instance) the free-text name from an email address. > > + > > +If it suspicious, nil is returned. If it is, a string explaining > > +the problem is returned." > > + (cond > > + ((not (equal name (ucs-normalize-NFC-string name))) > > + (format "`%s' is not in normalized format `%s'" > > + name (ucs-normalize-NFC-string name))) > > + ((seq-find (lambda (char) > > + (and (member char bidi-control-characters) > > + (not (member char > > + '( ?\N{left-to-right mark} > > + ?\N{right-to-left mark} > > + ?\N{arabic letter mark}))))) > > + name) > > + (format "The string contains bidirectional control characters")) > > + ((textsec-suspicious-nonspacing-p name)))) > > I thought the consensus from the last discussion about this subject was > to use `bidi-find-overridden-directionality' for this kind of thing, to > avoid false positives with legitimate use of bidirectional control > characters. Yes, using the Unicode security guidelines would produce unnecessary false positives. Which could be OK for paranoid minds, I guess, who are afraid of any bidi controls, even if they don't actually affect the display order. Like in this example: "אבגד ⁧שונה⁩ מרגיל" I do hope we will eventually offer separate functions to do that with fewer false positives (or a way of customizing these textsec functions to do that).