From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.bugs Subject: bug#51733: 27.1; Detect impossible email addresses better Date: Wed, 19 Jan 2022 14:55:35 +0100 Message-ID: <87zgnrakyw.fsf@gnus.org> References: <87czn8etuz.7.fsf@jidanni.org> <83zgqb18hq.fsf@gnu.org> <87zgqb6tcm.fsf@gnus.org> <87tue3y96e.fsf@gnus.org> <83fspny8fp.fsf@gnu.org> <87pmory859.fsf@gnus.org> <83ee57y7xz.fsf@gnu.org> <87lezfy70y.fsf@gnus.org> <83czkry6jw.fsf@gnu.org> <87h7a3y5no.fsf@gnus.org> <87czkry3h8.fsf@gnus.org> <835yqjy26h.fsf@gnu.org> <878rveybyf.fsf@gnus.org> <49EACC4C-E21E-4123-A3D2-901958CF7DC4@gnu.org> <87mtju76pp.fsf@gnus.org> <83bl0awbqq.fsf@gnu.org> <87sftm5lxc.fsf@gnus.org> <835yqiwa87.fsf@gnu.org> <87ee565l4d.fsf@gnus.org> <875yqi5kk7.fsf@gnus.org> <83zgnuuucu.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23286"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: 51733@debbugs.gnu.org, jidanni@jidanni.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Jan 19 15:47:12 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nACEv-0005lU-Ux for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 19 Jan 2022 15:47:10 +0100 Original-Received: from localhost ([::1]:46222 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nACEt-0008R1-8w for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 19 Jan 2022 09:47:08 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:37508) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nABRX-0007WJ-DQ for bug-gnu-emacs@gnu.org; Wed, 19 Jan 2022 08:56:08 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:60189) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nABRS-0006B4-Dz for bug-gnu-emacs@gnu.org; Wed, 19 Jan 2022 08:56:05 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nABRS-0000Hy-2c; Wed, 19 Jan 2022 08:56:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Wed, 19 Jan 2022 13:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51733 X-GNU-PR-Package: emacs,gnus Original-Received: via spool by 51733-submit@debbugs.gnu.org id=B51733.16426005501090 (code B ref 51733); Wed, 19 Jan 2022 13:56:02 +0000 Original-Received: (at 51733) by debbugs.gnu.org; 19 Jan 2022 13:55:50 +0000 Original-Received: from localhost ([127.0.0.1]:53091 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nABRF-0000HW-Fh for submit@debbugs.gnu.org; Wed, 19 Jan 2022 08:55:49 -0500 Original-Received: from quimby.gnus.org ([95.216.78.240]:35478) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nABRD-0000HG-AP for 51733@debbugs.gnu.org; Wed, 19 Jan 2022 08:55:48 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=9PHhxRLkMk2g2mlf5396q15DOqiRoEXoPlGjiI7hxtg=; b=XxjLximCDS//eXdEg2ayA9ayeC dtIdNEeXgS3KVYzY4vkx3GzmhmJQE8OX1bkmtEVAof5kOfu0oyDtGeCbPAKKOGXWElpe4RWB7HIvT JGy85HstpHFOsztkS/jB4m3jVJimXM/LwzDSjZxZtncqK3CXoOxykaZH208f44JSx8Uk=; Original-Received: from [84.212.220.105] (helo=giant) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nABR3-0003KW-Nb; Wed, 19 Jan 2022 14:55:40 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAgMAAAAqbBEUAAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAACVBMVEUwKCOfkID///+p RWBfAAAAAWJLR0QCZgt8ZAAAAAd0SU1FB+YBEw0tIo6XC6YAAAEqSURBVCjPPZFbigUxCERLSP57 oN2PwuTfQNz/VqY0904TQk77LqF6XjVJUzM8AAh+BgZ4YUy4oeAFpomZnAfxlhtiqAeSbgEYdAUW LRLYmB8YwbAGpp4HE/LLhx5Z6nxs/h5HmF6Yk9mQcaqNBs2dJyoyYPO1OCeLeFI1ZH9gZXrFFEhm jAYtt4x+P7TkB4YpVqpXHdFQluZMVhbGhMXeB/dLj22tSl22s7zwtk3yW6Ucc/6D7NvAdfMz0RJK K9CWqEZZ9IIJhmfpSO3EoZl7NkzjBLHH1Y4SOy7kTb8K9FgN/jNvHxwxn1esgXO8zyObFHlK/phe 20oKKW4NnLHyiXhr0OqLVW3WSd6r22UHCeXefyp1rQolB48LYahdWaad9r8L4ebEvmD4A1eHPsJv VNQxAAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIyLTAxLTE5VDEzOjQ1OjM0KzAwOjAwowb8OQAAACV0 RVh0ZGF0ZTptb2RpZnkAMjAyMi0wMS0xOVQxMzo0NTozNCswMDowMNJbRIUAAAAASUVORK5CYII= X-Now-Playing: David Bowie's _Tonight_: "Loving The Alien" In-Reply-To: <83zgnuuucu.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 17 Jan 2022 19:48:01 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:224599 Archived-At: Eli Zaretskii writes: > I think we should first determine what kinds of applications may need > this, and take it from there. The initial number of "confusability > with" classes can be very small, and we can add more as we discover > interesting use cases. The full number is pretty much infinite, I > think, but I'm not sure Emacs needs to support all of them OOTB. We > could support some of the popular ones, and provide infrastructure for > developing more. Yes. I was thinking about this bit, which isn't implemented yet (although the utility functions for it basically are). ---- The process of determining suspect usage of whole-script confusables is mor= e complicated than simply looking at the scripts of the labels in a domain = name. For example, it can be perfectly legitimate to have scripts in a SLD = (second level domain) not be the same as scripts in a TLD (top-level domain= ), such as: Cyrillic labels in a domain name with a TLD of .ru or .=D1=80=D1=84 Chinese labels in a domain name with a TLD of .com.au or .com Cyrillic labels that aren=E2=80=99t confusable with Latin with a TLD of= .com.au or .com The following high-level algorithm can be used to determine all scripts tha= t contain a whole-script confusable with a string X: Consider Q, the set of all strings confusable with X. Remove all strings from Q whose resolved script set is =E2=88=85 or ALL= (that is, keep only single-script strings plus those with characters only = in Common). Take the union of the resolved script sets of all strings remaining in = Q. As usual, this algorithm is intended only as a definition; implementations should use an optimized routine that produces the same result. ---- I'm not sure I understand the algorithm they're proposing. I think this shouldn't be suspicious? But I may be wrong: (textsec-domain-suspicious-p "=D0=A1=D0=B3=D1=81=D0=B5.=D1=80=D1=84") =3D> nil But this should be, but isn't currently: (textsec-domain-suspicious-p "=D0=A1=D0=B3=D1=81=D0=B5.ru") =3D> nil Now,=20 (textsec-ascii-confusable-p "=D0=A1=D0=B3=D1=81=D0=B5.ru") =3D> t and (textsec-ascii-confusable-p "=D0=A1=D0=B3=D1=81=D0=B5.=D1=80=D1=84") =3D> nil Is that what they mean here? I'm finding the logic overly clear here. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no