From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.bugs Subject: bug#51733: 27.1; Detect impossible email addresses better Date: Wed, 19 Jan 2022 16:45:29 +0100 Message-ID: <87k0ev91ba.fsf@gnus.org> References: <87czn8etuz.7.fsf@jidanni.org> <87pmory859.fsf@gnus.org> <83ee57y7xz.fsf@gnu.org> <87lezfy70y.fsf@gnus.org> <83czkry6jw.fsf@gnu.org> <87h7a3y5no.fsf@gnus.org> <87czkry3h8.fsf@gnus.org> <835yqjy26h.fsf@gnu.org> <878rveybyf.fsf@gnus.org> <49EACC4C-E21E-4123-A3D2-901958CF7DC4@gnu.org> <87mtju76pp.fsf@gnus.org> <83bl0awbqq.fsf@gnu.org> <87sftm5lxc.fsf@gnus.org> <835yqiwa87.fsf@gnu.org> <87ee565l4d.fsf@gnus.org> <875yqi5kk7.fsf@gnus.org> <83zgnuuucu.fsf@gnu.org> <87zgnrakyw.fsf@gnus.org> <83fspjstho.fsf@gnu.org> <87a6frajfg.fsf@gnus.org> <83bl07srh9.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27383"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: 51733@debbugs.gnu.org, jidanni@jidanni.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Jan 19 17:03:21 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nADQf-0006va-4O for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 19 Jan 2022 17:03:21 +0100 Original-Received: from localhost ([::1]:53426 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nADQe-0006LG-4T for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 19 Jan 2022 11:03:20 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:44990) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nAD9v-0003Qh-4H for bug-gnu-emacs@gnu.org; Wed, 19 Jan 2022 10:46:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:34160) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nAD9u-0000fr-Oy for bug-gnu-emacs@gnu.org; Wed, 19 Jan 2022 10:46:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1nAD9u-0006VA-HU; Wed, 19 Jan 2022 10:46:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Wed, 19 Jan 2022 15:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51733 X-GNU-PR-Package: emacs,gnus Original-Received: via spool by 51733-submit@debbugs.gnu.org id=B51733.164260714324955 (code B ref 51733); Wed, 19 Jan 2022 15:46:02 +0000 Original-Received: (at 51733) by debbugs.gnu.org; 19 Jan 2022 15:45:43 +0000 Original-Received: from localhost ([127.0.0.1]:55295 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nAD9b-0006UQ-1X for submit@debbugs.gnu.org; Wed, 19 Jan 2022 10:45:43 -0500 Original-Received: from quimby.gnus.org ([95.216.78.240]:38358) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nAD9X-0006U7-K8 for 51733@debbugs.gnu.org; Wed, 19 Jan 2022 10:45:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=x289BiipMn7dKnzYuAar/0ZF+YYsXFrFugAhoVbbWBc=; b=sfuzFlRHbnZqahV0vbxDjY/BHv vwMVLShW96R+bmBDYd0RSt4pQB2x+60xMflr4pRizkrMzIpH52T5gNbWyMChaY6n+nyWbwlYq85Nl dFKlxS29dTy4AHy3xYUlMlczpHO0PnhIJ7bFWuBS4sU1VgmfSN1rHUpYNyXhNCjvze4w=; Original-Received: from [84.212.220.105] (helo=giant) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nAD9N-0004HY-Sx; Wed, 19 Jan 2022 16:45:32 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAD1BMVEVORz8eHx23nVLB 0cn///8vnA2VAAAAAWJLR0QEj2jZUQAAAAd0SU1FB+YBEw8oBsxnz1wAAAGHSURBVDjLdVSNucMg CASzANgFDC5gw/67vUNj8/v80kg4uUPEErG+DSH9Z1yAGSx3QK8AD+vnFSUKQIiQAOMNizNmSsw9 gmhVyswAVAqXtAMrUZFEHBGgwHp8RLpwle7btURkRMT8TK5rYCGn8b3eN5i8707GxD3dUBG3Lyb9 uAfMMwJ+a8Kii/s2vDR43EqFmFY396JYEsCy4StVBoOH6W0CFoCtotmpevW2U2WjaqEe2t7cv7u4 bFatbta1fVt6yl0cVPEItEG0uM50P/Cui5N6aHhDBUftssFpYPAGff8G0yjq5pytalCx+xo1GTuH OIb2jfeiTOADcWsjXdDN6iKHAMCQA+Bz2fMWElgwA37nIbU7FlTw1onn1roA/Abw6TffdJjMD+Do W75S3QF5XBw6uM4AH8Zo490hdAgLUyqeqPRnS9TCMNgWvQ+Y4uJgtLBTCfNXC4rzpOg12WaX8C0r J9a3dNXTHRg3SuvyCsSp8PsfAFqu3/MHQCmA7M+i0L26p2P8A3hQUvGyC4YWAAAAJXRFWHRkYXRl OmNyZWF0ZQAyMDIyLTAxLTE5VDE1OjQwOjA2KzAwOjAw/Nr+agAAACV0RVh0ZGF0ZTptb2RpZnkA MjAyMi0wMS0xOVQxNTo0MDowNiswMDowMI2HRtYAAAAASUVORK5CYII= X-Now-Playing: The Art of Noise's _Who's Afraid of...!_: "A Time For Fear (Who's Afraid)" In-Reply-To: <83bl07srh9.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 19 Jan 2022 16:57:38 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:224614 Archived-At: Eli Zaretskii writes: > OK, but why do you think "=D0=A1=D0=B3=D1=81=D0=B5.ru" is confusable? Th= e SLD part is > entirely made of single-script characters, and UTS#39 explicitly > allows that: > > [...] it can be perfectly legitimate to have scripts in a SLD > (second level domain) not be the same as scripts in a TLD (top-level > domain), such as: > > Cyrillic labels in a domain name with a TLD of .ru or .=D1=80=D1=84=20 > > That's your case, isn't it? Yes, indeed. But: --- For some applications, it is useful to determine if a given input string ha= s any whole-script confusable. For example, the identifier "=D1=95=D1=81=D0= =BE=D1=80=D0=B5" using Cyrillic characters would pass the single-script tes= t described in Section 5.2, Restriction-Level Detection, even though it is = likely to be a spoof attempt.=20 --- So "=D0=A1=D0=B3=D1=81=D0=B5.ru" is suspicious in most contexts. > Regardless of what they are saying, I don't think the above is > suitable for production. I think it should be enough to see whether > there could be confusion with the corresponding ASCII characters from > confusables.txt. Yes, so that's what I've done now, but... I'd feel slightly better if I knew what they were actually getting at. I think they're saying that if "foo" is confusable with anything in any other scripts, then it's suspicious? But that sounds unworkeable. For instance, "circle.ru" is confusable with "=D0=A1=D1=96=D0=B3=D1=81=D3=80=D0=B5.ru", and perhaps it's= suspicious to a Russian, but I don't see how to make a workable function from that. Unless we start bringing in locales, and meh. So perhaps what I've implemented now is sufficient for domains. Anyway, I've implemented the user option and implemented this in shr, so we'll see how that goes. If no problems crop up, I'll announce all this in NEWS and document it in the lispref manual tomorrow. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no