From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.bugs Subject: bug#51733: 27.1; Detect impossible email addresses better Date: Mon, 17 Jan 2022 21:22:58 +0100 Message-ID: <87sftm3ye5.fsf@gnus.org> References: <87czn8etuz.7.fsf@jidanni.org> <87zgqb6tcm.fsf@gnus.org> <87tue3y96e.fsf@gnus.org> <83fspny8fp.fsf@gnu.org> <87pmory859.fsf@gnus.org> <83ee57y7xz.fsf@gnu.org> <87lezfy70y.fsf@gnus.org> <83czkry6jw.fsf@gnu.org> <87h7a3y5no.fsf@gnus.org> <87czkry3h8.fsf@gnus.org> <835yqjy26h.fsf@gnu.org> <878rveybyf.fsf@gnus.org> <49EACC4C-E21E-4123-A3D2-901958CF7DC4@gnu.org> <87mtju76pp.fsf@gnus.org> <83bl0awbqq.fsf@gnu.org> <87sftm5lxc.fsf@gnus.org> <835yqiwa87.fsf@gnu.org> <87ee565l4d.fsf@gnus.org> <875yqi5kk7.fsf@gnus.org> <83zgnuuucu.fsf@gnu.org> <83r196uqni.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29417"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: 51733@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jan 17 21:24:14 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n9YY2-0007QY-Ow for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 17 Jan 2022 21:24:14 +0100 Original-Received: from localhost ([::1]:40748 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n9YY1-0005rv-QU for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 17 Jan 2022 15:24:13 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:54254) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9YXt-0005rY-1f for bug-gnu-emacs@gnu.org; Mon, 17 Jan 2022 15:24:05 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:55223) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n9YXq-0004D1-PW for bug-gnu-emacs@gnu.org; Mon, 17 Jan 2022 15:24:04 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1n9YXq-0003Cr-D2; Mon, 17 Jan 2022 15:24:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Mon, 17 Jan 2022 20:24:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51733 X-GNU-PR-Package: emacs,gnus Original-Received: via spool by 51733-submit@debbugs.gnu.org id=B51733.164245099112265 (code B ref 51733); Mon, 17 Jan 2022 20:24:02 +0000 Original-Received: (at 51733) by debbugs.gnu.org; 17 Jan 2022 20:23:11 +0000 Original-Received: from localhost ([127.0.0.1]:48126 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n9YX0-0003Bk-Ov for submit@debbugs.gnu.org; Mon, 17 Jan 2022 15:23:10 -0500 Original-Received: from quimby.gnus.org ([95.216.78.240]:45912) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n9YWy-0003BW-HC for 51733@debbugs.gnu.org; Mon, 17 Jan 2022 15:23:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=7jciatWf+GjgJ6IO20ED2br2VvxgFDzuIxA6bonyCbs=; b=FTi/ouOgFR+ReecrWRQTTO5Ums tJMRko9wlLdrc+4CeGoBbK4dEsn1vW+Fs7QxLyzuXPSrbcFJ60fnUbMb3gTXSMqrTKOEYtf2eXQD0 SZXJqOi6dXUPbmcrFikSJgIYAvHoZnIIHOxV45EugQ8zW/zsEqcYS9Q/mzDGlElu9RSk=; Original-Received: from [84.212.220.105] (helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n9YWp-0005bH-BF; Mon, 17 Jan 2022 21:23:02 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAElBMVEXZ26hQgaEwSGD8 /NsHBgv///+ct3QrAAAAAWJLR0QF+G/pxwAAAAd0SU1FB+YBERQTDd19ffYAAAGISURBVDjLfZML csQgCIZRLyC4Bwg0BzDrBVrH+5+poHmYdLZM1sn+nyACAVDDSHgZ+8VECBngBnACeDMfO8gluztw HRSArhPbGvWPyUFVkr6rNdGXVDLbLxSQVRVhag13ELI30FqrD1A8DqAIqVU7KpQNQxYsB0hviaVb xjIBCZ7LDlgfAzUCVQafDhDL1oEJxJHCAZYdFFW2O3gP0ONGdOEEHuMFRk5hAn0j4XF2yV+TR8FP YJR0dOcO4tWMpP4n2KY2pTwB/w84xyHYBJQ80vW4WilFdE1W/uwBxz1ebbeaHCNvFcRuS9oo1Wz5 CQ4de4G6WBERVWMLyMklv7xWaNpjjUprFUaqmpUHiWsDDZo1qurqoZ0XLbSAgbZk5n5GW21i9F1U 1ecn28jJkVYT5AFqUEDtAnKARNN+i7a4ASLROoF2gVGLyxbfQZPb/gn8se9PoH4Erw/AbJ/2Z14z eOS2fzhi3+ztmrsH9g6DWx+A8RhQuoGK4Rzd1ww4ltOcXGByUBc/wC89WLzcTNIETQAAACV0RVh0 ZGF0ZTpjcmVhdGUAMjAyMi0wMS0xN1QyMDoxOToxMyswMDowMK7GNrgAAAAldEVYdGRhdGU6bW9k aWZ5ADIwMjItMDEtMTdUMjA6MTk6MTMrMDA6MDDfm44EAAAAAElFTkSuQmCC X-Now-Playing: Peter Gabriel's _Peter Gabriel 4_: "Shock The Monkey" In-Reply-To: <83r196uqni.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 17 Jan 2022 21:08:01 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:224479 Archived-At: I'm not quite sure I understand this bit here https://www.unicode.org/reports/tr39/#Confusable_Detection --- For an input string X, define skeleton(X) to be the following transformatio= n on the string: Convert X to NFD format, as described in [UAX15]. Concatenate the prototypes for each character in X according to the spe= cified data, producing a string of exemplar characters. Reapply NFD. --- I mean, that sounds OK in and of itself, but then: --- X and Y are single-script confusables if and only if they are confusable, = and their resolved script sets have at least one element in common. Examples: =E2=80=9C=C7=89eto=E2=80=9D and =E2=80=9Cljeto=E2=80=9D in La= tin (the Croatian word for =E2=80=9Csummer=E2=80=9D), where the first word = uses only four codepoints, the first of which is U+01C9 (=C7=89) LATIN SMAL= L LETTER LJ. --- But: (ucs-normalize-NFD-string "=C7=89eto") =3D> "=C7=89eto" So according to that algo "=C7=89eto" and "ljeto" are not confusable. But if we use NFKD instead, they are: (ucs-normalize-NFKD-string "=C7=89eto") =3D> "ljeto" It seems unlikely to be a typo in this document, surely? But NFKD seems to make a whole lot more sense than NFD for this usage. I must be missing or misreading something. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no