From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#51733: 27.1; Detect impossible email addresses better Date: Mon, 17 Jan 2022 19:48:01 +0200 Message-ID: <83zgnuuucu.fsf@gnu.org> References: <87czn8etuz.7.fsf@jidanni.org> <834k8k3za3.fsf@gnu.org> <874k8jfloo.fsf@gnus.org> <83zgqb18hq.fsf@gnu.org> <87zgqb6tcm.fsf@gnus.org> <87tue3y96e.fsf@gnus.org> <83fspny8fp.fsf@gnu.org> <87pmory859.fsf@gnus.org> <83ee57y7xz.fsf@gnu.org> <87lezfy70y.fsf@gnus.org> <83czkry6jw.fsf@gnu.org> <87h7a3y5no.fsf@gnus.org> <87czkry3h8.fsf@gnus.org> <835yqjy26h.fsf@gnu.org> <878rveybyf.fsf@gnus.org> <49EACC4C-E21E-4123-A3D2-901958CF7DC4@gnu.org> <87mtju76pp.fsf@gnus.org> <83bl0awbqq.fsf@gnu.org> <87sftm5lxc.fsf@gnus.org> <835yqiwa87.fsf@gnu.org> <87ee565l4d.fsf@gnus.org> <875yqi5kk7.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="11083"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 51733@debbugs.gnu.org, jidanni@jidanni.org To: Lars Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jan 17 18:49:23 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n9W8A-0002j5-As for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 17 Jan 2022 18:49:22 +0100 Original-Received: from localhost ([::1]:45526 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n9W88-0004F2-Cd for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 17 Jan 2022 12:49:21 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:50740) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9W7q-0004AN-EZ for bug-gnu-emacs@gnu.org; Mon, 17 Jan 2022 12:49:05 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:54998) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n9W7q-0005Fr-5L for bug-gnu-emacs@gnu.org; Mon, 17 Jan 2022 12:49:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1n9W7q-0000ts-2k; Mon, 17 Jan 2022 12:49:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Mon, 17 Jan 2022 17:49:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51733 X-GNU-PR-Package: emacs,gnus Original-Received: via spool by 51733-submit@debbugs.gnu.org id=B51733.16424416993405 (code B ref 51733); Mon, 17 Jan 2022 17:49:02 +0000 Original-Received: (at 51733) by debbugs.gnu.org; 17 Jan 2022 17:48:19 +0000 Original-Received: from localhost ([127.0.0.1]:47900 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n9W78-0000sr-Oy for submit@debbugs.gnu.org; Mon, 17 Jan 2022 12:48:19 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:49826) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n9W77-0000sd-RT for 51733@debbugs.gnu.org; Mon, 17 Jan 2022 12:48:18 -0500 Original-Received: from [2001:470:142:3::e] (port=43778 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9W72-0005Al-C7; Mon, 17 Jan 2022 12:48:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=EtYjOJhATbZkfaw7OM0ETbM4yX3fpSqM7705sArbNa0=; b=C+v62XliBIIB5m70kFE2 Ok/CjnNLE3wwKjHUiNzsR4LcfxlTvBvxeZAR0IwXPYLi68hFUulSpY81KF+4whgu0DBGpxGGz1veI va9jQCJrGjFZ+Geib5XdVq1xSNzeIqzZuuTjeS723Vy4cSByaJqPHhx/3/U4GVRFjlbr0U4znCBTi XYTdhDgsMO1CshsUQcE6eLE19Zf9lI12wm/wPp/w5MdqWzvXhQlClAY3olOow3zF0YD81gg5eP4Lt /q2oUk6wOkqvQ6QM/3dDeyU9zxjtlN7Eeuq0UhamiB5ClMcaS8hG0VGm35gsCGSMeztEqJAQSx0WZ 425Ej7RkLWPeBA==; Original-Received: from [87.69.77.57] (port=4973 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9W71-0001e8-SC; Mon, 17 Jan 2022 12:48:12 -0500 In-Reply-To: <875yqi5kk7.fsf@gnus.org> (message from Lars Ingebrigtsen on Mon, 17 Jan 2022 18:38:48 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:224469 Archived-At: > From: Lars Ingebrigtsen > Cc: 51733@debbugs.gnu.org, jidanni@jidanni.org > Date: Mon, 17 Jan 2022 18:38:48 +0100 > > I'm looking at the Confusable section now. > > https://www.unicode.org/reports/tr39/#Confusable_Detection > > Looks easy enough to implement (and the ELPA package already does the > parsing, so I'll be reusing bits from that). > > But... I'm wondering what the higher level interface would be? I mean, > quite a lot of strings are confusable with something else, but which > ones are interesting? The only thing that seems immediately interesting > to check for is whether a string is confusable with ASCII? > > That is, > > (textsec-confusable-with-ascii-p "C𝗂𝗋𝖼𝗅𝖾") > => t > > Because the ASCII characters are the ones that people rely on when doing > ... things, like email and browsing the web. > > But I mean, "C𝗂𝗋𝖼𝗅𝖾" is confusable with "БігсӀС" (the latter is > Cyrillic), and if you're writing Russian, that might also be > interesting. So perhaps a > > (textsec-confusable-with-script-p "C𝗂𝗋𝖼𝗅𝖾" 'cyrillic) > => t > > ? But... I'm not sure in which contexts that would actually be vital > to know. Hm. I think we should first determine what kinds of applications may need this, and take it from there. The initial number of "confusability with" classes can be very small, and we can add more as we discover interesting use cases. The full number is pretty much infinite, I think, but I'm not sure Emacs needs to support all of them OOTB. We could support some of the popular ones, and provide infrastructure for developing more.