From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#51733: 27.1; Detect impossible email addresses better Date: Sun, 16 Jan 2022 20:14:08 +0200 Message-ID: <838rvfy2dr.fsf@gnu.org> References: <87czn8etuz.7.fsf@jidanni.org> <87tugkkfid.fsf@gnus.org> <83fss44rbn.fsf@gnu.org> <875yt0ipre.fsf@gnus.org> <834k8k3za3.fsf@gnu.org> <874k8jfloo.fsf@gnus.org> <83zgqb18hq.fsf@gnu.org> <87zgqb6tcm.fsf@gnus.org> <87tue3y96e.fsf@gnus.org> <83fspny8fp.fsf@gnu.org> <87pmory859.fsf@gnus.org> <83ee57y7xz.fsf@gnu.org> <87lezfy70y.fsf@gnus.org> <83czkry6jw.fsf@gnu.org> <87h7a3y5no.fsf@gnus.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39353"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 51733@debbugs.gnu.org, jidanni@jidanni.org To: Lars Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Jan 16 19:15:47 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n9A47-0009wE-TA for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 16 Jan 2022 19:15:44 +0100 Original-Received: from localhost ([::1]:42588 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n9A46-0005E7-Dx for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 16 Jan 2022 13:15:42 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:39460) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9A3T-0005CU-6j for bug-gnu-emacs@gnu.org; Sun, 16 Jan 2022 13:15:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:51325) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n9A3S-000672-Sc for bug-gnu-emacs@gnu.org; Sun, 16 Jan 2022 13:15:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1n9A3S-0006Xd-Et; Sun, 16 Jan 2022 13:15:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Sun, 16 Jan 2022 18:15:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51733 X-GNU-PR-Package: emacs,gnus Original-Received: via spool by 51733-submit@debbugs.gnu.org id=B51733.164235687025062 (code B ref 51733); Sun, 16 Jan 2022 18:15:02 +0000 Original-Received: (at 51733) by debbugs.gnu.org; 16 Jan 2022 18:14:30 +0000 Original-Received: from localhost ([127.0.0.1]:44222 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n9A2w-0006W9-3y for submit@debbugs.gnu.org; Sun, 16 Jan 2022 13:14:30 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:38664) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n9A2t-0006Vw-V0 for 51733@debbugs.gnu.org; Sun, 16 Jan 2022 13:14:29 -0500 Original-Received: from [2001:470:142:3::e] (port=51678 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9A2o-00062h-Fu; Sun, 16 Jan 2022 13:14:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=QVnK3esmVso8V3ySM0Mz3DAyk8NGcSJ3TGbKY7ZY4AM=; b=TrYzaBjF4ALl yW2wdXOsD9IlV+Xxid3D9/lkh+8DMSVUa+4fFciVHb4HGWFVaG4nz0NNVgi6NYeORlHkigbAaZldz nJGCXkDcr3p80FLBPfZOp8fp0zeQlSpq0RYl6vkrQS+Gps0SwgeVgofQk3rkHWa2VwwOGOz4WXzb7 PMXyvMQgS+AGd+Xzuaa4t1CxP/fDDq1bDTtxznWiJSDe38GNF0MT9ZdoZNlkPA1X9ESk1m80in/nz CC3yaYREMshJWU4VazCOv6GrThIppzPhXVvdlLmvVto2z4SsPHO5d+yez0m8qpvVOINCKxeDnrHnI 1mB77w+5e12uVbi3GDgTUg==; Original-Received: from [87.69.77.57] (port=1933 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n9A2l-0003cu-Kn; Sun, 16 Jan 2022 13:14:22 -0500 In-Reply-To: <87h7a3y5no.fsf@gnus.org> (message from Lars Ingebrigtsen on Sun, 16 Jan 2022 18:03:23 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:224407 Archived-At: > From: Lars Ingebrigtsen > Cc: 51733@debbugs.gnu.org, jidanni@jidanni.org > Date: Sun, 16 Jan 2022 18:03:23 +0100 > > https://www.unicode.org/reports/tr24/tr24-32.html#Scripts_and_Blocks > > As a result, using the block names as simplistic substitute for > script identity generally leads to poor results. > > It looks like we're doing that, though? No, not really. We collect various blocks of the same scripts together. > And indeed: > > (elt char-script-table #xAB65) > => latin > > which is wrong, because that's > > GREEK LETTER SMALL CAPITAL OMEGA > > So we should be populating char-script-table from > http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt instead of > Blocks.txt. So I'll be doing that, too. Beware: the Unicode Script property is not identical to ours! Before throwing away what we have, please consider how many deviations we have in practice, and if they are just a few, let's fix only them individually. It's easy. You will have to add some manual heuristics even if you do use the Unicode Scripts.txt as the basis.