From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#45660: 28.0.50; Changed word/whitespace syntax Date: Fri, 08 Jan 2021 14:06:11 +0200 Message-ID: <83czyfk4zw.fsf@gnu.org> References: <87im8cr5t8.fsf@mail.linkov.net> <838s98bkyy.fsf@gnu.org> <87zh1nkzyb.fsf@mail.linkov.net> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7634"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 45660-done@debbugs.gnu.org To: Juri Linkov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Fri Jan 08 13:09:34 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kxqaE-0001tA-4w for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 08 Jan 2021 13:09:34 +0100 Original-Received: from localhost ([::1]:46872 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kxqaD-0002sg-75 for geb-bug-gnu-emacs@m.gmane-mx.org; Fri, 08 Jan 2021 07:09:33 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48626) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kxqXm-0001XI-LW for bug-gnu-emacs@gnu.org; Fri, 08 Jan 2021 07:07:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:37854) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kxqXm-0001rP-Da for bug-gnu-emacs@gnu.org; Fri, 08 Jan 2021 07:07:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kxqXm-00035Z-9c for bug-gnu-emacs@gnu.org; Fri, 08 Jan 2021 07:07:02 -0500 Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-To: bug-gnu-emacs@gnu.org Resent-Date: Fri, 08 Jan 2021 12:07:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: cc-closed 45660 X-GNU-PR-Package: emacs Mail-Followup-To: 45660@debbugs.gnu.org, eliz@gnu.org, juri@linkov.net Original-Received: via spool by 45660-done@debbugs.gnu.org id=D45660.161010758411820 (code D ref 45660); Fri, 08 Jan 2021 12:07:02 +0000 Original-Received: (at 45660-done) by debbugs.gnu.org; 8 Jan 2021 12:06:24 +0000 Original-Received: from localhost ([127.0.0.1]:49399 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kxqXA-00034a-Fx for submit@debbugs.gnu.org; Fri, 08 Jan 2021 07:06:24 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:55318) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kxqX7-00034M-Vq for 45660-done@debbugs.gnu.org; Fri, 08 Jan 2021 07:06:23 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:59413) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kxqX1-0001UV-Mv; Fri, 08 Jan 2021 07:06:15 -0500 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:4885 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kxqWv-0005n4-CJ; Fri, 08 Jan 2021 07:06:15 -0500 In-Reply-To: <87zh1nkzyb.fsf@mail.linkov.net> (message from Juri Linkov on Tue, 05 Jan 2021 20:20:44 +0200) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:197518 Archived-At: > From: Juri Linkov > Cc: 45660@debbugs.gnu.org > Date: Tue, 05 Jan 2021 20:20:44 +0200 > > > Previously, many characters, including u+202F, had the punctuation > > ('.') syntax. I modified that to be more close to the Unicode > > Character Database (UCD), and u+202F is not a punctuation character > > according to the UCD. It has the Zs general category, which means > > "space separator", the same as SPC, NBSP, EN SPACE, and others. > > So according to the Unicode standard it should have whitespace syntax? > > And indeed, I see no reason for similar characters to have different syntax: > > name: NO-BREAK SPACE > general-category: Zs (Separator, Space) > syntax: which means: whitespace > > name: NARROW NO-BREAK SPACE > general-category: Zs (Separator, Space) > syntax: w which means: word > > > Removing u+202F and other similar characters from the "punctuation" > > group had the side effect of leaving it at the default 'w' syntax. > > > > Should we make all Zs characters have the ' ' (whitespace) syntax? > > That should be easy, but we should try being consistent in this > > regard. > > Should the word characters separated by NO-BREAK SPACE by treated as one word? > If there is no reason to treat space characters as part of words, then all > characters with the Zs general category could have the same whitespace syntax. No further comments, so I've now made the change on master whereby all characters with Zs general category are given the whitespace syntax. I'm therefore closing this bug; please reopen if there any left-overs or undesired effects.