From: Eli Zaretskii <eliz@gnu.org>
To: Philipp Stephani <p.stephani2@gmail.com>
Cc: 25366@debbugs.gnu.org
Subject: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace
Date: Fri, 06 Jan 2017 17:11:48 +0200 [thread overview]
Message-ID: <83bmvkcjez.fsf@gnu.org> (raw)
In-Reply-To: <CAArVCkSD1Q+m3hkyCOevpFki--1syoMTCZefJ=53t-PLu-n74w@mail.gmail.com> (message from Philipp Stephani on Fri, 06 Jan 2017 15:00:22 +0000)
> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Fri, 06 Jan 2017 15:00:22 +0000
> Cc: 25366@debbugs.gnu.org
>
> http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties
>
> Patches to that effect are welcome.
>
> Here's a patch.
Thanks. A few minor comments below.
> +/* Return true if C is a horizontal whitespace character, as defined
> + by http://www.unicode.org/reports/tr18/tr18-19.html#blank. */
> +bool
> +blankp (int c)
> +{
> + if (c == '\t')
> + return true;
Why does this test explicitly only for a TAB? What about SPC, for
example?
> --- a/doc/lispref/searching.texi
> +++ b/doc/lispref/searching.texi
> @@ -553,7 +553,10 @@ Char Classes
> (@pxref{Character Properties}) indicates they are alphabetic
> characters.
> @item [:blank:]
> -This matches space and tab only.
> +This matches horizontal whitespace, as defined by Unicode Technical
> +Standard #18. In particular, it matches tabs and characters whose
> +Unicode @samp{general-category} property (@pxref{Character
> +Properties}) indicates they are spacing separators.
Similarly here: I find the lack of reference to a space potentially
confusing.
> +** The regular expression character class [:blank:] now matches
> +Unicode horizontal whitespace as defined in
> +http://www.unicode.org/reports/tr18/tr18-19.html#blank.
The reference to a particular version of UTS#18 might become obsolete
when a new version is released. So I suggest to provide a general
reference to the report and its section, not an exact URL.
next prev parent reply other threads:[~2017-01-06 15:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-05 13:46 bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace Philipp Stephani
2017-01-05 15:50 ` Eli Zaretskii
2017-01-06 15:00 ` Philipp Stephani
2017-01-06 15:11 ` Eli Zaretskii [this message]
2017-01-06 19:10 ` Philipp Stephani
2017-01-06 19:21 ` Philipp Stephani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83bmvkcjez.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=25366@debbugs.gnu.org \
--cc=p.stephani2@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.