all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#34391: 26.1; [[:cntrl:]] does not match DEL contrary to documentation
@ 2019-02-08 20:49 Mattias Engdegård
       [not found] ` <handler.34391.B.154965904215015.ack@debbugs.gnu.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Mattias Engdegård @ 2019-02-08 20:49 UTC (permalink / raw)
  To: 34391

Unlike every other regexp engines and POSIX regexps, and contrary to the documentation, [[:cntrl:]] does not match DEL (\177) in Emacs. (It does not match the C1 controls in U+0080..009f either, but at least there is no such claim.)

Assuming that it is not worth breaking existing code by changing the behaviour, let us at least fix the manual which says:

‘[:cntrl:]’
     This matches any ASCII control character.

which is inaccurate. The error also made it into the doc string of `rx'.






^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#34391: Acknowledgement (26.1; [[:cntrl:]] does not match DEL contrary to documentation)
       [not found] ` <handler.34391.B.154965904215015.ack@debbugs.gnu.org>
@ 2019-02-08 21:04   ` Mattias Engdegård
  2019-02-08 21:58     ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Mattias Engdegård @ 2019-02-08 21:04 UTC (permalink / raw)
  To: 34391

[-- Attachment #1: Type: text/plain, Size: 229 bytes --]

Proposed patch, again assuming that the [:cntrl:] behaviour cannot be modified. Is that true?
It may have come up before since a comment in a test makes explicit reference to it (see patch), but I cannot find any discussion.


[-- Attachment #2: 0001-Document-that-cntrl-does-not-match-DEL-Bug-34391.patch --]
[-- Type: application/octet-stream, Size: 2428 bytes --]

From 3835d9fb86821d14425cf888bff61ccaba0c0a36 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Fri, 8 Feb 2019 21:51:13 +0100
Subject: [PATCH] Document that [:cntrl:] does not match DEL (Bug#34391)

* doc/lispref/searching.texi (Character Classes):
* lisp/emacs-lisp/rx.el (rx):
Document that [:cntrl:] excludes DEL.
* test/src/regex-emacs-tests.el (regex-tests-PTESTS-whitelist):
Swap misplaced comments and fix wrong code for DEL.
---
 doc/lispref/searching.texi    | 2 +-
 lisp/emacs-lisp/rx.el         | 2 +-
 test/src/regex-emacs-tests.el | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 05fc328205..eeaef8a49a 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -559,7 +559,7 @@ tabs, and other characters whose Unicode @samp{general-category}
 property (@pxref{Character Properties}) indicates they are spacing
 separators.
 @item [:cntrl:]
-This matches any @acronym{ASCII} control character.
+This matches any @acronym{ASCII} control character except DEL.
 @item [:digit:]
 This matches @samp{0} through @samp{9}.  Thus, @samp{[-+[:digit:]]}
 matches any digit, as well as @samp{+} and @samp{-}.
diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el
index 8b4551d0d3..a3d8c1eb10 100644
--- a/lisp/emacs-lisp/rx.el
+++ b/lisp/emacs-lisp/rx.el
@@ -964,7 +964,7 @@ CHAR
      matches 0 through 9.
 
 `control', `cntrl'
-     matches ASCII control characters.
+     matches ASCII control characters except DEL.
 
 `hex-digit', `hex', `xdigit'
      matches 0 through 9, a through f and A through F.
diff --git a/test/src/regex-emacs-tests.el b/test/src/regex-emacs-tests.el
index e84af6b131..9a40316573 100644
--- a/test/src/regex-emacs-tests.el
+++ b/test/src/regex-emacs-tests.el
@@ -555,11 +555,11 @@ differences in behavior.")
 
 (defconst regex-tests-PTESTS-whitelist
   [
-   ;; emacs doesn't barf on weird ranges such as [b-a], but simply
-   ;; fails to match
+   ;; emacs doesn't see DEL (0x7f) as a [:cntrl:] character
    138
 
-   ;; emacs doesn't see DEL (0x78) as a [:cntrl:] character
+   ;; emacs doesn't barf on weird ranges such as [b-a], but simply
+   ;; fails to match
    168
   ]
   "Line numbers in the PTESTS test that should be skipped.  These
-- 
2.17.2 (Apple Git-113)


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#34391: Acknowledgement (26.1; [[:cntrl:]] does not match DEL contrary to documentation)
  2019-02-08 21:04   ` bug#34391: Acknowledgement (26.1; [[:cntrl:]] does not match DEL contrary to documentation) Mattias Engdegård
@ 2019-02-08 21:58     ` Eli Zaretskii
  2019-02-10  9:56       ` Mattias Engdegård
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2019-02-08 21:58 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 34391

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Fri, 8 Feb 2019 22:04:32 +0100
> 
> Proposed patch, again assuming that the [:cntrl:] behaviour cannot be modified. Is that true?
> It may have come up before since a comment in a test makes explicit reference to it (see patch), but I cannot find any discussion.

I think you don't find any discussion because in some quarters this
behavior is the only one that makes sense: "ASCII control characters"
is interpreted as "ASCII characters whose codepoints are below 32
decimal".  Which is why I prefer to amend the documentation to say
that, instead of excluding DEL explicitly.

Thanks.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#34391: Acknowledgement (26.1; [[:cntrl:]] does not match DEL contrary to documentation)
  2019-02-08 21:58     ` Eli Zaretskii
@ 2019-02-10  9:56       ` Mattias Engdegård
  2019-02-10 15:19         ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Mattias Engdegård @ 2019-02-10  9:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 34391

[-- Attachment #1: Type: text/plain, Size: 623 bytes --]

8 feb. 2019 kl. 22.58 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> I think you don't find any discussion because in some quarters this
> behavior is the only one that makes sense: "ASCII control characters"
> is interpreted as "ASCII characters whose codepoints are below 32
> decimal".  Which is why I prefer to amend the documentation to say
> that, instead of excluding DEL explicitly.

Well, DEL has been a control characters for more than half a century now, but you are right: documentation should be maximally clear to everyone, not just to those who think the way we think they ought to.
Would this patch do?

[-- Attachment #2: 0001-PATCH-Document-that-cntrl-does-not-match-DEL-Bug-343.patch --]
[-- Type: application/octet-stream, Size: 2447 bytes --]

From a7db422fa867d65cb2800ad8cd0f9acb18e10f64 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Sun, 10 Feb 2019 10:39:00 +0100
Subject: [PATCH] [PATCH] Document that [:cntrl:] does not match DEL
 (Bug#34391)

* doc/lispref/searching.texi (Character Classes):
* lisp/emacs-lisp/rx.el (rx):
Document that [:cntrl:] excludes DEL.
* test/src/regex-emacs-tests.el (regex-tests-PTESTS-whitelist):
Swap misplaced comments and fix wrong code for DEL.
---
 doc/lispref/searching.texi    | 2 +-
 lisp/emacs-lisp/rx.el         | 2 +-
 test/src/regex-emacs-tests.el | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 05fc328205..cfbd2449b1 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -559,7 +559,7 @@ tabs, and other characters whose Unicode @samp{general-category}
 property (@pxref{Character Properties}) indicates they are spacing
 separators.
 @item [:cntrl:]
-This matches any @acronym{ASCII} control character.
+This matches any character whose code is in the range 0--31.
 @item [:digit:]
 This matches @samp{0} through @samp{9}.  Thus, @samp{[-+[:digit:]]}
 matches any digit, as well as @samp{+} and @samp{-}.
diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el
index 8b4551d0d3..3fa0204a1a 100644
--- a/lisp/emacs-lisp/rx.el
+++ b/lisp/emacs-lisp/rx.el
@@ -964,7 +964,7 @@ CHAR
      matches 0 through 9.
 
 `control', `cntrl'
-     matches ASCII control characters.
+     matches any character whose code is in the range 0-31.
 
 `hex-digit', `hex', `xdigit'
      matches 0 through 9, a through f and A through F.
diff --git a/test/src/regex-emacs-tests.el b/test/src/regex-emacs-tests.el
index e84af6b131..9a40316573 100644
--- a/test/src/regex-emacs-tests.el
+++ b/test/src/regex-emacs-tests.el
@@ -555,11 +555,11 @@ differences in behavior.")
 
 (defconst regex-tests-PTESTS-whitelist
   [
-   ;; emacs doesn't barf on weird ranges such as [b-a], but simply
-   ;; fails to match
+   ;; emacs doesn't see DEL (0x7f) as a [:cntrl:] character
    138
 
-   ;; emacs doesn't see DEL (0x78) as a [:cntrl:] character
+   ;; emacs doesn't barf on weird ranges such as [b-a], but simply
+   ;; fails to match
    168
   ]
   "Line numbers in the PTESTS test that should be skipped.  These
-- 
2.17.2 (Apple Git-113)


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#34391: Acknowledgement (26.1; [[:cntrl:]] does not match DEL contrary to documentation)
  2019-02-10  9:56       ` Mattias Engdegård
@ 2019-02-10 15:19         ` Eli Zaretskii
  2019-02-10 22:42           ` Mattias Engdegård
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2019-02-10 15:19 UTC (permalink / raw)
  To: Mattias Engdegård; +Cc: 34391

> From: Mattias Engdegård <mattiase@acm.org>
> Date: Sun, 10 Feb 2019 10:56:07 +0100
> Cc: 34391@debbugs.gnu.org
> 
> > I think you don't find any discussion because in some quarters this
> > behavior is the only one that makes sense: "ASCII control characters"
> > is interpreted as "ASCII characters whose codepoints are below 32
> > decimal".  Which is why I prefer to amend the documentation to say
> > that, instead of excluding DEL explicitly.
> 
> Well, DEL has been a control characters for more than half a century now, but you are right: documentation should be maximally clear to everyone, not just to those who think the way we think they ought to.
> Would this patch do?

Yes, LGTM.  Thanks.  (Be sure to mention the bug number in the log
message.)





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#34391: Acknowledgement (26.1; [[:cntrl:]] does not match DEL contrary to documentation)
  2019-02-10 15:19         ` Eli Zaretskii
@ 2019-02-10 22:42           ` Mattias Engdegård
  0 siblings, 0 replies; 6+ messages in thread
From: Mattias Engdegård @ 2019-02-10 22:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 34391

10 feb. 2019 kl. 16.19 skrev Eli Zaretskii <eliz@gnu.org>:
> 
> Yes, LGTM.  Thanks.  (Be sure to mention the bug number in the log
> message.)

Yes, done.






^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-02-10 22:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-02-08 20:49 bug#34391: 26.1; [[:cntrl:]] does not match DEL contrary to documentation Mattias Engdegård
     [not found] ` <handler.34391.B.154965904215015.ack@debbugs.gnu.org>
2019-02-08 21:04   ` bug#34391: Acknowledgement (26.1; [[:cntrl:]] does not match DEL contrary to documentation) Mattias Engdegård
2019-02-08 21:58     ` Eli Zaretskii
2019-02-10  9:56       ` Mattias Engdegård
2019-02-10 15:19         ` Eli Zaretskii
2019-02-10 22:42           ` Mattias Engdegård

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.