unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#58992: 28.2; "lax space matching" no longer works
@ 2022-11-03 16:53 Vincent Lefevre
  2022-11-03 17:04 ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 16:53 UTC (permalink / raw)
  To: 58992


The Emacs manual says:

15.9 Lax Matching During Searching
==================================
[...]
   By default, search commands perform “lax space matching”: each space,
or sequence of spaces, matches any sequence of one or more whitespace
characters in the text.  (Incremental regexp search has a separate
default; see *note Regexp Search::.)  Hence, ‘foo bar’ matches
‘foo bar’, ‘foo  bar’, ‘foo   bar’, and so on (but not ‘foobar’).  More
[...]

This is working with GNU Emacs 27, but not with GNU Emacs 28.2
(tested under Debian/unstable).

To reproduce with emacs -Q, consider a file with

ab
cd
ab cd

and search for "b c" with

  C-s b c

Only the one in the 3rd line is found.


In GNU Emacs 28.2 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.34, cairo version 1.16.0)
 of 2022-10-29, modified by Debian built on x86-conova-01
Windowing system distributor 'The X.Org Foundation', version 11.0.12101004
System Description: Debian GNU/Linux bookworm/sid

Configured using:
 'configure --build x86_64-linux-gnu --prefix=/usr
 --sharedstatedir=/var/lib --libexecdir=/usr/libexec
 --localstatedir=/var/lib --infodir=/usr/share/info
 --mandir=/usr/share/man --with-libsystemd --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/28.2/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/28.2/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils
 --with-native-compilation --build x86_64-linux-gnu --prefix=/usr
 --sharedstatedir=/var/lib --libexecdir=/usr/libexec
 --localstatedir=/var/lib --infodir=/usr/share/info
 --mandir=/usr/share/man --with-libsystemd --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/28.2/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/28.2/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils
 --with-native-compilation --with-cairo --with-x=yes
 --with-x-toolkit=gtk3 --with-toolkit-scroll-bars 'CFLAGS=-g -O2
 -ffile-prefix-map=/build/emacs-SkJhIb/emacs-28.2+1=.
 -fstack-protector-strong -Wformat -Werror=format-security -Wall'
 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro'

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES
NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND THREADS TIFF
TOOLKIT_SCROLL_BARS X11 XDBE XIM XPM GTK3 ZLIB

Important settings:
  value of $LC_COLLATE: POSIX
  value of $LC_CTYPE: C.UTF-8
  value of $LC_TIME: en_DK.utf8
  value of $LANG: POSIX
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  display-time-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-11/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-11/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-11/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-12/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-12/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-12/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-13/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-13/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-13/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-14/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-14/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-14/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-15/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-15/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-15/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.5/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-3.5/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.5/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.6/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-3.6/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.6/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-3.7/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-3.7/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-3.7/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-7/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-7/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-7/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-8/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-8/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-8/tablegen-mode
/usr/share/emacs/site-lisp/llvm-10/llvm-mode hides /usr/share/emacs/site-lisp/llvm-9/llvm-mode
/usr/share/emacs/site-lisp/llvm-10/emacs hides /usr/share/emacs/site-lisp/llvm-9/emacs
/usr/share/emacs/site-lisp/llvm-10/tablegen-mode hides /usr/share/emacs/site-lisp/llvm-9/tablegen-mode
/usr/share/emacs/site-lisp/elpa/cmake-mode-3.24.2/cmake-mode hides /usr/share/emacs/site-lisp/elpa-src/cmake-mode-3.24.2/cmake-mode
/usr/share/emacs/site-lisp/elpa/cmake-mode-3.24.2/cmake-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/cmake-mode-3.24.2/cmake-mode-autoloads
/usr/share/emacs/site-lisp/elpa/cmake-mode-3.24.2/cmake-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/cmake-mode-3.24.2/cmake-mode-pkg
/usr/share/emacs/site-lisp/elpa/po-mode-0.21/po-mode hides /usr/share/emacs/site-lisp/elpa-src/po-mode-0.21/po-mode
/usr/share/emacs/site-lisp/elpa/po-mode-0.21/po-mode-pkg hides /usr/share/emacs/site-lisp/elpa-src/po-mode-0.21/po-mode-pkg
/usr/share/emacs/site-lisp/elpa/po-mode-0.21/po-mode-autoloads hides /usr/share/emacs/site-lisp/elpa-src/po-mode-0.21/po-mode-autoloads
/usr/share/emacs/site-lisp/latex-cjk-thai/thai-word hides /usr/share/emacs/28.2/lisp/language/thai-word

Features:
(shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs
rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util rmail
rmail-loaddefs text-property-search time-date mm-decode mm-bodies
mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail
rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils cus-edit pp
cus-start wid-edit time cus-load cc-styles cc-align cc-engine cc-vars
cc-defs edmacro kmacro mmm-auto mmm-vars mmm-utils mmm-compat package
browse-url url url-proxy url-privacy url-expand url-methods url-history
url-cookie url-domsuf url-util mailcap url-handlers url-parse
auth-source eieio eieio-core eieio-loaddefs password-cache json map
url-vars comp comp-cstr warnings subr-x rx cl-seq cl-macs cl-extra
help-mode seq byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib
iso-transl tooltip eldoc paren electric uniquify ediff-hook vc-hooks
lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock
font-lock syntax font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget hashtable-print-readable backquote threads dbusbind
inotify lcms2 dynamic-setting system-font-setting font-render-setting
cairo move-toolbar gtk x-toolkit x multi-tty make-network-process
native-compile emacs)

Memory information:
((conses 16 121759 7537)
 (symbols 48 11984 1)
 (strings 32 34388 5478)
 (string-bytes 1 1117991)
 (vectors 16 19182)
 (vector-slots 8 306771 8771)
 (floats 8 40 15)
 (intervals 56 282 0)
 (buffers 992 12))





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 16:53 bug#58992: 28.2; "lax space matching" no longer works Vincent Lefevre
@ 2022-11-03 17:04 ` Eli Zaretskii
  2022-11-03 17:21   ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-03 17:04 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> From: Vincent Lefevre <vincent@vinc17.net>
> Date: Thu, 03 Nov 2022 17:53:16 +0100
> 
> 
> The Emacs manual says:
> 
> 15.9 Lax Matching During Searching
> ==================================
> [...]
>    By default, search commands perform “lax space matching”: each space,
> or sequence of spaces, matches any sequence of one or more whitespace
> characters in the text.  (Incremental regexp search has a separate
> default; see *note Regexp Search::.)  Hence, ‘foo bar’ matches
> ‘foo bar’, ‘foo  bar’, ‘foo   bar’, and so on (but not ‘foobar’).  More
> [...]
> 
> This is working with GNU Emacs 27, but not with GNU Emacs 28.2
> (tested under Debian/unstable).

If it works for you by default in Emacs 27, then you either didn't
test with "emacs -Q" there or your Emacs 27 is customized wrt the
upstream.  For me, Emacs 26, Emacs 27, and all the later versions
behave the same.

> To reproduce with emacs -Q, consider a file with
> 
> ab
> cd
> ab cd
> 
> and search for "b c" with
> 
>   C-s b c
> 
> Only the one in the 3rd line is found.

Yes.  Because the default value of search-whitespace-regexp is
"[ \t]+".  And the Emacs manual which comes with Emacs 28 says:

     By default, search commands perform “lax space matching”: each space,
  or sequence of spaces, matches any sequence of one or more whitespace
  characters in the text.  (Incremental regexp search has a separate
  default; see *note Regexp Search::.)  Hence, ‘foo bar’ matches
  ‘foo bar’, ‘foo  bar’, ‘foo   bar’, and so on (but not ‘foobar’).  More
  precisely, Emacs matches each sequence of space characters in the search
  string to a regular expression specified by the variable
  ‘search-whitespace-regexp’.  For example, to make spaces match sequences
  of newlines as well as spaces, set it to the regular expression
  ‘[[:space:]\n]+’.  The default value of this variable considers any
  sequence of spaces and tab characters as whitespace.

So I see no bug here.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:04 ` Eli Zaretskii
@ 2022-11-03 17:21   ` Vincent Lefevre
  2022-11-03 17:34     ` Vincent Lefevre
                       ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 17:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-03 19:04:31 +0200, Eli Zaretskii wrote:
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Date: Thu, 03 Nov 2022 17:53:16 +0100
> > 
> > The Emacs manual says:
> > 
> > 15.9 Lax Matching During Searching
> > ==================================
> > [...]
> >    By default, search commands perform “lax space matching”: each space,
> > or sequence of spaces, matches any sequence of one or more whitespace
> > characters in the text.  (Incremental regexp search has a separate
> > default; see *note Regexp Search::.)  Hence, ‘foo bar’ matches
> > ‘foo bar’, ‘foo  bar’, ‘foo   bar’, and so on (but not ‘foobar’).  More
> > [...]
> > 
> > This is working with GNU Emacs 27, but not with GNU Emacs 28.2
> > (tested under Debian/unstable).
> 
> If it works for you by default in Emacs 27, then you either didn't
> test with "emacs -Q" there or your Emacs 27 is customized wrt the
> upstream.

I tested with "emacs -Q", and I've just tested again. I confirm the
behavior I could see: a newline character is matched. That's Debian's
package emacs-gtk 1:27.1+1-3.1+b1. So perhaps Debian has changed the
default (but no changes were announced in Debian for Emacs 28, whose
behavior is different).

BTW, for users who do not spend their time in reading the full doc,
I'd suggest to clarify the doc by saying "one or more user-configurable
whitespace characters", because AFAIK, a newline character is often
regarded as a whitespace character, in particular by Unicode:

  https://en.wikipedia.org/wiki/Whitespace_character

So the default could be very surprising.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:21   ` Vincent Lefevre
@ 2022-11-03 17:34     ` Vincent Lefevre
  2022-11-03 18:05       ` Eli Zaretskii
  2022-11-03 17:49     ` Gregory Heytings
  2022-11-03 18:03     ` Eli Zaretskii
  2 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 17:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-03 18:21:57 +0100, Vincent Lefevre wrote:
> On 2022-11-03 19:04:31 +0200, Eli Zaretskii wrote:
> > > From: Vincent Lefevre <vincent@vinc17.net>
> > > Date: Thu, 03 Nov 2022 17:53:16 +0100
> > > 
> > > The Emacs manual says:
> > > 
> > > 15.9 Lax Matching During Searching
> > > ==================================
> > > [...]
> > >    By default, search commands perform “lax space matching”: each space,
> > > or sequence of spaces, matches any sequence of one or more whitespace
> > > characters in the text.  (Incremental regexp search has a separate
> > > default; see *note Regexp Search::.)  Hence, ‘foo bar’ matches
> > > ‘foo bar’, ‘foo  bar’, ‘foo   bar’, and so on (but not ‘foobar’).  More
> > > [...]
> > > 
> > > This is working with GNU Emacs 27, but not with GNU Emacs 28.2
> > > (tested under Debian/unstable).
> > 
> > If it works for you by default in Emacs 27, then you either didn't
> > test with "emacs -Q" there or your Emacs 27 is customized wrt the
> > upstream.
> 
> I tested with "emacs -Q", and I've just tested again. I confirm the
> behavior I could see: a newline character is matched. That's Debian's
> package emacs-gtk 1:27.1+1-3.1+b1. So perhaps Debian has changed the
> default (but no changes were announced in Debian for Emacs 28, whose
> behavior is different).

In the officiel lisp/isearch.el file from the Debian/stable package
(i.e. *not* in the debian subdirectory):

(defcustom search-whitespace-regexp (purecopy "\\s-+")

So it doesn't seem to be a Debian customization.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:21   ` Vincent Lefevre
  2022-11-03 17:34     ` Vincent Lefevre
@ 2022-11-03 17:49     ` Gregory Heytings
  2022-11-03 17:56       ` Vincent Lefevre
  2022-11-03 18:03     ` Eli Zaretskii
  2 siblings, 1 reply; 53+ messages in thread
From: Gregory Heytings @ 2022-11-03 17:49 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992


>
> I tested with "emacs -Q", and I've just tested again. I confirm the 
> behavior I could see: a newline character is matched.  That's Debian's 
> package emacs-gtk 1:27.1+1-3.1+b1.
>

Are you really sure?  I just tested this with that same version of Emacs 
(Debian package emacs-gtk 1:27.1+1-3.1+b1), and

emacs -Q
ab RET bc RET ab SPC cd
M-<
C-s b SPC c

only matches the third line.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:49     ` Gregory Heytings
@ 2022-11-03 17:56       ` Vincent Lefevre
  2022-11-03 18:02         ` Vincent Lefevre
  2022-11-03 18:02         ` Gregory Heytings
  0 siblings, 2 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 17:56 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: Eli Zaretskii, 58992

On 2022-11-03 17:49:37 +0000, Gregory Heytings wrote:
> > I tested with "emacs -Q", and I've just tested again. I confirm the
> > behavior I could see: a newline character is matched.  That's Debian's
> > package emacs-gtk 1:27.1+1-3.1+b1.
> 
> Are you really sure?  I just tested this with that same version of Emacs
> (Debian package emacs-gtk 1:27.1+1-3.1+b1), and
> 
> emacs -Q
> ab RET bc RET ab SPC cd
> M-<
> C-s b SPC c
> 
> only matches the third line.

With this test, it only matches the third line, but not with a
pre-existing file. Why such a difference?

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:56       ` Vincent Lefevre
@ 2022-11-03 18:02         ` Vincent Lefevre
  2022-11-03 18:04           ` Gregory Heytings
  2022-11-03 18:04           ` Vincent Lefevre
  2022-11-03 18:02         ` Gregory Heytings
  1 sibling, 2 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 18:02 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: Eli Zaretskii, 58992

On 2022-11-03 18:56:41 +0100, Vincent Lefevre wrote:
> On 2022-11-03 17:49:37 +0000, Gregory Heytings wrote:
> > > I tested with "emacs -Q", and I've just tested again. I confirm the
> > > behavior I could see: a newline character is matched.  That's Debian's
> > > package emacs-gtk 1:27.1+1-3.1+b1.
> > 
> > Are you really sure?  I just tested this with that same version of Emacs
> > (Debian package emacs-gtk 1:27.1+1-3.1+b1), and
> > 
> > emacs -Q
> > ab RET bc RET ab SPC cd
> > M-<
> > C-s b SPC c
> > 
> > only matches the third line.
> 
> With this test, it only matches the third line, but not with a
> pre-existing file. Why such a difference?

Hmm... Your example is wrong!

Instead of

  ab RET bc RET ab SPC cd

it should be

  ab RET cd RET ab SPC cd

Unfortunately, this doesn't explain the problem!!!

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:56       ` Vincent Lefevre
  2022-11-03 18:02         ` Vincent Lefevre
@ 2022-11-03 18:02         ` Gregory Heytings
  1 sibling, 0 replies; 53+ messages in thread
From: Gregory Heytings @ 2022-11-03 18:02 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992


>
> With this test, it only matches the third line, but not with a 
> pre-existing file. Why such a difference?
>

What's the major mode when you open that pre-existing file?





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:21   ` Vincent Lefevre
  2022-11-03 17:34     ` Vincent Lefevre
  2022-11-03 17:49     ` Gregory Heytings
@ 2022-11-03 18:03     ` Eli Zaretskii
  2022-11-03 18:33       ` Vincent Lefevre
  2022-11-04  3:30       ` Vincent Lefevre
  2 siblings, 2 replies; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-03 18:03 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992-done

> Date: Thu, 3 Nov 2022 18:21:57 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> BTW, for users who do not spend their time in reading the full doc,
> I'd suggest to clarify the doc by saying "one or more user-configurable
> whitespace characters", because AFAIK, a newline character is often
> regarded as a whitespace character, in particular by Unicode:
> 
>   https://en.wikipedia.org/wiki/Whitespace_character

Thanks, I moved the detailed description of what is considered
"whitespace" closer to the first sentence.

> So the default could be very surprising.

It's a very long-time Emacs behavior.  Too late to change the default.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:02         ` Vincent Lefevre
@ 2022-11-03 18:04           ` Gregory Heytings
  2022-11-03 18:04           ` Vincent Lefevre
  1 sibling, 0 replies; 53+ messages in thread
From: Gregory Heytings @ 2022-11-03 18:04 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992


>
> Hmm... Your example is wrong!
>

Yes, sorry, that was a typo in my post.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:02         ` Vincent Lefevre
  2022-11-03 18:04           ` Gregory Heytings
@ 2022-11-03 18:04           ` Vincent Lefevre
  2022-11-03 18:11             ` Gregory Heytings
  1 sibling, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 18:04 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: Eli Zaretskii, 58992

On 2022-11-03 19:02:17 +0100, Vincent Lefevre wrote:
> Instead of
> 
>   ab RET bc RET ab SPC cd
> 
> it should be
> 
>   ab RET cd RET ab SPC cd
> 
> Unfortunately, this doesn't explain the problem!!!

Well, the issue can be reproduced only in fundamental mode.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 17:34     ` Vincent Lefevre
@ 2022-11-03 18:05       ` Eli Zaretskii
  0 siblings, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-03 18:05 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Thu, 3 Nov 2022 18:34:18 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> > I tested with "emacs -Q", and I've just tested again. I confirm the
> > behavior I could see: a newline character is matched. That's Debian's
> > package emacs-gtk 1:27.1+1-3.1+b1. So perhaps Debian has changed the
> > default (but no changes were announced in Debian for Emacs 28, whose
> > behavior is different).
> 
> In the officiel lisp/isearch.el file from the Debian/stable package
> (i.e. *not* in the debian subdirectory):
> 
> (defcustom search-whitespace-regexp (purecopy "\\s-+")
> 
> So it doesn't seem to be a Debian customization.

OK, thanks for solving this mystery.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:04           ` Vincent Lefevre
@ 2022-11-03 18:11             ` Gregory Heytings
  2022-11-03 18:18               ` Gregory Heytings
  2022-11-03 18:18               ` Eli Zaretskii
  0 siblings, 2 replies; 53+ messages in thread
From: Gregory Heytings @ 2022-11-03 18:11 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992


>
> Well, the issue can be reproduced only in fundamental mode.
>

Okay, now I see it:

emacs -Q
ab RET cd RET ab SPC cd
M-<
C-s b SPC c ;; only one match
C-x C-s foo RET
M-<
C-s b SPC c ;; two matches in Emacs 27, one match in Emacs 28

Hmm...





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:11             ` Gregory Heytings
@ 2022-11-03 18:18               ` Gregory Heytings
  2022-11-03 18:28                 ` Vincent Lefevre
  2022-11-03 18:36                 ` Juri Linkov
  2022-11-03 18:18               ` Eli Zaretskii
  1 sibling, 2 replies; 53+ messages in thread
From: Gregory Heytings @ 2022-11-03 18:18 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992


>
> Hmm...
>

Okay, this is because of:

commit 74d091a0a665da5dc01989d1b06a61ee21b975b2
Author: Lars Ingebrigtsen <larsi@gnus.org>
Date:   Fri Sep 10 12:27:28 2021 +0200

     Change the default value of search-whitespace-regexp

     * lisp/isearch.el (search-whitespace-regexp): Change the default
     to always exclude newlines from the set (bug#21278).  It used to
     be mode-dependent whether newlines were included or not, and this
     was confusing as a user interface.

which changed the default value of search-whitespace-regexp from "\\s-+" 
to "[ \t]".





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:11             ` Gregory Heytings
  2022-11-03 18:18               ` Gregory Heytings
@ 2022-11-03 18:18               ` Eli Zaretskii
  1 sibling, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-03 18:18 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: 58992, vincent

> Date: Thu, 03 Nov 2022 18:11:43 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Eli Zaretskii <eliz@gnu.org>, 58992@debbugs.gnu.org
> 
> emacs -Q
> ab RET cd RET ab SPC cd
> M-<
> C-s b SPC c ;; only one match
> C-x C-s foo RET
> M-<
> C-s b SPC c ;; two matches in Emacs 27, one match in Emacs 28
> 
> Hmm...

Did you look at the value of search-whitespace-regexp in both cases?





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:18               ` Gregory Heytings
@ 2022-11-03 18:28                 ` Vincent Lefevre
  2022-11-03 18:39                   ` Eli Zaretskii
  2022-11-03 18:43                   ` Gregory Heytings
  2022-11-03 18:36                 ` Juri Linkov
  1 sibling, 2 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 18:28 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: Eli Zaretskii, 58992

On 2022-11-03 18:18:53 +0000, Gregory Heytings wrote:
> Okay, this is because of:
> 
> commit 74d091a0a665da5dc01989d1b06a61ee21b975b2
> Author: Lars Ingebrigtsen <larsi@gnus.org>
> Date:   Fri Sep 10 12:27:28 2021 +0200
> 
>     Change the default value of search-whitespace-regexp
> 
>     * lisp/isearch.el (search-whitespace-regexp): Change the default
>     to always exclude newlines from the set (bug#21278).  It used to
>     be mode-dependent whether newlines were included or not, and this
>     was confusing as a user interface.
> 
> which changed the default value of search-whitespace-regexp from "\\s-+" to
> "[ \t]".

This is still buggy in Emacs 28.2 if I change the value:

search-whitespace-regexp is a variable defined in ‘isearch.el’.

Its value is "\\s-+"
Original value was "[ 	]+"

This works in Fundamental mode, but not in Lisp mode.

BTW, I don't understand what "\\s-+" means. I thought it was a
whitespace followed by a minus character repeated at least once.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:03     ` Eli Zaretskii
@ 2022-11-03 18:33       ` Vincent Lefevre
  2022-11-03 18:42         ` Eli Zaretskii
  2022-11-04  3:30       ` Vincent Lefevre
  1 sibling, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 18:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-03 20:03:52 +0200, Eli Zaretskii wrote:
> > Date: Thu, 3 Nov 2022 18:21:57 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > BTW, for users who do not spend their time in reading the full doc,
> > I'd suggest to clarify the doc by saying "one or more user-configurable
> > whitespace characters", because AFAIK, a newline character is often
> > regarded as a whitespace character, in particular by Unicode:
> > 
> >   https://en.wikipedia.org/wiki/Whitespace_character
> 
> Thanks, I moved the detailed description of what is considered
> "whitespace" closer to the first sentence.

OK, thanks.

> > So the default could be very surprising.
> 
> It's a very long-time Emacs behavior.  Too late to change the default.

Well, since the default changed in September 2021, this is not a very
long-time Emacs behavior. This depends on what modes users use the
most. This change affects at least the Fundamental and Texinfo modes.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:18               ` Gregory Heytings
  2022-11-03 18:28                 ` Vincent Lefevre
@ 2022-11-03 18:36                 ` Juri Linkov
  1 sibling, 0 replies; 53+ messages in thread
From: Juri Linkov @ 2022-11-03 18:36 UTC (permalink / raw)
  To: Gregory Heytings; +Cc: Eli Zaretskii, Vincent Lefevre, 58992

> Okay, this is because of:
>
> commit 74d091a0a665da5dc01989d1b06a61ee21b975b2
> Author: Lars Ingebrigtsen <larsi@gnus.org>
> Date:   Fri Sep 10 12:27:28 2021 +0200
>
>     Change the default value of search-whitespace-regexp
>
>     * lisp/isearch.el (search-whitespace-regexp): Change the default
>     to always exclude newlines from the set (bug#21278).  It used to
>     be mode-dependent whether newlines were included or not, and this
>     was confusing as a user interface.
>
> which changed the default value of search-whitespace-regexp from "\\s-+" to
> "[ \t]".

"\\s-+" in many modes includes newlines, so the default could include
explicit newlines as well that it matched in some modes until Emacs 28.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:28                 ` Vincent Lefevre
@ 2022-11-03 18:39                   ` Eli Zaretskii
  2022-11-03 18:43                   ` Gregory Heytings
  1 sibling, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-03 18:39 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: gregory, 58992

> Date: Thu, 3 Nov 2022 19:28:09 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, 58992@debbugs.gnu.org
> 
> This is still buggy in Emacs 28.2 if I change the value:

It isn't a bug.

> search-whitespace-regexp is a variable defined in ‘isearch.el’.
> 
> Its value is "\\s-+"
> Original value was "[ 	]+"
> 
> This works in Fundamental mode, but not in Lisp mode.

Because the meaning of \\s is different.  That's why we changed the
default value not to use \\s.

> BTW, I don't understand what "\\s-+" means. I thought it was a
> whitespace followed by a minus character repeated at least once.

No, it means a sequence of 1 or more characters whose syntax is '-'.
See the documentation of regular expressions in the manual.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:33       ` Vincent Lefevre
@ 2022-11-03 18:42         ` Eli Zaretskii
  2022-11-03 18:52           ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-03 18:42 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Thu, 3 Nov 2022 19:33:16 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> > It's a very long-time Emacs behavior.  Too late to change the default.
> 
> Well, since the default changed in September 2021, this is not a very
> long-time Emacs behavior. This depends on what modes users use the
> most. This change affects at least the Fundamental and Texinfo modes.

Yes.  But in programming modes the newline didn't match.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:28                 ` Vincent Lefevre
  2022-11-03 18:39                   ` Eli Zaretskii
@ 2022-11-03 18:43                   ` Gregory Heytings
  1 sibling, 0 replies; 53+ messages in thread
From: Gregory Heytings @ 2022-11-03 18:43 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992


>> commit 74d091a0a665da5dc01989d1b06a61ee21b975b2
>> Author: Lars Ingebrigtsen <larsi@gnus.org>
>> Date:   Fri Sep 10 12:27:28 2021 +0200
>>
>>     Change the default value of search-whitespace-regexp
>>
>>     * lisp/isearch.el (search-whitespace-regexp): Change the default
>>     to always exclude newlines from the set (bug#21278).  It used to
>>     be mode-dependent whether newlines were included or not, and this
>>     was confusing as a user interface.
>>
>> which changed the default value of search-whitespace-regexp from 
>> "\\s-+" to "[ \t]".
>
> This is still buggy in Emacs 28.2 if I change the value:
>
> This works in Fundamental mode, but not in Lisp mode.
>

As the commit message explains, "It used to be mode-dependent whether 
newlines were included or not, and this was confusing as a user 
interface."  Hence your confusion.  You will see the same in Emacs 27 and 
28: RET is space in fundamental mode and not space in text mode.

>
> BTW, I don't understand what "\\s-+" means.
>

It means "match any character whose syntax is "space".





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:42         ` Eli Zaretskii
@ 2022-11-03 18:52           ` Vincent Lefevre
  2022-11-03 19:22             ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-03 18:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-03 20:42:41 +0200, Eli Zaretskii wrote:
> > Date: Thu, 3 Nov 2022 19:33:16 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > > It's a very long-time Emacs behavior.  Too late to change the default.
> > 
> > Well, since the default changed in September 2021, this is not a very
> > long-time Emacs behavior. This depends on what modes users use the
> > most. This change affects at least the Fundamental and Texinfo modes.
> 
> Yes.  But in programming modes the newline didn't match.

Any reason why (perhaps except for shell modes)?

In C, except for the preprocessor, a newline is similar to a space
character.

BTW, it actually doesn't match either for the Texinfo mode, and
I don't see any reason why.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:52           ` Vincent Lefevre
@ 2022-11-03 19:22             ` Eli Zaretskii
  2022-11-04  2:29               ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-03 19:22 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Thu, 3 Nov 2022 19:52:07 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> > Yes.  But in programming modes the newline didn't match.
> 
> Any reason why (perhaps except for shell modes)?

Because the newline's syntax is not "whitespace" in those modes.

> In C, except for the preprocessor, a newline is similar to a space
> character.

The syntax we give to each character in a major mode depends on what
the mode needs to do with that character.  For example, a mode might
have a good reason to give the newline the '>' syntax, because the
newline ends a comment in those modes.

> BTW, it actually doesn't match either for the Texinfo mode, and
> I don't see any reason why.

In which version of Emacs, and with what value of
search-whitespace-regexp?





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 19:22             ` Eli Zaretskii
@ 2022-11-04  2:29               ` Vincent Lefevre
  2022-11-04  3:38                 ` Vincent Lefevre
  2022-11-04  7:14                 ` Eli Zaretskii
  0 siblings, 2 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04  2:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-03 21:22:12 +0200, Eli Zaretskii wrote:
> > Date: Thu, 3 Nov 2022 19:52:07 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > > Yes.  But in programming modes the newline didn't match.
> > 
> > Any reason why (perhaps except for shell modes)?
> 
> Because the newline's syntax is not "whitespace" in those modes.

OK, but then, the question is why the newline's syntax is not
"whitespace" in those modes...

> > In C, except for the preprocessor, a newline is similar to a space
> > character.
> 
> The syntax we give to each character in a major mode depends on what
> the mode needs to do with that character.  For example, a mode might
> have a good reason to give the newline the '>' syntax, because the
> newline ends a comment in those modes.

In C, the conventional comment is /* ... */ and the newline does not
end a comment. In any case, /* ... */ is more practical to write
multi-line comments in C (no need to repeat comment starters at the
beginning of every line), and if one wants to search in comments,
the newline should be regarded as a whitespace.

> > BTW, it actually doesn't match either for the Texinfo mode, and
> > I don't see any reason why.
> 
> In which version of Emacs, and with what value of
> search-whitespace-regexp?

Both 27.1 and 28.2 (Debian for both), with search-whitespace-regexp
set to "\\s-+". For Texinfo, it is more common to search in the
normal text (rather than comments), since this is the significant
content, so the newline character should be regarded as whitespace.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-03 18:03     ` Eli Zaretskii
  2022-11-03 18:33       ` Vincent Lefevre
@ 2022-11-04  3:30       ` Vincent Lefevre
  2022-11-04  7:21         ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04  3:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-03 20:03:52 +0200, Eli Zaretskii wrote:
> > Date: Thu, 3 Nov 2022 18:21:57 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > BTW, for users who do not spend their time in reading the full doc,
> > I'd suggest to clarify the doc by saying "one or more user-configurable
> > whitespace characters", because AFAIK, a newline character is often
> > regarded as a whitespace character, in particular by Unicode:
> > 
> >   https://en.wikipedia.org/wiki/Whitespace_character
> 
> Thanks, I moved the detailed description of what is considered
> "whitespace" closer to the first sentence.

The [[:space:]\n]+ in the doc is misleading. Since a newline character
is a whitespace, [[:space:]]+ is sufficient.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04  2:29               ` Vincent Lefevre
@ 2022-11-04  3:38                 ` Vincent Lefevre
  2022-11-04  7:25                   ` Eli Zaretskii
  2022-11-04  7:14                 ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04  3:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-04 03:29:50 +0100, Vincent Lefevre wrote:
> > In which version of Emacs, and with what value of
> > search-whitespace-regexp?
> 
> Both 27.1 and 28.2 (Debian for both), with search-whitespace-regexp
> set to "\\s-+". For Texinfo, it is more common to search in the
> normal text (rather than comments), since this is the significant
> content, so the newline character should be regarded as whitespace.

This is worse: even when changing search-whitespace-regexp to
include the newline character, this doesn't work in Texinfo!

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04  2:29               ` Vincent Lefevre
  2022-11-04  3:38                 ` Vincent Lefevre
@ 2022-11-04  7:14                 ` Eli Zaretskii
  2022-11-04 10:41                   ` Vincent Lefevre
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04  7:14 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 03:29:50 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> > Because the newline's syntax is not "whitespace" in those modes.
> 
> OK, but then, the question is why the newline's syntax is not
> "whitespace" in those modes...

Because the mode sets up its syntax tables for various needs, none of
which is Isearch.

> > > In C, except for the preprocessor, a newline is similar to a space
> > > character.
> > 
> > The syntax we give to each character in a major mode depends on what
> > the mode needs to do with that character.  For example, a mode might
> > have a good reason to give the newline the '>' syntax, because the
> > newline ends a comment in those modes.
> 
> In C, the conventional comment is /* ... */ and the newline does not
> end a comment. In any case, /* ... */ is more practical to write
> multi-line comments in C (no need to repeat comment starters at the
> beginning of every line), and if one wants to search in comments,
> the newline should be regarded as a whitespace.

This is not really relevant.  Major modes set up their syntax tables
as they consider relevant, and we won't change that for the benefit of
search-whitespace-regexp.  The lesson to learn here is not to base
Isearch-related regexps on character syntax, because that changes its
meaning with the major mode, something many users will not expect.

> > > BTW, it actually doesn't match either for the Texinfo mode, and
> > > I don't see any reason why.
> > 
> > In which version of Emacs, and with what value of
> > search-whitespace-regexp?
> 
> Both 27.1 and 28.2 (Debian for both), with search-whitespace-regexp
> set to "\\s-+".

Then don't use "\\s-+".  The manual suggests a different regexp for
your preference, and it does so for a good reason.  Why are you using
a regexp that we already concluded to be problematic and stopped
using?  You will get yourself in the same problems we decided to
avoid.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04  3:30       ` Vincent Lefevre
@ 2022-11-04  7:21         ` Eli Zaretskii
  2022-11-04 10:15           ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04  7:21 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 04:30:38 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> The [[:space:]\n]+ in the doc is misleading. Since a newline character
> is a whitespace, [[:space:]]+ is sufficient.

That is only true in major modes where the newline has the
"whitespace" syntax, see the description of [:space:] in the ELisp
manual.  I guess you tried this in Fundamental mode or somesuch?





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04  3:38                 ` Vincent Lefevre
@ 2022-11-04  7:25                   ` Eli Zaretskii
  2022-11-04 10:46                     ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04  7:25 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 04:38:05 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> This is worse: even when changing search-whitespace-regexp to
> include the newline character, this doesn't work in Texinfo!

I cannot reproduce this.  Please tell what value of
search-whitespace-regexp you used, how exactly did you change the
value, and what did you try in a Texinfo buffer that didn't work.

I changed search-whitespace-regexp via customize-variable, selected
the option labeled "Tabs, spaces and line breaks" there, clicked
Apply, then tried to search for "SOME THING" where I saw SOME\nTHING
in the buffer, and C-s did find that.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04  7:21         ` Eli Zaretskii
@ 2022-11-04 10:15           ` Vincent Lefevre
  2022-11-04 11:38             ` Andreas Schwab
  2022-11-04 11:45             ` Eli Zaretskii
  0 siblings, 2 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 10:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-04 09:21:04 +0200, Eli Zaretskii wrote:
> > Date: Fri, 4 Nov 2022 04:30:38 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > The [[:space:]\n]+ in the doc is misleading. Since a newline character
> > is a whitespace, [[:space:]]+ is sufficient.
> 
> That is only true in major modes where the newline has the
> "whitespace" syntax, see the description of [:space:] in the ELisp
> manual.  I guess you tried this in Fundamental mode or somesuch?

Wow! This is really confusing! [:space:] is defined by POSIX, and the
manual even refers to it:

    A character alternative can also specify named character classes
    (*note Char Classes::).  This is a POSIX feature.  [...]

You must not change its behavior! Making it depend on the major mode
is even worse.

If the intent is to have a different meaning, a different name should
be chosen, such as [:whitespace:] (though I'm not sure whether some
names are reserved by POSIX).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04  7:14                 ` Eli Zaretskii
@ 2022-11-04 10:41                   ` Vincent Lefevre
  2022-11-04 10:56                     ` Vincent Lefevre
  2022-11-04 11:48                     ` Eli Zaretskii
  0 siblings, 2 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 10:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-04 09:14:57 +0200, Eli Zaretskii wrote:
> > Date: Fri, 4 Nov 2022 03:29:50 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > > Because the newline's syntax is not "whitespace" in those modes.
> > 
> > OK, but then, the question is why the newline's syntax is not
> > "whitespace" in those modes...
> 
> Because the mode sets up its syntax tables for various needs, none of
> which is Isearch.

But the manual does not discourage the use of syntax classes
for searching, and it is not clear from the description that
they won't do what the user expects.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04  7:25                   ` Eli Zaretskii
@ 2022-11-04 10:46                     ` Vincent Lefevre
  2022-11-04 11:50                       ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 10:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-04 09:25:39 +0200, Eli Zaretskii wrote:
> > Date: Fri, 4 Nov 2022 04:38:05 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > This is worse: even when changing search-whitespace-regexp to
> > include the newline character, this doesn't work in Texinfo!
> 
> I cannot reproduce this.  Please tell what value of
> search-whitespace-regexp you used, how exactly did you change the
> value, and what did you try in a Texinfo buffer that didn't work.

I was using [[:space:]]+, based on the behavior of the C library and
the one I get with any other application. See my other message about
that.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 10:41                   ` Vincent Lefevre
@ 2022-11-04 10:56                     ` Vincent Lefevre
  2022-11-04 11:52                       ` Eli Zaretskii
  2022-11-04 11:48                     ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 10:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-04 11:41:15 +0100, Vincent Lefevre wrote:
> On 2022-11-04 09:14:57 +0200, Eli Zaretskii wrote:
> > > Date: Fri, 4 Nov 2022 03:29:50 +0100
> > > From: Vincent Lefevre <vincent@vinc17.net>
> > > Cc: 58992@debbugs.gnu.org
> > > 
> > > > Because the newline's syntax is not "whitespace" in those modes.
> > > 
> > > OK, but then, the question is why the newline's syntax is not
> > > "whitespace" in those modes...
> > 
> > Because the mode sets up its syntax tables for various needs, none of
> > which is Isearch.
> 
> But the manual does not discourage the use of syntax classes
> for searching, and it is not clear from the description that
> they won't do what the user expects.

I've searched on the web and found the Emacs wiki

  https://www.emacswiki.org/emacs/RegularExpression

It even encourages the use of "\s-" for searching: this is a page
mainly for searching (see the first line of the text), and in the
"Regexp Syntax Basics" (note: *basics*):

  \sc      character with c syntax (e.g. \s- for whitespace char)

So users are very likely to use it.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 10:15           ` Vincent Lefevre
@ 2022-11-04 11:38             ` Andreas Schwab
  2022-11-04 12:47               ` Vincent Lefevre
  2022-11-04 11:45             ` Eli Zaretskii
  1 sibling, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2022-11-04 11:38 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992

On Nov 04 2022, Vincent Lefevre wrote:

> Wow! This is really confusing! [:space:] is defined by POSIX,

Emacs regexps are _not_ defined by POSIX.

>     A character alternative can also specify named character classes
>     (*note Char Classes::).  This is a POSIX feature.  [...]

Did you read the referenced node?

‘[:space:]’
     This matches any character that has whitespace syntax (*note Syntax
     Class Table::).

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 10:15           ` Vincent Lefevre
  2022-11-04 11:38             ` Andreas Schwab
@ 2022-11-04 11:45             ` Eli Zaretskii
  2022-11-04 13:56               ` Robert Pluim
  1 sibling, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04 11:45 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 11:15:13 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> > > The [[:space:]\n]+ in the doc is misleading. Since a newline character
> > > is a whitespace, [[:space:]]+ is sufficient.
> > 
> > That is only true in major modes where the newline has the
> > "whitespace" syntax, see the description of [:space:] in the ELisp
> > manual.  I guess you tried this in Fundamental mode or somesuch?
> 
> Wow! This is really confusing! [:space:] is defined by POSIX, and the
> manual even refers to it:
> 
>     A character alternative can also specify named character classes
>     (*note Char Classes::).  This is a POSIX feature.  [...]
> 
> You must not change its behavior! Making it depend on the major mode
> is even worse.

Too late for such changes, sorry.  Emacs interprets [:space:] like
that since at least Emacs 22, if not before.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 10:41                   ` Vincent Lefevre
  2022-11-04 10:56                     ` Vincent Lefevre
@ 2022-11-04 11:48                     ` Eli Zaretskii
  1 sibling, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04 11:48 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 11:41:15 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> On 2022-11-04 09:14:57 +0200, Eli Zaretskii wrote:
> > > Date: Fri, 4 Nov 2022 03:29:50 +0100
> > > From: Vincent Lefevre <vincent@vinc17.net>
> > > Cc: 58992@debbugs.gnu.org
> > > 
> > > > Because the newline's syntax is not "whitespace" in those modes.
> > > 
> > > OK, but then, the question is why the newline's syntax is not
> > > "whitespace" in those modes...
> > 
> > Because the mode sets up its syntax tables for various needs, none of
> > which is Isearch.
> 
> But the manual does not discourage the use of syntax classes
> for searching, and it is not clear from the description that
> they won't do what the user expects.

It depends on the expectations of the user.  The manual does document
which classes mean what, and the user then should decide whether it
fits the particular job.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 10:46                     ` Vincent Lefevre
@ 2022-11-04 11:50                       ` Eli Zaretskii
  0 siblings, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04 11:50 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 11:46:42 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> On 2022-11-04 09:25:39 +0200, Eli Zaretskii wrote:
> > > Date: Fri, 4 Nov 2022 04:38:05 +0100
> > > From: Vincent Lefevre <vincent@vinc17.net>
> > > Cc: 58992@debbugs.gnu.org
> > > 
> > > This is worse: even when changing search-whitespace-regexp to
> > > include the newline character, this doesn't work in Texinfo!
> > 
> > I cannot reproduce this.  Please tell what value of
> > search-whitespace-regexp you used, how exactly did you change the
> > value, and what did you try in a Texinfo buffer that didn't work.
> 
> I was using [[:space:]]+, based on the behavior of the C library and
> the one I get with any other application. See my other message about
> that.

Then I suggest to use the value mentioned in the manual, or just
customize the variable via Customize.  Then you'll get what you want,
I think.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 10:56                     ` Vincent Lefevre
@ 2022-11-04 11:52                       ` Eli Zaretskii
  2022-11-04 13:04                         ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04 11:52 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 11:56:37 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> I've searched on the web and found the Emacs wiki
> 
>   https://www.emacswiki.org/emacs/RegularExpression
> 
> It even encourages the use of "\s-" for searching: this is a page
> mainly for searching (see the first line of the text), and in the
> "Regexp Syntax Basics" (note: *basics*):
> 
>   \sc      character with c syntax (e.g. \s- for whitespace char)
> 
> So users are very likely to use it.

The Emacs project is not responsible for what the wiki says, and at
least I take every opportunity to tell users to treat what it says
with a grain of salt.  Emacs comes with very detailed manuals, and
they and the doc strings should be used as the definitive
documentation, not the wiki or SE or any other place.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 11:38             ` Andreas Schwab
@ 2022-11-04 12:47               ` Vincent Lefevre
  2022-11-04 13:25                 ` Andreas Schwab
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 12:47 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Zaretskii, 58992

On 2022-11-04 12:38:14 +0100, Andreas Schwab wrote:
> On Nov 04 2022, Vincent Lefevre wrote:
> 
> > Wow! This is really confusing! [:space:] is defined by POSIX,
> 
> Emacs regexps are _not_ defined by POSIX.
> 
> >     A character alternative can also specify named character classes
> >     (*note Char Classes::).  This is a POSIX feature.  [...]
                                 ^^^^^^^^^^^^^^^^^^^^^^^

> Did you read the referenced node?

Did you read what the manual says?

It is not up to the user to search for contradictory information.

Instead of saying that this is a POSIX feature, the manual should say
that even they look like POSIX character classes, the Emacs ones are
different. Moreover, since this is surprising[*], this section should
also say that the character classes depend on the major mode (the
referenced node is there to give details, but surprising behavior
should be emphasized).

[*] Regexps (in particular, character classes) conventionally depend
on locales, but on nothing else. Emacs is the exception to the general
rule.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 11:52                       ` Eli Zaretskii
@ 2022-11-04 13:04                         ` Vincent Lefevre
  2022-11-04 13:40                           ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 13:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992

On 2022-11-04 13:52:52 +0200, Eli Zaretskii wrote:
> > Date: Fri, 4 Nov 2022 11:56:37 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: 58992@debbugs.gnu.org
> > 
> > I've searched on the web and found the Emacs wiki
> > 
> >   https://www.emacswiki.org/emacs/RegularExpression
> > 
> > It even encourages the use of "\s-" for searching: this is a page
> > mainly for searching (see the first line of the text), and in the
> > "Regexp Syntax Basics" (note: *basics*):
> > 
> >   \sc      character with c syntax (e.g. \s- for whitespace char)
> > 
> > So users are very likely to use it.
> 
> The Emacs project is not responsible for what the wiki says, and at
> least I take every opportunity to tell users to treat what it says
> with a grain of salt.  Emacs comes with very detailed manuals, and
> they and the doc strings should be used as the definitive
> documentation, not the wiki or SE or any other place.

The Emacs manuals are very detailed, but this is also an issue: it
may be difficult for the user to distinguish between details and
important things (in particular, when they are unintuitive, as for
behavior different from standards and other applications). The wiki
is a good example: even experts (who contributed to the wiki) can
be wrong or give bad advice. So when there is something misleading,
the manual should give big warnings and make sure that the user
will see them.

Also note that regexps are used everywhere, in many applications,
and users should not be expected to read again and again *whole*
parts of the manuals about common things like regexps.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 12:47               ` Vincent Lefevre
@ 2022-11-04 13:25                 ` Andreas Schwab
  2022-11-04 14:32                   ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2022-11-04 13:25 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992

On Nov 04 2022, Vincent Lefevre wrote:

> On 2022-11-04 12:38:14 +0100, Andreas Schwab wrote:
>> On Nov 04 2022, Vincent Lefevre wrote:
>> 
>> > Wow! This is really confusing! [:space:] is defined by POSIX,
>> 
>> Emacs regexps are _not_ defined by POSIX.
>> 
>> >     A character alternative can also specify named character classes
>> >     (*note Char Classes::).  This is a POSIX feature.  [...]
>                                  ^^^^^^^^^^^^^^^^^^^^^^^
>
>> Did you read the referenced node?
>
> Did you read what the manual says?

Yes, I did.

‘[:space:]’
     This matches any character that has whitespace syntax (*note Syntax
     Class Table::).

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 13:04                         ` Vincent Lefevre
@ 2022-11-04 13:40                           ` Eli Zaretskii
  0 siblings, 0 replies; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04 13:40 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: 58992

> Date: Fri, 4 Nov 2022 14:04:14 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: 58992@debbugs.gnu.org
> 
> The Emacs manuals are very detailed, but this is also an issue: it
> may be difficult for the user to distinguish between details and
> important things (in particular, when they are unintuitive, as for
> behavior different from standards and other applications). The wiki
> is a good example: even experts (who contributed to the wiki) can
> be wrong or give bad advice. So when there is something misleading,
> the manual should give big warnings and make sure that the user
> will see them.
> 
> Also note that regexps are used everywhere, in many applications,
> and users should not be expected to read again and again *whole*
> parts of the manuals about common things like regexps.

I added notes to the manual about syntax and case tables that affect
regexps, but your expectations are in general impractical: there's no
way we could guarantee that reading some short excerpt from a manual
will capture all the possible caveats.  This is simply too complex an
issue to allow that, sorry.  So yes, users are expected to read whole
parts of the manual, including the cross-referenced nodes.  There's no
way around that.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 11:45             ` Eli Zaretskii
@ 2022-11-04 13:56               ` Robert Pluim
  2022-11-04 14:04                 ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Robert Pluim @ 2022-11-04 13:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 58992, Vincent Lefevre

>>>>> On Fri, 04 Nov 2022 13:45:35 +0200, Eli Zaretskii <eliz@gnu.org> said:
    >> A character alternative can also specify named character classes
    >> (*note Char Classes::).  This is a POSIX feature.  [...]
    >> 
    >> You must not change its behavior! Making it depend on the major mode
    >> is even worse.

    Eli> Too late for such changes, sorry.  Emacs interprets [:space:] like
    Eli> that since at least Emacs 22, if not before.

Would it help if it said "This is based on a POSIX feature, but not
100% identical"?

Robert
-- 





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 13:56               ` Robert Pluim
@ 2022-11-04 14:04                 ` Eli Zaretskii
  2022-11-04 15:00                   ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04 14:04 UTC (permalink / raw)
  To: Robert Pluim; +Cc: 58992, vincent

> From: Robert Pluim <rpluim@gmail.com>
> Cc: Vincent Lefevre <vincent@vinc17.net>,  58992@debbugs.gnu.org
> Date: Fri, 04 Nov 2022 14:56:36 +0100
> 
> >>>>> On Fri, 04 Nov 2022 13:45:35 +0200, Eli Zaretskii <eliz@gnu.org> said:
>     >> A character alternative can also specify named character classes
>     >> (*note Char Classes::).  This is a POSIX feature.  [...]
>     >> 
>     >> You must not change its behavior! Making it depend on the major mode
>     >> is even worse.
> 
>     Eli> Too late for such changes, sorry.  Emacs interprets [:space:] like
>     Eli> that since at least Emacs 22, if not before.
> 
> Would it help if it said "This is based on a POSIX feature, but not
> 100% identical"?

No.  But we can remove that sentence, since it doesn't add anything to
the text.  Done.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 13:25                 ` Andreas Schwab
@ 2022-11-04 14:32                   ` Vincent Lefevre
  2022-11-04 14:35                     ` Andreas Schwab
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 14:32 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Zaretskii, 58992

On 2022-11-04 14:25:20 +0100, Andreas Schwab wrote:
> On Nov 04 2022, Vincent Lefevre wrote:
> 
> > On 2022-11-04 12:38:14 +0100, Andreas Schwab wrote:
> >> On Nov 04 2022, Vincent Lefevre wrote:
> >> 
> >> > Wow! This is really confusing! [:space:] is defined by POSIX,
> >> 
> >> Emacs regexps are _not_ defined by POSIX.
> >> 
> >> >     A character alternative can also specify named character classes
> >> >     (*note Char Classes::).  This is a POSIX feature.  [...]
> >                                  ^^^^^^^^^^^^^^^^^^^^^^^
> >
> >> Did you read the referenced node?
> >
> > Did you read what the manual says?
> 
> Yes, I did.

No, you didn't.

> ‘[:space:]’
>      This matches any character that has whitespace syntax (*note Syntax
>      Class Table::).

This is NOT a POSIX feature!!!

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 14:32                   ` Vincent Lefevre
@ 2022-11-04 14:35                     ` Andreas Schwab
  2022-11-04 15:02                       ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Andreas Schwab @ 2022-11-04 14:35 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992

On Nov 04 2022, Vincent Lefevre wrote:

> On 2022-11-04 14:25:20 +0100, Andreas Schwab wrote:
>> On Nov 04 2022, Vincent Lefevre wrote:
>> 
>> > On 2022-11-04 12:38:14 +0100, Andreas Schwab wrote:
>> >> On Nov 04 2022, Vincent Lefevre wrote:
>> >> 
>> >> > Wow! This is really confusing! [:space:] is defined by POSIX,
>> >> 
>> >> Emacs regexps are _not_ defined by POSIX.
>> >> 
>> >> >     A character alternative can also specify named character classes
>> >> >     (*note Char Classes::).  This is a POSIX feature.  [...]
>> >                                  ^^^^^^^^^^^^^^^^^^^^^^^
>> >
>> >> Did you read the referenced node?
>> >
>> > Did you read what the manual says?
>> 
>> Yes, I did.
>
> No, you didn't.
>
>> ‘[:space:]’
>>      This matches any character that has whitespace syntax (*note Syntax
>>      Class Table::).
>
> This is NOT a POSIX feature!!!

Yes, it is.  POSIX has introduced that syntax, and Emacs copied it.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 14:04                 ` Eli Zaretskii
@ 2022-11-04 15:00                   ` Vincent Lefevre
  2022-11-04 15:23                     ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 15:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Robert Pluim, 58992

On 2022-11-04 16:04:07 +0200, Eli Zaretskii wrote:
> > From: Robert Pluim <rpluim@gmail.com>
> > Cc: Vincent Lefevre <vincent@vinc17.net>,  58992@debbugs.gnu.org
> > Date: Fri, 04 Nov 2022 14:56:36 +0100
> > 
> > >>>>> On Fri, 04 Nov 2022 13:45:35 +0200, Eli Zaretskii <eliz@gnu.org> said:
> >     >> A character alternative can also specify named character classes
> >     >> (*note Char Classes::).  This is a POSIX feature.  [...]
> >     >> 
> >     >> You must not change its behavior! Making it depend on the major mode
> >     >> is even worse.
> > 
> >     Eli> Too late for such changes, sorry.  Emacs interprets [:space:] like
> >     Eli> that since at least Emacs 22, if not before.
> > 
> > Would it help if it said "This is based on a POSIX feature, but not
> > 100% identical"?
> 
> No.  But we can remove that sentence, since it doesn't add anything to
> the text.  Done.

IMHO, Section "Syntax of Regular Expressions" (for both Emacs and Elisp)
should warn that the meaning of a regular expression may depend on the
major mode.

Moreover, the [[:space:]\n]+ suggestion for search-whitespace-regexp
should be changed to something that does not use [:space:], as there
is no guarantee that the usual whitespace characters (e.g. space and
tab characters) have whitespace syntax. The manual says:

  @item Whitespace characters: @samp{@ } or @samp{-}
  Characters that separate symbols and words from each other.
  Typically, whitespace characters have no other syntactic significance,
  and multiple whitespace characters are syntactically equivalent to a
  single one.  Space, tab, and formfeed are classified as whitespace in
  almost all major modes.

But for Python, multiple whitespace characters are not syntactically
equivalent to a single one. So in a Python major mode, the user would
not want to use [:space:] for searching.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 14:35                     ` Andreas Schwab
@ 2022-11-04 15:02                       ` Vincent Lefevre
  2022-11-04 15:24                         ` Andreas Schwab
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-04 15:02 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Zaretskii, 58992

On 2022-11-04 15:35:28 +0100, Andreas Schwab wrote:
> On Nov 04 2022, Vincent Lefevre wrote:
> 
> > On 2022-11-04 14:25:20 +0100, Andreas Schwab wrote:
> >> On Nov 04 2022, Vincent Lefevre wrote:
> >> 
> >> > On 2022-11-04 12:38:14 +0100, Andreas Schwab wrote:
> >> >> On Nov 04 2022, Vincent Lefevre wrote:
> >> >> 
> >> >> > Wow! This is really confusing! [:space:] is defined by POSIX,
> >> >> 
> >> >> Emacs regexps are _not_ defined by POSIX.
> >> >> 
> >> >> >     A character alternative can also specify named character classes
> >> >> >     (*note Char Classes::).  This is a POSIX feature.  [...]
> >> >                                  ^^^^^^^^^^^^^^^^^^^^^^^
> >> >
> >> >> Did you read the referenced node?
> >> >
> >> > Did you read what the manual says?
> >> 
> >> Yes, I did.
> >
> > No, you didn't.
> >
> >> ‘[:space:]’
> >>      This matches any character that has whitespace syntax (*note Syntax
> >>      Class Table::).
> >
> > This is NOT a POSIX feature!!!
> 
> Yes, it is.  POSIX has introduced that syntax, and Emacs copied it.

Since the meaning in Emacs is different, Emacs didn't copy it.
It was merely inspired by POSIX.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 15:00                   ` Vincent Lefevre
@ 2022-11-04 15:23                     ` Eli Zaretskii
  2022-11-05  1:55                       ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-04 15:23 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: rpluim, 58992

> Date: Fri, 4 Nov 2022 16:00:02 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: Robert Pluim <rpluim@gmail.com>, 58992@debbugs.gnu.org
> 
> IMHO, Section "Syntax of Regular Expressions" (for both Emacs and Elisp)
> should warn that the meaning of a regular expression may depend on the
> major mode.

Already done.  Did you look at the latest version in Git?

> Moreover, the [[:space:]\n]+ suggestion for search-whitespace-regexp
> should be changed to something that does not use [:space:]

Done.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 15:02                       ` Vincent Lefevre
@ 2022-11-04 15:24                         ` Andreas Schwab
  0 siblings, 0 replies; 53+ messages in thread
From: Andreas Schwab @ 2022-11-04 15:24 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Eli Zaretskii, 58992

On Nov 04 2022, Vincent Lefevre wrote:

> Since the meaning in Emacs is different, Emacs didn't copy it.

It copied the syntax, as I said.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-04 15:23                     ` Eli Zaretskii
@ 2022-11-05  1:55                       ` Vincent Lefevre
  2022-11-05  6:47                         ` Eli Zaretskii
  0 siblings, 1 reply; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-05  1:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 58992

On 2022-11-04 17:23:11 +0200, Eli Zaretskii wrote:
> > Date: Fri, 4 Nov 2022 16:00:02 +0100
> > From: Vincent Lefevre <vincent@vinc17.net>
> > Cc: Robert Pluim <rpluim@gmail.com>, 58992@debbugs.gnu.org
> > 
> > IMHO, Section "Syntax of Regular Expressions" (for both Emacs and Elisp)
> > should warn that the meaning of a regular expression may depend on the
> > major mode.
> 
> Already done.  Did you look at the latest version in Git?

If I'm not mistaken, it is said only in Section "Character Classes",
not at a higher level.

> > Moreover, the [[:space:]\n]+ suggestion for search-whitespace-regexp
> > should be changed to something that does not use [:space:]
> 
> Done.

OK, thanks.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-05  1:55                       ` Vincent Lefevre
@ 2022-11-05  6:47                         ` Eli Zaretskii
  2022-11-05 11:20                           ` Vincent Lefevre
  0 siblings, 1 reply; 53+ messages in thread
From: Eli Zaretskii @ 2022-11-05  6:47 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: rpluim, 58992

> Date: Sat, 5 Nov 2022 02:55:54 +0100
> From: Vincent Lefevre <vincent@vinc17.net>
> Cc: rpluim@gmail.com, 58992@debbugs.gnu.org
> 
> On 2022-11-04 17:23:11 +0200, Eli Zaretskii wrote:
> > > Date: Fri, 4 Nov 2022 16:00:02 +0100
> > > From: Vincent Lefevre <vincent@vinc17.net>
> > > Cc: Robert Pluim <rpluim@gmail.com>, 58992@debbugs.gnu.org
> > > 
> > > IMHO, Section "Syntax of Regular Expressions" (for both Emacs and Elisp)
> > > should warn that the meaning of a regular expression may depend on the
> > > major mode.
> > 
> > Already done.  Did you look at the latest version in Git?
> 
> If I'm not mistaken, it is said only in Section "Character Classes",
> not at a higher level.

Yes, because that's where [:space:] is described.  Saying it in other
places would be a didactic mistake, because those other places are not
directly related to use of character classes in regular expressions.

In general, when one looks for details of some Emacs feature, one has
to read the parts of the manual which actually describe that feature
in all its details.





^ permalink raw reply	[flat|nested] 53+ messages in thread

* bug#58992: 28.2; "lax space matching" no longer works
  2022-11-05  6:47                         ` Eli Zaretskii
@ 2022-11-05 11:20                           ` Vincent Lefevre
  0 siblings, 0 replies; 53+ messages in thread
From: Vincent Lefevre @ 2022-11-05 11:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rpluim, 58992

On 2022-11-05 08:47:27 +0200, Eli Zaretskii wrote:
> > If I'm not mistaken, it is said only in Section "Character Classes",
> > not at a higher level.
> 
> Yes, because that's where [:space:] is described.  Saying it in other
> places would be a didactic mistake, because those other places are not
> directly related to use of character classes in regular expressions.
> 
> In general, when one looks for details of some Emacs feature, one has
> to read the parts of the manual which actually describe that feature
> in all its details.

Not necessarily. The regexp may come from somewhere else (there
was the Emacs manual about search-whitespace-regexp, though this
is no longer the case, but this could be in various other places)
and the user may test it without looking at the details, but he
doesn't know that the behavior depends on the major mode, so that
his tests are actually wrong.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2022-11-05 11:20 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-03 16:53 bug#58992: 28.2; "lax space matching" no longer works Vincent Lefevre
2022-11-03 17:04 ` Eli Zaretskii
2022-11-03 17:21   ` Vincent Lefevre
2022-11-03 17:34     ` Vincent Lefevre
2022-11-03 18:05       ` Eli Zaretskii
2022-11-03 17:49     ` Gregory Heytings
2022-11-03 17:56       ` Vincent Lefevre
2022-11-03 18:02         ` Vincent Lefevre
2022-11-03 18:04           ` Gregory Heytings
2022-11-03 18:04           ` Vincent Lefevre
2022-11-03 18:11             ` Gregory Heytings
2022-11-03 18:18               ` Gregory Heytings
2022-11-03 18:28                 ` Vincent Lefevre
2022-11-03 18:39                   ` Eli Zaretskii
2022-11-03 18:43                   ` Gregory Heytings
2022-11-03 18:36                 ` Juri Linkov
2022-11-03 18:18               ` Eli Zaretskii
2022-11-03 18:02         ` Gregory Heytings
2022-11-03 18:03     ` Eli Zaretskii
2022-11-03 18:33       ` Vincent Lefevre
2022-11-03 18:42         ` Eli Zaretskii
2022-11-03 18:52           ` Vincent Lefevre
2022-11-03 19:22             ` Eli Zaretskii
2022-11-04  2:29               ` Vincent Lefevre
2022-11-04  3:38                 ` Vincent Lefevre
2022-11-04  7:25                   ` Eli Zaretskii
2022-11-04 10:46                     ` Vincent Lefevre
2022-11-04 11:50                       ` Eli Zaretskii
2022-11-04  7:14                 ` Eli Zaretskii
2022-11-04 10:41                   ` Vincent Lefevre
2022-11-04 10:56                     ` Vincent Lefevre
2022-11-04 11:52                       ` Eli Zaretskii
2022-11-04 13:04                         ` Vincent Lefevre
2022-11-04 13:40                           ` Eli Zaretskii
2022-11-04 11:48                     ` Eli Zaretskii
2022-11-04  3:30       ` Vincent Lefevre
2022-11-04  7:21         ` Eli Zaretskii
2022-11-04 10:15           ` Vincent Lefevre
2022-11-04 11:38             ` Andreas Schwab
2022-11-04 12:47               ` Vincent Lefevre
2022-11-04 13:25                 ` Andreas Schwab
2022-11-04 14:32                   ` Vincent Lefevre
2022-11-04 14:35                     ` Andreas Schwab
2022-11-04 15:02                       ` Vincent Lefevre
2022-11-04 15:24                         ` Andreas Schwab
2022-11-04 11:45             ` Eli Zaretskii
2022-11-04 13:56               ` Robert Pluim
2022-11-04 14:04                 ` Eli Zaretskii
2022-11-04 15:00                   ` Vincent Lefevre
2022-11-04 15:23                     ` Eli Zaretskii
2022-11-05  1:55                       ` Vincent Lefevre
2022-11-05  6:47                         ` Eli Zaretskii
2022-11-05 11:20                           ` Vincent Lefevre

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).