unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Match empty string at begin/end of symbol
@ 2018-07-04 18:43 Joe Riel
  2018-07-04 19:21 ` Noam Postavsky
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Joe Riel @ 2018-07-04 18:43 UTC (permalink / raw)
  To: Help GNU Emacs

The regular expressions '\_<' and '\_>'
seem to be broken in Emacs 25.1.1.  Consider

(let ((str "3+ab"))
  (and (string-match "\\<[a-zA-Z][a-zA-Z0-9]*" str)
       (match-string 0 str)))
 
That returns "ab", as expected.  Change the "\\<" to "\\_<"
and it no longer matches.  Why not?

(let ((str "3+ab"))
  (and (string-match "\\_<[a-zA-Z][a-zA-Z0-9]*" str)
       (match-string 0 str)))

-- 
Joe Riel




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Match empty string at begin/end of symbol
  2018-07-04 18:43 Match empty string at begin/end of symbol Joe Riel
@ 2018-07-04 19:21 ` Noam Postavsky
  2018-07-04 19:22 ` Eli Zaretskii
  2018-07-04 19:25 ` Teemu Likonen
  2 siblings, 0 replies; 5+ messages in thread
From: Noam Postavsky @ 2018-07-04 19:21 UTC (permalink / raw)
  To: Joe Riel; +Cc: Help GNU Emacs

On 4 July 2018 at 14:43, Joe Riel <joer@san.rr.com> wrote:

> That returns "ab", as expected.  Change the "\\<" to "\\_<"
> and it no longer matches.  Why not?
>
> (let ((str "3+ab"))
>   (and (string-match "\\_<[a-zA-Z][a-zA-Z0-9]*" str)
>        (match-string 0 str)))

"+ab" all have symbol syntax in lisp-mode, try evaluating it from a
c-mode buffer and you will get "ab".



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Match empty string at begin/end of symbol
  2018-07-04 18:43 Match empty string at begin/end of symbol Joe Riel
  2018-07-04 19:21 ` Noam Postavsky
@ 2018-07-04 19:22 ` Eli Zaretskii
  2018-07-04 19:37   ` Joe Riel
  2018-07-04 19:25 ` Teemu Likonen
  2 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2018-07-04 19:22 UTC (permalink / raw)
  To: help-gnu-emacs

> Date: Wed, 4 Jul 2018 11:43:46 -0700
> From: Joe Riel <joer@san.rr.com>
> 
> The regular expressions '\_<' and '\_>'
> seem to be broken in Emacs 25.1.1.  Consider
> 
> (let ((str "3+ab"))
>   (and (string-match "\\<[a-zA-Z][a-zA-Z0-9]*" str)
>        (match-string 0 str)))
>  
> That returns "ab", as expected.  Change the "\\<" to "\\_<"
> and it no longer matches.  Why not?
> 
> (let ((str "3+ab"))
>   (and (string-match "\\_<[a-zA-Z][a-zA-Z0-9]*" str)
>        (match-string 0 str)))

The result of the last form depends on the major mode of the buffer
where (or in whose minibuffer) you evaluate it.  If it's Lisp or its
derivatives, it indeed should not match because a Lisp symbol can
legitimately be named "3+ab", and so "ab" is not at a symbol
boundary.  But if you try the same in a buffer whose major mode is C
Mode, you surely get a match, because '+' is not a symbol-constituent
character in C.

IOW, I don't think there's a bug here.  It's behaving as intended.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Match empty string at begin/end of symbol
  2018-07-04 18:43 Match empty string at begin/end of symbol Joe Riel
  2018-07-04 19:21 ` Noam Postavsky
  2018-07-04 19:22 ` Eli Zaretskii
@ 2018-07-04 19:25 ` Teemu Likonen
  2 siblings, 0 replies; 5+ messages in thread
From: Teemu Likonen @ 2018-07-04 19:25 UTC (permalink / raw)
  To: Joe Riel; +Cc: Help GNU Emacs

[-- Attachment #1: Type: text/plain, Size: 803 bytes --]

Joe Riel [2018-07-04 11:43:46-07] wrote:

> The regular expressions '\_<' and '\_>'
> seem to be broken in Emacs 25.1.1.  Consider
>
> (let ((str "3+ab"))
>   (and (string-match "\\<[a-zA-Z][a-zA-Z0-9]*" str)
>        (match-string 0 str)))
>  
> That returns "ab", as expected.  Change the "\\<" to "\\_<"
> and it no longer matches.  Why not?
>
> (let ((str "3+ab"))
>   (and (string-match "\\_<[a-zA-Z][a-zA-Z0-9]*" str)
>        (match-string 0 str)))

Because "+" is a symbol character too. See this:


    ELISP> (let ((str "3+ab"))
             (and (string-match "\\_<.+\\_>" str)
                  (match-string 0 str)))

    "3+ab"



-- 
/// Teemu Likonen   - .-..   <https://keybase.io/tlikonen> //
// PGP: 4E10 55DC 84E9 DFF6 13D7 8557 719D 69D3 2453 9450 ///

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Match empty string at begin/end of symbol
  2018-07-04 19:22 ` Eli Zaretskii
@ 2018-07-04 19:37   ` Joe Riel
  0 siblings, 0 replies; 5+ messages in thread
From: Joe Riel @ 2018-07-04 19:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs

On Wed, 04 Jul 2018 22:22:15 +0300
Eli Zaretskii <eliz@gnu.org> wrote:

> > Date: Wed, 4 Jul 2018 11:43:46 -0700
> > From: Joe Riel <joer@san.rr.com>
> > 
> > The regular expressions '\_<' and '\_>'
> > seem to be broken in Emacs 25.1.1.  Consider
> > 
> > (let ((str "3+ab"))
> >   (and (string-match "\\<[a-zA-Z][a-zA-Z0-9]*" str)
> >        (match-string 0 str)))
> >  
> > That returns "ab", as expected.  Change the "\\<" to "\\_<"
> > and it no longer matches.  Why not?
> > 
> > (let ((str "3+ab"))
> >   (and (string-match "\\_<[a-zA-Z][a-zA-Z0-9]*" str)
> >        (match-string 0 str)))  
> 
> The result of the last form depends on the major mode of the buffer
> where (or in whose minibuffer) you evaluate it.  If it's Lisp or its
> derivatives, it indeed should not match because a Lisp symbol can
> legitimately be named "3+ab", and so "ab" is not at a symbol
> boundary.  But if you try the same in a buffer whose major mode is C
> Mode, you surely get a match, because '+' is not a symbol-constituent
> character in C.
> 
> IOW, I don't think there's a bug here.  It's behaving as intended.
> 

Thanks, Eli.  I verified that by wrapping the call in a with-syntax-table
environment set to the appropriate syntax table; all is well.  


-- 
Joe Riel




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-04 19:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-04 18:43 Match empty string at begin/end of symbol Joe Riel
2018-07-04 19:21 ` Noam Postavsky
2018-07-04 19:22 ` Eli Zaretskii
2018-07-04 19:37   ` Joe Riel
2018-07-04 19:25 ` Teemu Likonen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).