unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [BUG] Regexp compiler, problem with character classes
@ 2006-06-03  1:14 Johan Bockgård
  2006-09-07 21:15 ` Richard Stallman
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Johan Bockgård @ 2006-06-03  1:14 UTC (permalink / raw)



[I'm resending this because I think it's a serious bug. It makes
character classes totally unreliable.]

Character classes are translated to character alternatives during the
regexp compile phase. This is wrong, since the syntax table should be
taken into account during the actual matching. This may be non-trivial
to fix.


    (with-temp-buffer
      (list
       (progn (modify-syntax-entry ?a " ")
              (string-match "x[[:space:]]" "xa"))
       (progn (modify-syntax-entry ?a "w")
              (string-match "x[[:space:]]" "xa"))))
    => (0 0)



0:      /exactn/1/x
3:      /charset [\t\f a\302\200-\303\277]
37:     /succeed
38:     end of pattern.

Compiling pattern: x[[:space:]]

Compiled pattern: 
38 bytes used/174 bytes allocated.
fastmap: x
re_nsub: 0      regs_alloc: 0   can_be_null: 0  no_sub: 0       not_bol: 0      not_eol: 0      syntax: 340204
0:      /exactn/1/x
3:      /charset [\t\f a\302\200-\303\277]
37:     /succeed
38:     end of pattern.
0:      /exactn/1/x
3:      /charset [\t\f a\302\200-\303\277]
37:     /succeed
38:     end of pattern.



As an effect you get the behavior below, since the compiler takes no
care to setup the syntax in the first place:


1)

    emacs -Q

    (with-temp-buffer
      (string-match "x[[:space:]]" "x\n"))

    => nil

(exit Emacs)


2)
    emacs -Q

    (with-temp-buffer
      (char-syntax ?\n)
      (string-match "x[[:space:]]" "x\n"))

    => 0


(Fchar_syntax does
    gl_state.current_syntax_table = current_buffer->syntax_table;)

-- 
This is bad.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2006-09-18 13:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-03  1:14 [BUG] Regexp compiler, problem with character classes Johan Bockgård
2006-09-07 21:15 ` Richard Stallman
2006-09-13  9:50   ` Johan Bockgård
2006-09-13 19:25     ` Richard Stallman
2006-09-07 21:15 ` Richard Stallman
2006-09-14 23:20   ` Chong Yidong
2006-09-15 14:29     ` Richard Stallman
2006-09-15 15:13       ` Chong Yidong
2006-09-18  8:43     ` Johan Bockgård
2006-09-18 12:53       ` Chong Yidong
2006-09-18 13:03         ` Stefan Monnier
2006-09-18 13:12         ` Johan Bockgård
2006-09-15  3:14 ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).