unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* regular expressions that match nothing
@ 2019-05-14  7:25 philippe schnoebelen
  2019-05-14 10:14 ` Mattias Engdegård
  0 siblings, 1 reply; 37+ messages in thread
From: philippe schnoebelen @ 2019-05-14  7:25 UTC (permalink / raw)
  To: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 982 bytes --]

I was very happy to see that in v27.0.50 (regexp-opt nil) now properly
returns a regular expression that matches nothing, namely a\`. Thanks to
whoever fixed that old bug.

I was wondering why (regexp-opt nil) uses a\` and not \'a or another option
like \=a\= so I did some profiling (see attached code).

The different options that I tried have more or less the same response time
when one checks, via looking-at, whether the regexp matches at point. But
when one searches for a match across a whole buffer, some options behave
notably faster than the others. And a\` is not the best, e.g., \=a\= is way
faster. Maybe some other solutions would be even faster.

Of course this may be dependent on the internals of the specific regexp
library at hand. I do not know enough to judge. In fact I believe that a
solid regular expression library should provide a specific regular
expression that matches nothing with special but easy treatment that
guarantees best response time.

--phs

[-- Attachment #1.2: Type: text/html, Size: 1153 bytes --]

[-- Attachment #2: profile-empty-regexp.el --]
[-- Type: application/octet-stream, Size: 1542 bytes --]

(defun profile-empty-regexps (&optional buffer)
  "Report some matching times for several regular expressions."
  (interactive)
  (unless buffer
    (setq buffer (find-file-noselect (locate-library "regexp-opt"))))
  (with-output-to-temp-buffer "Profiling"
    (princ (profile-one-regexp "a\\`" buffer))        (princ "\n")
    (princ (profile-one-regexp "\\'a" buffer))        (princ "\n")
    (princ (profile-one-regexp "\\=.\\=" buffer))     (princ "\n")
    (princ (profile-one-regexp "\\=a\\=" buffer))     (princ "\n")
    (princ (profile-one-regexp "\\." buffer))         (princ "\n")
    ))

(defun profile-one-regexp (regexp buffer &optional nbrepeats)
  ;; The workhorse
  (setq nbrepeats (or nbrepeats 50000))
  (let (start-time duration1 duration2 found)
    (with-current-buffer buffer
      (save-excursion
	(setq start-time (current-time))
	(goto-char (point-min))
	(dotimes (_ nbrepeats)
	  (looking-at regexp))
	(setq duration1 (time-subtract (current-time) start-time))
	(setq start-time (current-time))
	(goto-char (point-min))
	(dotimes (_ nbrepeats)
	  (when (re-search-forward regexp nil t) 
	    (setq found t)
	    (goto-char (point-min))))	;; return to test position
	(setq duration2 (time-subtract (current-time) start-time))))
    (format "Testing regexp %s %d times\n\tmatch at point-min: %.4fs\n\tsearch in buffer %s (size %d): %.4fs\n%s" regexp nbrepeats (float-time duration1) (buffer-name buffer) 
	    (buffer-size buffer) (float-time duration2)
	    (if found "\t*** WARNING *** a match was found\n" ""))))


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2019-05-26 12:05 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-14  7:25 regular expressions that match nothing philippe schnoebelen
2019-05-14 10:14 ` Mattias Engdegård
2019-05-14 19:41   ` Stefan Monnier
2019-05-15 16:21     ` Mattias Engdegård
2019-05-15 19:41       ` Alan Mackenzie
2019-05-16 10:54         ` Mattias Engdegård
2019-05-16 23:18           ` Phil Sainty
2019-05-17  9:43             ` Alan Mackenzie
2019-05-17 10:17               ` Mattias Engdegård
2019-05-17 12:53               ` Stefan Monnier
2019-05-15 20:17       ` Michael Heerdegen
2019-05-15 21:06         ` Stefan Monnier
2019-05-15 21:07         ` Mattias Engdegård
2019-05-15 21:38           ` Michael Heerdegen
2019-05-16  6:57           ` More re odditie [Was: regular expressions that match nothing] phs
2019-05-16  9:29             ` Mattias Engdegård
2019-05-16 10:59               ` phs
2019-05-16 12:31                 ` Stefan Monnier
2019-05-16 18:35             ` Michael Heerdegen
2019-05-16 20:31               ` Mattias Engdegård
2019-05-16 21:01                 ` Global and local definitions of non-functions/variable (was: More re odditie [Was: regular expressions that match nothing]) Stefan Monnier
2019-05-20 16:26           ` Bootstrap/autoload policy (was Re: regular expressions that match nothing) Mattias Engdegård
2019-05-22 14:02             ` Stefan Monnier
2019-05-22 14:07               ` Mattias Engdegård
2019-05-22 14:24                 ` Stefan Monnier
2019-05-22 15:06                   ` Mattias Engdegård
2019-05-22 15:53                     ` Stefan Monnier
2019-05-22 16:40                       ` Mattias Engdegård
2019-05-22 19:08                         ` Stefan Monnier
2019-05-26 12:05                         ` Basil L. Contovounesios
2019-05-16 18:12       ` regular expressions that match nothing Eric Abrahamsen
2019-05-19  4:30         ` 回复: " net june
2019-05-19  5:00           ` HaiJun Zhang
2019-05-19  7:32             ` Mattias Engdegård
2019-05-20  7:56               ` philippe schnoebelen
2019-05-20 23:19                 ` Richard Stallman
2019-05-19 14:12           ` 回复: " Drew Adams

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).