unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#30568: 27.0.50; `rx' doesn't create optimal regex for (group (or ...))
@ 2018-02-21 14:31 p.stephani2
  2019-12-13 19:18 ` Mattias Engdegård
  0 siblings, 1 reply; 2+ messages in thread
From: p.stephani2 @ 2018-02-21 14:31 UTC (permalink / raw)
  To: 30568


emacs -Q -batch --eval=3D'(progn (princ (rx (group (or "aaa" "bbb")))) (ter=
pri))'
=3D=3D> \(\(?:aaa\|bbb\)\)

This should generate \(aaa\|bbb\) instead.  Of course, these regexes are
equivalent, but the second one is easier to read (and maybe faster).


In GNU Emacs 27.0.50 (build 16, x86_64-pc-linux-gnu, GTK+ Version 3.22.24)
  of 2018-02-21 built on localhost
Repository revision: d599dce1353ce59d134fcff21cde02c70025253d
Windowing system distributor 'The X.Org Foundation', version 11.0.11903000
System Description: Debian GNU/Linux buster/sid

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Configured using:
  'configure --without-threads --enable-gcc-warnings=3Dwarn-only
  --enable-gtk-deprecation-warnings --without-pop --with-mailutils
  --enable-checking --enable-check-lisp-object-type --with-modules
  'CFLAGS=3D-O0 -ggdb3''

Configured features:
XPM JPEG TIFF GIF PNG SOUND DBUS GSETTINGS NOTIFY GNUTLS FREETYPE XFT
ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 MODULES JSON

Important settings:
   value of $LANG: en_US.UTF-8
   locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
   tooltip-mode: t
   global-eldoc-mode: t
   eldoc-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   tool-bar-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t
   line-number-mode: t
   transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny seq byte-opt gv
bytecomp byte-compile cconv cl-loaddefs cl-lib dired dired-loaddefs
format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg
epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils elec-pair time-date
mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode elisp-mode lisp-mode prog-mode register page menu-bar
rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core term/tty-colors frame cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote dbusbind inotify
dynamic-setting system-font-setting font-render-setting move-toolbar gtk
x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 95399 9075)
  (symbols 48 20247 1)
  (miscs 40 40 121)
  (strings 32 28348 1815)
  (string-bytes 1 757412)
  (vectors 16 14141)
  (vector-slots 8 499378 12892)
  (floats 8 50 67)
  (intervals 56 223 0)
  (buffers 992 12))

--=20
Google Germany GmbH
Erika-Mann-Stra=C3=9Fe 33
80636 M=C3=BCnchen

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Gesch=C3=A4ftsf=C3=BChrer: Paul Manicle, Halimah DeLaine Prado

If you received this communication by mistake, please don=E2=80=99t forward=
  it to
anyone else (it may contain confidential or privileged information), please
erase all copies of it, including all attachments, and please let the sender
know it went to the wrong person.  Thanks.





^ permalink raw reply	[flat|nested] 2+ messages in thread

* bug#30568: 27.0.50; `rx' doesn't create optimal regex for (group (or ...))
  2018-02-21 14:31 bug#30568: 27.0.50; `rx' doesn't create optimal regex for (group (or ...)) p.stephani2
@ 2019-12-13 19:18 ` Mattias Engdegård
  0 siblings, 0 replies; 2+ messages in thread
From: Mattias Engdegård @ 2019-12-13 19:18 UTC (permalink / raw)
  To: 30568; +Cc: Philipp

> (rx (group (or "aaa" "bbb")))
> ==> \(\(?:aaa\|bbb\)\)
>
> This should generate \(aaa\|bbb\) instead.  Of course, these regexes are
> equivalent, but the second one is easier to read (and maybe faster).

This remains unchanged, I'm afraid, despite rx being completely rewritten. Not that it matters much: brackets do not generate any regexp bytecode, thus matching performance isn't affected once the regexp has been compiled. When the brackets are required, there is no waste:

(rx (+ (or "aaa" "bbb"))) 
=> "\\(?:aaa\\|bbb\\)+"

Still, it's a bit untidy, and I like that you reported it. We could add a special value for the PAREN argument to regexp-opt to prevent bracketing altogether, I suppose. It isn't immediately on my to-do list, however.






^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-12-13 19:18 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-21 14:31 bug#30568: 27.0.50; `rx' doesn't create optimal regex for (group (or ...)) p.stephani2
2019-12-13 19:18 ` Mattias Engdegård

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).