unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Regexps and strings once again
@ 2014-09-14 23:27 Lars Magne Ingebrigtsen
  2014-09-15  0:50 ` Daniel Colascione
  2014-09-15  1:38 ` Yuri Khan
  0 siblings, 2 replies; 15+ messages in thread
From: Lars Magne Ingebrigtsen @ 2014-09-14 23:27 UTC (permalink / raw)
  To: emacs-devel

(Skip to 1) if you're not interested in why I started thinking about
this now.)

I was just fiddling around with a DOM traversal library (i.e., "document
object model", or something -- HTML traversal, like), and it has
functions for finding nodes by various criteria, like IDs.  So there are
functions like `dom-by-id' that take a DOM fragment and an ID and
returns the matching nodes.

I wrote the function as taking a regexp.  And I find what I'm doing
wrong 90% of the time when using it is that I expect an exact match, but
instead I'm getting all matching nodes.

This reminded me of this pretty general problem once again.  We have
oodles of functions in Emacs that does matching either on exact(ish)
strings, or regexps, and then we have an optional parameter that says
whether we want to interpret the string as an exact string or a
parameter.

It's kinda annoying, especially when the function defaults to the
interpretation you don't want.  And you have to remember which optional
parameter you're supposed to set.

So:  Here's yet another suggestion for how to deal with regexps in a
more general way in Emacs.  Or rather two.


1) New Special Syntax

A while ago, there was some suggestion about introducing a special
syntax for string literals, and it didn't really go anywhere, because
introducing a new syntax to Emacs is kinda a big deal.  But let's just
suggest it anyway:

(dom-by-id dom #/I (can)?haz new syntax/)

And see!  Perl Regexp syntax as well!  No more backslashitis!

Anyway, I assume that everybody would want this, but that it's too much
work for anybody to actually commit to.

2) Cheat; i.e., introduce a convention

What if we just mark a string as a regexp?

(dom-by-id dom (regexp "I \\(couldn't\\)?haz new syntax"))

It would basically just put a text property on the string, and functions
like `dom-by-id' would just do

(if (regexp-p match)
    (string-match match id)
  (string= match id))

Of course, both `regexp' and the proposed new syntax could compile the
regexp and return a regexp object and stuff if we wanted to be more
efficient...  But the regexp cache is already quite efficient, isn't it?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-09-15 12:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-14 23:27 Regexps and strings once again Lars Magne Ingebrigtsen
2014-09-15  0:50 ` Daniel Colascione
2014-09-15  2:14   ` Stefan Monnier
2014-09-15  3:41     ` Daniel Colascione
2014-09-15 12:52       ` Stefan Monnier
2014-09-15 10:04     ` Lars Magne Ingebrigtsen
2014-09-15 10:26       ` Andreas Schwab
2014-09-15 10:33         ` Lars Magne Ingebrigtsen
2014-09-15 12:56       ` Stefan Monnier
2014-09-15  6:39   ` Lars Magne Ingebrigtsen
2014-09-15  7:08     ` Daniel Colascione
2014-09-15  1:38 ` Yuri Khan
2014-09-15  9:22   ` Andreas Schwab
2014-09-15 10:12     ` Eric Abrahamsen
2014-09-15 10:22       ` Eric Abrahamsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).