all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Lars Magne Ingebrigtsen <larsi@gnus.org>
To: emacs-devel@gnu.org
Subject: Regexps and strings once again
Date: Mon, 15 Sep 2014 01:27:51 +0200	[thread overview]
Message-ID: <m3lhplixzc.fsf@stories.gnus.org> (raw)

(Skip to 1) if you're not interested in why I started thinking about
this now.)

I was just fiddling around with a DOM traversal library (i.e., "document
object model", or something -- HTML traversal, like), and it has
functions for finding nodes by various criteria, like IDs.  So there are
functions like `dom-by-id' that take a DOM fragment and an ID and
returns the matching nodes.

I wrote the function as taking a regexp.  And I find what I'm doing
wrong 90% of the time when using it is that I expect an exact match, but
instead I'm getting all matching nodes.

This reminded me of this pretty general problem once again.  We have
oodles of functions in Emacs that does matching either on exact(ish)
strings, or regexps, and then we have an optional parameter that says
whether we want to interpret the string as an exact string or a
parameter.

It's kinda annoying, especially when the function defaults to the
interpretation you don't want.  And you have to remember which optional
parameter you're supposed to set.

So:  Here's yet another suggestion for how to deal with regexps in a
more general way in Emacs.  Or rather two.


1) New Special Syntax

A while ago, there was some suggestion about introducing a special
syntax for string literals, and it didn't really go anywhere, because
introducing a new syntax to Emacs is kinda a big deal.  But let's just
suggest it anyway:

(dom-by-id dom #/I (can)?haz new syntax/)

And see!  Perl Regexp syntax as well!  No more backslashitis!

Anyway, I assume that everybody would want this, but that it's too much
work for anybody to actually commit to.

2) Cheat; i.e., introduce a convention

What if we just mark a string as a regexp?

(dom-by-id dom (regexp "I \\(couldn't\\)?haz new syntax"))

It would basically just put a text property on the string, and functions
like `dom-by-id' would just do

(if (regexp-p match)
    (string-match match id)
  (string= match id))

Of course, both `regexp' and the proposed new syntax could compile the
regexp and return a regexp object and stuff if we wanted to be more
efficient...  But the regexp cache is already quite efficient, isn't it?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





             reply	other threads:[~2014-09-14 23:27 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-14 23:27 Lars Magne Ingebrigtsen [this message]
2014-09-15  0:50 ` Regexps and strings once again Daniel Colascione
2014-09-15  2:14   ` Stefan Monnier
2014-09-15  3:41     ` Daniel Colascione
2014-09-15 12:52       ` Stefan Monnier
2014-09-15 10:04     ` Lars Magne Ingebrigtsen
2014-09-15 10:26       ` Andreas Schwab
2014-09-15 10:33         ` Lars Magne Ingebrigtsen
2014-09-15 12:56       ` Stefan Monnier
2014-09-15  6:39   ` Lars Magne Ingebrigtsen
2014-09-15  7:08     ` Daniel Colascione
2014-09-15  1:38 ` Yuri Khan
2014-09-15  9:22   ` Andreas Schwab
2014-09-15 10:12     ` Eric Abrahamsen
2014-09-15 10:22       ` Eric Abrahamsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3lhplixzc.fsf@stories.gnus.org \
    --to=larsi@gnus.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.