* `compare-strings' style question
@ 2009-11-19 10:50 tomas
2009-11-20 3:08 ` Kevin Rodgers
0 siblings, 1 reply; 14+ messages in thread
From: tomas @ 2009-11-19 10:50 UTC (permalink / raw)
To: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
In Elisp, I'm trying to test whether a string is a prefix of another.
Poking around the documentation, I stumbled upon `compare-strings',
which seems to do the job fairly well. The interface is a bit weird
(at least as seen from Lisp) It feels more like C's strcmp.
It returns t on exact match, and some numbers on mismatch. I understand
that the result might be useful in some cases (it tells one by how many
chars we miss a match), but then I can't just do
(when (compare-strings foo 0 5 bar 0 5)
...)
but must do
(when (eq (compare-strings foo 0 5 bar 0 5) t)
...)
which looks rather funny. My question: are there better idioms? Am I
barking up the wrong function?
Thanks for any insight
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLBSLwBcgs9XrR2kYRAn3iAJwOzo2LTQ+BErcPHhMcb44QDW34nACfcx86
5wH5HvOVPQ/i92mkZzvJWgc=
=XpNM
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-19 10:50 `compare-strings' style question tomas
@ 2009-11-20 3:08 ` Kevin Rodgers
2009-11-20 7:03 ` tomas
0 siblings, 1 reply; 14+ messages in thread
From: Kevin Rodgers @ 2009-11-20 3:08 UTC (permalink / raw)
To: help-gnu-emacs
tomas@tuxteam.de wrote:
> In Elisp, I'm trying to test whether a string is a prefix of another.
> Poking around the documentation, I stumbled upon `compare-strings',
> which seems to do the job fairly well. The interface is a bit weird
> (at least as seen from Lisp) It feels more like C's strcmp.
>
> It returns t on exact match, and some numbers on mismatch. I understand
> that the result might be useful in some cases (it tells one by how many
> chars we miss a match), but then I can't just do
>
> (when (compare-strings foo 0 5 bar 0 5)
> ...)
>
> but must do
>
>
> (when (eq (compare-strings foo 0 5 bar 0 5) t)
> ...)
>
> which looks rather funny. My question: are there better idioms? Am I
> barking up the wrong function?
(when (string-match (concat "^" (regexp-quote foo)) bar)
...)
--
Kevin Rodgers
Denver, Colorado, USA
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-20 3:08 ` Kevin Rodgers
@ 2009-11-20 7:03 ` tomas
2009-11-20 10:34 ` Kevin Rodgers
0 siblings, 1 reply; 14+ messages in thread
From: tomas @ 2009-11-20 7:03 UTC (permalink / raw)
To: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thu, Nov 19, 2009 at 08:08:59PM -0700, Kevin Rodgers wrote:
> (when (string-match (concat "^" (regexp-quote foo)) bar)
> ...)
Thanks -- but I was trying to avoid conjuring up the whole regexp
machinery for this task. I admit that this looks less confusing, though.
Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD4DBQFLBj82Bcgs9XrR2kYRAq66AJdGrFU/De9tKB71S0/Q9C7VADgsAJ4/Mp8E
JDIMZVIzeTe4pZ/G5a76eg==
=Y5cw
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-20 7:03 ` tomas
@ 2009-11-20 10:34 ` Kevin Rodgers
2009-11-24 9:32 ` tomas
0 siblings, 1 reply; 14+ messages in thread
From: Kevin Rodgers @ 2009-11-20 10:34 UTC (permalink / raw)
To: help-gnu-emacs
tomas@tuxteam.de wrote:
> On Thu, Nov 19, 2009 at 08:08:59PM -0700, Kevin Rodgers wrote:
>
>> (when (string-match (concat "^" (regexp-quote foo)) bar)
>> ...)
Oops, that should be:
(when (let ((case-fold-search nil))
(string-match (concat "^" (regexp-quote foo)) bar))
...)
> Thanks -- but I was trying to avoid conjuring up the whole regexp
> machinery for this task. I admit that this looks less confusing, though.
Perhaps you could enlighten us with some performance measurements?
Another contender:
(let ((foo-len (length foo))
(bar-len (length bar)))
(cond ((> bar-len foo-len)
(equal foo (substring bar 0 (1- foo-len))))
((= bar-len foo-len)
(equal foo bar))
(t nil)))
--
Kevin Rodgers
Denver, Colorado, USA
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-20 10:34 ` Kevin Rodgers
@ 2009-11-24 9:32 ` tomas
0 siblings, 0 replies; 14+ messages in thread
From: tomas @ 2009-11-24 9:32 UTC (permalink / raw)
To: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Fri, Nov 20, 2009 at 03:34:51AM -0700, Kevin Rodgers wrote:
[...]
> (when (let ((case-fold-search nil))
> (string-match (concat "^" (regexp-quote foo)) bar))
> ...)
>
>> Thanks -- but I was trying to avoid conjuring up the whole regexp
>> machinery for this task. I admit that this looks less confusing, though.
>
> Perhaps you could enlighten us with some performance measurements?
Originally it was more of an "economy of the tools" principle and not
real concern about (computer) performance. But your question piqued my
curiosity, so here is a firstt shot at that:
(let ((words
'("Carl" "Carl's" "Carla" "Carla's" "Carlene" "Carlene's" "Carlin"
"Carlin's" "Carlo" "Carlo's" "Carlos" "Carlsbad" "Carlsbad's"
"Carlson" "Carlson's" "Carlton" "Carlton's" "Carly" "Carly's"
"Carlyle" "Carlyle's" "Carmela" "Carmela's" "Carmella" "Carmella's"
"Carmelo" "Carmelo's" "Carmen" "Carmen's" "Carmichael" "Carmichael's"
"Carmine" "Carmine's" "Carnap" "Carnap's" "Carnation" "Carnation's"
"Carnegie" "Carnegie's" "Carney" "Carney's" "Carnot" "Carnot's"
"Carol" "Carol's" "Carole" "Carole's" "Carolina")))
(insert (format "compare-strings: %S\n"
(benchmark-run-compiled 10000
(mapc (lambda (w)
(compare-strings "Carm" 0 3 w 0 3))
words))))
(insert (format "string-match : %S\n"
(benchmark-run-compiled 10000
(mapc (lambda (w)
(string-match "^Carm" w))
words)))))
compare-strings: (0.399947 0 0.0)
string-match : (0.885371 0 0.0)
compare-strings: (0.387 0 0.0)
string-match : (0.870512 0 0.0)
compare-strings: (0.35596 0 0.0)
string-match : (0.88489 0 0.0)
This is with "benchmark-run" instead of "benchmark-run-compiled":
compare-strings: (0.61102 1 0.038892999999999955)
string-match : (0.980834 1 0.038853999999999944)
compare-strings: (0.6046680000000001 1 0.03880600000000001)
string-match : (1.002827 1 0.03884599999999999)
compare-strings: (0.608943 1 0.039271)
string-match : (0.979522 1 0.03894399999999998)
Thus, compare-strings seems a tad faster, although I don't believe it
does matter very much (bear in mind *I* rigged the benchmark, tho ;-)
I don't know how Emacs handles its regular expressions (whether it
caches the compiled regexp and on which occassions it invalidates its
cache), but possibly your idiom above (string-match (concat "^" ...))
will kill another bunch of CPU cycles. But as I said, peerformance
wasn't my primary concern.
Heck. Let's try. Doing just (concat "^" "Carm") instead of "^Carm"
compiled:
compare-strings: (0.400415 0 0.0)
string-match : (0.891014 0 0.0)
non-compiled:
compare-strings: (0.6066699999999999 1 0.04038800000000009)
string-match : (2.790288 35 1.410207000000001)
Is it the concat? Is it the re-compiling of the regexp? Dunno.
> Another contender:
>
> (let ((foo-len (length foo))
> (bar-len (length bar)))
> (cond ((> bar-len foo-len)
> (equal foo (substring bar 0 (1- foo-len))))
> ((= bar-len foo-len)
> (equal foo bar))
> (t nil)))
This doesn't make the code much more readable, I fear.
Thanks
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLC6gUBcgs9XrR2kYRAiWgAJ9gUWenOgH6YiVlgrDY4eW2VOrQ0ACfXPK+
wmS8QK1x3CregqNlZa1/eW4=
=f7vF
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <mailman.11037.1258628274.2239.help-gnu-emacs@gnu.org>]
* Re: `compare-strings' style question
[not found] <mailman.11037.1258628274.2239.help-gnu-emacs@gnu.org>
@ 2009-11-19 11:39 ` David Kastrup
2009-11-19 15:27 ` tomas
[not found] ` <mailman.11057.1258644880.2239.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 14+ messages in thread
From: David Kastrup @ 2009-11-19 11:39 UTC (permalink / raw)
To: help-gnu-emacs
tomas@tuxteam.de writes:
> Hi,
>
> In Elisp, I'm trying to test whether a string is a prefix of another.
> Poking around the documentation, I stumbled upon `compare-strings',
> which seems to do the job fairly well. The interface is a bit weird
> (at least as seen from Lisp) It feels more like C's strcmp.
>
> It returns t on exact match, and some numbers on mismatch. I understand
> that the result might be useful in some cases (it tells one by how many
> chars we miss a match), but then I can't just do
>
> (when (compare-strings foo 0 5 bar 0 5)
> ...)
>
> but must do
>
>
> (when (eq (compare-strings foo 0 5 bar 0 5) t)
> ...)
>
> which looks rather funny. My question: are there better idioms? Am I
> barking up the wrong function?
(unless (numberp ...
or
(if (symbolp ...
--
David Kastrup
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-19 11:39 ` David Kastrup
@ 2009-11-19 15:27 ` tomas
2009-11-19 19:55 ` Andreas Politz
[not found] ` <mailman.11057.1258644880.2239.help-gnu-emacs@gnu.org>
1 sibling, 1 reply; 14+ messages in thread
From: tomas @ 2009-11-19 15:27 UTC (permalink / raw)
To: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thu, Nov 19, 2009 at 12:39:33PM +0100, David Kastrup wrote:
> tomas@tuxteam.de writes:
>
> > Hi,
> >
> > In Elisp, I'm trying to test whether a string is a prefix of another.
[...]
> > (when (eq (compare-strings foo 0 5 bar 0 5) t)
> > ...)
> >
> > which looks rather funny. My question: are there better idioms? Am I
> > barking up the wrong function?
>
> (unless (numberp ...
>
> or
>
> (if (symbolp ...
Thanks. Still looks a bit funna, though :-)
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLBWPRBcgs9XrR2kYRApRJAJ9guSeaRTlxcPeyZWLvuge/cKNcTACePSkC
xRlUc9OmZFWlsvvH7jdoGmM=
=Wv1D
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-19 15:27 ` tomas
@ 2009-11-19 19:55 ` Andreas Politz
2009-11-20 6:47 ` tomas
0 siblings, 1 reply; 14+ messages in thread
From: Andreas Politz @ 2009-11-19 19:55 UTC (permalink / raw)
To: help-gnu-emacs
tomas@tuxteam.de writes:
> On Thu, Nov 19, 2009 at 12:39:33PM +0100, David Kastrup wrote:
>> tomas@tuxteam.de writes:
>>
>> > Hi,
>> >
>> > In Elisp, I'm trying to test whether a string is a prefix of another.
>
> [...]
>
>> > (when (eq (compare-strings foo 0 5 bar 0 5) t)
>> > ...)
>> >
>> > which looks rather funny. My question: are there better idioms? Am I
>> > barking up the wrong function?
>>
>> (unless (numberp ...
>>
>> or
>>
>> (if (symbolp ...
>
> Thanks. Still looks a bit funna, though :-)
>
> -- tomás
(defun string-prefixp (string prefix &optional ignore-case)
"Return t if PREFIX is a prefix of STRING."
(eq t
(compare-strings string 0 (length prefix)
prefix 0 (length prefix)
ignore-case)))
I defined this function in one of my elisp files. Why don't you do just
the same ? Where is the problem.
-ap
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-19 19:55 ` Andreas Politz
@ 2009-11-20 6:47 ` tomas
0 siblings, 0 replies; 14+ messages in thread
From: tomas @ 2009-11-20 6:47 UTC (permalink / raw)
To: Andreas Politz; +Cc: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thu, Nov 19, 2009 at 08:55:27PM +0100, Andreas Politz wrote:
[...]
> (defun string-prefixp (string prefix &optional ignore-case)
> "Return t if PREFIX is a prefix of STRING."
> (eq t
> (compare-strings string 0 (length prefix)
> prefix 0 (length prefix)
> ignore-case)))
>
> I defined this function in one of my elisp files. Why don't you do just
> the same ? Where is the problem.
Thanks. I'll do something along these lines when I have more than one
use. The problem is mainly with my surprise. I had the feeling I was
doing something wrong.
Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLBjt5Bcgs9XrR2kYRAnDxAJ9ZM4v+NAe7sZCiIWNo5dur3mRmlgCfQraq
0GgKinA5QJZConXQ77GIh/o=
=uMlU
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <mailman.11057.1258644880.2239.help-gnu-emacs@gnu.org>]
* Re: `compare-strings' style question
[not found] ` <mailman.11057.1258644880.2239.help-gnu-emacs@gnu.org>
@ 2009-11-19 16:17 ` David Kastrup
2009-11-19 20:54 ` Barry Margolin
2009-11-20 6:53 ` tomas
0 siblings, 2 replies; 14+ messages in thread
From: David Kastrup @ 2009-11-19 16:17 UTC (permalink / raw)
To: help-gnu-emacs
tomas@tuxteam.de writes:
> On Thu, Nov 19, 2009 at 12:39:33PM +0100, David Kastrup wrote:
>> tomas@tuxteam.de writes:
>>
>> > Hi,
>> >
>> > In Elisp, I'm trying to test whether a string is a prefix of another.
>
> [...]
>
>> > (when (eq (compare-strings foo 0 5 bar 0 5) t)
>> > ...)
>> >
>> > which looks rather funny. My question: are there better idioms? Am I
>> > barking up the wrong function?
>>
>> (unless (numberp ...
>>
>> or
>>
>> (if (symbolp ...
>
> Thanks. Still looks a bit funna, though :-)
In my opinion, t was the wrong choice for a match. nil would have been
much better because you can't use the result of compare-strings as a
condition.
But I suppose there is not much one can do now because of compatibility.
--
David Kastrup
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-19 16:17 ` David Kastrup
@ 2009-11-19 20:54 ` Barry Margolin
2009-11-20 7:00 ` tomas
2009-11-20 6:53 ` tomas
1 sibling, 1 reply; 14+ messages in thread
From: Barry Margolin @ 2009-11-19 20:54 UTC (permalink / raw)
To: help-gnu-emacs
In article <87einuij59.fsf@lola.goethe.zz>, David Kastrup <dak@gnu.org>
wrote:
> tomas@tuxteam.de writes:
>
> > On Thu, Nov 19, 2009 at 12:39:33PM +0100, David Kastrup wrote:
> >> tomas@tuxteam.de writes:
> >>
> >> > Hi,
> >> >
> >> > In Elisp, I'm trying to test whether a string is a prefix of another.
> >
> > [...]
> >
> >> > (when (eq (compare-strings foo 0 5 bar 0 5) t)
> >> > ...)
> >> >
> >> > which looks rather funny. My question: are there better idioms? Am I
> >> > barking up the wrong function?
> >>
> >> (unless (numberp ...
> >>
> >> or
> >>
> >> (if (symbolp ...
> >
> > Thanks. Still looks a bit funna, though :-)
>
> In my opinion, t was the wrong choice for a match. nil would have been
> much better because you can't use the result of compare-strings as a
> condition.
>
> But I suppose there is not much one can do now because of compatibility.
That would still be weird, because
(not (compare-strings ...))
would be the way to tell if they're equivalent. C has the same problem
with its strcmp() function, which returns negative, 0, or positive,
where 0 is C's falsehood.
The basic problem is that IF is designed to work with binary predicates,
and this operation is trinary.
Maybe compare-strings should have been defined like strcmp, returning 0
for the middle case. Then you wouldn't be tempted to think of it as a
predicate. (zerop (compare-strings ...)) doesn't seem as weird as (not
(compare-strings ...)).
--
Barry Margolin, barmar@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-19 20:54 ` Barry Margolin
@ 2009-11-20 7:00 ` tomas
2009-11-20 8:13 ` Andreas Politz
0 siblings, 1 reply; 14+ messages in thread
From: tomas @ 2009-11-20 7:00 UTC (permalink / raw)
To: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thu, Nov 19, 2009 at 03:54:20PM -0500, Barry Margolin wrote:
> In article <87einuij59.fsf@lola.goethe.zz>, David Kastrup <dak@gnu.org>
> wrote:
[...]
> > In my opinion, t was the wrong choice for a match. nil would have been
> > much better because you can't use the result of compare-strings as a
> > condition.
> >
> > But I suppose there is not much one can do now because of compatibility.
>
> That would still be weird, because
>
> (not (compare-strings ...))
>
> would be the way to tell if they're equivalent. C has the same problem
> with its strcmp() function, which returns negative, 0, or positive,
> where 0 is C's falsehood.
Yes, that would be a similar problem as C, where zero's alter ego is
false. It still looks a bit funny to say
if(!strcmp(foo, bar)) ...
...but at least, it's just a problem of name choice (more appropriate
would have been something along the lines of strdiff).
> The basic problem is that IF is designed to work with binary predicates,
> and this operation is trinary.
>
> Maybe compare-strings should have been defined like strcmp, returning 0
> for the middle case. Then you wouldn't be tempted to think of it as a
> predicate. (zerop (compare-strings ...)) doesn't seem as weird as (not
> (compare-strings ...)).
Yes, I would have preferred this choice (but nil would have been fine
too).
Thanks
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLBj6XBcgs9XrR2kYRAsz/AJ47RD83WcbAmKNJ3zDVO2RLorOEXwCePi9z
q0SAJuLd7lCI6MHoi2ShLlw=
=D4Be
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-20 7:00 ` tomas
@ 2009-11-20 8:13 ` Andreas Politz
0 siblings, 0 replies; 14+ messages in thread
From: Andreas Politz @ 2009-11-20 8:13 UTC (permalink / raw)
To: help-gnu-emacs
tomas@tuxteam.de writes:
>
> Yes, that would be a similar problem as C, where zero's alter ego is
> false. It still looks a bit funny to say
>
> if(!strcmp(foo, bar)) ...
>
> ...but at least, it's just a problem of name choice (more appropriate
> would have been something along the lines of strdiff).
>
I don't know. It just follows the standard for compare functions in C,
which is different from lisp predicates, as well as lisp compare
functions.
-ap
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: `compare-strings' style question
2009-11-19 16:17 ` David Kastrup
2009-11-19 20:54 ` Barry Margolin
@ 2009-11-20 6:53 ` tomas
1 sibling, 0 replies; 14+ messages in thread
From: tomas @ 2009-11-20 6:53 UTC (permalink / raw)
To: help-gnu-emacs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Thu, Nov 19, 2009 at 05:17:22PM +0100, David Kastrup wrote:
> tomas@tuxteam.de writes:
[...]
> > Thanks. Still looks a bit funna, though :-)
^^^ (small keyboard syndrome :)
> In my opinion, t was the wrong choice for a match. nil would have been
> much better because you can't use the result of compare-strings as a
> condition.
Yes. Nil, or zero, better mimicking C's strcmp.
> But I suppose there is not much one can do now because of compatibility.
Right. Funny though: one of the things I did before whining on this list
was to grep through all the Emacs-provided .el files (23.1.50.1), to get
some idea about its usage and I got no hits.
But one never knows about other users "out there".
Thanks
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFLBjzNBcgs9XrR2kYRAinOAJ9Lut59ObFDJaFP+xOaxp7Qrm/OYwCfZ85B
k4DGEewN2dkW9cbtq2qHt3Y=
=+jN6
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-11-24 9:32 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-19 10:50 `compare-strings' style question tomas
2009-11-20 3:08 ` Kevin Rodgers
2009-11-20 7:03 ` tomas
2009-11-20 10:34 ` Kevin Rodgers
2009-11-24 9:32 ` tomas
[not found] <mailman.11037.1258628274.2239.help-gnu-emacs@gnu.org>
2009-11-19 11:39 ` David Kastrup
2009-11-19 15:27 ` tomas
2009-11-19 19:55 ` Andreas Politz
2009-11-20 6:47 ` tomas
[not found] ` <mailman.11057.1258644880.2239.help-gnu-emacs@gnu.org>
2009-11-19 16:17 ` David Kastrup
2009-11-19 20:54 ` Barry Margolin
2009-11-20 7:00 ` tomas
2009-11-20 8:13 ` Andreas Politz
2009-11-20 6:53 ` tomas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).