* Re: bug in elisp... or in elisper???
[not found] <mailman.11.1300837050.13753.help-gnu-emacs@gnu.org>
@ 2011-03-22 23:50 ` David Kastrup
2011-03-23 15:21 ` ken
2011-03-23 7:01 ` bug in elisp... or in elisper??? Tim X
1 sibling, 1 reply; 8+ messages in thread
From: David Kastrup @ 2011-03-22 23:50 UTC (permalink / raw)
To: help-gnu-emacs
ken <gebser@mousecar.com> writes:
> Fellow elispers,
>
> Something seems to be amiss in the search syntax here:
>
> (setq aname-re-str
> "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\(
> \\|\t\\|\n\\)*?\\)>" )
>
> ;;Here's a function to use the above RE and return diagnostics:
>
> (defun test-aname-search ()
> (interactive)
> (re-search-forward aname-re-str)
> (message "1: \"%s\" 2: \"%s\" 3: \"%s\" 4: \"%s\" 5: \"%s\" 6: \"%s\"
> 7: \"%s\" 8: \"%s\""
> (match-string 1)
> (match-string 2)
> (match-string 3)
> (match-string 4)
> (match-string 5)
> (match-string 6)
> (match-string 7)
> (match-string 8)))
>
>
> The problem is that the 5th match-string should be either empty or
> whitespace.
Uh what?
\\(.\\|\n\\)*?
Matches _any_ character.
> But it consistently contains the last character of of the 4th
> match-string.
That is because it _is_ the last matched character of the 4th
match-string.
> And these two matches are separated by the literal
> character string, "</a"!! What's up with this?
Your ability to count \\( strings? They are assigned match numbers from
left to right, regardless of whether they are nested or not.
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper???
[not found] <mailman.11.1300837050.13753.help-gnu-emacs@gnu.org>
2011-03-22 23:50 ` bug in elisp... or in elisper??? David Kastrup
@ 2011-03-23 7:01 ` Tim X
2011-03-23 15:56 ` ken
1 sibling, 1 reply; 8+ messages in thread
From: Tim X @ 2011-03-23 7:01 UTC (permalink / raw)
To: help-gnu-emacs
ken <gebser@mousecar.com> writes:
> Fellow elispers,
>
> Something seems to be amiss in the search syntax here:
>
> (setq aname-re-str
> "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\(
> \\|\t\\|\n\\)*?\\)>" )
>
> ;;Here's a function to use the above RE and return diagnostics:
>
> (defun test-aname-search ()
> (interactive)
> (re-search-forward aname-re-str)
> (message "1: \"%s\" 2: \"%s\" 3: \"%s\" 4: \"%s\" 5: \"%s\" 6: \"%s\"
> 7: \"%s\" 8: \"%s\""
> (match-string 1)
> (match-string 2)
> (match-string 3)
> (match-string 4)
> (match-string 5)
> (match-string 6)
> (match-string 7)
> (match-string 8)))
>
>
> Here are some strings to search on:
>
> <h3><a name="thisname">Any Text--
> Hot Stuff</a></h3>
>
> <h1
> class="title"
>><a
> name="heres-a-name"
>>
> the</a
>></h1
>>
>
> <h3><a name="duplicate">Any Text--
> Hot Crud</a></h3>
>
>
> The problem is that the 5th match-string should be either empty or
> whitespace. But it consistently contains the last character of of the
> 4th match-string. And these two matches are separated by the literal
> character string, "</a"!! What's up with this?
>
>
> Wishing I hadn't quit beer,
> ken
I don't think your re is matching what you think it is. Strong recommend
you try using re-builder as this will give you a visual representation
of what your re is matching (with different colours representing the
various match groups).
Tim
--
tcross (at) rapttech dot com dot au
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper???
2011-03-22 23:50 ` bug in elisp... or in elisper??? David Kastrup
@ 2011-03-23 15:21 ` ken
2011-03-23 15:38 ` David Kastrup
0 siblings, 1 reply; 8+ messages in thread
From: ken @ 2011-03-23 15:21 UTC (permalink / raw)
To: David Kastrup; +Cc: help-gnu-emacs
On 03/22/2011 07:50 PM David Kastrup wrote:
> ken <gebser@mousecar.com> writes:
>
>> Fellow elispers,
>>
>> Something seems to be amiss in the search syntax here:
>>
>> (setq aname-re-str
>> "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\(
>> \\|\t\\|\n\\)*?\\)>" )
>>
>> ....
>
> Uh what?
>
> \\(.\\|\n\\)*?
>
> Matches _any_ character.
Yes. Why not? Users' texts can and do contain any sort of character,
multiple instances of them in fact... and, moreover, in any languages'
character sets they might want. They're allowed to do this.
Perhaps you're perplexed because you're not noting the RE immediately
following: "</a". IOW, elisp should keep reading chars until the first
instance of "</a". Seems to me to be a perfectly rational request. In
the small bit of testing I've done, it seems also to work just fine.
>
>> But it consistently contains the last character of of the 4th
>> match-string.
>
> That is because it _is_ the last matched character of the 4th
> match-string.
>
>> And these two matches are separated by the literal
>> character string, "</a"!! What's up with this?
>
> Your ability to count \\( strings? They are assigned match numbers from
> left to right, regardless of whether they are nested or not.
An inability to count would be the most derogatory interpretation. But
the function I wrote (here elided) actually did the counting for me, so
that would not be a cogent interpretation. A mere mortal, I wasn't born
knowing that REs could be nested (documentation I read in fact stated
they couldn't), of course then also not that in such cases both inner
and outer REs are counted separately by match-string. So once again,
the more charitable interpretation is the more perspicacious... and vice
versa.
--
One is not superior merely because one
sees the world as odious.
-- Chateaubriand (1768-1848)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper???
2011-03-23 15:21 ` ken
@ 2011-03-23 15:38 ` David Kastrup
2011-03-23 16:40 ` Irrelevant digression [was: Re: bug in elisp... or in elisper???] ken
0 siblings, 1 reply; 8+ messages in thread
From: David Kastrup @ 2011-03-23 15:38 UTC (permalink / raw)
To: gebser; +Cc: help-gnu-emacs
ken <gebser@mousecar.com> writes:
> An inability to count would be the most derogatory interpretation.
> But the function I wrote (here elided) actually did the counting for
> me, so that would not be a cogent interpretation. A mere mortal, I
> wasn't born knowing that REs could be nested (documentation I read in
> fact stated they couldn't),
Emacs comes with its own hyperlinked, up to date, maintained, indexed
fast documentation accessible via Help menu and keybindings.
There is no reason to promote random garbage found somewhere on the
internet to "documentation". In particular not concerning software that
has a history of 30 years, where consequently most documentation in
existence that might at one point even have been accurate is no longer
so due to being prehistoric.
Still I have my doubts that the documentation you are alluding to even
was ever part of Emacs.
> of course then also not that in such cases both inner and outer REs
> are counted separately by match-string. So once again, the more
> charitable interpretation is the more perspicacious... and vice versa.
Care to provide a pointer to the "documentation" you are referring to?
While I have my doubts it will lead to a much more charitable
interpretation, I certainly am willing to let myself be surprised.
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper???
2011-03-23 7:01 ` bug in elisp... or in elisper??? Tim X
@ 2011-03-23 15:56 ` ken
0 siblings, 0 replies; 8+ messages in thread
From: ken @ 2011-03-23 15:56 UTC (permalink / raw)
To: tcross; +Cc: help-gnu-emacs
Anything is easy if you know how to do it.
On 03/23/2011 03:01 AM Tim X wrote:
> ken <gebser@mousecar.com> writes:
>
>> Fellow elispers,
>>
>> Something seems to be amiss in the search syntax here:
>>
>> (setq aname-re-str
>> "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\(
>> \\|\t\\|\n\\)*?\\)>" )
>>
>> ;;Here's a function to use the above RE and return diagnostics:
>>
>> (defun test-aname-search ()
>> (interactive)
>> (re-search-forward aname-re-str)
>> (message "1: \"%s\" 2: \"%s\" 3: \"%s\" 4: \"%s\" 5: \"%s\" 6: \"%s\"
>> 7: \"%s\" 8: \"%s\""
>> (match-string 1)
>> (match-string 2)
>> (match-string 3)
>> (match-string 4)
>> (match-string 5)
>> (match-string 6)
>> (match-string 7)
>> (match-string 8)))
>>
>>
>> Here are some strings to search on:
>>
>> <h3><a name="thisname">Any Text--
>> Hot Stuff</a></h3>
>>
>> <h1
>> class="title"
>>> <a
>> name="heres-a-name"
>> the</a
>>> </h1
>>>
>> <h3><a name="duplicate">Any Text--
>> Hot Crud</a></h3>
>>
>>
>> The problem is that the 5th match-string should be either empty or
>> whitespace. But it consistently contains the last character of of the
>> 4th match-string. And these two matches are separated by the literal
>> character string, "</a"!! What's up with this?
>>
>>
>> Wishing I hadn't quit beer,
>> ken
>
> I don't think your re is matching what you think it is. Strong recommend
> you try using re-builder as this will give you a visual representation
> of what your re is matching (with different colours representing the
> various match groups).
>
> Tim
Well, I was missing a crucial bit of knowledge about REs (explained in
two previous posts here) and that was causing me to misinterpret
results. PJ's reply pointed me in the direction I needed to go to
figure out what the problem was. And I think it was a mistake for me to
post such a complex example, but I couldn't think of how else to do it.
I read mention of re-builder, but must admit I haven't tried it yet.
With your recommendation, I'm sure I'll be giving it a try on some
future RE puzzle. The mere fact that this tool exists is comforting...
tells me that I'm not the only one who's occasionally perplexed by REs.
Thanks for the suggestion.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Irrelevant digression [was: Re: bug in elisp... or in elisper???]
2011-03-23 15:38 ` David Kastrup
@ 2011-03-23 16:40 ` ken
2011-03-23 16:52 ` Le Wang
0 siblings, 1 reply; 8+ messages in thread
From: ken @ 2011-03-23 16:40 UTC (permalink / raw)
To: David Kastrup; +Cc: help-gnu-emacs
On 03/23/2011 11:38 AM David Kastrup wrote:
> ken <gebser@mousecar.com> writes:
>
>> An inability to count would be the most derogatory interpretation.
>> But the function I wrote (here elided) actually did the counting for
>> me, so that would not be a cogent interpretation. A mere mortal, I
>> wasn't born knowing that REs could be nested (documentation I read in
>> fact stated they couldn't),
>
> Emacs comes with its own hyperlinked, up to date, maintained, indexed
> fast documentation accessible via Help menu and keybindings.
Thanks, David, but I knew that already. Though I've read quite a bit of
it, admittedly, I didn't read the entirety of the emacs and elisp
documentation. I'm sure you're not suggesting that as requisite
preparation for writing a few elisp functions as that would preclude
most all of us from ever attempting it.
>
> There is no reason to promote random garbage found somewhere on the
> internet to "documentation". In particular not concerning software that
> has a history of 30 years, where consequently most documentation in
> existence that might at one point even have been accurate is no longer
> so due to being prehistoric.
And I certainly didn't "promote" it. The web is what it is. Haven't
you ever googled for something?
>
> Still I have my doubts that the documentation you are alluding to even
> was ever part of Emacs.
Someone had a webpage with information on it, much, perhaps most, of it
good information. I never said that webpage was "part of Emacs". It's
the web. Somebody made a mistake. Humans do that occasionally.
>
>> of course then also not that in such cases both inner and outer REs
>> are counted separately by match-string. So once again, the more
>> charitable interpretation is the more perspicacious... and vice versa.
>
> Care to provide a pointer to the "documentation" you are referring to?
> While I have my doubts it will lead to a much more charitable
> interpretation, I certainly am willing to let myself be surprised.
I read dozens of pages and see no gain or merit in reading back through
all of them to verify what I read... unless I had some neurotic desire
to win points in an irrelevant and fruitless discussion-- which I don't.
Nor would I want to obligate anyone to be charitable. That doesn't
work. Either they got it or they don't.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Irrelevant digression [was: Re: bug in elisp... or in elisper???]
2011-03-23 16:40 ` Irrelevant digression [was: Re: bug in elisp... or in elisper???] ken
@ 2011-03-23 16:52 ` Le Wang
2011-03-23 17:46 ` ken
0 siblings, 1 reply; 8+ messages in thread
From: Le Wang @ 2011-03-23 16:52 UTC (permalink / raw)
To: gebser; +Cc: help-gnu-emacs, David Kastrup
On Thu, Mar 24, 2011 at 12:40 AM, ken <gebser@mousecar.com> wrote:
> I read dozens of pages and see no gain or merit in reading back through
> all of them to verify what I read
You don't see any merit to removing misinformation so that others who
do the same search as you aren't led down the wrong path? Isn't that
why we are here, to help each other better use Emacs?
Provide the link, so we can go about trying to get the author to
change his documentation. At a minimum, when people search for that
link in particular, they'll see this thread on gmane.
--
Le
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Irrelevant digression [was: Re: bug in elisp... or in elisper???]
2011-03-23 16:52 ` Le Wang
@ 2011-03-23 17:46 ` ken
0 siblings, 0 replies; 8+ messages in thread
From: ken @ 2011-03-23 17:46 UTC (permalink / raw)
To: Le Wang; +Cc: help-gnu-emacs, David Kastrup
On 03/23/2011 12:52 PM Le Wang wrote:
> On Thu, Mar 24, 2011 at 12:40 AM, ken <gebser@mousecar.com> wrote:
>> I read dozens of pages and see no gain or merit in reading back through
>> all of them to verify what I read
>
> You don't see any merit to removing misinformation so that others who
> do the same search as you aren't led down the wrong path? Isn't that
> why we are here, to help each other better use Emacs?
Le, you bring a good point. However, it's not often possible to contact
the author of a webpage and if you do somehow do that, who knows if that
author reads email and if so would bother to make the change. More
importantly, as said, I don't have the time or the inclination to search
out that page. I understand the web has bad information and accept that
fact. If I wanted to fix inaccuracies on the web, there are many more
of vastly greater import. If I do happen to run across the page again,
however, I'll post it back here and then anyone who wants to can have at
the author. Also, you could search for yourself; I googled for things
like: emacs, elisp, regular expression(s), and other things, all of
which I can't recall now.
In addition, these list discussions are archived, right? So people
will-- or should-- find them and hopefully won't succumb to the same
inaccuracy as I did.
If you feel the situation demands more than that, then you have your
website with quite a bit of information on elisp. (I've read quite a
bit there. It's a pretty good site... and gets respectable search
rankings.) Post the information there. Heck, the article is virtually
written for you already.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-03-23 17:46 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <mailman.11.1300837050.13753.help-gnu-emacs@gnu.org>
2011-03-22 23:50 ` bug in elisp... or in elisper??? David Kastrup
2011-03-23 15:21 ` ken
2011-03-23 15:38 ` David Kastrup
2011-03-23 16:40 ` Irrelevant digression [was: Re: bug in elisp... or in elisper???] ken
2011-03-23 16:52 ` Le Wang
2011-03-23 17:46 ` ken
2011-03-23 7:01 ` bug in elisp... or in elisper??? Tim X
2011-03-23 15:56 ` ken
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).