* Re: bug in elisp... or in elisper??? [not found] <mailman.11.1300837050.13753.help-gnu-emacs@gnu.org> @ 2011-03-22 23:50 ` David Kastrup 2011-03-23 15:21 ` ken 2011-03-23 7:01 ` bug in elisp... or in elisper??? Tim X 1 sibling, 1 reply; 8+ messages in thread From: David Kastrup @ 2011-03-22 23:50 UTC (permalink / raw) To: help-gnu-emacs ken <gebser@mousecar.com> writes: > Fellow elispers, > > Something seems to be amiss in the search syntax here: > > (setq aname-re-str > "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\( > \\|\t\\|\n\\)*?\\)>" ) > > ;;Here's a function to use the above RE and return diagnostics: > > (defun test-aname-search () > (interactive) > (re-search-forward aname-re-str) > (message "1: \"%s\" 2: \"%s\" 3: \"%s\" 4: \"%s\" 5: \"%s\" 6: \"%s\" > 7: \"%s\" 8: \"%s\"" > (match-string 1) > (match-string 2) > (match-string 3) > (match-string 4) > (match-string 5) > (match-string 6) > (match-string 7) > (match-string 8))) > > > The problem is that the 5th match-string should be either empty or > whitespace. Uh what? \\(.\\|\n\\)*? Matches _any_ character. > But it consistently contains the last character of of the 4th > match-string. That is because it _is_ the last matched character of the 4th match-string. > And these two matches are separated by the literal > character string, "</a"!! What's up with this? Your ability to count \\( strings? They are assigned match numbers from left to right, regardless of whether they are nested or not. -- David Kastrup ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper??? 2011-03-22 23:50 ` bug in elisp... or in elisper??? David Kastrup @ 2011-03-23 15:21 ` ken 2011-03-23 15:38 ` David Kastrup 0 siblings, 1 reply; 8+ messages in thread From: ken @ 2011-03-23 15:21 UTC (permalink / raw) To: David Kastrup; +Cc: help-gnu-emacs On 03/22/2011 07:50 PM David Kastrup wrote: > ken <gebser@mousecar.com> writes: > >> Fellow elispers, >> >> Something seems to be amiss in the search syntax here: >> >> (setq aname-re-str >> "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\( >> \\|\t\\|\n\\)*?\\)>" ) >> >> .... > > Uh what? > > \\(.\\|\n\\)*? > > Matches _any_ character. Yes. Why not? Users' texts can and do contain any sort of character, multiple instances of them in fact... and, moreover, in any languages' character sets they might want. They're allowed to do this. Perhaps you're perplexed because you're not noting the RE immediately following: "</a". IOW, elisp should keep reading chars until the first instance of "</a". Seems to me to be a perfectly rational request. In the small bit of testing I've done, it seems also to work just fine. > >> But it consistently contains the last character of of the 4th >> match-string. > > That is because it _is_ the last matched character of the 4th > match-string. > >> And these two matches are separated by the literal >> character string, "</a"!! What's up with this? > > Your ability to count \\( strings? They are assigned match numbers from > left to right, regardless of whether they are nested or not. An inability to count would be the most derogatory interpretation. But the function I wrote (here elided) actually did the counting for me, so that would not be a cogent interpretation. A mere mortal, I wasn't born knowing that REs could be nested (documentation I read in fact stated they couldn't), of course then also not that in such cases both inner and outer REs are counted separately by match-string. So once again, the more charitable interpretation is the more perspicacious... and vice versa. -- One is not superior merely because one sees the world as odious. -- Chateaubriand (1768-1848) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper??? 2011-03-23 15:21 ` ken @ 2011-03-23 15:38 ` David Kastrup 2011-03-23 16:40 ` Irrelevant digression [was: Re: bug in elisp... or in elisper???] ken 0 siblings, 1 reply; 8+ messages in thread From: David Kastrup @ 2011-03-23 15:38 UTC (permalink / raw) To: gebser; +Cc: help-gnu-emacs ken <gebser@mousecar.com> writes: > An inability to count would be the most derogatory interpretation. > But the function I wrote (here elided) actually did the counting for > me, so that would not be a cogent interpretation. A mere mortal, I > wasn't born knowing that REs could be nested (documentation I read in > fact stated they couldn't), Emacs comes with its own hyperlinked, up to date, maintained, indexed fast documentation accessible via Help menu and keybindings. There is no reason to promote random garbage found somewhere on the internet to "documentation". In particular not concerning software that has a history of 30 years, where consequently most documentation in existence that might at one point even have been accurate is no longer so due to being prehistoric. Still I have my doubts that the documentation you are alluding to even was ever part of Emacs. > of course then also not that in such cases both inner and outer REs > are counted separately by match-string. So once again, the more > charitable interpretation is the more perspicacious... and vice versa. Care to provide a pointer to the "documentation" you are referring to? While I have my doubts it will lead to a much more charitable interpretation, I certainly am willing to let myself be surprised. -- David Kastrup ^ permalink raw reply [flat|nested] 8+ messages in thread
* Irrelevant digression [was: Re: bug in elisp... or in elisper???] 2011-03-23 15:38 ` David Kastrup @ 2011-03-23 16:40 ` ken 2011-03-23 16:52 ` Le Wang 0 siblings, 1 reply; 8+ messages in thread From: ken @ 2011-03-23 16:40 UTC (permalink / raw) To: David Kastrup; +Cc: help-gnu-emacs On 03/23/2011 11:38 AM David Kastrup wrote: > ken <gebser@mousecar.com> writes: > >> An inability to count would be the most derogatory interpretation. >> But the function I wrote (here elided) actually did the counting for >> me, so that would not be a cogent interpretation. A mere mortal, I >> wasn't born knowing that REs could be nested (documentation I read in >> fact stated they couldn't), > > Emacs comes with its own hyperlinked, up to date, maintained, indexed > fast documentation accessible via Help menu and keybindings. Thanks, David, but I knew that already. Though I've read quite a bit of it, admittedly, I didn't read the entirety of the emacs and elisp documentation. I'm sure you're not suggesting that as requisite preparation for writing a few elisp functions as that would preclude most all of us from ever attempting it. > > There is no reason to promote random garbage found somewhere on the > internet to "documentation". In particular not concerning software that > has a history of 30 years, where consequently most documentation in > existence that might at one point even have been accurate is no longer > so due to being prehistoric. And I certainly didn't "promote" it. The web is what it is. Haven't you ever googled for something? > > Still I have my doubts that the documentation you are alluding to even > was ever part of Emacs. Someone had a webpage with information on it, much, perhaps most, of it good information. I never said that webpage was "part of Emacs". It's the web. Somebody made a mistake. Humans do that occasionally. > >> of course then also not that in such cases both inner and outer REs >> are counted separately by match-string. So once again, the more >> charitable interpretation is the more perspicacious... and vice versa. > > Care to provide a pointer to the "documentation" you are referring to? > While I have my doubts it will lead to a much more charitable > interpretation, I certainly am willing to let myself be surprised. I read dozens of pages and see no gain or merit in reading back through all of them to verify what I read... unless I had some neurotic desire to win points in an irrelevant and fruitless discussion-- which I don't. Nor would I want to obligate anyone to be charitable. That doesn't work. Either they got it or they don't. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Irrelevant digression [was: Re: bug in elisp... or in elisper???] 2011-03-23 16:40 ` Irrelevant digression [was: Re: bug in elisp... or in elisper???] ken @ 2011-03-23 16:52 ` Le Wang 2011-03-23 17:46 ` ken 0 siblings, 1 reply; 8+ messages in thread From: Le Wang @ 2011-03-23 16:52 UTC (permalink / raw) To: gebser; +Cc: help-gnu-emacs, David Kastrup On Thu, Mar 24, 2011 at 12:40 AM, ken <gebser@mousecar.com> wrote: > I read dozens of pages and see no gain or merit in reading back through > all of them to verify what I read You don't see any merit to removing misinformation so that others who do the same search as you aren't led down the wrong path? Isn't that why we are here, to help each other better use Emacs? Provide the link, so we can go about trying to get the author to change his documentation. At a minimum, when people search for that link in particular, they'll see this thread on gmane. -- Le ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Irrelevant digression [was: Re: bug in elisp... or in elisper???] 2011-03-23 16:52 ` Le Wang @ 2011-03-23 17:46 ` ken 0 siblings, 0 replies; 8+ messages in thread From: ken @ 2011-03-23 17:46 UTC (permalink / raw) To: Le Wang; +Cc: help-gnu-emacs, David Kastrup On 03/23/2011 12:52 PM Le Wang wrote: > On Thu, Mar 24, 2011 at 12:40 AM, ken <gebser@mousecar.com> wrote: >> I read dozens of pages and see no gain or merit in reading back through >> all of them to verify what I read > > You don't see any merit to removing misinformation so that others who > do the same search as you aren't led down the wrong path? Isn't that > why we are here, to help each other better use Emacs? Le, you bring a good point. However, it's not often possible to contact the author of a webpage and if you do somehow do that, who knows if that author reads email and if so would bother to make the change. More importantly, as said, I don't have the time or the inclination to search out that page. I understand the web has bad information and accept that fact. If I wanted to fix inaccuracies on the web, there are many more of vastly greater import. If I do happen to run across the page again, however, I'll post it back here and then anyone who wants to can have at the author. Also, you could search for yourself; I googled for things like: emacs, elisp, regular expression(s), and other things, all of which I can't recall now. In addition, these list discussions are archived, right? So people will-- or should-- find them and hopefully won't succumb to the same inaccuracy as I did. If you feel the situation demands more than that, then you have your website with quite a bit of information on elisp. (I've read quite a bit there. It's a pretty good site... and gets respectable search rankings.) Post the information there. Heck, the article is virtually written for you already. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper??? [not found] <mailman.11.1300837050.13753.help-gnu-emacs@gnu.org> 2011-03-22 23:50 ` bug in elisp... or in elisper??? David Kastrup @ 2011-03-23 7:01 ` Tim X 2011-03-23 15:56 ` ken 1 sibling, 1 reply; 8+ messages in thread From: Tim X @ 2011-03-23 7:01 UTC (permalink / raw) To: help-gnu-emacs ken <gebser@mousecar.com> writes: > Fellow elispers, > > Something seems to be amiss in the search syntax here: > > (setq aname-re-str > "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\( > \\|\t\\|\n\\)*?\\)>" ) > > ;;Here's a function to use the above RE and return diagnostics: > > (defun test-aname-search () > (interactive) > (re-search-forward aname-re-str) > (message "1: \"%s\" 2: \"%s\" 3: \"%s\" 4: \"%s\" 5: \"%s\" 6: \"%s\" > 7: \"%s\" 8: \"%s\"" > (match-string 1) > (match-string 2) > (match-string 3) > (match-string 4) > (match-string 5) > (match-string 6) > (match-string 7) > (match-string 8))) > > > Here are some strings to search on: > > <h3><a name="thisname">Any Text-- > Hot Stuff</a></h3> > > <h1 > class="title" >><a > name="heres-a-name" >> > the</a >></h1 >> > > <h3><a name="duplicate">Any Text-- > Hot Crud</a></h3> > > > The problem is that the 5th match-string should be either empty or > whitespace. But it consistently contains the last character of of the > 4th match-string. And these two matches are separated by the literal > character string, "</a"!! What's up with this? > > > Wishing I hadn't quit beer, > ken I don't think your re is matching what you think it is. Strong recommend you try using re-builder as this will give you a visual representation of what your re is matching (with different colours representing the various match groups). Tim -- tcross (at) rapttech dot com dot au ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bug in elisp... or in elisper??? 2011-03-23 7:01 ` bug in elisp... or in elisper??? Tim X @ 2011-03-23 15:56 ` ken 0 siblings, 0 replies; 8+ messages in thread From: ken @ 2011-03-23 15:56 UTC (permalink / raw) To: tcross; +Cc: help-gnu-emacs Anything is easy if you know how to do it. On 03/23/2011 03:01 AM Tim X wrote: > ken <gebser@mousecar.com> writes: > >> Fellow elispers, >> >> Something seems to be amiss in the search syntax here: >> >> (setq aname-re-str >> "<a\\([\s-\\|\n]+?\\)name=\"\\(.*?\\)\"\\([\s-\\|\n]*?\\)>\\(\\(.\\|\n\\)*?\\)</a\\(\\( >> \\|\t\\|\n\\)*?\\)>" ) >> >> ;;Here's a function to use the above RE and return diagnostics: >> >> (defun test-aname-search () >> (interactive) >> (re-search-forward aname-re-str) >> (message "1: \"%s\" 2: \"%s\" 3: \"%s\" 4: \"%s\" 5: \"%s\" 6: \"%s\" >> 7: \"%s\" 8: \"%s\"" >> (match-string 1) >> (match-string 2) >> (match-string 3) >> (match-string 4) >> (match-string 5) >> (match-string 6) >> (match-string 7) >> (match-string 8))) >> >> >> Here are some strings to search on: >> >> <h3><a name="thisname">Any Text-- >> Hot Stuff</a></h3> >> >> <h1 >> class="title" >>> <a >> name="heres-a-name" >> the</a >>> </h1 >>> >> <h3><a name="duplicate">Any Text-- >> Hot Crud</a></h3> >> >> >> The problem is that the 5th match-string should be either empty or >> whitespace. But it consistently contains the last character of of the >> 4th match-string. And these two matches are separated by the literal >> character string, "</a"!! What's up with this? >> >> >> Wishing I hadn't quit beer, >> ken > > I don't think your re is matching what you think it is. Strong recommend > you try using re-builder as this will give you a visual representation > of what your re is matching (with different colours representing the > various match groups). > > Tim Well, I was missing a crucial bit of knowledge about REs (explained in two previous posts here) and that was causing me to misinterpret results. PJ's reply pointed me in the direction I needed to go to figure out what the problem was. And I think it was a mistake for me to post such a complex example, but I couldn't think of how else to do it. I read mention of re-builder, but must admit I haven't tried it yet. With your recommendation, I'm sure I'll be giving it a try on some future RE puzzle. The mere fact that this tool exists is comforting... tells me that I'm not the only one who's occasionally perplexed by REs. Thanks for the suggestion. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-03-23 17:46 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <mailman.11.1300837050.13753.help-gnu-emacs@gnu.org> 2011-03-22 23:50 ` bug in elisp... or in elisper??? David Kastrup 2011-03-23 15:21 ` ken 2011-03-23 15:38 ` David Kastrup 2011-03-23 16:40 ` Irrelevant digression [was: Re: bug in elisp... or in elisper???] ken 2011-03-23 16:52 ` Le Wang 2011-03-23 17:46 ` ken 2011-03-23 7:01 ` bug in elisp... or in elisper??? Tim X 2011-03-23 15:56 ` ken
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).