all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Wildcard matching in debbugs-gnu-search - how?
@ 2019-09-24 10:38 Michael Heerdegen
  2019-09-24 11:32 ` Michael Albinus
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2019-09-24 10:38 UTC (permalink / raw)
  To: Emacs mailing list; +Cc: Michael Albinus

Hi,

I've read the debbugs-ug manual now (ok, partly, and not far enough to
find out what "-ug" stands for).

My question: With the newest version of debbugs-gnu, why doesn't, for
example,

(debbugs-gnu-search "[RX] ^el-search-.*-sources$" nil nil nil nil)

give me matches but

(debbugs-gnu-search "el-search-emacs-elisp-sources" nil nil nil nil)

finds one - what's wrong with my given RX syntax?  I also fail trying to
use [BW] and [EW].  BTW, are these operators also allowed when
specifying the subject, or only for the body?

TIA,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-09-24 10:38 Wildcard matching in debbugs-gnu-search - how? Michael Heerdegen
@ 2019-09-24 11:32 ` Michael Albinus
  2019-09-24 11:51   ` Michael Heerdegen
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Albinus @ 2019-09-24 11:32 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs mailing list

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Hi,

Hi Michael,

> I've read the debbugs-ug manual now (ok, partly, and not far enough to
> find out what "-ug" stands for).

"User Guide" :-)

It is the very first line of the manual.

The debbugs manual itself is intended for the SOAP backend, if somebody
wants to write another interface but debbugs-gnu.el and debbugs-org.el.

> My question: With the newest version of debbugs-gnu, why doesn't, for
> example,
>
> (debbugs-gnu-search "[RX] ^el-search-.*-sources$" nil nil nil nil)
>
> give me matches but
>
> (debbugs-gnu-search "el-search-emacs-elisp-sources" nil nil nil nil)
>
> finds one - what's wrong with my given RX syntax?  I also fail trying to
> use [BW] and [EW].  BTW, are these operators also allowed when
> specifying the subject, or only for the body?

Well, personally I haven't used regular expressions yet. The don't work
for me either.

For simple tests, you might use <https://debbugs.gnu.org/cgi/search.cgi>.
debbugs-gnu-search shall be have similar.

In your case, knowing the the Estraier search machine is not really full
text based but word based, I would try

(debbugs-gnu-search "el AND search AND sources" nil nil nil nil)

This gives you 120 hits. Likely some of them are false postives, but it
shall be manageable.

Btw, maybe I shall make all debbugs-gnu-search arguments optional ...

> TIA,
>
> Michael.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-09-24 11:32 ` Michael Albinus
@ 2019-09-24 11:51   ` Michael Heerdegen
  2019-09-24 12:00     ` Michael Albinus
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2019-09-24 11:51 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Emacs mailing list

Michael Albinus <michael.albinus@gmx.de> writes:

> "User Guide" :-)
>
> It is the very first line of the manual.

Oh - as I said, I did not read everything.

> (debbugs-gnu-search "el AND search AND sources" nil nil nil nil)
>
> This gives you 120 hits. Likely some of them are false postives, but
> it shall be manageable.

Hmm, according to the manual it should be possible to skip the "AND"
operator but this doesn't seem to be the case...but why?

> Btw, maybe I shall make all debbugs-gnu-search arguments optional ...

Yes, good idea!


Thanks,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-09-24 11:51   ` Michael Heerdegen
@ 2019-09-24 12:00     ` Michael Albinus
  2019-09-24 12:29       ` Michael Heerdegen
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Albinus @ 2019-09-24 12:00 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs mailing list

Michael Heerdegen <michael_heerdegen@web.de> writes:

Hi Michael,

>> (debbugs-gnu-search "el AND search AND sources" nil nil nil nil)
>>
>> This gives you 120 hits. Likely some of them are false postives, but
>> it shall be manageable.
>
> Hmm, according to the manual it should be possible to skip the "AND"
> operator but this doesn't seem to be the case...but why?

The hyperestraier user guide says

--8<---------------cut here---------------start------------->8---
In case of simplified form, specify the following.

internet security
--8<---------------cut here---------------end--------------->8---

And the debbugs-ug manual says

--8<---------------cut here---------------start------------->8---
Simplified forms, as described in the Hyperestraier User Guide, are not
supported.
--8<---------------cut here---------------end--------------->8---

>> Btw, maybe I shall make all debbugs-gnu-search arguments optional ...
>
> Yes, good idea!

Will do.

> Thanks,
>
> Michael.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-09-24 12:00     ` Michael Albinus
@ 2019-09-24 12:29       ` Michael Heerdegen
  2019-09-24 13:01         ` Michael Albinus
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2019-09-24 12:29 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Emacs mailing list

Michael Albinus <michael.albinus@gmx.de> writes:

> And the debbugs-ug manual says
>
> Simplified forms, as described in the Hyperestraier User Guide, are not
> supported.

Hmm - OTOH the `debbugs-gnu-phrase-prompt' help echo says "If there is
no operator between the words, AND is used by default.".  Should that be
changed?

Regards,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-09-24 12:29       ` Michael Heerdegen
@ 2019-09-24 13:01         ` Michael Albinus
  2019-10-04 16:47           ` Michael Heerdegen
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Albinus @ 2019-09-24 13:01 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs mailing list

Michael Heerdegen <michael_heerdegen@web.de> writes:

>> And the debbugs-ug manual says
>>
>> Simplified forms, as described in the Hyperestraier User Guide, are not
>> supported.
>
> Hmm - OTOH the `debbugs-gnu-phrase-prompt' help echo says "If there is
> no operator between the words, AND is used by default.".  Should that be
> changed?

I guess so, yes. Feel free to fix such errors.

> Regards,
>
> Michael.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-09-24 13:01         ` Michael Albinus
@ 2019-10-04 16:47           ` Michael Heerdegen
  2019-10-05 10:27             ` Michael Albinus
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2019-10-04 16:47 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Emacs mailing list

Michael Albinus <michael.albinus@gmx.de> writes:

> > Hmm - OTOH the `debbugs-gnu-phrase-prompt' help echo says "If there is
> > no operator between the words, AND is used by default.".  Should that be
> > changed?
>
> I guess so, yes. Feel free to fix such errors.

Ok, I fixed this one :)


Regards,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-04 16:47           ` Michael Heerdegen
@ 2019-10-05 10:27             ` Michael Albinus
  2019-10-05 11:50               ` Michael Heerdegen
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Albinus @ 2019-10-05 10:27 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs mailing list

Michael Heerdegen <michael_heerdegen@web.de> writes:

Hi Michael,

>> > Hmm - OTOH the `debbugs-gnu-phrase-prompt' help echo says "If there is
>> > no operator between the words, AND is used by default.".  Should that be
>> > changed?
>>
>> I guess so, yes. Feel free to fix such errors.
>
> Ok, I fixed this one :)

Thanks! I've also tried to clarify in the debbugs user guide. The text
reads now

--8<---------------cut here---------------start------------->8---
 -- Command: debbugs-gnu-search
 -- Command: debbugs-org-search

     These both commands are completely interactive.  They ask for a
     '"search phrase"' for the text search.  It is just a string which
     contains the words to be searched for followed by each other.
     There are also operators like "AND", "ANDNOT" and "OR", which
     allow to search for words at different positions in the text.
     Only complete words, contained in a message body, are searched
     for.

     Wildcard searches are also supported.  It can be used for forward
     match search and backward match search of words.  For example,
     "[BW] euro" matches words which begin with "euro".  "[EW] sphere"
     matches words which end with "sphere".  Moreover, regular
     expressions are also supported.  For example, "[RX] ^inter.*al$"
     matches words which begin with "inter" and end with "al".(2)
     Several wildcards could be separated by the operators.  If there
     is no operator between the wildcards, "AND" is used by default.

     While the words to be searched for are case insensitive, the
     operators must be specified case sensitive.
--8<---------------cut here---------------end--------------->8---

Unfortunately, hyperestraier does not speak about the syntax of a word,
nowhere. Since it is written in Ruby, I guess it uses the syntax of a
Ruby identifier, see <https://ruby-doc.org/docs/ruby-doc-bundle/Manual/man-1.4/syntax.html#ident>:

--8<---------------cut here---------------start------------->8---
Ruby identifiers are consist of alphabets, decimal digits, and the
underscore character, and begin with a alphabets(including
underscore). There are no restrictions on the lengths of Ruby
identifiers.
--8<---------------cut here---------------end--------------->8---

This explains, why your example "[RX] ^el-search-.*-sources$" does not
work. Dashes don't belong to word syntax.

> Regards,
>
> Michael.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-05 10:27             ` Michael Albinus
@ 2019-10-05 11:50               ` Michael Heerdegen
  2019-10-05 12:01                 ` Michael Albinus
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2019-10-05 11:50 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Emacs mailing list

Michael Albinus <michael.albinus@gmx.de> writes:

> Thanks! I've also tried to clarify in the debbugs user guide.

Great, thanks for working on that.

> The text reads now
>
>  -- Command: debbugs-gnu-search
>  -- Command: debbugs-org-search
>
>      These both commands are completely interactive.  They ask for a
>      '"search phrase"' for the text search.  It is just a string which
>      contains the words to be searched for followed by each other.
>      There are also operators like "AND", "ANDNOT" and "OR", which

FWIW; that still sounds a bit like multiple words could be given without
an operator for me.

>      If there is no operator between the wildcards, "AND" is used by
>      default.

And that even more.  Didn't we want to remove this?

> Unfortunately, hyperestraier does not speak about the syntax of a word,
> nowhere. Since it is written in Ruby, I guess it uses the syntax of a
> Ruby identifier, see
> <https://ruby-doc.org/docs/ruby-doc-bundle/Manual/man-1.4/syntax.html#ident>:
>
> Ruby identifiers are consist of alphabets, decimal digits, and the
> underscore character, and begin with a alphabets(including
> underscore). There are no restrictions on the lengths of Ruby
> identifiers.
>
> This explains, why your example "[RX] ^el-search-.*-sources$" does not
> work. Dashes don't belong to word syntax.

Ok.  Too bad you are not sure.  I guess many people wonder what a "word"
is in this context.


Regards,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-05 11:50               ` Michael Heerdegen
@ 2019-10-05 12:01                 ` Michael Albinus
  2019-10-05 17:20                   ` Michael Heerdegen
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Albinus @ 2019-10-05 12:01 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs mailing list

Michael Heerdegen <michael_heerdegen@web.de> writes:

Hi Michael,

>> The text reads now
>>
>>  -- Command: debbugs-gnu-search
>>  -- Command: debbugs-org-search
>>
>>      These both commands are completely interactive.  They ask for a
>>      '"search phrase"' for the text search.  It is just a string which
>>      contains the words to be searched for followed by each other.
>>      There are also operators like "AND", "ANDNOT" and "OR", which
>
> FWIW; that still sounds a bit like multiple words could be given without
> an operator for me.
>
>>      If there is no operator between the wildcards, "AND" is used by
>>      default.
>
> And that even more.  Didn't we want to remove this?

Well, I haven't tested heavily. If you want, run your tests, and adapt
the text according to the results.

> Regards,
>
> Michael.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-05 12:01                 ` Michael Albinus
@ 2019-10-05 17:20                   ` Michael Heerdegen
  2019-10-05 18:11                     ` Michael Albinus
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2019-10-05 17:20 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Emacs mailing list

Michael Albinus <michael.albinus@gmx.de> writes:

> >>      If there is no operator between the wildcards, "AND" is used by
> >>      default.
> >
> > And that even more.  Didn't we want to remove this?
>
> Well, I haven't tested heavily. If you want, run your tests, and adapt
> the text according to the results.

What do you want to test - isn't it enough that some test cases don't
work?  If it should work, that should be fixed.  If it doesn't have to
work, what do we need to test?  If you don't know if it should work, I
don't know how to find out.

I would prefer if you could care about it.  I have gotten removed two
wisdom teeth (sorry if the verb is wrong), have a lot of other stuff to
care about at the moment, and do know completely nothing about debbugs.
I guess you can do it more efficiently and better.  It doesn't hurry.

Thanks,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-05 17:20                   ` Michael Heerdegen
@ 2019-10-05 18:11                     ` Michael Albinus
  2019-10-06 10:08                       ` Michael Heerdegen
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Albinus @ 2019-10-05 18:11 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs mailing list

Michael Heerdegen <michael_heerdegen@web.de> writes:

Hi Michael,

>> Well, I haven't tested heavily. If you want, run your tests, and adapt
>> the text according to the results.
>
> What do you want to test - isn't it enough that some test cases don't
> work?  If it should work, that should be fixed.  If it doesn't have to
> work, what do we need to test?  If you don't know if it should work, I
> don't know how to find out.

I'm also short in time, and debbugs is not my primary concern. But so
what ...

I've tested the HyperEstraier search engine using
<https://debbugs.gnu.org/cgi/search.cgi>, and it looks you are
right. Compare the search for "[BW] gcc AND [EW] gforth" and "[BW] gcc
[EW] gforth". So I have adapted the debbugs-ug manual, again.

> I would prefer if you could care about it.  I have gotten removed two
> wisdom teeth (sorry if the verb is wrong), have a lot of other stuff to
> care about at the moment, and do know completely nothing about debbugs.
> I guess you can do it more efficiently and better.  It doesn't hurry.

I hope it will get better with your Weisheistszaehne! Wish you the best!

> Thanks,
>
> Michael.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-05 18:11                     ` Michael Albinus
@ 2019-10-06 10:08                       ` Michael Heerdegen
  2019-10-06 10:31                         ` Michael Albinus
  0 siblings, 1 reply; 16+ messages in thread
From: Michael Heerdegen @ 2019-10-06 10:08 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Emacs mailing list

Michael Albinus <michael.albinus@gmx.de> writes:

> I've tested the HyperEstraier search engine using
> <https://debbugs.gnu.org/cgi/search.cgi>, and it looks you are
> right. Compare the search for "[BW] gcc AND [EW] gforth" and "[BW] gcc
> [EW] gforth". So I have adapted the debbugs-ug manual, again.

Thank you very much.  Looks good!

I didn't notice that - until I read your adaption, that a sequence of
words matches the same sequence of words (in the same order).  Should we
speak that out more explicitly?  Though, I understood it.

BTW, it seems that non-(Ruby) word chars are not just ignored.  For
example, with the query "el-search-emacs-elisp-sources" I find Bug#37321
(expected) but with "el search emacs elisp sources" I don't get a match
(unexpected for me).  Do you know more?

> I hope it will get better with your Weisheistszaehne!

You mean without :-P

> Wish you the best!

Thank you!  I'm already quite well again.


Regards,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-06 10:08                       ` Michael Heerdegen
@ 2019-10-06 10:31                         ` Michael Albinus
  2019-10-06 10:38                           ` Michael Heerdegen
  2019-10-16 11:57                           ` Michael Heerdegen
  0 siblings, 2 replies; 16+ messages in thread
From: Michael Albinus @ 2019-10-06 10:31 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Emacs mailing list

Michael Heerdegen <michael_heerdegen@web.de> writes:

Hi Michael,

> I didn't notice that - until I read your adaption, that a sequence of
> words matches the same sequence of words (in the same order).  Should we
> speak that out more explicitly?  Though, I understood it.

I believe it is understandable now. Personally, I'm not a fan of too
verbose description, but I might be biased because I understand what I'm
writing. Almost. Often.

> BTW, it seems that non-(Ruby) word chars are not just ignored.  For
> example, with the query "el-search-emacs-elisp-sources" I find Bug#37321
> (expected) but with "el search emacs elisp sources" I don't get a match
> (unexpected for me).  Do you know more?

No. I spent some hours in asking search engines for any hint what "word"
means in the HyperEstraier environment, no hit found. I've read also all
the Perl code in debbugs which does the HyperEstraier integration, no
hit either.

I haven't started to read the HyperEstraier source code itself, this
sounds too much to me.

> Regards,
>
> Michael.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-06 10:31                         ` Michael Albinus
@ 2019-10-06 10:38                           ` Michael Heerdegen
  2019-10-16 11:57                           ` Michael Heerdegen
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Heerdegen @ 2019-10-06 10:38 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Emacs mailing list

Ok,

then it is good enough for me.

Thanks again,

Michael.

> > BTW, it seems that non-(Ruby) word chars are not just ignored.  For
> > example, with the query "el-search-emacs-elisp-sources" I find Bug#37321
> > (expected) but with "el search emacs elisp sources" I don't get a match
> > (unexpected for me).  Do you know more?
>
> No. I spent some hours in asking search engines for any hint what "word"
> means in the HyperEstraier environment, no hit found. I've read also all
> the Perl code in debbugs which does the HyperEstraier integration, no
> hit either.
>
> I haven't started to read the HyperEstraier source code itself, this
> sounds too much to me.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Wildcard matching in debbugs-gnu-search - how?
  2019-10-06 10:31                         ` Michael Albinus
  2019-10-06 10:38                           ` Michael Heerdegen
@ 2019-10-16 11:57                           ` Michael Heerdegen
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Heerdegen @ 2019-10-16 11:57 UTC (permalink / raw)
  To: mikio, Michael Albinus; +Cc: Emacs mailing list

[Trying to resend the message because the original one had a broken
header]

Michael Albinus <michael.albinus@gmx.de> writes:

> No. I spent some hours in asking search engines for any hint what "word"
> means in the HyperEstraier environment, no hit found. I've read also all
> the Perl code in debbugs which does the HyperEstraier integration, no
> hit either.

Maybe Mikio (author of Hyper Estraier) can help?  Dunno if the Email
address stated on https://fallabs.com/hyperestraier/ is still
working.

Mikio, we are wondering how the Hyper Estraier algorithm splits texts
into words.  We are assuming that word syntax is defined by Ruby
syntax.  Is that correct?

And what happens with non-word characters?  Seems they are not just
ignored: a search phrase "foo-bar" matches occurrences of "foo-bar", but
"foo bar" seemingly doesn't find it.  So, what exactly is a "word" in
the context of Hyper Estraier?  Help appreciated.


TIA,

Michael.



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-10-16 11:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-09-24 10:38 Wildcard matching in debbugs-gnu-search - how? Michael Heerdegen
2019-09-24 11:32 ` Michael Albinus
2019-09-24 11:51   ` Michael Heerdegen
2019-09-24 12:00     ` Michael Albinus
2019-09-24 12:29       ` Michael Heerdegen
2019-09-24 13:01         ` Michael Albinus
2019-10-04 16:47           ` Michael Heerdegen
2019-10-05 10:27             ` Michael Albinus
2019-10-05 11:50               ` Michael Heerdegen
2019-10-05 12:01                 ` Michael Albinus
2019-10-05 17:20                   ` Michael Heerdegen
2019-10-05 18:11                     ` Michael Albinus
2019-10-06 10:08                       ` Michael Heerdegen
2019-10-06 10:31                         ` Michael Albinus
2019-10-06 10:38                           ` Michael Heerdegen
2019-10-16 11:57                           ` Michael Heerdegen

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.