bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
@ 2015-04-18 16:11 Oleh Krehel
  2015-04-18 17:41 ` Stefan Monnier
  0 siblings, 1 reply; 18+ messages in thread
From: Oleh Krehel @ 2015-04-18 16:11 UTC (permalink / raw)
  To: 20365

To reproduce, eval this in the top *info* directory:

(setq x (all-completions "(" 'Info-read-node-name-1))

x will contain many duplicates for each node, like "org" "org" "org"
"org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".

Finally, if one of the elements of `all-completions' is passed, it still
doesn't work. I'm guessing that it expects "(org)" instead of "org", but
then why not offer these on the completion list?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-18 16:11 bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1 Oleh Krehel
@ 2015-04-18 17:41 ` Stefan Monnier
  2015-04-18 17:49   ` Oleh Krehel
  2015-04-18 23:40   ` Drew Adams
  0 siblings, 2 replies; 18+ messages in thread
From: Stefan Monnier @ 2015-04-18 17:41 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: 20365

Just a first comment: it's not considered incorrect for
`all-completions' to return a list with duplicate entries in it.
More specifically, it's considered the completion UI's job to remove
those duplicates.

>   (setq x (all-completions "(" 'Info-read-node-name-1))
> x will contain many duplicates for each node, like "org" "org" "org"
> "org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".

Maybe the way these entries are generated could be reviewed to try and
reduce the number of duplicates.  And we could call `delete-dups' on the
result: while a completion-table shouldn't need to go out of its way to
reduce the number of duplicates (since the UI is supposed to handle it
anyway), it's probably good to avoid having such expected large number of
duplicates, indeed.

> Finally, if one of the elements of `all-completions' is passed, it still
> doesn't work.

What does "is passed" mean here?  What does "doesn't work" mean here?

> I'm guessing that it expects "(org)" instead of "org", but
> then why not offer these on the completion list?

What is "it"?

        Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-18 17:41 ` Stefan Monnier
@ 2015-04-18 17:49   ` Oleh Krehel
  2015-04-19  2:04     ` Stefan Monnier
  2015-04-18 23:40   ` Drew Adams
  1 sibling, 1 reply; 18+ messages in thread
From: Oleh Krehel @ 2015-04-18 17:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20365

On Sat, Apr 18, 2015 at 7:41 PM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
> Just a first comment: it's not considered incorrect for
> `all-completions' to return a list with duplicate entries in it.
> More specifically, it's considered the completion UI's job to remove
> those duplicates.
>
>>   (setq x (all-completions "(" 'Info-read-node-name-1))
>> x will contain many duplicates for each node, like "org" "org" "org"
>> "org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".
>
> Maybe the way these entries are generated could be reviewed to try and
> reduce the number of duplicates.  And we could call `delete-dups' on the
> result: while a completion-table shouldn't need to go out of its way to
> reduce the number of duplicates (since the UI is supposed to handle it
> anyway), it's probably good to avoid having such expected large number of
> duplicates, indeed.
>
>> Finally, if one of the elements of `all-completions' is passed, it still
>> doesn't work.
>
> What does "is passed" mean here?  What does "doesn't work" mean here?
>
>> I'm guessing that it expects "(org)" instead of "org", but
>> then why not offer these on the completion list?
>
> What is "it"?

What I think is happening there is that Info calls
`completing-read-function' and ultimately gives it (through
all-completions) a list like '("org" "emacs"...).  And then it expects
the answer to be in the form "(org)" or "(emacs)". Which I think is
strange. I've pushed recently this hack to `ivy-read', which makes
it work for Info:

((eq collection 'Info-read-node-name-1)
 (if (equal Info-current-file "dir")
     (setq collection
           (mapcar (lambda (x) (format "(%s)" x))
                   (cl-delete-duplicates
                    (all-completions "(" collection predicate)
                    :test 'equal)))
   (setq collection (all-completions "" collection predicate))))

This is quite ugly to go to such lengths to deal with just one
completion case.

Oleh





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-18 17:41 ` Stefan Monnier
  2015-04-18 17:49   ` Oleh Krehel
@ 2015-04-18 23:40   ` Drew Adams
  2015-04-19  1:50     ` Stefan Monnier
  1 sibling, 1 reply; 18+ messages in thread
From: Drew Adams @ 2015-04-18 23:40 UTC (permalink / raw)
  To: Stefan Monnier, Oleh Krehel; +Cc: 20365

> Just a first comment: it's not considered incorrect for
> `all-completions' to return a list with duplicate entries in it.
> More specifically, it's considered the completion UI's job to remove
> those duplicates.

Yes, absolutely.

> > (setq x (all-completions "(" 'Info-read-node-name-1))
> > x will contain many duplicates for each node, like "org" "org"
> > "org" "org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".
> 
> Maybe the way these entries are generated could be reviewed to try
> and reduce the number of duplicates.  And we could call `delete-dups' on
> the result: while a completion-table shouldn't need to go out of its way
> to reduce the number of duplicates (since the UI is supposed to handle
> it anyway), it's probably good to avoid having such expected large
> number of duplicates, indeed.

Just to be sure -

I hope you mean to do this only in `Info-read-node-name-1' or
in the code that calls `completing-read' (or in a "completion UI").
I hope you don't mean to do any of that in `all-completions'.
I'm guessing that this is what you mean.

If my guess is correct, then you need not read any further,
and thanks for confirming the guess.

---

In case not, let me disagree that such things should be done in
`all-completions' and similar functions or in `completing-read'.
They should not delete duplicates.

In my context, for instance, the completion candidates are often
alist elements where the cdrs matter.  Or they are strings that
are `string=' but have different properties.  The cdrs or the
string properties matter to the code that calls `all-completions'
or `completing-read'.  The calling code reconstructs the alist
element from the chosen candidate(s) (it provides access to the
associated cdr information).

It is important in such contexts to allow "duplicates", as
defined from the point of view of comparing only the cars with
`string=' (which is what `all-completions' does).  Just
because two completion candidate strings are equal does not
mean that the alist elements of which they are the cars are
equal.

It should be entirely up to the caller to delete duplicates.
It is trivial for it to do so, and logical that this be the
place where that is done.  Only the caller (or the "completion
UI") knows whether duplicates make sense in the current context.

`all-completions' & compagnie are building blocks (and coded
in C, no less, so not tweakable in Lisp).  It is not their
job to either (a) filter out duplicates or (b) try not to
produce duplicates.  (a) is the job of their callers, which
are the consumers of the candidates.  (b) is the job of the
COLLECTION function or other producer of the candidates.

So please try to tackle any problem of duplicates in this
particular use case by fiddling with `Info-read-node-name-1',
and not by introducing a change in `all-completions' etc.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-18 23:40   ` Drew Adams
@ 2015-04-19  1:50     ` Stefan Monnier
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Monnier @ 2015-04-19  1:50 UTC (permalink / raw)
  To: Drew Adams; +Cc: Oleh Krehel, 20365

> I hope you mean to do this only in `Info-read-node-name-1' or
> in the code that calls `completing-read' (or in a "completion UI").
> I hope you don't mean to do any of that in `all-completions'.
> I'm guessing that this is what you mean.

Yes.


        Stefan





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-18 17:49   ` Oleh Krehel
@ 2015-04-19  2:04     ` Stefan Monnier
  2015-04-19 11:53       ` Oleh Krehel
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Monnier @ 2015-04-19  2:04 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: 20365

> What I think is happening there is that Info calls
> `completing-read-function' and ultimately gives it (through
> all-completions) a list like '("org" "emacs"...).  And then it expects
> the answer to be in the form "(org)" or "(emacs)".  Which I think is
> strange.

IIUC the problem is as follows: the completion-table used by "g" in
Info-mode lets you complete node names in the current file or lets you
complete file names (which are placed within parens), but the completion
table itself does not add those parens [ A long standing missing feature
here is that it doesn't know how to complete a node name after the
"(<file>)", even though it's a very much valid input to provide.  ]

Your ivy-mode generally wants to have a list of strings as candidates
and wants those strings to be valid return values, whereas the way
completion tables are defined there is no guarantee that you can get
that kind of info from the completion table.

We could change the Info-mode completion table so as to include the
closing paren in the output of `all-completions' (and probably include
the opening paren as well, in that case).  Note also that on my system
"g (ema TAB" includes things like "emacs23/" where "(emacs23/)" is not
a valid element (since emacs-23 is a directorym which is complete in
steps, just as in C-x C-f).

        Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-19  2:04     ` Stefan Monnier
@ 2015-04-19 11:53       ` Oleh Krehel
  2015-04-19 14:43         ` Drew Adams
                           ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Oleh Krehel @ 2015-04-19 11:53 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20365

> We could change the Info-mode completion table so as to include the
> closing paren in the output of `all-completions' (and probably include
> the opening paren as well, in that case).  Note also that on my system
> "g (ema TAB" includes things like "emacs23/" where "(emacs23/)" is not
> a valid element (since emacs-23 is a directorym which is complete in
> steps, just as in C-x C-f).

That would be fine. It just makes sense for any element of what
`all-completions' returns to be a valid answer.

About the duplicate entries, I think it should be the responsibility
of the caller to remove the duplicates.  Here's my line of thought: a
completion function is expected to have an O(N) complexity, where N is
the amount of candidates. Removing duplicates is O(N^2) at worst, and
O(NlogN) at best.  So the completion function should not attempt to
remove the duplicates. It's doesn't affect the performance when I do
it for 1000 candidates, but when it's 20k (`describe-function') it can
have an impact.

The point is that the collection passed to the completion function can
be very large, and all but O(N) algorithms should be avoided.  On the
other hand, the caller knows exactly the type of data that it's
passing and may be able to remove the duplicates in an efficient way.

Oleh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-19 11:53       ` Oleh Krehel
@ 2015-04-19 14:43         ` Drew Adams
  2015-04-19 16:44         ` Dmitry Gutov
  2015-04-20  2:29         ` Stefan Monnier
  2 siblings, 0 replies; 18+ messages in thread
From: Drew Adams @ 2015-04-19 14:43 UTC (permalink / raw)
  To: Oleh Krehel, Stefan Monnier; +Cc: 20365

> It just makes sense for any element of what
> `all-completions' returns to be a valid answer.

Interesting point of view.  Sounds familiar... ;-)

http://debbugs.gnu.org/cgi/bugreport.cgi?bug=1085

> About the duplicate entries, I think it should be the responsibility
> of the caller to remove the duplicates.  Here's my line of thought:
> a completion function is expected to have an O(N) complexity, where N
> is the amount of candidates. Removing duplicates is O(N^2) at worst,
> and O(NlogN) at best.  So the completion function should not attempt to
> remove the duplicates. It's doesn't affect the performance when I do
> it for 1000 candidates, but when it's 20k (`describe-function') it
> can have an impact.
> 
> The point is that the collection passed to the completion function
> can be very large, and all but O(N) algorithms should be avoided.  On
> the other hand, the caller knows exactly the type of data that it's
> passing and may be able to remove the duplicates in an efficient
> way.

FWIW -

I agree that the calling code should control duplicate removal.
(Definitely, `all-completions' should not do that.)

But it can make sense to allow the calling program to optionally
"reach inside `completing-read'" to do its duplicate removal.

In Icicles, the caller can cause `completing-read' itself to remove
duplicates by binding global variable `icicle-transform-function'.

Because `completing-read' does some processing (e.g. sorting)
after computation of the candidates, this filter promotion into
`completing-read' can save some time.  It can also give users
dynamic control over the candidates to be matched (the completion
domain).

The value of variable `icicle-transform-function' is a function
used to transform the list of completion candidates.  Users can
toggle such transforming on/off using `C-$' during completion.

The most common use, by far, of `icicle-transform-function' is
to bind it to a remove-duplicates function.  Other uses include
switching to a particular subset of candidates, against which
user input is then matched.

For example, using `C-$' with `icicle-apropos-value' toggles
filtering the domain of initial candidates according to the
prefix argument, as follows:

 * no prefix arg: only user options (+ values)
 *           < 0: only commands (+ definitions)
 *           > 0: only faces (+ plists)
 *           = 0: only options (+ values), commands (+ defs),
                  faces (+ plists)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-19 11:53       ` Oleh Krehel
  2015-04-19 14:43         ` Drew Adams
@ 2015-04-19 16:44         ` Dmitry Gutov
  2015-04-19 17:00           ` Oleh Krehel
  2015-04-20  2:29         ` Stefan Monnier
  2 siblings, 1 reply; 18+ messages in thread
From: Dmitry Gutov @ 2015-04-19 16:44 UTC (permalink / raw)
  To: Oleh Krehel, Stefan Monnier; +Cc: 20365

On 04/19/2015 02:53 PM, Oleh Krehel wrote:

> About the duplicate entries, I think it should be the responsibility
> of the caller to remove the duplicates.

That could have been a decent argument if we're discussing a new API, 
and not an already widely-used one.

> Here's my line of thought: a
> completion function is expected to have an O(N) complexity, where N is
> the amount of candidates. Removing duplicates is O(N^2) at worst, and
> O(NlogN) at best.

O(NlogN) is closer to the truth:

First, you copy - O(N), then sort - O(NlogN), then call 
`delete-consecutive-dups' (linear time).

> So the completion function should not attempt to
> remove the duplicates. It's doesn't affect the performance when I do
> it for 1000 candidates, but when it's 20k (`describe-function') it can
> have an impact.

Even on a decently-sized collection (38K), this takes only 80ms:

(delete-consecutive-dups (sort (all-completions "" obarray) #'string<))

That's not terrible.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-19 16:44         ` Dmitry Gutov
@ 2015-04-19 17:00           ` Oleh Krehel
  2015-04-19 17:12             ` Dmitry Gutov
  0 siblings, 1 reply; 18+ messages in thread
From: Oleh Krehel @ 2015-04-19 17:00 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 20365

On Sun, Apr 19, 2015 at 6:44 PM, Dmitry Gutov <dgutov@yandex.ru> wrote:

> That could have been a decent argument if we're discussing a new API, and
> not an already widely-used one.

No problem: no completion package will complain if you don't pass it duplicates.
Then it's just a matter of fixing the offening callers of completion one by one.

>> Here's my line of thought: a
>> completion function is expected to have an O(N) complexity, where N is
>> the amount of candidates. Removing duplicates is O(N^2) at worst, and
>> O(NlogN) at best.
>
>
> O(NlogN) is closer to the truth:
>
> First, you copy - O(N), then sort - O(NlogN), then call
> `delete-consecutive-dups' (linear time).
>
>> So the completion function should not attempt to
>> remove the duplicates. It's doesn't affect the performance when I do
>> it for 1000 candidates, but when it's 20k (`describe-function') it can
>> have an impact.
>
>
> Even on a decently-sized collection (38K), this takes only 80ms:
>
> (delete-consecutive-dups (sort (all-completions "" obarray) #'string<))
>
> That's not terrible.

Actually, 0.08 is a lot when you consider that I would call this after
each key press (in case when the collection is a function, not for the
static list).  The sluggishness would be perceptible even for a
relatively slow typist. And this is only the overhead, there's also
actual computing to be done. There's no reason not to avoid this
overhead cost.

Oleh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-19 17:00           ` Oleh Krehel
@ 2015-04-19 17:12             ` Dmitry Gutov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Gutov @ 2015-04-19 17:12 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: 20365

On 04/19/2015 08:00 PM, Oleh Krehel wrote:

> No problem: no completion package will complain if you don't pass it duplicates.
> Then it's just a matter of fixing the offening callers of completion one by one.

Good luck with that.

> Actually, 0.08 is a lot when you consider that I would call this after
> each key press (in case when the collection is a function, not for the
> static list).  The sluggishness would be perceptible even for a
> relatively slow typist. And this is only the overhead, there's also
> actual computing to be done. There's no reason not to avoid this
> overhead cost.

It's only 80ms as long as the list is unfiltered. Type one or two chars 
- and the already small delay disappears into nothing.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-19 11:53       ` Oleh Krehel
  2015-04-19 14:43         ` Drew Adams
  2015-04-19 16:44         ` Dmitry Gutov
@ 2015-04-20  2:29         ` Stefan Monnier
  2015-04-20  8:38           ` Oleh Krehel
  2 siblings, 1 reply; 18+ messages in thread
From: Stefan Monnier @ 2015-04-20  2:29 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: 20365

> That would be fine. It just makes sense for any element of what
> `all-completions' returns to be a valid answer.

What's a valid answer?

all-completions will happily return "share/" when completing on
"/usr/s", even if you're looking for a .tex file rather than a directory
and even if there's no "share/" in the current directory either.

Yet, we don't want all-completions to return elements of the form
"/home/monnier/private/package/emacs/trunk/lisp/" since that would
result in a lot of redundancy that then needs to be found&removed.

So, no, all-completions just doesn't always return "valid answers".
Instead, it returns chunks of text that can added to some prefix (which
you can find via `completion-boundaries'), the result of which should be
a prefix of a valid answer.

> About the duplicate entries, I think it should be the responsibility
> of the caller to remove the duplicates.

That's the case currently.  The completion-table is called and the
caller is the UI, and currently it's the UI's responsibility to remove
the duplicates.

> Here's my line of thought: a completion function is expected to have
> an O(N) complexity, where N is the amount of candidates.  Removing
> duplicates is O(N^2) at worst, and O(NlogN) at best.

Actually, with a hash-table it's pretty much down to O(N).

        Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-20  2:29         ` Stefan Monnier
@ 2015-04-20  8:38           ` Oleh Krehel
  2015-04-20 12:30             ` Oleh Krehel
  2015-04-20 14:38             ` Stefan Monnier
  0 siblings, 2 replies; 18+ messages in thread
From: Oleh Krehel @ 2015-04-20  8:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20365

On Mon, Apr 20, 2015 at 4:29 AM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>> That would be fine. It just makes sense for any element of what
>> `all-completions' returns to be a valid answer.
>
> What's a valid answer?

A valid answer is the one that doesn't lead to an error if
`completing-read-function' returns it.
Also, in my experience, having less callbacks leads to easier debugging.
I've looked at the callbacks in `incomplete-mode': the completion
function gets called like 3 times for a single input.

> all-completions will happily return "share/" when completing on
> "/usr/s", even if you're looking for a .tex file rather than a directory
> and even if there's no "share/" in the current directory either.

This would be good if there was a "share/" in the current directory.
I'm fine with the concept of `all-completions' being able to return
"infinite" candidates, but it would be nice if it obeyed some rules.
For instance, ivy-mode can handle the file names being "infinte",
because of the concept of directories being non-terminal completion
nodes. This concept can be adapted to all compltetion types with
"infinite" candidates. And there would be zero confusion:
`all-completions' would immediately return a list of strings, some of
them terminal nodes, some of them "directories".  Then it only remains
to provide a generic `file-directory-p' and we're done.

> Yet, we don't want all-completions to return elements of the form
> "/home/monnier/private/package/emacs/trunk/lisp/" since that would
> result in a lot of redundancy that then needs to be found&removed.

With a concept of "active directory" and "non-terminal" node this can be done.
"active directory" would be a generic `default-directory', and
"non-terminal" would
be a generic predicate `file-directory-p'.

> So, no, all-completions just doesn't always return "valid answers".
> Instead, it returns chunks of text that can added to some prefix (which
> you can find via `completion-boundaries'), the result of which should be
> a prefix of a valid answer.

We could have the cake and eat it too with my approach described above.

>> About the duplicate entries, I think it should be the responsibility
>> of the caller to remove the duplicates.
>
> That's the case currently.  The completion-table is called and the
> caller is the UI, and currently it's the UI's responsibility to remove
> the duplicates.

So Info returning duplicates is a bug that should be fixed?

>> Here's my line of thought: a completion function is expected to have
>> an O(N) complexity, where N is the amount of candidates.  Removing
>> duplicates is O(N^2) at worst, and O(NlogN) at best.
>
> Actually, with a hash-table it's pretty much down to O(N).

Yeah, but we're not using that. And having no assumptions on the data,
the hashing function would be the most basic one.

Oleh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-20  8:38           ` Oleh Krehel
@ 2015-04-20 12:30             ` Oleh Krehel
  2022-04-17 10:51               ` Lars Ingebrigtsen
  2015-04-20 14:38             ` Stefan Monnier
  1 sibling, 1 reply; 18+ messages in thread
From: Oleh Krehel @ 2015-04-20 12:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20365

Please check the "fix-info-dups" brach with a fix. This patch removes
the duplicates and generally simplifies the completion of Info files.

Oleh





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-20  8:38           ` Oleh Krehel
  2015-04-20 12:30             ` Oleh Krehel
@ 2015-04-20 14:38             ` Stefan Monnier
  2015-04-20 14:52               ` Oleh Krehel
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Monnier @ 2015-04-20 14:38 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: 20365

> nodes. This concept can be adapted to all compltetion types with
> "infinite" candidates. And there would be zero confusion:
> `all-completions' would immediately return a list of strings, some of
> them terminal nodes, some of them "directories".  Then it only remains
> to provide a generic `file-directory-p' and we're done.

Yes, that's also what I was thinking.  Basically, if all-completions
returns something which requires further input, then this something
should include the "terminating char" which makes completion-boundaries
change (which is how to detect that your candidate list is out-of-date
because the input has "moved to another directory").

>> That's the case currently.  The completion-table is called and the
>> caller is the UI, and currently it's the UI's responsibility to remove
>> the duplicates.
> So Info returning duplicates is a bug that should be fixed?

Our UI already does remove duplicates (not in info.el, of course, since
our UI is in minibuffer.el).

>>> Here's my line of thought: a completion function is expected to have
>>> an O(N) complexity, where N is the amount of candidates.  Removing
>>> duplicates is O(N^2) at worst, and O(NlogN) at best.
>> Actually, with a hash-table it's pretty much down to O(N).
> Yeah, but we're not using that.

Not sure who's "we", here.  But the point is that if the performance of
delete-dups becomes a problem, it can be improved.

> And having no assumptions on the data, the hashing function would be
> the most basic one.

I don't think that should make much difference: (make-hash-table :test #'equal)
should work just fine.

        Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-20 14:38             ` Stefan Monnier
@ 2015-04-20 14:52               ` Oleh Krehel
  2015-04-20 19:14                 ` Stefan Monnier
  0 siblings, 1 reply; 18+ messages in thread
From: Oleh Krehel @ 2015-04-20 14:52 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 20365

On Mon, Apr 20, 2015 at 4:38 PM, Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>> nodes. This concept can be adapted to all compltetion types with
>> "infinite" candidates. And there would be zero confusion:
>> `all-completions' would immediately return a list of strings, some of
>> them terminal nodes, some of them "directories".  Then it only remains
>> to provide a generic `file-directory-p' and we're done.
>
> Yes, that's also what I was thinking.  Basically, if all-completions
> returns something which requires further input, then this something
> should include the "terminating char" which makes completion-boundaries
> change (which is how to detect that your candidate list is out-of-date
> because the input has "moved to another directory").

I'm not fully familiar with the concept of completion-boundaries, but I
have a feeling that I won't like it, specifically the multiple callback
passes for a response to a single completion input change.

Is the concept of completion-boundaries irreplaceable, could it maybe be
replaced by a notion of navigating a tree (just like navigating a file
system, with terminal and non-terminal nodes).  Which Emacs features use
the completion-boundaries concept?

>>> That's the case currently.  The completion-table is called and the
>>> caller is the UI, and currently it's the UI's responsibility to remove
>>> the duplicates.
>> So Info returning duplicates is a bug that should be fixed?
>
> Our UI already does remove duplicates (not in info.el, of course, since
> our UI is in minibuffer.el).

I've added a fix to Info in any case. Can you please look at it in the
fix-info-dups branch?

>>>> Here's my line of thought: a completion function is expected to have
>>>> an O(N) complexity, where N is the amount of candidates.  Removing
>>>> duplicates is O(N^2) at worst, and O(NlogN) at best.
>>> Actually, with a hash-table it's pretty much down to O(N).
>> Yeah, but we're not using that.
>
> Not sure who's "we", here.  But the point is that if the performance of
> delete-dups becomes a problem, it can be improved.

We are the happy Emacs users.

>> And having no assumptions on the data, the hashing function would be
>> the most basic one.
>
> I don't think that should make much difference: (make-hash-table :test #'equal)
> should work just fine.

OK, that's good to keep in mind. But even better is to avoid placing the
overhead on the UI, be it minibuffer.el or ivy.el or whatever.

Oleh





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-20 14:52               ` Oleh Krehel
@ 2015-04-20 19:14                 ` Stefan Monnier
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Monnier @ 2015-04-20 19:14 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: 20365

> I'm not fully familiar with the concept of completion-boundaries, but I

I know, but if you intend to support a completion UI that works with
existing completion-tables, you're going to have to become at least
somewhat acquainted to it.

> Is the concept of completion-boundaries irreplaceable, could it maybe be
> replaced by a notion of navigating a tree (just like navigating a file
> system, with terminal and non-terminal nodes).  Which Emacs features use
> the completion-boundaries concept?

All the "special" examples of completion I showed (plus file name completion).

> OK, that's good to keep in mind. But even better is to avoid placing the
> overhead on the UI, be it minibuffer.el or ivy.el or whatever.

Many completion tables are made by combining other completion tables.
Doing the delete-dups at every step is going to be more code and more
runtime work than doing it once at the end in the UI.  So the UI will
have to do it.  Completion-tables can also do it if they want, but it
should not be necessary for correctness.

        Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1
  2015-04-20 12:30             ` Oleh Krehel
@ 2022-04-17 10:51               ` Lars Ingebrigtsen
  0 siblings, 0 replies; 18+ messages in thread
From: Lars Ingebrigtsen @ 2022-04-17 10:51 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: Stefan Monnier, 20365

Oleh Krehel <ohwoeowho@gmail.com> writes:

> Please check the "fix-info-dups" brach with a fix. This patch removes
> the duplicates and generally simplifies the completion of Info files.

I've now applied that patch to Emacs 29 (with some changes), and I think
that fixes the reported problem here, so I'm closing the bug report.
(The discussion then went on to more general issues of duplicates in
completions.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-04-17 10:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-18 16:11 bug#20365: 24.5; all-completions returns duplicates for Info-read-node-name-1 Oleh Krehel
2015-04-18 17:41 ` Stefan Monnier
2015-04-18 17:49   ` Oleh Krehel
2015-04-19  2:04     ` Stefan Monnier
2015-04-19 11:53       ` Oleh Krehel
2015-04-19 14:43         ` Drew Adams
2015-04-19 16:44         ` Dmitry Gutov
2015-04-19 17:00           ` Oleh Krehel
2015-04-19 17:12             ` Dmitry Gutov
2015-04-20  2:29         ` Stefan Monnier
2015-04-20  8:38           ` Oleh Krehel
2015-04-20 12:30             ` Oleh Krehel
2022-04-17 10:51               ` Lars Ingebrigtsen
2015-04-20 14:38             ` Stefan Monnier
2015-04-20 14:52               ` Oleh Krehel
2015-04-20 19:14                 ` Stefan Monnier
2015-04-18 23:40   ` Drew Adams
2015-04-19  1:50     ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).