Re: Programmed completions are utterly confusing

unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed

* Re: Programmed completions are utterly confusing
       [not found] <m235al9ufz.fsf@gmail.com>
@ 2022-11-17  9:31 ` Jean Louis
  0 siblings, 0 replies; only message in thread
From: Jean Louis @ 2022-11-17  9:31 UTC (permalink / raw)
  To: Ag; +Cc: help-gnu-emacs

* Ag <agzam.ibragimov@gmail.com> [2022-11-15 02:53]:
> Before someone tells me that asking for help should be posted in
> help-gnu-emacs@gnu.org, I apologize for the noise, but I'm hoping maybe
> this could open a discussion for documentation improvements for
> completions.

Documentation is here:

(info "(elisp) Completion")

If you have suggestions, write it.

Which specific part is not clear?

> I've been trying to understand how completions work, and I needed a
> completion that searches not only within the displayed rows but also in
> annotations. Someone suggested concatenating the annotation info into
> the main string, but I don't like this for a few reasons. I've even been
> told that the annotations are not designed for that, and if I want them
> to be searched upon, they really can't be annotations.

I agree that you should be able to searching within any kind of data.

As Emacs is extensible, what I do in your use case, which is analogous
to my use case, I use two stages, first is search among too many
candidates, and second is the completion.

As I do not know how you store your searchable data, I can't give
examples on that.

Try to make a function that first searches, second provides you
completion of found candidates.

If you think of data scaling in future, then you will understand why
is two stages process better than single completion over all the
items. I do not like waiting for half a second or second or two
seconds for completion to appear. Emacs needs longer time to construct
completion candidates when there are too many, like if I have 70000 of
them, or 250,000 of them, it needs some time, maybe second, two,
three, I did not measure.

> - Use case I have right now: I want to search through URLs and their
>   page titles.

I do the same here, and I have made function to search only in URLs or
only in titles. 

I could easily do search like "my search terms AND other terms AND
other terms" -- but I prefer to separate terms and I use prefix key
C-u to tell how many words I want to find in the URL.

C-u C-u C-u SEARCH-KEY would ask me for 3 words, and then those 3
words must exist in the URL or in the TITLE.

And it is trivial to make a search both for title and URL in the same
time. 

> If I put each url into the main display string, and annotate them
> with titles, every row would be recognized as a "proper url", and
> packages like Embark could use that, e.g., I can dispatch
> embark-url-action and browse them. But if I concatenate the url and
> the page title, I would have to use tricks so Embark recognizes it
> as a url (and not a url with a string attached to it)

I suggest putting URL in URL place, title in title place. I also
suggest using ID numbers for completion. Use search before completion. 

I use this function to cut the ID from the title:

(defun rcd-get-bracketed-id-end (s)
  "Return the ID number in string S from within first brackets on its
end. For example it would return 123 from `Some string [123]'"
  (let* ((match (string-match "\\[\\([[:digit:]]*\\)\\][[:space:]]*$" s)))
    (when match
      (string-to-number
       (substring-no-properties s (match-beginning 1) (match-end 1))))))

(rcd-get-bracketed-id-end (completing-read "Choose: " '("One [1]" "Two [2]"))) ⇒ 1

Then by using the ID number I know what was chosen.

That way I can get a reference to much more complex data in
background. It is possible to use only the title this way, to access
the URL, tags, author, description, related other objects.

Get a reference, use reference to get more information about the object.

> - Another use case comes to mind, let's say I want to sift through some
>   kind of log entries. Imagine I would want to provide a feature, where
>   if typed something like "last 5 mins", it would limit the rows to
>   include only log events with timestamps no older than five minutes
>   ago. Obviously, there's no practical reason or a sensible way to
>   encode dynamic time value into each row,

That asks for PostgreSQL, so I recommend learning about it.

I use that already, noting what was done at what specific time, and
have it in my system. My log table is defined here below.

                                                     Table "public.log"
┌──────────────────┬─────────────────────────────┬───────────┬──────────┬───────────────────────────────────────────────────┐
│      Column      │            Type             │ Collation │ Nullable │                      Default                      │
├──────────────────┼─────────────────────────────┼───────────┼──────────┼───────────────────────────────────────────────────┤
│ log_id           │ integer                     │           │ not null │ nextval('generallog_generallog_id_seq'::regclass) │
│ log_datecreated  │ timestamp without time zone │           │ not null │ now()                                             │
│ log_datemodified │ timestamp without time zone │           │          │                                                   │
│ log_usercreated  │ text                        │           │ not null │ "current_user"()                                  │
│ log_usermodified │ text                        │           │ not null │ "current_user"()                                  │
│ log_peoplelist   │ integer                     │           │          │                                                   │
│ log_people       │ integer                     │           │          │                                                   │
│ log_businesses   │ integer                     │           │          │                                                   │
│ log_assignedto   │ integer                     │           │          │                                                   │
│ log_timezones    │ integer                     │           │          │                                                   │
│ log_date         │ timestamp without time zone │           │          │ now()                                             │
│ log_time         │ time without time zone      │           │          │                                                   │
│ log_name         │ text                        │           │ not null │                                                   │
│ log_description  │ text                        │           │          │                                                   │
│ log_publish      │ boolean                     │           │          │ false                                             │
│ log_hyobjects    │ integer                     │           │          │                                                   │
│ log_key          │ text                        │           │          │                                                   │
│ log_logtypes     │ integer                     │           │ not null │ 1                                                 │
│ log_uuid         │ uuid                        │           │ not null │ gen_random_uuid()                                 │
└──────────────────┴─────────────────────────────┴───────────┴──────────┴───────────────────────────────────────────────────┘
Indexes:
    "generallog_pkey" PRIMARY KEY, btree (log_id)
    "log_log_uuid_key" UNIQUE CONSTRAINT, btree (log_uuid)
Foreign-key constraints:
    "generallog_generallog_assignedto_fkey" FOREIGN KEY (log_assignedto) REFERENCES people(people_id)
    "generallog_generallog_contacts_fkey" FOREIGN KEY (log_people) REFERENCES people(people_id)
    "generallog_generallog_hlinks_fkey" FOREIGN KEY (log_hyobjects) REFERENCES hyobjects(hyobjects_id)
    "generallog_generallog_logtypes_fkey" FOREIGN KEY (log_logtypes) REFERENCES logtypes(logtypes_id)
    "generallog_generallog_timezones_fkey" FOREIGN KEY (log_timezones) REFERENCES timezones(timezones_id)
    "log_log_businesses_fkey" FOREIGN KEY (log_businesses) REFERENCES people(people_id)
    "log_log_peoplelist_fkey" FOREIGN KEY (log_peoplelist) REFERENCES people(people_id)

And then I use SQL completion function:

rcd-completing-read-sql-hash

rcd-completing-read-sql-hash is a Lisp closure in ‘rcd-pg-basics.el’.

(rcd-completing-read-sql-hash PROMPT SQL PG &optional HISTORY
INITIAL-INPUT NOT-REQUIRE-MATCH AUTO-INITIAL-INPUT)

Complete selection by using SQL.

First column shall be unique id, followed by text
representation.  Example SQL query:

SELECT people_id, people_firstname || ’ ’ || people_lastname FROM people

PG is database handle.  HISTORY is supported with INITIAL-INPUT
Argument PROMPT will be displayed to user.

(rcd-completing-read-sql-hash "Select log: "  "SELECT log_id, log_name || ', ' || log_datecreated FROM log LIMIT 10" cf-db)

Then I can see in completion something like:

Click on a completion to select it.
In this buffer, type RET to select the completion near point.

10 possible completions:
First log from command line, 2020-12-26 21:57:52.132102 [191]
mutt: message sent, 2020-12-26 22:01:15.840392 [192]
mutt: message sent, 2020-12-27 08:36:58.490307 [201]
mutt: message sent, 2020-12-27 08:38:16.822888 [202]
mutt: message sent, 2020-12-27 12:27:43.720052 [210]
mutt: message sent, 2020-12-27 17:31:50.430087 [212]
mutt: message sent, 2020-12-27 19:58:56.01232 [222]
mutt: message sent, 2020-12-28 06:04:21.229454 [227]
mutt: message sent, 2020-12-28 06:20:30.87355 [233]

So in that use case to find anything 5 minutes before, I would do this:

(rcd-completing-read-sql-hash "Select log: "  
			      "SELECT log_id, log_name || ', ' || log_datecreated 
                                 FROM log 
                                WHERE log_datecreated >= now() - interval '5 minutes'
                                LIMIT 10" cf-db) 

and I would get here as result the ID number `log_id' by which I
could then access that specific log entry. The entry looks like
this:

                             ID   33401
                   Date created   "2022-11-17 11:32:34.422482"
                  Date modified   nil
                   User created   "maddox"
                  User modified   "maddox"
                 List of people   nil
                        Contact   "Jean Louis"
                       Business   nil
                    Assigned to   nil
                      Time zone   nil
                           Date   "2022-11-17 11:32:34.422482"
                           Time   nil
                          Title   "Function `msmtp-count-remaining' invoked"
                    Description   nil
                        Publish   nil
                  Hyperdocument   nil
                            Key   nil
                       Log type   "Emacs Lisp Function"
                           UUID   "362b697a-5d28-434a-877a-d35850138467"

Thus by using the principle of searching through entries in the
first stage, and using completion to get the reference to the
object ID, you will get more clear results and be able to access
any related information.

> Can someone please help me learn all this? And can you please tell me
> what's wrong with the snippet?

It is better that you say about your data structures and what you
wish to achieve, instead of moving to complex subjects because
it is how you think it should be done.

This e-mail belongs to different mailing list, IMHO, so copy goes
there instead to emacs-devel.

-- 
Jean

Take action in Free Software Foundation campaigns:
https://www.fsf.org/campaigns

In support of Richard M. Stallman
https://stallmansupport.org/



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-11-17  9:31 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <m235al9ufz.fsf@gmail.com>
2022-11-17  9:31 ` Programmed completions are utterly confusing Jean Louis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).