Re: Programmed completions are utterly confusing

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Jean Louis <bugs@gnu.support>
To: Ag <agzam.ibragimov@gmail.com>
Cc: help-gnu-emacs@gnu.org
Subject: Re: Programmed completions are utterly confusing
Date: Thu, 17 Nov 2022 12:31:58 +0300	[thread overview]
Message-ID: <Y3X/jj3Tkr7inbtY@protected.localdomain> (raw)
In-Reply-To: <m235al9ufz.fsf@gmail.com>

* Ag <agzam.ibragimov@gmail.com> [2022-11-15 02:53]:
> Before someone tells me that asking for help should be posted in
> help-gnu-emacs@gnu.org, I apologize for the noise, but I'm hoping maybe
> this could open a discussion for documentation improvements for
> completions.

Documentation is here:

(info "(elisp) Completion")

If you have suggestions, write it.

Which specific part is not clear?

> I've been trying to understand how completions work, and I needed a
> completion that searches not only within the displayed rows but also in
> annotations. Someone suggested concatenating the annotation info into
> the main string, but I don't like this for a few reasons. I've even been
> told that the annotations are not designed for that, and if I want them
> to be searched upon, they really can't be annotations.

I agree that you should be able to searching within any kind of data.

As Emacs is extensible, what I do in your use case, which is analogous
to my use case, I use two stages, first is search among too many
candidates, and second is the completion.

As I do not know how you store your searchable data, I can't give
examples on that.

Try to make a function that first searches, second provides you
completion of found candidates.

If you think of data scaling in future, then you will understand why
is two stages process better than single completion over all the
items. I do not like waiting for half a second or second or two
seconds for completion to appear. Emacs needs longer time to construct
completion candidates when there are too many, like if I have 70000 of
them, or 250,000 of them, it needs some time, maybe second, two,
three, I did not measure.

> - Use case I have right now: I want to search through URLs and their
>   page titles.

I do the same here, and I have made function to search only in URLs or
only in titles. 

I could easily do search like "my search terms AND other terms AND
other terms" -- but I prefer to separate terms and I use prefix key
C-u to tell how many words I want to find in the URL.

C-u C-u C-u SEARCH-KEY would ask me for 3 words, and then those 3
words must exist in the URL or in the TITLE.

And it is trivial to make a search both for title and URL in the same
time. 

> If I put each url into the main display string, and annotate them
> with titles, every row would be recognized as a "proper url", and
> packages like Embark could use that, e.g., I can dispatch
> embark-url-action and browse them. But if I concatenate the url and
> the page title, I would have to use tricks so Embark recognizes it
> as a url (and not a url with a string attached to it)

I suggest putting URL in URL place, title in title place. I also
suggest using ID numbers for completion. Use search before completion. 

I use this function to cut the ID from the title:

(defun rcd-get-bracketed-id-end (s)
  "Return the ID number in string S from within first brackets on its
end. For example it would return 123 from `Some string [123]'"
  (let* ((match (string-match "\\[\\([[:digit:]]*\\)\\][[:space:]]*$" s)))
    (when match
      (string-to-number
       (substring-no-properties s (match-beginning 1) (match-end 1))))))

(rcd-get-bracketed-id-end (completing-read "Choose: " '("One [1]" "Two [2]"))) ⇒ 1

Then by using the ID number I know what was chosen.

That way I can get a reference to much more complex data in
background. It is possible to use only the title this way, to access
the URL, tags, author, description, related other objects.

Get a reference, use reference to get more information about the object.

> - Another use case comes to mind, let's say I want to sift through some
>   kind of log entries. Imagine I would want to provide a feature, where
>   if typed something like "last 5 mins", it would limit the rows to
>   include only log events with timestamps no older than five minutes
>   ago. Obviously, there's no practical reason or a sensible way to
>   encode dynamic time value into each row,

That asks for PostgreSQL, so I recommend learning about it.

I use that already, noting what was done at what specific time, and
have it in my system. My log table is defined here below.

                                                     Table "public.log"
┌──────────────────┬─────────────────────────────┬───────────┬──────────┬───────────────────────────────────────────────────┐
│      Column      │            Type             │ Collation │ Nullable │                      Default                      │
├──────────────────┼─────────────────────────────┼───────────┼──────────┼───────────────────────────────────────────────────┤
│ log_id           │ integer                     │           │ not null │ nextval('generallog_generallog_id_seq'::regclass) │
│ log_datecreated  │ timestamp without time zone │           │ not null │ now()                                             │
│ log_datemodified │ timestamp without time zone │           │          │                                                   │
│ log_usercreated  │ text                        │           │ not null │ "current_user"()                                  │
│ log_usermodified │ text                        │           │ not null │ "current_user"()                                  │
│ log_peoplelist   │ integer                     │           │          │                                                   │
│ log_people       │ integer                     │           │          │                                                   │
│ log_businesses   │ integer                     │           │          │                                                   │
│ log_assignedto   │ integer                     │           │          │                                                   │
│ log_timezones    │ integer                     │           │          │                                                   │
│ log_date         │ timestamp without time zone │           │          │ now()                                             │
│ log_time         │ time without time zone      │           │          │                                                   │
│ log_name         │ text                        │           │ not null │                                                   │
│ log_description  │ text                        │           │          │                                                   │
│ log_publish      │ boolean                     │           │          │ false                                             │
│ log_hyobjects    │ integer                     │           │          │                                                   │
│ log_key          │ text                        │           │          │                                                   │
│ log_logtypes     │ integer                     │           │ not null │ 1                                                 │
│ log_uuid         │ uuid                        │           │ not null │ gen_random_uuid()                                 │
└──────────────────┴─────────────────────────────┴───────────┴──────────┴───────────────────────────────────────────────────┘
Indexes:
    "generallog_pkey" PRIMARY KEY, btree (log_id)
    "log_log_uuid_key" UNIQUE CONSTRAINT, btree (log_uuid)
Foreign-key constraints:
    "generallog_generallog_assignedto_fkey" FOREIGN KEY (log_assignedto) REFERENCES people(people_id)
    "generallog_generallog_contacts_fkey" FOREIGN KEY (log_people) REFERENCES people(people_id)
    "generallog_generallog_hlinks_fkey" FOREIGN KEY (log_hyobjects) REFERENCES hyobjects(hyobjects_id)
    "generallog_generallog_logtypes_fkey" FOREIGN KEY (log_logtypes) REFERENCES logtypes(logtypes_id)
    "generallog_generallog_timezones_fkey" FOREIGN KEY (log_timezones) REFERENCES timezones(timezones_id)
    "log_log_businesses_fkey" FOREIGN KEY (log_businesses) REFERENCES people(people_id)
    "log_log_peoplelist_fkey" FOREIGN KEY (log_peoplelist) REFERENCES people(people_id)

And then I use SQL completion function:

rcd-completing-read-sql-hash

rcd-completing-read-sql-hash is a Lisp closure in ‘rcd-pg-basics.el’.

(rcd-completing-read-sql-hash PROMPT SQL PG &optional HISTORY
INITIAL-INPUT NOT-REQUIRE-MATCH AUTO-INITIAL-INPUT)

Complete selection by using SQL.

First column shall be unique id, followed by text
representation.  Example SQL query:

SELECT people_id, people_firstname || ’ ’ || people_lastname FROM people

PG is database handle.  HISTORY is supported with INITIAL-INPUT
Argument PROMPT will be displayed to user.

(rcd-completing-read-sql-hash "Select log: "  "SELECT log_id, log_name || ', ' || log_datecreated FROM log LIMIT 10" cf-db)

Then I can see in completion something like:

Click on a completion to select it.
In this buffer, type RET to select the completion near point.

10 possible completions:
First log from command line, 2020-12-26 21:57:52.132102 [191]
mutt: message sent, 2020-12-26 22:01:15.840392 [192]
mutt: message sent, 2020-12-27 08:36:58.490307 [201]
mutt: message sent, 2020-12-27 08:38:16.822888 [202]
mutt: message sent, 2020-12-27 12:27:43.720052 [210]
mutt: message sent, 2020-12-27 17:31:50.430087 [212]
mutt: message sent, 2020-12-27 19:58:56.01232 [222]
mutt: message sent, 2020-12-28 06:04:21.229454 [227]
mutt: message sent, 2020-12-28 06:20:30.87355 [233]

So in that use case to find anything 5 minutes before, I would do this:

(rcd-completing-read-sql-hash "Select log: "  
			      "SELECT log_id, log_name || ', ' || log_datecreated 
                                 FROM log 
                                WHERE log_datecreated >= now() - interval '5 minutes'
                                LIMIT 10" cf-db) 

and I would get here as result the ID number `log_id' by which I
could then access that specific log entry. The entry looks like
this:

                             ID   33401
                   Date created   "2022-11-17 11:32:34.422482"
                  Date modified   nil
                   User created   "maddox"
                  User modified   "maddox"
                 List of people   nil
                        Contact   "Jean Louis"
                       Business   nil
                    Assigned to   nil
                      Time zone   nil
                           Date   "2022-11-17 11:32:34.422482"
                           Time   nil
                          Title   "Function `msmtp-count-remaining' invoked"
                    Description   nil
                        Publish   nil
                  Hyperdocument   nil
                            Key   nil
                       Log type   "Emacs Lisp Function"
                           UUID   "362b697a-5d28-434a-877a-d35850138467"

Thus by using the principle of searching through entries in the
first stage, and using completion to get the reference to the
object ID, you will get more clear results and be able to access
any related information.

> Can someone please help me learn all this? And can you please tell me
> what's wrong with the snippet?

It is better that you say about your data structures and what you
wish to achieve, instead of moving to complex subjects because
it is how you think it should be done.

This e-mail belongs to different mailing list, IMHO, so copy goes
there instead to emacs-devel.

-- 
Jean

Take action in Free Software Foundation campaigns:
https://www.fsf.org/campaigns

In support of Richard M. Stallman
https://stallmansupport.org/

     prev parent reply	other threads:[~2022-11-17  9:31 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-14 20:01 Programmed completions are utterly confusing Ag
2022-11-15  2:55 ` Stefan Monnier
2022-11-17  9:31 ` Jean Louis [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3X/jj3Tkr7inbtY@protected.localdomain \
    --to=bugs@gnu.support \
    --cc=agzam.ibragimov@gmail.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.