unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Thoughts on Refactoring In-Buffer Completion In message.el
@ 2022-06-23 15:26 Alexander Adolf
  2022-06-25  4:35 ` Thomas Fitzsimmons
  2022-06-25  8:22 ` Stefan Monnier
  0 siblings, 2 replies; 6+ messages in thread
From: Alexander Adolf @ 2022-06-23 15:26 UTC (permalink / raw)
  To: emacs-devel

Hello,

<disclaimer>
  message.el is a massive file (some 8600 lines), and I can't claim
  having read or understood any substantial portion of it. All
  statements solely rely on the (small) portions of the code around
  completion in message.el, which I have looked at. Thus, as always in
  life, I shall be looking froward to standing corrected and
  enlightened wherever I deserve it. ;-)
</disclaimer>

Regarding in-buffer completion, message-mode is probably a rather
specific case as it has succinctly distinct in-buffer completion
contexts. Think for instance of a body or subject line, vs. a To header
line, vs. a References header line, vs. a Newsgroups header line.

Currently, message-mode employs two mechanisms for handling these
contextss:

• Its completion function message-completion-function is registered in
  completion-at-point-functions, and marshals the in-buffer completion
  contexts via the variable message-completion-alist.
  message-completion-alist in turn configures the (one and only)
  function to be called for completion candidates in a given context.

• The completion style to use for email addresses is added to the
  variable completion-category-defaults.

What made me start scratching my head about this, was the wish to
combine email address candidates from more than one source, and have
them be presented in a single list by completion-at-point. Since EUDC
had recently gained the ability to return combined search results from
several of its back-ends [1], I put together a new EUDC function that
can be added to completion-at-point-functions. My new function gets
added to the front of completion-at-point-functions in message-mode, and
email address completion candidates from the EUDC back-ends I have
configured show up when completion-at-point is triggered [2].

[1] commit 0470a4a939772c4bd25123b15f5eadab41f8bee5
[2] commit 620ac6735520aea97ce49059b0df38ed41930b6b


So far so good; this satisfies my immediate use-case.


Job done? Looking at [2], you'll notice that I need to check in my new
EUDC completion function whether point is actually on a To, Cc, etc.
line. Otherwise email addresses would happily be offered as completion
candidates in the middle of the subject, or in the body. Having this
extra check for an email message header line in EUDC code didn't feel
quite right, however. Think "separation of concerns".

Further looking at message.el, and how in-buffer completion is handled
there, you'll find that there is (more or less) close integration with
ecomplete, mailabbrev, EUDC, and bbdb in message.el. Consider, for
instance the defun starting at line 8378 of message.el:

┌────
│ 8377  (defun message--bbdb-query-with-words (words)
│ 8378    ;; FIXME: This (or something like this) should live on the BBDB side.
│ 8379    (when (fboundp 'bbdb-records)
└────
Listing 1: lisp/gnus/message.el "separation of concerns" example

What more should I add? I fully agree with the comment in line 8378.
Again, think "separation of concerns".


On this backdrop, where would I see room for improvement?


Overall, there are a couple of bits an pieces for in-buffer completion
in place in message.el already. But it seems they were developed/added
independently of each other, and a little more orchestration could
perhaps help to make things more flexible, but without reinventing the
wheel.

1) In message.el, combine both completion control mechanisms into a
   single one.

   At first glance from the end user's point of view,
   message-completion-alist and completion-category-defaults can be
   perceived as two distinct mechanisms, and it is not immediately
   obvious which completion style category applies for which regex in
   message-completion-alist. Only by inspecting line 8404 in message.el,
   one can discover which category is used:

   ┌────
   │ 8398  (defun message--name-table (orig-string)
   │ 8399    (let ((orig-words (split-string orig-string "[ \t]+"))
   │ 8400          eudc-responses
   │ 8401          bbdb-responses)
   │ 8402      (lambda (string pred action)
   │ 8403        (pcase action
   │ 8404          ('metadata '(metadata (category . email)))
   └────
   Listing 2: lisp/gnus/message.el completion styles handling

   Thus, I would propose to change message-completion-alist from being
   alist of (RE . FUN), to become an alist of (RE . PLIST). Two
   properties for the inner plist would initially defined; one for
   specifying the completion style, and another one for specifying the
   completion-at-point-functions. With this, the new default value of
   message-completion-alist could for example be along the lines of:

   ┌────
   │ (defcustom message-completion-alist
   │   `((,message-newgroups-header-regexp
   │      . '(:capf-style 'newsgroup
   │          :capf-funs  '(gnus-capf-complete)))    ;; hypothetical
   │     (,message-email-recipient-header-regexp
   │       . '(:capf-style 'email
   │           :capf-funs  '(eudc-capf-complete))))  ;; exists
   │   "docstring"
   │   )
   └────

   As an aside: considering the FIXME comment in the function
   message-expand-group, newsgroup completion would seem to be able to
   benefit from this change, too.

   If none of the regular expressions matches, the settings in
   completion-at-point-functions, and
   completion-category-defaults/overrides will apply as before, or in
   other modes.

   With such an approach, the end user would get a positive assertion as
   to which completion style is used in each part of a message buffer.


2) Refactor ecomplete, mailabbrev, and bbdb stuff out of message.el as
   much as possible.

   As the FIXME comment in listing 1 above suggests, any "query with
   words" functions, or other ecomplete, mailabbrev, or bbdb specific
   functions should be in the respective packages themselves
   ("separation of concerns"). Also, as EUDC performs search result
   aggregation across sources, these packages should implement EUDC
   back-ends to provide their results via EUDC. Message.el should thus
   not interact with any email address database directly, but instead
   provide a default configuration where completion-at-point queries
   EUDC for email addresses.


In all cases, current default behaviour should be retained as much as
reasonably possible, of course.


Many thanks and looking forward to your thoughts,

  -–alexander



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Thoughts on Refactoring In-Buffer Completion In message.el
  2022-06-23 15:26 Thoughts on Refactoring In-Buffer Completion In message.el Alexander Adolf
@ 2022-06-25  4:35 ` Thomas Fitzsimmons
  2022-06-27 15:48   ` Alexander Adolf
  2022-06-25  8:22 ` Stefan Monnier
  1 sibling, 1 reply; 6+ messages in thread
From: Thomas Fitzsimmons @ 2022-06-25  4:35 UTC (permalink / raw)
  To: Alexander Adolf; +Cc: emacs-devel

Hi Alexander,

Nice overview, thanks.  Do you think it will be possible to do this
refactoring as a series of small individual patches, each of which keeps
things working?  I think a series of proposed changes will be easier to
discuss.

Thomas



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Thoughts on Refactoring In-Buffer Completion In message.el
  2022-06-23 15:26 Thoughts on Refactoring In-Buffer Completion In message.el Alexander Adolf
  2022-06-25  4:35 ` Thomas Fitzsimmons
@ 2022-06-25  8:22 ` Stefan Monnier
  2022-06-27 16:37   ` Alexander Adolf
  1 sibling, 1 reply; 6+ messages in thread
From: Stefan Monnier @ 2022-06-25  8:22 UTC (permalink / raw)
  To: Alexander Adolf; +Cc: emacs-devel

>    ┌────
>    │ (defcustom message-completion-alist
>    │   `((,message-newgroups-header-regexp
>    │      . '(:capf-style 'newsgroup
>    │          :capf-funs  '(gnus-capf-complete)))    ;; hypothetical
>    │     (,message-email-recipient-header-regexp
>    │       . '(:capf-style 'email
>    │           :capf-funs  '(eudc-capf-complete))))  ;; exists
>    │   "docstring"
>    │   )
>    └────

Sounds fine, tho "capf-style" sounds wrong: these are the names of
categories, not a styles.

Also `capf-funs` sounds wrong: these should be completion tables
not CAPFs.  The CAPFs are the functions which decide which completion
table to use and what's the BEG...END that's being completed, and this
should not depend on the backend since they only depend on the format of
messages (and hence belong in `message.el`).

Currently we have `message-completion-alist` which decides on
a line-by-line basis what kind of completion we're doing, and then we
have a specialized function that decides what's the BEG..END to use and
which backend to use.

Maybe `message-completion-alist` should be beefed up so that it cuts
this middle man (specialized function): i.e. provide not just
a line-based regexp to decide which specialized function to use, but
also provide some way to specify the BEG..END and the backend*S*.

Maybe have it be a list of (FUNCTION CATEGORY . BACKENDS) ?
where FUNCTION should return (BEG . END) if this completion applies?

> 2) Refactor ecomplete, mailabbrev, and bbdb stuff out of message.el as
>    much as possible.

AFAIK this is already done for `ecomplete`.


        Stefan




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Thoughts on Refactoring In-Buffer Completion In message.el
  2022-06-25  4:35 ` Thomas Fitzsimmons
@ 2022-06-27 15:48   ` Alexander Adolf
  0 siblings, 0 replies; 6+ messages in thread
From: Alexander Adolf @ 2022-06-27 15:48 UTC (permalink / raw)
  To: Thomas Fitzsimmons; +Cc: emacs-devel

Hello Thomas,

Thomas Fitzsimmons <fitzsim@fitzsim.org> writes:

> [...]
> Nice overview, thanks.  Do you think it will be possible to do this
> refactoring as a series of small individual patches, each of which keeps
> things working?  I think a series of proposed changes will be easier to
> discuss.
> [...]

You have a good point here. I'll try to structure it that way.

Not sure whether all|some patches will meet everybody's definition of
"small" though... ;-) But I'll try to keep the as small as reasonably
possible.

  --alexander



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Thoughts on Refactoring In-Buffer Completion In message.el
  2022-06-25  8:22 ` Stefan Monnier
@ 2022-06-27 16:37   ` Alexander Adolf
  2022-06-28 15:49     ` Stefan Monnier
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Adolf @ 2022-06-27 16:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello Stefan,

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>    ┌────
>>    │ (defcustom message-completion-alist
>>    │   `((,message-newgroups-header-regexp
>>    │      . '(:capf-style 'newsgroup
>>    │          :capf-funs  '(gnus-capf-complete)))    ;; hypothetical
>>    │     (,message-email-recipient-header-regexp
>>    │       . '(:capf-style 'email
>>    │           :capf-funs  '(eudc-capf-complete))))  ;; exists
>>    │   "docstring"
>>    │   )
>>    └────
>
> Sounds fine, tho "capf-style" sounds wrong: these are the names of
> categories, not a styles.

These names were just for illustration, so happy to use better fitting
names.

Perhaps `completion-style-category`? A shorter version?

> Also `capf-funs` sounds wrong: [...]

They are either tables, or functions that return tables. As of today,
the entries in message-completion-alist are functions.

> Currently we have `message-completion-alist` which decides on
> a line-by-line basis what kind of completion we're doing, and then we
> have a specialized function that decides what's the BEG..END to use and
> which backend to use.
>
> Maybe `message-completion-alist` should be beefed up so that it cuts
> this middle man (specialized function): i.e. provide not just
> a line-based regexp to decide which specialized function to use, but
> also provide some way to specify the BEG..END and the backend*S*.

Good point, and agreed that determining (beg . end) should also be done
in message.Eli.

> Maybe have it be a list of (FUNCTION CATEGORY . BACKENDS) ?
> where FUNCTION should return (BEG . END) if this completion applies?

In some situations this will be called for every keystroke. Would the
extra overhead of a function call seem acceptable for this?

OTOH, the code of each of those functions would probably look extremely
similar. The format of header lines where specific completion is to be
attempted always seems to be:

    header-label ':' record | record-list

(I'm thinking of To/Cc/etc. and Newsgroups/Followup-To/etc.)

Thus, the only parameters for such functions would seem to be a regular
expression to match the header-label, and the character (or regex?) that
is used as the record separator. So we might just as well put the
header-label regex, and the record separator char in
`message-completion-alist`?

>> 2) Refactor ecomplete, mailabbrev, and bbdb stuff out of message.el as
>>    much as possible.
>
> AFAIK this is already done for `ecomplete`.
> [...]

Searching for `ecomplete` in message.el (tip of master as of this
writing):

message.el
1390:The default is `abbrev', which uses mailabbrev.  `ecomplete' uses
1396:		 (const :tag "Use ecomplete" ecomplete)
1400:  "List of `self-insert-command's used to trigger ecomplete.
1402:header, ecomplete will suggest the candidates of recipients (see also
1937:(defvar message-inhibit-ecomplete nil)
3097:  (when (and (message-mail-alias-type-p 'ecomplete)
3181:   ((message-mail-alias-type-p 'ecomplete)
3182:    (ecomplete-setup)))
4445:      ;; Do ecomplete address snarfing.
4446:      (when (and (message-mail-alias-type-p 'ecomplete)
4447:		 (not message-inhibit-ecomplete))
4448:	(message-put-addresses-in-ecomplete))
8021:	    (message-inhibit-ecomplete t)
8344:`ecomplete', `message-self-insert-commands' should probably be
8414:	   (ecomplete-setup)
8420:	                (when (and (bound-and-true-p ecomplete-database)
8421:	                           (fboundp 'ecomplete-completion-table))
8423:                                           (ecomplete-completion-table 'mail)
8628:(declare-function ecomplete-add-item "ecomplete" (type key text))
8629:(declare-function ecomplete-save "ecomplete" ())
8631:(defun message-put-addresses-in-ecomplete ()
8632:  (require 'ecomplete)
8640:	(ecomplete-add-item 'mail (car (mail-header-parse-address string))
8642:  (ecomplete-save))
8644:(autoload 'ecomplete-display-matches "ecomplete")
8666:		    (ecomplete-display-matches 'mail word choose))))
8671:(defun message-ecomplete-capf ()
8674:  (when (and (bound-and-true-p ecomplete-database)
8675:             (fboundp 'ecomplete-completion-table)
8683:      `(,start ,end ,(ecomplete-completion-table 'mail)))))

That's 40 hits, and looking at the matching lines, my first impression
would not frankly be that ecomplete integration has been reduced to a
bare minimum.


Looking forward to your thoughts,

  --alexander



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Thoughts on Refactoring In-Buffer Completion In message.el
  2022-06-27 16:37   ` Alexander Adolf
@ 2022-06-28 15:49     ` Stefan Monnier
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Monnier @ 2022-06-28 15:49 UTC (permalink / raw)
  To: Alexander Adolf; +Cc: emacs-devel

>> Sounds fine, tho "capf-style" sounds wrong: these are the names of
>> categories, not a styles.
> These names were just for illustration, so happy to use better fitting
> names.
> Perhaps `completion-style-category`? A shorter version?

It's not a "style category": it's more a "completion category" (it's not
a classification of styles but a classification of completion candidates).
To any given category the user can then associate some styles (as well
as control the cycling behavior).

>> Also `capf-funs` sounds wrong: [...]
> They are either tables, or functions that return tables.

I think they should be only completion tables (and most/all of those
tables will be functions: completion tables can take the shape of lists
of strings, alists, hash-tables, obarrays, or functions, but the only
think we care about is that they work with the corresponding methods).

That part that's wrong is not "funs" but "capf", because "capf" refers
to `completion-at-point-functions` which are not completion tables but
functions that return (BEG END TABLE).  They're related but
they're different.

>> Maybe `message-completion-alist` should be beefed up so that it cuts
>> this middle man (specialized function): i.e. provide not just
>> a line-based regexp to decide which specialized function to use, but
>> also provide some way to specify the BEG..END and the backend*S*.
> Good point, and agreed that determining (beg . end) should also be done
> in message.Eli.

Nice typo, thank you.  Indeed, maybe in order to mark the enormous
contribution of Eli to Emacs, maybe ELisp source files should end in
`.Eli` ;-)

>> Maybe have it be a list of (FUNCTION CATEGORY . BACKENDS) ?
>> where FUNCTION should return (BEG . END) if this completion applies?
> In some situations this will be called for every keystroke. Would the
> extra overhead of a function call seem acceptable for this?

An extra function call will be completely lost in the noise, yes.

> OTOH, the code of each of those functions would probably look extremely
> similar. The format of header lines where specific completion is to be
> attempted always seems to be:
>
>     header-label ':' record | record-list
>
> (I'm thinking of To/Cc/etc. and Newsgroups/Followup-To/etc.)
>
> Thus, the only parameters for such functions would seem to be a regular
> expression to match the header-label, and the character (or regex?) that
> is used as the record separator. So we might just as well put the
> header-label regex, and the record separator char in
> `message-completion-alist`?

You might be right, maybe we just need to add a record separator.

> 1390:The default is `abbrev', which uses mailabbrev.  `ecomplete' uses
> 1396:		 (const :tag "Use ecomplete" ecomplete)
> 1400:  "List of `self-insert-command's used to trigger ecomplete.
> 1402:header, ecomplete will suggest the candidates of recipients (see also
> 1937:(defvar message-inhibit-ecomplete nil)

These non-executable.

> 3097:  (when (and (message-mail-alias-type-p 'ecomplete)
> 3181:   ((message-mail-alias-type-p 'ecomplete)
> 3182:    (ecomplete-setup)))
> 4445:      ;; Do ecomplete address snarfing.
> 4446:      (when (and (message-mail-alias-type-p 'ecomplete)
> 4447:		 (not message-inhibit-ecomplete))
> 4448:	(message-put-addresses-in-ecomplete))
> 8021:	    (message-inhibit-ecomplete t)
> 8344:`ecomplete', `message-self-insert-commands' should probably be
> 8414:	   (ecomplete-setup)
> 8420:	                (when (and (bound-and-true-p ecomplete-database)
> 8421:	                           (fboundp 'ecomplete-completion-table))
> 8423:                                           (ecomplete-completion-table 'mail)
> 8628:(declare-function ecomplete-add-item "ecomplete" (type key text))
> 8629:(declare-function ecomplete-save "ecomplete" ())
> 8631:(defun message-put-addresses-in-ecomplete ()
> 8632:  (require 'ecomplete)
> 8640:	(ecomplete-add-item 'mail (car (mail-header-parse-address string))
> 8642:  (ecomplete-save))
> 8644:(autoload 'ecomplete-display-matches "ecomplete")
> 8666:		    (ecomplete-display-matches 'mail word choose))))
> 8671:(defun message-ecomplete-capf ()
> 8674:  (when (and (bound-and-true-p ecomplete-database)
> 8675:             (fboundp 'ecomplete-completion-table)
> 8683:      `(,start ,end ,(ecomplete-completion-table 'mail)))))
>
> That's 40 hits, and looking at the matching lines, my first impression
> would not frankly be that ecomplete integration has been reduced to a
> bare minimum.

Maybe I didn't express myself clearly enough: I meant that the
ecomplete-based *completion* code has been clearly separated.
It basically only uses `ecomplete-completion-table`.

`message-ecomplete-capf` was my first attempt at doing it cleanly: it
was clean alright, but it didn't allow combing this backend with
another one.  It's not used any more but is kept in case someone else
still uses it.  The place where `ecomplete` completion is used nowadays
is `message--name-table` which is supposed to combine EUDC, BBDB, and
Ecomplete completion tables into a single completion table.


        Stefan




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-06-28 15:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-23 15:26 Thoughts on Refactoring In-Buffer Completion In message.el Alexander Adolf
2022-06-25  4:35 ` Thomas Fitzsimmons
2022-06-27 15:48   ` Alexander Adolf
2022-06-25  8:22 ` Stefan Monnier
2022-06-27 16:37   ` Alexander Adolf
2022-06-28 15:49     ` Stefan Monnier

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).