unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Dario Gjorgjevski <dario.gjorgjevski@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: 42149@debbugs.gnu.org, "João Távora" <joaotavora@gmail.com>
Subject: bug#42149: Substring and flex completion ignore implicit trailing ‘any’
Date: Wed, 14 Oct 2020 10:22:09 +0200	[thread overview]
Message-ID: <fv2zojpn5lxln2.fsf@gmail.com> (raw)
In-Reply-To: <fv2zoj5z8lq3dk.fsf@gmail.com> (Dario Gjorgjevski's message of "Thu, 10 Sep 2020 13:26:31 +0200")

[-- Attachment #1: Type: text/plain, Size: 313 bytes --]

Hi Stefan, hi João,

I rebased my patch on top of the fix introduced for bug#41705 and
confirmed that it does not cause a regression.  Have you been able to
look into it?  Please let me know if you think there’s something missing
or if I should add additional tests.

I am attaching the patch below.


[-- Attachment #2: Fix (and optimize) scoring in PCM completion --]
[-- Type: text/x-diff, Size: 13013 bytes --]

From e1d07804aeb155a5ff3b6a1c410ec757269a43d3 Mon Sep 17 00:00:00 2001
From: Dario Gjorgjevski <dario.gjorgjevski@gmail.com>
Date: Wed, 14 Oct 2020 10:06:40 +0200
Subject: [PATCH] Fix (and optimize) scoring in PCM completion
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lisp/minibuffer.el (completion-pcm--hilit-commonality): Fix scoring
to also include the last part of the query string.  This was
especially evident for single-character query strings, e.g.,
‘(completion-flex-all-completions "1" '("1" "foo1") nil 1)’ would
match both "1" and "foo1" with a completion-score of 0.  This
adjustment makes the completion-score of "1" be 1 and of "foo1" by
0.25.  Furthermore, fix ‘completions-first-difference’ and
‘completions-common-part’ sometimes overlapping.  See also bug#42149.
Furthermore, some optimizations are done.

(completion-pcm--optimize-pattern): Turn multiple consecutive
occurrences of ‘any’ into just a single one.

(completion-pcm--count-leading-holes): New function.

(completion-pcm--match-size): New function.

* test/lisp/minibuffer-tests.el (completion-pcm-all-completions-test,
completion-substring-all-completions-test,
completion-flex-all-completions-test): Regression tests.
---
 lisp/minibuffer.el            |  99 +++++++++++++++-----------
 test/lisp/minibuffer-tests.el | 127 ++++++++++++++++++++++++++++++++++
 2 files changed, 184 insertions(+), 42 deletions(-)

diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el
index 427636e866..38bb4d0785 100644
--- a/lisp/minibuffer.el
+++ b/lisp/minibuffer.el
@@ -3110,6 +3110,7 @@ or a symbol, see `completion-pcm--merge-completions'."
     (while p
       (pcase p
         (`(,(or 'any 'any-delim) point . ,rest) (setq p `(point . ,rest)))
+	(`(any any . ,rest) (setq p `(any . ,rest)))
         ;; This is not just a performance improvement: it turns a
         ;; terminating `point' into an implicit `any', which affects
         ;; the final position of point (because `point' gets turned
@@ -3193,21 +3194,32 @@ one large \"hole\" and a clumped-together \"oo\" match) higher
 than the latter (which has two \"holes\" and three
 one-letter-long matches).")
 
+(defun completion-pcm--count-leading-holes (pattern)
+  "Count the number of leading holes in PATTERN."
+  (length (seq-take-while #'symbolp pattern)))
+
+(defun completion-pcm--match-size (pattern)
+  "Return the match size of PATTERN."
+  (apply #'+
+         (mapcar
+          (lambda (part) (if (stringp part) (length part) 0))
+          pattern)))
+
 (defun completion-pcm--hilit-commonality (pattern completions)
   (when completions
     (let* ((re (completion-pcm--pattern->regex pattern 'group))
            (point-idx (completion-pcm--pattern-point-idx pattern))
-           (case-fold-search completion-ignore-case))
+           (case-fold-search completion-ignore-case)
+           (num-leading-holes (completion-pcm--count-leading-holes pattern))
+           (score-numerator (float (completion-pcm--match-size pattern))))
       (mapcar
        (lambda (str)
-	 ;; Don't modify the string itself.
+         ;; Don't modify the string itself.
          (setq str (copy-sequence str))
          (unless (string-match re str)
            (error "Internal error: %s does not match %s" re str))
          (let* ((pos (if point-idx (match-beginning point-idx) (match-end 0)))
                 (md (match-data))
-                (start (pop md))
-                (end (pop md))
                 (len (length str))
                 ;; To understand how this works, consider these bad
                 ;; ascii(tm) diagrams showing how the pattern "foo"
@@ -3243,47 +3255,50 @@ one-letter-long matches).")
                 ;;    (SUM_across_i(hole_i_contrib) + 1) * len
                 ;;
                 ;; , where "len" is the string's length.
-                (score-numerator 0)
+                (full-match-start (pop md))
+                (full-match-end (pop md))
+                (match-start)
                 (score-denominator 0)
-                (last-b 0)
-                (update-score
-                 (lambda (a b)
-                   "Update score variables given match range (A B)."
-                   (setq
-                    score-numerator   (+ score-numerator (- b a)))
-                   (unless (or (= a last-b)
-                               (zerop last-b)
-                               (= a (length str)))
-                     (setq
-                      score-denominator (+ score-denominator
-                                           1
-                                           (expt (- a last-b 1)
-                                                 (/ 1.0
-                                                    flex-score-match-tightness)))))
-                   (setq
-                    last-b              b))))
-           (funcall update-score start start)
+                (hilit (lambda (match-start match-end)
+                         (add-face-text-property
+                          match-start match-end
+                          'completions-common-part
+                          nil str)
+                         ;; Maybe move `pos' away so we don not end up
+                         ;; putting `completions-first-difference' over
+                         ;; text that actually matches.
+                         (when (and (>= pos match-start) (< pos match-end))
+                           (setq pos match-end)))))
+           ;; Make sure that leading holes are explicitly discarded.
+           ;; Trailing holes are taken care of by
+           ;; `completion-pcm--optimize-pattern'.
+           (if (zerop num-leading-holes)
+               (setq match-start full-match-start)
+             (dotimes (_ (1- (* 2 num-leading-holes)))
+               (pop md))
+             (setq match-start (pop md)))
            (while md
-             (funcall update-score start (car md))
+             (let ((hole-start (pop md))
+                   (hole-end (pop md)))
+               (funcall hilit match-start hole-start)
+               (unless (= hole-start hole-end)
+                 (setq
+                  score-denominator (+ score-denominator
+                                       1
+                                       (expt
+                                        (- hole-end hole-start 1)
+                                        (/ 1.0 flex-score-match-tightness)))))
+               (setq match-start hole-end)))
+           (funcall hilit match-start full-match-end)
+           (when (> len pos)
              (add-face-text-property
-              start (pop md)
-              'completions-common-part
-              nil str)
-             (setq start (pop md)))
-           (funcall update-score len len)
-           (add-face-text-property
-            start end
-            'completions-common-part
-            nil str)
-           (if (> (length str) pos)
-               (add-face-text-property
-                pos (1+ pos)
-                'completions-first-difference
-                nil str))
-           (unless (zerop (length str))
-             (put-text-property
-              0 1 'completion-score
-              (/ score-numerator (* len (1+ score-denominator)) 1.0) str)))
+              pos (1+ pos)
+              'completions-first-difference
+              nil str))
+           (put-text-property
+            0 1
+            'completion-score
+            (/ score-numerator (* len (1+ score-denominator))) str))
          str)
        completions))))
 
diff --git a/test/lisp/minibuffer-tests.el b/test/lisp/minibuffer-tests.el
index 5da86f3614..a473fec441 100644
--- a/test/lisp/minibuffer-tests.el
+++ b/test/lisp/minibuffer-tests.el
@@ -104,5 +104,132 @@
                                                 nil (length input))
                      (cons output (length output)))))))
 
+(ert-deftest completion-pcm-all-completions-test ()
+  ;; Point is at end, this does not match anything
+  (should (equal
+           (completion-pcm-all-completions
+            "foo" '("hello" "world" "barfoobar") nil 3)
+           nil))
+  ;; Point is at beginning, this matches "barfoobar"
+  (should (equal
+           (car (completion-pcm-all-completions
+                 "foo" '("hello" "world" "barfoobar") nil 0))
+           "barfoobar"))
+  ;; Full match!
+  (should (eql
+           (get-text-property
+            0 'completion-score
+            (car (completion-pcm-all-completions
+                  "R" '("R" "hello") nil 1)))
+           1.0))
+  ;; One fourth of a match and no match due to point being at the end
+  (should (eql
+           (get-text-property
+            0 'completion-score
+            (car (completion-pcm-all-completions
+                  "RO" '("RaOb") nil 1)))
+           (/ 1.0 4.0)))
+  (should (equal
+           (completion-pcm-all-completions
+            "RO" '("RaOb") nil 2)
+           nil))
+  ;; Point is at beginning, but `completions-first-difference' is
+  ;; moved after it
+  (should (equal
+           (get-text-property
+            1 'face
+            (car (completion-pcm-all-completions
+                  "f" '("few" "many") nil 0)))
+           'completions-first-difference))
+  ;; Wildcards and delimiters work
+  (should (equal
+           (car (completion-pcm-all-completions
+                 "li-pac*" '("list-packages") nil 7))
+           "list-packages"))
+  (should (equal
+           (car (completion-pcm-all-completions
+                 "li-pac*" '("do-not-list-packages") nil 7))
+           nil)))
+
+(ert-deftest completion-substring-all-completions-test ()
+  ;; One third of a match!
+  (should (equal
+           (car (completion-substring-all-completions
+                 "foo" '("hello" "world" "barfoobar") nil 3))
+           "barfoobar"))
+  (should (eql
+           (get-text-property
+            0 'completion-score
+            (car (completion-substring-all-completions
+                  "foo" '("hello" "world" "barfoobar") nil 3)))
+           (/ 1.0 3.0)))
+  ;; Full match!
+  (should (eql
+           (get-text-property
+            0 'completion-score
+            (car (completion-substring-all-completions
+                  "R" '("R" "hello") nil 1)))
+           1.0))
+  ;; Substring match
+  (should (equal
+           (car (completion-substring-all-completions
+                  "custgroup" '("customize-group") nil 4))
+           "customize-group"))
+  (should (equal
+           (car (completion-substring-all-completions
+                  "custgroup" '("customize-group") nil 5))
+           nil))
+  ;; `completions-first-difference' should be at the right place
+  (should (equal
+           (get-text-property
+            4 'face
+            (car (completion-substring-all-completions
+                  "jab" '("dabjobstabby" "many") nil 1)))
+           'completions-first-difference))
+  (should (equal
+           (get-text-property
+            6 'face
+            (car (completion-substring-all-completions
+                  "jab" '("dabjabstabby" "many") nil 1)))
+           'completions-first-difference))
+  (should (equal
+           (get-text-property
+            6 'face
+            (car (completion-substring-all-completions
+                  "jab" '("dabjabstabby" "many") nil 3)))
+           'completions-first-difference)))
+
+(ert-deftest completion-flex-all-completions-test ()
+  ;; Fuzzy match
+  (should (equal
+           (car (completion-flex-all-completions
+                 "foo" '("hello" "world" "fabrobazo") nil 3))
+           "fabrobazo"))
+  ;; Full match!
+  (should (eql
+           (get-text-property
+            0 'completion-score
+            (car (completion-flex-all-completions
+                  "R" '("R" "hello") nil 1)))
+           1.0))
+  ;; Another fuzzy match, but more of a "substring" one
+  (should (equal
+           (car (completion-flex-all-completions
+                  "custgroup" '("customize-group-other-window") nil 4))
+           "customize-group-other-window"))
+  ;; `completions-first-difference' should be at the right place
+  (should (equal
+           (get-text-property
+            4 'face
+            (car (completion-flex-all-completions
+                  "custgroup" '("customize-group-other-window") nil 4)))
+           'completions-first-difference))
+  (should (equal
+           (get-text-property
+            15 'face
+            (car (completion-flex-all-completions
+                  "custgroup" '("customize-group-other-window") nil 9)))
+           'completions-first-difference)))
+
 (provide 'completion-tests)
 ;;; completion-tests.el ends here
-- 
2.17.1


[-- Attachment #3: Type: text/plain, Size: 173 bytes --]


Best regards,
Dario

-- 
dario.gjorgjevski@gmail.com :: +49 1525 8666837
%   gpg --keyserver 'hkps://hkps.pool.sks-keyservers.net' \
\`>     --recv-keys '744A4F0B4F1C9371'

  reply	other threads:[~2020-10-14  8:22 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-01 10:40 bug#42149: Substring and flex completion ignore implicit trailing ‘any’ Dario Gjorgjevski
2020-07-01 10:58 ` João Távora
2020-07-01 11:03 ` João Távora
2020-07-01 11:10   ` Dario Gjorgjevski
2020-09-08  9:05     ` Dario Gjorgjevski
2020-09-08  9:30       ` João Távora
2020-09-08  9:44         ` Dario Gjorgjevski
2020-09-08 10:08           ` João Távora
2020-09-08 11:12             ` Dario Gjorgjevski
2020-09-08 11:22               ` João Távora
2020-09-08 11:30                 ` Dario Gjorgjevski
2020-09-08 11:32                   ` João Távora
2020-09-09 10:17                     ` Dario Gjorgjevski
2020-09-09 11:38                       ` Dario Gjorgjevski
2020-09-09 13:13                         ` Stefan Monnier
2020-09-10 11:26                           ` Dario Gjorgjevski
2020-10-14  8:22                             ` Dario Gjorgjevski [this message]
2020-10-14  8:39                               ` João Távora
2020-10-14  9:01                                 ` Dario Gjorgjevski
2020-10-15 14:25                                   ` Dario Gjorgjevski
2020-11-20 20:39                                     ` Dario Gjorgjevski
2020-11-20 21:27                                       ` João Távora
2020-11-25  0:01                                         ` João Távora
2020-11-25  8:22                                           ` Dario Gjorgjevski
2020-11-25 12:22                                             ` João Távora
2020-11-25 13:27                                               ` Dario Gjorgjevski
2020-12-23  9:41                                                 ` Dario Gjorgjevski
2020-12-27 20:08                                                   ` João Távora
2020-12-27 20:23                                                     ` João Távora
2020-12-27 21:20                                                     ` Stefan Monnier
2020-12-28  9:30                                                       ` João Távora
2020-12-28 16:03                                                         ` Stefan Monnier
2020-12-28 16:58                                                           ` João Távora
2020-12-28 16:07                                                         ` Stefan Monnier
2020-12-28 17:04                                                           ` João Távora
2020-12-27 21:45                                     ` Stefan Monnier
2020-12-28  9:38                                       ` João Távora
2020-12-28 10:22                                         ` Dario Gjorgjevski
2020-12-28 11:34                                           ` João Távora
2020-12-28 11:48                                             ` Dario Gjorgjevski
2020-12-28 12:57                                               ` João Távora
2020-12-28 10:17                                       ` Dario Gjorgjevski
2020-12-28 16:26                                         ` Stefan Monnier
2020-12-28 17:16                                           ` João Távora
2020-12-28 19:48                                             ` Dario Gjorgjevski
2020-12-28 20:00                                               ` Stefan Monnier
2020-12-28 23:20                                                 ` João Távora
2020-12-29 13:27                                                   ` João Távora
2021-05-13  9:24                                                   ` Lars Ingebrigtsen
2021-05-13 14:31                                                     ` João Távora
2021-05-13 15:41                                                       ` Dario Gjorgjevski
2021-05-13 16:04                                                         ` João Távora
2021-05-16 13:51                                                           ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fv2zojpn5lxln2.fsf@gmail.com \
    --to=dario.gjorgjevski@gmail.com \
    --cc=42149@debbugs.gnu.org \
    --cc=joaotavora@gmail.com \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).