unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
@ 2023-11-22 22:18 Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-23 12:57 ` João Távora
  0 siblings, 1 reply; 27+ messages in thread
From: Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-22 22:18 UTC (permalink / raw)
  To: 67390; +Cc: Joseph Turner, Adam Porter, João Távora

Hello,

`shorthands-font-lock-shorthands' assumes that the shorthand uses the
same separator as the real symbol name, e.g., "s-" instead of "long-".
This causes an additional character to be highlighted when an
alternative separator is used, e.g., when using "s/" instead of "long-".

`shorthands--mismatch-from-end' returns the length of the common suffix,
of the shorthand and the long name.

  (shorthands--mismatch-from-end "s-tail" "long-tail") => 5

  "s-" is highlighted in this case.

Since `shorthands-font-lock-shorthands' also wants to highlight the
separator, "-" in this case, it corrects this off-by-one, but when
different separators are in use, this causes an additional character
to be highlighted

  (shorthands--mismatch-from-end "s/tail" "long-tail") => 4

  "s/t" is highlighted in this case.

Could we add support for using an alternative separator in shorthands?

IMO it is okay to use another separator for *shorthand* prefixes.  They
won't show up in M-x completion candidates, for example.  Shorthands are
confined to the file where they are being used, and I think authors
should be free to use whatever prefix they fancy.

     Cheers,
     Jonas





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-22 22:18 bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-23 12:57 ` João Távora
  2023-11-24 21:51   ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: João Távora @ 2023-11-23 12:57 UTC (permalink / raw)
  To: Jonas Bernoulli; +Cc: joseph, adam, 67390

On Wed, Nov 22, 2023 at 10:18 PM Jonas Bernoulli <jonas@bernoul.li> wrote:

> Could we add support for using an alternative separator in shorthands?

I think so, if you can find a patch for it.  This only affects the font-locking
bits of shorthands right?  IOW all other shorthand-aware functionality like
eldoc, M-., etc, already works with different separators, right?

> IMO it is okay to use another separator for *shorthand* prefixes.  They
> won't show up in M-x completion candidates, for example.  Shorthands are
> confined to the file where they are being used, and I think authors
> should be free to use whatever prefix they fancy.

I agree in principle, and exactly for that reason (that shorthands are confined
to a file).

João





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-23 12:57 ` João Távora
@ 2023-11-24 21:51   ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-25  0:03     ` João Távora
  0 siblings, 1 reply; 27+ messages in thread
From: Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-24 21:51 UTC (permalink / raw)
  To: João Távora; +Cc: joseph, adam, 67390

João Távora <joaotavora@gmail.com> writes:

> On Wed, Nov 22, 2023 at 10:18 PM Jonas Bernoulli <jonas@bernoul.li> wrote:
>
>> Could we add support for using an alternative separator in shorthands?
>
> I think so, if you can find a patch for it.  This only affects the font-locking
> bits of shorthands right?  IOW all other shorthand-aware functionality like
> eldoc, M-., etc, already works with different separators, right?

I haven't found any issues beside this off-by-one font-lock issue.

So far I have used this beauty:

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
@@ -57,7 +57,7 @@ shorthands--mismatch-from-end
            for i from 1
            for i1 = (- l1 i) for i2 = (- l2 i)
            while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
-           finally (return (1- i))))
+           finally (return (if (eq (aref str2 (1+ i2)) ?-) (1- i) i))))

Is that good enough?  Depending on how you look at it, this changes what
is being returned, but IMO this function is a bit murky to begin with.

The function name is `shorthands--mismatch-from-end', but it returns the
length of the common suffix, minus one, to account for the separator.
This change ensures that the separator is accounted for, even if it
differs between the shorthand and real symbol.

Since this function returns the length of the *matching* suffix after
the prefixes (including separator), I find it weird that its name
contains *MISmatch*.

It might make more sense to return the length of the shorthand prefix.

Also, have you considered throwing in a
  (not (string-equal (match-string 1) sname))

to avoid having to call `shorthands--mismatch-from-end' at all?
Maybe you have, but concluded it is cheaper to do a bit too much work
for non-shorthands, than to effectively repeat some work for shorthands.

     Jonas





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-24 21:51   ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-25  0:03     ` João Távora
  2023-11-25  3:26       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: João Távora @ 2023-11-25  0:03 UTC (permalink / raw)
  To: Jonas Bernoulli; +Cc: joseph, adam, 67390

On Fri, Nov 24, 2023 at 9:51 PM Jonas Bernoulli <jonas@bernoul.li> wrote:

> I haven't found any issues beside this off-by-one font-lock issue.

Good.

> So far I have used this beauty:
>
> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
> @@ -57,7 +57,7 @@ shorthands--mismatch-from-end
>             for i from 1
>             for i1 = (- l1 i) for i2 = (- l2 i)
>             while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
> -           finally (return (1- i))))
> +           finally (return (if (eq (aref str2 (1+ i2)) ?-) (1- i) i))))
>
> Is that good enough?  Depending on how you look at it, this changes what
> is being returned, but IMO this function is a bit murky to begin with.

A bit dodgy, no :-)  Maybe a docstring would shed some light
on what this function is supposed to do, so I'll propose one below.

> The function name is `shorthands--mismatch-from-end', but it returns the
> length of the common suffix, minus one, to account for the separator.
> This change ensures that the separator is accounted for, even if it
> differs between the shorthand and real symbol.
>
> Since this function returns the length of the *matching* suffix after
> the prefixes (including separator), I find it weird that its name
> contains *MISmatch*.

Probably I wanted to emulate what CL's MISMATCH does to some degree.
cl-mismatch exists in Emacs, but it's not available in shorthands.el,
and it seems to be buggy anyway.

You can rename the function shorthands--common-suffix-length if
you want.  I probably meant it to be separator-agnostic function,
and be replaceable by a cl-mismatch some time around 2084,

Now to your problem: I think what you want is to customize element
comparison (CL's MISMATCH allows that, btw).  Here's one way.

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index 82200ab88e9..36a862bc037 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -52,11 +52,18 @@ elisp-shorthand-font-lock-face
   :version "28.1"
   :group 'font-lock-faces)

-(defun shorthands--mismatch-from-end (str1 str2)
-  (cl-loop with l1 = (length str1) with l2 = (length str2)
+(defun shorthands--mismatch-from-end (str1 str2 &optional test)
+  "Compute position of first mismatching element of STR1 and STR2, from end.
+The return value is the reverse index of that element.  If 0, the
+last characters of STR1 and STR2 differ, if 1, the penultimate
+characters differ, and so on.  If optional TEST is passed,
+compare elements using that function, else compare with `eq'."
+  (cl-loop with test = (or test #'eq)
+           with l1 = (length str1) with l2 = (length str2)
            for i from 1
            for i1 = (- l1 i) for i2 = (- l2 i)
-           while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
+           while (and (>= i1 0) (>= i2 0)
+                      (funcall test (aref str1 i1) (aref str2 i2)))
            finally (return (1- i))))

 (defun shorthands-font-lock-shorthands (limit)

And now using the following lambda for TEST yields what you want:

(shorthands--mismatch-from-end "s-tail" "long-tail") ;; => 5

(shorthands--mismatch-from-end "s/tail" "long-tail"
                               (lambda (c1 c2)
                                 (or (and (eq c1 ?/) (eq c2 ?-))
                                     (eq c1 c2)))) ;; => also 5

Of course, you can hardcode this test inside the function, no need
for a parameter, and rename the function to whatever you find more
appropriate.  This allows us to keep control over what things
are considered acceptable separators from a font-locking perspective

> It might make more sense to return the length of the shorthand prefix.

I've been away from this code for a couple of years, so feel free to
propose more extensive changes.  As long as they don't increase the
complexity and are suitably tested, I have nothing against.

> Also, have you considered throwing in a
>   (not (string-equal (match-string 1) sname))
>
> to avoid having to call `shorthands--mismatch-from-end' at all?

Where and how this be thrown in?  How would you know what to highlight
in the shorthand printed form?  There can be many shorthand prefixes
in a given file.  But do show some code.

> Maybe you have, but concluded it is cheaper to do a bit too much work
> for non-shorthands, than to effectively repeat some work for shorthands.

Maybe.  Not sure this is more work (string-equal must still compare
every character right?), but, in summary, I'm not married to this
implementation.  I somewhat appreciate that I could still read it
after not having looked  at it for a couple years, but feel free to
change it.

João





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-25  0:03     ` João Távora
@ 2023-11-25  3:26       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-25 16:01         ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-25  3:26 UTC (permalink / raw)
  To: João Távora; +Cc: 67390, Adam Porter, Jonas Bernoulli

[-- Attachment #1: Type: text/plain, Size: 1120 bytes --]

João Távora <joaotavora@gmail.com> writes:

> On Fri, Nov 24, 2023 at 9:51 PM Jonas Bernoulli <jonas@bernoul.li> wrote:
>
>> Also, have you considered throwing in a
>>   (not (string-equal (match-string 1) sname))
>>
>> to avoid having to call `shorthands--mismatch-from-end' at all?
>
> Where and how this be thrown in?  How would you know what to highlight
> in the shorthand printed form?  There can be many shorthand prefixes
> in a given file.  But do show some code.

Please see the attached patch, inspired by Jonas's comment.

>> Maybe you have, but concluded it is cheaper to do a bit too much work
>> for non-shorthands, than to effectively repeat some work for shorthands.
>
> Maybe.  Not sure this is more work (string-equal must still compare
> every character right?), but, in summary, I'm not married to this
> implementation.  I somewhat appreciate that I could still read it
> after not having looked  at it for a couple years, but feel free to
> change it.

I haven't done any benchmarking - I'm curious to learn how to benchmark
font lock though!

Thanks!!

Joseph


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Support-shorthand-prefixes-besides.patch --]
[-- Type: text/x-diff, Size: 2991 bytes --]

From e469c9d621116857bd1d84720eb2eb43d4560074 Mon Sep 17 00:00:00 2001
From: Joseph Turner <joseph@breatheoutbreathe.in>
Date: Fri, 24 Nov 2023 19:17:13 -0800
Subject: [PATCH] Support shorthand prefixes besides "-"

Previously, shorthands-font-lock-shorthands added font locking to the
shorthand prefix by checking for a mismatch between the shorthand and
longhand symbols.  This broke font-locking when the shorthand prefix
separator was not "-" (bug#67390).

Now, shorthands-font-lock-shorthands adds font locking to the
shorthand prefix by checking if any of the shorthand prefixes in
read-symbol-shorthands are a prefix for the current symbol name.

Thanks to Jonas Bernoulli for the idea to use
(not (string-equal (match-string 1) sname))
---
 lisp/emacs-lisp/shorthands.el | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index 82200ab88e9..141b6115a3d 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -52,12 +52,14 @@
   :version "28.1"
   :group 'font-lock-faces)
 
-(defun shorthands--mismatch-from-end (str1 str2)
-  (cl-loop with l1 = (length str1) with l2 = (length str2)
-           for i from 1
-           for i1 = (- l1 i) for i2 = (- l2 i)
-           while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
-           finally (return (1- i))))
+(defun shorthands--find-if (predicate seq &optional key)
+  "Find the first item satisfying PREDICATE in SEQ.
+Return the matching item, or nil if not found. Optional argument
+KEY is used to filter SEQ, as in `cl-find-if'."
+  (catch 'found
+    (dolist (el seq)
+      (when (funcall predicate (funcall (or key #'identity) el))
+        (throw 'found el)))))
 
 (defun shorthands-font-lock-shorthands (limit)
   (when read-symbol-shorthands
@@ -69,10 +71,15 @@
                                                font-lock-string-face)))
                          (intern-soft (match-string 1))))
              (sname (and probe (symbol-name probe)))
-             (mm (and sname (shorthands--mismatch-from-end
-                             (match-string 1) sname))))
-        (unless (or (null mm) (= mm (length sname)))
-          (add-face-text-property (match-beginning 1) (1+ (- (match-end 1) mm))
+             (prefix (and sname
+                          (not (string-equal (match-string 1) sname))
+                          (car (shorthands--find-if
+                                (lambda (short)
+                                  (string-prefix-p short (match-string 1)))
+                                read-symbol-shorthands #'car)))))
+        (when prefix
+          (add-face-text-property (match-beginning 1)
+                                  (+ (match-beginning 1) (length prefix))
                                   'elisp-shorthand-font-lock-face))))))
 
 (font-lock-add-keywords 'emacs-lisp-mode '((shorthands-font-lock-shorthands)) t)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-25  3:26       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-25 16:01         ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-25 22:42           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-29 13:30           ` João Távora
  0 siblings, 2 replies; 27+ messages in thread
From: Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-25 16:01 UTC (permalink / raw)
  To: Joseph Turner, João Távora; +Cc: 67390, Adam Porter

Joseph Turner <joseph@ushin.org> writes:

> +                          (car (shorthands--find-if
> +                                (lambda (short)
> +                                  (string-prefix-p short (match-string 1)))
> +                                read-symbol-shorthands #'car)))))

Or simply:
  (car (assoc (match-string 1)
              read-symbol-shorthands
              #'string-prefix-p))





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-25 16:01         ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-25 22:42           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-26 13:52             ` João Távora
  2023-11-29 13:30           ` João Távora
  1 sibling, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-25 22:42 UTC (permalink / raw)
  To: Jonas Bernoulli; +Cc: 67390, Adam Porter, João Távora

[-- Attachment #1: Type: text/plain, Size: 512 bytes --]

Jonas Bernoulli <jonas@bernoul.li> writes:

> Joseph Turner <joseph@ushin.org> writes:
>
>> +                          (car (shorthands--find-if
>> +                                (lambda (short)
>> +                                  (string-prefix-p short (match-string 1)))
>> +                                read-symbol-shorthands #'car)))))
>
> Or simply:
>   (car (assoc (match-string 1)
>               read-symbol-shorthands
>               #'string-prefix-p))

Much nicer - see patch.  Thanks, Jonas!


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Support-shorthand-prefixes-besides.patch --]
[-- Type: text/x-diff, Size: 2341 bytes --]

From 5033a0e14950ed1622db78df1828c40de4b5a23a Mon Sep 17 00:00:00 2001
From: Joseph Turner <joseph@breatheoutbreathe.in>
Date: Fri, 24 Nov 2023 19:17:13 -0800
Subject: [PATCH] Support shorthand prefixes besides "-"

* lisp/emacs-lisp/shorthands.el (shorthands-font-lock-shorthands):
Add font locking to the shorthand prefix by checking if any of the
shorthand prefixes in read-symbol-shorthands are a prefix for the
current symbol name (bug#67390).

Co-authored-by: Jonas Bernoulli <jonas@bernoul.li>
---
 lisp/emacs-lisp/shorthands.el | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index 82200ab88e9..6ce9984a353 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -52,13 +52,6 @@
   :version "28.1"
   :group 'font-lock-faces)
 
-(defun shorthands--mismatch-from-end (str1 str2)
-  (cl-loop with l1 = (length str1) with l2 = (length str2)
-           for i from 1
-           for i1 = (- l1 i) for i2 = (- l2 i)
-           while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
-           finally (return (1- i))))
-
 (defun shorthands-font-lock-shorthands (limit)
   (when read-symbol-shorthands
     (while (re-search-forward
@@ -69,10 +62,14 @@
                                                font-lock-string-face)))
                          (intern-soft (match-string 1))))
              (sname (and probe (symbol-name probe)))
-             (mm (and sname (shorthands--mismatch-from-end
-                             (match-string 1) sname))))
-        (unless (or (null mm) (= mm (length sname)))
-          (add-face-text-property (match-beginning 1) (1+ (- (match-end 1) mm))
+             (prefix (and sname
+                          (not (string-equal (match-string 1) sname))
+                          (car (assoc (match-string 1)
+                                      read-symbol-shorthands
+                                      #'string-prefix-p)))))
+        (when prefix
+          (add-face-text-property (match-beginning 1)
+                                  (+ (match-beginning 1) (length prefix))
                                   'elisp-shorthand-font-lock-face))))))
 
 (font-lock-add-keywords 'emacs-lisp-mode '((shorthands-font-lock-shorthands)) t)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-25 22:42           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-26 13:52             ` João Távora
  2023-11-26 20:35               ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: João Távora @ 2023-11-26 13:52 UTC (permalink / raw)
  To: Joseph Turner; +Cc: 67390, Adam Porter, Jonas Bernoulli

On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph@ushin.org> wrote:
>
> Jonas Bernoulli <jonas@bernoul.li> writes:
>
> > Joseph Turner <joseph@ushin.org> writes:
> >
> >> +                          (car (shorthands--find-if
> >> +                                (lambda (short)
> >> +                                  (string-prefix-p short (match-string 1)))
> >> +                                read-symbol-shorthands #'car)))))
> >
> > Or simply:
> >   (car (assoc (match-string 1)
> >               read-symbol-shorthands
> >               #'string-prefix-p))
>
> Much nicer - see patch.  Thanks, Jonas!

So, I had a look at this patch and I think we should compare it
with the patch after my sig, which keeps 'shorthands--mismatch-from-end'
and also fixes this bug.

The main difference I see is that my patch keeps doing one string
comparison, via the mismatch function (which btw is now perfectly
analogous to CL mismatch and thus correctly named).  In the worst case,
Josheph's patch does 1 + N where N is the number of shorthands.  So
this is a fundamental complexity change.

Normally, that would be the end of the story, but here, it isn't.
For two reasons.

My version keeps a behaviour that can be considered buggy.
If a shorthand prefix has a common suffix with the longhand prefix
then that suffix will not be highlighted. Like:

;; Local Variables:
;; read-symbol-shorthands: (("bcrumb-" . "breadcrumb-")
;; End:

Here only "b" would be highlighted, effectively showing the user
how much typing was saved.  Is this wrong?  Does it makes sense
to use shorthands like this?

The other reason why this isn't the end of the story is that even
if we take that bug for granted, the string comparison functions in
Joshep's patch delegate to built-in C  comparison operators, which are
often much, much faster than Elisp.  At least before the advent of native
compilation, it used to be like this. Of course for a large enough  N
number of shorthands, my version wins, but that is probably not very
common either (or is it?  Not very hard to imagine a file making use
of many libraries and shorthanding each of them?)

So, benchmarking it will have to be, I'm afraid, because AFAIK
font-locking is a very performance sensitive area of Emacs.

In the meantime I will push my patch, but keep the bug open to see
if it is worth pushing Joseph's version.

João


diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index 82200ab88e9..b0665a55695 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -53,11 +53,16 @@ elisp-shorthand-font-lock-face
   :group 'font-lock-faces)

 (defun shorthands--mismatch-from-end (str1 str2)
+  "Tell index of first mismatch in STR1 and STR2, from end.
+The index is a valid 0-based index on STR1.  Returns nil if STR1
+equals STR2.  Return 0 if STR1 is a suffix of STR2."
   (cl-loop with l1 = (length str1) with l2 = (length str2)
            for i from 1
            for i1 = (- l1 i) for i2 = (- l2 i)
-           while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
-           finally (return (1- i))))
+           while (eq (aref str1 i1) (aref str2 i2))
+           if (zerop i2) return (if (zerop i1) nil i1)
+           if (zerop i1) return 0
+           finally (return i1)))

 (defun shorthands-font-lock-shorthands (limit)
   (when read-symbol-shorthands
@@ -69,10 +74,16 @@ shorthands-font-lock-shorthands
                                                font-lock-string-face)))
                          (intern-soft (match-string 1))))
              (sname (and probe (symbol-name probe)))
-             (mm (and sname (shorthands--mismatch-from-end
-                             (match-string 1) sname))))
-        (unless (or (null mm) (= mm (length sname)))
-          (add-face-text-property (match-beginning 1) (1+ (- (match-end 1) mm))
+             (mismatch (and sname (shorthands--mismatch-from-end
+                                   (match-string 1) sname)))
+             (guess (and mismatch (1+ mismatch))))
+        (when guess
+          (when (and (< guess (1- (length (match-string 1))))
+                     ;; In bug#67390 we allow other separators
+                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
+            (setq guess (1+ guess)))
+          (add-face-text-property (match-beginning 1)
+                                  (+ (match-beginning 1) guess)
                                   'elisp-shorthand-font-lock-face))))))

 (font-lock-add-keywords 'emacs-lisp-mode
'((shorthands-font-lock-shorthands)) t)





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-26 13:52             ` João Távora
@ 2023-11-26 20:35               ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-26 22:02                 ` João Távora
  0 siblings, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-26 20:35 UTC (permalink / raw)
  To: João Távora; +Cc: 67390, Adam Porter, Jonas Bernoulli

João Távora <joaotavora@gmail.com> writes:

> On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph@ushin.org> wrote:
>>
>> Jonas Bernoulli <jonas@bernoul.li> writes:
>>
>> > Joseph Turner <joseph@ushin.org> writes:
>> >
>> >> +                          (car (shorthands--find-if
>> >> +                                (lambda (short)
>> >> +                                  (string-prefix-p short (match-string 1)))
>> >> +                                read-symbol-shorthands #'car)))))
>> >
>> > Or simply:
>> >   (car (assoc (match-string 1)
>> >               read-symbol-shorthands
>> >               #'string-prefix-p))
>>
>> Much nicer - see patch.  Thanks, Jonas!
>
> So, I had a look at this patch and I think we should compare it
> with the patch after my sig, which keeps 'shorthands--mismatch-from-end'
> and also fixes this bug.
>
> The main difference I see is that my patch keeps doing one string
> comparison, via the mismatch function (which btw is now perfectly
> analogous to CL mismatch and thus correctly named).  In the worst case,
> Josheph's patch does 1 + N where N is the number of shorthands.  So
> this is a fundamental complexity change.
>
> Normally, that would be the end of the story, but here, it isn't.
> For two reasons.
>
> My version keeps a behaviour that can be considered buggy.
> If a shorthand prefix has a common suffix with the longhand prefix
> then that suffix will not be highlighted. Like:
>
> ;; Local Variables:
> ;; read-symbol-shorthands: (("bcrumb-" . "breadcrumb-")
> ;; End:
>
> Here only "b" would be highlighted, effectively showing the user
> how much typing was saved.  Is this wrong?  Does it makes sense
> to use shorthands like this?

I would expect the entire the shorthand to be highlit, I don't feeling
strongly about this.

> The other reason why this isn't the end of the story is that even
> if we take that bug for granted, the string comparison functions in
> Joshep's patch delegate to built-in C  comparison operators, which are
> often much, much faster than Elisp.  At least before the advent of native
> compilation, it used to be like this. Of course for a large enough  N
> number of shorthands, my version wins, but that is probably not very
> common either (or is it?  Not very hard to imagine a file making use
> of many libraries and shorthanding each of them?)
>
> So, benchmarking it will have to be, I'm afraid, because AFAIK
> font-locking is a very performance sensitive area of Emacs.

Yes.  I would like to learn how to do this!

> In the meantime I will push my patch, but keep the bug open to see
> if it is worth pushing Joseph's version.

Thank you!!  I'm happy to discuss this further if others are interested.

Joseph





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-26 20:35               ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-26 22:02                 ` João Távora
  2023-11-27  3:48                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-27 12:10                   ` Eli Zaretskii
  0 siblings, 2 replies; 27+ messages in thread
From: João Távora @ 2023-11-26 22:02 UTC (permalink / raw)
  To: Joseph Turner, Eli Zaretskii; +Cc: 67390, Adam Porter, Jonas Bernoulli

On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph@ushin.org> wrote:
>
> João Távora <joaotavora@gmail.com> writes:
>
> > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph@ushin.org> wrote:

> > So, benchmarking it will have to be, I'm afraid, because AFAIK
> > font-locking is a very performance sensitive area of Emacs.
>
> Yes.  I would like to learn how to do this!

I'm CCing Eli.

In the past, ISTR, Eli suggested to benchmark such things by visiting a
very large file in its beginning, then scrolling down by holding
the down arrow or PgDn for some fixed time period, like 30 seconds.
The  Emacs that scrolls the farthest is the most performant.  Not
entirely fail-proof (other processes may interfere, etc), but not
bad either.

So here you could create very large fictitious Elisp file with 0, 1, 3 and
10 shorthands each and then run your version vs my version and record
the final line numbers.  Then show the files and the numbers.

> > In the meantime I will push my patch, but keep the bug open to see
> > if it is worth pushing Joseph's version.
>
> Thank you!!

Now done in 36941e9e6a (master).

> I'm happy to discuss this further if others are interested.

I'm keeping this open.

João





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-26 22:02                 ` João Távora
@ 2023-11-27  3:48                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-27 12:10                   ` Eli Zaretskii
  1 sibling, 0 replies; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-27  3:48 UTC (permalink / raw)
  To: João Távora; +Cc: 67390, Adam Porter, Eli Zaretskii, Jonas Bernoulli

João Távora <joaotavora@gmail.com> writes:

> In the past, ISTR, Eli suggested to benchmark such things by visiting a
> very large file in its beginning, then scrolling down by holding
> the down arrow or PgDn for some fixed time period, like 30 seconds.
> The  Emacs that scrolls the farthest is the most performant.  Not
> entirely fail-proof (other processes may interfere, etc), but not
> bad either.
>
> So here you could create very large fictitious Elisp file with 0, 1, 3 and
> 10 shorthands each and then run your version vs my version and record
> the final line numbers.  Then show the files and the numbers.

In a 2.5M Elisp file with 0 shorthand prefixes, after 30s of pressing
C-v (scroll-up-command), point was on line 19238.

I then tried the same file with 1 and 4 shorthands, and I got basically
the same result:

| With 1 shorthand prefix |       |
|-------------------------+-------|
| No patch                | 19447 |
| mismatch                | 19238 |
| compare all prefixes    | 19024 |

| With 4 shorthand prefixes |       |
|---------------------------+-------|
| No patch                  | 19211 |
| mismatch                  | 19521 |
| compare all prefixes      | 19339 |

There is a big margin of error (probably around 500-1000 lines) since my
method wasn't at all exact. I stopped holding C-v when the stopwatch on
another device hit 30s, and I might have held for ±1s.

If this approach to benchmarking is valid, I think it indicates that
shorthands has no significant effect on performance in either case,
though there may be a greater difference with more shorthand prefixes.

> My version keeps a behaviour that can be considered buggy.
> If a shorthand prefix has a common suffix with the longhand prefix
> then that suffix will not be highlighted. Like:

> ;; Local Variables:
> ;; read-symbol-shorthands: (("bcrumb-" . "breadcrumb-")
> ;; End:

> Here only "b" would be highlighted, effectively showing the user
> how much typing was saved.  Is this wrong?  Does it makes sense
> to use shorthands like this?

Another example case:

;; Local Variables:
;; read-symbol-shorthands: (("aw-" . "ace-window-")
;; End:

Here only "a" would be highlighted.

Thanks,

Joseph





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-26 22:02                 ` João Távora
  2023-11-27  3:48                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-27 12:10                   ` Eli Zaretskii
  2023-11-29  8:21                     ` João Távora
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2023-11-27 12:10 UTC (permalink / raw)
  To: João Távora; +Cc: joseph, 67390, jonas, adam

> From: João Távora <joaotavora@gmail.com>
> Date: Sun, 26 Nov 2023 22:02:01 +0000
> Cc: Jonas Bernoulli <jonas@bernoul.li>, 67390@debbugs.gnu.org, 
> 	Adam Porter <adam@alphapapa.net>
> 
> On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph@ushin.org> wrote:
> >
> > João Távora <joaotavora@gmail.com> writes:
> >
> > > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph@ushin.org> wrote:
> 
> > > So, benchmarking it will have to be, I'm afraid, because AFAIK
> > > font-locking is a very performance sensitive area of Emacs.
> >
> > Yes.  I would like to learn how to do this!
> 
> I'm CCing Eli.
> 
> In the past, ISTR, Eli suggested to benchmark such things by visiting a
> very large file in its beginning, then scrolling down by holding
> the down arrow or PgDn for some fixed time period, like 30 seconds.
> The  Emacs that scrolls the farthest is the most performant.  Not
> entirely fail-proof (other processes may interfere, etc), but not
> bad either.

I still recommend this method.  Something like the below:

  (defun scroll-up-benchmark ()
    (interactive)
    (let ((oldgc gcs-done)
	  (oldtime (float-time)))
      (condition-case nil (while t (scroll-up) (redisplay))
	(error (message "GCs: %d Elapsed time: %f seconds"
			(- gcs-done oldgc) (- (float-time) oldtime))))))

Evaluate the above, and the invoke it at the beginning of a large
file.  Then compare the timings with different font-lock arrangements.

A variant is to scroll by N lines, not by pages.  Just change the
above to call scroll-up with the argument of N, for example 1 (or any
other number, if you want).





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-27 12:10                   ` Eli Zaretskii
@ 2023-11-29  8:21                     ` João Távora
  2023-11-29  9:12                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: João Távora @ 2023-11-29  8:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Joseph Turner, 67390, Jonas Bernoulli, Adam Porter

[-- Attachment #1: Type: text/plain, Size: 2083 bytes --]

On Mon, Nov 27, 2023, 12:12 Eli Zaretskii <eliz@gnu.org> wrote:

> > From: João Távora <joaotavora@gmail.com>
> > Date: Sun, 26 Nov 2023 22:02:01 +0000
> > Cc: Jonas Bernoulli <jonas@bernoul.li>, 67390@debbugs.gnu.org,
> >       Adam Porter <adam@alphapapa.net>
> >
> > On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph@ushin.org> wrote:
> > >
> > > João Távora <joaotavora@gmail.com> writes:
> > >
> > > > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph@ushin.org>
> wrote:
> >
> > > > So, benchmarking it will have to be, I'm afraid, because AFAIK
> > > > font-locking is a very performance sensitive area of Emacs.
> > >
> > > Yes.  I would like to learn how to do this!
> >
> > I'm CCing Eli.
> >
> > In the past, ISTR, Eli suggested to benchmark such things by visiting a
> > very large file in its beginning, then scrolling down by holding
> > the down arrow or PgDn for some fixed time period, like 30 seconds.
> > The  Emacs that scrolls the farthest is the most performant.  Not
> > entirely fail-proof (other processes may interfere, etc), but not
> > bad either.
>
> I still recommend this method.  Something like the below:
>
>   (defun scroll-up-benchmark ()
>     (interactive)
>     (let ((oldgc gcs-done)
>           (oldtime (float-time)))
>       (condition-case nil (while t (scroll-up) (redisplay))
>         (error (message "GCs: %d Elapsed time: %f seconds"
>                         (- gcs-done oldgc) (- (float-time) oldtime))))))
>
> Evaluate the above, and the invoke it at the beginning of a large
> file.  Then compare the timings with different font-lock arrangements.
>
> A variant is to scroll by N lines, not by pages.  Just change the
> above to call scroll-up with the argument of N, for example 1 (or any
> other number, if you want).
>

Joseph can you try these variations? They're slightly more exact. Also show
at least one of the large lisp files or tell me how to generate one. If you
still don't find any significant slowdown, I think we can merge your patch.

João

>

[-- Attachment #2: Type: text/html, Size: 3396 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-29  8:21                     ` João Távora
@ 2023-11-29  9:12                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-29 13:56                         ` João Távora
  0 siblings, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-11-29  9:12 UTC (permalink / raw)
  To: João Távora; +Cc: 67390, Adam Porter, Eli Zaretskii, Jonas Bernoulli

João Távora <joaotavora@gmail.com> writes:

> On Mon, Nov 27, 2023, 12:12 Eli Zaretskii <eliz@gnu.org> wrote:
>
>  > From: João Távora <joaotavora@gmail.com>
>  > Date: Sun, 26 Nov 2023 22:02:01 +0000
>  > Cc: Jonas Bernoulli <jonas@bernoul.li>, 67390@debbugs.gnu.org,
>  >       Adam Porter <adam@alphapapa.net>
>  >
>  > On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph@ushin.org> wrote:
>  > >
>  > > João Távora <joaotavora@gmail.com> writes:
>  > >
>  > > > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph@ushin.org> wrote:
>  >
>  > > > So, benchmarking it will have to be, I'm afraid, because AFAIK
>  > > > font-locking is a very performance sensitive area of Emacs.
>  > >
>  > > Yes.  I would like to learn how to do this!
>  >
>  > I'm CCing Eli.
>  >
>  > In the past, ISTR, Eli suggested to benchmark such things by visiting a
>  > very large file in its beginning, then scrolling down by holding
>  > the down arrow or PgDn for some fixed time period, like 30 seconds.
>  > The  Emacs that scrolls the farthest is the most performant.  Not
>  > entirely fail-proof (other processes may interfere, etc), but not
>  > bad either.
>
>  I still recommend this method.  Something like the below:
>
>    (defun scroll-up-benchmark ()
>      (interactive)
>      (let ((oldgc gcs-done)
>            (oldtime (float-time)))
>        (condition-case nil (while t (scroll-up) (redisplay))
>          (error (message "GCs: %d Elapsed time: %f seconds"
>                          (- gcs-done oldgc) (- (float-time) oldtime))))))
>
>  Evaluate the above, and the invoke it at the beginning of a large
>  file.  Then compare the timings with different font-lock arrangements.
>
>  A variant is to scroll by N lines, not by pages.  Just change the
>  above to call scroll-up with the argument of N, for example 1 (or any
>  other number, if you want).
>
> Joseph can you try these variations? They're slightly more exact. Also show at least one of the large lisp files or tell me how to generate
> one. If you still don't find any significant slowdown, I think we can merge your patch.

I'm happy to try Eli's variation if you don't beat me to it ;)

My large lisp file consisted of copy-pasting with a kbd macro the body of
https://git.sr.ht/~ushin/hyperdrive.el/tree/master/item/hyperdrive-lib.el
until the file reached ~50K lines -- well over the limit I expected to
reach on my machine.

One potential issue with the patch is that multiple shorthand prefixes
might match, while assoc will return the first matching cons pair
read-symbol-shorthand.

For example in hyperdrive-lib.el, we use the following shorthands in
order to display "//" instead of "/-" as the prefix separator for
private symbols, like "h//format-entry" instead of "h/-format-entry":

;; read-symbol-shorthands: (
;;   ("he//" . "hyperdrive-entry--")
;;   ("he/"  . "hyperdrive-entry-")
;;   ("h//"  . "hyperdrive--")
;;   ("h/"   . "hyperdrive-"))

However, if we rearrange the values like:

;; read-symbol-shorthands: (
;;   ("he/"  . "hyperdrive-entry-")
;;   ("he//" . "hyperdrive-entry--")
;;   ("h/"   . "hyperdrive-")
;;   ("h//"  . "hyperdrive--"))

then shorthands doesn't add fontification for either "h//" or "he//".
(This surprised me - I was expecting to see just the "h/" and "he/"
highlit)

However, I'm starting to wonder whether this is a case of user error.
Should we avoid overlapping shorthand prefixes?

Thank you!!

Joseph






^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-25 16:01         ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-11-25 22:42           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-29 13:30           ` João Távora
  2023-11-29 23:27             ` João Távora
  1 sibling, 1 reply; 27+ messages in thread
From: João Távora @ 2023-11-29 13:30 UTC (permalink / raw)
  To: Jonas Bernoulli; +Cc: Joseph Turner, 67390, Adam Porter

On Sat, Nov 25, 2023 at 4:03 PM Jonas Bernoulli via Bug reports for
GNU Emacs, the Swiss army knife of text editors
<bug-gnu-emacs@gnu.org> wrote:
>
> Joseph Turner <joseph@ushin.org> writes:
>
> > +                          (car (shorthands--find-if
> > +                                (lambda (short)
> > +                                  (string-prefix-p short (match-string 1)))
> > +                                read-symbol-shorthands #'car)))))
>
> Or simply:
>   (car (assoc (match-string 1)
>               read-symbol-shorthands
>               #'string-prefix-p))

I don't think it works, at least in my 'assoc', the order
of string-prefix-p arguments must be switched.  Pity
assoc or string-prefix-p decs didn't coordinate this.

(assoc probe read-symbol-shorthands (lambda (a b) (string-prefix-p b a)))

works





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-29  9:12                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-11-29 13:56                         ` João Távora
  0 siblings, 0 replies; 27+ messages in thread
From: João Távora @ 2023-11-29 13:56 UTC (permalink / raw)
  To: Joseph Turner; +Cc: 67390, Adam Porter, Eli Zaretskii, Jonas Bernoulli

On Wed, Nov 29, 2023 at 9:29 AM Joseph Turner <joseph@ushin.org> wrote:

> > Joseph can you try these variations? They're slightly more exact. Also show at least one of the large lisp files or tell me how to generate
> > one. If you still don't find any significant slowdown, I think we can merge your patch.
>
> I'm happy to try Eli's variation if you don't beat me to it ;)

Great.  No I don't think I will beat you to it :-), you have the
machinery setup already.

> However, if we rearrange the values like:
>
> ;; read-symbol-shorthands: (
> ;;   ("he/"  . "hyperdrive-entry-")
> ;;   ("he//" . "hyperdrive-entry--")
> ;;   ("h/"   . "hyperdrive-")
> ;;   ("h//"  . "hyperdrive--"))
>
> then shorthands doesn't add fontification for either "h//" or "he//".
> (This surprised me - I was expecting to see just the "h/" and "he/"
> highlit)
>
> However, I'm starting to wonder whether this is a case of user error.
> Should we avoid overlapping shorthand prefixes?

No, but you should make sure the longer prefixes appear before.
It's not only a font-locking issue, if the shorter shorthand
appears before it could read to an unintended symbol.  If shorthands
are

  (("h/" . "hello-") ("h//" . "hyperdrive--"))

and you type

  "h//warp"

, I'm fairly sure this will be read to

    "hello-/warp"

, which is probably not what you intended

The documentation doesn't mention this, but it should, so patches
welcome for that.

In theory, lread.c could do this sorting for you.  I don't think
this would affect performance, so patches welcome for that,
too.

Anyway, here's another patch that more or less bridges our two patches
and seems to do the right thing every time, even making you aware of
that ordering pitfall.

It could be more performant (or less, or about the same), so if
you can, please include it in your benchmarks.  It does less
work when presented with a "/"-ending shorthand for a "-"-separated
symbol, so I'm starting to think that this "/" idea is a good
convention for shorthands (useful for distinguishing them in
github code, for example).

I've tested it briefly with those 'he/', 'he//', etc cases.

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index b0665a55695..69e9e252aee 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -76,14 +76,19 @@ shorthands-font-lock-shorthands
              (sname (and probe (symbol-name probe)))
              (mismatch (and sname (shorthands--mismatch-from-end
                                    (match-string 1) sname)))
-             (guess (and mismatch (1+ mismatch))))
-        (when guess
-          (when (and (< guess (1- (length (match-string 1))))
-                     ;; In bug#67390 we allow other separators
-                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
-            (setq guess (1+ guess)))
+             probe2)
+        (when mismatch
+          (unless (eq (char-syntax (aref (match-string 1) mismatch)) ?_)
+            (or probe2
+                (setq probe2
+                      (buffer-substring (match-beginning 1)
+                                        (+ (match-beginning 1) mismatch))))
+            (setq mismatch (1- (length
+                                (car (assoc probe2
+                                            read-symbol-shorthands
+                                            (lambda (a b)
(string-prefix-p b a))))))))
           (add-face-text-property (match-beginning 1)
-                                  (+ (match-beginning 1) guess)
+                                  (+ (match-beginning 1) (1+ mismatch))
                                   'elisp-shorthand-font-lock-face))))))

 (font-lock-add-keywords 'emacs-lisp-mode
'((shorthands-font-lock-shorthands)) t)





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-29 13:30           ` João Távora
@ 2023-11-29 23:27             ` João Távora
  2023-11-30 14:16               ` João Távora
  0 siblings, 1 reply; 27+ messages in thread
From: João Távora @ 2023-11-29 23:27 UTC (permalink / raw)
  To: Jonas Bernoulli; +Cc: Joseph Turner, 67390, Adam Porter

On Wed, Nov 29, 2023 at 1:30 PM João Távora <joaotavora@gmail.com> wrote:
>
> On Sat, Nov 25, 2023 at 4:03 PM Jonas Bernoulli via Bug reports for
> GNU Emacs, the Swiss army knife of text editors
> <bug-gnu-emacs@gnu.org> wrote:
> >
> > Joseph Turner <joseph@ushin.org> writes:
> >
> > > +                          (car (shorthands--find-if
> > > +                                (lambda (short)
> > > +                                  (string-prefix-p short (match-string 1)))
> > > +                                read-symbol-shorthands #'car)))))
> >
> > Or simply:
> >   (car (assoc (match-string 1)
> >               read-symbol-shorthands
> >               #'string-prefix-p))
>
> I don't think it works, at least in my 'assoc', the order
> of string-prefix-p arguments must be switched.  Pity
> assoc or string-prefix-p decs didn't coordinate this.

nevermind, it does work if what you want is to see if the cars
of the alist are prefixes to the key, which is probably your
intention in this snippet.





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-29 23:27             ` João Távora
@ 2023-11-30 14:16               ` João Távora
  2023-11-30 15:23                 ` Eli Zaretskii
  2023-12-09 18:50                 ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 2 replies; 27+ messages in thread
From: João Távora @ 2023-11-30 14:16 UTC (permalink / raw)
  To: Jonas Bernoulli, Eli Zaretskii; +Cc: Joseph Turner, 67390, Adam Porter

Hi all,

I've been working on all these shorthand-related issues over the last
two days and I have reasonably short fixes for all of them.

For this particular issue (bug#67309), I've opted to
use Joseph's patch with very slight adjustments, as it's the
only one that guarantees correct behaviour and doesn't seem
to impact performance.

The other issues are:

bug#63480 (loaddefs-gen.el doesn't know about shorthands)
bug#67325 (prefix discovery i.e. register-definition-prefixes)
bug#67523 (check-declare.el doesn't know about shorthands)

I have all this in 6 commits in the bugfix/shorthand-fixes branch.

Here's the full patch minus whitespace changes.  If there are
no comments I'll push in a few days' time.

João

diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
index 1f3b677d7fb..18e80311177 100644
--- a/doc/lispref/symbols.texi
+++ b/doc/lispref/symbols.texi
@@ -761,6 +761,23 @@ Shorthands
 ;; End:
 @end example

+Note that if you have two shorthands in the same file where one is the
+prefix of the other, the longer shorthand will be attempted first.
+This happens regardless of the order you specify shorthands in the
+local variables section of your file.
+
+@example
+'(
+  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
+  t/foo  ; reads to 'my-tricks-foo'
+  )
+
+;; Local Variables:
+;; read-symbol-shorthands: (("t/" . "my-tricks-")
+;;                          ("t//" . "my-tricks--")
+;; End:
+@end example
+
 @subsection Exceptions

 There are two exceptions to rules governing Shorthand transformations:
diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
index c887d95210c..b19aedf314d 100644
--- a/lisp/emacs-lisp/check-declare.el
+++ b/lisp/emacs-lisp/check-declare.el
@@ -145,21 +145,26 @@ check-declare-verify
     (if (file-regular-p fnfile)
         (with-temp-buffer
           (insert-file-contents fnfile)
+          (unless cflag
+            ;; If in Elisp, ensure syntax and shorthands available
+            (set-syntax-table emacs-lisp-mode-syntax-table)
+            (let (enable-local-variables) (hack-local-variables)))
           ;; defsubst's don't _have_ to be known at compile time.
-          (setq re (format (if cflag
-                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
+          (setq re (if cflag
+                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
+                               (regexp-opt (mapcar 'cadr fnlist) t))
                      "^[ \t]*(\\(fset[ \t]+'\\|\
 cl-def\\(?:generic\\|method\\|un\\)\\|\
 def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
 ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
 \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
 ine-overloadable-function\\)\\)\
-[ \t]*%s\\([ \t;]+\\|$\\)")
-                           (regexp-opt (mapcar 'cadr fnlist) t)))
+[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
           (while (re-search-forward re nil t)
             (skip-chars-forward " \t\n")
-            (setq fn (match-string 2)
-                  type (match-string 1)
+            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
+            (when (member fn (mapcar 'cadr fnlist))
+              (setq type (match-string 1)
                     ;; (min . max) for a fixed number of arguments, or
                     ;; arglists with optional elements.
                     ;; (min) for arglists with &rest.
@@ -202,7 +207,7 @@ check-declare-verify
                               (t
                                'err))
                     ;; alist of functions and arglist signatures.
-                  siglist (cons (cons fn sig) siglist)))))
+                    siglist (cons (cons fn sig) siglist))))))
     (dolist (e fnlist)
       (setq arglist (nth 2 e)
             type
diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
index 04bea4723a2..e8093200bec 100644
--- a/lisp/emacs-lisp/loaddefs-gen.el
+++ b/lisp/emacs-lisp/loaddefs-gen.el
@@ -378,6 +378,7 @@ loaddefs-generate--parse-file
   (let ((defs nil)
         (load-name (loaddefs-generate--file-load-name file main-outfile))
         (compute-prefixes t)
+        read-symbol-shorthands
         local-outfile inhibit-autoloads)
     (with-temp-buffer
       (insert-file-contents file)
@@ -399,7 +400,19 @@ loaddefs-generate--parse-file
             (setq inhibit-autoloads (read (current-buffer)))))
         (save-excursion
           (when (re-search-forward "autoload-compute-prefixes: *" nil t)
-            (setq compute-prefixes (read (current-buffer))))))
+            (setq compute-prefixes (read (current-buffer)))))
+        (save-excursion
+          ;; since we're "open-coding" we have to repeat more
+          ;; complicated logic in `hack-local-variables'.
+          (when (re-search-forward "read-symbol-shorthands: *" nil t)
+            (let* ((commentless (replace-regexp-in-string
+                                 "\n\\s-*;+" ""
+                                 (buffer-substring (point) (point-max))))
+                   (unsorted-shorthands (car (read-from-string commentless))))
+              (setq read-symbol-shorthands
+                    (sort unsorted-shorthands
+                          (lambda (sh1 sh2)
+                            (> (length (car sh1)) (length (car sh2))))))))))

       ;; We always return the package version (even for pre-dumped
       ;; files).
@@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
     (while (re-search-forward
             "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil t)
       (unless (member (match-string 1) autoload-ignored-definitions)
-        (let ((name (match-string-no-properties 2)))
+        (let* ((name (match-string-no-properties 2))
+               ;; Consider `read-symbol-shorthands'.
+               (probe (let ((obarray (obarray-make)))
+                        (car (read-from-string name)))))
+          (setq name (symbol-name probe))
           (when (save-excursion
                   (goto-char (match-beginning 0))
                   (or (bobp)
diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index b0665a55695..69b562e3c7e 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
   :version "28.1"
   :group 'font-lock-faces)

-(defun shorthands--mismatch-from-end (str1 str2)
-  "Tell index of first mismatch in STR1 and STR2, from end.
-The index is a valid 0-based index on STR1.  Returns nil if STR1
-equals STR2.  Return 0 if STR1 is a suffix of STR2."
-  (cl-loop with l1 = (length str1) with l2 = (length str2)
-           for i from 1
-           for i1 = (- l1 i) for i2 = (- l2 i)
-           while (eq (aref str1 i1) (aref str2 i2))
-           if (zerop i2) return (if (zerop i1) nil i1)
-           if (zerop i1) return 0
-           finally (return i1)))
-
 (defun shorthands-font-lock-shorthands (limit)
+  "Font lock until LIMIT considering `read-symbol-shorthands'."
   (when read-symbol-shorthands
     (while (re-search-forward
             (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
             limit t)
       (let* ((existing (get-text-property (match-beginning 1) 'face))
+             (print-name (match-string 1))
              (probe (and (not (memq existing '(font-lock-comment-face
                                                font-lock-string-face)))
-                         (intern-soft (match-string 1))))
-             (sname (and probe (symbol-name probe)))
-             (mismatch (and sname (shorthands--mismatch-from-end
-                                   (match-string 1) sname)))
-             (guess (and mismatch (1+ mismatch))))
-        (when guess
-          (when (and (< guess (1- (length (match-string 1))))
-                     ;; In bug#67390 we allow other separators
-                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
-            (setq guess (1+ guess)))
+                         (intern-soft print-name)))
+             (symbol-name (and probe (symbol-name probe)))
+             (prefix (and symbol-name
+                          (not (string-equal print-name symbol-name))
+                          (car (assoc print-name
+                                      read-symbol-shorthands
+                                      #'string-prefix-p)))))
+        (when prefix
           (add-face-text-property (match-beginning 1)
-                                  (+ (match-beginning 1) guess)
+                                  (+ (match-beginning 1) (length prefix))
                                   'elisp-shorthand-font-lock-face))))))

 (font-lock-add-keywords 'emacs-lisp-mode
'((shorthands-font-lock-shorthands)) t)
diff --git a/lisp/files.el b/lisp/files.el
index 1cdcec23b11..b266d0727ec 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
 This hook is called only if there is at least one file-local
 variable to set.")

-(defvar permanently-enabled-local-variables '(lexical-binding)
+(defvar permanently-enabled-local-variables
+  '(lexical-binding read-symbol-shorthands)
   "A list of file-local variables that are always enabled.
 This overrides any `enable-local-variables' setting.")

@@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
                                 ;; to use 'thisbuf's name in the
                                 ;; warning message.
                                 (or (buffer-file-name thisbuf) ""))))))
+                          ((eq var 'read-symbol-shorthands)
+                           ;; Sort automatically by shorthand length
+                           ;; descending
+                           (setq val (sort val
+                                           (lambda (sh1 sh2) (>
(length (car sh1))
+
(length (car sh2))))))
+                           (push (cons 'read-symbol-shorthands val) result))
                           ((and (eq var 'mode) handle-mode))
    (t
     (ignore-errors





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-30 14:16               ` João Távora
@ 2023-11-30 15:23                 ` Eli Zaretskii
  2023-11-30 15:29                   ` João Távora
  2023-12-09 18:50                 ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 27+ messages in thread
From: Eli Zaretskii @ 2023-11-30 15:23 UTC (permalink / raw)
  To: João Távora; +Cc: joseph, 67390, jonas, adam

> From: João Távora <joaotavora@gmail.com>
> Date: Thu, 30 Nov 2023 14:16:51 +0000
> Cc: Joseph Turner <joseph@ushin.org>, 67390@debbugs.gnu.org, 
> 	Adam Porter <adam@alphapapa.net>
> 
> Hi all,
> 
> I've been working on all these shorthand-related issues over the last
> two days and I have reasonably short fixes for all of them.
> 
> For this particular issue (bug#67309), I've opted to
> use Joseph's patch with very slight adjustments, as it's the
> only one that guarantees correct behaviour and doesn't seem
> to impact performance.
> 
> The other issues are:
> 
> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
> bug#67325 (prefix discovery i.e. register-definition-prefixes)
> bug#67523 (check-declare.el doesn't know about shorthands)
> 
> I have all this in 6 commits in the bugfix/shorthand-fixes branch.

Thanks.

> Here's the full patch minus whitespace changes.  If there are
> no comments I'll push in a few days' time.

The plan is to merge these to the master branch, right?





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-30 15:23                 ` Eli Zaretskii
@ 2023-11-30 15:29                   ` João Távora
  0 siblings, 0 replies; 27+ messages in thread
From: João Távora @ 2023-11-30 15:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: joseph, 67390, jonas, adam

On Thu, Nov 30, 2023 at 3:23 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>
> Thanks.
>
> > Here's the full patch minus whitespace changes.  If there are
> > no comments I'll push in a few days' time.
>
> The plan is to merge these to the master branch, right?

Yes





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-11-30 14:16               ` João Távora
  2023-11-30 15:23                 ` Eli Zaretskii
@ 2023-12-09 18:50                 ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-02-03  7:10                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-12-09 18:50 UTC (permalink / raw)
  To: João Távora; +Cc: 67390, Adam Porter, Eli Zaretskii, Jonas Bernoulli

Hi João!  Thanks for your patience - preparing for EmacsConf was a blast,
and now I'm on a plane to go visit my grandmother!

João Távora <joaotavora@gmail.com> writes:

> Hi all,
>
> I've been working on all these shorthand-related issues over the last
> two days and I have reasonably short fixes for all of them.
>
> For this particular issue (bug#67309), I've opted to
> use Joseph's patch with very slight adjustments, as it's the
> only one that guarantees correct behaviour and doesn't seem
> to impact performance.
>
> The other issues are:
>
> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
> bug#67325 (prefix discovery i.e. register-definition-prefixes)
> bug#67523 (check-declare.el doesn't know about shorthands)
>
> I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>
> Here's the full patch minus whitespace changes.  If there are
> no comments I'll push in a few days' time.
>
> João
>
> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
> index 1f3b677d7fb..18e80311177 100644
> --- a/doc/lispref/symbols.texi
> +++ b/doc/lispref/symbols.texi
> @@ -761,6 +761,23 @@ Shorthands
>  ;; End:
>  @end example
>
> +Note that if you have two shorthands in the same file where one is the
> +prefix of the other, the longer shorthand will be attempted first.
> +This happens regardless of the order you specify shorthands in the
> +local variables section of your file.
> +
> +@example
> +'(
> +  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
> +  t/foo  ; reads to 'my-tricks-foo'
> +  )
> +
> +;; Local Variables:
> +;; read-symbol-shorthands: (("t/" . "my-tricks-")
> +;;                          ("t//" . "my-tricks--")
> +;; End:
> +@end example
> +
>  @subsection Exceptions

Clear and concise.

>  There are two exceptions to rules governing Shorthand transformations:
> diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
> index c887d95210c..b19aedf314d 100644
> --- a/lisp/emacs-lisp/check-declare.el
> +++ b/lisp/emacs-lisp/check-declare.el
> @@ -145,21 +145,26 @@ check-declare-verify
>      (if (file-regular-p fnfile)
>          (with-temp-buffer
>            (insert-file-contents fnfile)
> +          (unless cflag
> +            ;; If in Elisp, ensure syntax and shorthands available
> +            (set-syntax-table emacs-lisp-mode-syntax-table)
> +            (let (enable-local-variables) (hack-local-variables)))
>            ;; defsubst's don't _have_ to be known at compile time.
> -          (setq re (format (if cflag
> -                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> +          (setq re (if cflag
> +                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> +                               (regexp-opt (mapcar 'cadr fnlist) t))
>                       "^[ \t]*(\\(fset[ \t]+'\\|\
>  cl-def\\(?:generic\\|method\\|un\\)\\|\
>  def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>  ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>  \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>  ine-overloadable-function\\)\\)\
> -[ \t]*%s\\([ \t;]+\\|$\\)")
> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))

Would you explain what this regexp is intended to match?

>            (while (re-search-forward re nil t)
>              (skip-chars-forward " \t\n")
> -            (setq fn (match-string 2)
> -                  type (match-string 1)
> +            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
> +            (when (member fn (mapcar 'cadr fnlist))
> +              (setq type (match-string 1)
>                      ;; (min . max) for a fixed number of arguments, or
>                      ;; arglists with optional elements.
>                      ;; (min) for arglists with &rest.
> @@ -202,7 +207,7 @@ check-declare-verify
>                                (t
>                                 'err))
>                      ;; alist of functions and arglist signatures.
> -                  siglist (cons (cons fn sig) siglist)))))
> +                    siglist (cons (cons fn sig) siglist))))))
>      (dolist (e fnlist)
>        (setq arglist (nth 2 e)
>              type

On my machine, this patch removes some of the check-declare "function
not found" errors, but not all.  For example, with hyperdrive-lib.el:

(check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")

Before this patch, the "*Check Declarations Warnings*" buffer shows:

--8<---------------cut here---------------start------------->8---
 ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
 ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
    found
 ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function
    not found
 ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
 ■ hyperdrive-lib.el:1069:Warning (check-declare): said
    ‘h/dir--entry-at-point’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
 ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
    not found
--8<---------------cut here---------------end--------------->8---

and after your patch:

--8<---------------cut here---------------start------------->8---
 ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
 ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
    found
 ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
    not found
--8<---------------cut here---------------end--------------->8---

Are you able to reproduce this on your machine?

> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
> index 04bea4723a2..e8093200bec 100644
> --- a/lisp/emacs-lisp/loaddefs-gen.el
> +++ b/lisp/emacs-lisp/loaddefs-gen.el
> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file
>    (let ((defs nil)
>          (load-name (loaddefs-generate--file-load-name file main-outfile))
>          (compute-prefixes t)
> +        read-symbol-shorthands
>          local-outfile inhibit-autoloads)
>      (with-temp-buffer
>        (insert-file-contents file)
> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file
>              (setq inhibit-autoloads (read (current-buffer)))))
>          (save-excursion
>            (when (re-search-forward "autoload-compute-prefixes: *" nil t)
> -            (setq compute-prefixes (read (current-buffer))))))
> +            (setq compute-prefixes (read (current-buffer)))))
> +        (save-excursion
> +          ;; since we're "open-coding" we have to repeat more
> +          ;; complicated logic in `hack-local-variables'.
> +          (when (re-search-forward "read-symbol-shorthands: *" nil t)
> +            (let* ((commentless (replace-regexp-in-string
> +                                 "\n\\s-*;+" ""
> +                                 (buffer-substring (point) (point-max))))
> +                   (unsorted-shorthands (car (read-from-string commentless))))
> +              (setq read-symbol-shorthands
> +                    (sort unsorted-shorthands
> +                          (lambda (sh1 sh2)
> +                            (> (length (car sh1)) (length (car sh2))))))))))

IIUC, the intention here is to jump to a final "Local Variables"
declaration at the end of the file, then remove ";;", then read in the
uncommented value of `read-symbol-shorthands'.

Since `read-from-string' just reads one expression, the above hunk works
when there are more local variables after read-symbol-shorthands:

;; Local Variables:
;; read-symbol-shorthands: (("bc-" . "breadcrumb-"))
;; autoload-compute-prefixes: nil
;; End:

But if the read-symbol-shorthands declaration comes at the top, as in...

-*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*-

...then this form will allocate two strings almost as long as the file.

Here's an alternative hack attempting to uncomment and read the minimum:

diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
index e8093200bec..406e4b28f1f 100644
--- a/lisp/emacs-lisp/loaddefs-gen.el
+++ b/lisp/emacs-lisp/loaddefs-gen.el
@@ -404,10 +404,13 @@ don't include."
         (save-excursion
           ;; since we're "open-coding" we have to repeat more
           ;; complicated logic in `hack-local-variables'.
-          (when (re-search-forward "read-symbol-shorthands: *" nil t)
-            (let* ((commentless (replace-regexp-in-string
+          (when-let ((beg
+                      (re-search-forward "read-symbol-shorthands: *" nil t)))
+            ;; `read-symbol-shorthands' alist ends with two parens.
+            (let* ((end (re-search-forward ")[;\n\s]*)"))
+                   (commentless (replace-regexp-in-string
                                  "\n\\s-*;+" ""
-                                 (buffer-substring (point) (point-max))))
+                                 (buffer-substring beg end)))
                    (unsorted-shorthands (car (read-from-string commentless))))
               (setq read-symbol-shorthands
                     (sort unsorted-shorthands

>        ;; We always return the package version (even for pre-dumped
>        ;; files).
> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
>      (while (re-search-forward
>              "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil t)
>        (unless (member (match-string 1) autoload-ignored-definitions)
> -        (let ((name (match-string-no-properties 2)))
> +        (let* ((name (match-string-no-properties 2))
> +               ;; Consider `read-symbol-shorthands'.
> +               (probe (let ((obarray (obarray-make)))
> +                        (car (read-from-string name)))))
> +          (setq name (symbol-name probe))
>            (when (save-excursion
>                    (goto-char (match-beginning 0))
>                    (or (bobp)
> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
> index b0665a55695..69b562e3c7e 100644
> --- a/lisp/emacs-lisp/shorthands.el
> +++ b/lisp/emacs-lisp/shorthands.el
> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
>    :version "28.1"
>    :group 'font-lock-faces)
>
> -(defun shorthands--mismatch-from-end (str1 str2)
> -  "Tell index of first mismatch in STR1 and STR2, from end.
> -The index is a valid 0-based index on STR1.  Returns nil if STR1
> -equals STR2.  Return 0 if STR1 is a suffix of STR2."
> -  (cl-loop with l1 = (length str1) with l2 = (length str2)
> -           for i from 1
> -           for i1 = (- l1 i) for i2 = (- l2 i)
> -           while (eq (aref str1 i1) (aref str2 i2))
> -           if (zerop i2) return (if (zerop i1) nil i1)
> -           if (zerop i1) return 0
> -           finally (return i1)))
> -
>  (defun shorthands-font-lock-shorthands (limit)
> +  "Font lock until LIMIT considering `read-symbol-shorthands'."
>    (when read-symbol-shorthands
>      (while (re-search-forward
>              (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
>              limit t)
>        (let* ((existing (get-text-property (match-beginning 1) 'face))
> +             (print-name (match-string 1))
>               (probe (and (not (memq existing '(font-lock-comment-face
>                                                 font-lock-string-face)))
> -                         (intern-soft (match-string 1))))
> -             (sname (and probe (symbol-name probe)))
> -             (mismatch (and sname (shorthands--mismatch-from-end
> -                                   (match-string 1) sname)))
> -             (guess (and mismatch (1+ mismatch))))
> -        (when guess
> -          (when (and (< guess (1- (length (match-string 1))))
> -                     ;; In bug#67390 we allow other separators
> -                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
> -            (setq guess (1+ guess)))
> +                         (intern-soft print-name)))
> +             (symbol-name (and probe (symbol-name probe)))
> +             (prefix (and symbol-name
> +                          (not (string-equal print-name symbol-name))
> +                          (car (assoc print-name
> +                                      read-symbol-shorthands
> +                                      #'string-prefix-p)))))
> +        (when prefix
>            (add-face-text-property (match-beginning 1)
> -                                  (+ (match-beginning 1) guess)
> +                                  (+ (match-beginning 1) (length prefix))
>                                    'elisp-shorthand-font-lock-face))))))

Works well.  let-binding `symbol-name' and `print-name' is good improvement.

>  (font-lock-add-keywords 'emacs-lisp-mode
> '((shorthands-font-lock-shorthands)) t)
> diff --git a/lisp/files.el b/lisp/files.el
> index 1cdcec23b11..b266d0727ec 100644
> --- a/lisp/files.el
> +++ b/lisp/files.el
> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
>  This hook is called only if there is at least one file-local
>  variable to set.")
>
> -(defvar permanently-enabled-local-variables '(lexical-binding)
> +(defvar permanently-enabled-local-variables
> +  '(lexical-binding read-symbol-shorthands)
>    "A list of file-local variables that are always enabled.
>  This overrides any `enable-local-variables' setting.")
>
> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
>                                  ;; to use 'thisbuf's name in the
>                                  ;; warning message.
>                                  (or (buffer-file-name thisbuf) ""))))))
> +                          ((eq var 'read-symbol-shorthands)
> +                           ;; Sort automatically by shorthand length
> +                           ;; descending
> +                           (setq val (sort val
> +                                           (lambda (sh1 sh2) (>
> (length (car sh1))
> +
> (length (car sh2))))))
> +                           (push (cons 'read-symbol-shorthands val) result))
>                            ((and (eq var 'mode) handle-mode))
>     (t
>      (ignore-errors

Good catch.  I agree that longer shorthands should be applied first.

-----

A couple typo nits on the commit message of "Improve
shorthands-font-lock-shorthands (bug#67390)":

-  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
+  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'

-  Co-authored-by: João Távora <joaotavora@gmail.com>
+  Co-authored-by: Joseph Turner <joseph@breatheoutbreathe.in>


Thank you!

Joseph





^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2023-12-09 18:50                 ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-02-03  7:10                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-02-03 14:50                     ` João Távora
  0 siblings, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-02-03  7:10 UTC (permalink / raw)
  To: João Távora, Jonas Bernoulli, Eli Zaretskii, 67390,
	Adam Porter

Joseph Turner <joseph@ushin.org> writes:

> Hi João!  Thanks for your patience - preparing for EmacsConf was a blast,
> and now I'm on a plane to go visit my grandmother!
>
> João Távora <joaotavora@gmail.com> writes:
>
>> Hi all,
>>
>> I've been working on all these shorthand-related issues over the last
>> two days and I have reasonably short fixes for all of them.
>>
>> For this particular issue (bug#67309), I've opted to
>> use Joseph's patch with very slight adjustments, as it's the
>> only one that guarantees correct behaviour and doesn't seem
>> to impact performance.
>>
>> The other issues are:
>>
>> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
>> bug#67325 (prefix discovery i.e. register-definition-prefixes)
>> bug#67523 (check-declare.el doesn't know about shorthands)
>>
>> I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>>
>> Here's the full patch minus whitespace changes.  If there are
>> no comments I'll push in a few days' time.
>>
>> João
>>
>> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
>> index 1f3b677d7fb..18e80311177 100644
>> --- a/doc/lispref/symbols.texi
>> +++ b/doc/lispref/symbols.texi
>> @@ -761,6 +761,23 @@ Shorthands
>>  ;; End:
>>  @end example
>>
>> +Note that if you have two shorthands in the same file where one is the
>> +prefix of the other, the longer shorthand will be attempted first.
>> +This happens regardless of the order you specify shorthands in the
>> +local variables section of your file.
>> +
>> +@example
>> +'(
>> +  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
>> +  t/foo  ; reads to 'my-tricks-foo'
>> +  )
>> +
>> +;; Local Variables:
>> +;; read-symbol-shorthands: (("t/" . "my-tricks-")
>> +;;                          ("t//" . "my-tricks--")
>> +;; End:
>> +@end example
>> +
>>  @subsection Exceptions
>
> Clear and concise.
>
>>  There are two exceptions to rules governing Shorthand transformations:
>> diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
>> index c887d95210c..b19aedf314d 100644
>> --- a/lisp/emacs-lisp/check-declare.el
>> +++ b/lisp/emacs-lisp/check-declare.el
>> @@ -145,21 +145,26 @@ check-declare-verify
>>      (if (file-regular-p fnfile)
>>          (with-temp-buffer
>>            (insert-file-contents fnfile)
>> +          (unless cflag
>> +            ;; If in Elisp, ensure syntax and shorthands available
>> +            (set-syntax-table emacs-lisp-mode-syntax-table)
>> +            (let (enable-local-variables) (hack-local-variables)))
>>            ;; defsubst's don't _have_ to be known at compile time.
>> -          (setq re (format (if cflag
>> -                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> +          (setq re (if cflag
>> +                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> +                               (regexp-opt (mapcar 'cadr fnlist) t))
>>                       "^[ \t]*(\\(fset[ \t]+'\\|\
>>  cl-def\\(?:generic\\|method\\|un\\)\\|\
>>  def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>>  ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>>  \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>>  ine-overloadable-function\\)\\)\
>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>
> Would you explain what this regexp is intended to match?
>
>>            (while (re-search-forward re nil t)
>>              (skip-chars-forward " \t\n")
>> -            (setq fn (match-string 2)
>> -                  type (match-string 1)
>> +            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
>> +            (when (member fn (mapcar 'cadr fnlist))
>> +              (setq type (match-string 1)
>>                      ;; (min . max) for a fixed number of arguments, or
>>                      ;; arglists with optional elements.
>>                      ;; (min) for arglists with &rest.
>> @@ -202,7 +207,7 @@ check-declare-verify
>>                                (t
>>                                 'err))
>>                      ;; alist of functions and arglist signatures.
>> -                  siglist (cons (cons fn sig) siglist)))))
>> +                    siglist (cons (cons fn sig) siglist))))))
>>      (dolist (e fnlist)
>>        (setq arglist (nth 2 e)
>>              type
>
> On my machine, this patch removes some of the check-declare "function
> not found" errors, but not all.  For example, with hyperdrive-lib.el:
>
> (check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")
>
> Before this patch, the "*Check Declarations Warnings*" buffer shows:
>
> --8<---------------cut here---------------start------------->8---
>  ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
>  ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
>     found
>  ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function
>     not found
>  ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
>  ■ hyperdrive-lib.el:1069:Warning (check-declare): said
>     ‘h/dir--entry-at-point’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
>  ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
>     not found
> --8<---------------cut here---------------end--------------->8---
>
>
> and after your patch:
>
> --8<---------------cut here---------------start------------->8---
>  ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
>  ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
>     found
>  ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
>     not found
> --8<---------------cut here---------------end--------------->8---
>
> Are you able to reproduce this on your machine?
>
>> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
>> index 04bea4723a2..e8093200bec 100644
>> --- a/lisp/emacs-lisp/loaddefs-gen.el
>> +++ b/lisp/emacs-lisp/loaddefs-gen.el
>> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file
>>    (let ((defs nil)
>>          (load-name (loaddefs-generate--file-load-name file main-outfile))
>>          (compute-prefixes t)
>> +        read-symbol-shorthands
>>          local-outfile inhibit-autoloads)
>>      (with-temp-buffer
>>        (insert-file-contents file)
>> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file
>>              (setq inhibit-autoloads (read (current-buffer)))))
>>          (save-excursion
>>            (when (re-search-forward "autoload-compute-prefixes: *" nil t)
>> -            (setq compute-prefixes (read (current-buffer))))))
>> +            (setq compute-prefixes (read (current-buffer)))))
>> +        (save-excursion
>> +          ;; since we're "open-coding" we have to repeat more
>> +          ;; complicated logic in `hack-local-variables'.
>> +          (when (re-search-forward "read-symbol-shorthands: *" nil t)
>> +            (let* ((commentless (replace-regexp-in-string
>> +                                 "\n\\s-*;+" ""
>> +                                 (buffer-substring (point) (point-max))))
>> +                   (unsorted-shorthands (car (read-from-string commentless))))
>> +              (setq read-symbol-shorthands
>> +                    (sort unsorted-shorthands
>> +                          (lambda (sh1 sh2)
>> +                            (> (length (car sh1)) (length (car sh2))))))))))
>
> IIUC, the intention here is to jump to a final "Local Variables"
> declaration at the end of the file, then remove ";;", then read in the
> uncommented value of `read-symbol-shorthands'.
>
> Since `read-from-string' just reads one expression, the above hunk works
> when there are more local variables after read-symbol-shorthands:
>
> ;; Local Variables:
> ;; read-symbol-shorthands: (("bc-" . "breadcrumb-"))
> ;; autoload-compute-prefixes: nil
> ;; End:
>
> But if the read-symbol-shorthands declaration comes at the top, as in...
>
> -*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*-
>
> ...then this form will allocate two strings almost as long as the file.
>
> Here's an alternative hack attempting to uncomment and read the minimum:
>
> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
> index e8093200bec..406e4b28f1f 100644
> --- a/lisp/emacs-lisp/loaddefs-gen.el
> +++ b/lisp/emacs-lisp/loaddefs-gen.el
> @@ -404,10 +404,13 @@ don't include."
>          (save-excursion
>            ;; since we're "open-coding" we have to repeat more
>            ;; complicated logic in `hack-local-variables'.
> -          (when (re-search-forward "read-symbol-shorthands: *" nil t)
> -            (let* ((commentless (replace-regexp-in-string
> +          (when-let ((beg
> +                      (re-search-forward "read-symbol-shorthands: *" nil t)))
> +            ;; `read-symbol-shorthands' alist ends with two parens.
> +            (let* ((end (re-search-forward ")[;\n\s]*)"))
> +                   (commentless (replace-regexp-in-string
>                                   "\n\\s-*;+" ""
> -                                 (buffer-substring (point) (point-max))))
> +                                 (buffer-substring beg end)))
>                     (unsorted-shorthands (car (read-from-string commentless))))
>                (setq read-symbol-shorthands
>                      (sort unsorted-shorthands
>
>>        ;; We always return the package version (even for pre-dumped
>>        ;; files).
>> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
>>      (while (re-search-forward
>>              "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil t)
>>        (unless (member (match-string 1) autoload-ignored-definitions)
>> -        (let ((name (match-string-no-properties 2)))
>> +        (let* ((name (match-string-no-properties 2))
>> +               ;; Consider `read-symbol-shorthands'.
>> +               (probe (let ((obarray (obarray-make)))
>> +                        (car (read-from-string name)))))
>> +          (setq name (symbol-name probe))
>>            (when (save-excursion
>>                    (goto-char (match-beginning 0))
>>                    (or (bobp)
>> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
>> index b0665a55695..69b562e3c7e 100644
>> --- a/lisp/emacs-lisp/shorthands.el
>> +++ b/lisp/emacs-lisp/shorthands.el
>> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
>>    :version "28.1"
>>    :group 'font-lock-faces)
>>
>> -(defun shorthands--mismatch-from-end (str1 str2)
>> -  "Tell index of first mismatch in STR1 and STR2, from end.
>> -The index is a valid 0-based index on STR1.  Returns nil if STR1
>> -equals STR2.  Return 0 if STR1 is a suffix of STR2."
>> -  (cl-loop with l1 = (length str1) with l2 = (length str2)
>> -           for i from 1
>> -           for i1 = (- l1 i) for i2 = (- l2 i)
>> -           while (eq (aref str1 i1) (aref str2 i2))
>> -           if (zerop i2) return (if (zerop i1) nil i1)
>> -           if (zerop i1) return 0
>> -           finally (return i1)))
>> -
>>  (defun shorthands-font-lock-shorthands (limit)
>> +  "Font lock until LIMIT considering `read-symbol-shorthands'."
>>    (when read-symbol-shorthands
>>      (while (re-search-forward
>>              (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
>>              limit t)
>>        (let* ((existing (get-text-property (match-beginning 1) 'face))
>> +             (print-name (match-string 1))
>>               (probe (and (not (memq existing '(font-lock-comment-face
>>                                                 font-lock-string-face)))
>> -                         (intern-soft (match-string 1))))
>> -             (sname (and probe (symbol-name probe)))
>> -             (mismatch (and sname (shorthands--mismatch-from-end
>> -                                   (match-string 1) sname)))
>> -             (guess (and mismatch (1+ mismatch))))
>> -        (when guess
>> -          (when (and (< guess (1- (length (match-string 1))))
>> -                     ;; In bug#67390 we allow other separators
>> -                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
>> -            (setq guess (1+ guess)))
>> +                         (intern-soft print-name)))
>> +             (symbol-name (and probe (symbol-name probe)))
>> +             (prefix (and symbol-name
>> +                          (not (string-equal print-name symbol-name))
>> +                          (car (assoc print-name
>> +                                      read-symbol-shorthands
>> +                                      #'string-prefix-p)))))
>> +        (when prefix
>>            (add-face-text-property (match-beginning 1)
>> -                                  (+ (match-beginning 1) guess)
>> +                                  (+ (match-beginning 1) (length prefix))
>>                                    'elisp-shorthand-font-lock-face))))))
>
> Works well.  let-binding `symbol-name' and `print-name' is good improvement.
>
>>  (font-lock-add-keywords 'emacs-lisp-mode
>> '((shorthands-font-lock-shorthands)) t)
>> diff --git a/lisp/files.el b/lisp/files.el
>> index 1cdcec23b11..b266d0727ec 100644
>> --- a/lisp/files.el
>> +++ b/lisp/files.el
>> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
>>  This hook is called only if there is at least one file-local
>>  variable to set.")
>>
>> -(defvar permanently-enabled-local-variables '(lexical-binding)
>> +(defvar permanently-enabled-local-variables
>> +  '(lexical-binding read-symbol-shorthands)
>>    "A list of file-local variables that are always enabled.
>>  This overrides any `enable-local-variables' setting.")
>>
>> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
>>                                  ;; to use 'thisbuf's name in the
>>                                  ;; warning message.
>>                                  (or (buffer-file-name thisbuf) ""))))))
>> +                          ((eq var 'read-symbol-shorthands)
>> +                           ;; Sort automatically by shorthand length
>> +                           ;; descending
>> +                           (setq val (sort val
>> +                                           (lambda (sh1 sh2) (>
>> (length (car sh1))
>> +
>> (length (car sh2))))))
>> +                           (push (cons 'read-symbol-shorthands val) result))
>>                            ((and (eq var 'mode) handle-mode))
>>     (t
>>      (ignore-errors
>
> Good catch.  I agree that longer shorthands should be applied first.
>
> -----
>
> A couple typo nits on the commit message of "Improve
> shorthands-font-lock-shorthands (bug#67390)":
>
> -  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
> +  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'
>
> -  Co-authored-by: João Távora <joaotavora@gmail.com>
> +  Co-authored-by: Joseph Turner <joseph@breatheoutbreathe.in>
>
>
> Thank you!
>
> Joseph

Ping!





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2024-02-03  7:10                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-02-03 14:50                     ` João Távora
  2024-02-03 19:43                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: João Távora @ 2024-02-03 14:50 UTC (permalink / raw)
  To: Joseph Turner; +Cc: 67390, Adam Porter, Eli Zaretskii, Jonas Bernoulli

Sorry, this flew under the radar.  I thought I had already pushed to master
but didn't.  So I went through the commits again, addressed your concerns, and
applied your suggestions.  Pushed to master now.



On Sat, Feb 3, 2024 at 7:10 AM Joseph Turner <joseph@ushin.org> wrote:


>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>
> Would you explain what this regexp is intended to match?

A very complicated one, right?  Well ask the author, but I think it's intended
to find many definition-like forms.  No idea why this is  done with regexps and
not with 'read' as it is a classical parsing pitfall in the long run.
Maybe there
was a reason.

Anyway, I just added a bit of logic so that it considers
read-symbol-shorthands if
there are any.

> Are you able to reproduce this on your machine?

Yes, and I fixed it.

> ...then this form will allocate two strings almost as long as the file.
>
> Here's an alternative hack attempting to uncomment and read the minimum:

Thanks, I think that's a good idea and I added a commit in your name.

> A couple typo nits on the commit message of "Improve
> shorthands-font-lock-shorthands (bug#67390)":
>
> -  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
> +  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'
>
> -  Co-authored-by: João Távora <joaotavora@gmail.com>
> +  Co-authored-by: Joseph Turner <joseph@breatheoutbreathe.in>

I fixed these, too.  If you succesfully test this, I think we can close this bug
(and the other ones, too).

João





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2024-02-03 14:50                     ` João Távora
@ 2024-02-03 19:43                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-02-03 22:25                         ` João Távora
  0 siblings, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-02-03 19:43 UTC (permalink / raw)
  To: João Távora; +Cc: 67390, adam, eliz, jonas

[-- Attachment #1: Type: text/plain, Size: 2539 bytes --]

João Távora <joaotavora@gmail.com> writes:

> Sorry, this flew under the radar.  I thought I had already pushed to master
> but didn't.  So I went through the commits again, addressed your concerns, and
> applied your suggestions.  Pushed to master now.

Thank you!

> On Sat, Feb 3, 2024 at 7:10 AM Joseph Turner <joseph@ushin.org> wrote:
>
>
>>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>>> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
>>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>>
>> Would you explain what this regexp is intended to match?
>
> A very complicated one, right?  Well ask the author, but I think it's intended
> to find many definition-like forms.  No idea why this is  done with regexps and
> not with 'read' as it is a classical parsing pitfall in the long run.
> Maybe there
> was a reason.
>
> Anyway, I just added a bit of logic so that it considers
> read-symbol-shorthands if
> there are any.

That makes sense.

>> Are you able to reproduce this on your machine?
>
> Yes, and I fixed it.
>
>> ...then this form will allocate two strings almost as long as the file.
>>
>> Here's an alternative hack attempting to uncomment and read the minimum:
>
> Thanks, I think that's a good idea and I added a commit in your name.

Thanks!

>> A couple typo nits on the commit message of "Improve
>> shorthands-font-lock-shorthands (bug#67390)":
>>
>> -  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
>> +  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'
>>
>> -  Co-authored-by: João Távora <joaotavora@gmail.com>
>> +  Co-authored-by: Joseph Turner <joseph@breatheoutbreathe.in>
>
> I fixed these, too.  If you succesfully test this, I think we can close this bug
> (and the other ones, too).

I'm still reproducing the check-declare bug on my machine. It appears
that binding `enable-local-variables' to nil around the call to
`hack-local-variables' means that `read-symbol-shorthands' is not set.
Can we bind `enable-local-variables' to `:safe' instead?

(let (
      ;; (enable-local-variables t)   ; works
      ;; (enable-local-variables)     ; doesn't work
      (enable-local-variables :safe)  ; works
      )
  (with-temp-buffer
    (insert-file-contents "~/.local/src/hyperdrive.el/hyperdrive-lib.el")
    (hack-local-variables)
    read-symbol-shorthands))

See attached patch.

There's no way to hack just a single file- or dir-local variable, right?

Thank you!

Joseph


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Bind-enable-local-variables-to-safe-for-shorthands-b.patch --]
[-- Type: text/x-diff, Size: 1667 bytes --]

From f301d839031f78b303d698d7bca1f0b27264d2d8 Mon Sep 17 00:00:00 2001
From: Joseph Turner <joseph@breatheoutbreathe.in>
Date: Sat, 3 Feb 2024 11:56:31 -0800
Subject: [PATCH] ;  Bind enable-local-variables to :safe for shorthands
 (bug#67523)

* lisp/emacs-lisp/check-declare.el (check-declare-scan):
(check-declare-verify):
---
 lisp/emacs-lisp/check-declare.el | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
index a6d1a330d90..eff37828f65 100644
--- a/lisp/emacs-lisp/check-declare.el
+++ b/lisp/emacs-lisp/check-declare.el
@@ -87,7 +87,7 @@ don't know how to recognize (e.g. some macros)."
       (insert-file-contents file)
       ;; Ensure shorthands available, as we will be `read'ing Elisp
       ;; (bug#67523)
-      (let (enable-local-variables) (hack-local-variables))
+      (let ((enable-local-variables :safe)) (hack-local-variables))
       ;; FIXME we could theoretically be inside a string.
       (while (re-search-forward "^[ \t]*\\((declare-function\\)[ \t\n]" nil t)
         (let ((pos (match-beginning 1)))
@@ -152,7 +152,7 @@ is a string giving details of the error."
             ;; If in Elisp, ensure syntax and shorthands available
             ;; (bug#67523)
             (set-syntax-table emacs-lisp-mode-syntax-table)
-            (let (enable-local-variables) (hack-local-variables)))
+            (let ((enable-local-variables :safe)) (hack-local-variables)))
           ;; defsubst's don't _have_ to be known at compile time.
           (setq re (if cflag
                        (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2024-02-03 19:43                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-02-03 22:25                         ` João Távora
  2024-02-03 23:48                           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: João Távora @ 2024-02-03 22:25 UTC (permalink / raw)
  To: Joseph Turner; +Cc: 67390, adam, eliz, jonas

On Sat, Feb 3, 2024 at 8:01 PM Joseph Turner
<joseph@breatheoutbreathe.in> wrote:

> I'm still reproducing the check-declare bug on my machine. It appears
> that binding `enable-local-variables' to nil around the call to
> `hack-local-variables' means that `read-symbol-shorthands' is not set.
> Can we bind `enable-local-variables' to `:safe' instead?

It could be some bootstrapping issue, since the safe spec of that particular
variable itself needs to be autoloaded.  I vaguely remember something like
this and I _think_ it was fixed.

Anyway, I can't reproduce this with this test:

   src/emacs -Q --batch --eval '(check-declare-file
"~/tmp/hyperdrive.el/hyperdrive-lib.el")'

where ~/tmp/hyperdrive.el is a checkout of your hyperdrive library.

This doesn't output anything, which I think is the expected result.

How are you testing?

João





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2024-02-03 22:25                         ` João Távora
@ 2024-02-03 23:48                           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2024-05-21 22:05                             ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-02-03 23:48 UTC (permalink / raw)
  To: João Távora; +Cc: 67390, adam, eliz, jonas

João Távora <joaotavora@gmail.com> writes:

> On Sat, Feb 3, 2024 at 8:01 PM Joseph Turner
> <joseph@breatheoutbreathe.in> wrote:
>
>> I'm still reproducing the check-declare bug on my machine. It appears
>> that binding `enable-local-variables' to nil around the call to
>> `hack-local-variables' means that `read-symbol-shorthands' is not set.
>> Can we bind `enable-local-variables' to `:safe' instead?
>
> It could be some bootstrapping issue, since the safe spec of that particular
> variable itself needs to be autoloaded.  I vaguely remember something like
> this and I _think_ it was fixed.
>
> Anyway, I can't reproduce this with this test:
>
>    src/emacs -Q --batch --eval '(check-declare-file
> "~/tmp/hyperdrive.el/hyperdrive-lib.el")'
>
> where ~/tmp/hyperdrive.el is a checkout of your hyperdrive library.
>
> This doesn't output anything, which I think is the expected result.
>
> How are you testing?

Hmm... I just compiled from master with

./configure --with-x-toolkit=no --with-xpm=ifavailable --with-jpeg=ifavailable --with-gif=ifavailable --with-tiff=ifavailable --with-gnutls=ifavailable && make

then I ran

src/emacs -Q --batch --eval '(check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")'

which produced

uncompressing textsec-check.el.gz...
uncompressing textsec-check.el.gz...done
../hyperdrive.el/hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in ../hyperdrive.el/hyperdrive.el: function not found
../hyperdrive.el/hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined in ../hyperdrive.el/hyperdrive-history.el: function not found
../hyperdrive.el/hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was defined in ../hyperdrive.el/hyperdrive-dir.el: function not found

Would someone else kindly attempt to reproduce the issue?

Thanks!

Joseph





^ permalink raw reply	[flat|nested] 27+ messages in thread

* bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator
  2024-02-03 23:48                           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-05-21 22:05                             ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 27+ messages in thread
From: Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-05-21 22:05 UTC (permalink / raw)
  To: 67390; +Cc: adam, eliz, jonas, joaotavora

Joseph Turner via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org> writes:

> João Távora <joaotavora@gmail.com> writes:
>
>> On Sat, Feb 3, 2024 at 8:01 PM Joseph Turner
>> <joseph@breatheoutbreathe.in> wrote:
>>
>>> I'm still reproducing the check-declare bug on my machine. It appears
>>> that binding `enable-local-variables' to nil around the call to
>>> `hack-local-variables' means that `read-symbol-shorthands' is not set.
>>> Can we bind `enable-local-variables' to `:safe' instead?
>>
>> It could be some bootstrapping issue, since the safe spec of that particular
>> variable itself needs to be autoloaded.  I vaguely remember something like
>> this and I _think_ it was fixed.
>>
>> Anyway, I can't reproduce this with this test:
>>
>>    src/emacs -Q --batch --eval '(check-declare-file
>> "~/tmp/hyperdrive.el/hyperdrive-lib.el")'
>>
>> where ~/tmp/hyperdrive.el is a checkout of your hyperdrive library.
>>
>> This doesn't output anything, which I think is the expected result.
>>
>> How are you testing?
>
> Hmm... I just compiled from master with
>
> ./configure --with-x-toolkit=no --with-xpm=ifavailable
> --with-jpeg=ifavailable --with-gif=ifavailable --with-tiff=ifavailable
> --with-gnutls=ifavailable && make
>
> then I ran
>
> src/emacs -Q --batch --eval '(check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")'
>
> which produced
>
> uncompressing textsec-check.el.gz...
> uncompressing textsec-check.el.gz...done
> ../hyperdrive.el/hyperdrive-lib.el:44:Warning (check-declare): said
> ‘h/mode’ was defined in ../hyperdrive.el/hyperdrive.el: function not
> found
> ../hyperdrive.el/hyperdrive-lib.el:508:Warning (check-declare): said
> ‘h/history’ was defined in ../hyperdrive.el/hyperdrive-history.el:
> function not found
> ../hyperdrive.el/hyperdrive-lib.el:1332:Warning (check-declare): said
> ‘h/dir-handler’ was defined in ../hyperdrive.el/hyperdrive-dir.el:
> function not found
>
> Would someone else kindly attempt to reproduce the issue?

I just rebuilt from master ce8e292bca84f29cea540e3e23e88ec7a5d1674e

with the following settings

./configure --with-x-toolkit=no --with-xpm=ifavailable
--with-jpeg=ifavailable --with-gif=ifavailable --with-tiff=ifavailable
--with-gnutls=ifavailable && make

src/emacs -Q --batch --eval '(check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")'

returns the following:

uncompressing textsec-check.el.gz...
uncompressing textsec-check.el.gz...done
../hyperdrive.el/hyperdrive-lib.el:535:Warning (check-declare): said ‘h/history’
 was defined in ../hyperdrive.el/hyperdrive-history.el: function not found
../hyperdrive.el/hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was
 defined in ../hyperdrive.el/hyperdrive.el: function not found
../hyperdrive.el/hyperdrive-lib.el:1333:Warning (check-declare): said ‘h/blob-mo
de’ was defined in ../hyperdrive.el/hyperdrive.el: function not found
../hyperdrive.el/hyperdrive-lib.el:1383:Warning (check-declare): said ‘h/dir-han
dler’ was defined in ../hyperdrive.el/hyperdrive-dir.el: function not found

Can anyone else reproduce this error?

Thank you!

Joseph





^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2024-05-21 22:05 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-22 22:18 bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-23 12:57 ` João Távora
2023-11-24 21:51   ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-25  0:03     ` João Távora
2023-11-25  3:26       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-25 16:01         ` Jonas Bernoulli via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-25 22:42           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-26 13:52             ` João Távora
2023-11-26 20:35               ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-26 22:02                 ` João Távora
2023-11-27  3:48                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-27 12:10                   ` Eli Zaretskii
2023-11-29  8:21                     ` João Távora
2023-11-29  9:12                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-11-29 13:56                         ` João Távora
2023-11-29 13:30           ` João Távora
2023-11-29 23:27             ` João Távora
2023-11-30 14:16               ` João Távora
2023-11-30 15:23                 ` Eli Zaretskii
2023-11-30 15:29                   ` João Távora
2023-12-09 18:50                 ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-02-03  7:10                   ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-02-03 14:50                     ` João Távora
2024-02-03 19:43                       ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-02-03 22:25                         ` João Távora
2024-02-03 23:48                           ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-05-21 22:05                             ` Joseph Turner via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).