From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Joseph Turner via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator Date: Fri, 02 Feb 2024 23:10:00 -0800 Message-ID: <87sf2at44r.fsf@ushin.org> References: <87a5r5ph3p.fsf@bernoul.li> <87msv2vmzf.fsf@bernoul.li> <878r6mzezo.fsf@ushin.org> <87sf4tg6ts.fsf@bernoul.li> <87ttoqnxci.fsf@ushin.org> Reply-To: Joseph Turner Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31157"; mail-complaints-to="usenet@ciao.gmane.io" To: =?UTF-8?Q?Jo=C3=A3o_?= =?UTF-8?Q?T=C3=A1vora?= , Jonas Bernoulli , Eli Zaretskii , 67390@debbugs.gnu.org, Adam Porter Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Feb 03 08:11:02 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rWAB1-0007t9-JF for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 03 Feb 2024 08:11:00 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rWAAu-0002xY-Hb; Sat, 03 Feb 2024 02:10:52 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rWAAs-0002xO-Vb for bug-gnu-emacs@gnu.org; Sat, 03 Feb 2024 02:10:51 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rWAAs-0003th-N7 for bug-gnu-emacs@gnu.org; Sat, 03 Feb 2024 02:10:50 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rWAB3-00044p-M5 for bug-gnu-emacs@gnu.org; Sat, 03 Feb 2024 02:11:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Joseph Turner Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 03 Feb 2024 07:11:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 67390 X-GNU-PR-Package: emacs Original-Received: via spool by 67390-submit@debbugs.gnu.org id=B67390.170694423715638 (code B ref 67390); Sat, 03 Feb 2024 07:11:01 +0000 Original-Received: (at 67390) by debbugs.gnu.org; 3 Feb 2024 07:10:37 +0000 Original-Received: from localhost ([127.0.0.1]:45802 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rWAAe-000449-1J for submit@debbugs.gnu.org; Sat, 03 Feb 2024 02:10:36 -0500 Original-Received: from out-175.mta1.migadu.com ([95.215.58.175]:43101) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rWAAZ-00043x-FF for 67390@debbugs.gnu.org; Sat, 03 Feb 2024 02:10:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ushin.org; s=key1; t=1706944218; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hiUvm/mQuuNsSQvdPGy5Mys+FvzAPkrxYI7iZWlOtm0=; b=r0BRp7Tam51dWjZczeUNmeMV/KP9BPsSwHJQai0EIcx7erZ1jGgX3hZ7T4wpz+qQvAaBT6 hWB7Wm3aepgxfnh6Kdi7ZS7oe9QxvqSMnYpW1BM6JmC1Wh/UfOyWits/psT9pEMQvi7EAW AFFWApDpu9qYlMOrVxSMOrQvphiF73575CZ1GB+cT1xZ4w2PKJvMhzP1fqqy9XdkPRa/fV h7vL9I0aYGrZKc0oZeUl1tqG4yeYYiFFEscx4ivM1lE8wfjfQOxEXqbNo/a/vHK0+QF6PF jpdsIG3IaLmH/YBhYdsVmVczgxDOk6CUfFRL53KgT9Ef45UJ65vTIFJdAES/2A== X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. In-reply-to: <87ttoqnxci.fsf@ushin.org> X-Migadu-Flow: FLOW_OUT X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:279361 Archived-At: Joseph Turner writes: > Hi Jo=C3=A3o! Thanks for your patience - preparing for EmacsConf was a b= last, > and now I'm on a plane to go visit my grandmother! > > Jo=C3=A3o T=C3=A1vora writes: > >> Hi all, >> >> I've been working on all these shorthand-related issues over the last >> two days and I have reasonably short fixes for all of them. >> >> For this particular issue (bug#67309), I've opted to >> use Joseph's patch with very slight adjustments, as it's the >> only one that guarantees correct behaviour and doesn't seem >> to impact performance. >> >> The other issues are: >> >> bug#63480 (loaddefs-gen.el doesn't know about shorthands) >> bug#67325 (prefix discovery i.e. register-definition-prefixes) >> bug#67523 (check-declare.el doesn't know about shorthands) >> >> I have all this in 6 commits in the bugfix/shorthand-fixes branch. >> >> Here's the full patch minus whitespace changes. If there are >> no comments I'll push in a few days' time. >> >> Jo=C3=A3o >> >> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi >> index 1f3b677d7fb..18e80311177 100644 >> --- a/doc/lispref/symbols.texi >> +++ b/doc/lispref/symbols.texi >> @@ -761,6 +761,23 @@ Shorthands >> ;; End: >> @end example >> >> +Note that if you have two shorthands in the same file where one is the >> +prefix of the other, the longer shorthand will be attempted first. >> +This happens regardless of the order you specify shorthands in the >> +local variables section of your file. >> + >> +@example >> +'( >> + t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo' >> + t/foo ; reads to 'my-tricks-foo' >> + ) >> + >> +;; Local Variables: >> +;; read-symbol-shorthands: (("t/" . "my-tricks-") >> +;; ("t//" . "my-tricks--") >> +;; End: >> +@end example >> + >> @subsection Exceptions > > Clear and concise. > >> There are two exceptions to rules governing Shorthand transformations: >> diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-de= clare.el >> index c887d95210c..b19aedf314d 100644 >> --- a/lisp/emacs-lisp/check-declare.el >> +++ b/lisp/emacs-lisp/check-declare.el >> @@ -145,21 +145,26 @@ check-declare-verify >> (if (file-regular-p fnfile) >> (with-temp-buffer >> (insert-file-contents fnfile) >> + (unless cflag >> + ;; If in Elisp, ensure syntax and shorthands available >> + (set-syntax-table emacs-lisp-mode-syntax-table) >> + (let (enable-local-variables) (hack-local-variables))) >> ;; defsubst's don't _have_ to be known at compile time. >> - (setq re (format (if cflag >> - "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\"" >> + (setq re (if cflag >> + (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\"" >> + (regexp-opt (mapcar 'cadr fnlist) t)) >> "^[ \t]*(\\(fset[ \t]+'\\|\ >> cl-def\\(?:generic\\|method\\|un\\)\\|\ >> def\\(?:un\\|subst\\|foo\\|method\\|class\\|\ >> ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode= \\|\ >> \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\ >> ine-overloadable-function\\)\\)\ >> -[ \t]*%s\\([ \t;]+\\|$\\)") >> - (regexp-opt (mapcar 'cadr fnlist) t))) >> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)")) > > Would you explain what this regexp is intended to match? > >> (while (re-search-forward re nil t) >> (skip-chars-forward " \t\n") >> - (setq fn (match-string 2) >> - type (match-string 1) >> + (setq fn (symbol-name (car (read-from-string (match-string = 2))))) >> + (when (member fn (mapcar 'cadr fnlist)) >> + (setq type (match-string 1) >> ;; (min . max) for a fixed number of arguments, or >> ;; arglists with optional elements. >> ;; (min) for arglists with &rest. >> @@ -202,7 +207,7 @@ check-declare-verify >> (t >> 'err)) >> ;; alist of functions and arglist signatures. >> - siglist (cons (cons fn sig) siglist))))) >> + siglist (cons (cons fn sig) siglist)))))) >> (dolist (e fnlist) >> (setq arglist (nth 2 e) >> type > > On my machine, this patch removes some of the check-declare "function > not found" errors, but not all. For example, with hyperdrive-lib.el: > > (check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el") > > Before this patch, the "*Check Declarations Warnings*" buffer shows: > > --8<---------------cut here---------------start------------->8--- > =E2=96=A0 hyperdrive-lib.el:44:Warning (check-declare): said =E2=80=98h/= mode=E2=80=99 was defined in > ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found > =E2=96=A0 hyperdrive-lib.el:508:Warning (check-declare): said =E2=80=98h= /history=E2=80=99 was defined > in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function = not > found > =E2=96=A0 hyperdrive-lib.el:1283:Warning (check-declare): said =E2=80=98= h/org--link-goto=E2=80=99 was > defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: funct= ion > not found > =E2=96=A0 hyperdrive-lib.el:45:Warning (check-declare): said =E2=80=98h/= dir-mode=E2=80=99 was defined > in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not = found > =E2=96=A0 hyperdrive-lib.el:1069:Warning (check-declare): said > =E2=80=98h/dir--entry-at-point=E2=80=99 was defined in > ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not fou= nd > =E2=96=A0 hyperdrive-lib.el:1332:Warning (check-declare): said =E2=80=98= h/dir-handler=E2=80=99 was > defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: funct= ion > not found > --8<---------------cut here---------------end--------------->8--- > > > and after your patch: > > --8<---------------cut here---------------start------------->8--- > =E2=96=A0 hyperdrive-lib.el:44:Warning (check-declare): said =E2=80=98h/= mode=E2=80=99 was defined in > ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found > =E2=96=A0 hyperdrive-lib.el:508:Warning (check-declare): said =E2=80=98h= /history=E2=80=99 was defined > in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function = not > found > =E2=96=A0 hyperdrive-lib.el:1332:Warning (check-declare): said =E2=80=98= h/dir-handler=E2=80=99 was > defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: funct= ion > not found > --8<---------------cut here---------------end--------------->8--- > > Are you able to reproduce this on your machine? > >> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-= gen.el >> index 04bea4723a2..e8093200bec 100644 >> --- a/lisp/emacs-lisp/loaddefs-gen.el >> +++ b/lisp/emacs-lisp/loaddefs-gen.el >> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file >> (let ((defs nil) >> (load-name (loaddefs-generate--file-load-name file main-outfile= )) >> (compute-prefixes t) >> + read-symbol-shorthands >> local-outfile inhibit-autoloads) >> (with-temp-buffer >> (insert-file-contents file) >> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file >> (setq inhibit-autoloads (read (current-buffer))))) >> (save-excursion >> (when (re-search-forward "autoload-compute-prefixes: *" nil t) >> - (setq compute-prefixes (read (current-buffer)))))) >> + (setq compute-prefixes (read (current-buffer))))) >> + (save-excursion >> + ;; since we're "open-coding" we have to repeat more >> + ;; complicated logic in `hack-local-variables'. >> + (when (re-search-forward "read-symbol-shorthands: *" nil t) >> + (let* ((commentless (replace-regexp-in-string >> + "\n\\s-*;+" "" >> + (buffer-substring (point) (point-max))= )) >> + (unsorted-shorthands (car (read-from-string commentl= ess)))) >> + (setq read-symbol-shorthands >> + (sort unsorted-shorthands >> + (lambda (sh1 sh2) >> + (> (length (car sh1)) (length (car sh2)))))= ))))) > > IIUC, the intention here is to jump to a final "Local Variables" > declaration at the end of the file, then remove ";;", then read in the > uncommented value of `read-symbol-shorthands'. > > Since `read-from-string' just reads one expression, the above hunk works > when there are more local variables after read-symbol-shorthands: > > ;; Local Variables: > ;; read-symbol-shorthands: (("bc-" . "breadcrumb-")) > ;; autoload-compute-prefixes: nil > ;; End: > > But if the read-symbol-shorthands declaration comes at the top, as in... > > -*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*- > > ...then this form will allocate two strings almost as long as the file. > > Here's an alternative hack attempting to uncomment and read the minimum: > > diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-g= en.el > index e8093200bec..406e4b28f1f 100644 > --- a/lisp/emacs-lisp/loaddefs-gen.el > +++ b/lisp/emacs-lisp/loaddefs-gen.el > @@ -404,10 +404,13 @@ don't include." > (save-excursion > ;; since we're "open-coding" we have to repeat more > ;; complicated logic in `hack-local-variables'. > - (when (re-search-forward "read-symbol-shorthands: *" nil t) > - (let* ((commentless (replace-regexp-in-string > + (when-let ((beg > + (re-search-forward "read-symbol-shorthands: *" nil= t))) > + ;; `read-symbol-shorthands' alist ends with two parens. > + (let* ((end (re-search-forward ")[;\n\s]*)")) > + (commentless (replace-regexp-in-string > "\n\\s-*;+" "" > - (buffer-substring (point) (point-max)))) > + (buffer-substring beg end))) > (unsorted-shorthands (car (read-from-string commentle= ss)))) > (setq read-symbol-shorthands > (sort unsorted-shorthands > >> ;; We always return the package version (even for pre-dumped >> ;; files). >> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes >> (while (re-search-forward >> "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]= " nil t) >> (unless (member (match-string 1) autoload-ignored-definitions) >> - (let ((name (match-string-no-properties 2))) >> + (let* ((name (match-string-no-properties 2)) >> + ;; Consider `read-symbol-shorthands'. >> + (probe (let ((obarray (obarray-make))) >> + (car (read-from-string name))))) >> + (setq name (symbol-name probe)) >> (when (save-excursion >> (goto-char (match-beginning 0)) >> (or (bobp) >> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.= el >> index b0665a55695..69b562e3c7e 100644 >> --- a/lisp/emacs-lisp/shorthands.el >> +++ b/lisp/emacs-lisp/shorthands.el >> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face >> :version "28.1" >> :group 'font-lock-faces) >> >> -(defun shorthands--mismatch-from-end (str1 str2) >> - "Tell index of first mismatch in STR1 and STR2, from end. >> -The index is a valid 0-based index on STR1. Returns nil if STR1 >> -equals STR2. Return 0 if STR1 is a suffix of STR2." >> - (cl-loop with l1 =3D (length str1) with l2 =3D (length str2) >> - for i from 1 >> - for i1 =3D (- l1 i) for i2 =3D (- l2 i) >> - while (eq (aref str1 i1) (aref str2 i2)) >> - if (zerop i2) return (if (zerop i1) nil i1) >> - if (zerop i1) return 0 >> - finally (return i1))) >> - >> (defun shorthands-font-lock-shorthands (limit) >> + "Font lock until LIMIT considering `read-symbol-shorthands'." >> (when read-symbol-shorthands >> (while (re-search-forward >> (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>") >> limit t) >> (let* ((existing (get-text-property (match-beginning 1) 'face)) >> + (print-name (match-string 1)) >> (probe (and (not (memq existing '(font-lock-comment-face >> font-lock-string-face))) >> - (intern-soft (match-string 1)))) >> - (sname (and probe (symbol-name probe))) >> - (mismatch (and sname (shorthands--mismatch-from-end >> - (match-string 1) sname))) >> - (guess (and mismatch (1+ mismatch)))) >> - (when guess >> - (when (and (< guess (1- (length (match-string 1)))) >> - ;; In bug#67390 we allow other separators >> - (eq (char-syntax (aref (match-string 1) guess)) ?_= )) >> - (setq guess (1+ guess))) >> + (intern-soft print-name))) >> + (symbol-name (and probe (symbol-name probe))) >> + (prefix (and symbol-name >> + (not (string-equal print-name symbol-name)) >> + (car (assoc print-name >> + read-symbol-shorthands >> + #'string-prefix-p))))) >> + (when prefix >> (add-face-text-property (match-beginning 1) >> - (+ (match-beginning 1) guess) >> + (+ (match-beginning 1) (length prefix= )) >> 'elisp-shorthand-font-lock-face)))))) > > Works well. let-binding `symbol-name' and `print-name' is good improveme= nt. > >> (font-lock-add-keywords 'emacs-lisp-mode >> '((shorthands-font-lock-shorthands)) t) >> diff --git a/lisp/files.el b/lisp/files.el >> index 1cdcec23b11..b266d0727ec 100644 >> --- a/lisp/files.el >> +++ b/lisp/files.el >> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook >> This hook is called only if there is at least one file-local >> variable to set.") >> >> -(defvar permanently-enabled-local-variables '(lexical-binding) >> +(defvar permanently-enabled-local-variables >> + '(lexical-binding read-symbol-shorthands) >> "A list of file-local variables that are always enabled. >> This overrides any `enable-local-variables' setting.") >> >> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables >> ;; to use 'thisbuf's name in the >> ;; warning message. >> (or (buffer-file-name thisbuf) "")))))) >> + ((eq var 'read-symbol-shorthands) >> + ;; Sort automatically by shorthand length >> + ;; descending >> + (setq val (sort val >> + (lambda (sh1 sh2) (> >> (length (car sh1)) >> + >> (length (car sh2)))))) >> + (push (cons 'read-symbol-shorthands val) res= ult)) >> ((and (eq var 'mode) handle-mode)) >> (t >> (ignore-errors > > Good catch. I agree that longer shorthands should be applied first. > > ----- > > A couple typo nits on the commit message of "Improve > shorthands-font-lock-shorthands (bug#67390)": > > - h//thingy ; hilits "//" reads to 'hyperdrive--thingy' > + h//thingy ; hilits "h//" reads to 'hyperdrive--thingy' > > - Co-authored-by: Jo=C3=A3o T=C3=A1vora > + Co-authored-by: Joseph Turner > > > Thank you! > > Joseph Ping!