all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Kévin Le Gouguec" <kevin.legouguec@gmail.com>
To: 66902@debbugs.gnu.org
Subject: bug#66902: 30.0.50; Recognize env -S/--split-string in shebangs
Date: Sun, 12 Nov 2023 18:53:40 +0100	[thread overview]
Message-ID: <871qcuuacb.fsf@gmail.com> (raw)
In-Reply-To: <87ttq3lvpm.fsf@gmail.com> ("Kévin Le Gouguec"'s message of "Thu, 02 Nov 2023 21:57:25 +0100")

[-- Attachment #1: Type: text/plain, Size: 1449 bytes --]

Kévin Le Gouguec <kevin.legouguec@gmail.com> writes:

> Questions before proceeding to ChangeLog entries & regression tests:

For better or worse, I ended up proceeding to both these things, and
then some.  Let me know if the attached patches make sense; tested with

  make -j8 bootstrap && make -C test files-tests


Tentative answers to my questions:

> 1. Is this something we would like Emacs to recognize out of the box, or
> is it too niche?

Assuming yes.

> 2. What about the more general forms shown in (info "(coreutils) env
> invocation")?
>
>   #!/usr/bin/env -[v]S[OPTION]... [NAME=VALUE]... COMMAND [ARGS]...

Didn't go as far as handling -v nor NAME=VALUE pairs, but that could be
added later if we ever feel like it.

> 3. Assuming we do want to amend that regexp, would it be possible to use
> rx here?  OT1H guessing "no" because files.el is pre-reloaded, whereas
> rx.el is not; OTOH I see that files.el requires easy-mmode at
> compile-time, and that package does not show up in loadup.el, so…
> settling for "maybe?"

Figured rx was similar to pcase in that regard:

* They need to be required explicitly despite their macros being
  "autoloaded", because files.el is loaded during bootstrap before
  autoloading is set up.

* Somehow that does not cause them to be preloaded?  At least going by
  emacs -Q,
  * featurep returns nil,
  * preloaded-file-list does not include them.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-basic-tests-for-interpreter-mode-alist.patch --]
[-- Type: text/x-patch, Size: 2584 bytes --]

From 8ee71e0c70fa5c16cb802722e8de15af0932773d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= <kevin.legouguec@gmail.com>
Date: Sun, 12 Nov 2023 10:55:24 +0100
Subject: [PATCH 1/3] Add basic tests for interpreter-mode-alist

* test/lisp/files-tests.el (files-tests--check-shebang): New helper to
generate a temporary file with a given interpreter line, and assert
that the mode picked by 'set-auto-mode' is derived from an expected
mode.  Write the 'should' form so that failure reports include useful
context; for example:

    (ert-test-failed
     ((should
       (equal (list shebang actual-mode) (list shebang expected-mode)))
      :form
      (equal ("#!/usr/bin/env -S make -f" fundamental-mode)
	     ("#!/usr/bin/env -S make -f" makefile-mode))
      :value nil :explanation
      (list-elt 1 (different-atoms fundamental-mode makefile-mode))))

(files-tests-auto-mode-interpreter): New test; exercise some aspects
of interpreter-mode-alist.
---
 test/lisp/files-tests.el | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/test/lisp/files-tests.el b/test/lisp/files-tests.el
index 3492bd701b2..233efded945 100644
--- a/test/lisp/files-tests.el
+++ b/test/lisp/files-tests.el
@@ -1656,6 +1656,29 @@ files-tests-file-name-base
   (should (equal (file-name-base "foo") "foo"))
   (should (equal (file-name-base "foo/bar") "bar")))
 
+(defun files-tests--check-shebang (shebang expected-mode)
+  "Assert that mode for SHEBANG derives from EXPECTED-MODE."
+  (let ((actual-mode
+         (ert-with-temp-file script-file
+           :text shebang
+           (find-file script-file)
+           (if (derived-mode-p expected-mode)
+               expected-mode
+             major-mode))))
+    ;; Tuck all the information we need in the `should' form: input
+    ;; shebang, expected mode vs actual.
+    (should
+     (equal (list shebang actual-mode)
+            (list shebang expected-mode)))))
+
+(ert-deftest files-tests-auto-mode-interpreter ()
+  "Test that `set-auto-mode' deduces correct modes from shebangs."
+  (files-tests--check-shebang "#!/bin/bash" 'sh-mode)
+  (files-tests--check-shebang "#!/usr/bin/env bash" 'sh-mode)
+  (files-tests--check-shebang "#!/usr/bin/env python" 'python-base-mode)
+  (files-tests--check-shebang "#!/usr/bin/env python3" 'python-base-mode)
+  (files-tests--check-shebang "#!/usr/bin/make -f" 'makefile-mode))
+
 (ert-deftest files-test-dir-locals-auto-mode-alist ()
   "Test an `auto-mode-alist' entry in `.dir-locals.el'"
   (find-file (ert-resource-file "whatever.quux"))
-- 
2.42.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Convert-auto-mode-interpreter-regexp-to-an-rx-form.patch --]
[-- Type: text/x-patch, Size: 1702 bytes --]

From d730ee2108e3bd4d641bce2cb50f61e8fbdfcd09 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= <kevin.legouguec@gmail.com>
Date: Sun, 12 Nov 2023 16:51:04 +0100
Subject: [PATCH 2/3] Convert auto-mode-interpreter-regexp to an rx form

* lisp/files.el: explicitly require rx even though the macros are
autoloaded, since files.el is loaded during bootstrap.
(auto-mode-interpreter-regexp): re-write using rx.  A subsequent patch
will add support for env's -S/--split-string argument, which will
complicate the pattern past my personal threshold for bare regexps.
Allow multiple spaces between #!, interpreter and first argument:
empirically, Linux's execve allows it.
---
 lisp/files.el | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/lisp/files.el b/lisp/files.el
index 3d838cd3b8c..dc301bea3c5 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -30,6 +30,7 @@
 
 (eval-when-compile
   (require 'pcase)
+  (require 'rx)
   (require 'easy-mmode)) ; For `define-minor-mode'.
 
 (defvar font-lock-keywords)
@@ -3245,8 +3246,14 @@ inhibit-local-variables-p
     temp))
 
 (defvar auto-mode-interpreter-regexp
-  (purecopy "#![ \t]?\\([^ \t\n]*\
-/bin/env[ \t]\\)?\\([^ \t\n]+\\)")
+  (purecopy
+   (rx-let ((ascii-blank (any " \t"))
+            (non-blank (not (any " \t\n"))))
+     (rx "#!"
+         (* ascii-blank)
+         (? (group (* non-blank) "/bin/env"
+                   (* ascii-blank)))
+         (group (+ non-blank)))))
   "Regexp matching interpreters, for file mode determination.
 This regular expression is matched against the first line of a file
 to determine the file's mode in `set-auto-mode'.  If it matches, the file
-- 
2.42.1


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-Recognize-shebang-lines-that-pass-S-split-string-to-.patch --]
[-- Type: text/x-patch, Size: 2049 bytes --]

From 0287f84a3ab6b767cc99b91356a96f2162c6a099 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= <kevin.legouguec@gmail.com>
Date: Sun, 12 Nov 2023 17:46:34 +0100
Subject: [PATCH 3/3] Recognize shebang lines that pass -S/--split-string to
 env

* lisp/files.el (auto-mode-interpreter-regexp): Add optional -S switch
to the ignored group capturing the env invocation.
* test/lisp/files-tests.el (files-test-auto-mode-interpreter): Add a
couple of testcases; one from (info "(coreutils) env invocation"), the
other from a personal project.
---
 lisp/files.el            | 4 +++-
 test/lisp/files-tests.el | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/lisp/files.el b/lisp/files.el
index dc301bea3c5..56bdcf9d08b 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -3252,7 +3252,9 @@ auto-mode-interpreter-regexp
      (rx "#!"
          (* ascii-blank)
          (? (group (* non-blank) "/bin/env"
-                   (* ascii-blank)))
+                   (* ascii-blank)
+                   (? (or (: "-S" (* ascii-blank))
+                          (: "--split-string" (or ?= (* ascii-blank)))))))
          (group (+ non-blank)))))
   "Regexp matching interpreters, for file mode determination.
 This regular expression is matched against the first line of a file
diff --git a/test/lisp/files-tests.el b/test/lisp/files-tests.el
index 233efded945..3e499fff468 100644
--- a/test/lisp/files-tests.el
+++ b/test/lisp/files-tests.el
@@ -1677,6 +1677,8 @@ files-tests-auto-mode-interpreter
   (files-tests--check-shebang "#!/usr/bin/env bash" 'sh-mode)
   (files-tests--check-shebang "#!/usr/bin/env python" 'python-base-mode)
   (files-tests--check-shebang "#!/usr/bin/env python3" 'python-base-mode)
+  (files-tests--check-shebang "#!/usr/bin/env -S awk -v FS=\"\\t\" -v OFS=\"\\t\" -f" 'awk-mode)
+  (files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode)
   (files-tests--check-shebang "#!/usr/bin/make -f" 'makefile-mode))
 
 (ert-deftest files-test-dir-locals-auto-mode-alist ()
-- 
2.42.1


  reply	other threads:[~2023-11-12 17:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-02 20:57 bug#66902: 30.0.50; Recognize env -S/--split-string in shebangs Kévin Le Gouguec
2023-11-12 17:53 ` Kévin Le Gouguec [this message]
2023-11-18  9:41   ` Eli Zaretskii
2023-11-18 10:31     ` Kévin Le Gouguec
2023-11-18 17:44       ` Kévin Le Gouguec
2023-11-19  9:09         ` Eli Zaretskii
2023-11-19 10:51           ` Kévin Le Gouguec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871qcuuacb.fsf@gmail.com \
    --to=kevin.legouguec@gmail.com \
    --cc=66902@debbugs.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.