all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Kévin Le Gouguec" <kevin.legouguec@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Wilhelm Kirschbaum <wkirschbaum@gmail.com>,
	Malcolm Cook <malcolm.cook@gmail.com>,
	64939@debbugs.gnu.org
Subject: bug#64939: 30.0.50; The default auto-mode-interpreter-regexp does not match env with flags
Date: Sat, 10 Feb 2024 11:23:01 +0100	[thread overview]
Message-ID: <871q9kvcsa.fsf_-_@gmail.com> (raw)
In-Reply-To: <86a5o8vi5d.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 10 Feb 2024 10:27:10 +0200")

Thanks for the CC, this report had completely slipped past my notice
when I worked on bug#66902, and so did Malcolm's follow-ups.

Boldly adding Wilhelm as well, since I am not 100% sure Debbugs sends a
copy of every message in a report to their OP.

Comments below.

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Malcolm Cook <malcolm.cook@gmail.com>
>> Date: Thu, 1 Feb 2024 12:52:39 -0600
>> 
>> Regarding [1] allowing emacs to recognize shebang lines containing
>> calls to /bin/env with options (such as -S as allowed in new core
>> utils [2])...
>> 
>> I prefer allowing the proposed "shy" regexp to match zero or more
>> times (using a '*' instead of '?').
>> 
>> To wit, I have this now in my init.el:
>> 
>> (setq auto-mode-interpreter-regexp
>>       ;; Support shbang line calling `/bin/env` with `-S` (and/or
>> other options).
>>       ;; c.f. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64939
>>       (purecopy "#![ \t]?\\([^ \t\n]*\
>> /bin/env[ \t]\\)?\\(?:-\\{1,2\\}[a-zA-Z1-9=]+[ \t]+\\)*\\([^
>> \t\n]+\\)"))

IIUC this would be a more lax variant of what we installed for
bug#66902, can you confirm Malcolm?  This is what the current regexp
looks like on the master branch:

  (purecopy
   (concat
    "#![ \t]*"
    ;; Optional group 1: env(1) invocation.
    "\\("
    "[^ \t\n]*/bin/env[ \t]*"
    "\\(?:-S[ \t]*\\|--split-string\\(?:=\\|[ \t]*\\)\\)?"
    "\\)?"
    ;; Group 2: interpreter.
    "\\([^ \t\n]+\\)"))

And the corresponding test cases:

(ert-deftest files-tests-auto-mode-interpreter ()
  "Test that `set-auto-mode' deduces correct modes from shebangs."
  (files-tests--check-shebang "#!/bin/bash" 'sh-mode)
  (files-tests--check-shebang "#!/usr/bin/env bash" 'sh-mode)
  (files-tests--check-shebang "#!/usr/bin/env python" 'python-base-mode)
  (files-tests--check-shebang "#!/usr/bin/env python3" 'python-base-mode)
  (files-tests--check-shebang "#!/usr/bin/env -S awk -v FS=\"\\t\" -v OFS=\"\\t\" -f" 'awk-mode)
  (files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode)
  (files-tests--check-shebang "#!/usr/bin/make -f" 'makefile-mode))

Is this Good Enough™ for your purposes (Malcolm, Wilhelm), or should we
sophisticate the regexp further?  FWIW, in no particular order:

(a) env(1) does seem to support mixing up arbitrary options with -S¹, so
    in principle it would make sense to support that;

(b) Eli did not seem too found of the regexp hammer², so I don't know
    which direction we'd want to go between maximally correct (accept
    all arguments, _as long as_ -S|--split-string is in there) or good
    enough (just skip over --everything --that --looks --like -a
    --switch).

(c) FWIW the "maximally correct" regexp might not be _that_ ugly, since
    "-[v]S[OPTION]" must be the *first* token after env; in other words
    no need to support --some-option --split-string --more-options.

>> [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64939
>> [2] https://www.gnu.org/software/coreutils/manual/html_node/env-invocation.html#env-invocation
>> 
>> YMMV?
>
> Kevin, any comments about the proposals in this bug report?

Comments above; footnotes below.  Again, thanks for the heads up.

¹ $ cat demo.sh
  #!/usr/bin/env -vS -uFOOBAR bash -eux

  echo hi
  echo $FOOBAR
  echo bye
  $ FOOBAR=totally-set ./demo.sh
  split -S:  ‘ -uFOOBAR bash -eux’
   into:    ‘-uFOOBAR’
       &    ‘bash’
       &    ‘-eux’
  unset:    FOOBAR
  executing: bash
     arg[0]= ‘bash’
     arg[1]= ‘-eux’
     arg[2]= ‘./foo.sh’
  + echo hi
  hi
  ./foo.sh: line 4: FOOBAR: unbound variable

² https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64939#14





  reply	other threads:[~2024-02-10 10:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-29 20:08 bug#64939: 30.0.50; The default auto-mode-interpreter-regexp does not match env with flags Wilhelm Kirschbaum
2023-07-29 21:38 ` Wilhelm Kirschbaum
2023-07-30  5:04   ` Eli Zaretskii
2023-07-30  9:38     ` Wilhelm Kirschbaum
2023-07-30 10:04       ` Eli Zaretskii
2023-07-31  7:11         ` Wilhelm Kirschbaum
2023-07-31 17:38       ` Juri Linkov
2023-08-01  6:20         ` Wilhelm Kirschbaum
2023-07-30  4:53 ` Eli Zaretskii
2023-07-30  8:28   ` Wilhelm Kirschbaum
2023-07-30 10:03     ` Eli Zaretskii
2023-07-30 10:27       ` Wilhelm Kirschbaum
2024-01-31 19:52 ` bug#64939: Malcolm Cook
2024-02-01 18:52 ` bug#64939: Malcolm Cook
2024-02-10  8:27   ` bug#64939: Eli Zaretskii
2024-02-10 10:23     ` Kévin Le Gouguec [this message]
2024-02-10 17:08       ` bug#64939: 30.0.50; The default auto-mode-interpreter-regexp does not match env with flags Kévin Le Gouguec
2024-02-10 17:23         ` Malcolm Cook
2024-02-17  8:33           ` Eli Zaretskii
2024-02-28 17:57             ` Wilhelm Kirschbaum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871q9kvcsa.fsf_-_@gmail.com \
    --to=kevin.legouguec@gmail.com \
    --cc=64939@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=malcolm.cook@gmail.com \
    --cc=wkirschbaum@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.