From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: =?UTF-8?Q?K=C3=A9vin?= Le Gouguec <kevin.legouguec@gmail.com>
Newsgroups: gmane.emacs.bugs
Subject: bug#64939: 30.0.50;
 The default auto-mode-interpreter-regexp does not match env with flags
Date: Sat, 10 Feb 2024 11:23:01 +0100
Message-ID: <871q9kvcsa.fsf_-_@gmail.com>
References: <87mszebgwy.fsf@gmail.com>
 <CAAAQmVYs-Chasez9Ou3p+_TrDHa_U2D-QF-tGyK9TfWjE4ogXA@mail.gmail.com>
 <86a5o8vi5d.fsf@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="24613"; mail-complaints-to="usenet@ciao.gmane.io"
User-Agent: Gnus/5.13 (Gnus v5.13)
Cc: Wilhelm Kirschbaum <wkirschbaum@gmail.com>,
 Malcolm Cook <malcolm.cook@gmail.com>, 64939@debbugs.gnu.org
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Feb 10 11:24:16 2024
Return-path: <bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org>
Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org>)
	id 1rYkWs-0006AP-A0
	for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 10 Feb 2024 11:24:14 +0100
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <bug-gnu-emacs-bounces@gnu.org>)
	id 1rYkWT-0006rG-3H; Sat, 10 Feb 2024 05:23:49 -0500
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>)
 id 1rYkWR-0006qk-55
 for bug-gnu-emacs@gnu.org; Sat, 10 Feb 2024 05:23:47 -0500
Original-Received: from debbugs.gnu.org ([2001:470:142:5::43])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <Debian-debbugs@debbugs.gnu.org>)
 id 1rYkWQ-0002Qb-TC
 for bug-gnu-emacs@gnu.org; Sat, 10 Feb 2024 05:23:46 -0500
Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2)
 (envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1rYkWg-00033x-CJ
 for bug-gnu-emacs@gnu.org; Sat, 10 Feb 2024 05:24:02 -0500
X-Loop: help-debbugs@gnu.org
Resent-From: =?UTF-8?Q?K=C3=A9vin?= Le Gouguec <kevin.legouguec@gmail.com>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
Resent-CC: bug-gnu-emacs@gnu.org
Resent-Date: Sat, 10 Feb 2024 10:24:02 +0000
Resent-Message-ID: <handler.64939.B64939.170756060811697@debbugs.gnu.org>
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 64939
X-GNU-PR-Package: emacs
Original-Received: via spool by 64939-submit@debbugs.gnu.org id=B64939.170756060811697
 (code B ref 64939); Sat, 10 Feb 2024 10:24:02 +0000
Original-Received: (at 64939) by debbugs.gnu.org; 10 Feb 2024 10:23:28 +0000
Original-Received: from localhost ([127.0.0.1]:58291 helo=debbugs.gnu.org)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
 id 1rYkW7-00032b-Eq
 for submit@debbugs.gnu.org; Sat, 10 Feb 2024 05:23:28 -0500
Original-Received: from mail-wr1-x42d.google.com ([2a00:1450:4864:20::42d]:56541)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <kevin.legouguec@gmail.com>) id 1rYkW5-00032G-4H
 for 64939@debbugs.gnu.org; Sat, 10 Feb 2024 05:23:25 -0500
Original-Received: by mail-wr1-x42d.google.com with SMTP id
 ffacd0b85a97d-33b66883de9so1000762f8f.0
 for <64939@debbugs.gnu.org>; Sat, 10 Feb 2024 02:23:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1707560583; x=1708165383; darn=debbugs.gnu.org;
 h=content-transfer-encoding:mime-version:user-agent:message-id:date
 :references:in-reply-to:subject:cc:to:from:from:to:cc:subject:date
 :message-id:reply-to;
 bh=8zbzcON2v52mFMzP/xpdqqqONRHMS3fTPiZXY8IeMGg=;
 b=UpLyWYwDdqJaIhxtNnI2cRZMcKj1F9eMcqTqhnrqfHHYD42+PvauI7X4vldU/cjACU
 TzAGHAqSftZGayWNdlsDDg+g1n2/RcrY3tuHXNP4H5dLqCswLYXNR2LMMswFd7NQVTuN
 xpNX1d5lOD2HTwZyRQxoY5BcnmBiuAd3hLyTnJQuQ1KnAT0zUgkkCANcpsYjwObKjkGm
 NKqccnh+LtFKbZ8ZvohlRwCthe2hLDIP1GMDHFUkWZU20F9PwKSH9uz2ON5iTTvMgklh
 bmer+n/sT3CXC1Hp+fc9LIFnKG7E3aczzO21PeiGkpRY2xzeAMcIqbJghH1Wiz5YQDmi
 Ya8Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1707560583; x=1708165383;
 h=content-transfer-encoding:mime-version:user-agent:message-id:date
 :references:in-reply-to:subject:cc:to:from:x-gm-message-state:from
 :to:cc:subject:date:message-id:reply-to;
 bh=8zbzcON2v52mFMzP/xpdqqqONRHMS3fTPiZXY8IeMGg=;
 b=TE6OIKPTrUOFpsqiLQA19S/12i/PgYDhAe2dhD78UPRCLEnRbQHfEJjoAzKnAowJ6U
 1zkVQayuGTRYKNnhFcdcogWDnNcqXVLb6vqaZx7U9GHJtRrelxZPQYTHls/r+LbQVCUD
 CHQca+kTqagiKUoxK++7TmGj+WpoZOQS1saW66UjgPTc//jrq38vHteBR2VPsJfta1kL
 DEtWJlZLC8UhtofZbeZQSNg0nKrooFNN9Cdec4UyKkqCkgPNdTqRfttDjLDS0rvcgG+O
 v2LizDrFFL3gsPNCuVF7TgjXh7+6jrt0kndORqRJN3MSgcYbzlEYbbOEvN5ZNY492EfI
 WKFQ==
X-Gm-Message-State: AOJu0Yyiq1QT80rSGeQ35KvcBtB0aZtDTSctqHWV05mu7IEeulZQ4VEc
 SFPwKr4PZZSTdzUi36K6mIIf36qUJ0CnAS1zbwrwwrAN82MsivpTiYKQTs+I
X-Google-Smtp-Source: AGHT+IGe6XRurc1hnXardzeoijsuupvNcTLZIMe70vnlXsFtK5qTdRom8OUw2MRtQ4UOayrEAFQ2VQ==
X-Received: by 2002:a5d:46c4:0:b0:33b:4d5e:ace4 with SMTP id
 g4-20020a5d46c4000000b0033b4d5eace4mr1136613wrs.36.1707560583167; 
 Sat, 10 Feb 2024 02:23:03 -0800 (PST)
X-Forwarded-Encrypted: i=1;
 AJvYcCWOYF0Hs/+L8VEPU8OeHJyv/yK6M6Hp8CxdhxC9+XbKNFi/MzBRqvZpx19fy2Ug2RwkBKG7n+9D7uvY3w/SdK3IT+3tYQe1hqMw2Z/aXWv9ABkaZhRPPa1d0z+pw5M=
Original-Received: from amdahl30 ([2a01:e0a:253:fe0:2ef0:5dff:fed2:7b49])
 by smtp.gmail.com with ESMTPSA id
 p11-20020a5d68cb000000b0033b66c2d61esm1488528wrw.48.2024.02.10.02.23.02
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Sat, 10 Feb 2024 02:23:02 -0800 (PST)
In-Reply-To: <86a5o8vi5d.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 10 Feb
 2024 10:27:10 +0200")
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
X-BeenThere: bug-gnu-emacs@gnu.org
List-Id: "Bug reports for GNU Emacs,
 the Swiss army knife of text editors" <bug-gnu-emacs.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-gnu-emacs>,
 <mailto:bug-gnu-emacs-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/bug-gnu-emacs>
List-Post: <mailto:bug-gnu-emacs@gnu.org>
List-Help: <mailto:bug-gnu-emacs-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-gnu-emacs>,
 <mailto:bug-gnu-emacs-request@gnu.org?subject=subscribe>
Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org
Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.emacs.bugs:279741
Archived-At: <http://permalink.gmane.org/gmane.emacs.bugs/279741>

Thanks for the CC, this report had completely slipped past my notice
when I worked on bug#66902, and so did Malcolm's follow-ups.

Boldly adding Wilhelm as well, since I am not 100% sure Debbugs sends a
copy of every message in a report to their OP.

Comments below.

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Malcolm Cook <malcolm.cook@gmail.com>
>> Date: Thu, 1 Feb 2024 12:52:39 -0600
>>=20
>> Regarding [1] allowing emacs to recognize shebang lines containing
>> calls to /bin/env with options (such as -S as allowed in new core
>> utils [2])...
>>=20
>> I prefer allowing the proposed "shy" regexp to match zero or more
>> times (using a '*' instead of '?').
>>=20
>> To wit, I have this now in my init.el:
>>=20
>> (setq auto-mode-interpreter-regexp
>>       ;; Support shbang line calling `/bin/env` with `-S` (and/or
>> other options).
>>       ;; c.f. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D64939
>>       (purecopy "#![ \t]?\\([^ \t\n]*\
>> /bin/env[ \t]\\)?\\(?:-\\{1,2\\}[a-zA-Z1-9=3D]+[ \t]+\\)*\\([^
>> \t\n]+\\)"))

IIUC this would be a more lax variant of what we installed for
bug#66902, can you confirm Malcolm?  This is what the current regexp
looks like on the master branch:

  (purecopy
   (concat
    "#![ \t]*"
    ;; Optional group 1: env(1) invocation.
    "\\("
    "[^ \t\n]*/bin/env[ \t]*"
    "\\(?:-S[ \t]*\\|--split-string\\(?:=3D\\|[ \t]*\\)\\)?"
    "\\)?"
    ;; Group 2: interpreter.
    "\\([^ \t\n]+\\)"))

And the corresponding test cases:

(ert-deftest files-tests-auto-mode-interpreter ()
  "Test that `set-auto-mode' deduces correct modes from shebangs."
  (files-tests--check-shebang "#!/bin/bash" 'sh-mode)
  (files-tests--check-shebang "#!/usr/bin/env bash" 'sh-mode)
  (files-tests--check-shebang "#!/usr/bin/env python" 'python-base-mode)
  (files-tests--check-shebang "#!/usr/bin/env python3" 'python-base-mode)
  (files-tests--check-shebang "#!/usr/bin/env -S awk -v FS=3D\"\\t\" -v OFS=
=3D\"\\t\" -f" 'awk-mode)
  (files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode)
  (files-tests--check-shebang "#!/usr/bin/make -f" 'makefile-mode))

Is this Good Enough=E2=84=A2 for your purposes (Malcolm, Wilhelm), or shoul=
d we
sophisticate the regexp further?  FWIW, in no particular order:

(a) env(1) does seem to support mixing up arbitrary options with -S=C2=B9, =
so
    in principle it would make sense to support that;

(b) Eli did not seem too found of the regexp hammer=C2=B2, so I don't know
    which direction we'd want to go between maximally correct (accept
    all arguments, _as long as_ -S|--split-string is in there) or good
    enough (just skip over --everything --that --looks --like -a
    --switch).

(c) FWIW the "maximally correct" regexp might not be _that_ ugly, since
    "-[v]S[OPTION]" must be the *first* token after env; in other words
    no need to support --some-option --split-string --more-options.

>> [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D64939
>> [2] https://www.gnu.org/software/coreutils/manual/html_node/env-invocati=
on.html#env-invocation
>>=20
>> YMMV?
>
> Kevin, any comments about the proposals in this bug report?

Comments above; footnotes below.  Again, thanks for the heads up.

=C2=B9 $ cat demo.sh
  #!/usr/bin/env -vS -uFOOBAR bash -eux

  echo hi
  echo $FOOBAR
  echo bye
  $ FOOBAR=3Dtotally-set ./demo.sh
  split -S:  =E2=80=98 -uFOOBAR bash -eux=E2=80=99
   into:    =E2=80=98-uFOOBAR=E2=80=99
       &    =E2=80=98bash=E2=80=99
       &    =E2=80=98-eux=E2=80=99
  unset:    FOOBAR
  executing: bash
     arg[0]=3D =E2=80=98bash=E2=80=99
     arg[1]=3D =E2=80=98-eux=E2=80=99
     arg[2]=3D =E2=80=98./foo.sh=E2=80=99
  + echo hi
  hi
  ./foo.sh: line 4: FOOBAR: unbound variable

=C2=B2 https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D64939#14