From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: =?UTF-8?Q?K=C3=A9vin?= Le Gouguec Newsgroups: gmane.emacs.bugs Subject: bug#64939: 30.0.50; The default auto-mode-interpreter-regexp does not match env with flags Date: Sat, 10 Feb 2024 18:08:18 +0100 Message-ID: <87v86wtfgd.fsf@gmail.com> References: <87mszebgwy.fsf@gmail.com> <86a5o8vi5d.fsf@gnu.org> <871q9kvcsa.fsf_-_@gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="36603"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Wilhelm Kirschbaum , Malcolm Cook , 64939@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Feb 10 18:09:02 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rYqqc-0009Ks-4S for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 10 Feb 2024 18:09:02 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rYqqP-0002Os-AB; Sat, 10 Feb 2024 12:08:49 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rYqqN-0002OU-6d for bug-gnu-emacs@gnu.org; Sat, 10 Feb 2024 12:08:47 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rYqqM-0001fe-Ru for bug-gnu-emacs@gnu.org; Sat, 10 Feb 2024 12:08:46 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rYqqb-0001aG-U1 for bug-gnu-emacs@gnu.org; Sat, 10 Feb 2024 12:09:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?K=C3=A9vin?= Le Gouguec Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 10 Feb 2024 17:09:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64939 X-GNU-PR-Package: emacs Original-Received: via spool by 64939-submit@debbugs.gnu.org id=B64939.17075849276032 (code B ref 64939); Sat, 10 Feb 2024 17:09:01 +0000 Original-Received: (at 64939) by debbugs.gnu.org; 10 Feb 2024 17:08:47 +0000 Original-Received: from localhost ([127.0.0.1]:55121 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rYqqM-0001ZD-Oh for submit@debbugs.gnu.org; Sat, 10 Feb 2024 12:08:47 -0500 Original-Received: from mail-lj1-x236.google.com ([2a00:1450:4864:20::236]:57758) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rYqqK-0001Yt-8a for 64939@debbugs.gnu.org; Sat, 10 Feb 2024 12:08:45 -0500 Original-Received: by mail-lj1-x236.google.com with SMTP id 38308e7fff4ca-2d09d90fa11so24471621fa.3 for <64939@debbugs.gnu.org>; Sat, 10 Feb 2024 09:08:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707584902; x=1708189702; darn=debbugs.gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=4RAU+0DRGPPyqr2JAkK+FWSIm/hvv5qNYHibT3evtZg=; b=GUbYRhV5Ziqx2vaTt/flGH0sf0cpzaHISF+aWPwe0PJ4h+w/hbAeepakmNGnSwbd/I tCvROygvmwNl5NtH3+IZzVUdvDcLBkO22O2OBj6iQmTgTgrD5GeVYi/H6TfSI78d+IBB T9TtfDh9y6Yylc5Itn3Jkjj0gMTZgQ+XA1Vb1jICM42kXps8fl6fEhURvLXOZ2EwzE0E 7kgbjWomWiCBTEAB27AbK5gG84yvTRGKpPxOXHzStzYw/Zd90CPUmkoMUjNM+TWGUMvh fFZsFEO7rUAZrLXpVyZohm7vzMG5aoP7GX+rbeDJpGBfJoMgW/0DFbwu+vyIoXc+fSth jifg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707584902; x=1708189702; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4RAU+0DRGPPyqr2JAkK+FWSIm/hvv5qNYHibT3evtZg=; b=BWgS91krS5pyHQn2VEHw0F6wjM0n0PNF7HXbgPkcJ25BtarNDyzHGHg06ZOj18BuKq TE6e4qmuRCw79ICXzkIX/Bh3iS5OC6R86b6lj7ZqHFeTLhs7wNs0mZzH5kVfwXTUw1TK uQb9NHntwGriALhKcJzhMQ/LMenq1mCjzTf/t5lXt4WgNNtx/blzOZCUxAj2wyR7Ezry v+d0Q1v4dACCEC1ADQNLPl7KcvM294HMrN9h3qC6ZqpADSSlLKVRA4+L2GsYcS6jz00y 41IJr1Sj3rFWQCv+uytdbMTcFjGLaoirgJ33bA9ZS2NeBt8Sswz7Esw73bX1BOz4u+qt gstw== X-Gm-Message-State: AOJu0Yw/t1OJu9/rww9bNHQ1Nbxg7Mtjy4eGh+QXlgfis9RbNuyq432V L4xY+9VtxdvMnRGwNxi9TPMW/NUqufjOQi5iwm7pET8oCkx4wloAqpvXEeYi X-Google-Smtp-Source: AGHT+IG7UHVSlc+3nZFrJwtyMZOiG82ygsZK+pNLZBes+tfXJRmEZ65j71FxRKb8fst66uYSACRv5A== X-Received: by 2002:a2e:8884:0:b0:2d0:cd24:24c3 with SMTP id k4-20020a2e8884000000b002d0cd2424c3mr1435906lji.53.1707584901670; Sat, 10 Feb 2024 09:08:21 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVTl5hq1b3nwOgTinirkEixH4bTvtOI6dnRMnCoA6bvxdllu4qDnJWaGXLOtdiP+WT+zg6KgoMzzazu4nyxRaWzuNMZt5G2YV+eBQ5ztslyCwMspwYBX20wkLJyGZw= Original-Received: from amdahl30 ([2a01:e0a:253:fe0:2ef0:5dff:fed2:7b49]) by smtp.gmail.com with ESMTPSA id b8-20020a05600003c800b0033b5a6b4b9bsm2320507wrg.71.2024.02.10.09.08.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Feb 2024 09:08:19 -0800 (PST) In-Reply-To: <871q9kvcsa.fsf_-_@gmail.com> ("=?UTF-8?Q?K=C3=A9vin?= Le Gouguec"'s message of "Sat, 10 Feb 2024 11:23:01 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:279762 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable K=C3=A9vin Le Gouguec writes: > Is this Good Enough=E2=84=A2 for your purposes (Malcolm, Wilhelm), or sho= uld we > sophisticate the regexp further? FWIW, in no particular order: > > (a) env(1) does seem to support mixing up arbitrary options with -S=C2=B9= , so > in principle it would make sense to support that; > > (b) Eli did not seem too found of the regexp hammer=C2=B2, so I don't know > which direction we'd want to go between maximally correct (accept > all arguments, _as long as_ -S|--split-string is in there) or good > enough (just skip over --everything --that --looks --like -a > --switch). > > (c) FWIW the "maximally correct" regexp might not be _that_ ugly, since > "-[v]S[OPTION]" must be the *first* token after env; in other words > no need to support --some-option --split-string --more-options. Well, sorry, couldn't resist. How do the attached patches look? The new testcases should tell the whole story. ('make && make -C test files-tests' seems none the worse for wear) --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-Refine-shebang-tests-bug-64939.patch >From 0a3dcfa9d5859c8a7ecd4679b748298b0f5d3597 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= Date: Sat, 10 Feb 2024 16:14:08 +0100 Subject: [PATCH 1/3] Refine shebang tests (bug#64939) * test/lisp/files-tests.el (files-tests--check-shebang): For shell-script modes, verify that the correct shell is set. (files-tests-auto-mode-interpreter): Prefer sh-base-mode to sh-mode to stay tree-sitter-agnostic; re-organize test cases to make future ones easier to add. --- test/lisp/files-tests.el | 45 ++++++++++++++++++++++++---------------- 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/test/lisp/files-tests.el b/test/lisp/files-tests.el index 718ecd51f8b..23516ff0d7d 100644 --- a/test/lisp/files-tests.el +++ b/test/lisp/files-tests.el @@ -1656,30 +1656,39 @@ files-tests-file-name-base (should (equal (file-name-base "foo") "foo")) (should (equal (file-name-base "foo/bar") "bar"))) -(defun files-tests--check-shebang (shebang expected-mode) - "Assert that mode for SHEBANG derives from EXPECTED-MODE." - (let ((actual-mode - (ert-with-temp-file script-file - :text shebang - (find-file script-file) - (if (derived-mode-p expected-mode) - expected-mode - major-mode)))) - ;; Tuck all the information we need in the `should' form: input - ;; shebang, expected mode vs actual. - (should - (equal (list shebang actual-mode) - (list shebang expected-mode))))) +(defvar sh-shell) + +(defun files-tests--check-shebang (shebang expected-mode &optional expected-dialect) + "Assert that mode for SHEBANG derives from EXPECTED-MODE. + +If EXPECTED-MODE is sh-base-mode, DIALECT says what `sh-shell' should be +set to." + (ert-with-temp-file script-file + :text shebang + (find-file script-file) + (let ((actual-mode (if (derived-mode-p expected-mode) + expected-mode + major-mode))) + ;; Tuck all the information we need in the `should' form: input + ;; shebang, expected mode vs actual. + (should + (equal (list shebang actual-mode) + (list shebang expected-mode))) + (when (eq expected-mode 'sh-base-mode) + (should (eq sh-shell expected-dialect)))))) (ert-deftest files-tests-auto-mode-interpreter () "Test that `set-auto-mode' deduces correct modes from shebangs." - (files-tests--check-shebang "#!/bin/bash" 'sh-mode) - (files-tests--check-shebang "#!/usr/bin/env bash" 'sh-mode) + ;; Straightforward interpreter invocation. + (files-tests--check-shebang "#!/bin/bash" 'sh-base-mode 'bash) + (files-tests--check-shebang "#!/usr/bin/make -f" 'makefile-mode) + ;; Invocation through env. + (files-tests--check-shebang "#!/usr/bin/env bash" 'sh-base-mode 'bash) (files-tests--check-shebang "#!/usr/bin/env python" 'python-base-mode) (files-tests--check-shebang "#!/usr/bin/env python3" 'python-base-mode) + ;; Invocation through env, with supplementary arguments. (files-tests--check-shebang "#!/usr/bin/env -S awk -v FS=\"\\t\" -v OFS=\"\\t\" -f" 'awk-mode) - (files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode) - (files-tests--check-shebang "#!/usr/bin/make -f" 'makefile-mode)) + (files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode)) (ert-deftest files-test-dir-locals-auto-mode-alist () "Test an `auto-mode-alist' entry in `.dir-locals.el'" -- 2.43.0 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0002-Support-more-complex-env-invocations-in-shebang-line.patch >From ec011e258fa3a7697dad631d82bbd53e5bc93f50 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= Date: Sat, 10 Feb 2024 17:37:35 +0100 Subject: [PATCH 2/3] Support more complex env invocations in shebang lines This is not an exact re-implementation of what env accepts, but hopefully it should be "good enough". Example of known limitation: we assume that arguments for --long-options will be set with '=', but that is not necessarily the case. '--unset' (mandatory argument) can be passed as '--unset=VAR' or '--unset VAR', but '--default-signal' (optional argument) requires an '=' sign. For bug#64939. * lisp/files.el (auto-mode-interpreter-regexp): Account for supplementary arguments passed beside -S/--split-string. * test/lisp/files-tests.el (files-tests-auto-mode-interpreter): Test some of these combinations. --- lisp/files.el | 8 +++++++- test/lisp/files-tests.el | 8 +++++++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/lisp/files.el b/lisp/files.el index f67b650cb92..5098d49048e 100644 --- a/lisp/files.el +++ b/lisp/files.el @@ -3274,7 +3274,13 @@ auto-mode-interpreter-regexp ;; Optional group 1: env(1) invocation. "\\(" "[^ \t\n]*/bin/env[ \t]*" - "\\(?:-S[ \t]*\\|--split-string\\(?:=\\|[ \t]*\\)\\)?" + ;; Within group 1: possible -S/--split-string. + "\\(?:" + ;; -S/--split-string + "\\(?:-[0a-z]*S[ \t]*\\|--split-string=\\)" + ;; More env arguments. + "\\(?:-[^ \t\n]+[ \t]+\\)*" + "\\)?" "\\)?" ;; Group 2: interpreter. "\\([^ \t\n]+\\)")) diff --git a/test/lisp/files-tests.el b/test/lisp/files-tests.el index 23516ff0d7d..0a5c3b897e4 100644 --- a/test/lisp/files-tests.el +++ b/test/lisp/files-tests.el @@ -1687,8 +1687,14 @@ files-tests-auto-mode-interpreter (files-tests--check-shebang "#!/usr/bin/env python" 'python-base-mode) (files-tests--check-shebang "#!/usr/bin/env python3" 'python-base-mode) ;; Invocation through env, with supplementary arguments. + (files-tests--check-shebang "#!/usr/bin/env --split-string=bash -eux" 'sh-base-mode 'bash) + (files-tests--check-shebang "#!/usr/bin/env --split-string=-iv --default-signal bash -eux" 'sh-base-mode 'bash) (files-tests--check-shebang "#!/usr/bin/env -S awk -v FS=\"\\t\" -v OFS=\"\\t\" -f" 'awk-mode) - (files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode)) + (files-tests--check-shebang "#!/usr/bin/env -S make -f" 'makefile-mode) + (files-tests--check-shebang "#!/usr/bin/env -S-vi bash -eux" 'sh-base-mode 'bash) + (files-tests--check-shebang "#!/usr/bin/env -ivS --default-signal=INT bash -eux" 'sh-base-mode 'bash) + (files-tests--check-shebang "#!/usr/bin/env -ivS --default-signal bash -eux" 'sh-base-mode 'bash) + (files-tests--check-shebang "#!/usr/bin/env -vS -uFOOBAR bash -eux" 'sh-base-mode 'bash)) (ert-deftest files-test-dir-locals-auto-mode-alist () "Test an `auto-mode-alist' entry in `.dir-locals.el'" -- 2.43.0 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0003-Support-shebang-lines-with-amended-environment.patch >From 9e24e7f000011105a89e5ce81cd9f16eb9ef15b5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?K=C3=A9vin=20Le=20Gouguec?= Date: Sat, 10 Feb 2024 17:56:57 +0100 Subject: [PATCH 3/3] Support shebang lines with amended environment For bug#64939. * lisp/files.el (auto-mode-interpreter-regexp): Account for possible VARIABLE=[VALUE] operands. * test/lisp/files-tests.el (files-tests-auto-mode-interpreter): Add an example from the coreutils manual. --- lisp/files.el | 5 ++++- test/lisp/files-tests.el | 4 +++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/lisp/files.el b/lisp/files.el index 5098d49048e..524385edc84 100644 --- a/lisp/files.el +++ b/lisp/files.el @@ -3274,12 +3274,15 @@ auto-mode-interpreter-regexp ;; Optional group 1: env(1) invocation. "\\(" "[^ \t\n]*/bin/env[ \t]*" - ;; Within group 1: possible -S/--split-string. + ;; Within group 1: possible -S/--split-string and environment + ;; adjustments. "\\(?:" ;; -S/--split-string "\\(?:-[0a-z]*S[ \t]*\\|--split-string=\\)" ;; More env arguments. "\\(?:-[^ \t\n]+[ \t]+\\)*" + ;; Interpreter environment modifications. + "\\(?:[^ \t\n]+=[^ \t\n]*[ \t]+\\)*" "\\)?" "\\)?" ;; Group 2: interpreter. diff --git a/test/lisp/files-tests.el b/test/lisp/files-tests.el index 0a5c3b897e4..d4c1ef3ba67 100644 --- a/test/lisp/files-tests.el +++ b/test/lisp/files-tests.el @@ -1694,7 +1694,9 @@ files-tests-auto-mode-interpreter (files-tests--check-shebang "#!/usr/bin/env -S-vi bash -eux" 'sh-base-mode 'bash) (files-tests--check-shebang "#!/usr/bin/env -ivS --default-signal=INT bash -eux" 'sh-base-mode 'bash) (files-tests--check-shebang "#!/usr/bin/env -ivS --default-signal bash -eux" 'sh-base-mode 'bash) - (files-tests--check-shebang "#!/usr/bin/env -vS -uFOOBAR bash -eux" 'sh-base-mode 'bash)) + (files-tests--check-shebang "#!/usr/bin/env -vS -uFOOBAR bash -eux" 'sh-base-mode 'bash) + ;; Invocation through env, with modified environment. + (files-tests--check-shebang "#!/usr/bin/env -S PYTHONPATH=/...:${PYTHONPATH} python" 'python-base-mode)) (ert-deftest files-test-dir-locals-auto-mode-alist () "Test an `auto-mode-alist' entry in `.dir-locals.el'" -- 2.43.0 --=-=-=--