unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#55838: 29.0.50; Eshell string-split subscript indexing splits too much
@ 2022-06-08  1:36 Jim Porter
  2022-06-08  1:41 ` bug#55838: 29.0.50; [PATCH] " Jim Porter
  0 siblings, 1 reply; 5+ messages in thread
From: Jim Porter @ 2022-06-08  1:36 UTC (permalink / raw)
  To: 55838

 From "emacs -Q -f eshell":

   M-: (setq foo "a\nb:c")

   ~ $ echo $foo
   a
   b:c
   ~ $ echo $foo[: 0]
   ("a" "b")

The first command is normal, and just shows that Eshell outputs the 
string with no manipulation. In the second command, we split the string 
on ":" and get the 0th element. However, that gets split *again* (on 
newlines) and returns a list.

I think this is overly aggressive. It's due to `eshell-apply-indices' 
calling `eshell-convert' on the split element(s) of the string. However, 
`eshell-convert' is primarily designed to turn output from external 
command line programs into a Lispy form (so it splits by line to make a 
list, among other things). This would normally happen when doing 
something like this:

   ~ $ echo ${cat some-file.txt}
   ("line 1" "line 2" ...)

In the original case above, I think the split-subscript operator [: 0] 
should only be doing the one thing the user requested: split on ":" and 
get the 0th element.

Patch forthcoming momentarily. Just getting a bug number.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#55838: 29.0.50; [PATCH] Eshell string-split subscript indexing splits too much
  2022-06-08  1:36 bug#55838: 29.0.50; Eshell string-split subscript indexing splits too much Jim Porter
@ 2022-06-08  1:41 ` Jim Porter
  2022-06-08 12:11   ` bug#55838: 29.0.50; " Lars Ingebrigtsen
  2022-06-08 13:38   ` bug#55838: 29.0.50; [PATCH] " Eli Zaretskii
  0 siblings, 2 replies; 5+ messages in thread
From: Jim Porter @ 2022-06-08  1:41 UTC (permalink / raw)
  To: 55838

[-- Attachment #1: Type: text/plain, Size: 1174 bytes --]

On 6/7/2022 6:36 PM, Jim Porter wrote:
>  From "emacs -Q -f eshell":
> 
>    M-: (setq foo "a\nb:c")
> 
>    ~ $ echo $foo
>    a
>    b:c
>    ~ $ echo $foo[: 0]
>    ("a" "b")
> 
> The first command is normal, and just shows that Eshell outputs the 
> string with no manipulation. In the second command, we split the string 
> on ":" and get the 0th element. However, that gets split *again* (on 
> newlines) and returns a list.

Here's a patch for this. It changes the behavior of 
`eshell-apply-indices' to use `eshell-convert-to-number' (when the 
expansion isn't wrapped in double-quotes) instead of the more-aggressive 
`eshell-convert'. I think `eshell-convert-to-number' is the right thing 
here, since Eshell already converts number-like strings to actual 
numbers in most cases.

As a note, if you wanted the old behavior, you could do something like this:

   ~ $ echo $foo[: 0][0 1]
   ("a" "b")

There's also a suggestion in the "Bugs and ideas" section of the Eshell 
manual to add "*" as a subscript to mean "all indices", so you could do 
the above in a more generic fashion like:

   ~ $ echo $foo[: 0][*]
   ;; Doesn't currently work, but it could.

[-- Attachment #2: 0001-Don-t-split-Eshell-expansions-by-line-when-using-spl.patch --]
[-- Type: text/plain, Size: 3224 bytes --]

From 21cda1114d6269b47963c8835ee23a5bd31dcb39 Mon Sep 17 00:00:00 2001
From: Jim Porter <jporterbugs@gmail.com>
Date: Mon, 6 Jun 2022 19:53:39 -0700
Subject: [PATCH] Don't split Eshell expansions by line when using
 split-subscript operator

* lisp/eshell/esh-var.el (eshell-apply-indices): Use
'eshell-convert-to-number' instead of 'eshell-convert'.

* test/lisp/eshell/esh-var-tests.el
(esh-var-test/interp-convert-var-split-indices): Expand test
(bug#55838).
---
 lisp/eshell/esh-var.el            | 15 ++++++++-------
 test/lisp/eshell/esh-var-tests.el |  9 ++++++++-
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/lisp/eshell/esh-var.el b/lisp/eshell/esh-var.el
index 186f6358bc..27be6e1b1a 100644
--- a/lisp/eshell/esh-var.el
+++ b/lisp/eshell/esh-var.el
@@ -582,10 +582,11 @@ eshell-apply-indices
 Integers imply a direct index, and names, an associate lookup using
 `assoc'.
 
-If QUOTED is non-nil, this was invoked inside double-quotes.  This
-affects the behavior of splitting strings: without quoting, the
-split values are converted to Lisp forms via `eshell-convert'; with
-quoting, they're left as strings.
+If QUOTED is non-nil, this was invoked inside double-quotes.
+This affects the behavior of splitting strings: without quoting,
+the split values are converted to numbers via
+`eshell-convert-to-number' if possible; with quoting, they're
+left as strings.
 
 For example, to retrieve the second element of a user's record in
 '/etc/passwd', the variable reference would look like:
@@ -599,9 +600,9 @@ eshell-apply-indices
                      (not (get-text-property 0 'number index)))
             (setq separator index
                   refs (cdr refs)))
-	  (setq value
-		(mapcar (lambda (i) (eshell-convert i quoted))
-			(split-string value separator)))))
+	  (setq value (split-string value separator))
+          (unless quoted
+            (setq value (mapcar #'eshell-convert-to-number value)))))
       (cond
        ((< (length refs) 0)
 	(error "Invalid array variable index: %s"
diff --git a/test/lisp/eshell/esh-var-tests.el b/test/lisp/eshell/esh-var-tests.el
index 4e2a18861e..072cdb9b40 100644
--- a/test/lisp/eshell/esh-var-tests.el
+++ b/test/lisp/eshell/esh-var-tests.el
@@ -357,11 +357,18 @@ esh-var-test/interp-convert-var-number
 
 (ert-deftest esh-var-test/interp-convert-var-split-indices ()
   "Interpolate and convert string variable with indices"
+  ;; Check that numeric forms are converted to numbers.
   (let ((eshell-test-value "000 010 020 030 040"))
     (should (equal (eshell-test-command-result "echo $eshell-test-value[0]")
                    0))
     (should (equal (eshell-test-command-result "echo $eshell-test-value[0 2]")
-                   '(0 20)))))
+                   '(0 20))))
+  ;; Check that multiline forms are preserved as-is.
+  (let ((eshell-test-value "foo\nbar:baz\n"))
+    (should (equal (eshell-test-command-result "echo $eshell-test-value[: 0]")
+                   "foo\nbar"))
+    (should (equal (eshell-test-command-result "echo $eshell-test-value[: 1]")
+                   "baz\n"))))
 
 (ert-deftest esh-var-test/interp-convert-quoted-var-number ()
   "Interpolate numeric quoted numeric variable"
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* bug#55838: 29.0.50; Eshell string-split subscript indexing splits too much
  2022-06-08  1:41 ` bug#55838: 29.0.50; [PATCH] " Jim Porter
@ 2022-06-08 12:11   ` Lars Ingebrigtsen
  2022-06-08 13:38   ` bug#55838: 29.0.50; [PATCH] " Eli Zaretskii
  1 sibling, 0 replies; 5+ messages in thread
From: Lars Ingebrigtsen @ 2022-06-08 12:11 UTC (permalink / raw)
  To: Jim Porter; +Cc: 55838

Jim Porter <jporterbugs@gmail.com> writes:

> Here's a patch for this. It changes the behavior of
> `eshell-apply-indices' to use `eshell-convert-to-number' (when the
> expansion isn't wrapped in double-quotes) instead of the
> more-aggressive `eshell-convert'. I think `eshell-convert-to-number'
> is the right thing here, since Eshell already converts number-like
> strings to actual numbers in most cases.

Sounds good to me; pushed to Emacs 29.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#55838: 29.0.50; [PATCH] Eshell string-split subscript indexing splits too much
  2022-06-08  1:41 ` bug#55838: 29.0.50; [PATCH] " Jim Porter
  2022-06-08 12:11   ` bug#55838: 29.0.50; " Lars Ingebrigtsen
@ 2022-06-08 13:38   ` Eli Zaretskii
  2022-06-08 23:06     ` Jim Porter
  1 sibling, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2022-06-08 13:38 UTC (permalink / raw)
  To: Jim Porter; +Cc: 55838

> From: Jim Porter <jporterbugs@gmail.com>
> Date: Tue, 7 Jun 2022 18:41:30 -0700
> 
> > The first command is normal, and just shows that Eshell outputs the 
> > string with no manipulation. In the second command, we split the string 
> > on ":" and get the 0th element. However, that gets split *again* (on 
> > newlines) and returns a list.
> 
> Here's a patch for this. It changes the behavior of 
> `eshell-apply-indices' to use `eshell-convert-to-number' (when the 
> expansion isn't wrapped in double-quotes) instead of the more-aggressive 
> `eshell-convert'. I think `eshell-convert-to-number' is the right thing 
> here, since Eshell already converts number-like strings to actual 
> numbers in most cases.
> 
> As a note, if you wanted the old behavior, you could do something like this:
> 
>    ~ $ echo $foo[: 0][0 1]
>    ("a" "b")
> 
> There's also a suggestion in the "Bugs and ideas" section of the Eshell 
> manual to add "*" as a subscript to mean "all indices", so you could do 
> the above in a more generic fashion like:
> 
>    ~ $ echo $foo[: 0][*]
>    ;; Doesn't currently work, but it could.

I don't have any objections based on actual experience, and I don't
know what was the original design goals of this feature in Eshell.
However, please note that you are changing the behavior significantly,
and the only reason is that it doesn't make much sense to you.  I
wonder whether this is a strong enough motivation to make such
incompatible behavior changes.  Eshell is not a "normal" shell, in
that it attempts to make sense even if Lisp expressions are mixed with
Posix-ish shell features, so what may not make sense in Bash, Zsh, and
their ilk is not necessarily nonsensical in Eshell.

So maybe we should raise the bar for considering reasons for behavior
changes as valid?





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#55838: 29.0.50; [PATCH] Eshell string-split subscript indexing splits too much
  2022-06-08 13:38   ` bug#55838: 29.0.50; [PATCH] " Eli Zaretskii
@ 2022-06-08 23:06     ` Jim Porter
  0 siblings, 0 replies; 5+ messages in thread
From: Jim Porter @ 2022-06-08 23:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 55838

On 6/8/2022 6:38 AM, Eli Zaretskii wrote:
> I don't have any objections based on actual experience, and I don't
> know what was the original design goals of this feature in Eshell.
> However, please note that you are changing the behavior significantly,
> and the only reason is that it doesn't make much sense to you.

I probably should have elaborated a bit more on my reasoning in the 
original report. My goal with this (and other Eshell patches in this 
area) is mainly to add tests for some of the more-advanced Eshell syntax 
and also to ensure that it works as documented. There are a few cases 
where it's tricky to decide whether the code is right and the 
documentation is wrong, or vice-versa. This is one of those cases.

Here's what the Emacs 27/28 manuals have to say about this syntax (I've 
already changed/expanded this section in 29, so I'm going back to 28 to 
show what the docs said before I changed them):

   $var[i]

       Expands to the ith element of the value bound to var. If the value
       is a string, it will be split at whitespace to make it a list.
       Again, raises an error if the value is not a sequence.

   $var[: i]

       As above, but now splitting occurs at the colon character.

   $var[: i j]

       As above, but instead of returning just a string, it now returns a
       list of two strings. If the result is being interpolated into a
       larger string, this list will be flattened into one big string,
       with each element separated by a space.

I would interpret the above to mean that the only splitting that should 
happen for `$var[: i]' is with the ":". The last section says that 
`$var[: i]' returns "just a string", and `$var[: i j]' returns a list of 
two strings. However, in my example in the original message, `$foo[: 0 
1]' would return a list containing a list and a string. That's 
inconsistent with what the manual says, and in this case I think it's 
the manual that was right, and the code that wasn't.

Note: the last sentence in the manual excerpt above is also incorrect. 
When the list is "flattened into one big string", it will look like 
'("first" "second")', not 'first second'. Unlike the original bug here, 
which people probably don't encounter very often in practice, changing 
how the list is flattened would probably cause problems for users. It's 
a really common occurrence. Something as simple as `echo a b' will 
return '("a" "b")'. This problem is also discussed in bug#12689.





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-08 23:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-08  1:36 bug#55838: 29.0.50; Eshell string-split subscript indexing splits too much Jim Porter
2022-06-08  1:41 ` bug#55838: 29.0.50; [PATCH] " Jim Porter
2022-06-08 12:11   ` bug#55838: 29.0.50; " Lars Ingebrigtsen
2022-06-08 13:38   ` bug#55838: 29.0.50; [PATCH] " Eli Zaretskii
2022-06-08 23:06     ` Jim Porter

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).