On 3/20/2022 12:05 AM, Eli Zaretskii wrote:
> Thank you for working on this.  See some minor comments below.

Thanks for the thorough review.

>> +Eshell's globbing syntax is very similar to that of Zsh
>> +(@pxref{Filename Generation, , , zsh, The Z Shell Manual}).  Users
>> +coming from Bash can still use Bash-style globbing, as there are no
>> +incompatibilities.  To customize the syntax and behavior of globbing
>> +in Eshell see the Customize@footnote{@xref{Easy Customization, , ,
>> +emacs, The GNU Emacs Manual}.} group ``eshell-glob''.
> 
> Let's not have hyper-links in footnotes, it makes little sense to me.
> (Yes, I know this was in the original text.)

Ok, fixed.

>> +@table @code
>> +
>> +@item *
>> +Matches any string (including the empty string).  For example,
>> +@samp{*.el} matches any file with the @file{.el} extension.
> 
> You use @code in the @table, but @samp in the body, which will look
> inconsistent in the printed version of the manual.  Please use one of
> them (I think @samp is better).

Done. I only did this for the glob section though. Should I change the 
items in the predicates/modifiers to use @samp too? They're different 
enough that I'm not quite sure.

Or would @kbd be better to use here? These are things meant to be typed 
by the user into an interactive prompt, after all...

>> +@item **
>> +Matches any number of subdirectories in a file name, including zero.
> 
> The "including zero" part is confusing: it isn't immediately clear
> whether it refers to "file name" or "any number".  I'd use "Matches
> zero or more subdirectories..." instead.

Fixed.

>> +For example, @samp{**/foo.el} matches @file{foo.el},
>> +@file{bar/foo.el}, @file{bar/baz/foo.el}, etc.
> 
> The fact that the "zero" case removes the slash as well (if it does)
> should be mentioned explicitly in the text, I think.  It is importand
> in cases like foo/**/bar (which perhaps should have an example to make
> the feature more clear).

I've updated the item to be '**/', which is more accurate, since the 
trailing '/' is really a part of the token (the Zsh manual also 
documents it this way). I also added a short note that the '**/' must be 
the entirety of a file name segment (so something like 'foo**/bar.el' is 
nonsense).

>> +@item ***
>> +As @code{**}, but follows symlinks as well.
>     ^^
> "Like" is better, I think.

Fixed.

>> +@item [ @dots{} ]
>> +Defines a @dfn{character set} (@pxref{Regexps, , , emacs, The GNU
>> +Emacs Manual}).
> 
> Every @dfn should have a @cindex entry for the term.  In this case,
> the term should be qualified, since "character set" is used in many
> different contexts.  Something like
> 
>    @cindex character set, in Eshell glob patterns

Fixed.

>>                      A character set matches any character between
>> +@samp{[} and @samp{]}
> 
> This is ambiguous: "between [ and ]" could be interpreted as
> characters that are between those in the alphabetical order.  I'd
> follow the description in the Emacs manual, which says "characters
> between the two brackets".

Fixed.

>>              You can also include ranges of characters in the set by
>> +separating the start and end with @samp{-}.  Thus, @samp{[a-z]}
>> +matches any lower-case @acronym{ASCII} letter.
> 
> It might be a good idea to mention here that, unlike with Zsh,
> character ranges are interpreted in the Unicode codepoint order, not
> in the locale-dependent collation order.  This affects stuff like
> [a-z].  Also, does case-fold-search have any effect on these matches?

Done. case-fold-search doesn't have any effect, but 
eshell-glob-case-insensitive does. I've added a bit of documentation 
about that to the manual.

>> +Additionally, you can include @dfn{character classes} in a character
> 
> Another @dfn without an index entry..

Fixed.

>> +@item [^ @dots{} ]
>> +Defines a @dfn{complemented character set}.  This behaves just like a
> 
> And another.

Fixed here (and for @dfn{group} just below this).

>> +(ert-deftest em-glob-test/match-recursive ()
>> +  "Test that \"**\" recursively matches directories."
>> +  (with-fake-files '("a.el" "b.el" "ccc.el" "d.txt" "dir/a.el" "dir/sub/a.el"
>> +                     "dir/symlink/a.el" "symlink/a.el" "symlink/sub/a.el")
>> +    (should (equal (eshell-extended-glob "**/a.el")
>> +                   '("a.el" "dir/a.el" "dir/sub/a.el")))))
>> +
>> +(ert-deftest em-glob-test/match-recursive-follow-symlinks ()
>> +  "Test that \"***\" recursively matches directories, following symlinks."
>> +  (with-fake-files '("a.el" "b.el" "ccc.el" "d.txt" "dir/a.el" "dir/sub/a.el"
>> +                     "dir/symlink/a.el" "symlink/a.el" "symlink/sub/a.el")
>> +    (should (equal (eshell-extended-glob "***/a.el")
>> +                   '("a.el" "dir/a.el" "dir/sub/a.el" "dir/symlink/a.el"
>> +                     "symlink/a.el" "symlink/sub/a.el")))))
> 
> I think symlink tests should be skipped on MS-Windows, at least by
> default (with perhaps some Make variable to activate them?).  Creating
> symlinks on Windows requires elevation of privileges, and causes the
> system to stop and pop up the elevation confirmation dialog; for some
> users it will fail even when enabled.

These tests (and the ones in em-pred-tests.el) don't actually touch the 
filesystem. I just provided mock implementations of the relevant Elisp 
functions so that the tests can pretend that the files corresponding to 
these names really exist. That's a bit less realistic, but it should 
work on all systems without errors. It should work fine on MS-Windows, 
unless there are other bugs present.

That said, if you'd prefer a different implementation (e.g. one that 
examines real files) or just more documentation about what I'm doing 
here, let me know.

>> +@node Argument Predication
>> +@section Argument Predication and Modification
>> +Eshell supports @dfn{argument predication}, to filter elements of a
>> +glob, and @dfn{argument modification}, to manipulate argument values.
>> +These are similar to glob qualifiers in Zsh (@pxref{Glob Qualifiers, ,
>> +, zsh, The Z Shell Manual}).
> 
> Another place where index entries are needed.

Done.

>> +
>> +Predicates and modifiers are introduced with @code{( @dots{} )} after
>> +any list argument, where @code{@dots{}} is a list of predicates or
>> +modifiers.
> 
> Instead of using @dots{], which lacks any semantics here, I would
> suggest to use @code{(@var{filters})}.

Done.

>> +To customize the syntax and behavior of predicates and modifiers in
>> +Eshell see the Customize@footnote{@xref{Easy Customization, , , emacs,
>> +The GNU Emacs Manual}.} group ``eshell-pred''.
> 
> Again, please move this cross-reference from the footnote to the body.

Done.

>> +@subsection Argument Predicates
> 
> I'd prefer not to have @subsections without nodes.  If you think a
> @node is not appropriate for some reason, please use @subheading
> instead.

Done, added nodes for these.

>> +The @code{^} and @code{-} operators are not argument predicates
>> +themselves, but they modify the behavior of all subsequent predicates.
>> +@code{^} inverts the meaning of subsequent predicates, so
>> +@samp{*(^RWX)} expands to all files whose permissions disallow the
>> +world from accessing them in any way (i.e., reading, writing to, or
>> +modifying them).  When examining a symbolic link, @code{-} applies the
>> +subsequent predicates to the link's target instead of the link itself.
> 
> This is better moved to after the table of predicates.

Done.

>> +@table @asis
> 
> All the @items use @code, so "@table @code" is better, and then you
> can drop @code in the @items.

This is because there's a single item that doesn't just use @code:

   @item @code{.} (Period)

I just lifted that convention from the "Syntax of Regular Expressions" 
section in the Emacs manual. I think it helps disambiguate what that 
character is, since a lone "." on a line is a bit confusing.

Note: I changed this slightly in the latest patch to add '@r{}' around 
'(Period)', matching the regexp section.

>> +If @var{file} is specified instead, compare against the modification
>> +time of @file{file}.  Thus, @samp{a-'hello.txt'} matches all files
>> +accessed after @file{hello.txt} was last accessed.
> 
> The use of quotes 'like this', here and elsewhere in a similar
> context, begs the question: how to specify names that have embedded
> single-quote characters in them?

"Very carefully." :)

Seriously though, this is an area I don't fully understand yet, but in 
which I've found several bugs (or at least I think they're bugs). As 
such, I intentionally avoided documenting this since it's pretty 
confusing. To answer your specific question, you could type:

   *(a-"hello'there.txt")

However, the quoting rules are inconsistent. For example, this is how 
you normally insert a single quote inside a single-quoted string in Eshell:

   ~ $ echo 'hi''there'
   hi'there

That doesn't work inside predicates though, and it would treat the file 
name as "hi''there".

Relatedly, the allowed delimiters aren't consistent. a/m/c/u/g all allow 
*any* non-digit character as a delimiter, so 'a-XfooX' compares against 
the file 'foo'. They also use "balanced" bracket pairs, so 'a-<foo>' 
means the same thing.

:s/:gs/:i/:x allow any character as the delimiter (including digits), 
but don't use "balanced" bracket pairs, so you could type 
':s<pattern<repl<' or ':i<pattern<'.

:j/:S only allow ' and / as the delimiters.

I think this should be fixed, but it'll take a fair bit of work, and 
part of the reason for my filing this bug was to lay the foundation of 
unit tests so that I could adjust the escaping logic without regressing 
anything.

>> +@item e
>> +Treating the value as a file name, gets the file extension.
> 
> What is considered the extension in 'foo.bar.baz'?

It's 'baz'. I've expanded on this, explaining that it returns the final 
extension sans dot. I also added examples for this and all the other 
file name modifiers ('h'/head, 't'/tail, and 'r'/root).

>> +@item q
>> +Marks that the value should be interpreted by Eshell literally.
> 
> What does "literally" mean here?

Added an explanation for this (it means that any special characters like 
'$' lose their meaning and are treated literally). That said, I'm not 
100% sure when this would be useful, since I couldn't figure out a time 
where this would change how a command is executed. I'm probably just 
missing something though.

>> +@item S
>> +@item S/@var{pattern}/
>> +Splits the value by the regular expression @var{pattern}.  If
>> +@var{pattern} is omitted, split on spaces.
> 
> "Split the value by regexp" doesn't explain itself.  How about "split
> the value using the regular expression @var{pattern} as delimiters"?

Fixed.

>> +@item j
>> +@item j/@var{delim}/
>> +Joins a list of values, separated by the string @var{delim}.  If
>> +@var{delim} is omitted, use a single space as the delimiter.
> 
> And if DELIM is not omitted, what should it be? a regexp?

A string. I mentioned this in the first sentence, but I think it was a 
bit too brief. I've reworded this to be more explicit.

>> +@item o
>> +Sorts a list of strings in ascending lexicographic order.
>> +
>> +@item O
>> +Sorts a list of strings in descending lexicographic order.
> 
> This should clarify what is considered the lexicographic order here.
> Given the usual dependence on the locale, this is not self-evident.

Fixed, and added a cross-reference to the corresponding section of the 
Elisp manual. It would have been nice to link to the function itself: 
'string<', though I'm not sure how to do that (or if that's something I 
shouldn't do).