Eli Zaretskii <eliz@gnu.org> writes:

>> From: taylanbayirli@gmail.com (Taylan Ulrich Bayırlı/Kammer)
>> Cc: 21702@debbugs.gnu.org
>> Date: Mon, 19 Oct 2015 09:34:15 +0200
>> 
>> > Item 1 was this:
>> >
>> >> >> The function should clearly document
>> >> >> 
>> >> >>     1) for which shells will the quoting work absolutely, i.e. lead to
>> >> >>     the given string to appear *verbatim* in an element of the ARGV of
>> >> >>     the called command,
>> >
>> > There's nothing about safety here, only about correctness.  That is
>> > the aspect that I think is now covered, as the doc string now says for
>> > which shells one can have correct results.
>> 
>> Usually it's indeed correctness that protects against injection attacks.
>> A quoting mechanism that's correct is automatically safe.
>
> And that is the current situation, AFAIU.
>
>> Another way to make it safe would be to error when the given string
>> contains characters outside of a limited character set.
>
> What limited set would you suggest that will not make the function
> useless in real-life scenarios?
>
> In any case, I think quoting is better than rejecting, as it supports
> more use cases.
>
>> Either way, the safeness should be documented clearly, either implicitly
>> through a clear documentation of the correctness, or explicitly.
>
> Like I said, this convention should be adopted project-wide.  Doing so
> only in a few doc strings, let alone one, will only confuse, because
> the user will not know whether the lack of such documentation means
> the API is safe or unsafe.

Yes, it should be done for every function for which the concerns I've
explained apply.  So let's start from this one.

>> I would propose something along the lines of:
>> 
>>     It is guaranteed that ARGUMENT will be parsed as a single token by
>>     shells X, Y, and Z, as long as it is separated from other text via a
>>     delimiter in the syntax of the respective shell.
>
> I don't think we want to mention specific shells explicitly, because
> maintaining such a list would be a burden.  The standard shell of each
> OS is well defined and known to the users of the respective systems.
> Moreover, Emacs by default uses that shell automatically.

For instance: POSIX sh, MS-DOS, and Windows NT, is not a long list.  (I
don't really know what shells MS-DOS and Windows NT use; a more precise
naming would be good.)

The payoff of the small burden is having clear safety guarantees.

>> >> Does that make sense?
>> >
>> > Maybe it does, but only if we start documenting these aspects
>> > project-wide.  It makes little sense to me to do that for a single
>> > API, and not an important one at that.  But that's me.
>> 
>> This is an API which if its implementation is imperfect will result in
>> programs prone to code injection attacks when these programs face
>> untrusted input sources.  Why do you say it's not an important one?
>
> Because there are many much more important ones that can do much more
> harm more easily.  In particular, a shell command doesn't need to be
> quoted to be harmful or malicious.

There being other important cases, does not make this a less important
case.  It is exactly as important as I've already said.

I don't understand what "a shell command doesn't need to be quoted to be
harmful" is supposed to mean; quoting is what *makes* the arguments
harmless, by ensuring they cleanly end up in the ARGV of a called
command instead of causing arbitrary behavior of the shell.

Here's a patch doing an improvement to the documentation like the one I
proposed.  Of course, if you have verified that shells other than POSIX
sh are fully safe, feel free to improve the docstring accordingly.

Taylan