unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Tomi Ollila <tomi.ollila@iki.fi>
To: Felipe Contreras <felipe.contreras@gmail.com>
Cc: "notmuch@notmuchmail.org" <notmuch@notmuchmail.org>
Subject: Re: [PATCH v3] test: replace notmuch_passwd_sanitize() with _libconfig_sanitize()
Date: Wed, 19 May 2021 11:44:13 +0300	[thread overview]
Message-ID: <m2r1i3glk2.fsf@guru.guru-group.fi> (raw)
In-Reply-To: <CAMP44s1OzA-uxLqmxgtWrpPbROPc-g4c4SE5Otoep4N0CMWN-Q@mail.gmail.com>

On Wed, May 19 2021, Felipe Contreras wrote:

> On Tue, May 18, 2021 at 12:55 AM Tomi Ollila <tomi.ollila@iki.fi> wrote:
>>
>> notmuch_passwd_sanitize() in test-lib.sh is too generic, it cannot
>> work in many cases...
>>
>> The more specific version _libconfig_sanitize() replaces it in
>> T590-libconfig.sh and the code that uses it is modified to output
>> the keys (ascending numbers printed in hex) so the sanitizer knows
>> what to sanitize in which lines...
>>
>> "@" + fqdn -> "@FQDN" replacement is used as fqdn could --
>> in theory -- be substring of 'USERNAME'.
>>
>> 'user -> 'USER_FULL_NAME replacement to work in cases where user
>> is empty -- as only first ' is replaced that works as expected.
>>
>> In addition to ".(none)" now also ".localdomain" is filtered from
>> USERNAME@FQDN.
>> ---
>>
>> Changes to [v2]:
>>
>> * work in cases of empty user (e.g. in passwd gecos field)
>> * replace only 1st match; e.g. fqdn could contain substring of user
>>
>> v2: id:20210517193315.11343-1-tomi.ollila@iki.fi
>> v1: id:20210502181535.31292-1-tomi.ollila@iki.fi
>>
>> When tried w/ one replacement and w/o sq usage and emptied gecos, got
>>
>> .  @@ -9,5 +9,5 @@
>> .  7: 'true'
>> .  8: 'USERNAME@FQDN'
>> .  9: 'NULL'
>> . -a: 'USER_FULL_NAME'
>> . +USER_FULL_NAMEa: ''
>>
>>  test/T590-libconfig.sh | 97 +++++++++++++++++++++++++-----------------
>>  test/test-lib.sh       | 20 ---------
>>  2 files changed, 59 insertions(+), 58 deletions(-)
>>
>> diff --git a/test/T590-libconfig.sh b/test/T590-libconfig.sh
>> index 745e1bb4..42cbe6e0 100755
>> --- a/test/T590-libconfig.sh
>> +++ b/test/T590-libconfig.sh
>> @@ -5,6 +5,26 @@ test_description="library config API"
>>
>>  add_email_corpus
>>
>> +_libconfig_sanitize() {
>> +    ${NOTMUCH_PYTHON} -c '
>> +import os, sys, pwd, socket
>
> Why not use a heredoc?
>
>   python <<-EOF
>   ..
>   EOF

tldr: I'll post change to use heredoc.

Probably my bias against heredoc's when there are alternatives
-- although this is much more tolerable than cat <<EOF ... EOF
to write stuff to stdout ;)

Also, I did not recall it is this simple to read python code
from stdin to be executed (right, that loses stdin for user
input -- I was once bitten by that so that contributes to 
my bias).

Here the ability to use "'" (for clarity) is compelling reason
to use heredoc.

(alternatives would have been:

 * l.replace("'\''" + name ...

 * l.replace("'"'"'" + name ... ;D

 * use other delimiter than ' (but not unicode quotes >;)
)

While testing this option I looked (once again) how dash and
bash do heredocs (in linux) (just to update my knowledge):

dash creates pipe and dup2's fd[0] to 0 
     (that makes stdin not seekable)

bash clones subprocess; in subprocess it creates temporary file,
     writes data there, closes it, opens it for reading, 
     unlinks it from fs (could be problematic on windows), 
     dup2()'s it to stdin, closes the dupped fd and finally
     execve's python

zsh works like bash (i.e. these 2 provide seekable stdin)

$ (strace -f -ofile zsh heredoc-test.sh)

Tomi

>
>> +pw = pwd.getpwuid(os.getuid())
>> +user = pw.pw_name
>> +name = pw.pw_gecos.partition(",")[0]
>> +fqdn = socket.getaddrinfo(socket.gethostname(), 0, 0,
>> +                          socket.SOCK_STREAM, 0, socket.AI_CANONNAME)[0][3]
>> +for l in sys.stdin:
>> +    if l[:3] == "8: ":
>> +        l = l.replace(user, "USERNAME", 1).replace("@" + fqdn, "@FQDN", 1)
>> +        l = l.replace(".(none)", "", 1).replace(".localdomain", "", 1)
>> +    elif l[:3] == "a: ":
>> +        sq = chr(39) # single quote
>> +        l = l.replace(sq + name, sq + "USER_FULL_NAME", 1)
>
> Then we can simply do:
>
> l.replace("'" + name, "'USER_FULL_NAME", 1)
>
> The rest looks fine to me.
>
> -- 
> Felipe Contreras

  reply	other threads:[~2021-05-19  8:44 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18  5:54 [PATCH v3] test: replace notmuch_passwd_sanitize() with _libconfig_sanitize() Tomi Ollila
2021-05-19  7:29 ` Felipe Contreras
2021-05-19  8:44   ` Tomi Ollila [this message]
2021-05-19 17:34     ` Tomi Ollila
2021-05-19 19:51       ` Felipe Contreras
2021-05-20  7:43         ` Tomi Ollila
2021-05-21  9:46           ` Felipe Contreras
2021-05-21 18:22             ` Tomi Ollila
2021-05-24  2:34               ` Felipe Contreras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2r1i3glk2.fsf@guru.guru-group.fi \
    --to=tomi.ollila@iki.fi \
    --cc=felipe.contreras@gmail.com \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).