unofficial mirror of meta@public-inbox.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Eric Wong" <e@80x24.org>, "René Scharfe" <l.s.r@web.de>
Cc: git@vger.kernel.org,  meta@public-inbox.org
Subject: Re: `git patch-id --stable' vs quoted-printable
Date: Sun, 21 Aug 2022 21:06:52 -0700	[thread overview]
Message-ID: <xmqqczcsgbvn.fsf@gitster.g> (raw)
In-Reply-To: <20220822022503.M873583@dcvr> (Eric Wong's message of "Mon, 22 Aug 2022 02:25:03 +0000")

Eric Wong <e@80x24.org> writes:

> While poking around at the newish patchid indexing support in
> public-inbox[1], I noticed an inconsistency in how it seems to
> mishandle quoted-printable messages.
> ...
> So, I'm wondering if the search indexing code of public-inbox
> should s/^$/ /mgs before feeding stuff to `git patch-id'; and/or
> if `git patch-id' should be assuming empty lines and lines with a
> single SP are the same...

I suspect that QP is a red herring.  I haven't looked at relevant
code at all for a while, but what I think is going on is:

 * patch-id algorithm was written back when "unified" format of
   "diff" did not have the extension of GNU origin to allow an empty
   context line to be expressed as a truely empty line, not a single
   whitespace that signals it is a context line, followed by the
   contents of the line that is empty

 * "git apply" hence "git am" was taught to grok the empty context
   line extention, https://pubs.opengroup.org/onlinepubs/9699919799/utilities/diff.html
   has this:

      It is implementation-defined whether an empty unaffected line is
      written as an empty line or a line containing a single <space> character.

   IIRC, this was added after GNU diff started emitting such an
   output (--suppress-blank-empty) and people complained that such a
   patch is not understood by us.

 * "git diff" was updated to allow this with diff.suppressBlankEmpty
   configuration , but that is never turned on by default.

So, if a patch producer runs "git diff" with diff.suppressBlankEmpty
turned on, "git am" accepts it, and then you run "git show" without
the configuration, then the "shape" of the patch text would be
slightly different.  I do not offhand know if we added configuration
support to "patch-id", but even with a configuration knob, because
once you turn incoming e-mail into a commit, the single bit (i.e.
whether suppressBlankEmpty was in use or not) is forever lost, it
would not be of much help.  After all, the incoming patch can be
hand munged to use both "single whitespace and the end of line" and
"a completely empty line" to record an empty context line, and "am"
has to take such a patch happily.

I *think* the right thing to do is for patch-id that takes text
input to normalize the empty context line into one form or the other
(as a conservatist, I would say we should probably pretend as if an
empty context line is always expressed as a single whitespace on a
line by itself) before computing the ID.

René, do you remember if you used diff.suppressBlankEmpty
configuration when generating the patch in question at:

    https://public-inbox.org/git/6727daf1-f077-7319-187e-ab4e55de3b2d@web.de/raw

by the way?

Thanks.

  reply	other threads:[~2022-08-22  4:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-22  2:25 `git patch-id --stable' vs quoted-printable Eric Wong
2022-08-22  4:06 ` Junio C Hamano [this message]
2022-08-22  4:18   ` Junio C Hamano
2022-08-22  4:57     ` Eric Wong
2022-08-22 15:58   ` René Scharfe
2022-08-22 16:21     ` Junio C Hamano
2022-08-22 17:01       ` Eric Wong
2022-08-22 18:25         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://public-inbox.org/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqczcsgbvn.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=meta@public-inbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).