* `git patch-id --stable' vs quoted-printable @ 2022-08-22 2:25 Eric Wong 2022-08-22 4:06 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Eric Wong @ 2022-08-22 2:25 UTC (permalink / raw) To: git; +Cc: meta While poking around at the newish patchid indexing support in public-inbox[1], I noticed an inconsistency in how it seems to mishandle quoted-printable messages. For instance, René's message can apply fine with `git am': https://public-inbox.org/git/6727daf1-f077-7319-187e-ab4e55de3b2d@web.de/raw However, note the blank context lines are truly blank, as in /^$/, not /^ $/. Running `git patch-id --stable' on the decoded content of that message gives me: a245e99dbd2ce6e319955569eb8a111cb044f474 And that's the value public-inbox indexes. That patch ends up being commit fc0f8bcd64eb0a03a7098f72da9c4008bd48cf11 in git.git. However, `git show fc0f8bcd64eb0a03a7098f72da9c4008bd48cf11 |git patch-id --stable' gives me: fcac4cf581e11b703c229a129072c95c79b68bf So, I'm wondering if the search indexing code of public-inbox should s/^$/ /mgs before feeding stuff to `git patch-id'; and/or if `git patch-id' should be assuming empty lines and lines with a single SP are the same... [1] https://public-inbox.org/meta/20220620192730.550803-3-e@80x24.org/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: `git patch-id --stable' vs quoted-printable 2022-08-22 2:25 `git patch-id --stable' vs quoted-printable Eric Wong @ 2022-08-22 4:06 ` Junio C Hamano 2022-08-22 4:18 ` Junio C Hamano 2022-08-22 15:58 ` René Scharfe 0 siblings, 2 replies; 8+ messages in thread From: Junio C Hamano @ 2022-08-22 4:06 UTC (permalink / raw) To: Eric Wong, René Scharfe; +Cc: git, meta Eric Wong <e@80x24.org> writes: > While poking around at the newish patchid indexing support in > public-inbox[1], I noticed an inconsistency in how it seems to > mishandle quoted-printable messages. > ... > So, I'm wondering if the search indexing code of public-inbox > should s/^$/ /mgs before feeding stuff to `git patch-id'; and/or > if `git patch-id' should be assuming empty lines and lines with a > single SP are the same... I suspect that QP is a red herring. I haven't looked at relevant code at all for a while, but what I think is going on is: * patch-id algorithm was written back when "unified" format of "diff" did not have the extension of GNU origin to allow an empty context line to be expressed as a truely empty line, not a single whitespace that signals it is a context line, followed by the contents of the line that is empty * "git apply" hence "git am" was taught to grok the empty context line extention, https://pubs.opengroup.org/onlinepubs/9699919799/utilities/diff.html has this: It is implementation-defined whether an empty unaffected line is written as an empty line or a line containing a single <space> character. IIRC, this was added after GNU diff started emitting such an output (--suppress-blank-empty) and people complained that such a patch is not understood by us. * "git diff" was updated to allow this with diff.suppressBlankEmpty configuration , but that is never turned on by default. So, if a patch producer runs "git diff" with diff.suppressBlankEmpty turned on, "git am" accepts it, and then you run "git show" without the configuration, then the "shape" of the patch text would be slightly different. I do not offhand know if we added configuration support to "patch-id", but even with a configuration knob, because once you turn incoming e-mail into a commit, the single bit (i.e. whether suppressBlankEmpty was in use or not) is forever lost, it would not be of much help. After all, the incoming patch can be hand munged to use both "single whitespace and the end of line" and "a completely empty line" to record an empty context line, and "am" has to take such a patch happily. I *think* the right thing to do is for patch-id that takes text input to normalize the empty context line into one form or the other (as a conservatist, I would say we should probably pretend as if an empty context line is always expressed as a single whitespace on a line by itself) before computing the ID. René, do you remember if you used diff.suppressBlankEmpty configuration when generating the patch in question at: https://public-inbox.org/git/6727daf1-f077-7319-187e-ab4e55de3b2d@web.de/raw by the way? Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: `git patch-id --stable' vs quoted-printable 2022-08-22 4:06 ` Junio C Hamano @ 2022-08-22 4:18 ` Junio C Hamano 2022-08-22 4:57 ` Eric Wong 2022-08-22 15:58 ` René Scharfe 1 sibling, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2022-08-22 4:18 UTC (permalink / raw) To: Eric Wong; +Cc: René Scharfe, git, meta Junio C Hamano <gitster@pobox.com> writes: >> So, I'm wondering if the search indexing code of public-inbox >> should s/^$/ /mgs before feeding stuff to `git patch-id'; and/or >> if `git patch-id' should be assuming empty lines and lines with a >> single SP are the same... Another potential source of issues (not for the patch from René that was used as an example) is that a patch producer can use different diff algorithm from the setting you would use to index resulting commits via "git show | git patch-id". $ git show -U5 | git patch-id $ git show | git patch-id likely result in different patch IDs. The --patience and the --histogram options affect how common lines are matched up, again affecting the shape of the patches you compute patch-ids over. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: `git patch-id --stable' vs quoted-printable 2022-08-22 4:18 ` Junio C Hamano @ 2022-08-22 4:57 ` Eric Wong 0 siblings, 0 replies; 8+ messages in thread From: Eric Wong @ 2022-08-22 4:57 UTC (permalink / raw) To: Junio C Hamano; +Cc: René Scharfe, git, meta Junio C Hamano <gitster@pobox.com> wrote: > Junio C Hamano <gitster@pobox.com> writes: > > >> So, I'm wondering if the search indexing code of public-inbox > >> should s/^$/ /mgs before feeding stuff to `git patch-id'; and/or > >> if `git patch-id' should be assuming empty lines and lines with a > >> single SP are the same... > > Another potential source of issues (not for the patch from René that > was used as an example) is that a patch producer can use different > diff algorithm from the setting you would use to index resulting > commits via "git show | git patch-id". > > $ git show -U5 | git patch-id > $ git show | git patch-id > > likely result in different patch IDs. > > The --patience and the --histogram options affect how common lines > are matched up, again affecting the shape of the patches you compute > patch-ids over. Yes, you're right; though I suppose most users use git defaults. I'm not sure how useful patchid ends up being, actually... The old use of dfpre/dfpost blob OIDs seems to have been working well for years, already. Possibly OR-ing them with patchid in Xapian will end up getting good enough search engine coverage. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: `git patch-id --stable' vs quoted-printable 2022-08-22 4:06 ` Junio C Hamano 2022-08-22 4:18 ` Junio C Hamano @ 2022-08-22 15:58 ` René Scharfe 2022-08-22 16:21 ` Junio C Hamano 1 sibling, 1 reply; 8+ messages in thread From: René Scharfe @ 2022-08-22 15:58 UTC (permalink / raw) To: Junio C Hamano, Eric Wong; +Cc: git, meta [-- Attachment #1: Type: text/plain, Size: 425 bytes --] Am 22.08.22 um 06:06 schrieb Junio C Hamano: > René, do you remember if you used diff.suppressBlankEmpty > configuration when generating the patch in question at: > > https://public-inbox.org/git/6727daf1-f077-7319-187e-ab4e55de3b2d@web.de/raw > > by the way? I did not use that option. Attached the copy from my Sent folder, which has spaces at the start of the blank lines in the config.txt hunk. René [-- Attachment #2: [PATCH] revert: optionally refer to commit in the "reference" format.eml --] [-- Type: message/rfc822, Size: 2823 bytes --] From: "René Scharfe" <l.s.r@web.de> To: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org Subject: Re: [PATCH] revert: optionally refer to commit in the "reference" format Date: Sun, 26 Jun 2022 11:29:35 +0200 Message-ID: <6727daf1-f077-7319-187e-ab4e55de3b2d@web.de> Am 22.05.22 um 06:32 schrieb Junio C Hamano: > --- > Documentation/config/revert.txt | 3 +++ > Documentation/git-revert.txt | 9 +++++++++ > builtin/revert.c | 2 ++ > sequencer.c | 32 +++++++++++++++++++++++++++----- > sequencer.h | 1 + > t/t3501-revert-cherry-pick.sh | 31 +++++++++++++++++++++++++++++++ > 6 files changed, 73 insertions(+), 5 deletions(-) > create mode 100644 Documentation/config/revert.txt > > diff --git a/Documentation/config/revert.txt b/Documentation/config/revert.txt > new file mode 100644 > index 0000000000..797bfb6d62 > --- /dev/null > +++ b/Documentation/config/revert.txt > @@ -0,0 +1,3 @@ > +revert.reference:: > + Setting this variable to true makes `git revert` to behave > + as if the `--reference` option is given. Shouldn't this be "were" instead of "is"? Not fully sure. Anyway: --- >8 --- Subject: [PATCH] revert: config documentation fixes 43966ab315 (revert: optionally refer to commit in the "reference" format, 2022-05-26) added the documentation file config/revert.txt. Actually include it in config.txt. Make is used with a bare infinitive after the object; remove the "to". Signed-off-by: René Scharfe <l.s.r@web.de> --- Documentation/config.txt | 2 ++ Documentation/config/revert.txt | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index e284b042f2..e376d547ce 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -495,6 +495,8 @@ include::config/repack.txt[] include::config/rerere.txt[] +include::config/revert.txt[] + include::config/safe.txt[] include::config/sendemail.txt[] diff --git a/Documentation/config/revert.txt b/Documentation/config/revert.txt index 797bfb6d62..802d6faca2 100644 --- a/Documentation/config/revert.txt +++ b/Documentation/config/revert.txt @@ -1,3 +1,3 @@ revert.reference:: - Setting this variable to true makes `git revert` to behave + Setting this variable to true makes `git revert` behave as if the `--reference` option is given. -- 2.36.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: `git patch-id --stable' vs quoted-printable 2022-08-22 15:58 ` René Scharfe @ 2022-08-22 16:21 ` Junio C Hamano 2022-08-22 17:01 ` Eric Wong 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2022-08-22 16:21 UTC (permalink / raw) To: René Scharfe; +Cc: Eric Wong, git, meta René Scharfe <l.s.r@web.de> writes: > Am 22.08.22 um 06:06 schrieb Junio C Hamano: >> René, do you remember if you used diff.suppressBlankEmpty >> configuration when generating the patch in question at: >> >> https://public-inbox.org/git/6727daf1-f077-7319-187e-ab4e55de3b2d@web.de/raw >> >> by the way? > > I did not use that option. Attached the copy from my Sent folder, which > has spaces at the start of the blank lines in the config.txt hunk. Hmph, so I cannot quite explain what removed the leading spaces from the copy public-inbox archived. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: `git patch-id --stable' vs quoted-printable 2022-08-22 16:21 ` Junio C Hamano @ 2022-08-22 17:01 ` Eric Wong 2022-08-22 18:25 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Eric Wong @ 2022-08-22 17:01 UTC (permalink / raw) To: Junio C Hamano; +Cc: René Scharfe, git, meta Junio C Hamano <gitster@pobox.com> wrote: > René Scharfe <l.s.r@web.de> writes: > > > Am 22.08.22 um 06:06 schrieb Junio C Hamano: > >> René, do you remember if you used diff.suppressBlankEmpty > >> configuration when generating the patch in question at: > >> > >> https://public-inbox.org/git/6727daf1-f077-7319-187e-ab4e55de3b2d@web.de/raw > >> > >> by the way? > > > > I did not use that option. Attached the copy from my Sent folder, which > > has spaces at the start of the blank lines in the config.txt hunk. > > Hmph, so I cannot quite explain what removed the leading spaces from > the copy public-inbox archived. Both the June 26 message and the latest which arrived directly in my IMAP inbox (w/o public-inbox) are missing the space in blank lines, matching what public-inbox.org and lore have. I don't think postfix, SpamAssassin, or dovecot strip that.. I would expect the blank context lines in the QP version to have "=20". I sent this to test@public-inbox.org using `git send-email --transfer-encoding=quoted-printable' and it confirms the blank context lines having "=20" which are missing in René's messages: https://try.public-inbox.org/test/20220822165736.23246-1-e@80x24.org/raw ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: `git patch-id --stable' vs quoted-printable 2022-08-22 17:01 ` Eric Wong @ 2022-08-22 18:25 ` Junio C Hamano 0 siblings, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2022-08-22 18:25 UTC (permalink / raw) To: Eric Wong; +Cc: René Scharfe, git, meta Eric Wong <e@80x24.org> writes: > I don't think postfix, SpamAssassin, or dovecot strip that.. > ... > I would expect the blank context lines in the QP version to have > "=20". I sent this to test@public-inbox.org using > `git send-email --transfer-encoding=quoted-printable' and it > confirms the blank context lines having "=20" which are missing > in René's messages: > https://try.public-inbox.org/test/20220822165736.23246-1-e@80x24.org/raw Yeah, that is a mystery. Unlike format=flawed, I think CTE-QP was designed to be reversible without information loss. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-08-22 18:25 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-08-22 2:25 `git patch-id --stable' vs quoted-printable Eric Wong 2022-08-22 4:06 ` Junio C Hamano 2022-08-22 4:18 ` Junio C Hamano 2022-08-22 4:57 ` Eric Wong 2022-08-22 15:58 ` René Scharfe 2022-08-22 16:21 ` Junio C Hamano 2022-08-22 17:01 ` Eric Wong 2022-08-22 18:25 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).