* [PATCH] TODO: add item for searching based on git-patch-id(1) @ 2019-10-01 3:37 Eric Wong 2019-10-01 21:00 ` Konstantin Ryabitsev 0 siblings, 1 reply; 3+ messages in thread From: Eric Wong @ 2019-10-01 3:37 UTC (permalink / raw) To: meta I forgot about this feature when I was implementing blob-ID-based searches :x --- TODO | 3 +++ 1 file changed, 3 insertions(+) diff --git a/TODO b/TODO index 2c525615..93054bb3 100644 --- a/TODO +++ b/TODO @@ -112,3 +112,6 @@ all need to be considered for everything we introduce) * make "git cat-file --batch" detect unlinked packfiles so we don't have to restart processes (very long-term) + +* support searching based on `git-patch-id --stable` to improve + bidirectional mapping of commits <=> emails -- EW ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] TODO: add item for searching based on git-patch-id(1) 2019-10-01 3:37 [PATCH] TODO: add item for searching based on git-patch-id(1) Eric Wong @ 2019-10-01 21:00 ` Konstantin Ryabitsev 2019-10-01 22:00 ` Eric Wong 0 siblings, 1 reply; 3+ messages in thread From: Konstantin Ryabitsev @ 2019-10-01 21:00 UTC (permalink / raw) To: Eric Wong; +Cc: meta On Tue, Oct 01, 2019 at 03:37:47AM +0000, Eric Wong wrote: > I forgot about this feature when I was implementing > blob-ID-based searches :x > --- > TODO | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/TODO b/TODO > index 2c525615..93054bb3 100644 > --- a/TODO > +++ b/TODO > @@ -112,3 +112,6 @@ all need to be considered for everything we introduce) > > * make "git cat-file --batch" detect unlinked packfiles so we don't > have to restart processes (very long-term) > + > +* support searching based on `git-patch-id --stable` to improve > + bidirectional mapping of commits <=> emails It would be handy, but a word of caution -- because it strips whitespace, git-patch-id is not great for languages with syntactic indentation, like Python. For example, the following two patches generate the same patch-id, but one is actually malicious: diff --git a/file1.py b/file1.py index e574c49..6aa1937 100644 --- a/file1.py +++ b/file1.py @@ -1,3 +1,13 @@ #!/usr/bin/python +def is_logged_in(cookie): + if cookie: + print('User is logged in') + return True + + return False + +if is_logged_in(True): + print('You are logged in') + print('Hello!') This one below is malicious, because is_logged_in() will always return True: diff --git a/file1.py b/file1.py index e574c49..6aa1937 100644 --- a/file1.py +++ b/file1.py @@ -1,3 +1,13 @@ #!/usr/bin/python +def is_logged_in(cookie): + if cookie: + print('User is logged in') + return True + + return False + +if is_logged_in(True): + print('You are logged in') + print('Hello!') So, I wouldn't use git-patch-id as a mechanism to look up patches, except as an auxiliary one. -K ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] TODO: add item for searching based on git-patch-id(1) 2019-10-01 21:00 ` Konstantin Ryabitsev @ 2019-10-01 22:00 ` Eric Wong 0 siblings, 0 replies; 3+ messages in thread From: Eric Wong @ 2019-10-01 22:00 UTC (permalink / raw) To: Konstantin Ryabitsev; +Cc: meta Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote: > On Tue, Oct 01, 2019 at 03:37:47AM +0000, Eric Wong wrote: > > +* support searching based on `git-patch-id --stable` to improve > > + bidirectional mapping of commits <=> emails > > It would be handy, but a word of caution -- because it strips > whitespace, git-patch-id is not great for languages with syntactic > indentation, like Python. For example, the following two patches > generate the same patch-id, but one is actually malicious: Good point. Makefiles also fall into that category, I wonder what other languages are whitespace sensitive? <snip> > So, I wouldn't use git-patch-id as a mechanism to look up patches, > except as an auxiliary one. It's usable for 99% of patches for the kernel, though. But right, dfpost:$BLOB_ID matches should take precedence, and we can use a lower weight for the patch-id in Xapian The bigger question is the cost in time to reindex... And ultimately, I wonder if dfpost:$BLOB_ID + s:$COMMIT_TITLE is good enough, too... I think I need to dig out something I abandoned years ago for indexing coderepos and refactor that to be less space-intensive now. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-10-01 22:00 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-10-01 3:37 [PATCH] TODO: add item for searching based on git-patch-id(1) Eric Wong 2019-10-01 21:00 ` Konstantin Ryabitsev 2019-10-01 22:00 ` Eric Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).