* [PATCH] TODO: add item for searching based on git-patch-id(1)
@ 2019-10-01 3:37 Eric Wong
2019-10-01 21:00 ` Konstantin Ryabitsev
0 siblings, 1 reply; 3+ messages in thread
From: Eric Wong @ 2019-10-01 3:37 UTC (permalink / raw)
To: meta
I forgot about this feature when I was implementing
blob-ID-based searches :x
---
TODO | 3 +++
1 file changed, 3 insertions(+)
diff --git a/TODO b/TODO
index 2c525615..93054bb3 100644
--- a/TODO
+++ b/TODO
@@ -112,3 +112,6 @@ all need to be considered for everything we introduce)
* make "git cat-file --batch" detect unlinked packfiles so we don't
have to restart processes (very long-term)
+
+* support searching based on `git-patch-id --stable` to improve
+ bidirectional mapping of commits <=> emails
--
EW
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] TODO: add item for searching based on git-patch-id(1)
2019-10-01 3:37 [PATCH] TODO: add item for searching based on git-patch-id(1) Eric Wong
@ 2019-10-01 21:00 ` Konstantin Ryabitsev
2019-10-01 22:00 ` Eric Wong
0 siblings, 1 reply; 3+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-01 21:00 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
On Tue, Oct 01, 2019 at 03:37:47AM +0000, Eric Wong wrote:
> I forgot about this feature when I was implementing
> blob-ID-based searches :x
> ---
> TODO | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/TODO b/TODO
> index 2c525615..93054bb3 100644
> --- a/TODO
> +++ b/TODO
> @@ -112,3 +112,6 @@ all need to be considered for everything we introduce)
>
> * make "git cat-file --batch" detect unlinked packfiles so we don't
> have to restart processes (very long-term)
> +
> +* support searching based on `git-patch-id --stable` to improve
> + bidirectional mapping of commits <=> emails
It would be handy, but a word of caution -- because it strips
whitespace, git-patch-id is not great for languages with syntactic
indentation, like Python. For example, the following two patches
generate the same patch-id, but one is actually malicious:
diff --git a/file1.py b/file1.py
index e574c49..6aa1937 100644
--- a/file1.py
+++ b/file1.py
@@ -1,3 +1,13 @@
#!/usr/bin/python
+def is_logged_in(cookie):
+ if cookie:
+ print('User is logged in')
+ return True
+
+ return False
+
+if is_logged_in(True):
+ print('You are logged in')
+
print('Hello!')
This one below is malicious, because is_logged_in() will always return
True:
diff --git a/file1.py b/file1.py
index e574c49..6aa1937 100644
--- a/file1.py
+++ b/file1.py
@@ -1,3 +1,13 @@
#!/usr/bin/python
+def is_logged_in(cookie):
+ if cookie:
+ print('User is logged in')
+ return True
+
+ return False
+
+if is_logged_in(True):
+ print('You are logged in')
+
print('Hello!')
So, I wouldn't use git-patch-id as a mechanism to look up patches,
except as an auxiliary one.
-K
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] TODO: add item for searching based on git-patch-id(1)
2019-10-01 21:00 ` Konstantin Ryabitsev
@ 2019-10-01 22:00 ` Eric Wong
0 siblings, 0 replies; 3+ messages in thread
From: Eric Wong @ 2019-10-01 22:00 UTC (permalink / raw)
To: Konstantin Ryabitsev; +Cc: meta
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Tue, Oct 01, 2019 at 03:37:47AM +0000, Eric Wong wrote:
> > +* support searching based on `git-patch-id --stable` to improve
> > + bidirectional mapping of commits <=> emails
>
> It would be handy, but a word of caution -- because it strips
> whitespace, git-patch-id is not great for languages with syntactic
> indentation, like Python. For example, the following two patches
> generate the same patch-id, but one is actually malicious:
Good point. Makefiles also fall into that category, I wonder
what other languages are whitespace sensitive?
<snip>
> So, I wouldn't use git-patch-id as a mechanism to look up patches,
> except as an auxiliary one.
It's usable for 99% of patches for the kernel, though. But
right, dfpost:$BLOB_ID matches should take precedence, and we
can use a lower weight for the patch-id in Xapian
The bigger question is the cost in time to reindex...
And ultimately, I wonder if dfpost:$BLOB_ID + s:$COMMIT_TITLE
is good enough, too... I think I need to dig out something
I abandoned years ago for indexing coderepos and refactor that
to be less space-intensive now.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-10-01 22:00 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-10-01 3:37 [PATCH] TODO: add item for searching based on git-patch-id(1) Eric Wong
2019-10-01 21:00 ` Konstantin Ryabitsev
2019-10-01 22:00 ` Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).