* Re: [Outreachy] Feedback on 'guix git log' subcommand
2021-01-28 3:53 [Outreachy] Feedback on 'guix git log' subcommand Magali
@ 2021-01-28 23:07 ` zimoun
2021-01-30 1:30 ` Magali
2021-02-01 15:18 ` Ludovic Courtès
2021-01-29 0:04 ` [Outreachy] 'guix git log' slowness? zimoun
1 sibling, 2 replies; 6+ messages in thread
From: zimoun @ 2021-01-28 23:07 UTC (permalink / raw)
To: Magali, Gábor Boskovits, Guix Devel
[-- Attachment #1: Type: text/plain, Size: 2466 bytes --]
Hi Magali,
On Thu, 28 Jan 2021 at 00:53, Magali <magalilemes00@gmail.com> wrote:
> On a side note, I've been blogging about it at
> <https://magalilemes.gitlab.io/blog>.
Nice readings! :-)
> Below are a few examples of the options currently available:
>
> ./pre-inst-env guix git log --format=medium
> ./pre-inst-env guix git log --oneline
> ./pre-inst-env guix git log --channel-cache-path
> ./pre-inst-env guix git log --channel-cache-path=guix
Cool!
An example, the recent update of setuptools breaks python2-setuptools.
--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix git log --oneline \
> | grep 'python-setuptools:' | grep Update
b3ca2b4 gnu: python-setuptools: Update to 40.8.0.
d0205df gnu: python-setuptools: Update to 40.0.0.
62a9a23 gnu: python-setuptools: Update to 18.3.1.
d3d656c gnu: python-setuptools: Update to 12.1.
d660f7b gnu: python-setuptools: Update to 31.0.0.
e39d493 gnu: python-setuptools: Update to 41.0.1.
3142dac gnu: python-setuptools: Update to 52.0.0.
--8<---------------cut here---------------end--------------->8---
Ah, the order seems weird, “git log” says:
--8<---------------cut here---------------start------------->8---
3142daccbe gnu: python-setuptools: Update to 52.0.0.
e39d4933d1 gnu: python-setuptools: Update to 41.0.1.
b3ca2b45d1 gnu: python-setuptools: Update to 40.8.0.
d0205dfd92 gnu: python-setuptools: Update to 40.0.0.
d660f7be6d gnu: python-setuptools: Update to 31.0.0.
62a9a23bf9 gnu: python-setuptools: Update to 18.3.1.
d3d656c5fb gnu: python-setuptools: Update to 12.1.
--8<---------------cut here---------------end--------------->8---
Well, somehow it is your question about which order. :-)
Doing that, I am thinking of two nice features:
1. get the parent commits
2. show a specific commit
For example,
guix git log --parents=3142daccbe # => 631e1f33
guix git log --show=3142daccbe
so then I can feed “guix time-machine” with 631e1f33 to have
’python2-setuptools’ before it breaks. I am not sure to what “--show”
would do.
WDYT?
Hum, ’--grep’ seems missing. I do not remember, is it not working?
Even slowly?
As discussed, “guix-git-log” is not defined as ’define-command’ because
it should not appear on “guix help”. However, it should appear with
“guix git --help”.
Well, my idea was something along this attached patch,
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: small.patch --]
[-- Type: text/x-diff, Size: 1821 bytes --]
diff --git a/guix/scripts/git.scm b/guix/scripts/git.scm
index 8fcd0ccca8..3b141a622d 100644
--- a/guix/scripts/git.scm
+++ b/guix/scripts/git.scm
@@ -18,18 +18,26 @@
(define-module (guix scripts git)
#:use-module (ice-9 match)
+ #:use-module (ice-9 format)
#:use-module (guix ui)
#:use-module (guix scripts)
+ #:use-module (srfi srfi-1)
#:export (guix-git))
+(define %sub-commands
+ `(("authenticate" . "verify commit signatures and authorizations")
+ ("log". "show Git commit history")))
+
(define (show-help)
(display (G_ "Usage: guix git COMMAND ARGS...
Operate on Git repositories.\n"))
(newline)
(display (G_ "The valid values for ACTION are:\n"))
(newline)
- (display (G_ "\
- authenticate verify commit signatures and authorizations\n"))
+ (for-each (match-lambda
+ ((name . help)
+ (format #t "~13a ~a\n" name help)))
+ %sub-commands)
(newline)
(display (G_ "
-h, --help display this help and exit"))
@@ -38,8 +46,6 @@ Operate on Git repositories.\n"))
(newline)
(show-bug-report-information))
-(define %sub-commands '("authenticate" "log"))
-
(define (resolve-sub-command name)
(let ((module (resolve-interface
`(guix scripts git ,(string->symbol name))))
@@ -61,7 +67,6 @@ Operate on Git repositories.\n"))
((or ("-V") ("--version"))
(show-version-and-exit "guix git"))
((sub-command args ...)
- (if (member sub-command %sub-commands)
+ (if (find (lambda (s) (string=? (first s) sub-command)) %sub-commands)
(apply (resolve-sub-command sub-command) args)
- (format (current-error-port)
- (G_ "guix git: invalid sub-command~%")))))))
+ (leave (G_ "~a: invalid sub-command~%") sub-command))))))
[-- Attachment #3: Type: text/plain, Size: 441 bytes --]
but “guix system” hard code all the subcommands. So, maybe hardcoding
the “log” subcommand is the way to do; with translation in mind, I
guess.
> Another thing is that the command is a bit slower than 'git log' itself.
> Thoughts on how that could be improved?
I will provide more on a separate email. And review more in details the
code. :-)
Thanks for working on that. That’s cool! :-)
Cheers,
simon
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Outreachy] 'guix git log' slowness?
2021-01-28 3:53 [Outreachy] Feedback on 'guix git log' subcommand Magali
2021-01-28 23:07 ` zimoun
@ 2021-01-29 0:04 ` zimoun
1 sibling, 0 replies; 6+ messages in thread
From: zimoun @ 2021-01-29 0:04 UTC (permalink / raw)
To: Magali, Gábor Boskovits, Guix Devel
[-- Attachment #1: Type: text/plain, Size: 3844 bytes --]
Hi Magali,
(It is a slightly edited version I sent you. Since the aim of Outreachy
is also to interact with Community, let enjoy the French proverb: «more
crazy people we are, more fun we have». :-))
On Thu, 28 Jan 2021 at 00:53, Magali <magalilemes00@gmail.com> wrote:
> Another thing is that the command is a bit slower than 'git log' itself.
> Thoughts on how that could be improved?
The command is “slow”. A first quick analysis about the meaning of “slow”.
Basically, I have run twice:
--8<---------------cut here---------------start------------->8---
for n in 60000 10000 5000 1000 500 100 50 10 5 1;
do
time ./pre-inst-env guix git log --oneline \
| head -n $n > /dev/null ;
done
--8<---------------cut here---------------end--------------->8---
to have kind of warm cache. And again for the equivalent Git command.
Then, bit of Emacs edit processing to transform the output in the buffer
of ’M-x shell’ to something in the Python file:
tguix = np.array([2.871, …
tgit = np.array([0.013, …
(Well, to be correct, it is not twice but a couple of times to have an
average.)
Let normalize by removing the additive constant and run a classic
linear regression:
t ~ B n ^ a => log(t) ~ a log(n) + b
where ’a’ and ’b’ have to be estimated.
Well, I am surprised by ’a ~ 0.5’ which means that “guix git log” runs
with a sublinear time complexity. However, “git log” is linear. Maybe
I am doing something wrong.
Initially I thought the tree was badly walked but a quick:
--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix git log --channel-cache-path
guix
/home/simon/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq
$ git -C /home/simon/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq \
log --oneline | wc -l
72791
$ ./pre-inst-env guix git log --oneline | wc -l
72791
--8<---------------cut here---------------end--------------->8---
shows it is correct. Hum?! Well, I suspect noise on the data and the
normalization is bad here. Running the experiment in a batch of 10
times then averaging them should give an analysis more meaningful. Hey,
that’s a quick one. :-)
The conclusion here, it scales well enough… for now.
Therefore, the real the question is about the «additive constant». On
my machine, it is ~2.9s and this is where there is room of improvement,
I guess.
I exported ’get-commits’ to have it in the REPL and I did:
--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix repl
GNU Guile 3.0.5
Copyright (C) 1995-2021 Free Software Foundation, Inc.
Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.
Enter `,help' for help.
scheme@(guix-user)> ,use(guix scripts git log)
scheme@(guix-user)> (define (compute) (begin (get-commits) 'ok))
scheme@(guix-user)> ,time (compute)
$1 = ok
;; 2.533936s real time, 3.099156s run time. 0.901027s spent in GC.
scheme@(guix-user)>
--8<---------------cut here---------------end--------------->8---
And I let you run “,profile (compute)”. Well, this function should be
optimized. IMHO.
Initially, I thought about “stream” but I do no think it is the issue
here. Well, I think that the ’repo’ is open at each commit when
folding. Instead, it should be open before, keep alive and close at the
end of the ’fold’. Somehow. WDYT?
I will give a closer look to ’commit-closure’ because I am not convinced
it is useful here, neither.
Bonus, attached the Python script and the plot. :-)
Cheers,
simon
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: quick-analysis.py --]
[-- Type: text/x-python, Size: 951 bytes --]
# coding: utf-8
#
import numpy as np
import matplotlib.pyplot as plt
n = np.array([1., 5., 10., 50., 100., 500., 1000., 5000., 10000., 60000.])
tguix = np.array([2.871, 3.006, 2.987, 2.892, 2.991, 3.038, 3.088, 3.315, 3.565, 5.634])
tgit = np.array([0.013, 0.006, 0.013, 0.016, 0.014, 0.029, 0.047, 0.12, 0.233, 1.283])
normalize = lambda x: x - x[0]
logify = lambda x: np.log10(x[3:])
x = logify(n)
y = logify(normalize(tguix))
z = logify(normalize(tgit))
X = np.concatenate((x.reshape((len(x), 1)), np.ones((len(x), 1))), axis=1)
guix, res, rank, s = np.linalg.lstsq(X, y, rcond=None)
git, res, rank, s = np.linalg.lstsq(X, z, rcond=None)
predict = lambda sol: sol[0] * x + sol[1]
plt.plot(x, y, 'bo', label='guix')
plt.plot(x, z, 'ro', label='git')
plt.plot(x, predict(guix), 'bx-')
plt.plot(x, predict(git), 'rx-')
plt.legend(loc='lower right')
plt.xlabel("log(N) where N in `| head -n N`")
plt.ylabel("log(real time)")
plt.show()
[-- Attachment #3: scale.png --]
[-- Type: image/png, Size: 45025 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread