From: Paul Eggert <eggert@cs.ucla.edu>
To: "Óscar Fuentes" <ofv@wanadoo.es>, emacs-devel@gnu.org
Subject: Re: Recommend these .gitconfig settings for git integrity.
Date: Tue, 2 Feb 2016 23:31:00 -0800 [thread overview]
Message-ID: <56B1ACB4.7090705@cs.ucla.edu> (raw)
In-Reply-To: <878u33gd99.fsf@wanadoo.es>
On 02/02/2016 10:48 AM, Óscar Fuentes wrote:
> Emacs is no small project.
Emacs is considerably smaller than other projects that Git regularly deals with.
The Emacs master branch has 125,000 commits; the Linux kernel has 574,000
commits in its master branch (this omits history before 2005). I've seen reports
of git repositories at Facebook where the .git subdirectory contains 50 GB. By
comparison, my repository for Emacs master has 0.25 GB and for the Linux
kernelhas 1.7 GB. These numbers can vary quite a bit depending on packing and so
forth; still, the point remains that Emacs is not nearly as large as other
projects that use git.
> If the slow down affects large transfers ("large" meaning either many objects
> or big objects) what happens if an Emacs hackers pauses his activity for
> several months and then pulls? (after monitoring emacs-diffs for several
> years, I can attest that this scenario is quite frequent.)
I doubt whether it will slow down such pulls significantly for typical Emacs
development. I just now did a simple benchmark that cloned Emacs master from
savannah to my desktop at UCLA, and the overhead from the transfer.fsckObjects
setting was swamped by noise due to the network being slower or faster.Here are
a few details:
average times
real user+system fsckObjects value
212.4 202.6 true
217.1 195.7 false
The command used was:
git clone --config transfer.fsckObjects=VALUE git://git.sv.gnu.org/emacs.git
I warmed up with a clone that I discarded. I then tried three of
each command, interleaved, and took the averages.
This is just one benchmark of course, but it's suggestive.
> If we wish to avoid tainted objects created by whatever cause, the check can
> be enabled on the Savannah's repo, hence limiting the problem to the
> "infected" user.
The problem would not be limited to the infected user, if that user pushes
commits. Git on the client could remove the taint without actually fixing the
problem. Although the pushed commits would have checksums that match their data,
the pushed data would be corrupt.
A possible source of problems in this area is an attack on the integrity of the
Emacs repository by a determined outsider. Any such attacks cannot be discounted
by appealing to estimates based purely on counts of random hardware or software
or configuration errors, as the attacks would not be random.
> The problem seems to be so rare that a single Emacs hacker experiencing it
> every decade or so
Perhaps you're right, but perhaps not. Really, we do not know how common this
particular problem is. In my experience strange problems with git occur more
often than once per decade.
> (doesn't warrant the risks and inconveniences associated with using the
> setting (see my previous messages.)
Although I recall concerns based on hypothetical scenarios, I don't recall
descriptions of real inconveniences. Perhaps I missed an email or two.
> Are you a reprensetative sample of the overwhelming majority of the Emacs
> devel populationt? (No, you aren't).
True. I expect I use git more than the average Emacs developer does. Stefan
too.So we will probably be more affected than usual by this change.
> Thanks, although it is now a bit late. It is already installed on
> several repos, for sure. And anybody that stumbles on the revisions that
> contained your change (by bisecting some bug, for example) will have his
> .git/config file modified. I'm pretty sure that you didn't think of this
> issues when you made the change (neither I did at first.)
No, I anticipated these issues. There's no evidence that they are significant
problems.git bisect will still work.
> I appreciate your concern but that has an easy solution: enable it on the
> server (and on your own machine, to be extra sure.) See, problem solved
That would not solve the problem. True, it would help against some random
errors, but it won't work in general even there, much less against a determined
attack.
I take your point that there's no rush in making this change, and it will be
helpful to gain experience on it among developers who prefer a more
bleeding-edge environment, so I have reverted the change on emacs-25 and
installed a considerably more conservative approach on master to help us get
started. The new approach does not alter git configuration if a developer
invokes plain './autogen.sh'. Instead a developer can invoke the script with a
newly-introduced extension: either './autogen.sh git' to configure just git, or
'./autogen.sh all' to configure both autoconf and git. I hope this helps address
the concerns raised in this thread.
next prev parent reply other threads:[~2016-02-03 7:31 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-31 20:22 Recommend these .gitconfig settings for git integrity Karl Fogel
2016-01-31 20:35 ` Eli Zaretskii
2016-01-31 21:37 ` Karl Fogel
2016-01-31 21:48 ` Paul Eggert
2016-02-01 15:42 ` Karl Fogel
2016-02-01 16:01 ` Óscar Fuentes
2016-02-01 16:24 ` Stefan Monnier
2016-02-01 16:39 ` Karl Fogel
2016-02-01 19:12 ` Stefan Monnier
2016-02-01 19:56 ` Paul Eggert
2016-02-01 20:28 ` Eli Zaretskii
2016-02-01 21:40 ` Stefan Monnier
2016-02-02 8:02 ` Paul Eggert
2016-02-02 8:17 ` John Wiegley
2016-02-02 12:58 ` Stefan Monnier
2016-02-02 15:49 ` Óscar Fuentes
2016-02-02 17:55 ` Paul Eggert
2016-02-02 18:48 ` Óscar Fuentes
2016-02-03 7:31 ` Paul Eggert [this message]
2016-02-03 16:20 ` Óscar Fuentes
2016-02-03 18:10 ` Paul Eggert
2016-02-03 20:50 ` Óscar Fuentes
2016-02-04 3:53 ` Stefan Monnier
2016-02-02 23:22 ` Karl Fogel
2016-02-03 0:20 ` Lars Ingebrigtsen
2016-02-03 2:16 ` John Wiegley
2016-02-03 2:26 ` Paul Eggert
2016-02-03 6:35 ` John Wiegley
2016-02-03 15:47 ` Eli Zaretskii
2016-02-03 17:40 ` Paul Eggert
2016-02-03 17:52 ` Eli Zaretskii
2016-02-03 18:04 ` Paul Eggert
2016-02-04 0:20 ` Lars Ingebrigtsen
2016-02-02 16:19 ` Eli Zaretskii
2016-02-01 18:39 ` John Wiegley
2016-02-01 16:35 ` Paul Eggert
2016-02-01 16:51 ` Óscar Fuentes
2016-02-01 17:40 ` Paul Eggert
2016-02-01 20:34 ` Óscar Fuentes
2016-02-01 18:09 ` Karl Fogel
2016-02-01 20:56 ` Óscar Fuentes
2016-02-01 21:07 ` Karl Fogel
2016-02-02 10:30 ` Tom
2016-02-02 15:37 ` Paul Eggert
2016-02-02 17:24 ` Tom
2016-02-02 17:54 ` Paul Eggert
2016-02-01 20:50 ` Karl Fogel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56B1ACB4.7090705@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=emacs-devel@gnu.org \
--cc=ofv@wanadoo.es \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.