all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: "Óscar Fuentes" <ofv@wanadoo.es>, emacs-devel@gnu.org
Subject: Re: Recommend these .gitconfig settings for git integrity.
Date: Tue, 2 Feb 2016 23:31:00 -0800	[thread overview]
Message-ID: <56B1ACB4.7090705@cs.ucla.edu> (raw)
In-Reply-To: <878u33gd99.fsf@wanadoo.es>

On 02/02/2016 10:48 AM, Óscar Fuentes wrote:
> Emacs is no small project.

Emacs is considerably smaller than other projects that Git regularly deals with. 
The Emacs master branch has 125,000 commits; the Linux kernel has 574,000 
commits in its master branch (this omits history before 2005). I've seen reports 
of git repositories at Facebook where the .git subdirectory contains 50 GB. By 
comparison, my repository for Emacs master has 0.25 GB and for the Linux 
kernelhas 1.7 GB. These numbers can vary quite a bit depending on packing and so 
forth; still, the point remains that Emacs is not nearly as large as other 
projects that use git.

> If the slow down affects large transfers ("large" meaning either many objects 
> or big objects) what happens if an Emacs hackers pauses his activity for 
> several months and then pulls? (after monitoring emacs-diffs for several 
> years, I can attest that this scenario is quite frequent.) 

I doubt whether it will slow down such pulls significantly for typical Emacs 
development. I just now did a simple benchmark that cloned Emacs master from 
savannah to my desktop at UCLA, and the overhead from the transfer.fsckObjects 
setting was swamped by noise due to the network being slower or faster.Here are 
a few details:

     average times
     real      user+system   fsckObjects value
     212.4       202.6           true
     217.1       195.7           false

     The command used was:
     git clone --config transfer.fsckObjects=VALUE git://git.sv.gnu.org/emacs.git

     I warmed up with a clone that I discarded. I then tried three of
     each command, interleaved, and took the averages.

This is just one benchmark of course, but it's suggestive.

> If we wish to avoid tainted objects created by whatever cause, the check can 
> be enabled on the Savannah's repo, hence limiting the problem to the 
> "infected" user.

The problem would not be limited to the infected user, if that user pushes 
commits. Git on the client could remove the taint without actually fixing the 
problem. Although the pushed commits would have checksums that match their data, 
the pushed data would be corrupt.

A possible source of problems in this area is an attack on the integrity of the 
Emacs repository by a determined outsider. Any such attacks cannot be discounted 
by appealing to estimates based purely on counts of random hardware or software 
or configuration errors, as the attacks would not be random.

> The problem seems to be so rare that a single Emacs hacker experiencing it 
> every decade or so

Perhaps you're right, but perhaps not. Really, we do not know how common this 
particular problem is. In my experience strange problems with git occur more 
often than once per decade.

> (doesn't warrant the risks and inconveniences associated with using the 
> setting (see my previous messages.)

Although I recall concerns based on hypothetical scenarios, I don't recall 
descriptions of real inconveniences. Perhaps I missed an email or two.

> Are you a reprensetative sample of the overwhelming majority of the Emacs 
> devel populationt? (No, you aren't).

True. I expect I use git more than the average Emacs developer does. Stefan 
too.So we will probably be more affected than usual by this change.

> Thanks, although it is now a bit late. It is already installed on
> several repos, for sure. And anybody that stumbles on the revisions that
> contained your change (by bisecting some bug, for example) will have his
> .git/config file modified. I'm pretty sure that you didn't think of this
> issues when you made the change (neither I did at first.)

No, I anticipated these issues. There's no evidence that they are significant 
problems.git bisect will still work.

> I appreciate your concern but that has an easy solution: enable it on the 
> server (and on your own machine, to be extra sure.) See, problem solved

That would not solve the problem. True, it would help against some random 
errors, but it won't work in general even there, much less against a determined 
attack.

I take your point that there's no rush in making this change, and it will be 
helpful to gain experience on it among developers who prefer a more 
bleeding-edge environment, so I have reverted the change on emacs-25 and 
installed a considerably more conservative approach on master to help us get 
started. The new approach does not alter git configuration if a developer 
invokes plain './autogen.sh'. Instead a developer can invoke the script with a 
newly-introduced extension: either './autogen.sh git' to configure just git, or 
'./autogen.sh all' to configure both autoconf and git. I hope this helps address 
the concerns raised in this thread.



  reply	other threads:[~2016-02-03  7:31 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-31 20:22 Recommend these .gitconfig settings for git integrity Karl Fogel
2016-01-31 20:35 ` Eli Zaretskii
2016-01-31 21:37   ` Karl Fogel
2016-01-31 21:48     ` Paul Eggert
2016-02-01 15:42       ` Karl Fogel
2016-02-01 16:01         ` Óscar Fuentes
2016-02-01 16:24           ` Stefan Monnier
2016-02-01 16:39             ` Karl Fogel
2016-02-01 19:12               ` Stefan Monnier
2016-02-01 19:56                 ` Paul Eggert
2016-02-01 20:28                   ` Eli Zaretskii
2016-02-01 21:40                     ` Stefan Monnier
2016-02-02  8:02                     ` Paul Eggert
2016-02-02  8:17                       ` John Wiegley
2016-02-02 12:58                       ` Stefan Monnier
2016-02-02 15:49                       ` Óscar Fuentes
2016-02-02 17:55                         ` Paul Eggert
2016-02-02 18:48                           ` Óscar Fuentes
2016-02-03  7:31                             ` Paul Eggert [this message]
2016-02-03 16:20                               ` Óscar Fuentes
2016-02-03 18:10                                 ` Paul Eggert
2016-02-03 20:50                                   ` Óscar Fuentes
2016-02-04  3:53                                     ` Stefan Monnier
2016-02-02 23:22                           ` Karl Fogel
2016-02-03  0:20                             ` Lars Ingebrigtsen
2016-02-03  2:16                             ` John Wiegley
2016-02-03  2:26                               ` Paul Eggert
2016-02-03  6:35                                 ` John Wiegley
2016-02-03 15:47                                 ` Eli Zaretskii
2016-02-03 17:40                                   ` Paul Eggert
2016-02-03 17:52                                     ` Eli Zaretskii
2016-02-03 18:04                                       ` Paul Eggert
2016-02-04  0:20                                         ` Lars Ingebrigtsen
2016-02-02 16:19                       ` Eli Zaretskii
2016-02-01 18:39             ` John Wiegley
2016-02-01 16:35           ` Paul Eggert
2016-02-01 16:51             ` Óscar Fuentes
2016-02-01 17:40               ` Paul Eggert
2016-02-01 20:34                 ` Óscar Fuentes
2016-02-01 18:09               ` Karl Fogel
2016-02-01 20:56                 ` Óscar Fuentes
2016-02-01 21:07                   ` Karl Fogel
2016-02-02 10:30                 ` Tom
2016-02-02 15:37                   ` Paul Eggert
2016-02-02 17:24                     ` Tom
2016-02-02 17:54                       ` Paul Eggert
2016-02-01 20:50             ` Karl Fogel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B1ACB4.7090705@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=ofv@wanadoo.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.