all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Bob Proulx <bob@proulx.com>
To: "H. Dieter Wilhelm" <dieter@duenenhof-wilhelm.de>
Cc: help-gnu-emacs@gnu.org
Subject: Re: [git] puplic repo url incorrect or server overloaded?
Date: Thu, 13 Nov 2014 11:15:28 -0700	[thread overview]
Message-ID: <20141113103047306372297@bob.proulx.com> (raw)
In-Reply-To: <87mw7vnud5.fsf@vsl28t2g.ww011>

H. Dieter Wilhelm wrote:
>     git://git.sv.gnu.org/emacs
> ...
>       fatal: read error: Connection reset by peer
> ...
> Is this just a sign that the server is overloaded or am I'm doing
> something wrong?

There has been some miscommunication all around.  Let me bend
everyone's ear for a moment and fill in some behind the scenes
background information.

Emacs has made another attempt at converting from bzr to git.  I am
one of the volunteers that helps administer the Savannah systems and
have been helping them with this process on the admin side.  

Recently esr made another conversion and uploaded 13G of source to the
git repository.  

  vcs:~# du -sh /srv/git/emacs.git
  13G     /srv/git/emacs.git
    
I had set up the initial empty repository and didn't realize that the
recent upload was 13G.  Then it was announced that was the new
repository.  I didn't know it had been announced that this was the new
source location.  But I did notice that vcs browned out and needed
help.  I rebooted it twice to rescue it.  I had to stop all of the git
daemons at another two times.

Think what happens.  Everyone and their dog tries to download a fresh
copy of the repository.  At 13G each!  The concurrency limit was 40
concurrent processes.  Each of those were trying to download 13G.
That would take hours each.  It starved out all other projects from
being able to access their git repositories.

Worse is that it overloaded the system to the point that it was
inaccessible.  Turns out that it can't handle 40 simultaneous git
downloads of 13G each.  It pushed the system over a tipping point to
where other processes couldn't finish faster than new processes were
started.  The last system load logged was 350 before it stopped
responding at all.

Alerts notified me of the problem.  It was unresponsive.  I and the
FSF admins were in conference about the problem.  We rebooted the
server to rescue it.  Things seemed okay for a bit.  Then the load
average would creep up again.  We lost it again.  I needed to reboot
it again.  Figured out that it was git that was doing it.  I was
forced to disable git in order to keep the system alive.  Shut down
git and kill all of the git downloads.

Of course initially it wasn't known that it was the new 13G emacs
repository that was the problem.  That became apparent only after
digging into the problem.  The git daemons don't log what they are
doing and all of the projects share the pool.  It was just that git
was overloading the system.  I reduced the limits on git resources.  I
added more virtual memory.  I reconfigured overcommit off to avoid the
oom from killing critical processes.  Turned git on again and watched
the process list closely.  Figure out that everyone is downloading
emacs.

During this time your emacs git clone and others were probably getting
reset more than once.  What you were seeing was problems from this.

When we figured out that the emacs repository was new and 13G in size
and that was the problem!  I asked esr to repack it and upload it
again.  I moved the 13G repository out of the way to prevent the
continuing problem.  Kill all of the troubled git downloads and
restart git so that other projects could function again.  esr repacked
the emacs archive and uploaded it again.  The new repacked repository
is now only 200M in size.  As you can imagine that makes a world of
difference!

  vcs:/srv/git/emacs.git# du -sh
  200M    .

Basically now at this time everything is back to operating normally.
I still have the reduced process limits for git in place though.
Because at any time some other project might do the same thing.

This is all discussed at length that is the firehose that is
emacs-devel.  Unfortunately I don't have the time to read it right now
so as a note to others I am only reacting and helping when people CC
me on tasks from there.

And that is the behind the scenes of yesterday.

Bob



      parent reply	other threads:[~2014-11-13 18:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-13 12:43 [git] puplic repo url incorrect or server overloaded? H. Dieter Wilhelm
2014-11-13 13:01 ` Stephen Berman
2014-11-13 13:05   ` H. Dieter Wilhelm
2014-11-13 18:21     ` Bob Proulx
2014-11-13 14:23   ` [git] tracking instead of cloining; was: " H. Dieter Wilhelm
2014-11-13 18:15 ` Bob Proulx [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141113103047306372297@bob.proulx.com \
    --to=bob@proulx.com \
    --cc=dieter@duenenhof-wilhelm.de \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.