From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Bob Proulx Newsgroups: gmane.emacs.help Subject: Re: [git] puplic repo url incorrect or server overloaded? Date: Thu, 13 Nov 2014 11:15:28 -0700 Message-ID: <20141113103047306372297@bob.proulx.com> References: <87mw7vnud5.fsf@vsl28t2g.ww011> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1415902564 32499 80.91.229.3 (13 Nov 2014 18:16:04 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 13 Nov 2014 18:16:04 +0000 (UTC) Cc: help-gnu-emacs@gnu.org To: "H. Dieter Wilhelm" Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Nov 13 19:15:58 2014 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Xoyw2-0000So-Mp for geh-help-gnu-emacs@m.gmane.org; Thu, 13 Nov 2014 19:15:58 +0100 Original-Received: from localhost ([::1]:33359 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xoyw2-0003XP-5f for geh-help-gnu-emacs@m.gmane.org; Thu, 13 Nov 2014 13:15:58 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55405) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xoyvi-0003Wp-3A for help-gnu-emacs@gnu.org; Thu, 13 Nov 2014 13:15:42 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xoyvd-0005rw-74 for help-gnu-emacs@gnu.org; Thu, 13 Nov 2014 13:15:38 -0500 Original-Received: from joseki.proulx.com ([216.17.153.58]:42111) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xoyvc-0005rr-0O for help-gnu-emacs@gnu.org; Thu, 13 Nov 2014 13:15:32 -0500 Original-Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id 6BCD221232; Thu, 13 Nov 2014 11:15:28 -0700 (MST) Original-Received: by hysteria.proulx.com (Postfix, from userid 1000) id 534EF2DC38; Thu, 13 Nov 2014 11:15:28 -0700 (MST) Mail-Followup-To: "H. Dieter Wilhelm" , help-gnu-emacs@gnu.org Content-Disposition: inline In-Reply-To: <87mw7vnud5.fsf@vsl28t2g.ww011> User-Agent: Mutt/1.5.23 (2014-03-12) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 216.17.153.58 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:100917 Archived-At: H. Dieter Wilhelm wrote: > git://git.sv.gnu.org/emacs > ... > fatal: read error: Connection reset by peer > ... > Is this just a sign that the server is overloaded or am I'm doing > something wrong? There has been some miscommunication all around. Let me bend everyone's ear for a moment and fill in some behind the scenes background information. Emacs has made another attempt at converting from bzr to git. I am one of the volunteers that helps administer the Savannah systems and have been helping them with this process on the admin side. Recently esr made another conversion and uploaded 13G of source to the git repository. vcs:~# du -sh /srv/git/emacs.git 13G /srv/git/emacs.git I had set up the initial empty repository and didn't realize that the recent upload was 13G. Then it was announced that was the new repository. I didn't know it had been announced that this was the new source location. But I did notice that vcs browned out and needed help. I rebooted it twice to rescue it. I had to stop all of the git daemons at another two times. Think what happens. Everyone and their dog tries to download a fresh copy of the repository. At 13G each! The concurrency limit was 40 concurrent processes. Each of those were trying to download 13G. That would take hours each. It starved out all other projects from being able to access their git repositories. Worse is that it overloaded the system to the point that it was inaccessible. Turns out that it can't handle 40 simultaneous git downloads of 13G each. It pushed the system over a tipping point to where other processes couldn't finish faster than new processes were started. The last system load logged was 350 before it stopped responding at all. Alerts notified me of the problem. It was unresponsive. I and the FSF admins were in conference about the problem. We rebooted the server to rescue it. Things seemed okay for a bit. Then the load average would creep up again. We lost it again. I needed to reboot it again. Figured out that it was git that was doing it. I was forced to disable git in order to keep the system alive. Shut down git and kill all of the git downloads. Of course initially it wasn't known that it was the new 13G emacs repository that was the problem. That became apparent only after digging into the problem. The git daemons don't log what they are doing and all of the projects share the pool. It was just that git was overloading the system. I reduced the limits on git resources. I added more virtual memory. I reconfigured overcommit off to avoid the oom from killing critical processes. Turned git on again and watched the process list closely. Figure out that everyone is downloading emacs. During this time your emacs git clone and others were probably getting reset more than once. What you were seeing was problems from this. When we figured out that the emacs repository was new and 13G in size and that was the problem! I asked esr to repack it and upload it again. I moved the 13G repository out of the way to prevent the continuing problem. Kill all of the troubled git downloads and restart git so that other projects could function again. esr repacked the emacs archive and uploaded it again. The new repacked repository is now only 200M in size. As you can imagine that makes a world of difference! vcs:/srv/git/emacs.git# du -sh 200M . Basically now at this time everything is back to operating normally. I still have the reduced process limits for git in place though. Because at any time some other project might do the same thing. This is all discussed at length that is the firehose that is emacs-devel. Unfortunately I don't have the time to read it right now so as a note to others I am only reacting and helping when people CC me on tasks from there. And that is the behind the scenes of yesterday. Bob