From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: bug#26201: hydra.gnu.org uses =?UTF-8?Q?=E2=80=98guix_?= =?UTF-8?Q?publish=E2=80=99?= for nars and narinfos Date: Fri, 24 Mar 2017 04:12:50 -0400 Message-ID: <87y3vvozy5.fsf@netris.org> References: <20170320184449.5ac06051@khaalida> <144e9ba8-af93-fb18-d2b9-f198ae7c11e9@tobias.gr> <20170320195247.05f72fc9@khaalida> <8e7e07d1-563f-666f-2c32-2a772757c86f@tobias.gr> <8760j2wpfy.fsf@gnu.org> <9889a4b5-c300-cd03-1095-1115428067fb@tobias.gr> <87r31pyms2.fsf_-_@gnu.org> <87inmzrgbf.fsf@netris.org> <25b2472a-c705-53fe-f94f-04de9a2d484e@tobias.gr> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:54198) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1crKMH-00071y-M1 for bug-guix@gnu.org; Fri, 24 Mar 2017 04:14:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1crKME-0004iJ-FW for bug-guix@gnu.org; Fri, 24 Mar 2017 04:14:05 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:43502) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1crKME-0004iF-CP for bug-guix@gnu.org; Fri, 24 Mar 2017 04:14:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1crKME-0001R6-58 for bug-guix@gnu.org; Fri, 24 Mar 2017 04:14:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <25b2472a-c705-53fe-f94f-04de9a2d484e@tobias.gr> (Tobias Geerinckx-Rice's message of "Thu, 23 Mar 2017 19:52:30 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Tobias Geerinckx-Rice Cc: 26201@debbugs.gnu.org, guix-sysadmin@gnu.org Hi, Tobias Geerinckx-Rice writes: > On 23/03/17 19:36, Mark H Weaver wrote: >> One question: what will happen in the case of multiple concurrent >> requests for the same nar? Will multiple nar-pack-and-bzip2 processes >> be run on-demand? > > I think this used to be the case with the previous nginx configuration, > but the recent changes pushed by Ludo' were aimed in part at preventing > that. > >> Recall that the nginx proxy will pass all of those requests through, > > Are you sure? I was under the impression=C2=B9 that this is exactly what > =E2=80=98proxy_cache_lock on;=E2=80=99 prevents. I'm no nginx guru, obvio= usly, so please > =E2=80=94 anyone! =E2=80=94 correct me if I'm misguided. I agree that "proxy_cache_lock on" should prevent multiple concurrent requests for the same URL, but unfortunately its behavior is quite undesirable, and arguably worse than leaving it off in our case. See: https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock Specifically: Other requests of the same cache element will either wait for a response to appear in the cache or the cache lock for this element to be released, up to the time set by the proxy_cache_lock_timeout directive. In our problem case, it takes more than an hour for Hydra to finish sending a response for the 'texlive-texmf' nar. During that time, the nar will be slowly sent to the first client while it's being packed and bzipped on-demand. IIUC, with "proxy_cache_lock on", we have two choices of how other client requests will be treated: (1) If we increase "proxy_cache_lock_timeout" to a huge value, then there will *no* data sent to the other clients until the first client has received the entire nar, which means they wait over an hour before receiving the first byte. I guess this will result in timeouts on the client side. (2) If "proxy_cache_lock_timeout" is *not* huge, then all other clients will get failure responses until the first client has received the entire nar. Either way, this would cause users to see the same download failures (requiring user work-arounds like --fallback) that this fix is intended to prevent for 'texlive-texmf', but instead of happening only for that one nar, it will now happen for *all* large nars. Or at least that's what I'd expect based on my reading of the nginx docs linked above. I haven't tried it. IMO, the best solution is to *never* generate nars on Hydra in response to client requests, but rather to have the build slaves pack and compress the nars, copy them to Hydra, and then serve them as static files using nginx. A far inferior solution, but possibly acceptable and closer to the current approach, would be to arrange for all concurrent responses for the same nar to be sent incrementally from a single nar-packing process. More concretely, while packing and sending a nar response to the first client, the data would also be written to a file. Subsequent requests for the same nar would be serviced using the equivalent of: tail --bytes=3D+0 --follow FILENAME This way, no one would have to wait an hour to receive the first byte. What do you think? Mark