From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id iOwNHu7cs2EZCQAAgWs5BA (envelope-from ) for ; Sat, 11 Dec 2021 00:04:14 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id 4CyOGe7cs2FrHQAAbx9fmQ (envelope-from ) for ; Fri, 10 Dec 2021 23:04:14 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 3A20B8C03 for ; Sat, 11 Dec 2021 00:04:14 +0100 (CET) Received: from localhost ([::1]:39030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mvow1-0004r0-Br for larch@yhetil.org; Fri, 10 Dec 2021 18:04:13 -0500 Received: from eggs.gnu.org ([209.51.188.92]:35734) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mvovq-0004qh-By for bug-guix@gnu.org; Fri, 10 Dec 2021 18:04:02 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:36638) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mvovq-00056z-1J for bug-guix@gnu.org; Fri, 10 Dec 2021 18:04:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mvovp-0000OW-Po for bug-guix@gnu.org; Fri, 10 Dec 2021 18:04:01 -0500 X-Loop: help-debbugs@gnu.org Subject: bug#52338: Crawler bots are downloading substitutes Resent-From: Tobias Geerinckx-Rice Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Fri, 10 Dec 2021 23:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 52338 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Mark H Weaver Cc: 52338@debbugs.gnu.org, leo@famulari.name X-Debbugs-Original-Cc: 52338@debbugs.gnu.org, bug-guix@gnu.org, Leo Famulari Received: via spool by submit@debbugs.gnu.org id=B.16391774081474 (code B ref -1); Fri, 10 Dec 2021 23:04:01 +0000 Received: (at submit) by debbugs.gnu.org; 10 Dec 2021 23:03:28 +0000 Received: from localhost ([127.0.0.1]:48184 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mvovI-0000Nh-6L for submit@debbugs.gnu.org; Fri, 10 Dec 2021 18:03:28 -0500 Received: from lists.gnu.org ([209.51.188.17]:40686) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mvovG-0000NT-36 for submit@debbugs.gnu.org; Fri, 10 Dec 2021 18:03:27 -0500 Received: from eggs.gnu.org ([209.51.188.92]:35618) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mvovF-0004oR-U9 for bug-guix@gnu.org; Fri, 10 Dec 2021 18:03:25 -0500 Received: from [2a02:c205:2020:6054::1] (port=33832 helo=tobias.gr) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mvovE-00053R-2f for bug-guix@gnu.org; Fri, 10 Dec 2021 18:03:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=2018; bh=82HjlNJbHv1uq 7gIfndeEtFEvn4J6N16aTmLva30JR8=; h=in-reply-to:date:subject:cc:to: from:references; d=tobias.gr; b=XGk2P+rMtVH6gWqFulMEYl/u+o1r9wA7zCIetu xmr1EwlKUotFOlRP/kKQ6l/CPmQG7nc9c8CDK74W9/MjBNrB6dg0WdYcmXiJxHi2e9T26w qtj2StkW+fX2VpQMEZ6MGguD4Nrkpqidp5LsnDrMCH6nhUAcFD/C1MqQEdRoI1maVne6mI DTIK+2NrtlTtnKG61bDECPRgIRatwjJfCrIia5qaC2MAtYS2el+WJnbmFaY3jqrH2YaGEy 9g9Y8VfEJbhbzqbkfoFfPLJx/+dC/4mHjiGu+mrH32tbH+mRyB9uDYEUiobMgnWhkiMGOI ZYUpItvi1/t9LnCOqXZUkztA== Received: by submission.tobias.gr (OpenSMTPD) with ESMTPSA id 7e0f5e2d (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 10 Dec 2021 23:03:19 +0000 (UTC) References: <87r1ak2m1p.fsf@netris.org> Date: Fri, 10 Dec 2021 23:52:51 +0100 In-reply-to: <87r1ak2m1p.fsf@netris.org> BIMI-Selector: v=BIMI1; s=default; Message-ID: <875yrw3vvk.fsf@nckx> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a02:c205:2020:6054::1 (failed) Received-SPF: pass client-ip=2a02:c205:2020:6054::1; envelope-from=me@tobias.gr; helo=tobias.gr X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RDNS_NONE=0.793, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" Reply-to: Tobias Geerinckx-Rice From: Tobias Geerinckx-Rice via Bug reports for GNU Guix X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1639177454; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:in-reply-to:in-reply-to:references:references: list-id:list-help:list-unsubscribe:list-subscribe:list-post: dkim-signature; bh=82HjlNJbHv1uq7gIfndeEtFEvn4J6N16aTmLva30JR8=; b=RJ9uAfSiz45z2xhl7by+E4Yaf04RNHgxbrJgZcbY/xqVqkMCfC7PTl41GZ5HSSAdMxaUJP nS9xI/zSKRlovKg06PEPO7WAxcofa4/SwoUXRsGWUbVrGnbgiz8dI2dsZq3VPNsEMEKfob 4htFu1LSTAqUO8rpP8ixXdMIe8WGfWIvIUSDpUoPr682w5Mk+pSxwjTI8++OYdksQrosiT MGHt8O1hd1snwuSkAAB5jSaGUds5OW+KPe5J7tF3iU2FfIgCQ/Vx5uqyJyrMpRMjjVgL0w DOsF++lVyeeU/PvwTGxFDnFuwAS0mdfSc1Q8NJoxeAqsG20gUcbkVHDyDebJJA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1639177454; a=rsa-sha256; cv=none; b=AU77B+4ewsRPbKa9gSNqmflyGiet5LvaHV6X2Uw1DT+oIJagWNdvepOwV2FnOQeENYfHkt 28TzkLqy5TQL3jfML3qqfcoMNiNMLkGjuzQsCouE7g8X4SjL3ZmPG4SDVpbTHCEVmr4LP9 Dybc407orC73gGt53F5F3q/npCMMlZYSlZst2GyN8Y55Y2uJ4gfa4Pi+OZ0U0s0uArqUcu zV4q844IpQelSiDjP3bv8zdq2kQjLNum5fRzpruEc1bSJY64raZAsq0zbDKbZlCXlj5uYJ u+dPlgttafyFUN3pq54Z3s+16iNrtLGZBxqeRNd8H4wNArA4mpcjfuP3uEhDiQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=tobias.gr header.s=2018 header.b=XGk2P+rM; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -5.26 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=tobias.gr header.s=2018 header.b=XGk2P+rM; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 3A20B8C03 X-Spam-Score: -5.26 X-Migadu-Scanner: scn1.migadu.com X-TUID: GwEbE2f7DcqP --=-=-= Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable All, Mark H Weaver =E5=86=99=E9=81=93=EF=BC=9A > For what it's worth: during the years that I administered Hydra,=20 > I found > that many bots disregarded the robots.txt file that was in place=20 > there. > In practice, I found that I needed to periodically scan the=20 > access logs > for bots and forcefully block their requests in order to keep=20 > Hydra from > becoming overloaded with expensive queries from bots. Very good point. IME (which is a few years old at this point) at least the=20 highlighted BingBot & SemrushThing always respected my robots.txt,=20 but it's definitely a concern. I'll leave this bug open to remind=20 us of that in a few weeks or so=E2=80=A6 If it does become a problem, we (I) might add some basic=20 User-Agent sniffing to either slow down or outright block=20 non-Guile downloaders. Whitelisting any legitimate ones, of=20 course. I think that's less hassle than dealing with dynamic IP=20 blocks whilst being equally effective here. Thanks (again) for taking care of Hydra, Mark, and thank you Leo=20 for keeping an eye on Cuirass :-) T G-R --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iIMEARYKACsWIQT12iAyS4c9C3o4dnINsP+IT1VteQUCYbPc3w0cbWVAdG9iaWFz LmdyAAoJEA2w/4hPVW15Ky0BANhwI9BhRdXrGDsJPEblJvGMpSEWysyED3p7TZVU cF87AQDpw2NAebc3S4G2nEoAhKIoYZWLyyjW6G6HXQVib5WtAA== =bLY/ -----END PGP SIGNATURE----- --=-=-=--