From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Subject: Re: guix gc takes long in "deleting unused links ..." Date: Mon, 04 Feb 2019 22:11:34 +0100 Message-ID: <874l9jqmwp.fsf@gnu.org> References: <20190201065332.6d4c9815@alma-ubu> <87womjcfoi.fsf@cune.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([209.51.188.92]:45473) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gqlWp-0003wP-Ai for guix-devel@gnu.org; Mon, 04 Feb 2019 16:11:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gqlWn-0001Hs-Fn for guix-devel@gnu.org; Mon, 04 Feb 2019 16:11:43 -0500 Received: from hera.aquilenet.fr ([2a0c:e300::1]:51096) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gqlWn-00019Z-4V for guix-devel@gnu.org; Mon, 04 Feb 2019 16:11:41 -0500 In-Reply-To: <87womjcfoi.fsf@cune.org> (Caleb Ristvedt's message of "Fri, 01 Feb 2019 16:22:21 -0600") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Caleb Ristvedt Cc: guix-devel@gnu.org Hi! Caleb Ristvedt skribis: [...] > Ideally, the reference-counting approach to removing files would work > the same as in programming languages: as soon as a reference is removed, > check whether the reference count is now 0 (in our case 1, since an > entry would still exist in .links). In our case, we'd actually have to > check prior to losing the reference whether the count *would become* 1, > that is, whether it is currently 2. But unlike in programming languages, > we can't just "free a file" (more specifically, an inode). We have to > delete the last existing reference, in .links. The only way to find that > is by hashing the file prior to deleting it, which could be quite > expensive, but for any garbage collection targeting a small subset of > store items it would likely still be much faster. A potential fix there > would be to augment the store database with a table mapping store paths > to hashes (hashes already get computed when store items are > registered). Or we could switch between the full-pass and incremental > approaches based on characteristics of the request. Note that the database would need to contain hashes of individual files, not just store items (it already contains hashes of store item nars). This issue was discussed a while back at . Back then we couldn=E2=80=99t agre= e on a solution, but it=E2=80=99d be good to have your opinion with your fresh m= ind! >> Or better: Is it save here to just hit CTRL-C (and let the daemon work >> in background, or whatever)? > > I expect that CTRL-C at that point would cause the guix process to > terminate, closing its connection to the daemon. I don't believe the > daemon uses asynchronous I/O, so it wouldn't be affected until it tried > reading or writing from/to that socket. The guix-daemon child that handles the session would immediately get SIGHUP and terminate (I think), but that=E2=80=99s fine: it=E2=80=99s just = that files that could have been removed from .links will still be there. Ludo=E2=80=99.