From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: Re: Latest guile-daemon changes and bewilderment Date: Tue, 25 Jul 2017 16:05:03 -0400 Message-ID: <877eywb7v4.fsf@netris.org> References: <8760eicf4m.fsf@cune.org> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:57564) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1da651-0005zB-CO for guix-devel@gnu.org; Tue, 25 Jul 2017 16:05:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1da64y-00036k-5K for guix-devel@gnu.org; Tue, 25 Jul 2017 16:05:19 -0400 Received: from world.peace.net ([50.252.239.5]:44839) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1da64y-0002q9-1m for guix-devel@gnu.org; Tue, 25 Jul 2017 16:05:16 -0400 In-Reply-To: <8760eicf4m.fsf@cune.org> (Caleb Ristvedt's message of "Mon, 24 Jul 2017 05:18:17 -0500") List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Caleb Ristvedt Cc: guix-devel@gnu.org Caleb Ristvedt writes: > This includes reference-scanning, which I think pretty much works, but > which I regret to say I spent too much time thinking about. Premature > optimization and all that. It takes a rather different approach to the > way the C++ code does it - it uses a trie to recognize references out of > a list of possibilities (namely, anything thrown in the build > environment's store). The idea is that additional pre-calculation can be > performed on each node and it can become a sort of branching skip table > of the sort used in that one string-matching algorithm whose name I > can't remember. Maybe you're thinking of the Boyer-Moore string search algorithm: https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm You might also want to take a look at the simple algorithm I cooked up to make grafting faster, in (guix build graft). In theory it might be improved by building some kind of lookup table based on the set of expected hashes, as you suggest, but I wasn't sure if it would be a net win, given the added code complexity and larger lookup tables. We can skip quite a lot simply based on the fact that Nix hashes include only 1/8 of the possible byte values, and exclude the most common letters from English text. More investigation is needed! Regards, Mark