From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:8:6d80::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id sKhXGM9MZ2C8ggAAgWs5BA (envelope-from ) for ; Fri, 02 Apr 2021 18:56:47 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id 6HMoEs9MZ2B4JwAAbx9fmQ (envelope-from ) for ; Fri, 02 Apr 2021 16:56:47 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1047125FBD for ; Fri, 2 Apr 2021 18:56:46 +0200 (CEST) Received: from localhost ([::1]:57576 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lSN6D-0004Sh-GP for larch@yhetil.org; Fri, 02 Apr 2021 12:56:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35808) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lSN5u-0004RR-GL for guix-devel@gnu.org; Fri, 02 Apr 2021 12:56:26 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:36595) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lSN5s-0000le-Tf; Fri, 02 Apr 2021 12:56:24 -0400 Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=34698 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1lSN5p-0007Wk-TG; Fri, 02 Apr 2021 12:56:24 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Brice Waegeneire Subject: Re: Getting rid of the mandb profile hook? References: <87k0tw3y15.fsf@inria.fr> <87v9ad7jal.fsf@gmail.com> <87ft1cqq94.fsf@gnu.org> <0170f58ece0b1bfd193f1566c37eddb8@waegenei.re> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 13 Germinal an 229 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Fri, 02 Apr 2021 18:56:19 +0200 In-Reply-To: <0170f58ece0b1bfd193f1566c37eddb8@waegenei.re> (Brice Waegeneire's message of "Wed, 03 Mar 2021 21:50:31 +0100") Message-ID: <87lfa0zkuk.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Guix Devel , Maxim Cournoyer Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1617382607; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post; bh=+RdzROR1PkkIqyWi4FgR5wnSNGk5A7oMNfP6afr9zV4=; b=njgtNNRozy46Vr/qVlZdjWRO4L08K/YaKxzB1eM21dQ++CEu1X+Y483/ZfzTxHjy1R9FVu prrq+HGrWMNN7zznrnq/SpDn2usHrUR3L6LYm+u+gFjAdIX0MBxJ2cLlOlq18mxuqi6M7l xeviU/2i8aODp0jmcAc2tbA2DCjGs7+ywTE5EvI0JT/lmxA49XnZPfYJULQBEncZpa1qjf jgLyLpvdr2c2C37XQZZd8U0X4NGXy9heCBY5eh6BAg28YF3NvyKG8X6GPX+FKTnnaIiX9v 7HgVyuUoRuOQeyUhUBf5wZ/XzQj5eDXPBLcwRR1+1oUnY1dsMwSCtMab2HYgaQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1617382607; a=rsa-sha256; cv=none; b=t7Gz38Kd6+SpvaSZsyW4vEv0FVyw3ZgmGJ81deAojcjs2bzvaXN8tBBXQBZxJADbSKvz4X ABreJTXM2liN1PraqzA9UlO8Z0jSe+ZsRnrzvYG3Br+X+h3iEw9hhBPLL9XbgMkwDH3CvF U0/vB+90/uqcXxqpM1eX1S5IYLZUu82mPtiABHztp6dkEZflZtiMvSFUxkKSYaO1teoiBm CuwdPWFLbhOFdwbo/SwQK9/am6ZmRHVBYOGki+g+pdHmn90GnmfJIYQ0zAcGbIoQJAAWXT uHYf6h91tj044LLUA4oXamMPeR1Ar6t74iiq8TiBes1oSXvWn0ekHUOLDSbBnA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Spam-Score: 0.57 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 1047125FBD X-Spam-Score: 0.57 X-Migadu-Scanner: scn0.migadu.com X-TUID: OaepYUUVpGZ+ --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello! Brice Waegeneire skribis: > On 2021-03-03 15:13, Ludovic Court=C3=A8s wrote: >>>> I=E2=80=99m thinking we could get rid of the mandb hook. However, the [...] > What about using mandoc=C2=B9, the manpage compiler from OpenBSD, instead= of > man-db? As from it's manual it support specifying the database location: > > =E2=80=9Cmakewhatis -d dir [file ...]=E2=80=9D=C2=B2 I recently packaged it, but I=E2=80=99m not impressed; I=E2=80=99m not even= sure how that=E2=80=99s supposed to work: $ guix environment --ad-hoc mandoc -- makewhatis -d /tmp/foo $(find -L ~/= .guix-profile/share/man -name \*.[0-9].gz) exits successfully but does nothing. At this point my preference would be to build a custom tool (I=E2=80=99m not aware of any existing tool like that, but if you do, please share) that would lazily build a database, ideally full-text, and search through it; attached a super rough example that uses Guile-Xapian and inserts man-db synopses into a Xapian database. The tool would index man pages and Info pages. It would be smart enough to index only info/man files that have actually changed (it could look at the inode number to determine in a way that avoids unnecessary cache invalidation). I=E2=80=99m not sure how to implement this part though. It sounds like a good hack for our Xapian experts=E2=80=94I=E2=80=99m looking = at you Arun, Ricardo, zimoun. :-) Thoughts? I=E2=80=99d really like to have a rough solution so we can remove the =E2=80=98manual-database=E2=80=99 hook in time for the release. Thoughts? Ludo=E2=80=99. --=-=-= Content-Type: text/plain Content-Disposition: inline; filename=doc.scm Content-Description: Xapian as a man-db replacement (use-modules (xapian wrap) (xapian xapian) (ice-9 match) (guix man-db) (srfi srfi-1) (srfi srfi-26)) ;; eval: (put 'call-with-writable-database 'scheme-indent-function 1) (define (index-mandb-entry db entry) (define (mandb-entry-id-term entry) (string-append "Q" "man:" (mandb-entry-name entry) "." (number->string (mandb-entry-section entry)))) (when (mandb-entry-name entry) (let* ((idterm (mandb-entry-id-term entry)) (doc (make-document #:data (object->string `((name . ,(mandb-entry-name entry)) (section . ,(number->string (mandb-entry-section entry))) (file . ,(canonicalize-path (mandb-entry-file-name entry))))) #:terms `((,idterm . 0)))) (term-generator (make-term-generator #:stem (make-stem "en") #:document doc))) (index-text! term-generator (mandb-entry-name entry) #:prefix "A") (index-text! term-generator (number->string (mandb-entry-section entry)) #:prefix "B") (index-text! term-generator (mandb-entry-synopsis entry)) (replace-document! db idterm doc)))) (define (index-mandb-entries) (call-with-writable-database "/tmp/db" (lambda (db) (for-each (cut index-mandb-entry db <>) ;; (mandb-entries "/run/current-system/profile/share/man") (append-map mandb-entries (string-split (getenv "MANPATH") #\:)) )))) (define* (parse-query* querystring #:key stemmer stemming-strategy (prefixes '()) (boolean-prefixes '())) (let ((queryparser (new-QueryParser))) (QueryParser-set-stemmer queryparser stemmer) (when stemming-strategy (QueryParser-set-stemming-strategy queryparser stemming-strategy)) (for-each (match-lambda ((field . prefix) (QueryParser-add-prefix queryparser field prefix))) prefixes) (for-each (match-lambda ((field . prefix) (QueryParser-add-boolean-prefix queryparser field prefix))) boolean-prefixes) (let ((query (QueryParser-parse-query queryparser querystring))) (delete-QueryParser queryparser) query))) (define* (search querystring #:key (pagesize 100)) (call-with-database "/tmp/db" (lambda (db) (let* ((query (parse-query querystring #:stemmer (make-stem "en") #:prefixes '(("name" . "A") ("section" . "B")))) (enq (enquire db query))) ;; (Enquire-set-sort-by-value enq 0 #f) (reverse (mset-fold (lambda (item acc) (cons (call-with-input-string (document-data (mset-item-document item)) read) acc)) '() (enquire-mset enq #:maximum-items pagesize))))))) --=-=-=--