From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 0CDBF1F86C; Thu, 26 Nov 2020 19:45:44 +0000 (UTC) Date: Thu, 26 Nov 2020 19:45:43 +0000 From: Eric Wong To: workflows@vger.kernel.org Cc: meta@public-inbox.org Subject: WIP: searching all of lore Message-ID: <20201126194543.GA30337@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline List-Id: Requires Tor, for now: http://rskvuqcfnfizkjg6h5jvovwb3wkikzcwskf54lfpymus6mxrzw67b5ad.onion/all/ http://lore.czquwvybam4bgbro.onion/all/ It seems v3 onions with longer URLs are more secure, these days; but requires newer Tor. On Debian or RH-based systems, it's as easy as: install tor torsocks # assuming w3m is installed: torsocks w3m http://rskvuqcfnfizkjg6h5jvovwb3wkikzcwskf54lfpymus6mxrzw67b5ad.onion/all/ Other browsers can configure SOCKS5 proxies to 127.0.0.1:9050 or be wrapped via torsocks (which uses LD_PRELOAD) Disclaimers: I don't know much about Tor security (or security in general). I see Tor as an alternative to paying corrupt organizations (ICANN) and lets me self-host without a static IP address. I've also had numerous ISP and power outages this year, probably because more neighbors are home due to the pandemic, so don't expect 99.999% uptime, either. And I'm a klutz who always trips over cables on (relatively) good days, and I've only had bad days since March :< How to replicate ---------------- # I'm using the following to update mirror and lore.kernel.org # (see grokmirror docs) for more. Old command: grok-pull -v -c repos.conf public-inbox-index --all # update per-inbox indices # add "-L basic" or "-L medium" to reduce space requirements # to either -extindex or -index commands # The new command, not finalized yet: public-inbox-extindex --all -v /path/to/ALL The following changes in a otherwise boring ~/.public-inbox/config (or whatever $PI_CONFIG is set to) ; not yet stable or finalized, yet: ; this section allows, "all" is a special case, currently [extindex "all"] topdir = /path/to/ALL ; these are already documented in public-inbox-config(5) [publicinbox] ; 'all' ignores domain name matching, ; useful for inboxes served via multiple domains wwwlisting = all grokManifest = all ; users with larger machines may want to bump this, ; the default is for machines w/ 256MB RAM ; (which I still use, sometimes) indexBatchSize = 100m # I'm using the following to update from lore.kernel.org # (see grokmirror docs) for more grok-pull -v -c repos.conf public-inbox-index --all # update per-inbox indices public-inbox-extindex --all -v /path/to/ALL # index [extindex "all"] This is running commit 95cb3e48fc5c4e847cdc111c2c8c9f0b70bdea56 git clone https://public-inbox.org/public-inbox.git More changes coming (JMAP, speedups), and there's probably still lots of stuff broken and need fixing (including my brain :<)