From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.3 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 10F291F4B4 for ; Mon, 14 Sep 2020 16:01:19 +0000 (UTC) Received: by mail-qt1-x82b.google.com with SMTP id r8so236173qtp.13 for ; Mon, 14 Sep 2020 09:01:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=quAxxawOu14jV69ADLGRspRieEeeQxfgNKDRy/ZcpPc=; b=Wo1UBgCcXZpf6W71AqUMElaGLml3yVt1bDMVNtZwoCCTSKpKHSkVLQsA26P6NpWcBo 7pkscf8tM05ofl6oaEaXfuXGaNT/80df9DpASf2ktpzj5UuY18RdDfINh+h9uQ4upuV2 PL7GnFxfJmiQdvhQfSUYK+0cy2RqsZic+emk8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to; bh=quAxxawOu14jV69ADLGRspRieEeeQxfgNKDRy/ZcpPc=; b=PYwAfHQFGacD0qlBRg0O4+EbW/+S1FGCNOU6VKU8L1UboFPpHU1CnL9o9bYta+oqW2 DTBsrLcG6+BQUGyJrSZvlzkX/mc96UBW0NaMxEzcKSI6VQec0kptuu/e+OV0+9tFhpU4 mO5CkqokEe61NTZj3lIk1RL38bHPkZZP7K+U9JFr0Sms3om/5uNdToxVn0xT5nZntgZL zlTCmFEEdEiZitSlyIaIJfulia+MTwJVvxqZUE8uKb1Q7wdLRKsjhTlW6Yz2oS7anEdl QgXSs4lh2c7twca49Me1y87P1bJZYpstUCKXVBNCUTes7z7PjY8+1ppWccQrt+1cvMdP fs1Q== X-Gm-Message-State: AOAM5303yq5QBJecaqE8Hcf3iLeFBxAv4L2fZ4PhXGLblNofzjxfnS3p uo6aiChfur6l32DBKG96J5V+Ww== X-Google-Smtp-Source: ABdhPJwMBpPoHxukqIas4M6+6o1U/Vy9pHG8/zsLSLenvJkGT8qsZXnjWAX+/LvMcthHgySPShfzig== X-Received: by 2002:ac8:4d05:: with SMTP id w5mr13851284qtv.339.1600099277376; Mon, 14 Sep 2020 09:01:17 -0700 (PDT) Received: from chatter.i7.local ([89.36.78.230]) by smtp.gmail.com with ESMTPSA id 192sm14335981qkm.110.2020.09.14.09.01.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Sep 2020 09:01:16 -0700 (PDT) Date: Mon, 14 Sep 2020 12:01:14 -0400 From: Konstantin Ryabitsev To: Eric Wong Cc: meta@public-inbox.org Subject: Re: brain dump detached/external index so far... Message-ID: <20200914160114.kdwed3rkb52ibal4@chatter.i7.local> Mail-Followup-To: Eric Wong , meta@public-inbox.org References: <20200913065550.GA2337@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200913065550.GA2337@dcvr> List-Id: On Sun, Sep 13, 2020 at 06:55:50AM +0000, Eric Wong wrote: > Currently (and since the earliest days of this project > supporting Xapian), indices were per-inbox. This allowed > inboxes to be isolated, making it easy to add and remove > inboxes. > > The detached/external indices will allows a merging of > several existing inboxes into a single (or several) > virtual inboxes. > > Why? > > We want a cross-inbox search in the WWW UI, > and perhaps an "All Mail" IMAP/JMAP inbox. FYI, this is the most often requested feature for lore.kernel.org, alongside with "give me a mbox.gz with all followups, regardless to which list they were sent." I think several virtual inboxes makes more sense than always one global search, as people may want to search something like "all Linux kernel discussions" or "all gcc/compiler discussions". There could be different frontends to indicate which search is running -- e.g. "kernel.lore.kernel.org" vs. "gcc.lore.kernel.org". > Advantages: > > Deduplication built-in. Cross-posted messages only get > expensive Xapian data indexed once (multiple List-Id can > get attached to each message). As an off-side grumbling, we found out that AWS SES (their email processing service) will force-rewrite all message IDs without any option to prevent this. So, a single message with multiple recipients will arrive with a unique Message-ID to each one of them. This is so broken, it blows my mind, but AWS doesn't care to fix it. > Disclaimer: > > My brain hasn't been working quite right... > Heat + pandemic + power outages + insects + poor air quality > have all been taking their toll :< I hope things improve on the West Coast soon. The imagery alone was frightening to watch. Best regards, -K