From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-4.0 required=3.0 tests=AWL,BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 26D391F8C6 for ; Sat, 14 Aug 2021 22:09:58 +0000 (UTC) Received: by mail-qv1-xf34.google.com with SMTP id m3so7306127qvu.0 for ; Sat, 14 Aug 2021 15:09:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=nghmHMLEJgHf7JlmwliKozFMBpwPywOHabxrIgPP2r8=; b=Cb+FHomctyJ3Nn5bY2azi33NNaSNA2hJBsuEETWvMGy6fGnVchA8jlT66mhiPV3E8N ZSFJRoEvUJTB4+Mb6Tioh0lwnJSr7zqwrr4DEmG/Wdcc+NqJGZq1H/+uBV3Z7n2VNSG3 yWPOST17TLMIrYbyf/uZD1prYL8Bwy4n0lA1c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=nghmHMLEJgHf7JlmwliKozFMBpwPywOHabxrIgPP2r8=; b=gw5Q7bvAoJN4APVCX5nnq9ofwMKOD+52OKnEKgUPekazmq8i7ymg7/coq9HNuuPMte ZYvRnzJmWXJ6LRdMCCSMmX5AuaMCEHvzX8K12oTX12y6fXkdHOwP4lwDToKbVHhGohPN U7y5XiEw9ustnN0mOfNW3VBXKHCmsod7lxXXuP5gZHatLEHZf08WaQAXyN5SfZFuwCXI pKc0PSkWycf2MBQ1UgFJm2e96IhEGLIJfkXT1k8Onu2TBoDu47uJyrOFek1ZQSyTd6VD E/Wt1UGEJvSO1Uj/MCiJRqi5DDIlgjrftSu70Z82I9aGO/saiqaG39jZ/MmLwTIXvw3q 1tlg== X-Gm-Message-State: AOAM530MGeaFgGbEQHDxDEn/ByXBQ4KHHpbmghhSPl7F7rLK6JmT+GPa 4VD7BmWx1Cyi6WxKn/tR9NKViQ== X-Google-Smtp-Source: ABdhPJyv3wiYol1BCKWkI9LC1bXRa6XBFUWXGmlS3mEpwFPC/U9sdzqoRaZjMYwRsK8yUCKTo+2fFw== X-Received: by 2002:a05:6214:14ee:: with SMTP id k14mr9214526qvw.56.1628978996541; Sat, 14 Aug 2021 15:09:56 -0700 (PDT) Received: from nitro.local ([89.36.78.230]) by smtp.gmail.com with ESMTPSA id c2sm3612553qkd.57.2021.08.14.15.09.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 14 Aug 2021 15:09:56 -0700 (PDT) Date: Sat, 14 Aug 2021 18:09:53 -0400 From: Konstantin Ryabitsev To: Eric Wong Cc: meta@public-inbox.org Subject: Re: Boosts still not quite working Message-ID: <20210814220953.e2qgzhax2mslmkhv@nitro.local> References: <20210814134609.jcgq7je3yq4agzn6@nitro.local> <20210814204633.GA1020@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210814204633.GA1020@dcvr> List-Id: On Sat, Aug 14, 2021 at 08:46:33PM +0000, Eric Wong wrote: > > It was sent to iommu@lists.linux-foundation.org (mailman) and to a bunch of > > vger lists, all of which have higher boosts in the configuration, e.g. netdev: > > boost doesn't come into effect due to the Mailman footer from > the iommu list. extindex and v2 both account for content > differences despite sharing the same Message-ID > (deduplicating purely on Message-ID would be open to abuse > (and many old MUA-side bugs)) Ah-ha, okay. I clearly misunderstood its purpose. No worries, I see now that when we retrieve t.mbox.gz, the mailbox contains both the iommu and the vger sources, which should allow me to pick the preferred version based on the criteria I need (e.g. DKIM validation). > I'm planning on having a "diff view" to more easily distinguish > between different messages having the same Message-ID. It would > make it easier to highlight buggy clients, Mailman misconfigurations, > and malicious attempts to obscure/confuse readers. Indeed, I agree that it's best to give access to both instead of always returning the boosted-list version, as this would allow someone for malicious message-id stuffing. Now I just need to hack b4 to define a better criteria for when results have multiple identical message-id's. Thanks, -K