From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id BB8F41F542; Wed, 14 Jun 2023 23:50:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1686786615; bh=oMhz6d11g7HOQMfwWdzc/3yJYQmFG9gMZi0LmgaxmM4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=vTXCwRpI6+9pRSwcANrwBDXA47sVOm58bvDMSsMvc87qNDfF57MinNULh1c2GMF3R GMnFdZqoz+5HXsKFOqBcGHhZxGJEQEZZi2c1LmAJmNcwQkD9oWeSo8v+YKW0Q4KX+e JxdjaOkyfd3ejSDk+8xm2V4fLYJFj2V00FZeuHBE= Date: Wed, 14 Jun 2023 23:50:15 +0000 From: Eric Wong To: Konstantin Ryabitsev Cc: meta@public-inbox.org, a.fatoum@pengutronix.de, u.kleine-koenig@pengutronix.de Subject: Re: Indicating the mirror's origin Message-ID: <20230614235015.M82055@dcvr> References: <20230614-icons-siren-usual-f1a72b@meerkat> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20230614-icons-siren-usual-f1a72b@meerkat> List-Id: Konstantin Ryabitsev wrote: > Good day: > > We've had a few requests to mirror public-inbox archives that originate on > other systems so they can also be searchable and viewable via lore.kernel.org. > I've been dragging my feet on these requests, because they are a potential > liability in terms of GDPR compliance. I just tried using `git replace' for the first time: git replace --edit $BLOB_OID And all the `git cat-file --batch' invocations appear to work as if the original blob contents never existed. Of course, reindexing could be necessary, as would changing the git config to ensure `git fetch' doesn't destroy elements in the refs/replace/ namespace. git clones/fetch still include both the original and replacement blob; though (favoring the replacement); so perhaps `git replace' isn't a fit... Then; Worse case would be to temporarily remove the mirror; or forking it (via -edit/-purge + subscribe) until upstream cleans it up. lei's v2 inbox output could be used as a subscription mechanism. > If we are merely mirroring the archive from some other location, then there > should be a clear indication of the origin of the data and contact information > of the maintainer of the remote archive where someone could send requests for > any data removal. It's best if this is visible both via the web view and in > raw messages retrieved via our service, e.g. via an "X-Archive-Origin:" header > or something similar. I sometimes use the $INBOX_DIR/description file for that and it affects WWW and NNTP, but not IMAP/POP3. I'm not sure if I want to reintroduce header injection in case there's some conflict with DKIM or other signature mechanisms[1] > Any thoughts on this issue? IANAL, obviously... [1] https://public-inbox.org/meta/20201210214329.do66z6gzvepxc5w3@chatter.i7.local/