From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS6315 166.70.0.0/16 X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from out02.mta.xmission.com (out02.mta.xmission.com [166.70.13.232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id C9BA31F461; Sat, 18 May 2019 15:23:25 +0000 (UTC) Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1hS1B1-0000pL-Hi; Sat, 18 May 2019 09:23:18 -0600 Received: from ip72-206-97-68.om.om.cox.net ([72.206.97.68] helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1hS1Az-0003SN-Sj; Sat, 18 May 2019 09:23:11 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Eric Wong Cc: meta@public-inbox.org References: <877eap26rx.fsf@xmission.com> <20190518071518.fqcwveehh726wxdu@dcvr> Date: Sat, 18 May 2019 10:22:45 -0500 In-Reply-To: <20190518071518.fqcwveehh726wxdu@dcvr> (Eric Wong's message of "Sat, 18 May 2019 07:15:18 +0000") Message-ID: <87imu7ss6y.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1hS1Az-0003SN-Sj;;;mid=<87imu7ss6y.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=72.206.97.68;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18hkQXwWqi/bfNDe4r4JTCCT5M01uXPggw= X-SA-Exim-Connect-IP: 72.206.97.68 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [RFC][PATCH] Config.pm: Add support for mailing list information X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) List-Id: Eric Wong writes: > "Eric W. Biederman" wrote: >> >> The world has turned since I first started following mailing lists and >> to my surprise every mailling list that I am subscribed to properly >> sets the "List-ID:" mailing list header. So instead of doing >> something clever and flexible I am adding support for looking up >> public inbox mailing lists by their mailing list name. > > Agreed, and good to know List-Id is getting standardized on. > >> That makes the work needed for each email trivial and easy to understand. >> - Parse the "List-ID:" header. >> - Lookup in the configuration which mailbox is connected to that >> "List-ID:" >> - Deliver the mail to that mailbox. > > This is a great idea. One great side-effect would be avoiding > the the need to expose the mirror delivery address for server > admins who use -mda for mirroring lists: > > https://public-inbox.org/meta/20180612170546.GA5945@chatter/ Yes. I see. I can easily have it do that. For email mirrors this works very well. >> To that end this change enhances PublicInbox to have an additional >> mailbox configuration parameter "list" that holds the mailling list >> name. > > I think using "listId" in configs would be preferable, as I > think it's clear that we're matching the RFC2919 header and > not any other thing like "X-Mailing-List". > > We'll use camelCase for user docs, and only alllowercase for > parsing (consistent with git config naming). I actually hate > that naming convention of git config keys, so "list_id" for > internal variables. > > Spelling: s/mailling/mailing/g (you seem to spell it both ways > throughout the email) Got it. I think I am suffering from a bit of new Daddy syndrome. I will clean that up before I resubmit. > Also, as a Perl user, the word "list" alone is also a bit > ambiguous because it tends to get used interchangeably with > "array"; and it could be confused in my brain as a throwaway > variable. Makes perfect sense. >> A method is added to the PublicInbox config object called >> lookup_list that given a mailing list name will return >> the PublicInbox in the configarion that is configured to >> handle that mailing list. >> >> Signed-off-by: "Eric W. Biederman" >> --- >> >> Some context. I have my mailing list email coming in via imap, and I >> have a script that looks at List-ID and delivers them to the appropriate >> public-inbox. I am hoping to get my script at least into the PublicInbox >> scripts directory. >> >> I have looked and see that you a watchheader directive that effectively >> does the same thing that I am doing. Even so, I think a mailing list >> having one or more mailling list names is more fundamental. A mailing >> name is a well supported concept and it differs from an address >> in that a mailing list name does not have an "@". > > Yes, for -watch, watchheader is a more flexible variant of this. > listId could be used as a shortcut. Yes. I will se what I can arrange for updating -watch and -mda. >> So as part of getting my small changes to your PublicInbox code >> to support my email fetching script. Do you mind the addition >> of a mailing list name configuration parameter? >> If you don't mind can we apply the patch below? > > Don't mind at all, but I would like some comments in this > email addressed, along with some tests for -mda, at least. Noted I will aim at getting that done. >> Do I need to tweak >> public-inbox-watch to be able to use the list parameter? > > I think translating the listId directive into a watchheader > regexp will be sufficient. Yes. That should be straight forward. I will do that. >> I can't >> really test it as my setup is quite different but I am willing to >> work things through so that the code is harmonized and people >> can benefit from each others improvements. > > Looks like watchheader is also missing tests :x > >> lib/PublicInbox/Config.pm | 33 ++++++++++++++++++++++++++++++++- >> 1 file changed, 32 insertions(+), 1 deletion(-) >> >> diff --git a/lib/PublicInbox/Config.pm b/lib/PublicInbox/Config.pm >> index 09f9179b085a..70bd88f51556 100644 >> --- a/lib/PublicInbox/Config.pm >> +++ b/lib/PublicInbox/Config.pm >> @@ -25,6 +25,7 @@ sub new { >> >> # caches >> $self->{-by_addr} ||= {}; >> + $self->{-by_list} ||= {}; > > -by_list_id ack >> $self->{-by_name} ||= {}; >> $self->{-by_newsgroup} ||= {}; >> $self->{-no_obfuscate} ||= {}; >> @@ -84,6 +85,32 @@ sub lookup { >> _fill($self, $pfx); >> } >> >> +sub lookup_list { > > lookup_list_id ack >> + my ($self, $list) = @_; >> + my $inbox = $self->{-by_list}->{$list}; >> + return $inbox if $inbox; > > "my $ibx" > > I've been starting to standardize on $ibx as the identifier for > PublicInbox::Inbox references (and maybe leave $inbox for > human-readable names). > ack. >> + foreach my $k (keys %$self) { >> + $k =~ /\A(publicinbox\.[\w-]+)\.list\z/ or next; > > \.listid\z/ ack >> + my $v = $self->{$k}; >> + if (ref($v) eq "ARRAY") { >> + foreach my $alias (@$v) { >> + ($alias eq $list) or next; >> + $pfx = $1; >> + last; >> + } >> + } else { >> + ($v eq $list) or next; >> + $pfx = $1; >> + last; >> + } > > It looks these "eq" comparisons should be case-insensitive: > > https://tools.ietf.org/html/rfc2919#section-6 Interesting. I have yet to see it matter. But it isn't hard to change the code to do: (lc($v) eq lc($list). >> + } >> + defined $pfx or return; >> + _fill($self, $pfx); >> +} >> + >> sub lookup_name ($$) { >> my ($self, $name) = @_; >> $self->{-by_name}->{$name} || _fill($self, "publicinbox.$name"); >> @@ -389,7 +416,7 @@ sub _fill { >> } >> # TODO: more arrays, we should support multi-value for >> # more things to encourage decentralization >> - foreach my $k (qw(address altid nntpmirror coderepo hide)) { >> + foreach my $k (qw(address altid nntpmirror coderepo hide list)) { >> if (defined(my $v = $self->{"$pfx.$k"})) { >> $ibx->{$k} = _array($v); >> } >> @@ -412,6 +439,10 @@ sub _fill { >> $self->{-by_addr}->{$lc_addr} = $ibx; >> $self->{-no_obfuscate}->{$lc_addr} = 1; >> } >> + foreach (@{$ibx->{list}}) { >> + my $list = $_; >> + $self->{-by_list}->{$list} = $ibx; >> + } > > No sense in "my $list = $_"; either "foreach my $list_id" for > larger loops, or use $_ for tiny loops. > > Anyways, great idea, needs some minor tweaks. Thanks! Will make them and get this posted again. Eric