* Help setting up a public-inbox instance: importing from maildir archive isn't working
@ 2023-06-13 14:49 Rebecca Cran
2023-06-14 9:44 ` Eric Wong
0 siblings, 1 reply; 4+ messages in thread
From: Rebecca Cran @ 2023-06-13 14:49 UTC (permalink / raw)
To: meta
I'm trying to set up a public-inbox instance to mirror the edk2-devel
mailing list. I'm using version 1.9.0.
I'm having problems importing a mirror of 105,000 messages from 2016
onward - when I configured it to pull them from an IMAP server it
appeared to go through all of them, but then the web interface only
showed the last 10 or so, with no more pages and the edk2-devel.git
directory is only a few MB.
When imap didn't work, I tried downloading them into a maildir and tried
importing them via that instead, but that isn't working either.
I was wondering if someone on this list could help point out what I'm
doing wrong.
The maildir is:
~$ du -h Maildump/
4.0K Maildump/Mail/cur
4.0K Maildump/Mail/tmp
2.5G Maildump/Mail/new
2.5G Maildump/Mail
2.5G Maildump/
I'm using the following configuration in ~/.public-inbox/config:
[publicinbox]
wwwlisting = all
[publicinbox "edk2-devel"]
address = devel@edk2.groups.io
inboxdir = /home/public-inbox/edk2-devel.git
watchheader = List-Id:<devel.edk2.groups.io>
url = https://openfw.io/edk2-devel
watch = maildir:/home/public-inbox/Maildump/Mail/
watch = imaps://imap.example.net/INBOX ; redacted
indexlevel = full
replyto = :list
spamcheck = none
obfuscate = true
Authentication for the imap server is handled via git-credential.
I'm running the following commands:
public-inbox-init -V2 -L full edk2-devel
/home/public-inbox/edk2-devel.git https://openfw.io/edk2-devel
devel@edk2.groups.io
public-inbox-watcher # I run this in a screen session
public-inbox-httpd # I run this in a separate screen session, once
public-inbox-watcher appears to have processed the existing messages
--
Rebecca Cran
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Help setting up a public-inbox instance: importing from maildir archive isn't working
2023-06-13 14:49 Help setting up a public-inbox instance: importing from maildir archive isn't working Rebecca Cran
@ 2023-06-14 9:44 ` Eric Wong
2023-06-19 19:12 ` Rebecca Cran
0 siblings, 1 reply; 4+ messages in thread
From: Eric Wong @ 2023-06-14 9:44 UTC (permalink / raw)
To: Rebecca Cran; +Cc: meta
Rebecca Cran <rebecca@bsdio.com> wrote:
> I'm trying to set up a public-inbox instance to mirror the edk2-devel
> mailing list. I'm using version 1.9.0.
>
> I'm having problems importing a mirror of 105,000 messages from 2016 onward
> - when I configured it to pull them from an IMAP server it appeared to go
> through all of them, but then the web interface only showed the last 10 or
> so, with no more pages and the edk2-devel.git directory is only a few MB.
>
> When imap didn't work, I tried downloading them into a maildir and tried
> importing them via that instead, but that isn't working either.
OK, so I suppose it's a matching problem because the matching
logic is shared between IMAP and Maildir.
> [publicinbox "edk2-devel"]
> address = devel@edk2.groups.io
> inboxdir = /home/public-inbox/edk2-devel.git
> watchheader = List-Id:<devel.edk2.groups.io>
Any chance that List-Id doesn't match the older messages?
You are allowed multiple `watchheader' directives for an inbox
to account for address/name changes and such (and older headers
such as `X-BeenThere')
I haven't tried it, but this should work as long as you want
every message in a watched Maildir (or IMAP folder):
watchheader = From:.
...which matches all messages with a literal `.' in the From: header;
so practically every valid message. Likewise for Received:, Date:
or any practically-always-present header:value-substring combo.
I think everything else in your configs+commands looked fine;
but I'm still struggling with lack-of-sleep and could've missed
things :<
I designed the `watchheader' directives to handle multiple lists
funneled into one Maildir; but I suppose it's less intuitive for
users with a 1:1 list => Maildir mapping :x
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Help setting up a public-inbox instance: importing from maildir archive isn't working
2023-06-14 9:44 ` Eric Wong
@ 2023-06-19 19:12 ` Rebecca Cran
2023-06-20 2:56 ` Eric Wong
0 siblings, 1 reply; 4+ messages in thread
From: Rebecca Cran @ 2023-06-19 19:12 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
On 6/14/23 03:44, Eric Wong wrote:
> Any chance that List-Id doesn't match the older messages?
>
> You are allowed multiple `watchheader' directives for an inbox
> to account for address/name changes and such (and older headers
> such as `X-BeenThere')
>
> I haven't tried it, but this should work as long as you want
> every message in a watched Maildir (or IMAP folder):
>
> watchheader = From:.
>
> ...which matches all messages with a literal `.' in the From: header;
> so practically every valid message. Likewise for Received:, Date:
> or any practically-always-present header:value-substring combo.
>
> I think everything else in your configs+commands looked fine;
> but I'm still struggling with lack-of-sleep and could've missed
> things :<
>
>
> I designed the `watchheader' directives to handle multiple lists
> funneled into one Maildir; but I suppose it's less intuitive for
> users with a 1:1 list => Maildir mapping :x
Unfortunately I have several lists in the same Maildir, so I need to use
watchheader.
The List-Id hasn't changed for several years: for example this is a
message from November 2021:
List-Unsubscribe: <mailto:devel+unsubscribe@edk2.groups.io>
List-Subscribe: <mailto:devel+subscribe@edk2.groups.io>
List-Help: <mailto:devel+help@edk2.groups.io>
Sender: devel@edk2.groups.io
List-Id: <devel.edk2.groups.io>
Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io
X-Remote-Delivered-To: mailing list devel@edk2.groups.io
Reply-To: devel@edk2.groups.io,abdattar@amd.com
X-Gm-Message-State: WX5Eq2TqR2PVB2bR8lJXcv0Zx3953573AA=
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain
I added a line to match against To: as well, and that's working.
--
Rebecca Cran
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Help setting up a public-inbox instance: importing from maildir archive isn't working
2023-06-19 19:12 ` Rebecca Cran
@ 2023-06-20 2:56 ` Eric Wong
0 siblings, 0 replies; 4+ messages in thread
From: Eric Wong @ 2023-06-20 2:56 UTC (permalink / raw)
To: Rebecca Cran; +Cc: meta
Rebecca Cran <rebecca@bsdio.com> wrote:
> On 6/14/23 03:44, Eric Wong wrote:
>
> > Any chance that List-Id doesn't match the older messages?
> >
> > You are allowed multiple `watchheader' directives for an inbox
> > to account for address/name changes and such (and older headers
> > such as `X-BeenThere')
> >
> > I haven't tried it, but this should work as long as you want
> > every message in a watched Maildir (or IMAP folder):
> >
> > watchheader = From:.
> >
> > ...which matches all messages with a literal `.' in the From: header;
> > so practically every valid message. Likewise for Received:, Date:
> > or any practically-always-present header:value-substring combo.
> >
> > I think everything else in your configs+commands looked fine;
> > but I'm still struggling with lack-of-sleep and could've missed
> > things :<
> >
> >
> > I designed the `watchheader' directives to handle multiple lists
> > funneled into one Maildir; but I suppose it's less intuitive for
> > users with a 1:1 list => Maildir mapping :x
>
> Unfortunately I have several lists in the same Maildir, so I need to use
> watchheader.
>
> The List-Id hasn't changed for several years: for example this is a message
> from November 2021:
>
> List-Unsubscribe: <mailto:devel+unsubscribe@edk2.groups.io>
> List-Subscribe: <mailto:devel+subscribe@edk2.groups.io>
> List-Help: <mailto:devel+help@edk2.groups.io>
> Sender: devel@edk2.groups.io
> List-Id: <devel.edk2.groups.io>
> Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io
> X-Remote-Delivered-To: mailing list devel@edk2.groups.io
> Reply-To: devel@edk2.groups.io,abdattar@amd.com
> X-Gm-Message-State: WX5Eq2TqR2PVB2bR8lJXcv0Zx3953573AA=
> Content-Transfer-Encoding: quoted-printable
> Content-Type: text/plain
>
> I added a line to match against To: as well, and that's working.
OK, it's good that To: works for you; but it's still worrying to
me that List-Id didn't work...
If you have time to help diagnose this, can you try:
listid = devel.edk2.groups.io
in the config file and omit all `watchheader' directives?
public-inbox-watch will auto-translate listid to the appropriate
watchheader directive, but be case-insensitive in accordance
with RFC 2919 section 6.
Or are you be able to share a dump of the messages for me to try?
(getting a 502 error on <https://openfw.io/edk2-devel>)
Thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-06-20 2:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-13 14:49 Help setting up a public-inbox instance: importing from maildir archive isn't working Rebecca Cran
2023-06-14 9:44 ` Eric Wong
2023-06-19 19:12 ` Rebecca Cran
2023-06-20 2:56 ` Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).