* Add "generator" information to HTML pages @ 2023-01-08 19:04 Thomas Weißschuh 2023-01-08 19:47 ` Eric Wong 0 siblings, 1 reply; 5+ messages in thread From: Thomas Weißschuh @ 2023-01-08 19:04 UTC (permalink / raw) To: meta Hi, it would be nice if public-inbox could extend the HTML pages it generates with the "generator" meta tag [0]. Especially the version would be useful. This would help users during debugging to see the specific version of public-inbox they are looking at. For example: <head> <title>Some page</title> <meta name="generator" content="public-inbox 1.9.0" /> </head> [0] https://html.spec.whatwg.org/multipage/semantics.html#meta-generator Thanks, Thomas ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Add "generator" information to HTML pages 2023-01-08 19:04 Add "generator" information to HTML pages Thomas Weißschuh @ 2023-01-08 19:47 ` Eric Wong 2023-01-08 20:02 ` Thomas Weißschuh 0 siblings, 1 reply; 5+ messages in thread From: Eric Wong @ 2023-01-08 19:47 UTC (permalink / raw) To: Thomas Weißschuh; +Cc: meta Thomas Weißschuh <thomas@t-8ch.de> wrote: > Hi, > > it would be nice if public-inbox could extend the HTML pages it > generates with the "generator" meta tag [0]. > Especially the version would be useful. > > This would help users during debugging to see the specific version of > public-inbox they are looking at. What would users be debugging? Admins would be the only ones who care, I think... Version info becomes worthless if an admin blocks/alters certain endpoints via nginx/varnish or just editing the code. > For example: > > <head> > <title>Some page</title> > <meta name="generator" content="public-inbox 1.9.0" /> > </head> I prefer to disclose as little information as possible in case vulnerabilities are found. Alone, security by obscurity doesn't work, but obscurity does make things more difficult for attackers (same reason camouflage exists). I also don't like wasting memory+bandwidth on things most users won't see or care about. This is especially true for stuff at the beginnning of the output since that's most likely to succeed in being transferred. > [0] https://html.spec.whatwg.org/multipage/semantics.html#meta-generator > > Thanks, > Thomas ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Add "generator" information to HTML pages 2023-01-08 19:47 ` Eric Wong @ 2023-01-08 20:02 ` Thomas Weißschuh 2023-01-08 20:58 ` Eric Wong 0 siblings, 1 reply; 5+ messages in thread From: Thomas Weißschuh @ 2023-01-08 20:02 UTC (permalink / raw) To: Eric Wong; +Cc: meta Hi Eric, On Sun, Jan 08, 2023 at 07:47:38PM +0000, Eric Wong wrote: > Thomas Weißschuh <thomas@t-8ch.de> wrote: > > Hi, > > > > it would be nice if public-inbox could extend the HTML pages it > > generates with the "generator" meta tag [0]. > > Especially the version would be useful. > > > > This would help users during debugging to see the specific version of > > public-inbox they are looking at. > > What would users be debugging? > Admins would be the only ones who care, I think... Since recently my mails to linux-kernel@vger.kernel.org that should end up on public-inbox on https://lore.kernel.org/lkml/ don't do so. They are accepted by the mail server on vger.kernel.org but never end up in the archives. I suspect some interactions between b4 which is used to generate the mails, the unicode characters in my name and public-inbox to be the culprit. This is what I wanted to reproduce locally, for which exact versions would have been nice. > Version info becomes worthless if an admin blocks/alters certain > endpoints via nginx/varnish or just editing the code. > > > For example: > > > > <head> > > <title>Some page</title> > > <meta name="generator" content="public-inbox 1.9.0" /> > > </head> > > I prefer to disclose as little information as possible in case > vulnerabilities are found. Alone, security by obscurity doesn't work, > but obscurity does make things more difficult for attackers > (same reason camouflage exists). > > I also don't like wasting memory+bandwidth on things most users > won't see or care about. This is especially true for stuff at > the beginnning of the output since that's most likely to succeed > in being transferred. Fair enough. The loading speed of public-inbox is really great, let's keep it that way. > > [0] https://html.spec.whatwg.org/multipage/semantics.html#meta-generator @Konstantin, if you read this: I'll send a proper bugreport to tools@linux.kernel.org soonish. Thanks, Thomas ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Add "generator" information to HTML pages 2023-01-08 20:02 ` Thomas Weißschuh @ 2023-01-08 20:58 ` Eric Wong 2023-01-08 21:54 ` Thomas Weißschuh 0 siblings, 1 reply; 5+ messages in thread From: Eric Wong @ 2023-01-08 20:58 UTC (permalink / raw) To: Thomas Weißschuh; +Cc: meta Thomas Weißschuh <thomas@t-8ch.de> wrote: > On Sun, Jan 08, 2023 at 07:47:38PM +0000, Eric Wong wrote: > > Thomas Weißschuh <thomas@t-8ch.de> wrote: > > > Hi, > > > > > > it would be nice if public-inbox could extend the HTML pages it > > > generates with the "generator" meta tag [0]. > > > Especially the version would be useful. > > > > > > This would help users during debugging to see the specific version of > > > public-inbox they are looking at. > > > > What would users be debugging? > > Admins would be the only ones who care, I think... > > Since recently my mails to linux-kernel@vger.kernel.org that should end > up on public-inbox on https://lore.kernel.org/lkml/ don't do so. > They are accepted by the mail server on vger.kernel.org but never end up > in the archives. > I suspect some interactions between b4 which is used to generate the > mails, the unicode characters in my name and public-inbox to be the > culprit. Your mail seem fine to my server, but coming from an IPv6 address has caused problems with some other servers in the past. Another potential thing might be your use of utf-8 in the From: header, while your Content-Type: is iso-8859-1 for the body. > This is what I wanted to reproduce locally, for which exact versions > would have been nice. I remember Konstantin has cherry-picked some commits from public-inbox.git in the past, and I suspect he already has https://public-inbox.org/meta/20221124213155.M736847@dcvr/ ("eml: header_raw converts octets to Perl UTF-8") for SMTPUTF8 One thing I wouldn't be opposed to doing is adding a way to download all loaded files in a tarball as a means for AGPL enforcement. The tricky thing is those files may change on disk after loading (and often does in my case :x), so they'd need to be copied into stable storage at startup (and updated if there's lazy-loading). Same security caveats apply, though. > > I also don't like wasting memory+bandwidth on things most users > > won't see or care about. This is especially true for stuff at > > the beginnning of the output since that's most likely to succeed > > in being transferred. > > Fair enough. > The loading speed of public-inbox is really great, let's keep it that > way. Good to know it's great for you. It's still too slow for me, but I'm anti-consumerist and refuse to follow Moore's law :x ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Add "generator" information to HTML pages 2023-01-08 20:58 ` Eric Wong @ 2023-01-08 21:54 ` Thomas Weißschuh 0 siblings, 0 replies; 5+ messages in thread From: Thomas Weißschuh @ 2023-01-08 21:54 UTC (permalink / raw) To: Eric Wong; +Cc: meta On Sun, Jan 08, 2023 at 08:58:04PM +0000, Eric Wong wrote: > Thomas Weißschuh <thomas@t-8ch.de> wrote: > > On Sun, Jan 08, 2023 at 07:47:38PM +0000, Eric Wong wrote: > > > Thomas Weißschuh <thomas@t-8ch.de> wrote: > > > > it would be nice if public-inbox could extend the HTML pages it > > > > generates with the "generator" meta tag [0]. > > > > Especially the version would be useful. > > > > > > > > This would help users during debugging to see the specific version of > > > > public-inbox they are looking at. > > > > > > What would users be debugging? > > > Admins would be the only ones who care, I think... > > > > Since recently my mails to linux-kernel@vger.kernel.org that should end > > up on public-inbox on https://lore.kernel.org/lkml/ don't do so. > > They are accepted by the mail server on vger.kernel.org but never end up > > in the archives. > > I suspect some interactions between b4 which is used to generate the > > mails, the unicode characters in my name and public-inbox to be the > > culprit. > > Your mail seem fine to my server, but coming from an IPv6 > address has caused problems with some other servers in the past. > Another potential thing might be your use of utf-8 in the From: > header, while your Content-Type: is iso-8859-1 for the body. I think I found the culprit. And it is indeed the b4 tool, or rather the Python email library it is using. Posting it here because you might know if this is standards conform or if it would be reasonable to carry a workaround inside public-inbox. When b4 passes the message to Pythons email.message.EmailMessage the 'To' header is just a long, unencoded string containing all recipients and their unicode names. EmailMessage then makes sure that this string conforms to legal email header values. It performs linewrapping and the special header utf-8 encoding/escaping. However IFF a header line contains unicode character and IFF the first character of a linewrapped line is a comma (,) then that comma will also be utf-8 escaped. Example input: 01234567890123456789012345678901234567890123456789012345678901234567890123, ä Example output 01234567890123456789012345678901234567890123456789012345678901234567890123 =?utf-8?q?=2C?= =?utf-8?q?=C3=A4?= I expect this to be a bug in the python library but maybe it is correct. > > This is what I wanted to reproduce locally, for which exact versions > > would have been nice. > > I remember Konstantin has cherry-picked some commits from > public-inbox.git in the past, and I suspect he already > has https://public-inbox.org/meta/20221124213155.M736847@dcvr/ > ("eml: header_raw converts octets to Perl UTF-8") for SMTPUTF8 > > One thing I wouldn't be opposed to doing is adding a way to > download all loaded files in a tarball as a means for AGPL > enforcement. The tricky thing is those files may change on disk > after loading (and often does in my case :x), so they'd need to > be copied into stable storage at startup (and updated if there's > lazy-loading). Same security caveats apply, though. > > > > I also don't like wasting memory+bandwidth on things most users > > > won't see or care about. This is especially true for stuff at > > > the beginnning of the output since that's most likely to succeed > > > in being transferred. > > > > Fair enough. > > The loading speed of public-inbox is really great, let's keep it that > > way. > > Good to know it's great for you. It's still too slow for me, > but I'm anti-consumerist and refuse to follow Moore's law :x ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-01-09 14:14 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-01-08 19:04 Add "generator" information to HTML pages Thomas Weißschuh 2023-01-08 19:47 ` Eric Wong 2023-01-08 20:02 ` Thomas Weißschuh 2023-01-08 20:58 ` Eric Wong 2023-01-08 21:54 ` Thomas Weißschuh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).