* Windows-style linebreaks (\r\n) and the web-renderer
@ 2022-01-14 20:48 Thomas Weißschuh
2022-02-11 20:22 ` [PATCH] view: remove all CR before LF Eric Wong
0 siblings, 1 reply; 2+ messages in thread
From: Thomas Weißschuh @ 2022-01-14 20:48 UTC (permalink / raw)
To: meta
Hi,
it seems the rendering of \r\n (Windows-style) linebreaks, is a bit suboptimal
on the website.
The \r are rendered literally. Mutt for example does not.
Example: https://lore.kernel.org/lkml/20210914093515.260031-1-maxime@cerno.tech/
Raw message:
...
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
...
Hi,=0D
=0D
....
Rendered:
....
Hi,\r
\r
...
The fix is probably obvious for you, if not I can try to come up with one.
Thanks,
Thomas
^ permalink raw reply [flat|nested] 2+ messages in thread
* [PATCH] view: remove all CR before LF
2022-01-14 20:48 Windows-style linebreaks (\r\n) and the web-renderer Thomas Weißschuh
@ 2022-02-11 20:22 ` Eric Wong
0 siblings, 0 replies; 2+ messages in thread
From: Eric Wong @ 2022-02-11 20:22 UTC (permalink / raw)
To: Thomas Weißschuh; +Cc: meta
Thomas Weißschuh <thomas@t-8ch.de> wrote:
> Hi,
>
> it seems the rendering of \r\n (Windows-style) linebreaks, is a bit suboptimal
> on the website.
>
> The \r are rendered literally. Mutt for example does not.
>
> Example: https://lore.kernel.org/lkml/20210914093515.260031-1-maxime@cerno.tech/
Thanks for the example.
> Raw message:
> ...
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: quoted-printable
> ...
>
>
> Hi,=0D
> =0D
> ....
>
> Rendered:
>
> ....
> Hi,\r
> \r
> ...
>
>
> The fix is probably obvious for you, if not I can try to come up with one.
Yes, except I remember adding support for CR-LF long ago...
The problem here is some messages are CR-CR-LF for some odd reason.
Oh well, it's a 1 character fix on our end for the HTML.
Not sure if ContentHash (deduplication) and SolverGit (blob
regeneration) ought to strip redundant CR, yet...
-------8<-------
Subject: [PATCH] view: remove all CR before LF
While we've rendered CR-LF as LF-only in HTML for many years,
some messages end up as CR-CR-LF. So strip ALL all CR bytes
preceding LF bytes, while preserving odd CR in the middle of
lines.
Reported-by: Thomas Weißschuh <thomas@t-8ch.de>
Link: https://public-inbox.org/meta/8d13668f-cac7-4984-bb4e-ad90502dc46d@t-8ch.de/
---
lib/PublicInbox/View.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 2e9cf705..ca02ae05 100644
--- a/lib/PublicInbox/View.pm
+++ b/lib/PublicInbox/View.pm
@@ -586,7 +586,7 @@ sub add_text_body { # callback for each_part
# makes no difference to browsers, and don't screw up filename
# link generation in diffs with the extra '%0D'
- $s =~ s/\r\n/\n/sg;
+ $s =~ s/\r+\n/\n/sg;
# will be escaped to `•' in HTML
obfuscate_addrs($ibx, $s, "\x{2022}") if $ibx->{obfuscate};
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-02-11 20:22 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-14 20:48 Windows-style linebreaks (\r\n) and the web-renderer Thomas Weißschuh
2022-02-11 20:22 ` [PATCH] view: remove all CR before LF Eric Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).