all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: JD Smith <jdtsmith@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 71685@debbugs.gnu.org
Subject: bug#71685: [PATCH] fix shr rendering in tables without tbody
Date: Sat, 6 Jul 2024 14:13:30 -0400	[thread overview]
Message-ID: <E5BC31C9-7C14-4E6A-AAB1-11B44FC6C6E1@gmail.com> (raw)
In-Reply-To: <86h6d355vk.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 1172 bytes --]



> On Jul 6, 2024, at 3:36 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: JD Smith <jdtsmith@gmail.com>
>> Date: Thu, 20 Jun 2024 15:15:32 -0400
>> 
>> It is very common for HTML tables to include a header (<thead>) and/or footer (<tfoot>) without using <tbody>.  Modern browsers simply supply an implicit <tbody>..</tbody> around all the unparented rows in a table.  `shr' does not handle this common case correctly.  Below is an example with <thead> but not <tbody>.  It prints the header, but then subsumes it again inside the derived body, printing the header again in a single cell.  
>> 
>> The relevant function which should handle this is `shr--fix-tbody'.   The included patch to this function simply avoids including `thead` and `tfoot` children in the implicit body.
> 
> Thanks.  I don't see any experts chiming in, so if you are confident
> in the patch, and it doesn't fail the existing tests, please install
> on the emacs-30 branch, and thanks.  Bonus points for adding a test
> for this case.

Thanks.  I'm afraid I don't have write access on savannah.  I've added a test and formatted the patch, below.  All shr tests succeed.


[-- Attachment #2: 0001-Fix-formatting-of-tables-with-thead-tfoot-but-no-tbo.patch --]
[-- Type: application/octet-stream, Size: 2349 bytes --]

From 623ecf07dc1b215cbc98f5804d58b571a649e9ba Mon Sep 17 00:00:00 2001
From: JD Smith <93749+jdtsmith@users.noreply.github.com>
Date: Sat, 6 Jul 2024 09:22:33 -0400
Subject: [PATCH] Fix formatting of tables with thead/tfoot but no tbody

Correctly handle formatting of tables containing thead and/or tfoot, but
without any tbody, to prevent including thead/tfoot content twice within
the table's derived body (Bug#71685).
* lisp/net.shr.el (shr--fix-tbody): Omit thead/tfoot from implicit body
* test/lisp/net/shr-resources/table.html:
* test/lisp/net/shr-resources/table.txt:
Added table rendering test.
---
 lisp/net/shr.el                        | 5 +++--
 test/lisp/net/shr-resources/table.html | 7 +++++++
 test/lisp/net/shr-resources/table.txt  | 5 +++++
 3 files changed, 15 insertions(+), 2 deletions(-)
 create mode 100644 test/lisp/net/shr-resources/table.html
 create mode 100644 test/lisp/net/shr-resources/table.txt

diff --git a/lisp/net/shr.el b/lisp/net/shr.el
index 3dadcb9a09b..fb72ea6aa67 100644
--- a/lisp/net/shr.el
+++ b/lisp/net/shr.el
@@ -2261,8 +2261,9 @@ shr-table-body
 (defun shr--fix-tbody (tbody)
   (nconc (list 'tbody (dom-attributes tbody))
          (cl-loop for child in (dom-children tbody)
-                  collect (if (or (stringp child)
-                                  (not (eq (dom-tag child) 'tr)))
+		  for tag = (and (not (stringp child)) (dom-tag child))
+		  unless (or (eq tag 'thead) (eq tag 'tfoot))
+		  collect (if (not (eq tag 'tr))
                               (list 'tr nil (list 'td nil child))
                             child))))
 
diff --git a/test/lisp/net/shr-resources/table.html b/test/lisp/net/shr-resources/table.html
new file mode 100644
index 00000000000..c5e8875ac91
--- /dev/null
+++ b/test/lisp/net/shr-resources/table.html
@@ -0,0 +1,7 @@
+<table>
+<thead><tr><th>A</th><th>B</th></tr></thead>
+<tr><td>1</td><td>2</td></tr>
+<tr><td>3</td><td>4</td></tr>
+5678
+<tfoot><tr><th>A</th><th>B</th></tr></tfoot>
+</table>
diff --git a/test/lisp/net/shr-resources/table.txt b/test/lisp/net/shr-resources/table.txt
new file mode 100644
index 00000000000..70939effb63
--- /dev/null
+++ b/test/lisp/net/shr-resources/table.txt
@@ -0,0 +1,5 @@
+  A  B    
+  1  2    
+  3  4    
+  5678     
+  A  B    
-- 
2.43.0


  reply	other threads:[~2024-07-06 18:13 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-20 19:15 bug#71685: [PATCH] fix shr rendering in tables without tbody JD Smith
2024-07-06  7:36 ` Eli Zaretskii
2024-07-06 18:13   ` JD Smith [this message]
2024-07-06 19:11     ` Stefan Kangas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E5BC31C9-7C14-4E6A-AAB1-11B44FC6C6E1@gmail.com \
    --to=jdtsmith@gmail.com \
    --cc=71685@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.