unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#59549: EWW ordered list display irregularity
@ 2022-11-24 18:29 Nicholas Drozd
  2024-09-23 19:07 ` Sebastián Monía
  0 siblings, 1 reply; 2+ messages in thread
From: Nicholas Drozd @ 2022-11-24 18:29 UTC (permalink / raw)
  To: 59549

[-- Attachment #1: Type: text/plain, Size: 710 bytes --]

Here is the Wiktionary definition of the word "locus":
https://en.wiktionary.org/wiki/locus#Noun

When I open that page in EWW, I see five definition entries. But the second
entry is blank. I think: what is that missing definition, and why is it
missing?

When I open it in Firefox, I see that there are only four definitions, the
same ones as displayed in EWW. So EWW is not failing to display anything;
instead, it is inserting something extra. That extra something comes from
this piece of HTML:

  <li class="mw-empty-elt"></li>

I don't know why that's in there or how Firefox knows not to display it. It
would be cool if EWW also knew not to display it.

Bug report or feature request? You be the judge.

[-- Attachment #2: Type: text/html, Size: 859 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* bug#59549: EWW ordered list display irregularity
  2022-11-24 18:29 bug#59549: EWW ordered list display irregularity Nicholas Drozd
@ 2024-09-23 19:07 ` Sebastián Monía
  0 siblings, 0 replies; 2+ messages in thread
From: Sebastián Monía @ 2024-09-23 19:07 UTC (permalink / raw)
  To: Nicholas Drozd; +Cc: 59549

[-- Attachment #1: Type: text/plain, Size: 931 bytes --]

Hi everyone,


>  Here is the Wiktionary definition of the word "locus":
>  https://en.wiktionary.org/wiki/locus#Noun When I open that page in
>  EWW, I see five definition entries. But the second entry is blank.

I was able to reproduce.

>  So EWW is not failing to display anything; instead, it is inserting
>  something extra. That extra something comes from this piece of HTML:
>
>    <li class="mw-empty-elt"></li>

I wouldn't say for sure EWW is in the wrong here. Apparently inserting
empty li elements for styling purposed is a somewhat common practice.
Couldn't confirm how "correct" it is, but it is accepted.
(sidenote, stuff like this makes be glad I haven't worked in web stuff
in many many years)

>  I don't know why that's in there or how Firefox knows not to display
>  it. It would be cool if EWW also knew not to display it.

The attached patch does exactly that: skip any li elements that don't
have content.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: shr.el: don't render empty li tags --]
[-- Type: text/x-patch, Size: 2789 bytes --]

From afa3cccda43ea17933d0e782243cf2adc9ee51c6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?=
 <sebastian.monia@sebasmonia.com>
Date: Mon, 23 Sep 2024 15:00:44 -0400
Subject: [PATCH] shr: don't render empty li tags (bug#59549)

---
 lisp/net/shr.el | 49 +++++++++++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/lisp/net/shr.el b/lisp/net/shr.el
index cd0e482aee7..2a72621fec4 100644
--- a/lisp/net/shr.el
+++ b/lisp/net/shr.el
@@ -1656,6 +1656,11 @@ Based on https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-infore
   (shr-generic dom)
   (shr-ensure-paragraph))
 
+(defun shr-tag-empty-content-p (dom)
+  "Return t if DOM has no content.
+By \"content\" we mean \"text between the tags\"."
+  (string-empty-p (string-trim (dom-text dom))))
+
 (defun shr-tag-div (dom)
   (let ((display (cdr (assq 'display shr-stylesheet))))
     (if (or (equal display "inline")
@@ -2163,26 +2168,30 @@ BASE is the URL of the HTML being rendered."
   (shr-ensure-paragraph))
 
 (defun shr-tag-li (dom)
-  (shr-ensure-newline)
-  (let ((start (point)))
-    (let* ((bullet
-	    (if (numberp shr-list-mode)
-		(prog1
-		    (format "%d " shr-list-mode)
-		  (setq shr-list-mode (1+ shr-list-mode)))
-	      (car shr-internal-bullet)))
-	   (width (if (numberp shr-list-mode)
-		      (shr-string-pixel-width bullet)
-		    (cdr shr-internal-bullet))))
-      (insert bullet)
-      (shr-mark-fill start)
-      (let ((shr-indentation (+ shr-indentation width)))
-	(put-text-property start (1+ start)
-			   'shr-continuation-indentation shr-indentation)
-	(put-text-property start (1+ start) 'shr-prefix-length (length bullet))
-	(shr-generic dom))))
-  (unless (bolp)
-    (insert "\n")))
+  ;; bug#59549: EWW ordered list display irregularity
+  ;; empty li tags are used sometimes for styling purposes: do not
+  ;; render such tags
+  (unless (shr-tag-empty-content-p dom)
+    (shr-ensure-newline)
+    (let ((start (point)))
+      (let* ((bullet
+	      (if (numberp shr-list-mode)
+		  (prog1
+		      (format "%d " shr-list-mode)
+		    (setq shr-list-mode (1+ shr-list-mode)))
+	        (car shr-internal-bullet)))
+	     (width (if (numberp shr-list-mode)
+		        (shr-string-pixel-width bullet)
+		      (cdr shr-internal-bullet))))
+        (insert bullet)
+        (shr-mark-fill start)
+        (let ((shr-indentation (+ shr-indentation width)))
+	  (put-text-property start (1+ start)
+			     'shr-continuation-indentation shr-indentation)
+	  (put-text-property start (1+ start) 'shr-prefix-length (length bullet))
+	  (shr-generic dom))))
+    (unless (bolp)
+      (insert "\n"))))
 
 (defun shr-mark-fill (start)
   ;; We may not have inserted any text to fill.
-- 
2.45.2.windows.1


[-- Attachment #3: Type: text/plain, Size: 166 bytes --]


I tested with the Wiktionary page and a few other offline tests and it
worked as intended.

Regards,
Seb

-- 
Sebastián Monía
https://site.sebasmonia.com/

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-09-23 19:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-24 18:29 bug#59549: EWW ordered list display irregularity Nicholas Drozd
2024-09-23 19:07 ` Sebastián Monía

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).