* bug#73133: 29.2; EWW fails to render some webpages
@ 2024-09-08 20:52 Ganimard via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-10 6:06 ` Jim Porter
0 siblings, 1 reply; 10+ messages in thread
From: Ganimard via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2024-09-08 20:52 UTC (permalink / raw)
To: 73133
[-- Attachment #1: Type: text/plain, Size: 7951 bytes --]
To Whom it may concern,
I have recently discovered the website gastonle.ru, however it does not
render with Emacs Web Wowser. It appears to be a relatively simple
website and I cannot see what would prohibit it from rendering.
I have also tried it on an Ubuntu 22.04.4 LTS distro running Emacs 28.1
but it also fails to render. This therefore appears to be a bug in EWW.
---
In GNU Emacs 29.2 (build 1, aarch64-apple-darwin21.6.0, NS
appkit-2113.60 Version 12.6.6 (Build 21G646)) of 2024-01-19 built on
armbob.lan
Windowing system distributor 'Apple', version 10.3.2487
System Description: macOS 14.2.1
Configured using:
'configure --with-ns '--enable-locallisppath=/Library/Application
Support/Emacs/${version}/site-lisp:/Library/Application
Support/Emacs/site-lisp' --with-modules 'CFLAGS=-DFD_SETSIZE=10000
-DDARWIN_UNLIMITED_SELECT' --with-x-toolkit=no'
Configured features:
ACL GLIB GMP GNUTLS JPEG JSON LIBXML2 MODULES NOTIFY KQUEUE NS PDUMPER
PNG RSVG SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER ZLIB
Important settings:
value of $LANG: en_NZ.UTF-8
locale-coding-system: utf-8-unix
Major mode: Markdown
Minor modes in effect:
yas-global-mode: t
yas-minor-mode: t
global-git-commit-mode: t
magit-auto-revert-mode: t
shell-dirtrack-mode: t
server-mode: t
TeX-PDF-mode: t
TeX-source-correlate-mode: t
global-display-line-numbers-mode: t
display-line-numbers-mode: t
whitespace-mode: t
global-page-break-lines-mode: t
override-global-mode: t
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
line-number-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
/Users/ganimard/.emacs.d/elpa/transient-20230919.2146/transient hides /Applications/Emacs.app/Contents/Resources/lisp/transient <http://Emacs.app/Contents/Resources/lisp/transient>
Features:
(shadow sort mail-extr emacsbug files-x vc-hg vc-bzr vc-src vc-sccs
vc-svn vc-cvs vc-rcs log-view vc bug-reference help-fns radix-tree
magit-patch magit-subtree magit-gitignore magit-ediff ediff ediff-merg
ediff-mult ediff-wind ediff-diff ediff-help ediff-init ediff-util
magit-extras face-remap misearch multi-isearch vc-git vc-dispatcher
markdown-mode color dired-aux disp-table hl-todo flycheck forth-mode
forth-spec forth-smie smie forth-syntax llvm-mode splunk-mode ess
lisp-mnt ess-utils ess-custom go-mode find-file ffap etags fileloop xref
rust-utils rust-mode rust-rustfmt rust-playpen rust-compile rust-cargo
yasnippet magit-submodule magit-blame magit-stash magit-reflog
magit-bisect magit-push magit-pull magit-fetch magit-clone magit-remote
magit-commit magit-sequence magit-notes magit-worktree magit-tag
magit-merge magit-branch magit-reset magit-files magit-refs magit-status
magit magit-repos magit-apply magit-wip magit-log which-func imenu
magit-diff smerge-mode diff diff-mode git-commit log-edit pcvs-util
add-log magit-core magit-autorevert autorevert magit-margin
magit-transient magit-process with-editor shell server magit-mode
transient magit-git magit-base magit-section cursor-sensor dash
auctex-latexmk latex latex-flymake flymake-proc flymake project compile
warnings tex-ispell tex-style tex texmathp latex-preview-pane doc-view
filenotify jka-compr image-mode exif auctex ebib ebib-reading-list
ebib-notes org-element org-persist xdg org-id org-refile org ob
ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-src ob-comint
org-pcomplete pcomplete comint ansi-osc ansi-color org-list org-footnote
org-faces org-entities noutline outline icons ob-emacs-lisp ob-core
ob-eval org-cycle org-table org-keys oc org-loaddefs find-func cal-menu
calendar cal-loaddefs ol org-fold org-fold-core org-compat ring avl-tree
generator org-version org-macs ebib-filters ebib-keywords ebib-utils
ebib-db message sendmail yank-media puny dired dired-loaddefs rfc822 mml
mml-sec epa derived epg rfc6068 epg-config gnus-util
text-property-search mm-decode mm-bodies mm-encode mail-parse rfc2231
rfc2047 rfc2045 mm-util ietf-drums mail-prsvr mailabbrev mail-utils
gmm-utils mailheader format-spec parsebib rx hl-line pp crm bibtex
iso8601 time-date writeroom-mode visual-fill-column olivetti
multiple-cursors mc-separate-operations rectangular-region-mode
mc-mark-pop mc-edit-lines mc-hide-unmatched-lines-mode mc-mark-more
thingatpt mc-cycle-cursors multiple-cursors-core advice rect move-text
no-littering compat paredit edmacro kmacro display-line-numbers
whitespace page-break-lines smart-mode-line-atom-one-dark-theme cl-extra
help-mode atom-one-dark-theme use-package use-package-ensure
use-package-delight use-package-diminish use-package-bind-key bind-key
easy-mmode use-package-core finder-inf atom-one-dark-theme-autoloads
auctex-latexmk-autoloads auctex-autoloads tex-site company-autoloads
dracula-theme-autoloads ebib-autoloads ess-autoloads flycheck-autoloads
forth-mode-autoloads gdscript-mode-autoloads go-mode-autoloads
hl-todo-autoloads impatient-mode-autoloads htmlize-autoloads
julia-formatter-autoloads just-mode-autoloads
latex-preview-pane-autoloads llvm-ts-mode-autoloads lsp-docker-autoloads
lsp-julia-autoloads julia-mode-autoloads lsp-ui-autoloads
lsp-mode-autoloads ht-autoloads lv-autoloads magit-autoloads pcase
git-commit-autoloads magit-section-autoloads move-text-autoloads
multiple-cursors-autoloads no-littering-autoloads olivetti-autoloads
package-lint-autoloads page-break-lines-autoloads paredit-autoloads
parsebib-autoloads pkg-info-autoloads epl-autoloads
quelpa-use-package-autoloads quelpa-autoloads rustic-autoloads
markdown-mode-autoloads f-autoloads dash-autoloads rust-mode-autoloads
s-autoloads session-async-autoloads simple-httpd-autoloads
smart-mode-line-atom-one-dark-theme-autoloads smart-mode-line-autoloads
rich-minority-autoloads spinner-autoloads splunk-mode-autoloads
transient-autoloads with-editor-autoloads compat-autoloads info
writeroom-mode-autoloads visual-fill-column-autoloads
xterm-color-autoloads yaml-autoloads yaml-mode-autoloads
yasnippet-autoloads package browse-url url url-proxy url-privacy
url-expand url-methods url-history url-cookie generate-lisp-file
url-domsuf url-util mailcap url-handlers url-parse auth-source cl-seq
eieio eieio-core cl-macs password-cache json subr-x map byte-opt gv
bytecomp byte-compile url-vars cl-loaddefs cl-lib rmc iso-transl tooltip
cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type
elisp-mode mwheel term/ns-win ns-win ucs-normalize mule-util
term/common-win tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode lisp-mode prog-mode register
page tab-bar menu-bar rfn-eshadow isearch easymenu timer select
scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors
frame minibuffer nadvice seq simple cl-generic indonesian philippine
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure
cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp
files window text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget keymap hashtable-print-readable backquote
threads kqueue cocoa ns multi-tty make-network-process emacs)
Memory information:
((conses 16 412027 70117)
(symbols 48 34112 0)
(strings 32 128155 6447)
(string-bytes 1 4038566)
(vectors 16 67754)
(vector-slots 8 739746 70880)
(floats 8 294 368)
(intervals 56 6200 53)
(buffers 984 43))
[-- Attachment #2: Type: text/html, Size: 12370 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-08 20:52 bug#73133: 29.2; EWW fails to render some webpages Ganimard via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2024-09-10 6:06 ` Jim Porter
2024-09-21 9:13 ` Eli Zaretskii
0 siblings, 1 reply; 10+ messages in thread
From: Jim Porter @ 2024-09-10 6:06 UTC (permalink / raw)
To: Ganimard, 73133
On 9/8/2024 1:52 PM, Ganimard via Bug reports for GNU Emacs, the Swiss
army knife of text editors wrote:
> I have recently discovered the website gastonle.ru, however it does not
> render with Emacs Web Wowser. It appears to be a relatively simple
> website and I cannot see what would prohibit it from rendering.
Checking that page via curl, it appears that it doesn't return a
Content-Type header. In the absence of that header, EWW assumes that the
page is plain text.
> I have also tried it on an Ubuntu 22.04.4 LTS distro running Emacs 28.1
> but it also fails to render. This therefore appears to be a bug in EWW.
From my reading of RFC9110[1], this is *technically* a bug (we should
assume application/octet-stream, not text/plain), but that wouldn't fix
the rendering here; it would probably make things worse. However, per
the RFC, EWW would be within its rights to guess that the page is HTML,
e.g. by checking for "<!doctype html>". It also recommends having that
be an option that can be disabled, which is reasonable (and in keeping
with Emacs's design principles anyway).
[1] https://www.rfc-editor.org/rfc/rfc9110#section-8.3-5
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-10 6:06 ` Jim Porter
@ 2024-09-21 9:13 ` Eli Zaretskii
2024-09-21 17:12 ` Jim Porter
0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-21 9:13 UTC (permalink / raw)
To: Jim Porter; +Cc: 73133, ganimard
> Date: Mon, 9 Sep 2024 23:06:56 -0700
> From: Jim Porter <jporterbugs@gmail.com>
>
> On 9/8/2024 1:52 PM, Ganimard via Bug reports for GNU Emacs, the Swiss
> army knife of text editors wrote:
> > I have recently discovered the website gastonle.ru, however it does not
> > render with Emacs Web Wowser. It appears to be a relatively simple
> > website and I cannot see what would prohibit it from rendering.
>
> Checking that page via curl, it appears that it doesn't return a
> Content-Type header. In the absence of that header, EWW assumes that the
> page is plain text.
>
> > I have also tried it on an Ubuntu 22.04.4 LTS distro running Emacs 28.1
> > but it also fails to render. This therefore appears to be a bug in EWW.
>
> From my reading of RFC9110[1], this is *technically* a bug (we should
> assume application/octet-stream, not text/plain), but that wouldn't fix
> the rendering here; it would probably make things worse. However, per
> the RFC, EWW would be within its rights to guess that the page is HTML,
> e.g. by checking for "<!doctype html>". It also recommends having that
> be an option that can be disabled, which is reasonable (and in keeping
> with Emacs's design principles anyway).
>
> [1] https://www.rfc-editor.org/rfc/rfc9110#section-8.3-5
Thanks. Would someone like to submit a patch along these lines?
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-21 9:13 ` Eli Zaretskii
@ 2024-09-21 17:12 ` Jim Porter
2024-09-23 15:43 ` Sebastián Monía
2024-09-23 15:56 ` Sebastián Monía
0 siblings, 2 replies; 10+ messages in thread
From: Jim Porter @ 2024-09-21 17:12 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 73133, ganimard
On 9/21/2024 2:13 AM, Eli Zaretskii wrote:
>> Date: Mon, 9 Sep 2024 23:06:56 -0700
>> From: Jim Porter <jporterbugs@gmail.com>
>>
>> From my reading of RFC9110[1], this is *technically* a bug (we should
>> assume application/octet-stream, not text/plain), but that wouldn't fix
>> the rendering here; it would probably make things worse. However, per
>> the RFC, EWW would be within its rights to guess that the page is HTML,
>> e.g. by checking for "<!doctype html>". It also recommends having that
>> be an option that can be disabled, which is reasonable (and in keeping
>> with Emacs's design principles anyway).
>>
>> [1] https://www.rfc-editor.org/rfc/rfc9110#section-8.3-5
>
> Thanks. Would someone like to submit a patch along these lines?
It'll probably be a couple weeks until I have time to write a patch, but
if no one has done so by then, I'll look into it.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-21 17:12 ` Jim Porter
@ 2024-09-23 15:43 ` Sebastián Monía
2024-09-28 10:58 ` Eli Zaretskii
2024-09-23 15:56 ` Sebastián Monía
1 sibling, 1 reply; 10+ messages in thread
From: Sebastián Monía @ 2024-09-23 15:43 UTC (permalink / raw)
To: Jim Porter; +Cc: Eli Zaretskii, 73133, ganimard
[-- Attachment #1: Type: text/plain, Size: 970 bytes --]
Jim Porter <jporterbugs@gmail.com> writes:
> On 9/21/2024 2:13 AM, Eli Zaretskii wrote:
>>> Date: Mon, 9 Sep 2024 23:06:56 -0700
>>> From: Jim Porter <jporterbugs@gmail.com>
>>>
>>> From my reading of RFC9110[1], this is *technically* a bug (we should
>>> assume application/octet-stream, not text/plain), but that wouldn't fix
>>> the rendering here; it would probably make things worse. However, per
>>> the RFC, EWW would be within its rights to guess that the page is HTML,
>>> e.g. by checking for "<!doctype html>". It also recommends having that
>>> be an option that can be disabled, which is reasonable (and in keeping
>>> with Emacs's design principles anyway).
>>>
>>> [1] https://www.rfc-editor.org/rfc/rfc9110#section-8.3-5
>> Thanks. Would someone like to submit a patch along these lines?
>
> It'll probably be a couple weeks until I have time to write a patch,
> but if no one has done so by then, I'll look into it.
Would the patch attached work?
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: eww-use-doctype-fallback --]
[-- Type: text/x-patch, Size: 2863 bytes --]
From 499abe197e6d245228be853731314e19148bb658 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?=
<sebastian.monia@sebasmonia.com>
Date: Mon, 23 Sep 2024 11:40:18 -0400
Subject: [PATCH] Add option eww-use-doctype-fallback, code to detect if a page
has a valid doctype tag, and use it as alternative to a content-type header
---
lisp/net/eww.el | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index a651d9d5020..59a146c8392 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -170,6 +170,14 @@ the first item is the program, and the rest are the arguments."
:type '(choice (const :tag "Never" nil)
regexp))
+(defcustom eww-use-doctype-fallback t
+ "Accept a DOCTYPE tag as evidence that page content is HTML.
+This is used only when the page does not have a valid Content-Type
+header."
+ :version "30.1"
+ :group 'eww
+ :type 'boolean)
+
(defcustom eww-browse-url-new-window-is-tab 'tab-bar
"Whether to open up new windows in a tab or a new buffer.
If t, then open the URL in a new tab rather than a new buffer if
@@ -630,6 +638,18 @@ Currently this means either text/html or application/xhtml+xml."
(member content-type '("text/html"
"application/xhtml+xml")))
+(defun eww--doctype-html-p (data-buffer)
+ "Return non-nil if DATA-BUFFER contains a doctype declaration."
+ ;; https://html.spec.whatwg.org/multipage/syntax.html#the-doctype
+ (let ((case-fold-search t)
+ (target
+ "<!doctype +html *\\(>\\|system +\\(\\\"\\|'\\)+about:legacy-compat\\)"))
+ (with-current-buffer data-buffer
+ (goto-char (point-min))
+ ;; match basic <!doctype html> and also legacy variants as
+ ;; specified in link above
+ (re-search-forward target nil t))))
+
(defun eww--rename-buffer ()
"Rename the current EWW buffer.
The renaming scheme is performed in accordance with
@@ -695,7 +715,9 @@ The renaming scheme is performed in accordance with
url))
(goto-char (point-min))
(eww-display-html (or encode charset) url nil point buffer))
- ((eww-html-p (car content-type))
+ ((or (eww-html-p (car content-type))
+ (and eww-use-doctype-fallback
+ (eww--doctype-html-p data-buffer)))
(eww-display-html (or encode charset) url nil point buffer))
((equal (car content-type) "application/pdf")
(eww-display-pdf))
@@ -717,7 +739,7 @@ The renaming scheme is performed in accordance with
(setq buffer-undo-list nil)))
(kill-buffer data-buffer)))
(unless (buffer-live-p buffer)
- (kill-buffer data-buffer))))
+ (kill-buffer data-buffer)))
(defun eww-parse-headers ()
(let ((headers nil))
--
2.45.2.windows.1
[-- Attachment #3: Type: text/plain, Size: 54 bytes --]
--
Sebastián Monía
https://site.sebasmonia.com/
^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-21 17:12 ` Jim Porter
2024-09-23 15:43 ` Sebastián Monía
@ 2024-09-23 15:56 ` Sebastián Monía
2024-09-24 18:31 ` Jim Porter
1 sibling, 1 reply; 10+ messages in thread
From: Sebastián Monía @ 2024-09-23 15:56 UTC (permalink / raw)
To: Jim Porter; +Cc: Eli Zaretskii, 73133, ganimard
[-- Attachment #1: Type: text/plain, Size: 158 bytes --]
Hi all,
Would something like the attached patch work?
Thanks,
Seb
PS: I think I sent this to just one person by mistake instead of a wide
reply, my bad.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: eww-use-doctype-fallback --]
[-- Type: text/x-patch, Size: 2863 bytes --]
From 499abe197e6d245228be853731314e19148bb658 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Sebasti=C3=A1n=20Mon=C3=ADa?=
<sebastian.monia@sebasmonia.com>
Date: Mon, 23 Sep 2024 11:40:18 -0400
Subject: [PATCH] Add option eww-use-doctype-fallback, code to detect if a page
has a valid doctype tag, and use it as alternative to a content-type header
---
lisp/net/eww.el | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index a651d9d5020..59a146c8392 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -170,6 +170,14 @@ the first item is the program, and the rest are the arguments."
:type '(choice (const :tag "Never" nil)
regexp))
+(defcustom eww-use-doctype-fallback t
+ "Accept a DOCTYPE tag as evidence that page content is HTML.
+This is used only when the page does not have a valid Content-Type
+header."
+ :version "30.1"
+ :group 'eww
+ :type 'boolean)
+
(defcustom eww-browse-url-new-window-is-tab 'tab-bar
"Whether to open up new windows in a tab or a new buffer.
If t, then open the URL in a new tab rather than a new buffer if
@@ -630,6 +638,18 @@ Currently this means either text/html or application/xhtml+xml."
(member content-type '("text/html"
"application/xhtml+xml")))
+(defun eww--doctype-html-p (data-buffer)
+ "Return non-nil if DATA-BUFFER contains a doctype declaration."
+ ;; https://html.spec.whatwg.org/multipage/syntax.html#the-doctype
+ (let ((case-fold-search t)
+ (target
+ "<!doctype +html *\\(>\\|system +\\(\\\"\\|'\\)+about:legacy-compat\\)"))
+ (with-current-buffer data-buffer
+ (goto-char (point-min))
+ ;; match basic <!doctype html> and also legacy variants as
+ ;; specified in link above
+ (re-search-forward target nil t))))
+
(defun eww--rename-buffer ()
"Rename the current EWW buffer.
The renaming scheme is performed in accordance with
@@ -695,7 +715,9 @@ The renaming scheme is performed in accordance with
url))
(goto-char (point-min))
(eww-display-html (or encode charset) url nil point buffer))
- ((eww-html-p (car content-type))
+ ((or (eww-html-p (car content-type))
+ (and eww-use-doctype-fallback
+ (eww--doctype-html-p data-buffer)))
(eww-display-html (or encode charset) url nil point buffer))
((equal (car content-type) "application/pdf")
(eww-display-pdf))
@@ -717,7 +739,7 @@ The renaming scheme is performed in accordance with
(setq buffer-undo-list nil)))
(kill-buffer data-buffer)))
(unless (buffer-live-p buffer)
- (kill-buffer data-buffer))))
+ (kill-buffer data-buffer)))
(defun eww-parse-headers ()
(let ((headers nil))
--
2.45.2.windows.1
[-- Attachment #3: Type: text/plain, Size: 54 bytes --]
--
Sebastián Monía
https://site.sebasmonia.com/
^ permalink raw reply related [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-23 15:56 ` Sebastián Monía
@ 2024-09-24 18:31 ` Jim Porter
2024-09-25 20:46 ` Sebastián Monía
0 siblings, 1 reply; 10+ messages in thread
From: Jim Porter @ 2024-09-24 18:31 UTC (permalink / raw)
To: Sebastián Monía; +Cc: Eli Zaretskii, 73133, ganimard
On 9/23/2024 8:56 AM, Sebastián Monía wrote:
> Would something like the attached patch work?
I was actually thinking something more general, like a defcustom named
'eww-guess-content-type-functions', which would be a list of functions
where the first non-nil result is the guessed Content-Type. That way, we
could extend this to other content types (for example, maybe we'd want
to look for the magic headers for various image formats too; we don't
have to do that in this bug).
I think your 'eww--doctype-html-p' function would work nicely with a
couple small tweaks as one of the functions in
'eww-guess-content-type-functions' though.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-24 18:31 ` Jim Porter
@ 2024-09-25 20:46 ` Sebastián Monía
2024-09-26 1:59 ` Jim Porter
0 siblings, 1 reply; 10+ messages in thread
From: Sebastián Monía @ 2024-09-25 20:46 UTC (permalink / raw)
To: Jim Porter; +Cc: Eli Zaretskii, 73133, ganimard
Hi Jim,
Jim Porter <jporterbugs@gmail.com> writes:
> I was actually thinking something more general, like a defcustom named
> 'eww-guess-content-type-functions', which would be a list of functions
> where the first non-nil result is the guessed Content-Type. That way,
> we could extend this to other content types (for example, maybe we'd
> want to look for the magic headers for various image formats too; we
> don't have to do that in this bug).
I think the functions for the new defcustom should accept the
content-type, headers (since both are already parsed by that time), and
the entire buffer. If you agree, I can give your suggestion a shot, if
not let me know what do you think would work.
> I think your 'eww--doctype-html-p' function would work nicely with a
> couple small tweaks as one of the functions in
> 'eww-guess-content-type-functions' though.
Thanks!
I would also have the current '(eww-html-p (car content-type))' wrapped
in a function `eww--content-type-html-p` and put both functions in the
defcustom, first content type then doctype.
--
Sebastián Monía
https://site.sebasmonia.com/
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-25 20:46 ` Sebastián Monía
@ 2024-09-26 1:59 ` Jim Porter
0 siblings, 0 replies; 10+ messages in thread
From: Jim Porter @ 2024-09-26 1:59 UTC (permalink / raw)
To: Sebastián Monía; +Cc: Eli Zaretskii, 73133, ganimard
On 9/25/2024 1:46 PM, Sebastián Monía wrote:
> Jim Porter <jporterbugs@gmail.com> writes:
>> I was actually thinking something more general, like a defcustom named
>> 'eww-guess-content-type-functions', which would be a list of functions
>> where the first non-nil result is the guessed Content-Type. That way,
>> we could extend this to other content types (for example, maybe we'd
>> want to look for the magic headers for various image formats too; we
>> don't have to do that in this bug).
>
> I think the functions for the new defcustom should accept the
> content-type, headers (since both are already parsed by that time), and
> the entire buffer. If you agree, I can give your suggestion a shot, if
> not let me know what do you think would work.
I think we'd only want to run this hook if the Content-Type is absent
from the headers (its job is to *guess* a content type, after all), so
I'd expect the signature to be the list of headers + the buffer.
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#73133: 29.2; EWW fails to render some webpages
2024-09-23 15:43 ` Sebastián Monía
@ 2024-09-28 10:58 ` Eli Zaretskii
0 siblings, 0 replies; 10+ messages in thread
From: Eli Zaretskii @ 2024-09-28 10:58 UTC (permalink / raw)
To: Sebastián Monía; +Cc: jporterbugs, 73133, ganimard
> From: Sebastián Monía <sebastian@sebasmonia.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, 73133@debbugs.gnu.org, ganimard@tuta.io
> Date: Mon, 23 Sep 2024 11:43:36 -0400
>
> +(defcustom eww-use-doctype-fallback t
> + "Accept a DOCTYPE tag as evidence that page content is HTML.
This should say
"Whether to accept the DOCTYPE tag as evidence that page content is HTML."
> +This is used only when the page does not have a valid Content-Type
> +header."
> + :version "30.1"
^^^^
This should be "31.1"
> +(defun eww--doctype-html-p (data-buffer)
> + "Return non-nil if DATA-BUFFER contains a doctype declaration."
Not just "doctype declaration", but "HTML doctype declaration", right?
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-09-28 10:58 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-08 20:52 bug#73133: 29.2; EWW fails to render some webpages Ganimard via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-09-10 6:06 ` Jim Porter
2024-09-21 9:13 ` Eli Zaretskii
2024-09-21 17:12 ` Jim Porter
2024-09-23 15:43 ` Sebastián Monía
2024-09-28 10:58 ` Eli Zaretskii
2024-09-23 15:56 ` Sebastián Monía
2024-09-24 18:31 ` Jim Porter
2024-09-25 20:46 ` Sebastián Monía
2024-09-26 1:59 ` Jim Porter
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).