unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#44348: 28.0.50; eww renders xml processing element as is
@ 2020-10-31 15:47 Pankaj Jangid
  2020-10-31 19:47 ` Stephen Berman
  0 siblings, 1 reply; 6+ messages in thread
From: Pankaj Jangid @ 2020-10-31 15:47 UTC (permalink / raw)
  To: 44348


I published a webpage using org. The output has this xml element at the
top:

<?xml version="1.0" encoding="utf-8"?>

But this is rendered as it is in eww when I fetch it from the hosted
website.

When I view-source the element there is:

&lt;?xml version="1.0" encoding="utf-8"?>

Note that the opening angle bracket is converted to HTML entity type.

The hosted page is https://codeisgreat.org/ and source is at
https://github.com/jangid/codeisgreat/blob/master/docs/index.html. If
that helps.


In GNU Emacs 28.0.50 (build 1, x86_64-apple-darwin19.6.0, NS
appkit-1894.60 Version 10.15.7 (Build 19H2))
 of 2020-10-31 built on mb2.local Repository revision:
74c45a62e1e48d7c52dc513b6911e65dcc38aa23 Repository branch: master
Windowing system distributor 'Apple', version 10.3.1894 System
Description: Mac OS X 10.15.7

Configured using:
 'configure LDFLAGS=-L/usr/local/opt/ruby/lib
 CPPFLAGS=-I/usr/local/opt/ruby/include
 PKG_CONFIG_PATH=:/usr/local/opt/sqlite/lib/pkgconfig:/usr/local/opt/libxml2/lib/pkgconfig:/usr/local/opt/openssl/lib/pkgconfig:/usr/local/opt/libffi/lib/pkgconfig:/usr/local/opt/ruby/lib/pkgconfig'

Configured features: JPEG TIFF GIF PNG RSVG DBUS GLIB NOTIFY KQUEUE ACL
GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES THREADS JSON PDUMPER
LCMS2

Important settings:
  value of $LC_CTYPE: UTF-8 value of $LANG: en_IN.UTF-8
  locale-coding-system: utf-8-unix

Major mode: eww

Minor modes in effect:
  tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t
  mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t
  file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t
  blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t
  auto-compression-mode: t buffer-read-only: t line-number-mode: t
  transient-mark-mode: t

Load-path shadows: None found.

Features: (shadow sort mail-extr emacsbug message dired dired-loaddefs
rfc822 mml mml-sec epa derived epg epg-config mm-decode mm-bodies
mm-encode mailabbrev gmm-utils mailheader sendmail view mhtml-mode
css-mode smie color js imenu cc-mode cc-fonts cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-vars cc-defs sgml-mode cl-extra
help-mode gnutls network-stream url-http mail-parse rfc2231 url-gw nsm
rmc url-cache url-auth format-spec eww easymenu xdg url-queue thingatpt
shr kinsoku svg xml dom browse-url url url-proxy url-privacy url-expand
url-methods url-history url-cookie url-domsuf url-util url-parse
url-vars mailcap puny mm-url gnus nnheader gnus-util rmail
rmail-loaddefs auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json map rfc2047 rfc2045 ietf-drums
text-property-search time-date subr-x seq byte-opt gv bytecomp
byte-compile cconv mail-utils wid-edit mm-util mail-prsvr cl-loaddefs
cl-lib tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/ns-win ns-win ucs-normalize mule-util
term/common-win tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice button
loaddefs faces cus-face macroexp files window text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote threads dbusbind kqueue cocoa ns
lcms2 multi-tty make-network-process emacs)

Memory information: ((conses 16 116182 8858)
 (symbols 48 13342 1) (strings 32 39602 2558) (string-bytes 1 1369339)
 (vectors 16 20825) (vector-slots 8 262285 11163) (floats 8 141 51)
 (intervals 56 695 0) (buffers 992 14))

-- 
Pankaj





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#44348: 28.0.50; eww renders xml processing element as is
  2020-10-31 15:47 bug#44348: 28.0.50; eww renders xml processing element as is Pankaj Jangid
@ 2020-10-31 19:47 ` Stephen Berman
  2020-11-01 13:28   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Berman @ 2020-10-31 19:47 UTC (permalink / raw)
  To: Pankaj Jangid; +Cc: 44348

On Sat, 31 Oct 2020 21:17:35 +0530 Pankaj Jangid <pankaj@codeisgreat.org> wrote:

> I published a webpage using org. The output has this xml element at the
> top:
>
> <?xml version="1.0" encoding="utf-8"?>
>
> But this is rendered as it is in eww when I fetch it from the hosted
> website.
>
> When I view-source the element there is:
>
> &lt;?xml version="1.0" encoding="utf-8"?>
>
> Note that the opening angle bracket is converted to HTML entity type.

The simplest fix would seem to be this:

diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index fd9fe98439..051698d6d6 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -420,7 +420,7 @@ eww--preprocess-html
       (narrow-to-region start end)
       (goto-char start)
       (let ((case-fold-search t))
-        (while (re-search-forward "<[^0-9a-z!/]" nil t)
+        (while (re-search-forward "<[^0-9a-z!?/]" nil t)
           (goto-char (match-beginning 0))
           (delete-region (point) (1+ (point)))
           (insert "&lt;"))))))

But if that's too permissive, then a more specific fix is this:

diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index fd9fe98439..bc795df256 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -421,9 +421,11 @@ eww--preprocess-html
       (goto-char start)
       (let ((case-fold-search t))
         (while (re-search-forward "<[^0-9a-z!/]" nil t)
-          (goto-char (match-beginning 0))
-          (delete-region (point) (1+ (point)))
-          (insert "&lt;"))))))
+          (unless (and (looking-back "\\?" (line-beginning-position))
+                       (looking-at "xml"))
+            (goto-char (match-beginning 0))
+            (delete-region (point) (1+ (point)))
+            (insert "&lt;")))))))

 ;;;###autoload (defalias 'browse-web 'eww)

Steve Berman





^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#44348: 28.0.50; eww renders xml processing element as is
  2020-10-31 19:47 ` Stephen Berman
@ 2020-11-01 13:28   ` Lars Ingebrigtsen
  2020-11-01 23:08     ` Stephen Berman
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Ingebrigtsen @ 2020-11-01 13:28 UTC (permalink / raw)
  To: Stephen Berman; +Cc: 44348, Pankaj Jangid

Stephen Berman <stephen.berman@gmx.net> writes:

> The simplest fix would seem to be this:

[...]

> @@ -420,7 +420,7 @@ eww--preprocess-html
>        (narrow-to-region start end)
>        (goto-char start)
>        (let ((case-fold-search t))
> -        (while (re-search-forward "<[^0-9a-z!/]" nil t)
> +        (while (re-search-forward "<[^0-9a-z!?/]" nil t)

Looks good to me; go ahead and push.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#44348: 28.0.50; eww renders xml processing element as is
  2020-11-01 13:28   ` Lars Ingebrigtsen
@ 2020-11-01 23:08     ` Stephen Berman
  2020-11-02 15:16       ` Lars Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Berman @ 2020-11-01 23:08 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 44348, Pankaj Jangid

On Sun, 01 Nov 2020 14:28:46 +0100 Lars Ingebrigtsen <larsi@gnus.org> wrote:

> Stephen Berman <stephen.berman@gmx.net> writes:
>
>> The simplest fix would seem to be this:
>
> [...]
>
>> @@ -420,7 +420,7 @@ eww--preprocess-html
>>        (narrow-to-region start end)
>>        (goto-char start)
>>        (let ((case-fold-search t))
>> -        (while (re-search-forward "<[^0-9a-z!/]" nil t)
>> +        (while (re-search-forward "<[^0-9a-z!?/]" nil t)
>
> Looks good to me; go ahead and push.

Thanks.  I checked and saw that eww--preprocess-html is a new function
in emacs-27 (commit 568f1488), and the bug does not happen in emacs-26,
so should the fix go into the emacs-27 branch?

Steve Berman





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#44348: 28.0.50; eww renders xml processing element as is
  2020-11-01 23:08     ` Stephen Berman
@ 2020-11-02 15:16       ` Lars Ingebrigtsen
  2020-11-02 22:28         ` Stephen Berman
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Ingebrigtsen @ 2020-11-02 15:16 UTC (permalink / raw)
  To: Stephen Berman; +Cc: 44348, Pankaj Jangid

Stephen Berman <stephen.berman@gmx.net> writes:

> Thanks.  I checked and saw that eww--preprocess-html is a new function
> in emacs-27 (commit 568f1488), and the bug does not happen in emacs-26,
> so should the fix go into the emacs-27 branch?

Yes, this should be a safe enough fix to go to emacs-27.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#44348: 28.0.50; eww renders xml processing element as is
  2020-11-02 15:16       ` Lars Ingebrigtsen
@ 2020-11-02 22:28         ` Stephen Berman
  0 siblings, 0 replies; 6+ messages in thread
From: Stephen Berman @ 2020-11-02 22:28 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 44348-done, Pankaj Jangid

On Mon, 02 Nov 2020 16:16:40 +0100 Lars Ingebrigtsen <larsi@gnus.org> wrote:

> Stephen Berman <stephen.berman@gmx.net> writes:
>
>> Thanks.  I checked and saw that eww--preprocess-html is a new function
>> in emacs-27 (commit 568f1488), and the bug does not happen in emacs-26,
>> so should the fix go into the emacs-27 branch?
>
> Yes, this should be a safe enough fix to go to emacs-27.

Done in commit 1b7ab9d0ac and closing the bug.

Steve Berman





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-11-02 22:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-31 15:47 bug#44348: 28.0.50; eww renders xml processing element as is Pankaj Jangid
2020-10-31 19:47 ` Stephen Berman
2020-11-01 13:28   ` Lars Ingebrigtsen
2020-11-01 23:08     ` Stephen Berman
2020-11-02 15:16       ` Lars Ingebrigtsen
2020-11-02 22:28         ` Stephen Berman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).