unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* eww.el: Patch to cache the parse tree
@ 2013-11-27 17:09 T.V. Raman
  2013-11-30  1:08 ` T.V. Raman
  2013-12-01 13:12 ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 13+ messages in thread
From: T.V. Raman @ 2013-11-27 17:09 UTC (permalink / raw)
  To: emacs-devel; +Cc: tv.raman.tv

Hi,

I'd like  to add some code to eww.el so that the parsed document
is cached ( -- this will enable  functionality such as document
filtering etc (see
http://emacspeak.googlecode.com/svn/trunk/lisp/shr-url.el) for
similar functionality that I originally built using bare shr.

Here is a patch against Master:

git diff master
diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index 86e0977..a446a01 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -89,6 +89,9 @@
   :group 'eww)

 (defvar eww-current-url nil)
+(defvar eww-current-dom nil)
+(make-variable-buffer-local 'eww-current-dom)
+
 (defvar eww-current-title ""
   "Title of current page.")
 (defvar eww-history nil)
@@ -208,6 +211,7 @@ word(s) will be searched for via `eww-search-prefix'."
 		  (start end &optional base-url))

 (defun eww-display-html (charset url)
+  (declare (special eww-current-dom))
   (or (fboundp 'libxml-parse-html-region)
       (error "This function requires Emacs to be compiled with libxml2"))
   (unless (eq charset 'utf8)
@@ -219,6 +223,7 @@ word(s) will be searched for via `eww-search-prefix'."
 	  'base (list (cons 'href url))
 	  (libxml-parse-html-region (point) (point-max)))))
     (eww-setup-buffer)
+    (setq eww-current-dom document)
     (let ((inhibit-read-only t)
 	  (after-change-functions nil)
 	  (shr-width nil)
@@ -387,9 +392,11 @@ word(s) will be searched for via `eww-search-prefix'."
   )

 (defun eww-save-history ()
+  (declare (special ew-current-dom))
   (push (list :url eww-current-url
 	      :title eww-current-title
 	      :point (point)
+              :dom eww-current-dom
 	      :text (buffer-string))
 	eww-history))

@@ -427,6 +434,7 @@ word(s) will be searched for via `eww-search-prefix'."
   (let ((inhibit-read-only t))
     (erase-buffer)
     (insert (plist-get elem :text))
+    (setq eww-current-dom (plist-get elem :dom))
     (goto-char (plist-get elem :point))
     (setq eww-current-url (plist-get elem :url)
 	  eww-current-title (plist-get elem :title))))
09:05:43 raman-glaptop net $
-- 

--



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* eww.el: Patch to cache the parse tree
  2013-11-27 17:09 eww.el: Patch to cache the parse tree T.V. Raman
@ 2013-11-30  1:08 ` T.V. Raman
  2013-12-01 13:12 ` Lars Magne Ingebrigtsen
  1 sibling, 0 replies; 13+ messages in thread
From: T.V. Raman @ 2013-11-30  1:08 UTC (permalink / raw)
  To: emacs-devel, tv.raman.tv

Following up to myself:

To see how I am leveraging this patch,  see the sections on DOM
Filtering in
http://emacspeak.googlecode.com/svn/trunk/lisp/emacspeak-eww.el


-- 

-- 


On 11/27/13, T.V. Raman <tv.raman.tv@gmail.com> wrote:
> Hi,
>
> I'd like  to add some code to eww.el so that the parsed document
> is cached ( -- this will enable  functionality such as document
> filtering etc (see
> http://emacspeak.googlecode.com/svn/trunk/lisp/shr-url.el) for
> similar functionality that I originally built using bare shr.
>
> Here is a patch against Master:
>
> git diff master
> diff --git a/lisp/net/eww.el b/lisp/net/eww.el
> index 86e0977..a446a01 100644
> --- a/lisp/net/eww.el
> +++ b/lisp/net/eww.el
> @@ -89,6 +89,9 @@
>    :group 'eww)
>
>  (defvar eww-current-url nil)
> +(defvar eww-current-dom nil)
> +(make-variable-buffer-local 'eww-current-dom)
> +
>  (defvar eww-current-title ""
>    "Title of current page.")
>  (defvar eww-history nil)
> @@ -208,6 +211,7 @@ word(s) will be searched for via `eww-search-prefix'."
>  		  (start end &optional base-url))
>
>  (defun eww-display-html (charset url)
> +  (declare (special eww-current-dom))
>    (or (fboundp 'libxml-parse-html-region)
>        (error "This function requires Emacs to be compiled with libxml2"))
>    (unless (eq charset 'utf8)
> @@ -219,6 +223,7 @@ word(s) will be searched for via `eww-search-prefix'."
>  	  'base (list (cons 'href url))
>  	  (libxml-parse-html-region (point) (point-max)))))
>      (eww-setup-buffer)
> +    (setq eww-current-dom document)
>      (let ((inhibit-read-only t)
>  	  (after-change-functions nil)
>  	  (shr-width nil)
> @@ -387,9 +392,11 @@ word(s) will be searched for via `eww-search-prefix'."
>    )
>
>  (defun eww-save-history ()
> +  (declare (special ew-current-dom))
>    (push (list :url eww-current-url
>  	      :title eww-current-title
>  	      :point (point)
> +              :dom eww-current-dom
>  	      :text (buffer-string))
>  	eww-history))
>
> @@ -427,6 +434,7 @@ word(s) will be searched for via `eww-search-prefix'."
>    (let ((inhibit-read-only t))
>      (erase-buffer)
>      (insert (plist-get elem :text))
> +    (setq eww-current-dom (plist-get elem :dom))
>      (goto-char (plist-get elem :point))
>      (setq eww-current-url (plist-get elem :url)
>  	  eww-current-title (plist-get elem :title))))
> 09:05:43 raman-glaptop net $
> --
>
> --
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: eww.el: Patch to cache the parse tree
  2013-11-27 17:09 eww.el: Patch to cache the parse tree T.V. Raman
  2013-11-30  1:08 ` T.V. Raman
@ 2013-12-01 13:12 ` Lars Magne Ingebrigtsen
  2013-12-03  3:09   ` T.V. Raman
                     ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-12-01 13:12 UTC (permalink / raw)
  To: T.V. Raman; +Cc: emacs-devel

"T.V. Raman" <tv.raman.tv@gmail.com> writes:

> I'd like  to add some code to eww.el so that the parsed document
> is cached ( -- this will enable  functionality such as document
> filtering etc (see
> http://emacspeak.googlecode.com/svn/trunk/lisp/shr-url.el) for
> similar functionality that I originally built using bare shr.

I've now applied parts of your patch, and rewrote bits.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 13+ messages in thread

* eww.el: Patch to cache the parse tree
  2013-12-01 13:12 ` Lars Magne Ingebrigtsen
@ 2013-12-03  3:09   ` T.V. Raman
  2013-12-03  3:34   ` T.V. Raman
  2013-12-04 16:46   ` eww: display page source (was: eww.el: Patch to cache the parse tree) Ted Zlatanov
  2 siblings, 0 replies; 13+ messages in thread
From: T.V. Raman @ 2013-12-03  3:09 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen, emacs-devel

Thanks Lars,

-- 
Best Regards,
--raman


On 12/1/13, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> "T.V. Raman" <tv.raman.tv@gmail.com> writes:
>
>> I'd like  to add some code to eww.el so that the parsed document
>> is cached ( -- this will enable  functionality such as document
>> filtering etc (see
>> http://emacspeak.googlecode.com/svn/trunk/lisp/shr-url.el) for
>> similar functionality that I originally built using bare shr.
>
> I've now applied parts of your patch, and rewrote bits.
>
> --
> (domestic pets only, the antidote for overdose, milk.)
>    bloggy blog: http://lars.ingebrigtsen.no
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* eww.el: Patch to cache the parse tree
  2013-12-01 13:12 ` Lars Magne Ingebrigtsen
  2013-12-03  3:09   ` T.V. Raman
@ 2013-12-03  3:34   ` T.V. Raman
  2013-12-14 16:27     ` Lars Magne Ingebrigtsen
  2013-12-04 16:46   ` eww: display page source (was: eww.el: Patch to cache the parse tree) Ted Zlatanov
  2 siblings, 1 reply; 13+ messages in thread
From: T.V. Raman @ 2013-12-03  3:34 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen, emacs-devel

Hi Lars,

One more request: in the call
(libxml-parse-html-region (point) (point-max))

could you please pass in the current url as a third argument so
that libxml knows to handle it as a base URL?

-- 
Best Regards,
--raman


On 12/1/13, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> "T.V. Raman" <tv.raman.tv@gmail.com> writes:
>
>> I'd like  to add some code to eww.el so that the parsed document
>> is cached ( -- this will enable  functionality such as document
>> filtering etc (see
>> http://emacspeak.googlecode.com/svn/trunk/lisp/shr-url.el) for
>> similar functionality that I originally built using bare shr.
>
> I've now applied parts of your patch, and rewrote bits.
>
> --
> (domestic pets only, the antidote for overdose, milk.)
>    bloggy blog: http://lars.ingebrigtsen.no
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* eww: display page source (was: eww.el: Patch to cache the parse tree)
  2013-12-01 13:12 ` Lars Magne Ingebrigtsen
  2013-12-03  3:09   ` T.V. Raman
  2013-12-03  3:34   ` T.V. Raman
@ 2013-12-04 16:46   ` Ted Zlatanov
  2013-12-05  1:37     ` eww: display page source Lars Magne Ingebrigtsen
  2 siblings, 1 reply; 13+ messages in thread
From: Ted Zlatanov @ 2013-12-04 16:46 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 227 bytes --]

Lars, since you're in a generous mood ;)

Here's a patch to display the current page's HTML source.  Can you see
if it's acceptable?  I used the `eww-current-dom' code as a guide to
where to put the stateful data.

Thanks
Ted


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: eww-display-source.patch --]
[-- Type: text/x-diff, Size: 2605 bytes --]

=== modified file 'lisp/net/eww.el'
--- lisp/net/eww.el	2013-12-03 04:54:17 +0000
+++ lisp/net/eww.el	2013-12-04 16:43:02 +0000
@@ -117,6 +117,7 @@
 
 (defvar eww-current-url nil)
 (defvar eww-current-dom nil)
+(defvar eww-current-source nil)
 (defvar eww-current-title ""
   "Title of current page.")
 (defvar eww-history nil)
@@ -247,6 +248,7 @@
 	     (list
 	      'base (list (cons 'href url))
 	      (libxml-parse-html-region (point) (point-max))))))
+    (setq eww-current-source (buffer-substring (point) (point-max)))
     (eww-setup-buffer)
     (setq eww-current-dom document)
     (let ((inhibit-read-only t)
@@ -375,6 +377,14 @@
   (unless (eq major-mode 'eww-mode)
     (eww-mode)))
 
+(defun eww-view-source ()
+  (let ((buf (get-buffer-create "*eww-source*"))
+        (source eww-current-source))
+    (with-current-buffer buf
+      (delete-region (point-min) (point-max))
+      (insert (or eww-current-source "no source")))
+    (view-buffer buf)))
+
 (defvar eww-mode-map
   (let ((map (make-sparse-keymap)))
     (suppress-keymap map)
@@ -395,6 +405,7 @@
     (define-key map "d" 'eww-download)
     (define-key map "w" 'eww-copy-page-url)
     (define-key map "C" 'url-cookie-list)
+    (define-key map "v" 'eww-view-source)
 
     (define-key map "b" 'eww-add-bookmark)
     (define-key map "B" 'eww-list-bookmarks)
@@ -411,6 +422,7 @@
 	 :active (not (zerop eww-history-position))]
 	["Browse with external browser" eww-browse-with-external-browser t]
 	["Download" eww-download t]
+	["View page source" eww-view-source]
 	["Copy page URL" eww-copy-page-url t]
 	["Add bookmark" eww-add-bookmark t]
 	["List bookmarks" eww-copy-page-url t]
@@ -424,6 +436,7 @@
   ;; FIXME?  This seems a strange default.
   (set (make-local-variable 'eww-current-url) 'author)
   (set (make-local-variable 'eww-current-dom) nil)
+  (set (make-local-variable 'eww-current-source) nil)
   (set (make-local-variable 'browse-url-browser-function) 'eww-browse-url)
   (set (make-local-variable 'after-change-functions) 'eww-process-text-input)
   (set (make-local-variable 'eww-history) nil)
@@ -437,6 +450,7 @@
 	      :title eww-current-title
 	      :point (point)
               :dom eww-current-dom
+              :source eww-current-source
 	      :text (buffer-string))
 	eww-history))
 
@@ -468,6 +482,7 @@
   (let ((inhibit-read-only t))
     (erase-buffer)
     (insert (plist-get elem :text))
+    (setq eww-current-source (plist-get elem :source))
     (setq eww-current-dom (plist-get elem :dom))
     (goto-char (plist-get elem :point))
     (setq eww-current-url (plist-get elem :url)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: eww: display page source
  2013-12-04 16:46   ` eww: display page source (was: eww.el: Patch to cache the parse tree) Ted Zlatanov
@ 2013-12-05  1:37     ` Lars Magne Ingebrigtsen
  2013-12-05 16:06       ` Ted Zlatanov
  0 siblings, 1 reply; 13+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-12-05  1:37 UTC (permalink / raw)
  To: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

> Lars, since you're in a generous mood ;)
>
> Here's a patch to display the current page's HTML source.  Can you see
> if it's acceptable?  I used the `eww-current-dom' code as a guide to
> where to put the stateful data.

Looks good to me, please apply.  >"?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: eww: display page source
  2013-12-05  1:37     ` eww: display page source Lars Magne Ingebrigtsen
@ 2013-12-05 16:06       ` Ted Zlatanov
  2013-12-14 16:26         ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 13+ messages in thread
From: Ted Zlatanov @ 2013-12-05 16:06 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: emacs-devel

On Thu, 05 Dec 2013 02:37:09 +0100 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote: 

LMI> Ted Zlatanov <tzz@lifelogs.com> writes:
>> Lars, since you're in a generous mood ;)
>> 
>> Here's a patch to display the current page's HTML source.  Can you see
>> if it's acceptable?  I used the `eww-current-dom' code as a guide to
>> where to put the stateful data.

LMI> Looks good to me, please apply.  >"?

OK; done.  I call `html-mode' if it's loaded to do nice highlighting; I
hope that's OK.

Ted



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: eww: display page source
  2013-12-05 16:06       ` Ted Zlatanov
@ 2013-12-14 16:26         ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 13+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-12-14 16:26 UTC (permalink / raw)
  To: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

> OK; done.  I call `html-mode' if it's loaded to do nice highlighting; I
> hope that's OK.

Sure; that's nice.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: eww.el: Patch to cache the parse tree
  2013-12-03  3:34   ` T.V. Raman
@ 2013-12-14 16:27     ` Lars Magne Ingebrigtsen
  2013-12-16  0:22       ` T.V. Raman
  2013-12-16  0:24       ` T.V. Raman
  0 siblings, 2 replies; 13+ messages in thread
From: Lars Magne Ingebrigtsen @ 2013-12-14 16:27 UTC (permalink / raw)
  To: T.V. Raman; +Cc: emacs-devel

"T.V. Raman" <tv.raman.tv@gmail.com> writes:

> One more request: in the call
> (libxml-parse-html-region (point) (point-max))
>
> could you please pass in the current url as a third argument so
> that libxml knows to handle it as a base URL?

Hm...  doesn't eww do the base URL expansion stuff itself?  I don't
recall why, though.  >"?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 13+ messages in thread

* eww.el: Patch to cache the parse tree
  2013-12-14 16:27     ` Lars Magne Ingebrigtsen
@ 2013-12-16  0:22       ` T.V. Raman
  2013-12-16  0:24       ` T.V. Raman
  1 sibling, 0 replies; 13+ messages in thread
From: T.V. Raman @ 2013-12-16  0:22 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen, emacs-devel

I thought I ran into a case where the base url was unset (dont
remember where) looked at the code, and felt that may be we would
have done better by passing in the base-url. I see that you cons
on base href to the DOM being passed to shr. ...
-- 

-- 


On 12/14/13, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> "T.V. Raman" <tv.raman.tv@gmail.com> writes:
>
>> One more request: in the call
>> (libxml-parse-html-region (point) (point-max))
>>
>> could you please pass in the current url as a third argument so
>> that libxml knows to handle it as a base URL?
>
> Hm...  doesn't eww do the base URL expansion stuff itself?  I don't
> recall why, though.  >"?
>
> --
> (domestic pets only, the antidote for overdose, milk.)
>    bloggy blog: http://lars.ingebrigtsen.no
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* eww.el: Patch to cache the parse tree
  2013-12-14 16:27     ` Lars Magne Ingebrigtsen
  2013-12-16  0:22       ` T.V. Raman
@ 2013-12-16  0:24       ` T.V. Raman
  2013-12-24 20:36         ` Lars Ingebrigtsen
  1 sibling, 1 reply; 13+ messages in thread
From: T.V. Raman @ 2013-12-16  0:24 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen, emacs-devel

Also, Lars, do you have any interest in incorporating the DOM
filtering code I wrote for Emacspeak users; may be useful for the
mainstream user -- hard for me to tell. You can see the code
here:
http://emacspeak.googlecode.com/svn/trunk/lisp/emacspeak-eww.el


-- 

-- 


On 12/14/13, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> "T.V. Raman" <tv.raman.tv@gmail.com> writes:
>
>> One more request: in the call
>> (libxml-parse-html-region (point) (point-max))
>>
>> could you please pass in the current url as a third argument so
>> that libxml knows to handle it as a base URL?
>
> Hm...  doesn't eww do the base URL expansion stuff itself?  I don't
> recall why, though.  >"?
>
> --
> (domestic pets only, the antidote for overdose, milk.)
>    bloggy blog: http://lars.ingebrigtsen.no
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: eww.el: Patch to cache the parse tree
  2013-12-16  0:24       ` T.V. Raman
@ 2013-12-24 20:36         ` Lars Ingebrigtsen
  0 siblings, 0 replies; 13+ messages in thread
From: Lars Ingebrigtsen @ 2013-12-24 20:36 UTC (permalink / raw)
  To: T.V. Raman; +Cc: emacs-devel

"T.V. Raman" <tv.raman.tv@gmail.com> writes:

> Also, Lars, do you have any interest in incorporating the DOM
> filtering code I wrote for Emacspeak users; may be useful for the
> mainstream user -- hard for me to tell. You can see the code
> here:
> http://emacspeak.googlecode.com/svn/trunk/lisp/emacspeak-eww.el

This looks useful, but it does look like it would slow down HTML
rendering.

eww is plenty slow already, and is the sort of system where a lot of
good, small improvements may conspire collectively to make it totally
unusable, while each improvement is a good thing individually.

So I think this would be better as a part of Emacspeak.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-12-24 20:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-27 17:09 eww.el: Patch to cache the parse tree T.V. Raman
2013-11-30  1:08 ` T.V. Raman
2013-12-01 13:12 ` Lars Magne Ingebrigtsen
2013-12-03  3:09   ` T.V. Raman
2013-12-03  3:34   ` T.V. Raman
2013-12-14 16:27     ` Lars Magne Ingebrigtsen
2013-12-16  0:22       ` T.V. Raman
2013-12-16  0:24       ` T.V. Raman
2013-12-24 20:36         ` Lars Ingebrigtsen
2013-12-04 16:46   ` eww: display page source (was: eww.el: Patch to cache the parse tree) Ted Zlatanov
2013-12-05  1:37     ` eww: display page source Lars Magne Ingebrigtsen
2013-12-05 16:06       ` Ted Zlatanov
2013-12-14 16:26         ` Lars Magne Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).