* [PATCH 0/4] org-feed: Improve feed parsing
2010-06-19 11:29 ` zwz
@ 2010-06-19 14:25 ` David Maus
2010-07-01 13:23 ` Carsten Dominik
2010-06-19 14:25 ` [PATCH 1/4] Respect possible XML namespace of rss:item elements David Maus
` (3 subsequent siblings)
4 siblings, 1 reply; 21+ messages in thread
From: David Maus @ 2010-06-19 14:25 UTC (permalink / raw)
To: emacs-orgmode
Four patches to improve feed parsing:
- respect XML namespace of rss:item elements (fixes reported problem
with codeproject.com's feed)
- ignore case of rss element names (dto., codeproject uses upper
case letters for rss:guid element)
- provide and use function to unescape XML entities. Some
characters must be escaped for XML transport (e.g. &), they are
properly unescaped using `xml-entity-alist'
David Maus (4):
Respect possible XML namespace of rss:item elements.
Ignore case of rss element names.
Unescape protected entities defined in `xml-entity-alist'.
Unescape rss element content.
lisp/org-feed.el | 26 +++++++++++++++++++-------
1 files changed, 19 insertions(+), 7 deletions(-)
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/4] org-feed: Improve feed parsing
2010-06-19 14:25 ` [PATCH 0/4] org-feed: Improve feed parsing David Maus
@ 2010-07-01 13:23 ` Carsten Dominik
2010-07-01 18:02 ` [PATCH] Resubmit: Remove superfluous lambda David Maus
2010-07-01 18:02 ` [PATCH] " David Maus
0 siblings, 2 replies; 21+ messages in thread
From: Carsten Dominik @ 2010-07-01 13:23 UTC (permalink / raw)
To: David Maus; +Cc: emacs-orgmode
Hi David,
I have applied these patches, with the exception of patch 20
(patchwork reference number, about removing a lambda) which
for some reason does not apply.
Can you please check and resubmit an updated patch.
Also, your commit messages are almost the new format, but
please don't indent the second and further lines
of the ChangeLog entry.
Thanks!
- Carsten
On Jun 19, 2010, at 4:25 PM, David Maus wrote:
> Four patches to improve feed parsing:
>
> - respect XML namespace of rss:item elements (fixes reported problem
> with codeproject.com's feed)
>
> - ignore case of rss element names (dto., codeproject uses upper
> case letters for rss:guid element)
>
> - provide and use function to unescape XML entities. Some
> characters must be escaped for XML transport (e.g. &), they are
> properly unescaped using `xml-entity-alist'
>
> David Maus (4):
> Respect possible XML namespace of rss:item elements.
> Ignore case of rss element names.
> Unescape protected entities defined in `xml-entity-alist'.
> Unescape rss element content.
>
> lisp/org-feed.el | 26 +++++++++++++++++++-------
> 1 files changed, 19 insertions(+), 7 deletions(-)
>
>
> _______________________________________________
> Emacs-orgmode mailing list
> Please use `Reply All' to send replies to the list.
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
- Carsten
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] Resubmit: Remove superfluous lambda.
2010-07-01 13:23 ` Carsten Dominik
@ 2010-07-01 18:02 ` David Maus
2010-07-01 18:02 ` [PATCH] " David Maus
1 sibling, 0 replies; 21+ messages in thread
From: David Maus @ 2010-07-01 18:02 UTC (permalink / raw)
To: emacs-orgmode
Resubmit patch #60 as requested (against current HEAD)
David Maus (1):
Remove superfluous lambda.
lisp/org-feed.el | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] Remove superfluous lambda.
2010-07-01 13:23 ` Carsten Dominik
2010-07-01 18:02 ` [PATCH] Resubmit: Remove superfluous lambda David Maus
@ 2010-07-01 18:02 ` David Maus
2010-07-02 4:40 ` Carsten Dominik
1 sibling, 1 reply; 21+ messages in thread
From: David Maus @ 2010-07-01 18:02 UTC (permalink / raw)
To: emacs-orgmode
* org-feed.el (org-feed-unescape): Remove superfluous lambda.
---
lisp/org-feed.el | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 3edcf1a..cda7368 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -271,8 +271,7 @@ have been saved."
(defun org-feed-unescape (s)
"Unescape protected entities in S."
(let ((re (concat "&\\("
- (mapconcat (lambda (e)
- (car e)) xml-entity-alist "\\|")
+ (mapconcat 'car xml-entity-alist "\\|")
"\\);")))
(while (string-match re s)
(setq s (replace-match
--
1.7.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] Remove superfluous lambda.
2010-07-01 18:02 ` [PATCH] " David Maus
@ 2010-07-02 4:40 ` Carsten Dominik
2010-07-02 6:04 ` David Maus
0 siblings, 1 reply; 21+ messages in thread
From: Carsten Dominik @ 2010-07-02 4:40 UTC (permalink / raw)
To: David Maus, John Wiegley; +Cc: emacs-orgmode Mode
Hmm, the catcher did not see this. Why?
- Carsten
On Jul 1, 2010, at 8:02 PM, David Maus wrote:
> * org-feed.el (org-feed-unescape): Remove superfluous lambda.
> ---
> lisp/org-feed.el | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/lisp/org-feed.el b/lisp/org-feed.el
> index 3edcf1a..cda7368 100644
> --- a/lisp/org-feed.el
> +++ b/lisp/org-feed.el
> @@ -271,8 +271,7 @@ have been saved."
> (defun org-feed-unescape (s)
> "Unescape protected entities in S."
> (let ((re (concat "&\\("
> - (mapconcat (lambda (e)
> - (car e)) xml-entity-alist "\\|")
> + (mapconcat 'car xml-entity-alist "\\|")
> "\\);")))
> (while (string-match re s)
> (setq s (replace-match
> --
> 1.7.1
>
>
> _______________________________________________
> Emacs-orgmode mailing list
> Please use `Reply All' to send replies to the list.
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
- Carsten
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Remove superfluous lambda.
2010-07-02 4:40 ` Carsten Dominik
@ 2010-07-02 6:04 ` David Maus
2010-07-02 13:21 ` Sebastian Rose
0 siblings, 1 reply; 21+ messages in thread
From: David Maus @ 2010-07-02 6:04 UTC (permalink / raw)
To: Carsten Dominik; +Cc: emacs-orgmode Mode, John Wiegley
[-- Attachment #1.1: Type: text/plain, Size: 247 bytes --]
Carsten Dominik wrote:
>Hmm, the catcher did not see this. Why?
He did: http://patchwork.newartisans.com/patch/105/ -- patch is
already applied.
-- David
--
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de
[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]
[-- Attachment #2: Type: text/plain, Size: 201 bytes --]
_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Remove superfluous lambda.
2010-07-02 6:04 ` David Maus
@ 2010-07-02 13:21 ` Sebastian Rose
2010-07-02 13:23 ` John Wiegley
2010-07-06 7:44 ` Carsten Dominik
0 siblings, 2 replies; 21+ messages in thread
From: Sebastian Rose @ 2010-07-02 13:21 UTC (permalink / raw)
To: David Maus; +Cc: John Wiegley, emacs-orgmode Mode, Carsten Dominik
[-- Attachment #1: Type: text/plain, Size: 438 bytes --]
David Maus <dmaus@ictsoc.de> writes:
> Carsten Dominik wrote:
>>Hmm, the catcher did not see this. Why?
>
> He did: http://patchwork.newartisans.com/patch/105/ -- patch is
> already applied.
Where can I see that?
I read "Accepted" which is not "Applied", is it?
And there are "accepted" patches, that are not "applied".
E.g. http://patchwork.newartisans.com/patch/73/
The diff against the current head (8da31057eb0952889858c):
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: org-capture-file-templates.patch --]
[-- Type: text/x-diff, Size: 631 bytes --]
diff --git a/lisp/org-capture.el b/lisp/org-capture.el
index 8c887ce..f38a78c 100644
--- a/lisp/org-capture.el
+++ b/lisp/org-capture.el
@@ -924,6 +924,8 @@ Point will remain at the first line after the inserted text."
(org-capture-put :key (car entry) :description (nth 1 entry)
:target (nth 3 entry))
(let ((txt (nth 4 entry)) (type (or (nth 2 entry) 'entry)))
+ (when (file-exists-p txt)
+ (setq txt (org-file-contents txt)))
(when (or (not txt) (not (string-match "\\S-" txt)))
;; The template may be empty or omitted for special types.
;; Here we insert the default templates for such cases.
[-- Attachment #3: Type: text/plain, Size: 77 bytes --]
Is it possible to link to the commit a patch was applied?
Sebastian
[-- Attachment #4: Type: text/plain, Size: 201 bytes --]
_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] Remove superfluous lambda.
2010-07-02 13:21 ` Sebastian Rose
@ 2010-07-02 13:23 ` John Wiegley
2010-07-02 13:49 ` Sebastian Rose
2010-07-06 7:44 ` Carsten Dominik
1 sibling, 1 reply; 21+ messages in thread
From: John Wiegley @ 2010-07-02 13:23 UTC (permalink / raw)
To: Sebastian Rose; +Cc: emacs-orgmode Mode, Carsten Dominik
On Jul 2, 2010, at 9:21 AM, Sebastian Rose wrote:
> I read "Accepted" which is not "Applied", is it?
Accepted = Applied.
> Is it possible to link to the commit a patch was applied?
After today, patches which are merged in using "pw" will note the resulting commit.
John
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Remove superfluous lambda.
2010-07-02 13:21 ` Sebastian Rose
2010-07-02 13:23 ` John Wiegley
@ 2010-07-06 7:44 ` Carsten Dominik
1 sibling, 0 replies; 21+ messages in thread
From: Carsten Dominik @ 2010-07-06 7:44 UTC (permalink / raw)
To: Sebastian Rose; +Cc: emacs-orgmode Mode, John Wiegley
On Jul 2, 2010, at 3:21 PM, Sebastian Rose wrote:
> David Maus <dmaus@ictsoc.de> writes:
>> Carsten Dominik wrote:
>>> Hmm, the catcher did not see this. Why?
>>
>> He did: http://patchwork.newartisans.com/patch/105/ -- patch is
>> already applied.
>
>
> Where can I see that?
>
>
> I read "Accepted" which is not "Applied", is it?
>
>
> And there are "accepted" patches, that are not "applied".
>
> E.g. http://patchwork.newartisans.com/patch/73/
It is possible that I made a mistake here.
>
>
> The diff against the current head (8da31057eb0952889858c):
>
> diff --git a/lisp/org-capture.el b/lisp/org-capture.el
> index 8c887ce..f38a78c 100644
> --- a/lisp/org-capture.el
> +++ b/lisp/org-capture.el
> @@ -924,6 +924,8 @@ Point will remain at the first line after the
> inserted text."
> (org-capture-put :key (car entry) :description (nth 1 entry)
> :target (nth 3 entry))
> (let ((txt (nth 4 entry)) (type (or (nth 2 entry) 'entry)))
> + (when (file-exists-p txt)
> + (setq txt (org-file-contents txt)))
> (when (or (not txt) (not (string-match "\\S-" txt)))
> ;; The template may be empty or omitted for special types.
> ;; Here we insert the default templates for such cases.
I am rejecting this patch in the current form, because the ambiguity
between file name and template is not good.
I did check in a different patch, where the template may be a sting, or
(file "/path-to-file")
or
(function function-to-make-template)
This should provide more stable and more flexible ways to do this.
- Carsten
>
>
>
> Is it possible to link to the commit a patch was applied?
>
>
>
>
> Sebastian
- Carsten
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/4] Respect possible XML namespace of rss:item elements.
2010-06-19 11:29 ` zwz
2010-06-19 14:25 ` [PATCH 0/4] org-feed: Improve feed parsing David Maus
@ 2010-06-19 14:25 ` David Maus
2010-06-19 14:25 ` [PATCH 2/4] Ignore case of rss element names David Maus
` (2 subsequent siblings)
4 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-06-19 14:25 UTC (permalink / raw)
To: emacs-orgmode
* org-feed.el (org-feed-parse-rss-feed): Respect possible XML
namespace of rss:item elements.
---
lisp/org-feed.el | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 37b2327..c86ca90 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -583,7 +583,7 @@ containing the properties `:guid' and `:item-full-text'."
(with-current-buffer buffer
(widen)
(goto-char (point-min))
- (while (re-search-forward "<item>" nil t)
+ (while (re-search-forward "<item\\>.*?>" nil t)
(setq beg (point)
end (and (re-search-forward "</item>" nil t)
(match-beginning 0)))
--
1.7.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 2/4] Ignore case of rss element names.
2010-06-19 11:29 ` zwz
2010-06-19 14:25 ` [PATCH 0/4] org-feed: Improve feed parsing David Maus
2010-06-19 14:25 ` [PATCH 1/4] Respect possible XML namespace of rss:item elements David Maus
@ 2010-06-19 14:25 ` David Maus
2010-06-19 14:25 ` [PATCH 3/4] Unescape protected entities defined in `xml-entity-alist' David Maus
2010-06-19 14:25 ` [PATCH 4/4] Unescape rss element content David Maus
4 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-06-19 14:25 UTC (permalink / raw)
To: emacs-orgmode
* org-feed.el (org-feed-parse-rss-feed): Ignore case of rss
element names.
---
lisp/org-feed.el | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index c86ca90..b0373e5 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -579,7 +579,8 @@ Assumes headers are indeed present!"
"Parse BUFFER for RSS feed entries.
Returns a list of entries, with each entry a property list,
containing the properties `:guid' and `:item-full-text'."
- (let (entries beg end item guid entry)
+ (let ((case-fold-search t)
+ entries beg end item guid entry)
(with-current-buffer buffer
(widen)
(goto-char (point-min))
--
1.7.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 3/4] Unescape protected entities defined in `xml-entity-alist'.
2010-06-19 11:29 ` zwz
` (2 preceding siblings ...)
2010-06-19 14:25 ` [PATCH 2/4] Ignore case of rss element names David Maus
@ 2010-06-19 14:25 ` David Maus
2010-06-19 14:25 ` [PATCH 4/4] Unescape rss element content David Maus
4 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-06-19 14:25 UTC (permalink / raw)
To: emacs-orgmode
* org-feed.el (org-feed-unescape): New function. Unescape
protected entities.
(org-feed-parse-atom-entry): Use function for atom:content
type text and html.
---
lisp/org-feed.el | 15 +++++++++++++--
1 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index b0373e5..2621008 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -267,6 +267,17 @@ have been saved."
(defvar org-feed-buffer "*Org feed*"
"The buffer used to retrieve a feed.")
+(defun org-feed-unescape (s)
+ "Unescape protected entities in S."
+ (let ((re (concat "&\\("
+ (mapconcat (lambda (e)
+ (car e)) xml-entity-alist "\\|")
+ "\\);")))
+ (while (string-match re s)
+ (setq s (replace-match
+ (cdr (assoc (match-string 1 s) xml-entity-alist)) nil nil s)))
+ s))
+
;;;###autoload
(defun org-feed-update-all ()
"Get inbox items from all feeds in `org-feed-alist'."
@@ -647,10 +658,10 @@ formatted as a string, not the original XML data."
(cond
((string= type "text")
;; We like plain text.
- (setq entry (plist-put entry :description (car (xml-node-children content)))))
+ (setq entry (plist-put entry :description (org-feed-unescape (car (xml-node-children content))))))
((string= type "html")
;; TODO: convert HTML to Org markup.
- (setq entry (plist-put entry :description (car (xml-node-children content)))))
+ (setq entry (plist-put entry :description (org-feed-unescape (car (xml-node-children content))))))
((string= type "xhtml")
;; TODO: convert XHTML to Org markup.
(setq entry (plist-put entry :description (prin1-to-string (xml-node-children content)))))
--
1.7.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 4/4] Unescape rss element content.
2010-06-19 11:29 ` zwz
` (3 preceding siblings ...)
2010-06-19 14:25 ` [PATCH 3/4] Unescape protected entities defined in `xml-entity-alist' David Maus
@ 2010-06-19 14:25 ` David Maus
2010-06-19 14:28 ` [PATCH] Declare variable `xml-entity-alist' for byte compiler David Maus
4 siblings, 1 reply; 21+ messages in thread
From: David Maus @ 2010-06-19 14:25 UTC (permalink / raw)
To: emacs-orgmode
* org-feed.el (org-feed-parse-rss-entry): Unescape rss element
content.
---
lisp/org-feed.el | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 2621008..e7fd0f2 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -617,7 +617,7 @@ containing the properties `:guid' and `:item-full-text'."
nil t)
(setq entry (plist-put entry
(intern (concat ":" (match-string 1)))
- (match-string 2))))
+ (org-feed-unescape (match-string 2)))))
(goto-char (point-min))
(unless (re-search-forward "isPermaLink[ \t]*=[ \t]*\"false\"" nil t)
(setq entry (plist-put entry :guid-permalink t))))
@@ -650,8 +650,8 @@ formatted as a string, not the original XML data."
'href)))
;; Add <title/> as :title.
(setq entry (plist-put entry :title
- (car (xml-node-children
- (car (xml-get-children xml 'title))))))
+ (org-feed-unescape (car (xml-node-children
+ (car (xml-get-children xml 'title)))))))
(let* ((content (car (xml-get-children xml 'content)))
(type (xml-get-attribute-or-nil content 'type)))
(when content
--
1.7.1
^ permalink raw reply related [flat|nested] 21+ messages in thread