unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil
       [not found] <CAF6JJ+AzMnvHrh8bPnjyoK+TyTaO3AWO3ZBLAfeTPTUqVWZF1w@mail.gmail.com>
@ 2019-04-05  0:50 ` Noam Postavsky
  2019-04-12 10:21 ` Noam Postavsky
  1 sibling, 0 replies; 7+ messages in thread
From: Noam Postavsky @ 2019-04-05  0:50 UTC (permalink / raw)
  To: wenbushi; +Cc: 34685

tags 34685 + moreinfo
quit

wenbushi <wenbushi@gmail.com> writes:

> The function "nnrss-get-namespace-prefix" in gnus/nnrss.el always
> returns nil, which causes the contents in "<content:encoded>" tag of
> an RSS XML not showing in the gnus article buffer.
>
> Here is a fix:
>
> --- nnrss.el    2019-02-28 20:02:29.224675750 +0800
> +++ nnrss-fixed.el      2019-02-28 20:02:04.534267796 +0800
> @@ -1031,7 +1031,7 @@
>    "Given EL (containing a parsed element) and URI (containing a string
>  that gives the URI for which you want to retrieve the namespace
>  prefix), return the prefix."
> -  (let* ((prefix (car (rassoc uri (cadar el))))
> +  (let* ((prefix (car (rassoc uri (cadar (nthcdr 2 (car el))))))

> the argument "el" in the function is a list of the parsed XML, like(some
> fields are ignored)
>
> ((rss ((version . "2.0") (xmlns:atom . "http://www.w3.org/2005/Atom"))
>       (channel ((xmlns:content . "http://purl.org/rss/1.0/modules/content/"))
>                (title nil "RSS title")
>                (item nil
>                      (title nil "article title")
>                      (content:encoded nil "article content")))))
>
> The function "nnrss-get-namespace-prefix" should extract tag
> "xmlns:content". But it only returns nil because "(cadar el)" matches
> nothing.

That's only due to the particular encoding of your RSS feed though,
isn't it?  I believe xmlns prefixes can technically go on any element in
a document; I expect the current code works for some feeds, and your fix
would break things for them.

We should gather some test cases to be able to fix this properly.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil
       [not found] <CAF6JJ+AzMnvHrh8bPnjyoK+TyTaO3AWO3ZBLAfeTPTUqVWZF1w@mail.gmail.com>
  2019-04-05  0:50 ` bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil Noam Postavsky
@ 2019-04-12 10:21 ` Noam Postavsky
  2019-04-12 11:55   ` Noam Postavsky
  1 sibling, 1 reply; 7+ messages in thread
From: Noam Postavsky @ 2019-04-12 10:21 UTC (permalink / raw)
  To: 34685; +Cc: wenbushi

[-- Attachment #1: Type: text/plain, Size: 82 bytes --]

[forwarding to list, please use "Reply All" to keep 34685@debbugs.gnu.org on Cc]


[-- Attachment #2: Type: message/rfc822, Size: 1117 bytes --]

[-- Attachment #2.1.1: Type: text/plain, Size: 519 bytes --]

Yeah. You're right. Sorry the inconvenience of this mistake.

I've found the tag of my rss feed wasn't in the right place after several
days and forgot to close this issue. After that I didn't check the format
requirements of rss xml carefully though and just made a patch to fix it.
The original codes works well for all other feeds except the outlier I
tested at first(so unlucky!). Maybe it's quite rare for an rss xml to put
its tags in the wrong places.

Thanks for the reply. I should be more careful at testing.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil
  2019-04-12 10:21 ` Noam Postavsky
@ 2019-04-12 11:55   ` Noam Postavsky
  2019-09-26 22:34     ` Lars Ingebrigtsen
  0 siblings, 1 reply; 7+ messages in thread
From: Noam Postavsky @ 2019-04-12 11:55 UTC (permalink / raw)
  To: 34685; +Cc: wenbushi

severity 34685 minor
quit

Noam Postavsky <npostavs@gmail.com> writes:

> From: wenbushi <wenbushi@gmail.com>

> I've found the tag of my rss feed wasn't in the right place after several
> days and forgot to close this issue. After that I didn't check the format
> requirements of rss xml carefully though and just made a patch to fix it.
> The original codes works well for all other feeds except the outlier I
> tested at first(so unlucky!). Maybe it's quite rare for an rss xml to put
> its tags in the wrong places.

Well, if it's only the one feed, I guess this is not too urgent a bug,
but I don't think the feed is technically wrong.  As I said before, XML
allows xmlns attributes anywhere in the document, so nnrss should be
fixed to handle it.  But we need to have test cases both of the
"outlier" case and "typical" cases to get it right.






^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil
  2019-04-12 11:55   ` Noam Postavsky
@ 2019-09-26 22:34     ` Lars Ingebrigtsen
       [not found]       ` <877dgx76x0.fsf@arrian.i-did-not-set--mail-host-address--so-tickle-me>
  0 siblings, 1 reply; 7+ messages in thread
From: Lars Ingebrigtsen @ 2019-09-26 22:34 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 34685, wenbushi

Noam Postavsky <npostavs@gmail.com> writes:

> Well, if it's only the one feed, I guess this is not too urgent a bug,
> but I don't think the feed is technically wrong.  As I said before, XML
> allows xmlns attributes anywhere in the document, so nnrss should be
> fixed to handle it.  But we need to have test cases both of the
> "outlier" case and "typical" cases to get it right.

I've now fixed this more generally by searching through the DOM for the
URI.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil
       [not found]       ` <877dgx76x0.fsf@arrian.i-did-not-set--mail-host-address--so-tickle-me>
@ 2021-08-09 12:13         ` Lars Ingebrigtsen
  2021-08-09 16:30           ` Benjamin Riefenstahl
  0 siblings, 1 reply; 7+ messages in thread
From: Lars Ingebrigtsen @ 2021-08-09 12:13 UTC (permalink / raw)
  To: Benjamin Riefenstahl; +Cc: 34685, wenbushi, Noam Postavsky

Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net> writes:

> LS09LT0tPQpDb250ZW50LVR5cGU6IHRleHQvcGxhaW4KCkhpIExhcnMsCgpMYXJzIEluZ2Vicmln
> dHNlbiB3cml0ZXM6Cj4gSSd2ZSBub3cgZml4ZWQgdGhpcyBtb3JlIGdlbmVyYWxseSBieSBzZWFy

Your message arrived kinda destroyed, but I used my decoder ring:

> This doesn't seem to work.  I encountered the problem just now on Emacs
> master with https://sql-ledger.com/userforum/index.php?mode=rss .  The
> attached patch fixes it for me.

[...]

> -  (let* ((prefix (car (rassoc uri (dom-attributes
> +  (let* ((prefix (car (or (rassoc uri (dom-attributes el))
> +                          (rassoc uri
> +                                  (dom-attributes

[...]

> +(defconst test-nnrss-xml
> +  '((rss
> +     ((version . "2.0")
> +      (xmlns:dc . "http://purl.org/dc/elements/1.1/"))
> +     (channel
> +      ((xmlns:content . "http://purl.org/rss/1.0/modules/content/"))))))

The problem here is that this test XML isn't a valid DOM -- it's a
list of DOMs.  A valid DOM would be

(defconst test-nnrss-xml
  '(rss
    ((version . "2.0")
     (xmlns:dc . "http://purl.org/dc/elements/1.1/"))
    (channel
     ((xmlns:content . "http://purl.org/rss/1.0/modules/content/")))))

and in this case the test works fine.  So if EL is a list of nodes, then
it's the caller of nnrss-get-namespace-prefix here that should be
adjusted to pass in the car of the list instead?  Could you try that and
see whether that works?  (And send a new patch; I've applied your test
(adjusted).)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil
  2021-08-09 12:13         ` Lars Ingebrigtsen
@ 2021-08-09 16:30           ` Benjamin Riefenstahl
  2021-08-10 13:48             ` Lars Ingebrigtsen
  0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Riefenstahl @ 2021-08-09 16:30 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 34685, wenbushi, Noam Postavsky

[-- Attachment #1: Type: text/plain, Size: 2042 bytes --]

Hi Lars,

> Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net> writes:
>> LS09LT0tPQpDb250ZW50LVR5cGU6IHRleHQvcGxhaW4KCkhpIExhcnMsCgpMYXJzIEluZ2Vicmln
>> dHNlbiB3cml0ZXM6Cj4gSSd2ZSBub3cgZml4ZWQgdGhpcyBtb3JlIGdlbmVyYWxseSBieSBzZWFy

Lars Ingebrigtsen writes:
> Your message arrived kinda destroyed, but I used my decoder ring:

Sorry about that.  My laptop does not have email setup correctly, that
made things complicated in this case.

>> +(defconst test-nnrss-xml
>> +  '((rss
>> +     ((version . "2.0")
>> +      (xmlns:dc . "http://purl.org/dc/elements/1.1/"))
>> +     (channel
>> +      ((xmlns:content . "http://purl.org/rss/1.0/modules/content/"))))))

> The problem here is that this test XML isn't a valid DOM -- it's a
> list of DOMs.  A valid DOM would be

I originally just copied this from the output of the debugger, so this
is what nnrss-get-namespace-prefix gets passed.  I just checked, the
structure is the result of xml-parse-region, is that supposed to create
something that matches the use of dom-search?  When I try it on some
test XML it does create this same structure again.

Test code:

    (with-temp-buffer
      (insert "<rss xmlns:content='http://purl.org/rss/1.0/modules/content/'>
                 <channel />
               </rss>")
      (xml-parse-region (point-min) (point-max)))

> So if EL is a list of nodes, then it's the caller of
> nnrss-get-namespace-prefix here that should be adjusted to pass in the
> car of the list instead?  Could you try that and see whether that
> works?

What is passed to nnrss-get-namespace-prefix is also used in other
places and the calling function is a bit complicated so I'd rather not
change it.  I attach another change in nnrss-get-namespace-prefix
function instead.

I note that your previous work on this bug is the first time that
nnrss.el uses functions from dom.el.  It maybe should use it more, but
that is a project for another day, I would say.

> (And send a new patch; I've applied your test (adjusted).)

See below.

Thanks for your time,
benny


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-nnrss-get-namespace-prefix-bug-34685.patch --]
[-- Type: text/x-diff, Size: 2114 bytes --]

From f7f9bc841aa9a4b6fd5a3d8dcffd259f01b10591 Mon Sep 17 00:00:00 2001
From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
Date: Mon, 9 Aug 2021 18:06:31 +0200
Subject: [PATCH] Fix nnrss-get-namespace-prefix (bug#34685)

nnrss-get-namespace-prefix gets passed the result of xml-parse-region
and that is not quite what dom-search expects.

* lisp/gnus/nnrss.el (nnrss-get-namespace-prefix): Use the car of
parameter el to match what dom-search expects.
* test/lisp/gnus/nnrss-tests.el (test-nnrss-xml): Adjust to what
xml-parse-region produces.
---
 lisp/gnus/nnrss.el            |  7 ++++---
 test/lisp/gnus/nnrss-tests.el | 10 +++++-----
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/lisp/gnus/nnrss.el b/lisp/gnus/nnrss.el
index 0f12ee0e9d..97c9f18a60 100644
--- a/lisp/gnus/nnrss.el
+++ b/lisp/gnus/nnrss.el
@@ -954,9 +954,10 @@ nnrss-get-namespace-prefix
   "Given EL (containing a parsed element) and URI (containing a string
 that gives the URI for which you want to retrieve the namespace
 prefix), return the prefix."
-  (let* ((prefix (car (rassoc uri (dom-attributes
-				   (dom-search
-				    el
+  (let* ((dom (car el))
+         (prefix (car (rassoc uri (dom-attributes
+			           (dom-search
+				    dom
 				    (lambda (node)
 				      (rassoc uri (dom-attributes node))))))))
 	 (nslist (if prefix
diff --git a/test/lisp/gnus/nnrss-tests.el b/test/lisp/gnus/nnrss-tests.el
index 01b374a2f6..92b7dacf18 100644
--- a/test/lisp/gnus/nnrss-tests.el
+++ b/test/lisp/gnus/nnrss-tests.el
@@ -27,11 +27,11 @@ test-nnrss-normalize
                  "Fri, 17 Sep 2004 05:09:49 +0000")))
 
 (defconst test-nnrss-xml
-  '(rss
-    ((version . "2.0")
-     (xmlns:dc . "http://purl.org/dc/elements/1.1/"))
-    (channel
-     ((xmlns:content . "http://purl.org/rss/1.0/modules/content/")))))
+  '((rss
+     ((version . "2.0")
+      (xmlns:dc . "http://purl.org/dc/elements/1.1/"))
+     (channel
+      ((xmlns:content . "http://purl.org/rss/1.0/modules/content/"))))))
 
 (ert-deftest test-nnrss-namespace-top ()
   (should (equal (nnrss-get-namespace-prefix
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil
  2021-08-09 16:30           ` Benjamin Riefenstahl
@ 2021-08-10 13:48             ` Lars Ingebrigtsen
  0 siblings, 0 replies; 7+ messages in thread
From: Lars Ingebrigtsen @ 2021-08-10 13:48 UTC (permalink / raw)
  To: Benjamin Riefenstahl; +Cc: 34685, wenbushi, Noam Postavsky

Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net> writes:

> I originally just copied this from the output of the debugger, so this
> is what nnrss-get-namespace-prefix gets passed.  I just checked, the
> structure is the result of xml-parse-region, is that supposed to create
> something that matches the use of dom-search?  When I try it on some
> test XML it does create this same structure again.
>
> Test code:
>
>     (with-temp-buffer
>       (insert "<rss xmlns:content='http://purl.org/rss/1.0/modules/content/'>
>                  <channel />
>                </rss>")
>       (xml-parse-region (point-min) (point-max)))

Ah, I see.  xml-parse-region returns a list of dom objects, while
libxml-parse-xml-region just returns a single object, which is probably
where the confusion here originates from.  (dom.el was written after
libxml2 support was added, and written with the output from that in
mind.)

>> (And send a new patch; I've applied your test (adjusted).)
>
> See below.

Thanks; applied to Emacs 28.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-10 13:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAF6JJ+AzMnvHrh8bPnjyoK+TyTaO3AWO3ZBLAfeTPTUqVWZF1w@mail.gmail.com>
2019-04-05  0:50 ` bug#34685: 26.1; function nnrss-get-namespace-prefix always returns nil Noam Postavsky
2019-04-12 10:21 ` Noam Postavsky
2019-04-12 11:55   ` Noam Postavsky
2019-09-26 22:34     ` Lars Ingebrigtsen
     [not found]       ` <877dgx76x0.fsf@arrian.i-did-not-set--mail-host-address--so-tickle-me>
2021-08-09 12:13         ` Lars Ingebrigtsen
2021-08-09 16:30           ` Benjamin Riefenstahl
2021-08-10 13:48             ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).