unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Tim Landscheidt <tim@tim-landscheidt.de>
To: Stefan Huchler <stefan.huchler@mail.de>
Cc: help-gnu-emacs@gnu.org
Subject: Re: elisps dom library doesn't work as I expect
Date: Wed, 10 May 2023 20:18:40 +0000	[thread overview]
Message-ID: <873544gctb.fsf@vagabond.tim-landscheidt.de> (raw)
In-Reply-To: <87mt2chma4.fsf@mail.de> (Stefan Huchler's message of "Wed, 10 May 2023 05:56:35 +0200")

Stefan Huchler <stefan.huchler@mail.de> wrote:

>> dom-by-tag returns a list of DOM elements; however,
>> dom-elements expects a single DOM element as its second ar-
>> gument.  So you need to iterate over the list of DOM ele-
>> ments returned by dom-by-tag and call dom-elements on each,
>> or use dom-search, etc.

> Interesting, I find the documention of dom-elements confusing:
>> Find elements matching MATCH
> what are the "elements" then? attributes not tags/ dom entries? But yes
> that would be a bugreport about the documentation also that dom-elements
> is not even listed in the gnu doku seem strange to me.

"Elements" in the context of dom-elements means the children
of the node passed as DOM (AFAICT).

> But maybe I am just not good enough in xml lingo.

That's not really a problem here as it is not /the/ prob-
lem :-); the dom-* functions only lightly relate to XML or
DOM concepts in JavaScript & Co. so one has to refer to the
Emacs "model".  As almost everything is a list in Emacs, one
does not get a meaningful error when using the DOM functions
incorrectly but instead the code just does not work.

> Could you explain a bit or show a example of the dom-search function or
> explain it's parameters. I have no idea what in the docstring
> "predicate" means and in what format it's expected, is Predicate a known
> term for something specific?

A predicate in Emacs Lisp is typically a (possibly
anonymous) function that looks at something and then returns
t for some values and nil for others.  So for your use case,
you could write something à la:

| (let
|     ((dom (with-temp-buffer (url-insert-file-contents
|                              "https://www.ebay.com/itm/185887279856")
|                             (libxml-parse-html-region (point-min) (point-max)))))
|   (dom-attr (car
|              (dom-search
|               dom
|               (lambda (d)
|                 (and (equal (dom-tag d) 'meta)
|                      (equal (dom-attr d 'itemprop) "name")))))
|             'content))

This will iterate over all DOM elements in the document,
return those that have a tag "meta" and an attribute
"itemprop" with the value "name", take the first (and
probably only) one, and return the value of this element's
"content" attribute.

Tim



  reply	other threads:[~2023-05-10 20:18 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-09  3:54 elisps dom library doesn't work as I expect Stefan Huchler
2023-05-09  5:32 ` Tim Landscheidt
2023-05-10  3:56   ` Stefan Huchler
2023-05-10 20:18     ` Tim Landscheidt [this message]
2023-05-20 19:40       ` Stefan Huchler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=873544gctb.fsf@vagabond.tim-landscheidt.de \
    --to=tim@tim-landscheidt.de \
    --cc=help-gnu-emacs@gnu.org \
    --cc=stefan.huchler@mail.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).