From: John Kitchin <jkitchin@andrew.cmu.edu>
To: "emacs-orgmode@gnu.org" <emacs-orgmode@gnu.org>
Subject: html to org-mode
Date: Fri, 3 Jan 2014 21:40:14 -0500 [thread overview]
Message-ID: <CAJ51ETrsyuAwpYOvJ2yqYsVirJdX4qcmkmaRVROj-mm4C3LF_g@mail.gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2151 bytes --]
Hi everyone,
I was playing around with org-rss today, and it is pretty cool. I would
like to customize the way the subheading bodies look though, primarily to
unescape some html things like <, to get rid of all the html tags,
convert <a ..> to org-mode links, to download <img ...> so they can be
displayed, etc...
for example a body of an rss entry looks like:
<title>Philip Herron: Cython Book</title> <guid>
http://redbrain.co.uk/?p=147</guid> <link>
http://redbrain.co.uk/cython-book/</link> <description><p>Hey all i
thought i should really share that i actually wrote a book on Cython. The
book has detailed examples and even shows you how you can extend native
C/C++ applications in python by doing it for Tmux. <a href="
http://bit.ly/195ahQs">http://bit.ly/195ahQs</a></p> <p><a href="
http://redbrain.co.uk/wp-content/uploads/2013/12/photo.jpg"><img
class="aligncenter size-full wp-image-148" alt="photo" src="
http://redbrain.co.uk/wp-content/uploads/2013/12/photo.jpg" width="640"
height="480" /></a>The code can be found: <a href="
https://github.com/redbrain/cython-book">
https://github.com/redbrain/cython-book</a></p></description>
<pubDate>Tue, 10 Dec 2013 14:45:08 +0000</pubDate>
I would like this simplified to something like:
Philip Herron: Cython Book
http://redbrain.co.uk/?p=147
http://redbrain.co.uk/cython-book/
Hey all i thought i should really share that i actually wrote a book on
Cython. The book has detailed examples and even shows you how you can
extend native C/C++ applications in python by doing it for Tmux.
http://bit.ly/195ahQs
[[feed-images/photo.jpg]]
The code can be found: https://github.com/redbrain/cython-book
basically, get the html code as close to org as reasonable. i found a way
to get an html parse tree (libxml-parse-html-region start end), but I can't
figure out how to convert that to the text I want.
Has anyone done anything like this?
John
-----------------------------------
John Kitchin
Associate Professor
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
http://kitchingroup.cheme.cmu.edu
[-- Attachment #2: Type: text/html, Size: 3288 bytes --]
next reply other threads:[~2014-01-04 2:40 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-04 2:40 John Kitchin [this message]
2014-01-04 4:56 ` html to org-mode Feng Shu
2014-01-04 6:22 ` York Zhao
2014-01-04 10:54 ` Bastien
2014-01-04 13:48 ` John Kitchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJ51ETrsyuAwpYOvJ2yqYsVirJdX4qcmkmaRVROj-mm4C3LF_g@mail.gmail.com \
--to=jkitchin@andrew.cmu.edu \
--cc=emacs-orgmode@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.