emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* capture, attach, link files from web
@ 2013-10-07 11:49 Myles English
  2013-10-07 13:08 ` Oleh
  2013-10-07 15:55 ` Eric Abrahamsen
  0 siblings, 2 replies; 9+ messages in thread
From: Myles English @ 2013-10-07 11:49 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org


Hello,

Just thought I would share something I find useful.  What the code below
does is:

1) prompts for a link to a file on the internet
2) downloads the file
3) attaches the file to the current subtree
4) inserts at the current point a link to the attachment

This is useful if (e.g.) you are scouring Google images for ideas and
want to save lots of image files.

Requirements: wget, set $TMPDIR.
TODO: integrate properly with capture template

#+here_is_some elisp
(setq org-link-abbrev-alist '(("att" . org-attach-expand-link)))

(defun my-attach-and-link-web-file (lnk)
  "Download a file, attach it to our heading, insert a link"
  (interactive "*sAttach and link to url: \n")
  (let ((tmpdir (expand-file-name (getenv "TMPDIR")))
	(fname (file-name-nondirectory lnk)))
    (progn (message (concat "Downloading " lnk " to " tmpdir "/" fname))
	   (call-process "wget" nil '("*Messages*" t) nil "-P"
			 tmpdir "-d"
			 lnk)
	   (org-attach-attach (concat tmpdir "/" fname) nil 'mv)
	   (insert (concat "[[att:" fname "]]")))))

(define-key global-map "\C-cs" 'my-attach-and-link-web-file)
#+that_was_elisp

Myles

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-07 11:49 capture, attach, link files from web Myles English
@ 2013-10-07 13:08 ` Oleh
  2013-10-08 10:43   ` Myles English
  2013-10-07 15:55 ` Eric Abrahamsen
  1 sibling, 1 reply; 9+ messages in thread
From: Oleh @ 2013-10-07 13:08 UTC (permalink / raw)
  To: Myles English; +Cc: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 3055 bytes --]

Hi Myles,

I counter your tip with my own on capturing pdfs.
Maybe you'll find some of this stuff useful for your case.

My capture template captures a pdf file that I have to read.
It works for:
1. A pdf file in doc-view mode.
2. Any dired buffer with point on a pdf file.

What it does:
1. Create a new TODO item under gtd.org/Projects/Scientific Articles
2. The item title is "Read blah-blah by Foo", if the pdf name has
    proper format, otherwise it's just "Read blah-blah".
3. The pdf is attached to the TODO item.
4. A note is added with the capture time.

Here's the code:

(setq org.d "~/Dropbox/org/")
(require 'org-attach)
(require 'org-capture)
(defun org-process-current-pdf ()
  (let* ((buffer (org-capture-get :buffer))
         (buffer-mode (with-current-buffer buffer major-mode))
         (filename (org-capture-get :original-file)))
    (when (file-directory-p filename)
      (with-current-buffer (org-capture-get :original-buffer)
        (setq filename (dired-get-filename))))
    (when (string= (file-name-extension filename) "pdf")
      (let ((org-attach-directory (concat org.d "data/"))
            (name (file-name-sans-extension
                   (file-name-nondirectory filename))))
        (org-attach-attach filename nil 'cp)
        (if (string-match "\\[\\(.*\\)\\] \\(.*\\)(\\(.*\\))" name)
            (format "\"%s\" by %s"
                    (match-string 2 name)
                    (match-string 1 name))
          name)))))

(add-to-list 'org-capture-templates
             '("p" "Pdf article" entry (file+olp (concat org.d "gtd.org")
"Projects" "Scientific Articles")
               "* TODO Read %(org-process-current-pdf)\nAdded: %U %i\n
 %?\n"))

regards,
Oleh










On Mon, Oct 7, 2013 at 1:49 PM, Myles English <mylesenglish@gmail.com>wrote:

>
> Hello,
>
> Just thought I would share something I find useful.  What the code below
> does is:
>
> 1) prompts for a link to a file on the internet
> 2) downloads the file
> 3) attaches the file to the current subtree
> 4) inserts at the current point a link to the attachment
>
> This is useful if (e.g.) you are scouring Google images for ideas and
> want to save lots of image files.
>
> Requirements: wget, set $TMPDIR.
> TODO: integrate properly with capture template
>
> #+here_is_some elisp
> (setq org-link-abbrev-alist '(("att" . org-attach-expand-link)))
>
> (defun my-attach-and-link-web-file (lnk)
>   "Download a file, attach it to our heading, insert a link"
>   (interactive "*sAttach and link to url: \n")
>   (let ((tmpdir (expand-file-name (getenv "TMPDIR")))
>         (fname (file-name-nondirectory lnk)))
>     (progn (message (concat "Downloading " lnk " to " tmpdir "/" fname))
>            (call-process "wget" nil '("*Messages*" t) nil "-P"
>                          tmpdir "-d"
>                          lnk)
>            (org-attach-attach (concat tmpdir "/" fname) nil 'mv)
>            (insert (concat "[[att:" fname "]]")))))
>
> (define-key global-map "\C-cs" 'my-attach-and-link-web-file)
> #+that_was_elisp
>
> Myles
>
>

[-- Attachment #2: Type: text/html, Size: 4571 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-07 11:49 capture, attach, link files from web Myles English
  2013-10-07 13:08 ` Oleh
@ 2013-10-07 15:55 ` Eric Abrahamsen
  2013-10-07 17:49   ` Myles English
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Abrahamsen @ 2013-10-07 15:55 UTC (permalink / raw)
  To: emacs-orgmode

Myles English <mylesenglish@gmail.com> writes:

> Hello,
>
> Just thought I would share something I find useful.  What the code below
> does is:
>
> 1) prompts for a link to a file on the internet
> 2) downloads the file
> 3) attaches the file to the current subtree
> 4) inserts at the current point a link to the attachment
>
> This is useful if (e.g.) you are scouring Google images for ideas and
> want to save lots of image files.

Interesting! I've done a fair amount of this, and wanted this exact sort
of function, and have been too lazy to implement it myself.

A couple of thoughts:

Rather than sending downloaded files to $TMPDIR, it might be nice to
have them just use whatever dir org-attach would have used. I use
org-attach from time to time, and notice that everything ends up under
~/org/data/. I haven't actually investigated why that happens (I've got
org-directory set to ~/org/), mostly because it strikes me as a fine
default. When we've got that directory, setting a different TMPDIR seems
unnecessary. I'll admit part of my hesitation comes from the fact that
"TMPDIR" sounds like it's going to get automatically deleted at some
point.

I've often thought it would be nice to link to images in an org file
with http: links, then at some arbitrary point in time call a
hypothetical org-localize-external-resources command. That command would
wget all the external resources, put them somewhere local, and switch
the links to the file: type. Just a thought.

Regardless, thanks for posting this. It's fun to see other people
thinking in familiar directions.

E

> Requirements: wget, set $TMPDIR.
> TODO: integrate properly with capture template
>
> #+here_is_some elisp
> (setq org-link-abbrev-alist '(("att" . org-attach-expand-link)))
>
> (defun my-attach-and-link-web-file (lnk)
>   "Download a file, attach it to our heading, insert a link"
>   (interactive "*sAttach and link to url: \n")
>   (let ((tmpdir (expand-file-name (getenv "TMPDIR")))
> 	(fname (file-name-nondirectory lnk)))
>     (progn (message (concat "Downloading " lnk " to " tmpdir "/" fname))
> 	   (call-process "wget" nil '("*Messages*" t) nil "-P"
> 			 tmpdir "-d"
> 			 lnk)
> 	   (org-attach-attach (concat tmpdir "/" fname) nil 'mv)
> 	   (insert (concat "[[att:" fname "]]")))))
>
> (define-key global-map "\C-cs" 'my-attach-and-link-web-file)
> #+that_was_elisp
>
> Myles

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-07 15:55 ` Eric Abrahamsen
@ 2013-10-07 17:49   ` Myles English
  2013-10-08  1:39     ` Eric Abrahamsen
  0 siblings, 1 reply; 9+ messages in thread
From: Myles English @ 2013-10-07 17:49 UTC (permalink / raw)
  To: Eric Abrahamsen; +Cc: emacs-orgmode


Hi Eric,

I am glad you like it.

eric@ericabrahamsen.net writes:

[..]

> Rather than sending downloaded files to $TMPDIR, it might be nice to
> have them just use whatever dir org-attach would have used. I use
> org-attach from time to time, and notice that everything ends up under
> ~/org/data/. I haven't actually investigated why that happens (I've got
> org-directory set to ~/org/), mostly because it strikes me as a fine
> default. When we've got that directory, setting a different TMPDIR seems
> unnecessary. I'll admit part of my hesitation comes from the fact that
> "TMPDIR" sounds like it's going to get automatically deleted at some
> point.

The $TMPDIR was just an environment variable I had set already so
assumed it was semi-standard (doesn't everyone have a $TMPDIR?).  When
my function calls:

(org-attach-attach (concat tmpdir "/" fname) nil 'mv)

it moves the file from $TMPDIR to the attachment directory, amongst
other things no doubt.

The attachment directory is decided by the (org-attach-dir) function and
I presume the new file could be downloaded straight there and then the
task/heading would have to be synchronised with it's attachments to get
the new file to show up in the heading's properties.

> I've often thought it would be nice to link to images in an org file
> with http: links, then at some arbitrary point in time call a
> hypothetical org-localize-external-resources command. That command would
> wget all the external resources, put them somewhere local, and switch
> the links to the file: type. Just a thought.

Good idea.  I look forward to your clever implementation with proper
indenting and informative comments.

> Regardless, thanks for posting this. It's fun to see other people
> thinking in familiar directions.

I agree, it is nice to supplement the daily diet of bug reports, help
requests, "have you tried emacs -Q" etc.

Myles

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-07 17:49   ` Myles English
@ 2013-10-08  1:39     ` Eric Abrahamsen
  2013-10-08 10:22       ` Myles English
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Abrahamsen @ 2013-10-08  1:39 UTC (permalink / raw)
  To: emacs-orgmode

Myles English <mylesenglish@gmail.com> writes:

> Hi Eric,
>
> I am glad you like it.
>
> eric@ericabrahamsen.net writes:
>
> [..]
>
>> Rather than sending downloaded files to $TMPDIR, it might be nice to
>> have them just use whatever dir org-attach would have used. I use
>> org-attach from time to time, and notice that everything ends up under
>> ~/org/data/. I haven't actually investigated why that happens (I've got
>> org-directory set to ~/org/), mostly because it strikes me as a fine
>> default. When we've got that directory, setting a different TMPDIR seems
>> unnecessary. I'll admit part of my hesitation comes from the fact that
>> "TMPDIR" sounds like it's going to get automatically deleted at some
>> point.
>
> The $TMPDIR was just an environment variable I had set already so
> assumed it was semi-standard (doesn't everyone have a $TMPDIR?).  When
> my function calls:
>
> (org-attach-attach (concat tmpdir "/" fname) nil 'mv)
>
> it moves the file from $TMPDIR to the attachment directory, amongst
> other things no doubt.

Whoops, should have looked at the signature of `org-attach-attach' more
closely...

> The attachment directory is decided by the (org-attach-dir) function and
> I presume the new file could be downloaded straight there and then the
> task/heading would have to be synchronised with it's attachments to get
> the new file to show up in the heading's properties.
>
>> I've often thought it would be nice to link to images in an org file
>> with http: links, then at some arbitrary point in time call a
>> hypothetical org-localize-external-resources command. That command would
>> wget all the external resources, put them somewhere local, and switch
>> the links to the file: type. Just a thought.
>
> Good idea.  I look forward to your clever implementation with proper
> indenting and informative comments.

I'll get right on it :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-08  1:39     ` Eric Abrahamsen
@ 2013-10-08 10:22       ` Myles English
  2013-10-08 13:31         ` Eric Abrahamsen
  0 siblings, 1 reply; 9+ messages in thread
From: Myles English @ 2013-10-08 10:22 UTC (permalink / raw)
  To: Eric Abrahamsen; +Cc: emacs-orgmode


eric@...net writes:

>>> I've often thought it would be nice to link to images in an org file
>>> with http: links, then at some arbitrary point in time call a
>>> hypothetical org-localize-external-resources command. That command would
>>> wget all the external resources, put them somewhere local, and switch
>>> the links to the file: type. Just a thought.

How about a derived export backend with a filter that does a wget and
rewrites the links?

One problem could be what if a wget fails?  As I am finding with my
implementation, some websites only allow browserlike access.

Myles

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-07 13:08 ` Oleh
@ 2013-10-08 10:43   ` Myles English
  2013-10-08 15:28     ` Oleh
  0 siblings, 1 reply; 9+ messages in thread
From: Myles English @ 2013-10-08 10:43 UTC (permalink / raw)
  To: Oleh; +Cc: emacs-orgmode@gnu.org


Hi Oleh,

ohwoeowho writes:

> I counter your tip with my own on capturing pdfs.
> Maybe you'll find some of this stuff useful for your case.

My use case is slightly different: I am looking for pictures and want to
"insert a picture right here to show up in the exported document", and I
have a different solution to the use case you describe, but thanks, it
is useful to see how to attach files via capture templates.

I get almost the same result as you but get there in a different way;
using zotero as the capture mechanism, exporting from zotero to a bibtex
file[1], hacking reftex[2] to supply a file path and then end up with a
nice TODO with a clickable link in the properties to open the pdf.

* \cite{gawin_simulation_2009} - Simulation of Cavitation in Water Saturated Porous Media Considering Effects of Dissolved Air
:PROPERTIES:
:Created: <2013-03-05 Tue 14:13>
:Custom_ID: gawin_simulation_2009
:file: [[library:/home/myles/.mozilla/firefox/5p1jxjph.default/zotero/storage/4RTT5M3F/Gawin%20and%20Sanavia%20-%20Simulation%20of%20Cavitation%20in%20Water%20Saturated%20Porous.pdf][file]]
:bib: [[bib:gawin_simulation_2009][bib]]
:END:

The only pain is that I have to patch reftex-cite.el every time it is
overwritten as I can't get my patch accepted due to a dead project.

Myles


Footnotes: 
[1]  http://lists.gnu.org/archive/html/emacs-orgmode/2012-06/msg00503.html

[2]  http://lists.gnu.org/archive/html/auctex-devel/2012-06/msg00002.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-08 10:22       ` Myles English
@ 2013-10-08 13:31         ` Eric Abrahamsen
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Abrahamsen @ 2013-10-08 13:31 UTC (permalink / raw)
  To: emacs-orgmode

Myles English <mylesenglish@gmail.com> writes:

> eric@...net writes:
>
>>>> I've often thought it would be nice to link to images in an org file
>>>> with http: links, then at some arbitrary point in time call a
>>>> hypothetical org-localize-external-resources command. That command would
>>>> wget all the external resources, put them somewhere local, and switch
>>>> the links to the file: type. Just a thought.
>
> How about a derived export backend with a filter that does a wget and
> rewrites the links?
>
> One problem could be what if a wget fails?  As I am finding with my
> implementation, some websites only allow browserlike access.

That's kind of why I think it would be better as a standalone
interactive function, rather than an export preprocessing thingummy.
You'd run it first, with timeouts and error messages, etc, and then
export when you're done.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: capture, attach, link files from web
  2013-10-08 10:43   ` Myles English
@ 2013-10-08 15:28     ` Oleh
  0 siblings, 0 replies; 9+ messages in thread
From: Oleh @ 2013-10-08 15:28 UTC (permalink / raw)
  To: Myles English; +Cc: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 2430 bytes --]

Hi Myles,

Just a note, I think that there's an advantage to keep the attachments
in sync with org files. This way they're never lost and are available
wherever the org file is available.
My system is a Firefox plugin that copies to clipboard the file name

[D Gawin, L Sanavia] Simulation of cavitation in water saturated porous
media considering effects of dissolved air(2010).pdf

when I right click in google scholar. Then I can save the file with this
name where I like.

When it's necessary, I can capture it from a dired buffer in Emacs.

Then the todo item will be:

* TODO Read "Simulation of cavitation in water saturated porous media
considering effects of dissolved air" by D Gawin, L Sanavia

Since this file is an org attachment, it's synchronized along with
my whole org folder to my laptop.

regards,
Oleh




On Tue, Oct 8, 2013 at 12:43 PM, Myles English <mylesenglish@gmail.com>wrote:

>
> Hi Oleh,
>
> ohwoeowho writes:
>
> > I counter your tip with my own on capturing pdfs.
> > Maybe you'll find some of this stuff useful for your case.
>
> My use case is slightly different: I am looking for pictures and want to
> "insert a picture right here to show up in the exported document", and I
> have a different solution to the use case you describe, but thanks, it
> is useful to see how to attach files via capture templates.
>
> I get almost the same result as you but get there in a different way;
> using zotero as the capture mechanism, exporting from zotero to a bibtex
> file[1], hacking reftex[2] to supply a file path and then end up with a
> nice TODO with a clickable link in the properties to open the pdf.
>
> * \cite{gawin_simulation_2009} - Simulation of Cavitation in Water
> Saturated Porous Media Considering Effects of Dissolved Air
> :PROPERTIES:
> :Created: <2013-03-05 Tue 14:13>
> :Custom_ID: gawin_simulation_2009
> :file:
> [[library:/home/myles/.mozilla/firefox/5p1jxjph.default/zotero/storage/4RTT5M3F/Gawin%20and%20Sanavia%20-%20Simulation%20of%20Cavitation%20in%20Water%20Saturated%20Porous.pdf][file]]
> :bib: [[bib:gawin_simulation_2009][bib]]
> :END:
>
> The only pain is that I have to patch reftex-cite.el every time it is
> overwritten as I can't get my patch accepted due to a dead project.
>
> Myles
>
>
> Footnotes:
> [1]  http://lists.gnu.org/archive/html/emacs-orgmode/2012-06/msg00503.html
>
> [2]  http://lists.gnu.org/archive/html/auctex-devel/2012-06/msg00002.html
>

[-- Attachment #2: Type: text/html, Size: 3400 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-10-08 15:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-07 11:49 capture, attach, link files from web Myles English
2013-10-07 13:08 ` Oleh
2013-10-08 10:43   ` Myles English
2013-10-08 15:28     ` Oleh
2013-10-07 15:55 ` Eric Abrahamsen
2013-10-07 17:49   ` Myles English
2013-10-08  1:39     ` Eric Abrahamsen
2013-10-08 10:22       ` Myles English
2013-10-08 13:31         ` Eric Abrahamsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).