From: Ramon Diaz-Uriarte <rdiaz02@gmail.com>
To: Matt Lundin <mdl@imapmail.org>
Cc: Ramon Diaz-Uriarte <rdiaz02@gmail.com>, Org Mode <emacs-orgmode@gnu.org>
Subject: Re: Org Mode and PDF Notes!
Date: Fri, 13 Nov 2015 00:55:14 +0100 [thread overview]
Message-ID: <8737wag4d9.fsf@gmail.com> (raw)
In-Reply-To: <87k2pn70mb.fsf@fastmail.fm>
On Thu, 12-11-2015, at 15:28, Matt Lundin <mdl@imapmail.org> wrote:
> Ramon Diaz-Uriarte <rdiaz02@gmail.com> writes:
>
>>
>> so we get the location of the highlight (and its properties), but not the
>> textual contents. And this is the case whether I make the annotation with
>> EzPDF or Okular or, for that matter, with pdf-tools itself.
>>
>> So it seems RepliGO is actually giving you a lot more by default :-)
>>
>>>
>>> Politza and I are discussing this here:
>>> https://github.com/politza/pdf-tools/issues/137
>>>
>>> that might be a good place to ocntinue the conversation.
>>>
>>
>> I'll do. In the meantime, I think this is a limitation coming from
>> poppler. Other people have mentioned similar things (e.g.,
>> http://coda.caseykuhlman.com/entries/2014/pdf-extract.html) and using other
>> tools that depend on poppler (such as Leela:
>> https://github.com/TrilbyWhite/Leela) also will not give us the text
>> itself.
>
> I don't think this is a limitation of poppler so much as the way that
> pdf annotations work. Typically, the subject/text field is not populated
> by the text of the highlighted region. Rather, a highlight annotation
> specifies bounds, color, style, etc. Basically what Repligo does (I
> wouldn't recommend using it, as it is closed source and severely out of
> date) is to grab the text *at the time of highlighting* and add it to
> the notes field. I don't know of any other annotation tool that does the
> same thing. Applications built on poppler could do it, though they
> currently do not.
I stand corrected. You are right; sorry for the sloppiness in the wording
and ideas.
>
> For extracting the text of highlighted regions *after the fact*, I've
> had good luck with this script that relies on the pdf-reader gem for
> ruby:
>
> https://gist.github.com/danlucraft/5277732
That is also what I use for extracting the text from the highlighted
regions.
R.
>
> Matt
--
Ramon Diaz-Uriarte
Department of Biochemistry, Lab B-25
Facultad de Medicina
Universidad Autónoma de Madrid
Arzobispo Morcillo, 4
28029 Madrid
Spain
Phone: +34-91-497-2412
Email: rdiaz02@gmail.com
ramon.diaz@iib.uam.es
http://ligarto.org/rdiaz
next prev parent reply other threads:[~2015-11-12 23:55 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-11 14:42 Org Mode and PDF Notes! Matt Price
2015-11-11 14:59 ` Kaushal Modi
2015-11-11 20:38 ` Matt Price
2015-11-11 20:48 ` Kaushal Modi
2015-11-11 20:58 ` Matt Price
2015-11-12 12:02 ` Sebastian Christ
2015-11-12 11:58 ` Sebastian Christ
2015-11-11 15:06 ` Xebar Saram
2015-11-11 15:10 ` Russell Adams
2015-11-11 16:40 ` Jeffrey DeLeo
2015-11-11 20:18 ` Matt Price
2015-11-11 17:09 ` Memnon Anon
2015-11-11 20:34 ` Matt Price
2015-11-12 17:31 ` Memnon Anon
2015-11-11 20:17 ` Ramon Diaz-Uriarte
2015-11-11 20:33 ` Matt Price
2015-11-11 22:43 ` Matt Lundin
2015-11-12 12:23 ` Ramon Diaz-Uriarte
2015-11-12 13:11 ` Matt Price
2015-11-13 0:39 ` Ramon Diaz-Uriarte
2015-11-12 14:28 ` Matt Lundin
2015-11-12 22:52 ` Matt Price
2015-11-12 23:51 ` Ramon Diaz-Uriarte
2015-11-12 23:55 ` Ramon Diaz-Uriarte [this message]
2015-11-12 11:30 ` Karl Voit
-- strict thread matches above, loose matches on Subject: below --
2015-11-11 15:15 Peter Davis
[not found] <20@gmane.emacs.orgmode.nnrss>
2015-11-13 8:04 ` Matti Minkkinen
2015-11-16 10:07 ` Ramon Diaz-Uriarte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8737wag4d9.fsf@gmail.com \
--to=rdiaz02@gmail.com \
--cc=emacs-orgmode@gnu.org \
--cc=mdl@imapmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.