* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 11:23 ` Jean Louis
@ 2022-08-18 13:48 ` Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 14:38 ` Emanuel Berg
0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2022-08-18 13:48 UTC (permalink / raw)
To: help-gnu-emacs
Jean Louis [2022-08-18 14:23:05] wrote:
> * Alessandro Bertulli <alessandro.bertulli96@gmail.com> [2022-08-18 01:27]:
>> I'm currently writing my MS's thesis. Searching for the state of the art
>> of my assigned technology, I am struggling to read and reason about some
>> old papers from ACM and IEEE (pre-2000, scanned, with no index). I am
>> currently switching back and forth between Sioyek and Evince to read my
>> pdfs, while taking notes in Org mode.
>
> Images with text you may process with OCR program to get some
> meanings, and then text may be connected to pages as well and PDF
> packed again.
Some of the old articles in ACM are already processed this way for
you, actually.
Stefan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 13:48 ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
@ 2022-08-18 14:38 ` Emanuel Berg
2022-08-18 14:41 ` Stefan Monnier via Users list for the GNU Emacs text editor
0 siblings, 1 reply; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 14:38 UTC (permalink / raw)
To: help-gnu-emacs
Stefan Monnier via Users list for the GNU Emacs text editor wrote:
>>> I'm currently writing my MS's thesis. Searching for the
>>> state of the art of my assigned technology, I am
>>> struggling to read and reason about some old papers from
>>> ACM and IEEE (pre-2000, scanned, with no index). I am
>>> currently switching back and forth between Sioyek and
>>> Evince to read my pdfs, while taking notes in Org mode.
>>
>> Images with text you may process with OCR program to get
>> some meanings, and then text may be connected to pages as
>> well and PDF packed again.
>
> Some of the old articles in ACM are already processed this
> way for you, actually.
Are there some articles that are really good?
--
underground experts united
https://dataswamp.org/~incal
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 14:38 ` Emanuel Berg
@ 2022-08-18 14:41 ` Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 20:45 ` Emanuel Berg
0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2022-08-18 14:41 UTC (permalink / raw)
To: help-gnu-emacs
> Are there some articles that are really good?
Only those that I wrote, of course,
Stefan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
@ 2022-08-18 20:22 Alessandro Bertulli
2022-08-18 22:14 ` Stefan Monnier
2022-08-19 4:24 ` tomas
0 siblings, 2 replies; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-18 20:22 UTC (permalink / raw)
To: monnier; +Cc: help-gnu-emacs
> Some of the old articles in ACM are already processed this way for
> you, actually.
I dunno, the ones I read were actually simple scans (as far as I can
tell). The "search in text" function of Evince/Sioyek worked, tho. Are
you referring to that?
Alessandro
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 14:41 ` Stefan Monnier via Users list for the GNU Emacs text editor
@ 2022-08-18 20:45 ` Emanuel Berg
0 siblings, 0 replies; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 20:45 UTC (permalink / raw)
To: help-gnu-emacs
Stefan Monnier via Users list for the GNU Emacs text editor wrote:
>> Are there some articles that are really good?
>
> Only those that I wrote, of course,
Examples please :D
--
underground experts united
https://dataswamp.org/~incal
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 20:22 [OFFTOPIC] Academic workflow with old PDFs Alessandro Bertulli
@ 2022-08-18 22:14 ` Stefan Monnier
2022-08-19 4:24 ` tomas
1 sibling, 0 replies; 15+ messages in thread
From: Stefan Monnier @ 2022-08-18 22:14 UTC (permalink / raw)
To: Alessandro Bertulli; +Cc: help-gnu-emacs
>> Some of the old articles in ACM are already processed this way for
>> you, actually.
> I dunno, the ones I read were actually simple scans (as far as I can
> tell). The "search in text" function of Evince/Sioyek worked, tho. Are
> you referring to that?
Yes.
Stefan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 20:45 ` Emanuel Berg
@ 2022-08-18 22:24 ` Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 23:22 ` Emanuel Berg
0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2022-08-18 22:24 UTC (permalink / raw)
To: help-gnu-emacs
Emanuel Berg [2022-08-18 22:45:14] wrote:
> Guys, no one uses the word "academia" any more.
In academia, we do.
Stefan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 22:24 ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
@ 2022-08-18 23:22 ` Emanuel Berg
2022-08-18 23:34 ` Emanuel Berg
2022-08-19 9:47 ` Marcin Borkowski
0 siblings, 2 replies; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 23:22 UTC (permalink / raw)
To: help-gnu-emacs
Stefan Monnier via Users list for the GNU Emacs text editor wrote:
>> Guys, no one uses the word "academia" any more.
>
> In academia, we do.
Examples? :)
--
underground experts united
https://dataswamp.org/~incal
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 23:22 ` Emanuel Berg
@ 2022-08-18 23:34 ` Emanuel Berg
2022-08-20 20:37 ` Alessandro Bertulli
2022-08-19 9:47 ` Marcin Borkowski
1 sibling, 1 reply; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 23:34 UTC (permalink / raw)
To: help-gnu-emacs
>>> Guys, no one uses the word "academia" any more.
>>
>> In academia, we do.
>
> Examples?
I reckon this message may start a flame but that's not my
intention, I'm looking to hear your advice (especially, but
not limited, if you work in/with academia)
I meant a _good_ example ... and list isn't "academia",
wherever that's suppose to be.
This - "if you work in/with academia" - should be "if you are
a researcher/scientist" if the OP is from the
technology/engineering/science world.
In "academia" there are "intellectual's" and "scholars" (AAAAH!
it gets worse!)
Stefan, you gonna be a Shakespearian scholar now? LOL Actually
I can't even envision you as one, and I mean that as
a compliment, of course ...
[Or a Civil War buff? Bonus fact/question: have more books
been written on the American Civil War than on WW2? On the
American civil war, "[t]here are over 60 000 books"
<https://en.wikipedia.org/wiki/Bibliography_of_the_American_Civil_War>
so ... how many on WW2? Ask the academics at the History
Department - but don't assume they can count or maintain
a Bibtex file...]
--
underground experts united
https://dataswamp.org/~incal
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 20:22 [OFFTOPIC] Academic workflow with old PDFs Alessandro Bertulli
2022-08-18 22:14 ` Stefan Monnier
@ 2022-08-19 4:24 ` tomas
2022-08-20 20:57 ` Alessandro Bertulli
1 sibling, 1 reply; 15+ messages in thread
From: tomas @ 2022-08-19 4:24 UTC (permalink / raw)
To: help-gnu-emacs
[-- Attachment #1: Type: text/plain, Size: 569 bytes --]
On Thu, Aug 18, 2022 at 10:22:09PM +0200, Alessandro Bertulli wrote:
> > Some of the old articles in ACM are already processed this way for
> > you, actually.
>
> I dunno, the ones I read were actually simple scans (as far as I can
> tell). The "search in text" function of Evince/Sioyek worked, tho. Are
> you referring to that?
If that works (and assuming Evince hasn't acquired OCR powers stelathily),
the pre-scanned text must be somewhere in the document, yes.
Does selecting a region of text and copying that elsewhere work, too?
Cheers
--
t
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 23:22 ` Emanuel Berg
2022-08-18 23:34 ` Emanuel Berg
@ 2022-08-19 9:47 ` Marcin Borkowski
2022-08-19 13:51 ` Emanuel Berg
1 sibling, 1 reply; 15+ messages in thread
From: Marcin Borkowski @ 2022-08-19 9:47 UTC (permalink / raw)
To: Emanuel Berg; +Cc: help-gnu-emacs
On 2022-08-19, at 01:22, Emanuel Berg <incal@dataswamp.org> wrote:
> Stefan Monnier via Users list for the GNU Emacs text editor wrote:
>
>>> Guys, no one uses the word "academia" any more.
>>
>> In academia, we do.
>
> Examples? :)
https://academia.stackexchange.com/
Good enough?
--
Marcin Borkowski
http://mbork.pl
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-19 9:47 ` Marcin Borkowski
@ 2022-08-19 13:51 ` Emanuel Berg
0 siblings, 0 replies; 15+ messages in thread
From: Emanuel Berg @ 2022-08-19 13:51 UTC (permalink / raw)
To: help-gnu-emacs
Marcin Borkowski wrote:
>>>> Guys, no one uses the word "academia" any more.
>>>
>>> In academia, we do.
>>
>> Examples? :)
>
> https://academia.stackexchange.com/
>
> Good enough?
-1
--
underground experts united
https://dataswamp.org/~incal
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-18 23:34 ` Emanuel Berg
@ 2022-08-20 20:37 ` Alessandro Bertulli
0 siblings, 0 replies; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-20 20:37 UTC (permalink / raw)
To: incal; +Cc: help-gnu-emacs
> I meant a _good_ example ... and list isn't "academia",
> wherever that's suppose to be.
Again, I'm sorry this bothers you so much. Anyway, that was the reason I
specified "especially, but not limited to".
> This - "if you work in/with academia" - should be "if you are
> a researcher/scientist" if the OP is from the
> technology/engineering/science world.
You're right, I am, but actually in my country we never did that
distinction.
> In "academia" there are "intellectual's" and "scholars" (AAAAH!
> it gets worse!)
Here, on the other hand, you're completely right. Those two words are
not used even here, unless you're (inho) stucking up :-)
Anyway, as I said, i didn't mean to turn this thread into a linguistic
flame, so I don't want to further clutter the mailing list, if you are
ok with it.
Alessandro
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-19 4:24 ` tomas
@ 2022-08-20 20:57 ` Alessandro Bertulli
2022-08-20 21:14 ` Alessandro Bertulli
0 siblings, 1 reply; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-20 20:57 UTC (permalink / raw)
To: tomas; +Cc: help-gnu-emacs
> Does selecting a region of text and copying that elsewhere work, too?
Yes, but not everywhere. Again, thank you very much, I suppose PDF
readers cannot do miracles. If the point is the quality of the paper,
that's fine, it means I can stop searching for a magical, non-existent
PDF software.
> If that works (and assuming Evince hasn't acquired OCR powers stelathily),
> the pre-scanned text must be somewhere in the document, yes.
You're right.
Alessandro
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [OFFTOPIC] Academic workflow with old PDFs
2022-08-20 20:57 ` Alessandro Bertulli
@ 2022-08-20 21:14 ` Alessandro Bertulli
0 siblings, 0 replies; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-20 21:14 UTC (permalink / raw)
To: help-gnu-emacs
Following last message, I'd like to thank Jean and Stefan:
> Images with text you may process with OCR program to get some
> meanings, and then text may be connected to pages as well and PDF
> packed again.
>
> --
> Jean
> > The "search in text" function of Evince/Sioyek worked, tho. Are
> > you referring to that?
>
> Yes.
>
>
> Stefan
As I was saying, it seems like I'll struggle with the quality of the
PDFs. No big deal, having them at least OCRed by the publisher is a good
thing. Thanks again!
Alessandro
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-08-20 21:14 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-18 20:22 [OFFTOPIC] Academic workflow with old PDFs Alessandro Bertulli
2022-08-18 22:14 ` Stefan Monnier
2022-08-19 4:24 ` tomas
2022-08-20 20:57 ` Alessandro Bertulli
2022-08-20 21:14 ` Alessandro Bertulli
-- strict thread matches above, loose matches on Subject: below --
2022-08-17 21:36 Alessandro Bertulli
2022-08-18 11:23 ` Jean Louis
2022-08-18 13:48 ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 14:38 ` Emanuel Berg
2022-08-18 14:41 ` Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 20:45 ` Emanuel Berg
2022-08-18 20:45 ` Emanuel Berg
2022-08-18 22:24 ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 23:22 ` Emanuel Berg
2022-08-18 23:34 ` Emanuel Berg
2022-08-20 20:37 ` Alessandro Bertulli
2022-08-19 9:47 ` Marcin Borkowski
2022-08-19 13:51 ` Emanuel Berg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).