unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 11:23 ` Jean Louis
@ 2022-08-18 13:48   ` Stefan Monnier via Users list for the GNU Emacs text editor
  2022-08-18 14:38     ` Emanuel Berg
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2022-08-18 13:48 UTC (permalink / raw)
  To: help-gnu-emacs

Jean Louis [2022-08-18 14:23:05] wrote:
> * Alessandro Bertulli <alessandro.bertulli96@gmail.com> [2022-08-18 01:27]:
>> I'm currently writing my MS's thesis. Searching for the state of the art
>> of my assigned technology, I am struggling to read and reason about some
>> old papers from ACM and IEEE (pre-2000, scanned, with no index). I am
>> currently switching back and forth between Sioyek and Evince to read my
>> pdfs, while taking notes in Org mode.
>
> Images with text you may process with OCR program to get some
> meanings, and then text may be connected to pages as well and PDF
> packed again.

Some of the old articles in ACM are already processed this way for
you, actually.


        Stefan




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 13:48   ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
@ 2022-08-18 14:38     ` Emanuel Berg
  2022-08-18 14:41       ` Stefan Monnier via Users list for the GNU Emacs text editor
  0 siblings, 1 reply; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 14:38 UTC (permalink / raw)
  To: help-gnu-emacs

Stefan Monnier via Users list for the GNU Emacs text editor wrote:

>>> I'm currently writing my MS's thesis. Searching for the
>>> state of the art of my assigned technology, I am
>>> struggling to read and reason about some old papers from
>>> ACM and IEEE (pre-2000, scanned, with no index). I am
>>> currently switching back and forth between Sioyek and
>>> Evince to read my pdfs, while taking notes in Org mode.
>>
>> Images with text you may process with OCR program to get
>> some meanings, and then text may be connected to pages as
>> well and PDF packed again.
>
> Some of the old articles in ACM are already processed this
> way for you, actually.

Are there some articles that are really good?

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 14:38     ` Emanuel Berg
@ 2022-08-18 14:41       ` Stefan Monnier via Users list for the GNU Emacs text editor
  2022-08-18 20:45         ` Emanuel Berg
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2022-08-18 14:41 UTC (permalink / raw)
  To: help-gnu-emacs

> Are there some articles that are really good?

Only those that I wrote, of course,


        Stefan




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
@ 2022-08-18 20:22 Alessandro Bertulli
  2022-08-18 22:14 ` Stefan Monnier
  2022-08-19  4:24 ` tomas
  0 siblings, 2 replies; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-18 20:22 UTC (permalink / raw)
  To: monnier; +Cc: help-gnu-emacs

> Some of the old articles in ACM are already processed this way for
> you, actually.

I dunno, the ones I read were actually simple scans (as far as I can
tell). The "search in text" function of Evince/Sioyek worked, tho. Are
you referring to that?

Alessandro



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 14:41       ` Stefan Monnier via Users list for the GNU Emacs text editor
@ 2022-08-18 20:45         ` Emanuel Berg
  0 siblings, 0 replies; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 20:45 UTC (permalink / raw)
  To: help-gnu-emacs

Stefan Monnier via Users list for the GNU Emacs text editor wrote:

>> Are there some articles that are really good?
>
> Only those that I wrote, of course,

Examples please :D

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 20:22 [OFFTOPIC] Academic workflow with old PDFs Alessandro Bertulli
@ 2022-08-18 22:14 ` Stefan Monnier
  2022-08-19  4:24 ` tomas
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Monnier @ 2022-08-18 22:14 UTC (permalink / raw)
  To: Alessandro Bertulli; +Cc: help-gnu-emacs

>> Some of the old articles in ACM are already processed this way for
>> you, actually.
> I dunno, the ones I read were actually simple scans (as far as I can
> tell). The "search in text" function of Evince/Sioyek worked, tho. Are
> you referring to that?

Yes.


        Stefan




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 20:45 ` Emanuel Berg
@ 2022-08-18 22:24   ` Stefan Monnier via Users list for the GNU Emacs text editor
  2022-08-18 23:22     ` Emanuel Berg
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2022-08-18 22:24 UTC (permalink / raw)
  To: help-gnu-emacs

Emanuel Berg [2022-08-18 22:45:14] wrote:
> Guys, no one uses the word "academia" any more.

In academia, we do.


        Stefan




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 22:24   ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
@ 2022-08-18 23:22     ` Emanuel Berg
  2022-08-18 23:34       ` Emanuel Berg
  2022-08-19  9:47       ` Marcin Borkowski
  0 siblings, 2 replies; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 23:22 UTC (permalink / raw)
  To: help-gnu-emacs

Stefan Monnier via Users list for the GNU Emacs text editor wrote:

>> Guys, no one uses the word "academia" any more.
>
> In academia, we do.

Examples? :)

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 23:22     ` Emanuel Berg
@ 2022-08-18 23:34       ` Emanuel Berg
  2022-08-20 20:37         ` Alessandro Bertulli
  2022-08-19  9:47       ` Marcin Borkowski
  1 sibling, 1 reply; 15+ messages in thread
From: Emanuel Berg @ 2022-08-18 23:34 UTC (permalink / raw)
  To: help-gnu-emacs

>>> Guys, no one uses the word "academia" any more.
>>
>> In academia, we do.
>
> Examples?

  I reckon this message may start a flame but that's not my
  intention, I'm looking to hear your advice (especially, but
  not limited, if you work in/with academia)

I meant a _good_ example ... and list isn't "academia",
wherever that's suppose to be.

This - "if you work in/with academia" - should be "if you are
a researcher/scientist" if the OP is from the
technology/engineering/science world.

In "academia" there are "intellectual's" and "scholars" (AAAAH!
it gets worse!)

Stefan, you gonna be a Shakespearian scholar now? LOL Actually
I can't even envision you as one, and I mean that as
a compliment, of course ...

[Or a Civil War buff? Bonus fact/question: have more books
 been written on the American Civil War than on WW2? On the
 American civil war, "[t]here are over 60 000 books"
 <https://en.wikipedia.org/wiki/Bibliography_of_the_American_Civil_War>
 so ... how many on WW2? Ask the academics at the History
 Department - but don't assume they can count or maintain
 a Bibtex file...]

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 20:22 [OFFTOPIC] Academic workflow with old PDFs Alessandro Bertulli
  2022-08-18 22:14 ` Stefan Monnier
@ 2022-08-19  4:24 ` tomas
  2022-08-20 20:57   ` Alessandro Bertulli
  1 sibling, 1 reply; 15+ messages in thread
From: tomas @ 2022-08-19  4:24 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 569 bytes --]

On Thu, Aug 18, 2022 at 10:22:09PM +0200, Alessandro Bertulli wrote:
> > Some of the old articles in ACM are already processed this way for
> > you, actually.
> 
> I dunno, the ones I read were actually simple scans (as far as I can
> tell). The "search in text" function of Evince/Sioyek worked, tho. Are
> you referring to that?

If that works (and assuming Evince hasn't acquired OCR powers stelathily),
the pre-scanned text must be somewhere in the document, yes.

Does selecting a region of text and copying that elsewhere work, too?

Cheers
-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 23:22     ` Emanuel Berg
  2022-08-18 23:34       ` Emanuel Berg
@ 2022-08-19  9:47       ` Marcin Borkowski
  2022-08-19 13:51         ` Emanuel Berg
  1 sibling, 1 reply; 15+ messages in thread
From: Marcin Borkowski @ 2022-08-19  9:47 UTC (permalink / raw)
  To: Emanuel Berg; +Cc: help-gnu-emacs


On 2022-08-19, at 01:22, Emanuel Berg <incal@dataswamp.org> wrote:

> Stefan Monnier via Users list for the GNU Emacs text editor wrote:
>
>>> Guys, no one uses the word "academia" any more.
>>
>> In academia, we do.
>
> Examples? :)

https://academia.stackexchange.com/

Good enough?

-- 
Marcin Borkowski
http://mbork.pl



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-19  9:47       ` Marcin Borkowski
@ 2022-08-19 13:51         ` Emanuel Berg
  0 siblings, 0 replies; 15+ messages in thread
From: Emanuel Berg @ 2022-08-19 13:51 UTC (permalink / raw)
  To: help-gnu-emacs

Marcin Borkowski wrote:

>>>> Guys, no one uses the word "academia" any more.
>>>
>>> In academia, we do.
>>
>> Examples? :)
>
> https://academia.stackexchange.com/
>
> Good enough?

-1

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-18 23:34       ` Emanuel Berg
@ 2022-08-20 20:37         ` Alessandro Bertulli
  0 siblings, 0 replies; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-20 20:37 UTC (permalink / raw)
  To: incal; +Cc: help-gnu-emacs

> I meant a _good_ example ... and list isn't "academia",
> wherever that's suppose to be.

Again, I'm sorry this bothers you so much. Anyway, that was the reason I
specified "especially, but not limited to".

> This - "if you work in/with academia" - should be "if you are
> a researcher/scientist" if the OP is from the
> technology/engineering/science world.

You're right, I am, but actually in my country we never did that
distinction.

> In "academia" there are "intellectual's" and "scholars" (AAAAH!
> it gets worse!)

Here, on the other hand, you're completely right. Those two words are
not used even here, unless you're (inho) stucking up :-)

Anyway, as I said, i didn't mean to turn this thread into a linguistic
flame, so I don't want to further clutter the mailing list, if you are
ok with it.

Alessandro



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-19  4:24 ` tomas
@ 2022-08-20 20:57   ` Alessandro Bertulli
  2022-08-20 21:14     ` Alessandro Bertulli
  0 siblings, 1 reply; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-20 20:57 UTC (permalink / raw)
  To: tomas; +Cc: help-gnu-emacs

> Does selecting a region of text and copying that elsewhere work, too?

Yes, but not everywhere. Again, thank you very much, I suppose PDF
readers cannot do miracles. If the point is the quality of the paper,
that's fine, it means I can stop searching for a magical, non-existent
PDF software.

> If that works (and assuming Evince hasn't acquired OCR powers stelathily),
> the pre-scanned text must be somewhere in the document, yes.

You're right.

Alessandro



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [OFFTOPIC] Academic workflow with old PDFs
  2022-08-20 20:57   ` Alessandro Bertulli
@ 2022-08-20 21:14     ` Alessandro Bertulli
  0 siblings, 0 replies; 15+ messages in thread
From: Alessandro Bertulli @ 2022-08-20 21:14 UTC (permalink / raw)
  To: help-gnu-emacs

Following last message, I'd like to thank Jean and Stefan:

> Images with text you may process with OCR program to get some
> meanings, and then text may be connected to pages as well and PDF
> packed again.
> 
> -- 
> Jean

> > The "search in text" function of Evince/Sioyek worked, tho. Are
> > you referring to that?
> 
> Yes.
> 
> 
>         Stefan

As I was saying, it seems like I'll struggle with the quality of the
PDFs. No big deal, having them at least OCRed by the publisher is a good
thing. Thanks again!

Alessandro



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-08-20 21:14 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-18 20:22 [OFFTOPIC] Academic workflow with old PDFs Alessandro Bertulli
2022-08-18 22:14 ` Stefan Monnier
2022-08-19  4:24 ` tomas
2022-08-20 20:57   ` Alessandro Bertulli
2022-08-20 21:14     ` Alessandro Bertulli
  -- strict thread matches above, loose matches on Subject: below --
2022-08-17 21:36 Alessandro Bertulli
2022-08-18 11:23 ` Jean Louis
2022-08-18 13:48   ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 14:38     ` Emanuel Berg
2022-08-18 14:41       ` Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 20:45         ` Emanuel Berg
2022-08-18 20:45 ` Emanuel Berg
2022-08-18 22:24   ` [OFFTOPIC] " Stefan Monnier via Users list for the GNU Emacs text editor
2022-08-18 23:22     ` Emanuel Berg
2022-08-18 23:34       ` Emanuel Berg
2022-08-20 20:37         ` Alessandro Bertulli
2022-08-19  9:47       ` Marcin Borkowski
2022-08-19 13:51         ` Emanuel Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).