unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* How I edit PDF files
       [not found] <mailman.107.1679500858.1489.help-gnu-emacs@gnu.org>
@ 2023-03-22 19:46 ` Frederick Bartlett
  0 siblings, 0 replies; only message in thread
From: Frederick Bartlett @ 2023-03-22 19:46 UTC (permalink / raw)
  To: help-gnu-emacs

On rare occasions (that is, when the source of the PDF is unavailable), I
have edited and/or transformed PDF files. Note that I started by editing
Type 1 PostScript files in the late 80s before PDF was released; that was a
much easier starting point than the full panoply of modern PDF. A general
solution for PDF editing, while perhaps possible, would require roughly the
same amount of work as creating the PDF format in the first place.

While I’ve used Emacs as part of this process for 25 or 30 years, I have
not attempted to use Emacs Lisp (I don’t like parentheses).

However, each application has its own application/dialect of PDF which can
be reverse engineered; that takes (at best) several hours.

For instance, I was able to change the information on a PDF address label
when the original app wouldn’t allow the required changes – we wanted 4
lines of information where the app would allow only 3.

Also, back in the 90s at Springer-Verlag, I routinely swapped out bitmapped
fonts for TFM-equivalent Type 1 fonts in author-produced TeX PDFs for
publication. That was the easiest PDF editing I’ve ever done.

Finally, just recently, I parsed the PDF of a Microsoft Excel spreadsheet
and converted it to a Python array (from which it could be imported into
database or spreadsheet apps) after the original xlsx file had been lost –
but most of that work was done by pdftotext, which is part of poppler,
after which a simple Python script did the rest.

But it is always better and quicker to get to the original source of the
PDF and make the changes there.



On Wed, Mar 22, 2023 at 12:03 PM <help-gnu-emacs-request@gnu.org> wrote:

> Send help-gnu-emacs mailing list submissions to
>         help-gnu-emacs@gnu.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.gnu.org/mailman/listinfo/help-gnu-emacs
> or, via email, send a message with subject or body 'help' to
>         help-gnu-emacs-request@gnu.org
>
> You can reach the person managing the list at
>         help-gnu-emacs-owner@gnu.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of help-gnu-emacs digest..."
>
>
> Today's Topics:
>
>    1. Re: superfluous(?) quotes in mail alias expansion (Emanuel Berg)
>    2. Re: editing a PDF [Re: emacs 30.5.0 editing epub] (Emanuel Berg)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 21 Mar 2023 15:24:41 +0100
> From: Emanuel Berg <incal@dataswamp.org>
> To: help-gnu-emacs@gnu.org
> Subject: Re: superfluous(?) quotes in mail alias expansion
> Message-ID: <87h6ueqj12.fsf@dataswamp.org>
> Content-Type: text/plain; charset=utf-8
>
> Gregor Zattler wrote:
>
> > the fine manual has this to say regarding mail aliases:
> >
> >    If an address contains a space, quote the whole
> >    address with a pair of double quotes, like this:
> >
> >         alias jsmith "John Q. Smith <none@example.com>"
> >
> >    Note that you need not include double quotes around
> >    individual parts of the address, such as the
> >    person’s full name.  Emacs puts them in if they are
> >    needed.  For instance, it inserts the above address
> >    as ‘"John Q. Smith" <none@example.com>’.
> >
> > This describes correctly how this mechanism works,
> > regarding it's outcome, but not why.
>
> In the data file, it's just so it can be parsed. When it gets
> inserted, quotes are added if they are needed.
>
> Try putting it like this in ~/.mailrc:
>
> alias jsmith-1 "John Q. Smith <none@example.com>"
> alias jsmith-2 "John Q Smith <none@example.com>"
> alias jsmith jsmith-1 jsmith-2
>
> then expand both with "jsmith", it will look like this:
>
> To: "John Q. Smith" <none@example.com>,
>  John Q Smith <none@example.com>
>
> Quotes inserted (because of the dot) in the first case, not in
> the second as not needed.
>
> --
> underground experts united
> https://dataswamp.org/~incal
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 21 Mar 2023 07:38:42 +0100
> From: Emanuel Berg <incal@dataswamp.org>
> To: help-gnu-emacs@gnu.org
> Subject: Re: editing a PDF [Re: emacs 30.5.0 editing epub]
> Message-ID: <87v8iur4lp.fsf@dataswamp.org>
> Content-Type: text/plain; charset=utf-8
>
> Yuri Khan wrote:
>
> >> It is kinda weird that, with all the many things that emacs
> >> can do, it can't take the info from doc-view (which
> >> obviously understands all the pieces of a pdf-- down to its
> >> bits-- and how they all go together to make a document) and
> >> edit it... it's pretty much implausible to believe emacs
> >> *can't* do that. But then, I've always thought reality was
> >> completely implausible. :^/
> >
> > PDF is not really meant for editing. It’s not even a data
> > format. Rather, it’s an executable program that has
> > instructions like “select this font” and “display this word
> > in the selected font at this position on the page” and “make
> > a new page”.
> >
> > You don’t normally edit executable programs, you compile
> > them from source. In the same vein, to get a modified PDF,
> > you find the source document from which it was produced,
> > modify that, and re-export.
>
> Still, they are editable at/from the PDF level as well, for
> example with xournal. This is used for signing documents,
> for example. It should not be confused or compared with
> editing the source, that's another thing.
>
> --
> underground experts united
> https://dataswamp.org/~incal
>
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> help-gnu-emacs mailing list
> help-gnu-emacs@gnu.org
> https://lists.gnu.org/mailman/listinfo/help-gnu-emacs
>
>
> ------------------------------
>
> End of help-gnu-emacs Digest, Vol 244, Issue 37
> ***********************************************
>


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-03-22 19:46 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.107.1679500858.1489.help-gnu-emacs@gnu.org>
2023-03-22 19:46 ` How I edit PDF files Frederick Bartlett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).