From: dalanicolai <dalanicolai@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Emacs Devel <emacs-devel@gnu.org>
Subject: Re: [PATCH] add epub support to doc-view
Date: Fri, 14 Jan 2022 21:02:08 +0100 [thread overview]
Message-ID: <CACJP=3mmKOy4zVmZmDBRMSft7d3PTUYaTkucjMwTVH2TKWg8uw@mail.gmail.com> (raw)
In-Reply-To: <CACJP=3nDuQzZ5thPjcMYy7R3n-Zq8LV8oJHUbwczq0rcEdQQzQ@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 829 bytes --]
And another, updated, patch, adding an extra option to configure the epub
font-size
On Fri, 14 Jan 2022 at 17:15, dalanicolai <dalanicolai@gmail.com> wrote:
> So here is a second version of the patch. But it adds a few more
> extensions to the list
> I forgot about them, as I don't use or experimented with them, but I
> figured that
> this is a good opportunity to add the support for those extensions also.
> A small comment for those who are interested, the CBZ (and by mupdf
> unsupported CBR)
> files seem to be just zipped/rarred collections of image files (I guess
> usually png/jpg). So
> supporting those extensions doesn't really require the `mutool` command if
> emacs would
> just uncompress the collections.
>
> I guess there is not much more to comment on in addition to the comments
> within the patch/files.
>
[-- Attachment #1.2: Type: text/html, Size: 1233 bytes --]
[-- Attachment #2: 0001-Add-support-for-EPUB-CBZ-FB2-and-O-XPS-extension-to-.patch --]
[-- Type: text/x-patch, Size: 17588 bytes --]
From 53913d3b2c667fa8fda1df671212a3c64a3b21a2 Mon Sep 17 00:00:00 2001
From: Daniel Nicolai <dalanicolai@gmail.com>
Date: Tue, 11 Jan 2022 20:37:36 +0100
Subject: [PATCH] Add support for EPUB, CBZ, FB2 and (O)XPS extension to doc
view
* doc/emacs/misc.texi (Document View):
Add requirements for new extensions (i.e. mutool)
* lisp/doc-view.el (doc-view): Additionally update preliminary comment
(doc-view-custom-set-epub-font-size): redraw image after setting
(doc-view-unoconv-program): Put code all on one line
(doc-view-doc-type): Update docstring.
(doc-view-kill-proc): Fix comment indentation
(doc-view-mode-p):
Add check for new extensions and alternative check for PDF
(doc-view-pdf/ps->png): Associate new extension with png converter
(doc-view-convert-current-doc): Handle new extensions like PDF's
(doc-view-set-doc-type): Set correct doc-type for new extensions.
* lisp/files.el (auto-mode-alist):
Associate new extension types with doc-view
---
doc/emacs/misc.texi | 23 ++++---
lisp/doc-view.el | 153 ++++++++++++++++++++++++++++----------------
lisp/files.el | 2 +-
3 files changed, 114 insertions(+), 64 deletions(-)
diff --git a/doc/emacs/misc.texi b/doc/emacs/misc.texi
index df1e5ef238..365c079e89 100644
--- a/doc/emacs/misc.texi
+++ b/doc/emacs/misc.texi
@@ -455,20 +455,27 @@ Document View
@cindex PostScript file
@cindex OpenDocument file
@cindex Microsoft Office file
+@cindex EPUB file
+@cindex CBZ file
+@cindex FB2 file
+@cindex XPS file
+@cindex OXPS file
@cindex DocView mode
@cindex mode, DocView
@cindex document viewer (DocView)
@findex doc-view-mode
DocView mode is a major mode for viewing DVI, PostScript (PS), PDF,
-OpenDocument, and Microsoft Office documents. It provides features
-such as slicing, zooming, and searching inside documents. It works by
-converting the document to a set of images using the @command{gs}
-(GhostScript) or @command{mudraw}/@command{pdfdraw} (MuPDF) commands
-and other external tools @footnote{For PostScript files, GhostScript
-is a hard requirement. For DVI files, @code{dvipdf} or @code{dvipdfm}
-is needed. For OpenDocument and Microsoft Office documents, the
-@code{unoconv} tool is needed.}, and displaying those images.
+OpenDocument, Microsoft Office, EPUB, CBZ, FB2, XPS and OXPS
+documents. It provides features such as slicing, zooming, and
+searching inside documents. It works by converting the document to a
+set of images using the @command{gs} (GhostScript) or
+@command{pdfdraw}/@command{mutool draw} (MuPDF) commands and other
+external tools @footnote{PostScript files require GhostScript, DVI
+files require @code{dvipdf} or @code{dvipdfm}, OpenDocument and
+Microsoft Office documents require the @code{unoconv} tool, and EPUB,
+CBZ, FB2, XPS and OXPS files require @code{mutool} to be available.},
+and displaying those images.
@findex doc-view-toggle-display
@findex doc-view-minor-mode
diff --git a/lisp/doc-view.el b/lisp/doc-view.el
index 5b462b24f5..57144ece1c 100644
--- a/lisp/doc-view.el
+++ b/lisp/doc-view.el
@@ -3,7 +3,7 @@
;; Copyright (C) 2007-2022 Free Software Foundation, Inc.
;;
;; Author: Tassilo Horn <tsdh@gnu.org>
-;; Keywords: files, pdf, ps, dvi
+;; Keywords: files, pdf, ps, dvi, djvu, epub, cbz, fb2, xps, openxps
;; This file is part of GNU Emacs.
@@ -25,17 +25,19 @@
;; Viewing PS/PDF/DVI files requires Ghostscript, `dvipdf' (comes with
;; Ghostscript) or `dvipdfm' (comes with teTeX or TeXLive) and
;; `pdftotext', which comes with xpdf (https://www.foolabs.com/xpdf/)
-;; or poppler (https://poppler.freedesktop.org/).
-;; Djvu documents require `ddjvu' (from DjVuLibre).
-;; ODF files require `soffice' (from LibreOffice).
+;; or poppler (https://poppler.freedesktop.org/). EPUB, CBZ, FB2, XPS
+;; and OXPS documents require `mutool' which comes with mupdf
+;; (https://mupdf.com/index.html). Djvu documents require `ddjvu'
+;; (from DjVuLibre). ODF files require `soffice' (from LibreOffice).
;;; Commentary:
;; DocView is a document viewer for Emacs. It converts a number of
-;; document formats (including PDF, PS, DVI, Djvu and ODF files) to a
-;; set of PNG files, one PNG for each page, and displays the PNG
-;; images inside an Emacs buffer. This buffer uses `doc-view-mode'
-;; which provides convenient key bindings for browsing the document.
+;; document formats (including PDF, PS, DVI, Djvu, ODF, EPUB, CBZ,
+;; FB2, XPS and OXPS files) to a set of PNG (or TIFF for djvu) files,
+;; one image for each page, and displays the images inside an Emacs
+;; buffer. This buffer uses `doc-view-mode' which provides convenient
+;; key bindings for browsing the document.
;;
;; To use it simply open a document file with
;;
@@ -147,7 +149,10 @@
;;;; Customization Options
(defgroup doc-view nil
- "In-buffer viewer for PDF, PostScript, DVI, and DJVU files."
+ "In-buffer document viewer.
+The viewer handles PDF, PostScript, DVI, DJVU, ODF, EPUB, CBZ,
+FB2, XPS and OXPS files, if the appropriate converter programs
+are available (see Info node `(emacs)Document View')"
:link '(function-link doc-view)
:version "22.2"
:group 'applications
@@ -221,6 +226,20 @@ doc-view-resolution
Higher values result in larger images."
:type 'number)
+(defun doc-view-custom-set-epub-font-size (option-name new-value)
+ (set-default option-name new-value)
+ (dolist (x (buffer-list))
+ (with-current-buffer x
+ (when (eq doc-view-doc-type 'epub)
+ (delete-directory doc-view--current-cache-dir t)
+ (doc-view-initiate-display)
+ (doc-view-goto-page (doc-view-current-page))))))
+
+(defcustom doc-view-epub-font-size nil
+ "Font size in points for EPUB layout."
+ :type 'integer
+ :set #'doc-view-custom-set-epub-font-size)
+
(defcustom doc-view-scale-internally t
"Whether we should try to rescale images ourselves.
If nil, the document is re-rendered every time the scaling factor is modified.
@@ -256,9 +275,7 @@ doc-view-dvipdf-program
`doc-view-dvipdf-program' will be preferred."
:type 'file)
-(define-obsolete-variable-alias 'doc-view-unoconv-program
- 'doc-view-odf->pdf-converter-program
- "24.4")
+(define-obsolete-variable-alias 'doc-view-unoconv-program 'doc-view-odf->pdf-converter-program "24.4")
(defcustom doc-view-odf->pdf-converter-program
(cond
@@ -382,7 +399,8 @@ doc-view--buffer-file-name
(defvar doc-view-doc-type nil
"The type of document in the current buffer.
-Can be `dvi', `pdf', `ps', `djvu' or `odf'.")
+Can be `dvi', `pdf', `ps', `djvu', `odf', 'epub', `cbz', `fb2',
+`'xps' or `oxps'.")
(defvar doc-view-single-page-converter-function nil
"Function to call to convert a single page of the document to a bitmap file.
@@ -464,17 +482,17 @@ doc-view--revert-buffer
;; It's normal for this operation to result in a very large undo entry.
(setq-local undo-outer-limit (* 2 (buffer-size))))
(cl-labels ((revert ()
- (let ((revert-buffer-preserve-modes t))
- (apply orig-fun args)
- ;; Update the cached version of the pdf file,
- ;; too. This is the one that's used when
- ;; rendering (bug#26996).
- (unless (equal buffer-file-name
- doc-view--buffer-file-name)
- ;; FIXME: Lars says he needed to recreate
- ;; the dir, we should figure out why.
- (doc-view-make-safe-dir doc-view-cache-directory)
- (write-region nil nil doc-view--buffer-file-name)))))
+ (let ((revert-buffer-preserve-modes t))
+ (apply orig-fun args)
+ ;; Update the cached version of the pdf file,
+ ;; too. This is the one that's used when
+ ;; rendering (bug#26996).
+ (unless (equal buffer-file-name
+ doc-view--buffer-file-name)
+ ;; FIXME: Lars says he needed to recreate
+ ;; the dir, we should figure out why.
+ (doc-view-make-safe-dir doc-view-cache-directory)
+ (write-region nil nil doc-view--buffer-file-name)))))
(if (and (eq 'pdf doc-view-doc-type)
(executable-find "pdfinfo"))
;; We don't want to revert if the PDF file is corrupted which
@@ -738,7 +756,7 @@ doc-view-kill-proc
(interactive)
(while (consp doc-view--current-converter-processes)
(ignore-errors ;; Some entries might not be processes, and maybe
- ;; some are dead already?
+ ; some are dead already?
(kill-process (pop doc-view--current-converter-processes))))
(when doc-view--current-timer
(cancel-timer doc-view--current-timer)
@@ -799,8 +817,8 @@ doc-view--current-cache-dir
;;;###autoload
(defun doc-view-mode-p (type)
"Return non-nil if document type TYPE is available for `doc-view'.
-Document types are symbols like `dvi', `ps', `pdf', or `odf' (any
-OpenDocument format)."
+Document types are symbols like `dvi', `ps', `pdf', `epub',
+`cbz', `fb2', `xps', `oxps', or`odf' (any OpenDocument format)."
(and (display-graphic-p)
(image-type-available-p 'png)
(cond
@@ -811,16 +829,22 @@ doc-view-mode-p
(and doc-view-dvipdfm-program
(executable-find doc-view-dvipdfm-program)))))
((memq type '(postscript ps eps pdf))
- (or (and doc-view-ghostscript-program
+ (or (and doc-view-ghostscript-program
(executable-find doc-view-ghostscript-program))
- (and doc-view-pdfdraw-program
- (executable-find doc-view-pdfdraw-program))))
+ ;; for pdf also check for `doc-view-pdfdraw-program'
+ (when (eq type 'pdf)
+ (and doc-view-pdfdraw-program
+ (executable-find doc-view-pdfdraw-program)))))
((eq type 'odf)
(and doc-view-odf->pdf-converter-program
(executable-find doc-view-odf->pdf-converter-program)
(doc-view-mode-p 'pdf)))
((eq type 'djvu)
(executable-find "ddjvu"))
+ ((memq type '(epub cbz fb2 xps oxps))
+ ;; first check if `doc-view-pdfdraw-program' is set to mutool
+ (and (string= doc-view-pdfdraw-program "mutool")
+ (executable-find "mutool")))
(t ;; unknown image type
nil))))
@@ -1053,7 +1077,7 @@ doc-view-start-process
;; some file-name-handler-managed dir, for example).
(let* ((default-directory (or (unhandled-file-name-directory
default-directory)
- (expand-file-name "~/")))
+ (expand-file-name "~/")))
(proc (apply #'start-process name doc-view-conversion-buffer
program args)))
(push proc doc-view--current-converter-processes)
@@ -1139,14 +1163,17 @@ doc-view-pdf-password-protected-pdfdraw-p
(search-forward "error: cannot authenticate password" nil t)))
(defun doc-view-pdf->png-converter-mupdf (pdf png page callback)
- (let ((pdf-passwd (if (doc-view-pdf-password-protected-pdfdraw-p pdf)
- (read-passwd "Enter password for PDF file: "))))
+ (let* ((pdf-passwd (if (doc-view-pdf-password-protected-pdfdraw-p pdf)
+ (read-passwd "Enter password for PDF file: ")))
+ (options `(,(concat "-o" png)
+ ,(format "-r%d" (round doc-view-resolution))
+ ,@(if pdf-passwd `("-p" ,pdf-passwd)))))
+ (when (and (eq doc-view-doc-type 'epub) doc-view-epub-font-size)
+ (setq options (append options (list (format "-S%s" doc-view-epub-font-size)))))
(doc-view-start-process
"pdf->png" doc-view-pdfdraw-program
`(,@(doc-view-pdfdraw-program-subcommand)
- ,(concat "-o" png)
- ,(format "-r%d" (round doc-view-resolution))
- ,@(if pdf-passwd `("-p" ,pdf-passwd))
+ ,@options
,pdf
,@(if page `(,(format "%d" page))))
callback)))
@@ -1189,7 +1216,7 @@ doc-view-pdf/ps->png
"Convert PDF-PS to PNG asynchronously."
(funcall
(pcase doc-view-doc-type
- ('pdf doc-view-pdf->png-converter-function)
+ ((or 'pdf 'epub 'cbz 'fb2 'xps 'oxps) doc-view-pdf->png-converter-function)
('djvu #'doc-view-djvu->tiff-converter-ddjvu)
(_ #'doc-view-ps->png-converter-ghostscript))
pdf-ps png nil
@@ -1227,20 +1254,20 @@ doc-view-document->bitmap
(let ((rest (cdr pages)))
(funcall doc-view-single-page-converter-function
pdf (format png (car pages)) (car pages)
- (lambda ()
- (if rest
- (doc-view-document->bitmap pdf png rest)
- ;; Yippie, the important pages are done, update the display.
- (clear-image-cache)
- ;; For the windows that have a message (like "Welcome to
- ;; DocView") display property, clearing the image cache is
- ;; not sufficient.
- (dolist (win (get-buffer-window-list (current-buffer) nil 'visible))
- (with-selected-window win
- (when (stringp (overlay-get (doc-view-current-overlay) 'display))
- (doc-view-goto-page (doc-view-current-page)))))
- ;; Convert the rest of the pages.
- (doc-view-pdf/ps->png pdf png)))))))
+ (lambda ()
+ (if rest
+ (doc-view-document->bitmap pdf png rest)
+ ;; Yippie, the important pages are done, update the display.
+ (clear-image-cache)
+ ;; For the windows that have a message (like "Welcome to
+ ;; DocView") display property, clearing the image cache is
+ ;; not sufficient.
+ (dolist (win (get-buffer-window-list (current-buffer) nil 'visible))
+ (with-selected-window win
+ (when (stringp (overlay-get (doc-view-current-overlay) 'display))
+ (doc-view-goto-page (doc-view-current-page)))))
+ ;; Convert the rest of the pages.
+ (doc-view-pdf/ps->png pdf png)))))))
(defun doc-view-pdf->txt (pdf txt callback)
"Convert PDF to TXT asynchronously and call CALLBACK when finished."
@@ -1337,7 +1364,9 @@ doc-view-convert-current-doc
;; Rename to doc.pdf
(rename-file opdf pdf)
(doc-view-pdf/ps->png pdf png-file)))))
- ((or 'pdf 'djvu)
+ ;; The doc-view-mode-p check ensures that epub, cbz, fb2 and
+ ;; (o)xps are handled with mutool
+ ((or 'pdf 'djvu 'epub 'cbz 'fb2 'xps 'oxps)
(let ((pages (doc-view-active-pages)))
;; Convert doc to bitmap images starting with the active pages.
(doc-view-document->bitmap doc-view--buffer-file-name png-file pages)))
@@ -1432,7 +1461,7 @@ doc-view-paper-sizes
(defun doc-view-guess-paper-size (iw ih)
"Guess the paper size according to the aspect ratio."
(cl-labels ((div (x y)
- (round (/ (* 100.0 x) y))))
+ (round (/ (* 100.0 x) y))))
(let ((ar (div iw ih))
(al (mapcar (lambda (l)
(list (div (nth 1 l) (nth 2 l)) (car l)))
@@ -1869,6 +1898,8 @@ doc-view-set-doc-type
("dvi" dvi)
;; PDF
("pdf" pdf) ("epdf" pdf)
+ ;; EPUB
+ ("epub" epub)
;; PostScript
("ps" ps) ("eps" ps)
;; DjVu
@@ -1880,7 +1911,13 @@ doc-view-set-doc-type
;; Microsoft Office formats (also handled by the odf
;; conversion chain).
("doc" odf) ("docx" odf) ("xls" odf) ("xlsx" odf)
- ("ppt" odf) ("pps" odf) ("pptx" odf) ("rtf" odf))
+ ("ppt" odf) ("pps" odf) ("pptx" odf) ("rtf" odf)
+ ;; CBZ
+ ("cbz" cbz)
+ ;; FB2
+ ("fb2" fb2)
+ ;; (Open)XPS
+ ("xps" xps) ("oxps" oxps))
t))))
(content-types
(save-excursion
@@ -1889,7 +1926,13 @@ doc-view-set-doc-type
((looking-at "%!") '(ps))
((looking-at "%PDF") '(pdf))
((looking-at "\367\002") '(dvi))
- ((looking-at "AT&TFORM") '(djvu))))))
+ ((looking-at "AT&TFORM") '(djvu))
+ ;; The following pattern actually is for recognizing
+ ;; zip-archives, so that this same association is used for
+ ;; cbz files. This is fine, as cbz files should be handled
+ ;; like epub anyway.
+ ((looking-at "PK") '(epub))
+ ))))
(setq-local
doc-view-doc-type
(car (or (nreverse (seq-intersection name-types content-types #'eq))
diff --git a/lisp/files.el b/lisp/files.el
index a11786fca2..f2c656bfde 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -2925,7 +2925,7 @@ auto-mode-alist
("\\.\\(diffs?\\|patch\\|rej\\)\\'" . diff-mode)
("\\.\\(dif\\|pat\\)\\'" . diff-mode) ; for MS-DOS
("\\.[eE]?[pP][sS]\\'" . ps-mode)
- ("\\.\\(?:PDF\\|DVI\\|OD[FGPST]\\|DOCX\\|XLSX?\\|PPTX?\\|pdf\\|djvu\\|dvi\\|od[fgpst]\\|docx\\|xlsx?\\|pptx?\\)\\'" . doc-view-mode-maybe)
+ ("\\.\\(?:PDF\\|EPUB\\|CBZ\\|FB2\\|O?XPS\\|DVI\\|OD[FGPST]\\|DOCX\\|XLSX?\\|PPTX?\\|pdf\\|epub\\|cbz\\|fb2\\|o?xps\\|djvu\\|dvi\\|od[fgpst]\\|docx\\|xlsx?\\|pptx?\\)\\'" . doc-view-mode-maybe)
("configure\\.\\(ac\\|in\\)\\'" . autoconf-mode)
("\\.s\\(v\\|iv\\|ieve\\)\\'" . sieve-mode)
("BROWSE\\'" . ebrowse-tree-mode)
--
2.33.1
next prev parent reply other threads:[~2022-01-14 20:02 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-11 2:20 [PATCH] add epub support to doc-view dalanicolai
2022-01-11 2:32 ` Po Lu
2022-01-11 9:34 ` dalanicolai
2022-01-11 9:50 ` Tassilo Horn
2022-01-11 10:04 ` dalanicolai
2022-01-11 10:08 ` dalanicolai
2022-01-11 10:15 ` Robert Pluim
2022-01-11 9:59 ` Robert Pluim
2022-01-11 10:09 ` dalanicolai
2022-01-11 2:48 ` Stefan Monnier
2022-01-11 3:30 ` Stefan Kangas
2022-01-11 10:01 ` dalanicolai
2022-01-11 10:16 ` Robert Pluim
2022-01-13 9:14 ` dalanicolai
2022-01-11 9:59 ` dalanicolai
2022-01-11 10:13 ` dalanicolai
2022-01-11 14:39 ` Stefan Monnier
2022-01-13 9:25 ` dalanicolai
2022-01-14 16:15 ` dalanicolai
2022-01-14 20:02 ` dalanicolai [this message]
2022-01-26 20:28 ` dalanicolai
2022-01-27 16:05 ` Lars Ingebrigtsen
2022-01-27 21:09 ` Iñigo Serna
2022-01-28 13:47 ` Lars Ingebrigtsen
2022-01-28 19:51 ` Iñigo Serna
2022-01-29 17:07 ` dalanicolai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACJP=3mmKOy4zVmZmDBRMSft7d3PTUYaTkucjMwTVH2TKWg8uw@mail.gmail.com' \
--to=dalanicolai@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).