unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Visuwesh <visuweshm@gmail.com>
To: Tassilo Horn <tsdh@gnu.org>
Cc: Eli Zaretskii <eliz@gnu.org>, "Jose A. Ortega Ruiz" <jao@gnu.org>,
	73530@debbugs.gnu.org
Subject: bug#73530: [PATCH] Add imenu index function for Djvu files in doc-view
Date: Wed, 02 Oct 2024 13:49:55 +0530	[thread overview]
Message-ID: <87y136dihg.fsf@gmail.com> (raw)
In-Reply-To: <875xqb2efq.fsf@gnu.org> (Tassilo Horn's message of "Wed, 02 Oct 2024 08:42:49 +0200")

[-- Attachment #1: Type: text/plain, Size: 6924 bytes --]

[புதன் அக்டோபர் 02, 2024] Tassilo Horn wrote:

> Visuwesh <visuweshm@gmail.com> writes:
>
> Hi Visuwesh,
>
> [Sorry if this message appears twice but it seems to have bounced
> yesterday.]

[ I did not get the previous mail FYI.  ]

>> Please review the attached.
>
> First of all, the patch doesn't apply on master's NEWS and misc.texi
> here.  If I exclude those, the changes to doc-view.el can be applied.

Oops, I suppose I can no longer be lazy about pulling from remote
anymore.

> Unfortunately, I didn't find a PDF nor DjVu document on my computer
> where an index can be built.  I have the relevant tools installed but
> get the message that no index can be built for that document and
> doc-view--outline becomes 'unavailable.
>
> I've tried various PDFs generated by LaTeX with many section,
> subsections, etc.

The PDF generated by LaTeX can have a wildly different outline than
matched by doc-view's regexp:

    % mutool show test.pdf outline
    |	"Text"	#nameddest=section.1
    |	"Annotations"	#nameddest=section.2
    |	"Links"	#nameddest=section.3
    |	"Attachments"	#nameddest=section.4
    +	"Outline"	#nameddest=section.5
    +		"subsection"	#nameddest=subsection.5.1
    |			"subsubsection"	#nameddest=subsubsection.5.1.1

Compare it with:

    % mutool show atkins_physical_chemistry.pdf outline
    |	"Cover"	#page=1&view=Fit
    |	"PREFACE"	#page=7&view=Fit
    |	"USING THE BOOK"	#page=8&view=Fit
    |	"ABOUT THE AUTHORS"	#page=12&view=Fit
    |	"ACKNOWLEDGEMENTS"	#page=13&view=Fit
    |	"BRIEF CONTENTS"	#page=15&view=Fit
    |	"FULL CONTENTS"	#page=17&view=Fit
    |	"CONVENTIONS"	#page=27&view=Fit
    |	"LIST OF TABLES"	#page=28&view=Fit
    ...


> For DjVu, my sample size is 1, and that's a presentation, so at least
> here I'm not sure if there should be an index available...

I will send the link to the DjVu file that I wrote the feature for
off-list.  I will send a link to a PDF file too.

> That said, I haven't used the imenu feature before so I can't say if it
> ever worked for me...
>
>> diff --git a/doc/emacs/misc.texi b/doc/emacs/misc.texi
>> index e19e554fb26..332d5b1468f 100644
>> --- a/doc/emacs/misc.texi
>> +++ b/doc/emacs/misc.texi
>> @@ -581,17 +581,14 @@ DocView Navigation
>>  default size for DocView, customize the variable
>>  @code{doc-view-resolution}.
>>  
>> -@vindex doc-view-imenu-enabled
>>  @vindex doc-view-imenu-flatten
>>  @vindex doc-view-imenu-format
>> -  When the @command{mutool} program is available, DocView will use it
>> -to generate entries for an outline menu, making it accessible via the
>> -@code{imenu} facility (@pxref{Imenu}).  To disable this functionality
>> -even when @command{mutool} can be found on your system, customize the
>> -variable @code{doc-view-imenu-enabled} to the @code{nil} value.  You
>> -can further customize how @code{imenu} items are formatted and
>> -displayed using the variables @code{doc-view-imenu-format} and
>> -@code{doc-view-imenu-flatten}.
>> +  DocView can generate an outline menu for PDF and Djvu documents using
>
> Didn't Eli say the official spelling was DjVu?  That's at least the
> spelling that the djvused man pages also uses and they should know.

Fixed.

>> +the @command{mutool} and the @command{djvused} programs respectively
>> +when they are available.  This is made accessible via the
>> @code{imenu} +facility (@pxref{Imenu}).  You can customize how
>> @code{imenu} items are +formatted and displayed using the variables
>> @code{doc-view-imenu-format} +and @code{doc-view-imenu-flatten}.
>
> I guess you should mention the new defcustom doc-view-djvused-program
> here, too.

Done.

On this note, should we use doc-view-pdfdraw-program in place of mutool
in doc-view--pdf-outline?

>> +(defcustom doc-view-imenu-enabled (and (or (executable-find "mutool")
>> +                                           (executable-find "djvused"))
>> +                                       t)
>> +  "Whether to generate imenu outline for PDF and Djvu files.
>> +This uses \"mutool\" for PDF files and \"djvused\" for Djvu files."
>>    :type 'boolean
>> -  :version "29.1")
>> +  :version "31.1")
>> +(make-obsolete-variable 'doc-view-imenu-enabled
>> +   "Imenu index is generated unconditionally, when available"
>> +   "31.1")
>
> Ah, I thought our last agreement was that we keep that variable (as
> suggested by Jose) as it is used right now but make it possible to have
> a value that tells to index only PDF or DjVu documents.

Ahh, I misunderstood the suggestion.

> Well, I actually have no strong opinion here.  Technically, I like your
> approach better because of its simplicity.  I would like to test with
> some larger documents to see how long index building takes, though.

I tried the function with a large PDF file:

    % time mutool show atkins_physical_chemistry.pdf outline >/dev/null
        0m00.32s real     0m00.30s user     0m00.02s system
    % time mutool show atkins_physical_chemistry.pdf outline >/dev/null
        0m00.30s real     0m00.26s user     0m00.03s system
    % mutool show atkins_physical_chemistry.pdf outline |wc -l
    925
    % du -h atkins_physical_chemistry.pdf 
    97M	atkins_physical_chemistry.pdf

    (benchmark-run 10
      (doc-view--pdf-outline "~/doc/uni/refb/atkins_physical_chemistry.pdf"))
      ;; => (3.0118861719999996 0 0.0)

    (benchmark-run 1
      (doc-view--pdf-outline "~/doc/uni/refb/atkins_physical_chemistry.pdf"))
      ;; => (0.306343039 0 0.0)

which honestly isn't that long a time to wait for the first time you say
M-g i.

Now for the DjVu file that I was testing on:

    % time djvused -e print-outline Solid_State_Physics_Ashcroft.djvu >/dev/null
        0m00.24s real     0m00.23s user     0m00.01s system
    % djvused -e print-outline Solid_State_Physics_Ashcroft.djvu |wc -l
    115
    % du -sh Solid_State_Physics_Ashcroft.djvu 
    83M	Solid_State_Physics_Ashcroft.djvu

    (benchmark-run 10
      (doc-view--djvu-outline "~/tmp/Solid_State_Physics_Ashcroft.djvu"))
      ;; => (2.2234427809999997 0 0.0)

    (benchmark-run 1
      (doc-view--djvu-outline "~/tmp/Solid_State_Physics_Ashcroft.djvu"))
      ;; => (0.239040117 0 0.0)

IIRC, there's a djvu file somewhere stashed in my home directory that
had an index.  I can benchmark making the index for that file too if
you want.

For my init.el which has a (length imenu--index-alist) = 852,

    (benchmark-run 10
      (setq imenu--index-alist nil)
      (imenu--make-index-alist)) ;; => (7.113529254 0 0.0)

with REPETITIONS=1, I get (0.854962398 0 0.0).

In conclusion, the waiting time is barely an inconvenience.

> Anyhow, please write a complete sentence in the deprecation, so a dot at
> the end.  And remove the comma.

Done.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-imenu-index-function-for-DjVu-files-in-doc-view.patch --]
[-- Type: text/x-diff, Size: 9829 bytes --]

From 54c6050fa054dfb23b9f73c46661bfb8c69cc931 Mon Sep 17 00:00:00 2001
From: Visuwesh <visuweshm@gmail.com>
Date: Wed, 2 Oct 2024 13:48:25 +0530
Subject: [PATCH] Add imenu index function for DjVu files in doc-view

* lisp/doc-view.el (doc-view-imenu-enabled): Tweak the default
value to check for 'djvused', and make it obsolete.
(doc-view--djvu-outline, doc-view--parse-djvu-outline): Add new
functions to return imenu index for a Djvu file.
(doc-view--outline): Add new function to create the imenu index
depending on the file type.
(doc-view--outline): Document new possible variable value.
(doc-view-imenu-index): Use the above function instead.
(doc-view-imenu-setup): Try to create the imenu index
unconditionally.
* doc/emacs/misc.texi (DocView Navigation): Mention index
creation using 'djvused' too.
* etc/NEWS: Announce the change.  (Bug#73530)
---
 doc/emacs/misc.texi |  18 ++++----
 etc/NEWS            |   7 +++
 lisp/doc-view.el    | 101 +++++++++++++++++++++++++++++++++++++-------
 3 files changed, 102 insertions(+), 24 deletions(-)

diff --git a/doc/emacs/misc.texi b/doc/emacs/misc.texi
index b074eb034b2..7b11a829b0b 100644
--- a/doc/emacs/misc.texi
+++ b/doc/emacs/misc.texi
@@ -581,17 +581,17 @@ DocView Navigation
 default size for DocView, customize the variable
 @code{doc-view-resolution}.
 
-@vindex doc-view-imenu-enabled
 @vindex doc-view-imenu-flatten
 @vindex doc-view-imenu-format
-  When the @command{mutool} program is available, DocView will use it
-to generate entries for an outline menu, making it accessible via the
-@code{imenu} facility (@pxref{Imenu}).  To disable this functionality
-even when @command{mutool} can be found on your system, customize the
-variable @code{doc-view-imenu-enabled} to the @code{nil} value.  You
-can further customize how @code{imenu} items are formatted and
-displayed using the variables @code{doc-view-imenu-format} and
-@code{doc-view-imenu-flatten}.
+@vindex doc-view-djvused-program
+  DocView can generate an outline menu for PDF and DjVu documents using
+the @command{mutool} and the @command{djvused} programs respectively
+when they are available.  This is made accessible via the @code{imenu}
+facility (@pxref{Imenu}).  You can customize how @code{imenu} items are
+formatted and displayed using the variables @code{doc-view-imenu-format}
+and @code{doc-view-imenu-flatten}.  The filename of the
+@command{djvused} program can be customized by changing the
+@code{doc-view-djvused-program} user option.
 
 @cindex registers, in DocView mode
 @findex doc-view-page-to-register
diff --git a/etc/NEWS b/etc/NEWS
index abe316547aa..bbcef80b762 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -351,6 +351,13 @@ Docview can store current page to buffer-local registers with the new
 command 'doc-view-page-to-register' (bound to 'm'), and later the stored
 page can be restored with 'doc-view-jump-to-register' (bound to ''').
 
++++
+*** Docview can generate imenu index for DjVu files.
+When the 'djvused' program is available, Docview can now generate imenu
+index for DjVu files from its outline.
+The name of the 'djvused' program can be customized by changing the user
+option 'doc-view-djvused-program'.
+
 ** Tramp
 
 +++
diff --git a/lisp/doc-view.el b/lisp/doc-view.el
index e79295a8b01..6aa90926465 100644
--- a/lisp/doc-view.el
+++ b/lisp/doc-view.el
@@ -27,8 +27,10 @@
 ;; `pdftotext', which comes with xpdf (https://www.foolabs.com/xpdf/)
 ;; or poppler (https://poppler.freedesktop.org/). EPUB, CBZ, FB2, XPS
 ;; and OXPS documents require `mutool' which comes with mupdf
-;; (https://mupdf.com/index.html). Djvu documents require `ddjvu'
+;; (https://mupdf.com/index.html). DjVu documents require `ddjvu'
 ;; (from DjVuLibre).  ODF files require `soffice' (from LibreOffice).
+;; `djvused' (from DjVuLibre) can be optionally used to generate imenu
+;; outline for DjVu documents when available.
 
 ;;; Commentary:
 
@@ -216,10 +218,23 @@ doc-view-mupdf-use-svg
   :type 'boolean
   :version "30.1")
 
-(defcustom doc-view-imenu-enabled (and (executable-find "mutool") t)
-  "Whether to generate an imenu outline when \"mutool\" is available."
+(defcustom doc-view-djvused-program (and (executable-find "djvused")
+                                         "djvused")
+  "Name of \"djvused\" program to generate imenu outline for DjVu files.
+This is part of DjVuLibre."
+  :type 'file
+  :version "31.1")
+
+(defcustom doc-view-imenu-enabled (and (or (executable-find "mutool")
+                                           (executable-find "djvused"))
+                                       t)
+  "Whether to generate imenu outline for PDF and DjVu files.
+This uses \"mutool\" for PDF files and \"djvused\" for DjVu files."
   :type 'boolean
-  :version "29.1")
+  :version "31.1")
+(make-obsolete-variable 'doc-view-imenu-enabled
+   "Imenu index is generated unconditionally when available."
+   "31.1")
 
 (defcustom doc-view-imenu-title-format "%t (%p)"
   "Format spec for imenu's display of section titles from docview documents.
@@ -1958,7 +1973,9 @@ doc-view--outline-rx
   "[^\t]+\\(\t+\\)\"\\(.+\\)\"\t#\\(?:page=\\)?\\([0-9]+\\)")
 
 (defvar-local doc-view--outline nil
-  "Cached PDF outline, so that it is only computed once per document.")
+  "Cached PDF outline, so that it is only computed once per document.
+It can be the symbol `unavailable' to indicate that outline is
+unavailable for the document.")
 
 (defun doc-view--pdf-outline (&optional file-name)
   "Return a list describing the outline of FILE-NAME.
@@ -1973,6 +1990,7 @@ doc-view--pdf-outline
             (fn (expand-file-name fn)))
         (with-temp-buffer
           (unless (eql 0 (call-process "mutool" nil (current-buffer) nil "show" fn "outline"))
+            (setq doc-view--outline 'unavailable)
             (imenu-unavailable-error "Unable to create imenu index using `mutool'"))
           (goto-char (point-min))
           (while (re-search-forward doc-view--outline-rx nil t)
@@ -1983,6 +2001,42 @@ doc-view--pdf-outline
                   outline)))
         (nreverse outline)))))
 
+(defun doc-view--djvu-outline (&optional file-name)
+  "Return a list describing the outline of FILE-NAME.
+If FILE-NAME is nil or omitted, it defaults to the current buffer's file
+name.
+
+For the format, see `doc-view--pdf-outline'."
+  (unless file-name (setq file-name (buffer-file-name)))
+  (with-temp-buffer
+    (call-process doc-view-djvused-program nil (current-buffer) nil
+                  "-e" "print-outline" file-name)
+    (goto-char (point-min))
+    (when (eobp)
+      (setq doc-view--outline 'unavailable)
+      (imenu-unavailable-error "Unable to create imenu index using `djvused'"))
+    (nreverse (doc-view--parse-djvu-outline (read (current-buffer))))))
+
+(defun doc-view--parse-djvu-outline (bookmark &optional level)
+  "Return a list describing the djvu outline from BOOKMARK.
+Optional argument LEVEL is the current heading level, which defaults to 1."
+  (unless level (setq level 1))
+  (let ((res))
+    (unless (eq (car bookmark) 'bookmarks)
+      (user-error "Unknown outline type: %S" (car bookmark)))
+    (pcase-dolist (`(,title ,page . ,rest) (cdr bookmark))
+      (push `((level . ,level)
+              (title . ,title)
+              (page . ,(string-to-number (string-remove-prefix "#" page))))
+            res)
+      (when (and rest (listp (car rest)))
+        (setq res (append
+                   (doc-view--parse-djvu-outline
+                    (cons 'bookmarks rest)
+                    (+ level 1))
+                   res))))
+    res))
+
 (defun doc-view--imenu-subtree (outline act)
   "Construct a tree of imenu items for the given outline list and action.
 
@@ -2015,19 +2069,36 @@ doc-view-imenu-index
 For extensibility, callers can specify a FILE-NAME to indicate
 the buffer other than the current buffer, and a jumping function
 GOTO-PAGE-FN other than `doc-view-goto-page'."
-  (let* ((goto (or goto-page-fn 'doc-view-goto-page))
-         (act (lambda (_name _pos page) (funcall goto page)))
-         (outline (or doc-view--outline (doc-view--pdf-outline file-name))))
-    (car (doc-view--imenu-subtree outline act))))
+  (unless doc-view--outline
+    (setq doc-view--outline (doc-view--outline file-name)))
+  (unless (eq doc-view--outline 'unavailable)
+    (let* ((goto (or goto-page-fn #'doc-view-goto-page))
+           (act (lambda (_name _pos page) (funcall goto page)))
+           (outline doc-view--outline))
+      (car (doc-view--imenu-subtree outline act)))))
+
+(defun doc-view--outline (&optional file-name)
+  "Return the outline for the file FILE-NAME.
+If FILE-NAME is nil, use the current file instead."
+  (unless file-name (setq file-name (buffer-file-name)))
+  (let ((outline
+         (pcase doc-view-doc-type
+           ('djvu
+            (when doc-view-djvused-program
+              (doc-view--djvu-outline file-name)))
+           (_
+            (doc-view--pdf-outline file-name)))))
+    (when outline (imenu-add-to-menubar "Outline"))
+    ;; When the outline could not be made due to unavailability of the
+    ;; required program, or its absency from the document, return
+    ;; 'unavailable'.
+    (or outline 'unavailable)))
 
 (defun doc-view-imenu-setup ()
   "Set up local state in the current buffer for imenu, if needed."
-  (when doc-view-imenu-enabled
-    (setq-local imenu-create-index-function #'doc-view-imenu-index
-                imenu-submenus-on-top nil
-                imenu-sort-function nil
-                doc-view--outline (doc-view--pdf-outline))
-    (when doc-view--outline (imenu-add-to-menubar "Outline"))))
+  (setq-local imenu-create-index-function #'doc-view-imenu-index
+              imenu-submenus-on-top nil
+              imenu-sort-function nil))
 
 ;;;; User interface commands and the mode
 
-- 
2.45.2


  reply	other threads:[~2024-10-02  8:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-28 15:10 bug#73530: [PATCH] Add imenu index function for Djvu files in doc-view Visuwesh
2024-09-28 15:42 ` Eli Zaretskii
2024-09-28 17:02   ` Tassilo Horn
2024-09-28 17:35     ` Visuwesh
2024-09-28 17:53       ` Eli Zaretskii
2024-09-28 18:11       ` Tassilo Horn
2024-09-28 19:03         ` jao
2024-09-28 19:15           ` Tassilo Horn
2024-09-28 19:50             ` Jose A. Ortega Ruiz
2024-09-29 14:03               ` Tassilo Horn
2024-09-29 14:34                 ` Visuwesh
2024-09-29 16:20                   ` Tassilo Horn
2024-09-29 16:38                     ` Visuwesh
2024-09-29 17:15                       ` Tassilo Horn
2024-09-30 17:29                         ` Visuwesh
2024-10-02  6:42                           ` Tassilo Horn
2024-10-02  8:19                             ` Visuwesh [this message]
2024-10-02 14:53                               ` Tassilo Horn
2024-10-03  8:03                                 ` Tassilo Horn
2024-10-03 11:10                                   ` Visuwesh
2024-10-03 12:11                                     ` Tassilo Horn
2024-10-03 14:51                                   ` Visuwesh
2024-10-04  5:31                                     ` Tassilo Horn
2024-10-04  7:31                                       ` Visuwesh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y136dihg.fsf@gmail.com \
    --to=visuweshm@gmail.com \
    --cc=73530@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=jao@gnu.org \
    --cc=tsdh@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).