From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Visuwesh Newsgroups: gmane.emacs.bugs Subject: bug#73638: 31.0.50; doc-view: imenu index cannot be made for LaTeX PDFs Date: Mon, 07 Oct 2024 18:23:39 +0530 Message-ID: <87v7y43wh8.fsf@gmail.com> References: <87ploebyhc.fsf@gmail.com> <87ed4upbmf.fsf@gnu.org> <86plodvlcm.fsf@gnu.org> <87ttdphhir.fsf@gmail.com> <87jzel3aus.fsf@gnu.org> <87plodh68g.fsf@gmail.com> <87plod8ob3.fsf@gnu.org> <8734l9h0nj.fsf@gmail.com> <87y130z988.fsf@gnu.org> <87bjzw5knb.fsf@gmail.com> <87zfng44px.fsf@gmail.com> <87set8yy21.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39453"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Eli Zaretskii , 73638@debbugs.gnu.org To: Tassilo Horn Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Oct 07 14:55:16 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sxnGe-000A83-Dg for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 07 Oct 2024 14:55:16 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sxnGL-0004Uk-WC; Mon, 07 Oct 2024 08:54:58 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sxnGK-0004Tw-4t for bug-gnu-emacs@gnu.org; Mon, 07 Oct 2024 08:54:56 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sxnGJ-0006PC-RR for bug-gnu-emacs@gnu.org; Mon, 07 Oct 2024 08:54:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:References:In-Reply-To:From:To:Subject; bh=fjUylxEXa8eLt3knNF3YQu6fLvoxViTloSQjBK5W12I=; b=eXulK/oAk32Bnj6D02n7NdiqitYBSrCVUNTeGAgiIUrzUcnu43zl0veruShNqYNNPrg0bhJPZi6pZnL4FY9slvPaOfw0o/eU0ifH+koniTiYn7tDmyaDdNtbSq7ZBRCTC/fdpiN9mpMdSB9wlTGOttcm6c9SkH86caBY05bgV+4AQJrncumz4u/ymLlqu+LqGXU04GjMoL0qaTe9j2cDShOKmYgdR/5RwHB9+IrE2hBVJA5ep1tnMsRQW9tkl/X5ATYUYVPkipJa2lGNT4knJqK1ltW6rxKLBHsuG704yvgTTDm20CGp9V9Duseb8HyuAwwf7cjAovWk0eppxuxt2A==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1sxnGQ-0002FI-1T for bug-gnu-emacs@gnu.org; Mon, 07 Oct 2024 08:55:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Visuwesh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 07 Oct 2024 12:55:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 73638 X-GNU-PR-Package: emacs Original-Received: via spool by 73638-submit@debbugs.gnu.org id=B73638.17283057018616 (code B ref 73638); Mon, 07 Oct 2024 12:55:01 +0000 Original-Received: (at 73638) by debbugs.gnu.org; 7 Oct 2024 12:55:01 +0000 Original-Received: from localhost ([127.0.0.1]:45426 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sxnGO-0002Eu-GY for submit@debbugs.gnu.org; Mon, 07 Oct 2024 08:55:01 -0400 Original-Received: from mail-pl1-f193.google.com ([209.85.214.193]:44297) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1sxnGL-0002Ee-VQ for 73638@debbugs.gnu.org; Mon, 07 Oct 2024 08:54:58 -0400 Original-Received: by mail-pl1-f193.google.com with SMTP id d9443c01a7336-20b64584fd4so40958705ad.1 for <73638@debbugs.gnu.org>; Mon, 07 Oct 2024 05:54:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728305624; x=1728910424; darn=debbugs.gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=fjUylxEXa8eLt3knNF3YQu6fLvoxViTloSQjBK5W12I=; b=QxX/o+HdpPb9tU+lRoh1xq+FanaHi+KcUxWgq6BCJ9MMiJeEILTWD6GV/M1QAQkVxf ueLl2TTkBkXYfgUB6grHOQLjEY/FaQsuSkHXeO3Nv8mgFZqLpLBWK/DDmuk0SZ18x+A1 H6L+/zw2CGSrEbUBwyUKIQdf16EiJXkij8Zl7NUvnTjW5rBtwy02oAshFrGCfRhhG6Cz ILRt3xFJ1fsB4wo2rV74HP74VobR0iPqK2UXfS1T7rHm/kEqjEkaLsXw84fWBhZQMGM8 j44NbahprajJwj+sZiFyMmJfK0wjBm1Sk31INdQmclAWM8YOgtw33EgbYa5X2Z0cX69H Fq8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728305624; x=1728910424; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=fjUylxEXa8eLt3knNF3YQu6fLvoxViTloSQjBK5W12I=; b=hqeWEtw7NBMR8pXqDl8gR2M6cIZM6b0k8BxNXbAoolcrZjYTx0T3P1auhsh8ilLSUF FJWCdcEJmoNmNeiECNhH5KfsTtfz7LywuRJO5zqjj5gNWhI1MMkW9RaCj0TA90gkPwYK MioalO5Fua6CkBWKC0v0vdmMisQPFrnpWEqIR8+DHJQ7ZPKjHcL6LV9EzDFDNzhzAO5H IgNBw/UihNmWSttKqhh4FHXrOwfj+3SifSMtjw49qXlrrpnKPUACT2GAmBjueY9t2LId LCylfyyReQykJbQl0qTB/kc7C+kS5ZWtib5narh+0gObSRyVoXBykIKsQGvlrZvmB8ZX /x5A== X-Forwarded-Encrypted: i=1; AJvYcCVw+dWOp/An83j2o9qZc3pdVuutPA+Y1CT7aHBUnTz/UP/pEdCrQK2RsMRTDmEG6vqKRAgrNg==@debbugs.gnu.org X-Gm-Message-State: AOJu0Yy3AL87YNJQF4onP7lWRDhqNZBhW3HB1/KFcJB2l0I35c7SiDYF Mns4dMeu0a0KQCoFN3NmGNAdWHi3ofsci9rhHNnkHgAoF6mI+rxo X-Google-Smtp-Source: AGHT+IGGa7Oj996hEnnTPKX7oG4pcQza1RJI/n3yY3cccn6TsRNc85LaBMiyKVieYfw8cDTl0EBQuA== X-Received: by 2002:a17:902:f548:b0:20b:96e:8e65 with SMTP id d9443c01a7336-20bff1a8b43mr164373425ad.42.1728305624483; Mon, 07 Oct 2024 05:53:44 -0700 (PDT) Original-Received: from localhost ([115.240.90.130]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20c1395a54asm38960725ad.189.2024.10.07.05.53.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Oct 2024 05:53:43 -0700 (PDT) In-Reply-To: <87set8yy21.fsf@gnu.org> (Tassilo Horn's message of "Mon, 07 Oct 2024 13:03:50 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:293118 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable [=E0=AE=A4=E0=AE=BF=E0=AE=99=E0=AF=8D=E0=AE=95=E0=AE=B3=E0=AF=8D =E0=AE=85= =E0=AE=95=E0=AF=8D=E0=AE=9F=E0=AF=8B=E0=AE=AA=E0=AE=B0=E0=AF=8D 07, 2024] T= assilo Horn wrote: > Visuwesh writes: > >>>> Maybe it would also a good idea to use a :stderr buffer with >>>> make-process and put its contents into the imenu-unavailable-error. >>>> That way, chances are better we get the reason for failure delivered >>>> in bug reports. >>> >>> I do not think it is worth the trouble since only syntax errors are >>> likely to surface up in stderr which would be very unlikely. If the >>> PDF file does not have an outline, there would be nothing printed by >>> our script so end-of-file error should catch that case. >> >> Actually, this wasn't quite correct I think. We would have stray > in >> the buffer and read would return the symbol >. I corrected that in >> the attached. > > The patch looks good. But during testing, it seems that the index is > always off by one page, i.e., the index for some section brings me to > page 117 but the section heading is actually on page 118. > > I have that both with the Peter Atkins et al. book you suggested as well > as with own papers which didn't work at all previously due to #nameddest > references. Ugghhh, looks like the page number returned by the JS function is zero-indexed. Thanks for the catch (and sorry for the many mistakes and hence the back-and-forth). Should be corrected in the attached patch. --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=0001-Make-imenu-index-generation-for-PDFs-more-reliable.patch >From 84563a74cc2fba7279153f08d442b69c2977f2b4 Mon Sep 17 00:00:00 2001 From: Visuwesh Date: Sun, 6 Oct 2024 18:02:06 +0530 Subject: [PATCH] Make imenu index generation for PDFs more reliable Do away with parsing the output of "mutool show FILE outline" since the URI reported in its output may not include the page number of the heading, and instead may contained "nameddest" elements which cannot be resolved using "mutool". Instead, use a MuPDF JS script to generate the PDF outline allowing to resolve such URIs. * lisp/doc-view.el (doc-view--outline-rx): Remove as no longer needed. (doc-view--outline): Reflect that outline can be generated for non-PDF files too. (doc-view--mutool-pdf-outline-script): Add new variable to hold the JS script used to generate the outline. (doc-view--pdf-outline): Use the script. (bug#73638) --- lisp/doc-view.el | 48 ++++++++++++++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 16 deletions(-) diff --git a/lisp/doc-view.el b/lisp/doc-view.el index 446beeafd9f..a49cbc69717 100644 --- a/lisp/doc-view.el +++ b/lisp/doc-view.el @@ -1969,14 +1969,26 @@ doc-view-search-previous-match (doc-view-goto-page (caar (last doc-view--current-search-matches))))))) ;;;; Imenu support -(defconst doc-view--outline-rx - "[^\t]+\\(\t+\\)\"\\(.+\\)\"\t#\\(?:page=\\)?\\([0-9]+\\)") - (defvar-local doc-view--outline nil - "Cached PDF outline, so that it is only computed once per document. + "Cached document outline, so that it is only computed once per document. It can be the symbol `unavailable' to indicate that outline is unavailable for the document.") +(defvar doc-view--mutool-pdf-outline-script + "var document = new Document.openDocument(\"%s\", \"application/pdf\"); +var outline = document.loadOutline(); +if(!outline) quit(); +function pp(outl, level){print(\"((level . \" + level + \")\");\ +print(\"(title . \" + repr(outl.title) + \")\");\ +print(\"(page . \" + document.resolveLink(outl.uri)+1 + \"))\");\ +if(outl.down){for(var i=0; i