From: John Kitchin <jkitchin@andrew.cmu.edu>
To: "Bruce D'Arcus" <bdarcus@gmail.com>
Cc: org-mode-email <emacs-orgmode@gnu.org>
Subject: Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies
Date: Wed, 20 Apr 2022 08:44:32 -0400 [thread overview]
Message-ID: <CAJ51ETp67jayCmpK2nYjD3t5a8OhJP8dKCFu9HW_bvqZB7Dz3A@mail.gmail.com> (raw)
In-Reply-To: <CAF-FPGMyU70+NaaLCkFsfL_r3Bmy=M+HqH=t4JNoyt1UVgNhuw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5562 bytes --]
I would see if you can cache the result and not do it more than needed; it
can add a performance issue on large files.
John
-----------------------------------
Professor John Kitchin (he/him/his)
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu
On Wed, Apr 20, 2022 at 8:39 AM Bruce D'Arcus <bdarcus@gmail.com> wrote:
> On Wed, Apr 20, 2022 at 8:28 AM Ihor Radchenko <yantar92@gmail.com> wrote:
> >
> > There is an edge case triggering infinite loop in oc-basic.
> >
> > It is caused by bibtex-map-entries (used in
> > org-cite-basic--parse-bibtex) when ran on a malformed bibtex buffer [1].
> >
> > The proposed patch validates the bibtex buffer before processing and
> > throws an error if issues are found. `bibtex-validate` also
> > conveniently displays a list of errors with clickable links to
> > problematic lines.
> >
> > I believe that it is useful for the users to see such issues instead of,
> > say, failing silently on malformed bibliographies.
> >
> > WDYT?
>
> I haven't tested it, but this is an excellent idea!
>
> Bruce
>
> > Best,
> > Ihor
> >
> > [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55036
> >
> > * lisp/oc-basic.el (org-cite-basic--parse-bibtex): Validate the
> > bibliography before parsing. Display list of issues if any (via
> > `bibtex-validate`).
> > (org-cite-basic--parse-bibliography): Set buffer file name needed by
> > `bibtex-validate`. Empty the cache in case of error.
> > ---
> > lisp/oc-basic.el | 38 ++++++++++++++++++++++++--------------
> > 1 file changed, 24 insertions(+), 14 deletions(-)
> >
> > diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el
> > index 873986d07..79f7a4844 100644
> > --- a/lisp/oc-basic.el
> > +++ b/lisp/oc-basic.el
> > @@ -214,6 +214,10 @@ (defun org-cite-basic--parse-bibtex (dialect)
> > (let ((entries (make-hash-table :test #'equal))
> > (bibtex-sort-ignore-string-entries t))
> > (bibtex-set-dialect dialect t)
> > + ;; Throw an error if bibliography is malformed.
> > + (unless (bibtex-validate)
> > + (user-error "Malformed bibliography at %S"
> > + (or (buffer-file-name) (current-buffer))))
> > (bibtex-map-entries
> > (lambda (key &rest _)
> > ;; Normalize entries: field names are turned into symbols
> > @@ -258,21 +262,27 @@ (defun org-cite-basic--parse-bibliography
> (&optional info)
> > (when (or (org-file-has-changed-p file)
> > (not (gethash file
> org-cite-basic--file-id-cache)))
> > (insert-file-contents file)
> > + (setf (buffer-file-name) file)
> > (puthash file (org-buffer-hash)
> org-cite-basic--file-id-cache))
> > - (let* ((file-id (cons file (gethash file
> org-cite-basic--file-id-cache)))
> > - (entries
> > - (or (cdr (assoc file-id
> org-cite-basic--bibliography-cache))
> > - (let ((table
> > - (pcase (file-name-extension file)
> > - ("json" (org-cite-basic--parse-json))
> > - ("bib" (org-cite-basic--parse-bibtex
> 'biblatex))
> > - ("bibtex"
> (org-cite-basic--parse-bibtex 'BibTeX))
> > - (ext
> > - (user-error "Unknown bibliography
> extension: %S"
> > - ext)))))
> > - (push (cons file-id table)
> org-cite-basic--bibliography-cache)
> > - table))))
> > - (push (cons file entries) results)))))
> > + (unwind-protect
> > + (condition-case nil
> > + (unwind-protect
> > + (let* ((file-id (cons file (gethash file
> org-cite-basic--file-id-cache)))
> > + (entries
> > + (or (cdr (assoc file-id
> org-cite-basic--bibliography-cache))
> > + (let ((table
> > + (pcase (file-name-extension
> file)
> > + ("json"
> (org-cite-basic--parse-json))
> > + ("bib"
> (org-cite-basic--parse-bibtex 'biblatex))
> > + ("bibtex"
> (org-cite-basic--parse-bibtex 'BibTeX))
> > + (ext
> > + (user-error "Unknown
> bibliography extension: %S"
> > + ext)))))
> > + (push (cons file-id table)
> org-cite-basic--bibliography-cache)
> > + table))))
> > + (push (cons file entries) results))
> > + (setf (buffer-file-name) nil))
> > + (error (setq org-cite-basic--file-id-cache nil)))))))
> > (when info (plist-put info :cite-basic/bibliography results))
> > results)))
> >
> > --
> > 2.35.1
> >
> >
> >
> > --
> > Ihor Radchenko,
> > PhD,
> > Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
> > State Key Laboratory for Mechanical Behavior of Materials, Xi'an
> Jiaotong University, Xi'an, China
> > Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg
> >
>
>
[-- Attachment #2: Type: text/html, Size: 7735 bytes --]
next prev parent reply other threads:[~2022-04-20 12:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-20 12:28 [PATCH] oc-basic: Detect malformed bibtex bibliographies Ihor Radchenko
2022-04-20 12:38 ` Bruce D'Arcus
2022-04-20 12:44 ` John Kitchin [this message]
2022-04-20 13:02 ` Ihor Radchenko
2022-07-09 9:18 ` Ihor Radchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJ51ETp67jayCmpK2nYjD3t5a8OhJP8dKCFu9HW_bvqZB7Dz3A@mail.gmail.com \
--to=jkitchin@andrew.cmu.edu \
--cc=bdarcus@gmail.com \
--cc=emacs-orgmode@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).