From mboxrd@z Thu Jan 1 00:00:00 1970 From: Richard Lawrence Subject: Citation processing via Zotero + zotxt Date: Sat, 28 Nov 2015 12:16:07 -0800 Message-ID: <87wpt1yj5k.fsf@berkeley.edu> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:59661) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2luR-0007bG-Qf for emacs-orgmode@gnu.org; Sat, 28 Nov 2015 15:15:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a2luN-0005oN-IL for emacs-orgmode@gnu.org; Sat, 28 Nov 2015 15:15:51 -0500 Received: from mail-pa0-x236.google.com ([2607:f8b0:400e:c03::236]:34711) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2luN-0005o4-8N for emacs-orgmode@gnu.org; Sat, 28 Nov 2015 15:15:47 -0500 Received: by padhx2 with SMTP id hx2so143487200pad.1 for ; Sat, 28 Nov 2015 12:15:45 -0800 (PST) Received: from aquinas (c-67-169-117-151.hsd1.ca.comcast.net. [67.169.117.151]) by smtp.gmail.com with ESMTPSA id x79sm41658989pfi.47.2015.11.28.12.15.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 28 Nov 2015 12:15:44 -0800 (PST) Received: from rwl by aquinas with local (Exim 4.84) (envelope-from ) id 1a2luh-0006DP-P6 for emacs-orgmode@gnu.org; Sat, 28 Nov 2015 12:16:07 -0800 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Hi everyone, For the past few days, I've been looking more closely at using the combination of Zotero [1] with Erik Hetzner's zotxt plugin [2] as a means of processing citations when exporting to non-LaTeX backends. I am now thinking that this is probably our best option, but I'd like to know what other people think before I sink a lot of work into it. Here are the reasons I think this is the best option: 1) It is really easy for users. For those unfamiliar, Zotero is a reference manager, and zotxt is a Zotero plugin that makes it easier to work with Zotero from plain text documents. Both are Firefox plugins, which means they can be installed by a non-technical user with a couple of clicks. It also means that users get updates automatically. I think this is *really* important. Pretty much all the other options we have talked about seem like they will require multi-step, non-trivial installation procedures ("First install {Node.js/Haskell/JVM ...}, then install {citeproc-node/pandoc-citeproc/citeproc-java...}, then install our wrapper script..."). Updating could require other manual operations of similar complexity. Avoiding that kind of procedure will make citations a lot more usable from Org for everyone. Also, unlike the other options, Zotero is a full-featured reference manager, not just a batch processor. So we as users get a useful piece of software with a simple installation procedure; the other options require a complex installation procedure for a less-useful program. 2) It is quite complete. Previously, I thought that it would be a similar amount of work to communicate with Zotero from Emacs as any of the other CSL implementations out there. However, after looking at zotxt a bit more closely, I discovered that it has an (undocumented) API endpoint [3] that pretty much does exactly what we need: it accepts a list of citation objects, and returns a list of formatted citations and a formatted bibliography, which can be inserted into the exported document. This endpoint still needs a little bit of work, to generalize it and make it easier to get the data in the format we need. (That is probably why it is undocumented in the README.) But it requires much less work than I thought it would, and much less work than it would be to get a full-featured setup with something like citeproc-node. Erik has also written a package for communicating with zotxt from Emacs, zotxt-emacs [4], which is available on MELPA. This package already contains a lot of useful functions for querying the Zotero database and inserting reference data into documents, including links in Org documents. I think it would be pretty straightforward to extend this package to provide a nice UI for writers who are inserting citations into Org documents, including search-based lookups of keys, etc. Perhaps org-ref could also be taught to communicate with zotxt (with or without zotxt-emacs) without too much work. 3) It uses citeproc-js. In previous discussions, I think we agreed that it would be best for us to use citeproc-js as a CSL processor, since it is the `canonical' CSL implementation, as opposed to pandoc-citeproc or citeproc-java. Zotero just uses citeproc-js internally to process citations, so it meets this requirement. I know that many people (perhaps especially the `power users' who have been active in the citations discussion so far) prefer to maintain their reference database without the aid of a GUI reference manager like Zotero. I still think Zotero + zotxt is the best option for non-LaTeX citation processing, even for these folks. The ease of installation (and removal) of the required programs alone makes it worth it, even if you never actually populate a Zotero database. So given what I know at the moment, I think our efforts would best be directed at making the in-progress org-cite library communicate with Zotero via zotxt. What do you think? Best, Richard [1] https://www.zotero.org/ [2] https://gitlab.com/egh/zotxt/ [3] See the bibliographyEndpoint function in: https://gitlab.com/egh/zotxt/blob/master/extension/bootstrap.js [4] https://gitlab.com/egh/zotxt-emacs