all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: JD Smith <jdsmith@as.arizona.edu>
Subject: Serialize list to disk
Date: Tue, 20 Dec 2005 17:42:45 -0700	[thread overview]
Message-ID: <pan.2005.12.21.00.42.43.507037@as.arizona.edu> (raw)


I have a large (1MB) XML routine definition file which can be parsed
with xml-parse-file into a fairly large list.  I need to walk over
this list and transform it in place to a more useful internal format.
All of this takes 5-7secs on reasonably speedy machines (of which
about 4s occurs just in xml-parse-file).  This list must be read in
during each Emacs session in which my IDLWAVE programming mode is
used, but the list is constant, so it would seem natural to perform
the XML conversion just once, and then serialize the list to disk in
some compact binary (but hopefully portable) format for future
sessions.  Before it was reborn as XML, this data structure was
auto-generated in lisp and byte-compiled, and it loaded in a fraction
of a second.  That is no longer an option, so I'm stuck with the XML.

Is there a simple input/output mechanism in Emacs for static list
structures, that would allow me to write a compact representation on
disk, and quickly recover it in future sessions, without going through
the lengthy XML conversion process?  One wrinkle: much of this data
structure is repetitive, so to save on memory and speed, many of the
strings inside it are "sinterned" into custom hash variables.

The list is basically many, many cells similar to:

<ROUTINE name="WV_CWT" link="WV_CWT.html">
  <SYNTAX name="Result = WV_CWT(Array, Family, Order )" type="func" />
  <ARGUMENT name="Array" link="WV_CWT.html#wp1009345" />
  <ARGUMENT name="Family" link="WV_CWT.html#wp1009347" />
  <ARGUMENT name="Order" link="WV_CWT.html#wp1009349" />
  <KEYWORD name="DOUBLE" link="WV_CWT.html#wp1009356" />
  <KEYWORD name="DSCALE" link="WV_CWT.html#wp1009358" />
  <KEYWORD name="NSCALE" link="WV_CWT.html#wp1009484" />
  <KEYWORD name="PAD" link="WV_CWT.html#wp1009489" />
  <KEYWORD name="SCALE" link="WV_CWT.html#wp1009494" />
  <KEYWORD name="START_SCALE" link="WV_CWT.html#wp1009568" />
</ROUTINE>

which gets lispified as:

(ROUTINE ((name . "WV_CWT") (link . "WV_CWT.html")) (SYNTAX ((name
. "Result = WV_CWT(Array, Family, Order )") (type . "func")) "")
(ARGUMENT ((name . "Array") (link . "WV_CWT.html#wp1009345")) "")
(ARGUMENT ((name . "Family") (link . "WV_CWT.html#wp1009347")) "")
(ARGUMENT ((name . "Order") (link . "WV_CWT.html#wp1009349")) "")
(KEYWORD ((name . "DOUBLE") (link . "WV_CWT.html#wp1009356")) "")
(KEYWORD ((name . "DSCALE") (link . "WV_CWT.html#wp1009358")) "")
(KEYWORD ((name . "NSCALE") (link . "WV_CWT.html#wp1009484")) "")
(KEYWORD ((name . "PAD") (link . "WV_CWT.html#wp1009489")) "")
(KEYWORD ((name . "SCALE") (link . "WV_CWT.html#wp1009494")) "")
(KEYWORD ((name . "START_SCALE") (link . "WV_CWT.html#wp1009568"))
""))

Not sure why that final null string is added to the list (possibly the
space before "/>"?).

Thanks,

JD

             reply	other threads:[~2005-12-21  0:42 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-21  0:42 JD Smith [this message]
2005-12-21 17:46 ` Serialize list to disk Kevin Rodgers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pan.2005.12.21.00.42.43.507037@as.arizona.edu \
    --to=jdsmith@as.arizona.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.