unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Serialize list to disk
@ 2005-12-21  0:42 JD Smith
  2005-12-21 17:46 ` Kevin Rodgers
  0 siblings, 1 reply; 2+ messages in thread
From: JD Smith @ 2005-12-21  0:42 UTC (permalink / raw)



I have a large (1MB) XML routine definition file which can be parsed
with xml-parse-file into a fairly large list.  I need to walk over
this list and transform it in place to a more useful internal format.
All of this takes 5-7secs on reasonably speedy machines (of which
about 4s occurs just in xml-parse-file).  This list must be read in
during each Emacs session in which my IDLWAVE programming mode is
used, but the list is constant, so it would seem natural to perform
the XML conversion just once, and then serialize the list to disk in
some compact binary (but hopefully portable) format for future
sessions.  Before it was reborn as XML, this data structure was
auto-generated in lisp and byte-compiled, and it loaded in a fraction
of a second.  That is no longer an option, so I'm stuck with the XML.

Is there a simple input/output mechanism in Emacs for static list
structures, that would allow me to write a compact representation on
disk, and quickly recover it in future sessions, without going through
the lengthy XML conversion process?  One wrinkle: much of this data
structure is repetitive, so to save on memory and speed, many of the
strings inside it are "sinterned" into custom hash variables.

The list is basically many, many cells similar to:

<ROUTINE name="WV_CWT" link="WV_CWT.html">
  <SYNTAX name="Result = WV_CWT(Array, Family, Order )" type="func" />
  <ARGUMENT name="Array" link="WV_CWT.html#wp1009345" />
  <ARGUMENT name="Family" link="WV_CWT.html#wp1009347" />
  <ARGUMENT name="Order" link="WV_CWT.html#wp1009349" />
  <KEYWORD name="DOUBLE" link="WV_CWT.html#wp1009356" />
  <KEYWORD name="DSCALE" link="WV_CWT.html#wp1009358" />
  <KEYWORD name="NSCALE" link="WV_CWT.html#wp1009484" />
  <KEYWORD name="PAD" link="WV_CWT.html#wp1009489" />
  <KEYWORD name="SCALE" link="WV_CWT.html#wp1009494" />
  <KEYWORD name="START_SCALE" link="WV_CWT.html#wp1009568" />
</ROUTINE>

which gets lispified as:

(ROUTINE ((name . "WV_CWT") (link . "WV_CWT.html")) (SYNTAX ((name
. "Result = WV_CWT(Array, Family, Order )") (type . "func")) "")
(ARGUMENT ((name . "Array") (link . "WV_CWT.html#wp1009345")) "")
(ARGUMENT ((name . "Family") (link . "WV_CWT.html#wp1009347")) "")
(ARGUMENT ((name . "Order") (link . "WV_CWT.html#wp1009349")) "")
(KEYWORD ((name . "DOUBLE") (link . "WV_CWT.html#wp1009356")) "")
(KEYWORD ((name . "DSCALE") (link . "WV_CWT.html#wp1009358")) "")
(KEYWORD ((name . "NSCALE") (link . "WV_CWT.html#wp1009484")) "")
(KEYWORD ((name . "PAD") (link . "WV_CWT.html#wp1009489")) "")
(KEYWORD ((name . "SCALE") (link . "WV_CWT.html#wp1009494")) "")
(KEYWORD ((name . "START_SCALE") (link . "WV_CWT.html#wp1009568"))
""))

Not sure why that final null string is added to the list (possibly the
space before "/>"?).

Thanks,

JD

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Serialize list to disk
  2005-12-21  0:42 Serialize list to disk JD Smith
@ 2005-12-21 17:46 ` Kevin Rodgers
  0 siblings, 0 replies; 2+ messages in thread
From: Kevin Rodgers @ 2005-12-21 17:46 UTC (permalink / raw)


JD Smith wrote:
> I have a large (1MB) XML routine definition file which can be parsed
> with xml-parse-file into a fairly large list.  I need to walk over
> this list and transform it in place to a more useful internal format.
> All of this takes 5-7secs on reasonably speedy machines (of which
> about 4s occurs just in xml-parse-file).  This list must be read in
> during each Emacs session in which my IDLWAVE programming mode is
> used, but the list is constant, so it would seem natural to perform
> the XML conversion just once, and then serialize the list to disk in
> some compact binary (but hopefully portable) format for future
> sessions.  Before it was reborn as XML, this data structure was
> auto-generated in lisp and byte-compiled, and it loaded in a fraction
> of a second.  That is no longer an option, so I'm stuck with the XML.
> 
> Is there a simple input/output mechanism in Emacs for static list
> structures, that would allow me to write a compact representation on
> disk, and quickly recover it in future sessions, without going through
> the lengthy XML conversion process?  One wrinkle: much of this data
> structure is repetitive, so to save on memory and speed, many of the
> strings inside it are "sinterned" into custom hash variables.

Of course!

To write it out:

(let ((routine (xml-parse-file XML-FILE)))
   (with-temp-file LISP-FILE
     (pp object (current-buffer))))

Then to read it in:

(let ((routine (read (find-file-noselect LISP-FILE))))
   ...)

> The list is basically many, many cells similar to:
> 
> <ROUTINE name="WV_CWT" link="WV_CWT.html">
>   <SYNTAX name="Result = WV_CWT(Array, Family, Order )" type="func" />
>   <ARGUMENT name="Array" link="WV_CWT.html#wp1009345" />
>   <ARGUMENT name="Family" link="WV_CWT.html#wp1009347" />
>   <ARGUMENT name="Order" link="WV_CWT.html#wp1009349" />
>   <KEYWORD name="DOUBLE" link="WV_CWT.html#wp1009356" />
>   <KEYWORD name="DSCALE" link="WV_CWT.html#wp1009358" />
>   <KEYWORD name="NSCALE" link="WV_CWT.html#wp1009484" />
>   <KEYWORD name="PAD" link="WV_CWT.html#wp1009489" />
>   <KEYWORD name="SCALE" link="WV_CWT.html#wp1009494" />
>   <KEYWORD name="START_SCALE" link="WV_CWT.html#wp1009568" />
> </ROUTINE>
> 
> which gets lispified as:
> 
> (ROUTINE ((name . "WV_CWT") (link . "WV_CWT.html")) (SYNTAX ((name
> . "Result = WV_CWT(Array, Family, Order )") (type . "func")) "")
> (ARGUMENT ((name . "Array") (link . "WV_CWT.html#wp1009345")) "")
> (ARGUMENT ((name . "Family") (link . "WV_CWT.html#wp1009347")) "")
> (ARGUMENT ((name . "Order") (link . "WV_CWT.html#wp1009349")) "")
> (KEYWORD ((name . "DOUBLE") (link . "WV_CWT.html#wp1009356")) "")
> (KEYWORD ((name . "DSCALE") (link . "WV_CWT.html#wp1009358")) "")
> (KEYWORD ((name . "NSCALE") (link . "WV_CWT.html#wp1009484")) "")
> (KEYWORD ((name . "PAD") (link . "WV_CWT.html#wp1009489")) "")
> (KEYWORD ((name . "SCALE") (link . "WV_CWT.html#wp1009494")) "")
> (KEYWORD ((name . "START_SCALE") (link . "WV_CWT.html#wp1009568"))
> ""))
> 
> Not sure why that final null string is added to the list (possibly the
> space before "/>"?).

No, that is normal XML syntax:  "<SYNTAX ... />" is an empty element
just like "<SYNTAX ...></SYNTAX>".

-- 
Kevin Rodgers

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-12-21 17:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-21  0:42 Serialize list to disk JD Smith
2005-12-21 17:46 ` Kevin Rodgers

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).