unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Package for data serialization?
@ 2006-06-13 10:39 spamfilteraccount
  2006-06-13 11:13 ` Pascal Bourguignon
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: spamfilteraccount @ 2006-06-13 10:39 UTC (permalink / raw)


Is there a package for de/serializing an arbitrary elisp data
structure, so that it can be read/written in binary format from/to
disk?

I know about prin1 and co., but they create a printed representation
and I want binary for speed and size.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
  2006-06-13 10:39 Package for data serialization? spamfilteraccount
@ 2006-06-13 11:13 ` Pascal Bourguignon
  2006-06-13 13:04 ` Ted Zlatanov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Pascal Bourguignon @ 2006-06-13 11:13 UTC (permalink / raw)


spamfilteraccount@gmail.com writes:

> Is there a package for de/serializing an arbitrary elisp data
> structure, so that it can be read/written in binary format from/to
> disk?
>
> I know about prin1 and co., but they create a printed representation
> and I want binary for speed and size.


There is no emacs lisp function to get the binary representation of a
cons cell, or a number or a string or anything, AFAIK.

So you won't be able to write in emacs lisp any function converting
values to binary any faster than prin1.


You may want to try it in C, adding primitive functions.  You will
still have to do a lot of work, like converting pointers into OID,
etc.


IIRC, there is a mean to save an emacs lisp image, so you could save
the whole data structure, along with the whole emacs in a new emacs
image, and then you'd run this image instead of a virgin emacs.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

NEW GRAND UNIFIED THEORY DISCLAIMER: The manufacturer may
technically be entitled to claim that this product is
ten-dimensional. However, the consumer is reminded that this
confers no legal rights above and beyond those applicable to
three-dimensional objects, since the seven new dimensions are
"rolled up" into such a small "area" that they cannot be
detected.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
  2006-06-13 10:39 Package for data serialization? spamfilteraccount
  2006-06-13 11:13 ` Pascal Bourguignon
@ 2006-06-13 13:04 ` Ted Zlatanov
  2006-06-13 13:37   ` Pascal Bourguignon
  2006-06-13 16:02 ` Phillip Lord
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Ted Zlatanov @ 2006-06-13 13:04 UTC (permalink / raw)


On 13 Jun 2006, spamfilteraccount@gmail.com wrote:

> Is there a package for de/serializing an arbitrary elisp data
> structure, so that it can be read/written in binary format from/to
> disk?
>
> I know about prin1 and co., but they create a printed representation
> and I want binary for speed and size.

To save space, just compress the resulting file.  It will probably do
better for *arbitrary* data (if you know the data, for instance
images, you can obviously do a better job because you know what can be
thrown away).

I don't think the speed improvements will be worth the time you will
spend on this :)  Perhaps the fact that there isn't such a system
already indicates it's not worth the effort to build it.

Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
  2006-06-13 13:04 ` Ted Zlatanov
@ 2006-06-13 13:37   ` Pascal Bourguignon
  0 siblings, 0 replies; 9+ messages in thread
From: Pascal Bourguignon @ 2006-06-13 13:37 UTC (permalink / raw)


Ted Zlatanov <tzz@lifelogs.com> writes:

> On 13 Jun 2006, spamfilteraccount@gmail.com wrote:
>
>> Is there a package for de/serializing an arbitrary elisp data
>> structure, so that it can be read/written in binary format from/to
>> disk?
>>
>> I know about prin1 and co., but they create a printed representation
>> and I want binary for speed and size.
>
> To save space, just compress the resulting file.  It will probably do
> better for *arbitrary* data (if you know the data, for instance
> images, you can obviously do a better job because you know what can be
> thrown away).

For space, already the ASCII form will be more compact than the binary
one.

For example, with numbers, you have a lot of numbers that have 1, 2 or
3 digits.  That's already one less than the 32-bit you'd use for all
the numbers.  For cons cells, you'd need two pointers (or two OID):
64-bit  With "(" " . " and ")" you use only 5 bytes per cons cells
intead of 8, and even less when you use the compacted form for lists:
"(" " " ")", using only 1 + (number of elements) bytes. etc.


> I don't think the speed improvements will be worth the time you will
> spend on this :)  Perhaps the fact that there isn't such a system
> already indicates it's not worth the effort to build it.

The speed bottleneck will be the hard disk.  If you've got shorter
data, then you'll be faster.  We've seen that an ASCII representation
is shorter. If you compress it even shorter than a binary one.  So it
will be faster. 

Moreover, the ASCII representation will be portable, and is less work.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

ATTENTION: Despite any other listing of product contents found
herein, the consumer is advised that, in actuality, this product
consists of 99.9999999999% empty space.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
  2006-06-13 10:39 Package for data serialization? spamfilteraccount
  2006-06-13 11:13 ` Pascal Bourguignon
  2006-06-13 13:04 ` Ted Zlatanov
@ 2006-06-13 16:02 ` Phillip Lord
  2006-06-13 17:57 ` Eli Zaretskii
       [not found] ` <mailman.2851.1150221485.9609.help-gnu-emacs@gnu.org>
  4 siblings, 0 replies; 9+ messages in thread
From: Phillip Lord @ 2006-06-13 16:02 UTC (permalink / raw)


>>>>> "anon" == spamfilteraccount  <spamfilteraccount@gmail.com> writes:

  anon> Is there a package for de/serializing an arbitrary elisp data
  anon> structure, so that it can be read/written in binary format
  anon> from/to disk?

  anon> I know about prin1 and co., but they create a printed
  anon> representation and I want binary for speed and size.


As far as I know, this doesn't work in general. Not all lisp objects
have an output representation that can then by eval'd back in; the
most obvious one for me is the hash-table which is a pain. I can
understand why objects such as a frame or a window can't be
serialised, but not a hash. 

However, as far as I can tell Emacs does the have the ability to store
arbitrary elisp data items -- this is what `dump-emacs' does for
example. I wonder how hard it would be to extend this system so that
it could cope with arbitrary bits of the current state. It would be
quite useful. 

Incidentally, the printed representation is not bad. Emacs is pretty
quick at parsing lisp code, even large data structures, for purposes
of an eval.

Cheers

Phil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
  2006-06-13 10:39 Package for data serialization? spamfilteraccount
                   ` (2 preceding siblings ...)
  2006-06-13 16:02 ` Phillip Lord
@ 2006-06-13 17:57 ` Eli Zaretskii
       [not found] ` <mailman.2851.1150221485.9609.help-gnu-emacs@gnu.org>
  4 siblings, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2006-06-13 17:57 UTC (permalink / raw)


> From: spamfilteraccount@gmail.com
> Date: 13 Jun 2006 03:39:30 -0700
> Complaints-To: groups-abuse@google.com
> Injection-Info: f6g2000cwb.googlegroups.com; posting-host=194.237.142.21;
> 	posting-account=b98TkQ0AAAD7PsllN8gfWGRoPOPWdnv4
> 
> Is there a package for de/serializing an arbitrary elisp data
> structure, so that it can be read/written in binary format from/to
> disk?
> 
> I know about prin1 and co., but they create a printed representation
> and I want binary for speed and size.

Do you actually have a real-life case where the speed and size of the
text representation is an issue?  If so, please describe the details.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
       [not found] ` <mailman.2851.1150221485.9609.help-gnu-emacs@gnu.org>
@ 2006-06-13 18:38   ` spamfilteraccount
  2006-06-13 18:48     ` Eli Zaretskii
       [not found]     ` <mailman.2854.1150224536.9609.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 9+ messages in thread
From: spamfilteraccount @ 2006-06-13 18:38 UTC (permalink / raw)



Eli Zaretskii wrote:
>
> Do you actually have a real-life case where the speed and size of the
> text representation is an issue?  If so, please describe the details.

Actually, I only assumed (a bit hastily maybe) that a binary
interpretation can be read significantly faster and that's why I asked
for it. I need to store long strings and fairly small numbers, so the
text format will probably be adequate for my purposes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
  2006-06-13 18:38   ` spamfilteraccount
@ 2006-06-13 18:48     ` Eli Zaretskii
       [not found]     ` <mailman.2854.1150224536.9609.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 9+ messages in thread
From: Eli Zaretskii @ 2006-06-13 18:48 UTC (permalink / raw)


> From: spamfilteraccount@gmail.com
> Date: 13 Jun 2006 11:38:12 -0700
> 
> Eli Zaretskii wrote:
> >
> > Do you actually have a real-life case where the speed and size of the
> > text representation is an issue?  If so, please describe the details.
> 
> Actually, I only assumed (a bit hastily maybe) that a binary
> interpretation can be read significantly faster and that's why I asked
> for it.

I think you will find that this assumption is false, except in _very_
rare and marginal cases (like humongously long data structures).

For example, one of the Emacs features writes to a file the last place
in each visited file as a Lisp data structure.  On my machine, this
file is 31KB large, but Emacs still starts in a snap, even though it
reads this file at startup.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Package for data serialization?
       [not found]     ` <mailman.2854.1150224536.9609.help-gnu-emacs@gnu.org>
@ 2006-06-14 10:27       ` Phillip Lord
  0 siblings, 0 replies; 9+ messages in thread
From: Phillip Lord @ 2006-06-14 10:27 UTC (permalink / raw)



>>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:

  >> Actually, I only assumed (a bit hastily maybe) that a binary
  >> interpretation can be read significantly faster and that's why I
  >> asked for it.

  Eli> I think you will find that this assumption is false, except in
  Eli> _very_ rare and marginal cases (like humongously long data
  Eli> structures).

  Eli> For example, one of the Emacs features writes to a file the
  Eli> last place in each visited file as a Lisp data structure.  On
  Eli> my machine, this file is 31KB large, but Emacs still starts in
  Eli> a snap, even though it reads this file at startup.



My package -- variant.el -- includes a file with all the English words
that Americans spell incorrectly. Of course, there are a lot of them,
so it's nearly 1M in size. Again Emacs reads this in too short a time
to measure. 

Phil

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-06-14 10:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-13 10:39 Package for data serialization? spamfilteraccount
2006-06-13 11:13 ` Pascal Bourguignon
2006-06-13 13:04 ` Ted Zlatanov
2006-06-13 13:37   ` Pascal Bourguignon
2006-06-13 16:02 ` Phillip Lord
2006-06-13 17:57 ` Eli Zaretskii
     [not found] ` <mailman.2851.1150221485.9609.help-gnu-emacs@gnu.org>
2006-06-13 18:38   ` spamfilteraccount
2006-06-13 18:48     ` Eli Zaretskii
     [not found]     ` <mailman.2854.1150224536.9609.help-gnu-emacs@gnu.org>
2006-06-14 10:27       ` Phillip Lord

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).