unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: "Pascal J. Bourguignon" <pjb@informatimago.com>
To: help-gnu-emacs@gnu.org
Subject: Re: why are there [v e c t o r s] in Lisp?
Date: Fri, 16 Oct 2015 05:57:29 +0200	[thread overview]
Message-ID: <87a8rjea8m.fsf@kuiper.lan.informatimago.com> (raw)
In-Reply-To: mailman.432.1444964227.7904.help-gnu-emacs@gnu.org

Random832 <random832@fastmail.com> writes:

> Emanuel Berg <embe8573@student.uu.se> writes:
>> If a list just contains say a bunch of known integers
>> that don't need to be computed, is there anything that
>> stops this list from being stored in "contiguous
>> memory" with the list functions as well as the
>> constant access time "vector" functions available?

Emanuel, notice that in some implementations of lisp in the 1980s, there
was an optimization for lists called cdr-coding which stored the list
when it was copied as a vector of consecutive elements (eliminating thus
the cdr, which basically was reduced to a bit in tag of the slot).  This
still allowed to call cdr (instead of derefencing that slot, we'd just
increment the address in the vector).  But if you used rplacd, then it
had to "split" the cdr-coded vector, creating a new cons cell where to
store the new cdr pointer.

    (copy-list '(1 2 3 4 5))

would create a vector of CARs and point to the first one.  The slot
would have a bit set indicating that the CDR has been eliminated, and
the CAR of the next cell is stored in the next address:

+-----------------------------------------------------------+
| list = (1 2 3 4 5)   +----------------------------f       |
|   |                  |                                    |
|   v                  v                                    |
| +-----+-----+-----+-----+-----+-----+                     |                          
| |  * '|  * '|  * '|  * '|  *  | NIL |                     |                         
| +-----+-----+-----+-----+-----+-----+                     |                         
|    |     |     |     |     |                              |                      
|    v     v     v     v     v                              |
|  +---+ +---+ +---+ +---+ +---+                            |
|  | 1 | | 2 | | 3 | | 4 | | 5 |                            |
|  +---+ +---+ +---+ +---+ +---+                            |
+-----------------------------------------------------------+

The last cell would be a CDR (so the preceeding slot doesn't have the
bit in the tag set), pointing to a following cons cell or cdr vector, or
the atom in the dotted list (here, NIL, to indicate the end of the
list).

Now the problem occurs when you use rplacd in the middle of the list:
                                                             
(let* ((list (list 1 2 3 4 5))
       (f    (nthcdr 3 list)))
  (rplacd (cddr list) (list* 4.0 4.2 (cddddr list)))         
  (list list f))
--> ((1 2 3 4.0 4.2 . #1=(5))
     (4 . #1#))

To do that in the presence of cdr-coding, we have to introduce a new
cons cell for 4, and more problematic, we have to scan the whole memory
for references to this cell, and update them (in this case, the variable
f must now point to a new cons cell, instead of in the middle of the
cdr-vector):


+-----------------------------------------------------------+
| list = (1 2 3 4 5)                                        |
|   |                                                       |
|   |                      +-----+-----+-----+              |
|   |                  +-->|  * '|  *  |  *  |              |
|   |                  |   +-----+-----+-----+              |
|   |                  |      |     |     |                 |
|   |                  |      v     v     |                 |
|   |                  |    +---+ +---+   |                 |
|   |                  |    |4.0| |4.2|   |                 |
|   |                  |    +---+ +---+   |                 |
|   |                  |                  |                 |
|   |                  |     +------------+<-----------+    |
|   |                  |     |        f-------+        |    |
|   |                  |     |                |        |    |
|   v                  v     v                v        |    |
| +-----+-----+-----+-----+-----+-----+     +---+---+  |    |                          
| |  * '|  * '|  *  |  *  |  * '| NIL |     | * | * |--+    |                         
| +-----+-----+-----+-----+-----+-----+     +---+---+       |                         
|    |     |     |           |                |             |                      
|    v     v     v           v                v             |
|  +---+ +---+ +---+       +---+            +---+           |
|  | 1 | | 2 | | 3 |       | 5 |            | 4 |           |
|  +---+ +---+ +---+       +---+            +---+           |
+-----------------------------------------------------------+

> Well, what happens if you cdr that list? Does it make a
> copy? 

cdr'ing is no problem you just increment your pointer.

> Wasted memory, wasted time, won't track if you change
> the original. Yeah, yeah, in platonic ideal lisp you *can't*
> change lists, but this is the real world, where even scheme
> lets you do it. Does it keep a reference to the original
> list? Then what if you take the cd^{4000}r of a list of
> 5000, then throw away the original?  Wasted memory
> again. Java used to have this issue with strings. I suppose
> you could have the underlying array somehow know what the
> earliest index that a reference actually exists for is, and
> magically throw away everything before it during garbage
> collection, if there's too much wasted space. But that seems
> rather a lot of work.
>
>> By the way: in my previous post strings were mentioned
>> and it sounded like they were sugar for lists of chars
>> - this isn't the case (you can't `car' a string) but
>> it could have been and it isn't harmful (I think) to
>> think of strings that way.
>
> The thing is, if you can car a string people will wonder why
> you can't cdr it. And with mutable objects it's hard to make
> cdr work right. (fsvo "right")

It's rplacd which becomes much more complex.

Also, when you program with such a system, you might be afraid that
rplacd leads to fragmented chains, and will have a tendency to call
copy-list to compact them.  But indeed, copy-list means that less
structure is shared, and therefore would waste time and memory.  This is
what they do in C++ and why they believe lists are costly.


-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk


  parent reply	other threads:[~2015-10-16  3:57 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.428.1444957396.7904.help-gnu-emacs@gnu.org>
2015-10-16  1:51 ` why are there [v e c t o r s] in Lisp? Pascal J. Bourguignon
2015-10-16  2:31   ` Emanuel Berg
2015-10-16  2:29     ` Random832
2015-10-16  2:51       ` Emanuel Berg
2015-10-16  2:56         ` Random832
2015-10-16 23:30           ` Emanuel Berg
     [not found]           ` <mailman.482.1445037713.7904.help-gnu-emacs@gnu.org>
2015-10-17  1:55             ` Pascal J. Bourguignon
2015-10-17  4:47               ` Emanuel Berg
     [not found]               ` <mailman.494.1445057637.7904.help-gnu-emacs@gnu.org>
2015-10-17 15:25                 ` Pascal J. Bourguignon
2015-10-17 21:12                   ` Emanuel Berg
     [not found]                   ` <mailman.519.1445115776.7904.help-gnu-emacs@gnu.org>
2015-10-18  1:08                     ` Pascal J. Bourguignon
     [not found]         ` <mailman.432.1444964227.7904.help-gnu-emacs@gnu.org>
2015-10-16  3:57           ` Pascal J. Bourguignon [this message]
2015-10-16  4:17             ` Random832
     [not found]             ` <mailman.434.1444970033.7904.help-gnu-emacs@gnu.org>
2015-10-16  5:16               ` Pascal J. Bourguignon
     [not found]   ` <mailman.429.1444962165.7904.help-gnu-emacs@gnu.org>
2015-10-16  3:31     ` Pascal J. Bourguignon
2015-10-16 23:46       ` Emanuel Berg
     [not found]       ` <mailman.483.1445038647.7904.help-gnu-emacs@gnu.org>
2015-10-17  2:04         ` Pascal J. Bourguignon
2015-10-17  4:40           ` Random832
2015-10-17  5:00             ` Emanuel Berg
2015-10-17  4:40           ` Emanuel Berg
     [not found]           ` <mailman.491.1445056308.7904.help-gnu-emacs@gnu.org>
2015-10-17  5:53             ` Barry Margolin
2015-10-17 15:16             ` Pascal J. Bourguignon
2015-10-17 21:06               ` Emanuel Berg
     [not found]               ` <mailman.518.1445115463.7904.help-gnu-emacs@gnu.org>
2015-10-18  1:07                 ` Pascal J. Bourguignon
2015-10-18 12:32                   ` Emanuel Berg
     [not found]                   ` <mailman.551.1445171034.7904.help-gnu-emacs@gnu.org>
2015-10-18 12:55                     ` Pascal J. Bourguignon
2015-10-18 14:28                       ` Emanuel Berg
2015-10-18 21:17                         ` Robert Thorpe
     [not found]                       ` <mailman.557.1445177952.7904.help-gnu-emacs@gnu.org>
2015-10-18 19:48                         ` Barry Margolin
2015-10-18 21:25                           ` Emanuel Berg
2015-10-18 21:39                             ` Random832
2015-10-19  0:46                         ` Pascal J. Bourguignon
2015-10-17  5:56           ` Barry Margolin
2015-10-17 15:06             ` Emanuel Berg
2015-10-16 13:32 ` Barry Margolin
2015-10-16 23:47   ` Emanuel Berg
     [not found] <mailman.581.1445203060.7904.help-gnu-emacs@gnu.org>
2015-10-19  0:45 ` Pascal J. Bourguignon
2015-10-16  1:12 Emanuel Berg
2015-10-17  1:11 ` Aurélien Aptel
2015-10-17  4:22   ` Emanuel Berg
2015-10-17  7:58   ` Jude DaShiell
2015-10-19 16:33     ` Nick Dokos
     [not found]   ` <mailman.490.1445055179.7904.help-gnu-emacs@gnu.org>
2015-10-17 15:09     ` Pascal J. Bourguignon
     [not found] ` <mailman.488.1445044303.7904.help-gnu-emacs@gnu.org>
2015-10-17  2:22   ` Pascal J. Bourguignon
2015-10-17  4:56     ` Emanuel Berg
2015-10-17  5:49       ` Barry Margolin
2015-10-17 15:04         ` Emanuel Berg
     [not found]         ` <mailman.506.1445093727.7904.help-gnu-emacs@gnu.org>
2015-10-17 15:20           ` Pascal J. Bourguignon
2015-10-17 15:38             ` Emanuel Berg
     [not found]             ` <mailman.511.1445095810.7904.help-gnu-emacs@gnu.org>
2015-10-17 16:01               ` Javier
2015-10-17 16:03                 ` Javier
2015-10-17 16:34                 ` Pascal J. Bourguignon
2015-10-17 21:18                   ` Emanuel Berg
2015-10-17 16:15               ` Pascal J. Bourguignon
2015-10-17 21:37                 ` Emanuel Berg
2015-10-17 15:18       ` Pascal J. Bourguignon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a8rjea8m.fsf@kuiper.lan.informatimago.com \
    --to=pjb@informatimago.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).