unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: "Pascal J. Bourguignon" <pjb@informatimago.com>
To: help-gnu-emacs@gnu.org
Subject: Re: why are there [v e c t o r s] in Lisp?
Date: Fri, 16 Oct 2015 03:51:42 +0200	[thread overview]
Message-ID: <87mvvjeg29.fsf@kuiper.lan.informatimago.com> (raw)
In-Reply-To: mailman.428.1444957396.7904.help-gnu-emacs@gnu.org

Emanuel Berg <embe8573@student.uu.se> writes:

> Why is there a special syntax for vectors? 

To make it easy to introduce literal vectors in your programs.

Without this syntax, you would only have run-time constructors, and you
would have to write more complex code.

For example, the equivalent of:

    (defun permut-0213 (x)
       (aref [0 2 1 3] x))

would have to be written as:

   (defun permut-0213 (x)
      (aref (load-time-value (vector 0 2 1 3)) x))

or:
    
   (let ((permut (vector 0 2 1 3)))
     (defun permut-0213 (x)
        (aref permut x)))


> In linear algebra, an n-dimensional vector is a sequence of n numbers,
> and collectively they make for something that has direction and
> magnitude (in particular, it doesn't have a position). But whatever
> the math, isn't that (a sequence of numbers) something that the lists
> of Lisp can handle just as well, or actually better, as it will be
> more generic (a lot of stuff that doesn't work on "real" Lisp vectors
> will work on vectors that are also lists). And using lists doesn't
> mean nobody cannot write hundreds of "math vector" specific stuff to
> modify those list vectors! Right?

Right.  And why have strings too!


> Have a look:
>
>     (vectorp [1 2 3]) ; t
>
>     (vectorp '(1 2 3)) ; nil - really?

    (vectorp "123")       ; --> nil - why? :-( 

    (sequencep [1 2 3])  ; --> t
    (sequencep '(1 2 3)) ; --> t
    (sequencep "123")    ; --> t

In Common Lisp, strings are vectors are sequence.
And in CL, vectors are arrays, which may be multi-dimensional, but
there's no such thing in emacs lisp.


The reason why there is a vector data type distinct from the list data
type is purely an optimization reason.  In ACL2, for example, there are
only "lists". https://en.wikipedia.org/wiki/ACL2



In lisp, there are no lists.  There's no list abstract data type.  There
are only cons cells, which are used to build linked lists of cons cells.

Here, a cons cell is represented with a box split in two parts, the car
and the cdr:

    +---+---+
    | * | * |-->
    +---+---+
      |
      v


Therefore a list such as (1 2 3 4 5) will be actually a cons cell:
 (1 . (2 . (3 . (4 . (5 . nil))))), which can be drawn as:

+-----------------------------------------------------------+
| (1 2 3 4 5)                                               |
|                                                           |
| +---+---+   +---+---+   +---+---+   +---+---+   +---+---+ |
| | * | * |-->| * | * |-->| * | * |-->| * | * |-->| * |NIL| |
| +---+---+   +---+---+   +---+---+   +---+---+   +---+---+ |
|   |           |           |           |           |       |
|   v           v           v           v           v       |
| +---+       +---+       +---+       +---+       +---+     |
| | 1 |       | 2 |       | 3 |       | 4 |       | 5 |     |
| +---+       +---+       +---+       +---+       +---+     |
+-----------------------------------------------------------+

The problem is that to find the ith element in the list, we have to walk
i cons cells:

    (defun elt (s i)
       (etypecase s
         (null nil)
         (cons (if (zerop i)
                  (car s)
                  (elt (cdr s) (1- i))))))

elt is therefore O(n) for lists of length n, which is rather slow for
such a basic access.



The solution is to use internally an indexed data structure: contiguous
memory blocks, for which we can compute the address in O(1):


+-----------------------------------------------------------+
| [1 2 3 4 5]                                               |
|                                                           |
| +-----+-----+-----+-----+-----+                           |                          
| |  *  |  *  |  *  |  *  |  *  |                           |                         
| +-----+-----+-----+-----+-----+                           |                         
|    |     |     |     |     |                              |                      
|    v     v     v     v     v                              |
|  +---+ +---+ +---+ +---+ +---+                            |
|  | 1 | | 2 | | 3 | | 4 | | 5 |                            |
|  +---+ +---+ +---+ +---+ +---+                            |
+-----------------------------------------------------------+


To find the address of the ith slot, we only have to multiply i by the
size of the slots, and add the address of the start of the vector:

(defun elt (s i)
  (etypecase s
    (null nil)
    (cons (if (zerop i)
              (car s)
              (elt (cdr s) (1- i))))
    (vector (%deref (%address (+ (%address-of s)
                                 (* i (%size-of-element-of s))))))))

All the operations performed for the vector case are O(1), therefore elt
for vectors is O(1), which is much faster than for list, notably when
the index is not small.

Furthermore, notice that this kind of vector addressing often benefit
from addressing modes implemented in the processor, so you don't really
have to compile it from lisp; the implementation will have a native
vector indexing operator that will translate to a single inline
processor instruction:

;; in CCL:
cl-user> (disassemble (lambda (s i)
                        (declare (optimize (speed 3) (space 3) (safety 0) (debug 0)))
                        (declare (type simple-vector s) (type fixnum i))
                        (svref s i)))
L0
    (leaq (@ (:^ L0) (% rip)) (% fn))       ;     [0]
    (pushq (% rbp))                         ;     [7]
    (movq (% rsp) (% rbp))                  ;     [8]
    (pushq (% arg_y))                       ;    [11]
    (pushq (% arg_z))                       ;    [12]
    (movq (@ -5 (% arg_y) (% arg_z)) (% arg_z)) ;    [13]
    (leaveq)                                ;    [18]
    (retq)                                  ;    [19]
nil
cl-user> 

Notice how the access to the vector slot is reduced to the single ix86
instruction:

    (movq   (@ -5 (% arg_y) (% arg_z))    (% arg_z))

(The offset -5 is introduced to remove the type tag in the address of
the vector).




Now, of course, (car l) will be faster than (aref v 0)
and (cadr l) might also be faster than (aref v 1), but already, (cadr l)
requires two memory accesses (read the cdr of l, read the car of that
cdr), while (aref v 1) only needs one memory access (the movq above), 
so even if movq with it's implied multiply and add will probably be
faster than a memory access, more so nowadays when processor (register)
speed have become much faster than memory accesses.



Therefore the lesson is that if you have to use elt (nth) on a list, in
a loop, you definitely do not want to do that if the list is more than
one or two elements.


Instead, either process the elements in the list in order using car/cdr,
or use a vector for random access.




In the case of emacs lisp, the difference is hidden in the virtual
machine, you would have to look at the C code:

(disassemble (lambda (s i) (aref s i)))
byte code:
  args: (s i)
0       varref    s
1       varref    i
2       aref      
3       return    

(disassemble (lambda (l i) (nth i l)))
byte code:
  args: (l i)
0       varref    i
1       varref    l
2       nth       
3       return    

-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk


       reply	other threads:[~2015-10-16  1:51 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.428.1444957396.7904.help-gnu-emacs@gnu.org>
2015-10-16  1:51 ` Pascal J. Bourguignon [this message]
2015-10-16  2:31   ` why are there [v e c t o r s] in Lisp? Emanuel Berg
2015-10-16  2:29     ` Random832
2015-10-16  2:51       ` Emanuel Berg
2015-10-16  2:56         ` Random832
2015-10-16 23:30           ` Emanuel Berg
     [not found]           ` <mailman.482.1445037713.7904.help-gnu-emacs@gnu.org>
2015-10-17  1:55             ` Pascal J. Bourguignon
2015-10-17  4:47               ` Emanuel Berg
     [not found]               ` <mailman.494.1445057637.7904.help-gnu-emacs@gnu.org>
2015-10-17 15:25                 ` Pascal J. Bourguignon
2015-10-17 21:12                   ` Emanuel Berg
     [not found]                   ` <mailman.519.1445115776.7904.help-gnu-emacs@gnu.org>
2015-10-18  1:08                     ` Pascal J. Bourguignon
     [not found]         ` <mailman.432.1444964227.7904.help-gnu-emacs@gnu.org>
2015-10-16  3:57           ` Pascal J. Bourguignon
2015-10-16  4:17             ` Random832
     [not found]             ` <mailman.434.1444970033.7904.help-gnu-emacs@gnu.org>
2015-10-16  5:16               ` Pascal J. Bourguignon
     [not found]   ` <mailman.429.1444962165.7904.help-gnu-emacs@gnu.org>
2015-10-16  3:31     ` Pascal J. Bourguignon
2015-10-16 23:46       ` Emanuel Berg
     [not found]       ` <mailman.483.1445038647.7904.help-gnu-emacs@gnu.org>
2015-10-17  2:04         ` Pascal J. Bourguignon
2015-10-17  4:40           ` Random832
2015-10-17  5:00             ` Emanuel Berg
2015-10-17  4:40           ` Emanuel Berg
     [not found]           ` <mailman.491.1445056308.7904.help-gnu-emacs@gnu.org>
2015-10-17  5:53             ` Barry Margolin
2015-10-17 15:16             ` Pascal J. Bourguignon
2015-10-17 21:06               ` Emanuel Berg
     [not found]               ` <mailman.518.1445115463.7904.help-gnu-emacs@gnu.org>
2015-10-18  1:07                 ` Pascal J. Bourguignon
2015-10-18 12:32                   ` Emanuel Berg
     [not found]                   ` <mailman.551.1445171034.7904.help-gnu-emacs@gnu.org>
2015-10-18 12:55                     ` Pascal J. Bourguignon
2015-10-18 14:28                       ` Emanuel Berg
2015-10-18 21:17                         ` Robert Thorpe
     [not found]                       ` <mailman.557.1445177952.7904.help-gnu-emacs@gnu.org>
2015-10-18 19:48                         ` Barry Margolin
2015-10-18 21:25                           ` Emanuel Berg
2015-10-18 21:39                             ` Random832
2015-10-19  0:46                         ` Pascal J. Bourguignon
2015-10-17  5:56           ` Barry Margolin
2015-10-17 15:06             ` Emanuel Berg
2015-10-16 13:32 ` Barry Margolin
2015-10-16 23:47   ` Emanuel Berg
     [not found] <mailman.581.1445203060.7904.help-gnu-emacs@gnu.org>
2015-10-19  0:45 ` Pascal J. Bourguignon
2015-10-16  1:12 Emanuel Berg
2015-10-17  1:11 ` Aurélien Aptel
2015-10-17  4:22   ` Emanuel Berg
2015-10-17  7:58   ` Jude DaShiell
2015-10-19 16:33     ` Nick Dokos
     [not found]   ` <mailman.490.1445055179.7904.help-gnu-emacs@gnu.org>
2015-10-17 15:09     ` Pascal J. Bourguignon
     [not found] ` <mailman.488.1445044303.7904.help-gnu-emacs@gnu.org>
2015-10-17  2:22   ` Pascal J. Bourguignon
2015-10-17  4:56     ` Emanuel Berg
2015-10-17  5:49       ` Barry Margolin
2015-10-17 15:04         ` Emanuel Berg
     [not found]         ` <mailman.506.1445093727.7904.help-gnu-emacs@gnu.org>
2015-10-17 15:20           ` Pascal J. Bourguignon
2015-10-17 15:38             ` Emanuel Berg
     [not found]             ` <mailman.511.1445095810.7904.help-gnu-emacs@gnu.org>
2015-10-17 16:01               ` Javier
2015-10-17 16:03                 ` Javier
2015-10-17 16:34                 ` Pascal J. Bourguignon
2015-10-17 21:18                   ` Emanuel Berg
2015-10-17 16:15               ` Pascal J. Bourguignon
2015-10-17 21:37                 ` Emanuel Berg
2015-10-17 15:18       ` Pascal J. Bourguignon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mvvjeg29.fsf@kuiper.lan.informatimago.com \
    --to=pjb@informatimago.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).