From: "Pascal J. Bourguignon" <pjb@informatimago.com>
To: help-gnu-emacs@gnu.org
Subject: Re: plists, alists, and hashtables
Date: Wed, 05 Aug 2015 19:24:31 +0200 [thread overview]
Message-ID: <878u9pps1c.fsf@kuiper.lan.informatimago.com> (raw)
In-Reply-To: 87d1z2ukw1.fsf@lifelogs.com
Ted Zlatanov <tzz@lifelogs.com> writes:
> On Wed, 05 Aug 2015 08:12:22 +0200 "Pascal J. Bourguignon" <pjb@informatimago.com> wrote:
>
> PJB> What you are losing from sight is the fact that:
>
> PJB> - a-lists are lists
> PJB> - p-lists are lists
> PJB> - lists are sequences
> PJB> - lists are cons cells or nil
>
> PJB> therefore any operator working on cons cells, on sequences, on lists,
> PJB> can also work on p-lists and on a-lists.
>
> Yes, I think the implicit advantage of everything being a list is well
> understood amongst us. But so is the disadvantage of treating
> everything as a list. The question is whether hashtables, an existing
> ELisp map data type, could become more popular.
Why?
How would that be good?
Seriously, hash-tables have a lot of drawbacks.
They use much more memory,
they are much slower (on small dictionaries),
they are much restrictive on the possible key equivalence function.
The only advantage they have, is on speed of access in big dictionaries.
But even when you need a O(1) access on a big dictionary, you will find
you keep converting between hash-table and lists or vectors, of only to
sort the entries out to present them to the user!
If you achieved your goal of having unwashed masses use hash-tables
instead of a-list/p-lists, the only result you'd attain would be to slow
down emacs. Run your own benchmark. On my computer, I notice that
until the size of the dictionary reaches 20-30, a-lists are performing
better than hash-table. (And even, I call assoc on set, while for
a-list it is customary to just do acons, so inserting or reseting
entries would be even much faster).
------------------------------------------------------------------------
;;; -*- mode:emacs-lisp;lexical-binding:t;coding:utf-8 -*-
(setf lexical-binding t)
(defun make-hash-table-dictionary ()
(let ((table (make-hash-table)))
(lambda (m &optional k v)
(ecase m
(get (gethash k table))
(del (remhash k table))
(set (setf (gethash k table) v))
(key (let ((keys '()))
(maphash (lambda (k v)
(declare (ignore v))
(push k keys))
table)
keys))))))
(defun make-a-list-dictionary ()
(let ((table '()))
(lambda (m &optional k v)
(ecase m
(get (cdr (assoc k table)))
(del (let ((prev-entry (loop
for cell on (cons nil table)
until (eql k (cadr cell))
finally (return cell))))
(when prev-entry
(setf (cdr prev-entry) (cddr prev-entry)))
(setf table (cdr prev-entry))))
(set (let ((entry (assoc k table)))
(if entry
(setf (cdr entry) v)
(setf table (acons k v table)))))
(key (mapcar (function car) table))))))
(defun make-null-dictionary ()
(lambda (m &optional k v)
(declare (ignore k v))
(ecase m
((get del set key) nil))))
(defun dict-get (dict k) (funcall dict 'get k))
(defun dict-del (dict k) (funcall dict 'del k))
(defun dict-set (dict k v) (funcall dict 'set k v))
(defun dict-key (dict) (funcall dict 'key))
(defun get-internal-real-time ()
(destructuring-bind (high low microsec &rest ignored) (current-time)
(declare (ignore ignored))
(+ (* high 65536.0) low (* 1e-6 microsec))))
(defmacro chrono (&rest body)
"Returns the time it took to evaluate the body expressions (in an implict progn)."
(let ((start (gensym))
(result (gensym)))
`(let* ((,start (get-internal-real-time))
(,result (progn ,@body)) )
(- (get-internal-real-time) ,start))))
(defun generate-keys (size)
(loop repeat size
collect (gensym)))
(defun generate-values (size)
(loop repeat size
with max = (* size 1000)
collect (random size)))
(defun benchmark (constructor size repeat)
(let* ((dicts '())
(keys (generate-keys size))
(values (generate-values size))
(constructor-time (chrono (setf dicts (loop repeat repeat
collect (funcall constructor)))))
(fill-time (chrono (mapc (lambda (dict)
(loop for k in keys for v in values
do (dict-set dict k v)))
dicts)))
(key-time-full (chrono (mapc (function dict-key) dicts)))
(get-time-full (chrono (mapc (lambda (dict)
(loop for k in keys for v in values
collect (eql v (dict-get dict k))))
dicts)))
(del-time (chrono (mapc (lambda (dict)
(loop for k in keys for v in values
for i from 0
when (oddp i)
do (dict-del dict k)))
dicts)))
(key-time-part (chrono (mapc (function dict-key) dicts)))
(get-time-part (chrono (mapc (lambda (dict)
(loop for k in keys for v in values
for i from 0
if (oddp i)
collect (null (dict-get dict k))
else
collect (eql v (dict-get dict k))))
dicts))))
(let ((count (float repeat)))
(list :constructor-time (/ constructor-time count)
:fill-time (/ fill-time count)
:key-time-full (/ key-time-full count)
:get-time-full (/ get-time-full count)
:del-time (/ del-time count)
:key-time-part (/ key-time-part count)
:get-time-part (/ get-time-part count)))))
(defun benchmark-all ()
(loop with repeat = 100
for size in '(1 2 3 4 5 6 7 8 9 10 15 20 25 30 35 40 50 60 70 80 90 100)
collect (loop
with null-case = (benchmark 'make-null-dictionary size repeat)
for constructor in '(make-a-list-dictionary make-hash-table-dictionary)
collect (list* constructor size
(mapcar* (lambda (v b)
(if (numberp v)
(- v b)
v))
(benchmark constructor size repeat)
null-case)))))
;; (benchmark-all)
------------------------------------------------------------------------
> PJB> Unfortunately there are many more than 8 datastructures for which you
> PJB> could want a specific syntax.
>
> You have to be specific about why we'd discuss or want these 8 data
> structures, since the discussion was only about hashtables.
Nonetheless, this argument is the reason why the print/read syntax for
hash-tables is:
> #s(hash-table size 65 test eql rehash-size 1.5 rehash-threshold 0.8 data ())
This extends easily to any other data structure for which you'd want a
literal representation.
--
__Pascal Bourguignon__ http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk
next prev parent reply other threads:[~2015-08-05 17:24 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-31 21:42 How to iterate over properties in a plist? Marcin Borkowski
2015-07-31 22:18 ` Stefan Monnier
2015-07-31 22:29 ` Marcin Borkowski
2015-07-31 22:42 ` Dmitry Gutov
2015-08-01 13:34 ` Michael Heerdegen
[not found] ` <mailman.7705.1438381807.904.help-gnu-emacs@gnu.org>
2015-07-31 23:33 ` Pascal J. Bourguignon
2015-08-01 22:49 ` Stefan Monnier
[not found] ` <mailman.7750.1438469396.904.help-gnu-emacs@gnu.org>
2015-08-04 10:15 ` plists, alists, and hashtables (was: How to iterate over properties in a plist?) Ted Zlatanov
2015-08-04 10:29 ` Nicolas Petton
[not found] ` <mailman.7809.1438684158.904.help-gnu-emacs@gnu.org>
2015-08-04 11:23 ` plists, alists, and hashtables Ted Zlatanov
2015-08-05 4:36 ` plists, alists, and hashtables (was: How to iterate over properties in a plist?) Rusi
2015-08-05 6:12 ` plists, alists, and hashtables Pascal J. Bourguignon
2015-08-05 9:47 ` Ted Zlatanov
2015-08-05 12:20 ` Rusi
2015-08-06 19:16 ` Stefan Monnier
[not found] ` <mailman.7892.1438888819.904.help-gnu-emacs@gnu.org>
2015-08-07 16:33 ` Rusi
2015-08-05 17:24 ` Pascal J. Bourguignon [this message]
2015-08-05 18:31 ` Ted Zlatanov
2015-08-05 19:30 ` Barry Margolin
2015-08-05 19:40 ` Robert Thorpe
2015-08-05 21:11 ` Pascal J. Bourguignon
2015-08-06 15:17 ` Ted Zlatanov
2015-08-06 18:46 ` Pascal J. Bourguignon
2015-08-06 20:19 ` Drew Adams
2015-08-06 21:08 ` Ted Zlatanov
2015-08-07 0:23 ` Pascal J. Bourguignon
2015-08-06 19:35 ` Stefan Monnier
2015-08-05 13:48 ` Drew Adams
2015-08-06 19:12 ` Stefan Monnier
[not found] ` <mailman.7890.1438888393.904.help-gnu-emacs@gnu.org>
2015-08-06 20:00 ` Pascal J. Bourguignon
2015-08-06 20:57 ` Ted Zlatanov
2015-08-06 21:10 ` Drew Adams
[not found] ` <mailman.7902.1438895429.904.help-gnu-emacs@gnu.org>
2015-08-06 21:15 ` Ted Zlatanov
2015-08-06 21:31 ` Stefan Monnier
2015-08-07 1:53 ` Ted Zlatanov
2015-08-07 7:34 ` Pascal J. Bourguignon
2015-08-07 16:32 ` Stefan Monnier
[not found] ` <mailman.7941.1438965165.904.help-gnu-emacs@gnu.org>
2015-08-08 3:48 ` Pascal J. Bourguignon
2015-08-08 13:42 ` Stefan Monnier
2015-08-08 14:51 ` Rusi
2015-08-07 0:08 ` Pascal J. Bourguignon
2015-08-07 2:14 ` Ted Zlatanov
2015-08-07 7:53 ` Pascal J. Bourguignon
2015-08-07 11:21 ` Ted Zlatanov
2015-08-07 11:47 ` Pascal J. Bourguignon
2015-08-07 17:21 ` Ted Zlatanov
2015-08-07 19:21 ` Stefan Monnier
[not found] ` <mailman.7952.1438975314.904.help-gnu-emacs@gnu.org>
2015-08-08 3:52 ` Pascal J. Bourguignon
2015-08-07 16:35 ` Stefan Monnier
[not found] <mailman.7856.1438803631.904.help-gnu-emacs@gnu.org>
2015-08-05 20:08 ` Ted Zlatanov
2015-08-05 20:45 ` Stefan Monnier
2015-08-05 21:36 ` Drew Adams
2015-08-05 21:41 ` Pascal J. Bourguignon
[not found] ` <mailman.7860.1438807554.904.help-gnu-emacs@gnu.org>
2015-08-06 1:32 ` Ted Zlatanov
[not found] ` <mailman.7862.1438810623.904.help-gnu-emacs@gnu.org>
2015-08-06 1:36 ` Ted Zlatanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878u9pps1c.fsf@kuiper.lan.informatimago.com \
--to=pjb@informatimago.com \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.