From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Pascal J. Bourguignon" Newsgroups: gmane.emacs.help Subject: Re: plists, alists, and hashtables Date: Wed, 05 Aug 2015 19:24:31 +0200 Organization: Informatimago Message-ID: <878u9pps1c.fsf@kuiper.lan.informatimago.com> References: <876150vwaa.fsf@mbork.pl> <873803x5q4.fsf@kuiper.lan.informatimago.com> <87a8u7we9s.fsf_-_@lifelogs.com> <02f81836-554f-4bb4-873b-85c24e080e3d@googlegroups.com> <87614uqn5l.fsf@kuiper.lan.informatimago.com> <87d1z2ukw1.fsf@lifelogs.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1438795530 7112 80.91.229.3 (5 Aug 2015 17:25:30 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 5 Aug 2015 17:25:30 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Aug 05 19:25:24 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZN2RP-00085m-T6 for geh-help-gnu-emacs@m.gmane.org; Wed, 05 Aug 2015 19:25:24 +0200 Original-Received: from localhost ([::1]:41396 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZN2RP-0001UC-73 for geh-help-gnu-emacs@m.gmane.org; Wed, 05 Aug 2015 13:25:23 -0400 X-Received: by 10.112.218.10 with SMTP id pc10mr3043323lbc.8.1438795472778; Wed, 05 Aug 2015 10:24:32 -0700 (PDT) Original-Path: usenet.stanford.edu!ue2no45128lbc.1!news-out.google.com!e9ni121976lbc.1!nntp.google.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 191 Original-X-Trace: individual.net ZOFMka5MoXekG2TDK6vr8wmQdeJVrZ0vFXLOETIFPn+7yfHRWT Cancel-Lock: sha1:ZWJhOWRhMDRiOWU4NzI0Yjg0OGNmM2Y4YWY3ZjAyNWM0OGUwODExNQ== sha1:9QSbBl6U6Qxifxz8rvXATBWQigY= Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAQMAAABtzGvEAAAABlBMVEUAAAD///+l2Z/dAAAA oElEQVR4nK3OsRHCMAwF0O8YQufUNIQRGIAja9CxSA55AxZgFO4coMgYrEDDQZWPIlNAjwq9 033pbOBPtbXuB6PKNBn5gZkhGa86Z4x2wE67O+06WxGD/HCOGR0deY3f9Ijwwt7rNGNf6Oac l/GuZTF1wFGKiYYHKSFAkjIo1b6sCYS1sVmFhhhahKQssRjRT90ITWUk6vvK3RsPGs+M1RuR mV+hO/VvFAAAAABJRU5ErkJggg== X-Accept-Language: fr, es, en User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) Original-Xref: usenet.stanford.edu gnu.emacs.help:213976 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:106261 Archived-At: Ted Zlatanov writes: > On Wed, 05 Aug 2015 08:12:22 +0200 "Pascal J. Bourguignon" wrote: > > PJB> What you are losing from sight is the fact that: > > PJB> - a-lists are lists > PJB> - p-lists are lists > PJB> - lists are sequences > PJB> - lists are cons cells or nil > > PJB> therefore any operator working on cons cells, on sequences, on lists, > PJB> can also work on p-lists and on a-lists. > > Yes, I think the implicit advantage of everything being a list is well > understood amongst us. But so is the disadvantage of treating > everything as a list. The question is whether hashtables, an existing > ELisp map data type, could become more popular. Why? How would that be good? Seriously, hash-tables have a lot of drawbacks. They use much more memory, they are much slower (on small dictionaries), they are much restrictive on the possible key equivalence function. The only advantage they have, is on speed of access in big dictionaries. But even when you need a O(1) access on a big dictionary, you will find you keep converting between hash-table and lists or vectors, of only to sort the entries out to present them to the user! If you achieved your goal of having unwashed masses use hash-tables instead of a-list/p-lists, the only result you'd attain would be to slow down emacs. Run your own benchmark. On my computer, I notice that until the size of the dictionary reaches 20-30, a-lists are performing better than hash-table. (And even, I call assoc on set, while for a-list it is customary to just do acons, so inserting or reseting entries would be even much faster). ------------------------------------------------------------------------ ;;; -*- mode:emacs-lisp;lexical-binding:t;coding:utf-8 -*- (setf lexical-binding t) (defun make-hash-table-dictionary () (let ((table (make-hash-table))) (lambda (m &optional k v) (ecase m (get (gethash k table)) (del (remhash k table)) (set (setf (gethash k table) v)) (key (let ((keys '())) (maphash (lambda (k v) (declare (ignore v)) (push k keys)) table) keys)))))) (defun make-a-list-dictionary () (let ((table '())) (lambda (m &optional k v) (ecase m (get (cdr (assoc k table))) (del (let ((prev-entry (loop for cell on (cons nil table) until (eql k (cadr cell)) finally (return cell)))) (when prev-entry (setf (cdr prev-entry) (cddr prev-entry))) (setf table (cdr prev-entry)))) (set (let ((entry (assoc k table))) (if entry (setf (cdr entry) v) (setf table (acons k v table))))) (key (mapcar (function car) table)))))) (defun make-null-dictionary () (lambda (m &optional k v) (declare (ignore k v)) (ecase m ((get del set key) nil)))) (defun dict-get (dict k) (funcall dict 'get k)) (defun dict-del (dict k) (funcall dict 'del k)) (defun dict-set (dict k v) (funcall dict 'set k v)) (defun dict-key (dict) (funcall dict 'key)) (defun get-internal-real-time () (destructuring-bind (high low microsec &rest ignored) (current-time) (declare (ignore ignored)) (+ (* high 65536.0) low (* 1e-6 microsec)))) (defmacro chrono (&rest body) "Returns the time it took to evaluate the body expressions (in an implict progn)." (let ((start (gensym)) (result (gensym))) `(let* ((,start (get-internal-real-time)) (,result (progn ,@body)) ) (- (get-internal-real-time) ,start)))) (defun generate-keys (size) (loop repeat size collect (gensym))) (defun generate-values (size) (loop repeat size with max = (* size 1000) collect (random size))) (defun benchmark (constructor size repeat) (let* ((dicts '()) (keys (generate-keys size)) (values (generate-values size)) (constructor-time (chrono (setf dicts (loop repeat repeat collect (funcall constructor))))) (fill-time (chrono (mapc (lambda (dict) (loop for k in keys for v in values do (dict-set dict k v))) dicts))) (key-time-full (chrono (mapc (function dict-key) dicts))) (get-time-full (chrono (mapc (lambda (dict) (loop for k in keys for v in values collect (eql v (dict-get dict k)))) dicts))) (del-time (chrono (mapc (lambda (dict) (loop for k in keys for v in values for i from 0 when (oddp i) do (dict-del dict k))) dicts))) (key-time-part (chrono (mapc (function dict-key) dicts))) (get-time-part (chrono (mapc (lambda (dict) (loop for k in keys for v in values for i from 0 if (oddp i) collect (null (dict-get dict k)) else collect (eql v (dict-get dict k)))) dicts)))) (let ((count (float repeat))) (list :constructor-time (/ constructor-time count) :fill-time (/ fill-time count) :key-time-full (/ key-time-full count) :get-time-full (/ get-time-full count) :del-time (/ del-time count) :key-time-part (/ key-time-part count) :get-time-part (/ get-time-part count))))) (defun benchmark-all () (loop with repeat = 100 for size in '(1 2 3 4 5 6 7 8 9 10 15 20 25 30 35 40 50 60 70 80 90 100) collect (loop with null-case = (benchmark 'make-null-dictionary size repeat) for constructor in '(make-a-list-dictionary make-hash-table-dictionary) collect (list* constructor size (mapcar* (lambda (v b) (if (numberp v) (- v b) v)) (benchmark constructor size repeat) null-case))))) ;; (benchmark-all) ------------------------------------------------------------------------ > PJB> Unfortunately there are many more than 8 datastructures for which you > PJB> could want a specific syntax. > > You have to be specific about why we'd discuss or want these 8 data > structures, since the discussion was only about hashtables. Nonetheless, this argument is the reason why the print/read syntax for hash-tables is: > #s(hash-table size 65 test eql rehash-size 1.5 rehash-threshold 0.8 data ()) This extends easily to any other data structure for which you'd want a literal representation. -- __Pascal Bourguignon__ http://www.informatimago.com/ “The factory of the future will have only two employees, a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment.” -- Carl Bass CEO Autodesk