From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Pascal J. Bourguignon" Newsgroups: gmane.emacs.help Subject: Re: plists, alists, and hashtables Date: Wed, 05 Aug 2015 23:11:41 +0200 Organization: Informatimago Message-ID: <87vbcto2ya.fsf@kuiper.lan.informatimago.com> References: <876150vwaa.fsf@mbork.pl> <873803x5q4.fsf@kuiper.lan.informatimago.com> <87a8u7we9s.fsf_-_@lifelogs.com> <02f81836-554f-4bb4-873b-85c24e080e3d@googlegroups.com> <87614uqn5l.fsf@kuiper.lan.informatimago.com> <87d1z2ukw1.fsf@lifelogs.com> <878u9pps1c.fsf@kuiper.lan.informatimago.com> <87oailbn8t.fsf@lifelogs.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1438809339 9587 80.91.229.3 (5 Aug 2015 21:15:39 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 5 Aug 2015 21:15:39 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Aug 05 23:15:21 2015 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZN61v-0003XX-Ee for geh-help-gnu-emacs@m.gmane.org; Wed, 05 Aug 2015 23:15:19 +0200 Original-Received: from localhost ([::1]:42177 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZN61u-0008P1-Ki for geh-help-gnu-emacs@m.gmane.org; Wed, 05 Aug 2015 17:15:18 -0400 Original-Path: usenet.stanford.edu!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 100 Original-X-Trace: individual.net le2R1L2oTGwdJvwkUCXKxQBA/BtMXBvfjkAd4bTegKT15X9qQX Cancel-Lock: sha1:M2NhZmM5ZGU5OWE5ODBlNjQ3NmMxMTJhODZmNGZkYjJiZGRiODI0MA== sha1:vzZIZ/6xYFMOvwNsiwSKPAIi8Ss= Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAQMAAABtzGvEAAAABlBMVEUAAAD///+l2Z/dAAAA oElEQVR4nK3OsRHCMAwF0O8YQufUNIQRGIAja9CxSA55AxZgFO4coMgYrEDDQZWPIlNAjwq9 033pbOBPtbXuB6PKNBn5gZkhGa86Z4x2wE67O+06WxGD/HCOGR0deY3f9Ijwwt7rNGNf6Oac l/GuZTF1wFGKiYYHKSFAkjIo1b6sCYS1sVmFhhhahKQssRjRT90ITWUk6vvK3RsPGs+M1RuR mV+hO/VvFAAAAABJRU5ErkJggg== X-Accept-Language: fr, es, en User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) Original-Xref: usenet.stanford.edu gnu.emacs.help:213984 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:106269 Archived-At: Ted Zlatanov writes: > I think these details are easily optimized at the C level. Clearly an > alist is better as the *backend* hashtable implementation for up to 10, > possibly up to 100 entries (depending on caching, pipelining, hashing > function, and other factors). But the frontend presentation is what I'm > concerned about. DUH! What did I just do to write the benchmark? So you're not discussing about hash-tables, but about how to provide high level abstractions. Well DUH, just program them! functions, macros. > I think a better reader syntax for hashtables would > make them easier to write and read in code and would error out if they > are malformed. That's an improvement over alists and plists I think. And yes, if you had readtables and reader macros, also a reader macro. Don't ask for a dictionary abtraction. Ask for readtables and reader macros! So that you may implement your own syntax for your own abstractions! > > PJB> The only advantage they have, is on speed of access in big dictionaries. > > PJB> But even when you need a O(1) access on a big dictionary, you will find > PJB> you keep converting between hash-table and lists or vectors, of only to > PJB> sort the entries out to present them to the user! > > That's no different than looping over alists and plists to collect and > sort the entries, is it? Well, for small dictionaries, (about 8 entries), it's still faster to copy the keys from a-lists than from a hash-table. > PJB> instead of a-list/p-lists, the only result you'd attain would be to > PJB> slow down emacs. Run your own benchmark. On my computer, I notice > PJB> that until the size of the dictionary reaches 20-30, a-lists are > PJB> performing better than hash-table. (And even, I call assoc on set, > PJB> while for a-list it is customary to just do acons, so inserting or > PJB> reseting entries would be even much faster). > > Absolutely, but I mentioned already that this is easily fixed on the > back end. I think it's clear that while hashtables *can* scale, alists > and plists *can't* because their backend and frontend are the same. > Hashtables are only accessible through an API, their backend is > hidden. Yes, you're asking about an abstraction and a nice API and syntax for it. I'm saying ok, implement it, lisp is good for that. (It could be better with reader macros). I'm also saying, beware anyways, because even with adaptative data structures abstracted away, you will aways have some (usage) complexity coming up, from the fact that your abstract operations will have some overhead and some time and space complexity that may not be what is the best in some specific cases. (let ((sizes '())) (mapatoms (lambda (s) (when (boundp s) (let ((table (symbol-value s))) (when (hash-table-p table) (push (hash-table-count table) sizes)))))) (sort sizes '<)) Here are the results on three different emacs instances doing different things: CL develoment with slime, erc, gnus: --> (0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 3 4 4 5 15 18 19 23 23 28 36 54 108 252 253 366 709 978 1013) --> (0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 4 4 5 8 19 19 22 23 23 28 54 108 252 253 366 604 752 871 894 978) --> (0 0 0 0 0 0 0 0 0 0 1 2 4 4 4 5 9 19 19 23 23 26 28 30 54 108 252 253 366 562 752 894 978 1638) (length '(0 0 0 0 0 0 0 0 0 0 0)) --> 11 (length '(1 2 4 4 4 5 9)) --> 7 (length '(19 19 23 23 26 28 30)) --> 7 (length '(54 108 252 253 366 562 752 894 978 1638)) --> 10 So about half of those hash-table are too small, and should have been implemented as a-lists, one quarter is around the break even, and only one quarter should definitelhy be hash-tables. You could improve this, by implementing a better data walker, here we only look at the hash-tables directly bounds to variables. -- __Pascal Bourguignon__ http://www.informatimago.com/ “The factory of the future will have only two employees, a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment.” -- Carl Bass CEO Autodesk