From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: "Pascal J. Bourguignon" <pjb@informatimago.com>
Newsgroups: gmane.emacs.help
Subject: Re: plists, alists, and hashtables
Date: Wed, 05 Aug 2015 23:11:41 +0200
Organization: Informatimago
Message-ID: <87vbcto2ya.fsf@kuiper.lan.informatimago.com>
References: <876150vwaa.fsf@mbork.pl>
	<jwvoais9dk5.fsf-monnier+gmane.emacs.help@gnu.org>
	<mailman.7705.1438381807.904.help-gnu-emacs@gnu.org>
	<873803x5q4.fsf@kuiper.lan.informatimago.com>
	<mailman.7750.1438469396.904.help-gnu-emacs@gnu.org>
	<87a8u7we9s.fsf_-_@lifelogs.com>
	<02f81836-554f-4bb4-873b-85c24e080e3d@googlegroups.com>
	<87614uqn5l.fsf@kuiper.lan.informatimago.com>
	<87d1z2ukw1.fsf@lifelogs.com>
	<878u9pps1c.fsf@kuiper.lan.informatimago.com>
	<87oailbn8t.fsf@lifelogs.com>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: ger.gmane.org 1438809339 9587 80.91.229.3 (5 Aug 2015 21:15:39 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Wed, 5 Aug 2015 21:15:39 +0000 (UTC)
To: help-gnu-emacs@gnu.org
Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Aug 05 23:15:21 2015
Return-path: <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>
Envelope-to: geh-help-gnu-emacs@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>)
	id 1ZN61v-0003XX-Ee
	for geh-help-gnu-emacs@m.gmane.org; Wed, 05 Aug 2015 23:15:19 +0200
Original-Received: from localhost ([::1]:42177 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>)
	id 1ZN61u-0008P1-Ki
	for geh-help-gnu-emacs@m.gmane.org; Wed, 05 Aug 2015 17:15:18 -0400
Original-Path: usenet.stanford.edu!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
Original-Newsgroups: gnu.emacs.help
Original-Lines: 100
Original-X-Trace: individual.net le2R1L2oTGwdJvwkUCXKxQBA/BtMXBvfjkAd4bTegKT15X9qQX
Cancel-Lock: sha1:M2NhZmM5ZGU5OWE5ODBlNjQ3NmMxMTJhODZmNGZkYjJiZGRiODI0MA==
	sha1:vzZIZ/6xYFMOvwNsiwSKPAIi8Ss=
Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAQMAAABtzGvEAAAABlBMVEUAAAD///+l2Z/dAAAA
	oElEQVR4nK3OsRHCMAwF0O8YQufUNIQRGIAja9CxSA55AxZgFO4coMgYrEDDQZWPIlNAjwq9
	033pbOBPtbXuB6PKNBn5gZkhGa86Z4x2wE67O+06WxGD/HCOGR0deY3f9Ijwwt7rNGNf6Oac
	l/GuZTF1wFGKiYYHKSFAkjIo1b6sCYS1sVmFhhhahKQssRjRT90ITWUk6vvK3RsPGs+M1RuR
	mV+hO/VvFAAAAABJRU5ErkJggg==
X-Accept-Language: fr, es, en
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
Original-Xref: usenet.stanford.edu gnu.emacs.help:213984
X-BeenThere: help-gnu-emacs@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Users list for the GNU Emacs text editor <help-gnu-emacs.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/help-gnu-emacs>,
	<mailto:help-gnu-emacs-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/help-gnu-emacs>
List-Post: <mailto:help-gnu-emacs@gnu.org>
List-Help: <mailto:help-gnu-emacs-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/help-gnu-emacs>,
	<mailto:help-gnu-emacs-request@gnu.org?subject=subscribe>
Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org
Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.help:106269
Archived-At: <http://permalink.gmane.org/gmane.emacs.help/106269>

Ted Zlatanov <tzz@lifelogs.com> writes:

> I think these details are easily optimized at the C level. Clearly an
> alist is better as the *backend* hashtable implementation for up to 10,
> possibly up to 100 entries (depending on caching, pipelining, hashing
> function, and other factors). But the frontend presentation is what I'm
> concerned about. 

DUH!  What did I just do to write the benchmark?

So you're not discussing about hash-tables, but about how to provide
high level abstractions.  Well DUH, just program them! functions,
macros.

> I think a better reader syntax for hashtables would
> make them easier to write and read in code and would error out if they
> are malformed. That's an improvement over alists and plists I think.

And yes, if you had readtables and reader macros, also a reader macro.

Don't ask for a dictionary abtraction.  Ask for readtables and reader
macros!  So that you may implement your own syntax for your own
abstractions!


>
> PJB> The only advantage they have, is on speed of access in big dictionaries.
>
> PJB> But even when you need a O(1) access on a big dictionary, you will find
> PJB> you keep converting between hash-table and lists or vectors, of only to
> PJB> sort the entries out to present them to the user!
>
> That's no different than looping over alists and plists to collect and
> sort the entries, is it?

Well, for small dictionaries, (about 8 entries), it's still faster to
copy the keys from a-lists than from a hash-table.


> PJB> instead of a-list/p-lists, the only result you'd attain would be to
> PJB> slow down emacs. Run your own benchmark. On my computer, I notice
> PJB> that until the size of the dictionary reaches 20-30, a-lists are
> PJB> performing better than hash-table. (And even, I call assoc on set,
> PJB> while for a-list it is customary to just do acons, so inserting or
> PJB> reseting entries would be even much faster).
>
> Absolutely, but I mentioned already that this is easily fixed on the
> back end. I think it's clear that while hashtables *can* scale, alists
> and plists *can't* because their backend and frontend are the same.
> Hashtables are only accessible through an API, their backend is
> hidden.

Yes, you're asking about an abstraction and a nice API and syntax for
it.  

I'm saying ok, implement it, lisp is good for that. (It could be better
with reader macros).

I'm also saying, beware anyways, because even with adaptative data
structures abstracted away, you will aways have some (usage) complexity
coming up, from the fact that your abstract operations will have some
overhead and some time and space complexity that may not be what is the
best in some specific cases.


(let ((sizes '()))
 (mapatoms (lambda (s)
             (when (boundp s)
               (let ((table (symbol-value s)))
                 (when (hash-table-p table)
                   (push (hash-table-count table) sizes))))))
  (sort sizes '<))

Here are the results on three different emacs instances doing different
things: CL develoment with slime, erc, gnus:

--> (0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 3 4 4 5 15 18 19 23 23 28 36 54 108 252 253 366 709 978 1013)
--> (0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 4 4 5 8 19 19 22 23 23 28 54 108 252 253 366 604 752 871 894 978)
--> (0 0 0 0 0 0 0 0 0 0 1 2 4 4 4 5 9 19 19 23 23 26 28 30 54 108 252 253 366 562 752 894 978 1638)

(length '(0 0 0 0 0 0 0 0 0 0 0))                    --> 11
(length '(1 2 4 4 4 5 9))                            -->  7
(length '(19 19 23 23 26 28 30))                     -->  7             
(length '(54 108 252 253 366 562 752 894 978 1638))  --> 10

So about half of those hash-table are too small, and should have been
implemented as a-lists, one quarter is around the break even, and only
one quarter should definitelhy be hash-tables.

You could improve this, by implementing a better data walker, here we
only look at the hash-tables directly bounds to variables.


-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk