From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Philipp Stephani Newsgroups: gmane.emacs.bugs Subject: bug#28753: 25.3; Functions to get alist from hash table and vice versa Date: Sun, 04 Mar 2018 19:17:17 +0000 Message-ID: References: <54ecd1bb-0c84-4b0a-b19e-3a89cbe832bc@default> <94d02ad7-d069-4195-b84a-13229dfa3e9a@default> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="94eb2c1cf776709b8605669b0fbe" X-Trace: blaine.gmane.org 1520190974 28027 195.159.176.226 (4 Mar 2018 19:16:14 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 4 Mar 2018 19:16:14 +0000 (UTC) Cc: 28753@debbugs.gnu.org To: Drew Adams Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Mar 04 20:16:10 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1esZ7A-0006QN-4X for geb-bug-gnu-emacs@m.gmane.org; Sun, 04 Mar 2018 20:16:08 +0100 Original-Received: from localhost ([::1]:45843 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1esZ9C-0006Lp-BK for geb-bug-gnu-emacs@m.gmane.org; Sun, 04 Mar 2018 14:18:14 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42330) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1esZ94-0006LW-LY for bug-gnu-emacs@gnu.org; Sun, 04 Mar 2018 14:18:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1esZ90-0001Wm-JC for bug-gnu-emacs@gnu.org; Sun, 04 Mar 2018 14:18:06 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:36809) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1esZ90-0001Wi-DS for bug-gnu-emacs@gnu.org; Sun, 04 Mar 2018 14:18:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1esZ90-0004SH-77 for bug-gnu-emacs@gnu.org; Sun, 04 Mar 2018 14:18:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Philipp Stephani Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 04 Mar 2018 19:18:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 28753 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 28753-submit@debbugs.gnu.org id=B28753.152019105717089 (code B ref 28753); Sun, 04 Mar 2018 19:18:02 +0000 Original-Received: (at 28753) by debbugs.gnu.org; 4 Mar 2018 19:17:37 +0000 Original-Received: from localhost ([127.0.0.1]:44706 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1esZ8a-0004RZ-IM for submit@debbugs.gnu.org; Sun, 04 Mar 2018 14:17:36 -0500 Original-Received: from mail-lf0-f53.google.com ([209.85.215.53]:45565) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1esZ8Y-0004RH-Ng for 28753@debbugs.gnu.org; Sun, 04 Mar 2018 14:17:35 -0500 Original-Received: by mail-lf0-f53.google.com with SMTP id h127so19030206lfg.12 for <28753@debbugs.gnu.org>; Sun, 04 Mar 2018 11:17:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=i8QHCl7KpbIGbbp6RI+AZsqMZuWmEVgJSOJm1RuBweg=; b=JzGiq8U400CYgmqgnwJPVw/caG6hwwoAR93VYoqNS6n2GOSOjHpQbzxyUrf8i6BJI2 BJx3oZClQ6JswO+qk3PSogkdYuMXRbTKyonJIfgKM0sBV5rWB2jCB3pLiHBJ7HD1Q2uN QiIPDzIdmQcLXYPbTcpOUZuzvJ12cJdYLSEENZvMRw5fOfMr1qAx6oAUIOcaiH5owSJO ugzlGgrt/1ldtrSUO8Mnhc5Lowno5Evt91lKiiL1NQ7cruVKlCHb04dwzROvMIRE4OyO Y25LrzmWmG9bqyl1XK1DMNlSnsZC7Mqigj2qIt6WqOIl/peeMeu8xybzMYr3O758QhtU Cbvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=i8QHCl7KpbIGbbp6RI+AZsqMZuWmEVgJSOJm1RuBweg=; b=kquYjhEColmEGk7rGWSzqfTb8hMEPHAMrJlKm/nGf3ho6ZZB8Fc43bDD8CYGLUPX8K befzJw1OTY0YR1QSCuGxPOWPxuSIKT30IAPI5tzyeKUjgcofRbF258MdkWb29fu4ooNC BoDox71CoOH3ExHroizhMQQugVeYWZPUCz6SAyIPnqI0pr5kiPfG72zLN6+xTv6r1/z/ P5qJCzmgPzRh85JI79zVaQCZyqkMBbhEpGtbliwHZDlZKWAsItHX1YCQbys3LB6xY5xK g782h1443CEEoSb+/KJ05nIzqVxZJjcA5sf3NevcgNPQ1lqw72+AuYwTciLw54fnoPso ibvw== X-Gm-Message-State: AElRT7HJWXrm9n30+wHYAdB3dE0EnCyjFUGV/YgdfRloDdagYvDplJ6U q1TMqF6maClVUEj1MasNY5wf35WJ/qGUeROeYWg= X-Google-Smtp-Source: AG47ELuQV3EZYBs9k4G8FD1KY5+T93t1NDC29Rwba1QauK1DulG6KSK27amHjazxhj6e7Ly5NQ3nmaCy2S6FtzB+hYg= X-Received: by 10.46.83.67 with SMTP id t3mr7898720ljd.63.1520191048553; Sun, 04 Mar 2018 11:17:28 -0800 (PST) In-Reply-To: <94d02ad7-d069-4195-b84a-13229dfa3e9a@default> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:143902 Archived-At: --94eb2c1cf776709b8605669b0fbe Content-Type: text/plain; charset="UTF-8" Drew Adams schrieb am So., 31. Dez. 2017 um 01:01 Uhr: > > > (when (or use-last-p (not (gethash key ht))) > > > > This doesn't work if the value is nil. > > I guess you mean that it doesn't DTRT if the cdr of an alist > entry is `nil' - e.g. (foo) aka (foo . nil). > Yes. > > > You need to use an uninterned symbol or some other unique object, e.g. > > (eq (gethash key ht #1='#:void) #1#) > > OK. Dunno which is clearer or whether there is some > performance difference, but I would probably just bind > a var to a unique cons, e.g. (cons 1 1), outside the > `dolist'. E.g.: > > (let ((ht (make-hash-table :test test :weakness weakness > :size size :rehash-size rehash-size > :rehash-threshold rehash-threshold)) > (uniq-cons (cons 1 1)) > key val) > (dolist (key.val alist) > (setq key (car key.val) > val (cdr key.val)) > (when (or use-last-p (not (eq (gethash key ht uniq-cons)))) > (puthash key val ht))) > ht)) > > (With your approach of using an uninterned symbol, wouldn't > you want to do the same thing: bind a var to it outside the > `dolist', or would that make no real difference?) > It's no real difference. I just proposed the shortest way that works. > > > I'd personally make use-last-p another keyword argument, though. > > I don't have a strong objection, but why? > Especially for booleans it's much clearer if the parameter name is repeated on the call site. > > > > (defun hash-table-to-alist (hash-table) > > > "Create and return an alist created from HASH-TABLE. > > > The order of alist entries is the same as the order of hash-table > > > entries (which normally is the order in which the entries were added > > > to the table)." > > > (let ((al ())) > > > (maphash (lambda (key val) (push (cons key val) al)) hash-table) > > > (nreverse al))) > > > > Hmm, is the order guaranteed? I haven't found anything in the > > Emacs Lisp manual about this, so maybe just leave out the > > parenthetical remark or say that the order is unspecified? > > Good question! But it's not just the parenthetical part. > If we couldn't say anything about the traversal order of > `maphash' then the whole sentence would need to go. > > FWIW, I think it's pretty important. Order is quite > important for alists, generally. > > Is there some reason we should not be able to count on > this `maphash' behavior? > Order in hashtables is not guaranteed. If anything relies on the order, it's buggy. > > That's the behavior I saw, AFAICT, but I didn't test much. > > I don't know whether it is guaranteed, but this use case > involving conversion to alist looks like a good argument for > some guarantee wrt order. > No. Ordering guarantees require additional complexity and overhead, and are almost never needed. > > I see that in Common Lisp nothing is said about the traversal > by `maphash' over the hash table. This is all that is said: > > "For each entry in hash-table, maphash calls function on two > arguments: the key of the entry and the value of the entry; > maphash then returns nil. If entries are added to or deleted > from the hash table while a maphash is in progress, the results > are unpredictable, with one exception: if the function calls > remhash to remove the entry currently being processed by the > function, or performs a setf of gethash on that entry to change > the associated value, then those operations will have the > intended effect." > > - https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node155.html > > (Emacs doc should also probably say at least something about > what happens when entries are added/removed during a `maphash' > application.) > Yes. > > If Emacs decides not to guarantee the order seen now, which I > think is the order I described in the doc string above, then we > would need to remove that sentence. That would be too bad for > users of function `hash-table-to-alist', but at least they > would, at least so far (and AFAICT), get that behavior, which > is likely to be expected by them even if it is not documented. > > Another possibility (?): Allow _optional_ creation of a hash > table that does preserve such ordering (even if just by > recording order of entry and making that available to `maphash'). > E.g., add a keyword arg for `make-hash-table' that does just that. > > Even if it turned out that this consistently or occasionally > detracted from performance, it could be a useful option. > Conversion to an alist would be a case in point. > I don't think it would be worth the additional complexity. Hash tables have been available for ages in most programming languages, and almost never does anybody ask for ordering guarantees. (For example, I have never seen somebody using LinkedHashMap in Java.) Even for alists, most of the time maintaining insertion order is an irrelevant detail, most users care only about get/put/remove. --94eb2c1cf776709b8605669b0fbe Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


Drew A= dams <drew.adams@oracle.com= > schrieb am So., 31. Dez. 2017 um 01:01=C2=A0Uhr:
> > (when (or use-last-p=C2=A0 (not (gethash key h= t)))
>
> This doesn't work if the value is nil.

I guess you mean that it doesn't DTRT if the cdr of an alist
entry is `nil' - e.g. (foo) aka (foo . nil).

<= /div>
Yes.
=C2=A0

> You need to use an uninterned symbol or some other unique object, e.g.=
> (eq (gethash key ht #1=3D'#:void) #1#)

OK.=C2=A0 Dunno which is clearer or whether there is some
performance difference, but I would probably just bind
a var to a unique cons, e.g. (cons 1 1), outside the
`dolist'.=C2=A0 E.g.:

(let ((ht (make-hash-table :test test :weakness weakness
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0:size size :rehash-size rehash-size
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0:rehash-threshold rehash-threshold))
=C2=A0 =C2=A0 =C2=A0 (uniq-cons (cons 1 1))
=C2=A0 =C2=A0 =C2=A0 key val)
=C2=A0 (dolist (key.val=C2=A0 alist)
=C2=A0 =C2=A0 (setq key=C2=A0 (car key.val)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 val=C2=A0 (cdr key.val))
=C2=A0 =C2=A0 (when (or use-last-p=C2=A0 (not (eq (gethash key ht uniq-cons= ))))
=C2=A0 =C2=A0 =C2=A0 (puthash key val ht)))
=C2=A0 ht))

(With your approach of using an uninterned symbol, wouldn't
you want to do the same thing: bind a var to it outside the
`dolist', or would that make no real difference?)
=
It's no real difference. I just proposed the shortest wa= y that works.
=C2=A0
=C2=A0
> I'd personally make use-last-p another keyword argument, though.
I don't have a strong objection, but why?

Especially for booleans it's much clearer if the parameter name = is repeated on the call site.
=C2=A0

> > (defun hash-table-to-alist (hash-table)
> >=C2=A0 "Create and return an alist created from HASH-TABLE. > > The order of alist entries is the same as the order of hash-table=
> > entries (which normally is the order in which the entries were ad= ded
> > to the table)."
> >=C2=A0 (let ((al=C2=A0 ()))
> >=C2=A0 =C2=A0 (maphash (lambda (key val) (push (cons key val) al))= hash-table)
> >=C2=A0 =C2=A0 (nreverse al)))
>
> Hmm, is the order guaranteed? I haven't found anything in the
> Emacs Lisp manual about this, so maybe just leave out the
> parenthetical remark or say that the order is unspecified?

Good question!=C2=A0 But it's not just the parenthetical part.
If we couldn't say anything about the traversal order of
`maphash' then the whole sentence would need to go.

FWIW, I think it's pretty important.=C2=A0 Order is quite
important for alists, generally.

Is there some reason we should not be able to count on
this `maphash' behavior?

Order in h= ashtables is not guaranteed. If anything relies on the order, it's bugg= y.
=C2=A0

That's the behavior I saw, AFAICT, but I didn't test much.

I don't know whether it is guaranteed, but this use case
involving conversion to alist looks like a good argument for
some guarantee wrt order.

No. Ordering = guarantees require additional complexity and overhead, and are almost never= needed.
=C2=A0

I see that in Common Lisp nothing is said about the traversal
by `maphash' over the hash table.=C2=A0 This is all that is said:

=C2=A0"For each entry in hash-table, maphash calls function on two
=C2=A0 arguments: the key of the entry and the value of the entry;
=C2=A0 maphash then returns nil. If entries are added to or deleted
=C2=A0 from the hash table while a maphash is in progress, the results
=C2=A0 are unpredictable, with one exception: if the function calls
=C2=A0 remhash to remove the entry currently being processed by the
=C2=A0 function, or performs a setf of gethash on that entry to change
=C2=A0 the associated value, then those operations will have the
=C2=A0 intended effect."

=C2=A0 - https://www.cs.cmu.edu/Groups/AI= /html/cltl/clm/node155.html

(Emacs doc should also probably say at least something about
what happens when entries are added/removed during a `maphash'
application.)

Yes.
=C2=A0

If Emacs decides not to guarantee the order seen now, which I
think is the order I described in the doc string above, then we
would need to remove that sentence.=C2=A0 That would be too bad for
users of function `hash-table-to-alist', but at least they
would, at least so far (and AFAICT), get that behavior, which
is likely to be expected by them even if it is not documented.

Another possibility (?): Allow _optional_ creation of a hash
table that does preserve such ordering (even if just by
recording order of entry and making that available to `maphash').
E.g., add a keyword arg for `make-hash-table' that does just that.

Even if it turned out that this consistently or occasionally
detracted from performance, it could be a useful option.
Conversion to an alist would be a case in point.

<= /div>
I don't think it would be worth the additional complexity. Ha= sh tables have been available for ages in most programming languages, and a= lmost never does anybody ask for ordering guarantees. (For example, I have = never seen somebody using LinkedHashMap in Java.)
Even for alists= , most of the time maintaining insertion order is an irrelevant detail, mos= t users care only about get/put/remove.=C2=A0
--94eb2c1cf776709b8605669b0fbe--