From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.lisp.guile.bugs Subject: bug#17296: Uh, wrong? Date: Tue, 03 Jun 2014 19:30:23 +0200 Message-ID: <87wqcx99zk.fsf@fencepost.gnu.org> References: <878ur25thj.fsf@yeeloong.lan> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: ger.gmane.org 1401816684 27583 80.91.229.3 (3 Jun 2014 17:31:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 3 Jun 2014 17:31:24 +0000 (UTC) To: 17296@debbugs.gnu.org Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Tue Jun 03 19:31:18 2014 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WrsYK-0000e6-SR for guile-bugs@m.gmane.org; Tue, 03 Jun 2014 19:31:13 +0200 Original-Received: from localhost ([::1]:54978 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WrsYK-0008M9-HV for guile-bugs@m.gmane.org; Tue, 03 Jun 2014 13:31:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57968) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WrsYC-0008Hd-ML for bug-guile@gnu.org; Tue, 03 Jun 2014 13:31:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WrsYA-0002iu-TH for bug-guile@gnu.org; Tue, 03 Jun 2014 13:31:04 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:43595) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WrsYA-0002ig-Q4 for bug-guile@gnu.org; Tue, 03 Jun 2014 13:31:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1WrsYA-0004FD-6y for bug-guile@gnu.org; Tue, 03 Jun 2014 13:31:02 -0400 X-Loop: help-debbugs@gnu.org In-Reply-To: <878ur25thj.fsf@yeeloong.lan> Resent-From: David Kastrup Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Tue, 03 Jun 2014 17:31:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17296 X-GNU-PR-Package: guile X-GNU-PR-Keywords: patch Original-Received: via spool by 17296-submit@debbugs.gnu.org id=B17296.140181664916290 (code B ref 17296); Tue, 03 Jun 2014 17:31:02 +0000 Original-Received: (at 17296) by debbugs.gnu.org; 3 Jun 2014 17:30:49 +0000 Original-Received: from localhost ([127.0.0.1]:42472 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WrsXt-0004Ea-Am for submit@debbugs.gnu.org; Tue, 03 Jun 2014 13:30:49 -0400 Original-Received: from fencepost.gnu.org ([208.118.235.10]:52969 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WrsXn-0004EM-3n for 17296@debbugs.gnu.org; Tue, 03 Jun 2014 13:30:43 -0400 Original-Received: from localhost ([127.0.0.1]:60275 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WrsXm-0002gq-Jx for 17296@debbugs.gnu.org; Tue, 03 Jun 2014 13:30:38 -0400 Original-Received: by lola (Postfix, from userid 1000) id 41F79E0885; Tue, 3 Jun 2014 19:30:23 +0200 (CEST) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.bugs:7479 Archived-At: --=-=-= Content-Type: text/plain I see that a patch has been committed. It is conflicting with another patch of mine and it does not follow from the srfi-1 specification. To wit: Most procedures are defined only on proper lists -- that is, finite, nil-terminated lists. The procedures that will also handle circular or dotted lists are specifically marked. While this design decision restricts the domain of possible arguments one can pass to these procedures, it has the benefit of allowing the procedures to catch the error cases where programmers inadvertently pass scalar values to a list procedure by accident, e.g., by switching the arguments to a procedure call. Note "allowing to catch". Then we have Errors Note that statements of the form "it is an error" merely mean "don't do that." They are not a guarantee that a conforming implementation will "catch" such improper use by, for example, raising some kind of exception. Regrettably, R5RS Scheme requires no firmer guarantee even for basic operators such as car and cdr, so there's little point in requiring these procedures to do more. Here is the relevant section of the R5RS: When speaking of an error situation, this report uses the phrase "an error is signalled" to indicate that implementations must detect and report the error. If such wording does not appear in the discussion of an error, then implementations are not required to detect or report the error, though they are encouraged to do so. An error situation that implementations are not required to detect is usually referred to simply as "an error." For example, it is an error for a procedure to be passed an argument that the procedure is not explicitly specified to handle, even though such domain errors are seldom mentioned in this report. Implementations may extend a procedure's domain of definition to include such arguments. So let's see how we have defined stuff here: list A proper (finite, nil-terminated) list clist A proper or circular list flist A finite (proper or dotted) list For length+, we have length list -> integer length+ clist -> integer or #f Both length and length+ return the length of the argument. It is an error to pass a value to length which is not a proper list (finite and nil-terminated). In particular, this means an implementation may diverge or signal an error when length is applied to a circular list. length+, on the other hand, returns #F when applied to a circular list. The length of a proper list is a non-negative integer n such that cdr applied n times to the list produces the empty list. So the behavior for length+ on a dotted list is strictly unspecified. It is not even stated "it is an error". Functions like fold state: fold kons knil clist1 clist2 ... -> value The fold operation terminates when the shortest list runs out of values: (fold cons* '() '(a b c) '(1 2 3 4 5)) => (c 3 b 2 a 1) At least one of the list arguments must be finite. Note the wording: "at least one of the list arguments must be finite", not "at least one of the list arguments must be proper". The definition of the recursion for the single-list case again leaves the case of a dotted list unspecified. If we take a look at the reference implementation at , we get (define (length+ x) ; Returns #f if X is circular. (let lp ((x x) (lag x) (len 0)) (if (pair? x) (let ((x (cdr x)) (len (+ len 1))) (if (pair? x) (let ((x (cdr x)) (lag (cdr lag)) (len (+ len 1))) (and (not (eq? x lag)) (lp x lag len))) len)) len))) which _clearly_ treats dotted lists like regular lists with regard to giving the length of the spine. Incidentally, the reference implementation also contains ;;; R4RS, so commented out. ;(define (length x) ; LENGTH may diverge or ; (let lp ((x x) (len 0)) ; raise an error if X is ; (if (pair? x) ; a circular list. This version ; (lp (cdr x) (+ len 1)) ; diverges. ; len))) which again would work on dotted lists just fine. But I am not all that interested in meddling with that. At any rate, what I am getting at is that I was going to submit the following patch as a part of a series fixing other bugs, bugs that I need a working "get the length of a dotted list" operator for. We don't have any such operator in GUILE, and that's awkward. --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=bap commit 9539272d26f2954a253ed1365a6704ed197a79be Author: David Kastrup Date: Mon Jun 2 15:05:55 2014 +0200 Let length+ return the length of dotted lists rather than #f * libguile/srfi-1.c (scm_srfi1_length_plus): Previously, length+ returned #f for dotted lists. This leaves the user with no efficient means for determining the length of dotted lists. While the Scheme standard does not prescribe a behavior here, the reference implementation at indeed returns the spine length (number of successive pairs in the cdr-chain) of dotted lists rather than #f, providing a good endorsement of this behavior. As one consequence, the multi-list implementations for map, fold, and for-each will happen to accept dotted lists as the shortest list. Previously, this caused an error late during processing. diff --git a/libguile/srfi-1.c b/libguile/srfi-1.c index aaa3efe..0db6388 100644 --- a/libguile/srfi-1.c +++ b/libguile/srfi-1.c @@ -614,8 +614,32 @@ SCM_DEFINE (scm_srfi1_length_plus, "length+", 1, 0, 0, "circular.") #define FUNC_NAME s_scm_srfi1_length_plus { - long len = scm_ilength (lst); - return (len >= 0 ? SCM_I_MAKINUM (len) : SCM_BOOL_F); + /* This uses the "tortoise and hare" algorithm to detect "infinitely + long" lists (i.e. lists with cycles in their cdrs), and returns #f + if it does find one. + + Dotted lists are treated just like regular lists, returning the + length of the spine. This is in conformance with the reference + implementation though not explicitly defined in the standard. */ + long i = 0; + SCM tortoise = lst; + SCM hare = lst; + + do { + if (!scm_is_pair (hare)) return scm_from_long (i); + hare = SCM_CDR(hare); + i++; + if (!scm_is_pair (hare)) return scm_from_long (i); + hare = SCM_CDR(hare); + i++; + /* For every two steps the hare takes, the tortoise takes one. */ + tortoise = SCM_CDR(tortoise); + } + while (!scm_is_eq (hare, tortoise)); + + /* If the tortoise ever catches the hare, then the list must contain + a cycle. */ + return SCM_BOOL_F; } #undef FUNC_NAME diff --git a/module/srfi/srfi-1.scm b/module/srfi/srfi-1.scm index 0806e73..bc72048 100644 --- a/module/srfi/srfi-1.scm +++ b/module/srfi/srfi-1.scm @@ -474,7 +474,7 @@ that result. See the manual for details." (or len1 len2)))) (unless len (scm-error 'wrong-type-arg "fold" - "Args do not contain a proper (finite) list: ~S" + "Args do not contain a finite list: ~S" (list (list list1 list2)) #f)) (let fold2 ((knil knil) (list1 list1) (list2 list2) (len len)) (if (zero? len) @@ -601,7 +601,7 @@ has just one element then that's the return value." (or len1 len2)))) (unless len (scm-error 'wrong-type-arg "map" - "Args do not contain a proper (finite) list: ~S" + "Args do not contain a finite list: ~S" (list (list l1 l2)) #f)) (let map2 ((l1 l1) (l2 l2) (len len)) (if (zero? len) @@ -620,7 +620,7 @@ has just one element then that's the return value." rest))) (if (not len) (scm-error 'wrong-type-arg "map" - "Args do not contain a proper (finite) list: ~S" + "Args do not contain a finite list: ~S" (list (cons l1 rest)) #f)) (let mapn ((l1 l1) (rest rest) (len len)) (if (zero? len) @@ -649,7 +649,7 @@ has just one element then that's the return value." (or len1 len2)))) (unless len (scm-error 'wrong-type-arg "for-each" - "Args do not contain a proper (finite) list: ~S" + "Args do not contain a finite list: ~S" (list (list l1 l2)) #f)) (let for-each2 ((l1 l1) (l2 l2) (len len)) (unless (zero? len) @@ -667,7 +667,7 @@ has just one element then that's the return value." rest))) (if (not len) (scm-error 'wrong-type-arg "for-each" - "Args do not contain a proper (finite) list: ~S" + "Args do not contain a finite list: ~S" (list (cons l1 rest)) #f)) (let for-eachn ((l1 l1) (rest rest) (len len)) (if (> len 0) diff --git a/test-suite/tests/srfi-1.test b/test-suite/tests/srfi-1.test index d40f8e1..9364ea2 100644 --- a/test-suite/tests/srfi-1.test +++ b/test-suite/tests/srfi-1.test @@ -1187,19 +1187,21 @@ (pass-if-exception "proc arg count 4" exception:wrong-num-args (fold (lambda (x y z prev) x) 1 '(1 2 3) '(1 2 3))) - (pass-if-exception "improper first 1" exception:wrong-type-arg - (fold + 1 1 '(1 2 3))) - (pass-if-exception "improper first 2" exception:wrong-type-arg - (fold + 1 '(1 . 2) '(1 2 3))) - (pass-if-exception "improper first 3" exception:wrong-type-arg - (fold + 1 '(1 2 . 3) '(1 2 3))) - - (pass-if-exception "improper second 1" exception:wrong-type-arg - (fold + 1 '(1 2 3) 1)) - (pass-if-exception "improper second 2" exception:wrong-type-arg - (fold + 1 '(1 2 3) '(1 . 2))) - (pass-if-exception "improper second 3" exception:wrong-type-arg - (fold + 1 '(1 2 3) '(1 2 . 3))) + ;; For multiple list arguments, dotted lists are permitted by this + ;; implementation and a non-list is a zero-length dotted list + (pass-if "improper first 1" + (= 1 (fold + 1 1 '(1 2 3)))) + (pass-if "improper first 2" + (= 3 (fold + 1 '(1 . 2) '(1 2 3)))) + (pass-if "improper first 3" + (= 7 (fold + 1 '(1 2 . 3) '(1 2 3)))) + + (pass-if "improper second 1" + (= 1 (fold + 1 '(1 2 3) 1))) + (pass-if "improper second 2" + (= 3 (fold + 1 '(1 2 3) '(1 . 2)))) + (pass-if "improper second 3" + (= 7 (fold + 1 '(1 2 3) '(1 2 . 3)))) (pass-if (= 6 (fold + 1 '(2) '(3)))) (pass-if (= 15 (fold + 1 '(2 3) '(4 5)))) --=-=-= Content-Type: text/plain I am not saying that your implementation of count+ is wrong. It meets the specs of srfi-1. But so did the previous implementation. And so does mine. And since there is otherwise no actual length operator of any kind working for dotted lists, I want one. The previous implementation was actually trivially the same as (define (length* lst) (catch 'wrong-type-arg (lambda () (length lst)) (lambda _ #f))) and consequently was trivial to write using existing length functions in the C API. And I need my variant of length+ to fix things like take-right which currently bomb out on large lists with a VM error and are _required_ to deal with _both_ proper and dotted lists. Which is awkward to do without a length operator working on both proper and dotted lists. -- David Kastrup --=-=-=--