unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
@ 2018-02-27  9:22 Michael Heerdegen
  2018-02-27 11:21 ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-02-27  9:22 UTC (permalink / raw)
  To: 30626


Hello,

Traversing a `stream-of-directory-files' over a huge directory hierarchy
crashes my Emacs.

Here is a recipe: I have a file
"/home/micha/Treasure/today/stream-crash.el" with these contents:

#+begin_src emacs-lisp
;; -*- lexical-binding: t -*-

(require 'stream)

(seq-doseq (_file (stream-of-directory-files
                   "/home/micha" t nil t nil
                   (lambda (file) (and (file-readable-p file) (file-regular-p file)))))
  nil)
#+end_src

Then I

micha> emacs -Q -L /home/micha/software/elpa/packages/stream\
                -l /home/micha/Treasure/today/stream-crash.el

and I get a segfault after ~ 1 minute.

This kind of crash only seems to occur when the traversed directory is
sufficiently large.


Thanks in advance,

Michael.


In GNU Emacs 26.0.91 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.28)
 of 2018-02-26 built on drachen
Repository revision: ea8f9fd7be138708a52aad418d09d5d4ca6b2a7e
Windowing system distributor 'The X.Org Foundation', version 11.0.11906000
System Description:	Debian GNU/Linux testing (buster)






^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27  9:22 bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' Michael Heerdegen
@ 2018-02-27 11:21 ` Eli Zaretskii
  2018-02-27 11:39   ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-02-27 11:21 UTC (permalink / raw)
  To: 30626, michael_heerdegen

On February 27, 2018 11:22:26 AM GMT+02:00, Michael Heerdegen <michael_heerdegen@web.de> wrote:
> 
> Hello,
> 
> Traversing a `stream-of-directory-files' over a huge directory
> hierarchy
> crashes my Emacs.
> 
> Here is a recipe: I have a file
> "/home/micha/Treasure/today/stream-crash.el" with these contents:
> 
> #+begin_src emacs-lisp
> ;; -*- lexical-binding: t -*-
> 
> (require 'stream)
> 
> (seq-doseq (_file (stream-of-directory-files
>                    "/home/micha" t nil t nil
>   (lambda (file) (and (file-readable-p file) (file-regular-p file)))))
>   nil)
> #+end_src
> 
> Then I
> 
> micha> emacs -Q -L /home/micha/software/elpa/packages/stream\
>                 -l /home/micha/Treasure/today/stream-crash.el
> 
> and I get a segfault after ~ 1 minute.
> 
> This kind of crash only seems to occur when the traversed directory is
> sufficiently large.


I guess it's a stack overflow: that function recurses into subdirectories.

To avoid such problems, the function should be rewritten to work by BFS, not DFS.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27 11:21 ` Eli Zaretskii
@ 2018-02-27 11:39   ` Michael Heerdegen
  2018-02-27 12:08     ` Michael Heerdegen
  2018-02-27 18:00     ` Eli Zaretskii
  0 siblings, 2 replies; 57+ messages in thread
From: Michael Heerdegen @ 2018-02-27 11:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 30626

Eli Zaretskii <eliz@gnu.org> writes:

> I guess it's a stack overflow: that function recurses into
> subdirectories.
>
> To avoid such problems, the function should be rewritten to work by
> BFS, not DFS.

Here is the backtrace I'm now able to produce.  Looks like the crash
happens in gc:

| [lots of more lines like these cut]
| #104019 0x00000000005e0526 in mark_object (arg=XIL(0x5a15413)) at alloc.c:6624
| #104020 0x00000000005dfab5 in mark_vectorlike (ptr=0x5a584e0) at alloc.c:6227
| #104021 0x00000000005e0526 in mark_object (arg=XIL(0x5bd7cf3)) at alloc.c:6624
| #104022 0x00000000005dfab5 in mark_vectorlike (ptr=0x5873a80) at alloc.c:6227
| #104023 0x00000000005e0526 in mark_object (arg=XIL(0x5bd7cb3)) at alloc.c:6624
| #104024 0x00000000006070ea in mark_specpdl (first=0x58a1348, ptr=0x58a1b90) at eval.c:3868
| #104025 0x00000000006856b2 in mark_one_thread (thread=0xcd5340 <main_thread>) at thread.c:614
| #104026 0x00000000006857c7 in mark_threads_callback (ignore=0x0) at thread.c:649
| #104027 0x00000000005dda03 in flush_stack_call_func (func=0x685781 <mark_threads_callback>, arg=0x0) at alloc.c:5220
| #104028 0x00000000006857f5 in mark_threads () at thread.c:656
| #104029 0x00000000005df2d2 in garbage_collect_1 (end=0x7fffffff63f0) at alloc.c:5997
| #104030 0x00000000005df929 in Fgarbage_collect () at alloc.c:6168
| #104031 0x000000000055746c in maybe_gc () at lisp.h:4749
| #104032 0x00000000006047cb in Ffuncall (nargs=4, args=0x7fffffff64d0) at eval.c:2751
| #104033 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83c4), vector=XIL(0x3255bb5), maxdepth=make_number(13), args_template=make_number(128), nargs=2, args=0x7fffffff6b38) at bytecode.c:629
| #104034 0x000000000060529c in funcall_lambda (fun=XIL(0x3255c15), nargs=2, arg_vector=0x7fffffff6b38) at eval.c:2970
| #104035 0x00000000006048d4 in Ffuncall (nargs=3, args=0x7fffffff6b30) at eval.c:2771
| #104036 0x000000000060395b in Fapply (nargs=3, args=0x7fffffff6b30) at eval.c:2346
| #104037 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=3, args=0x7fffffff6b30) at eval.c:2824
| #104038 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffff6b28) at eval.c:2769
| #104039 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x15372775), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff6ff8) at bytecode.c:629
| #104040 0x000000000060529c in funcall_lambda (fun=XIL(0x153727c5), nargs=0, arg_vector=0x7fffffff6ff8) at eval.c:2970
| #104041 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff6ff0) at eval.c:2771
| #104042 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff7460) at bytecode.c:629
| #104043 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff7458) at eval.c:2970
| #104044 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff7450) at eval.c:2771
| #104045 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffff78e0) at bytecode.c:629
| #104046 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffff78d8) at eval.c:2970
| #104047 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff78d0) at eval.c:2771
| #104048 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x153727f5), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff7da8) at bytecode.c:629
| #104049 0x000000000060529c in funcall_lambda (fun=XIL(0x15372845), nargs=0, arg_vector=0x7fffffff7da8) at eval.c:2970
| #104050 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff7da0) at eval.c:2771
| #104051 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff8210) at bytecode.c:629
| #104052 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff8208) at eval.c:2970
| #104053 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff8200) at eval.c:2771
| #104054 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffff8690) at bytecode.c:629
| #104055 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffff8688) at eval.c:2970
| #104056 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff8680) at eval.c:2771
| #104057 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x15372875), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff8b58) at bytecode.c:629
| #104058 0x000000000060529c in funcall_lambda (fun=XIL(0x153728c5), nargs=0, arg_vector=0x7fffffff8b58) at eval.c:2970
| #104059 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff8b50) at eval.c:2771
| #104060 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff8fc0) at bytecode.c:629
| #104061 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff8fb8) at eval.c:2970
| #104062 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff8fb0) at eval.c:2771
| #104063 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffff9440) at bytecode.c:629
| #104064 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffff9438) at eval.c:2970
| #104065 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff9430) at eval.c:2771
| #104066 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x153728f5), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff9908) at bytecode.c:629
| #104067 0x000000000060529c in funcall_lambda (fun=XIL(0x15372945), nargs=0, arg_vector=0x7fffffff9908) at eval.c:2970
| #104068 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff9900) at eval.c:2771
| #104069 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff9d70) at bytecode.c:629
| #104070 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff9d68) at eval.c:2970
| #104071 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff9d60) at eval.c:2771
| #104072 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffa1e8) at bytecode.c:629
| #104073 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffffa1e0) at eval.c:2970
| #104074 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffa1d8) at eval.c:2771
| #104075 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f8c64), vector=XIL(0x153729a5), maxdepth=make_number(7), args_template=make_number(256), nargs=0, args=0x7fffffffa6a8) at bytecode.c:629
| #104076 0x000000000060529c in funcall_lambda (fun=XIL(0x153729f5), nargs=0, arg_vector=0x7fffffffa6a8) at eval.c:2970
| #104077 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffffa6a0) at eval.c:2771
| #104078 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffffab10) at bytecode.c:629
| #104079 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffffab08) at eval.c:2970
| #104080 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffab00) at eval.c:2771
| #104081 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffaf88) at bytecode.c:629
| #104082 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffffaf80) at eval.c:2970
| #104083 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffaf78) at eval.c:2771
| #104084 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f8c04), vector=XIL(0x331ede5), maxdepth=make_number(5), args_template=make_number(514), nargs=2, args=0x7fffffffb5d0) at bytecode.c:629
| #104085 0x000000000060529c in funcall_lambda (fun=XIL(0x331ee05), nargs=2, arg_vector=0x7fffffffb5c0) at eval.c:2970
| #104086 0x00000000006048d4 in Ffuncall (nargs=3, args=0x7fffffffb5b8) at eval.c:2771
| #104087 0x000000000060390d in Fapply (nargs=4, args=0x7fffffffb5b8) at eval.c:2342
| #104088 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=4, args=0x7fffffffb5b8) at eval.c:2824
| #104089 0x0000000000604890 in Ffuncall (nargs=5, args=0x7fffffffb5b0) at eval.c:2769
| #104090 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42fb574), vector=XIL(0x425f285), maxdepth=make_number(15), args_template=make_number(642), nargs=2, args=0x7fffffffba40) at bytecode.c:629
| #104091 0x000000000060529c in funcall_lambda (fun=XIL(0x425f305), nargs=2, arg_vector=0x7fffffffba30) at eval.c:2970
| #104092 0x000000000060500d in apply_lambda (fun=XIL(0x425f305), args=XIL(0x5bd89c3), count=30) at eval.c:2906
| #104093 0x0000000000603602 in eval_sub (form=XIL(0x5bd89d3)) at eval.c:2279
| #104094 0x0000000000632c69 in readevalloop_eager_expand_eval (val=XIL(0x5bd8a73), macroexpand=XIL(0xda5d0)) at lread.c:1850
| #104095 0x00000000006335f0 in readevalloop (readcharfun=XIL(0x5b75ce5), infile0=0x0, sourcename=XIL(0x55c2e14), printflag=false, unibyte=XIL(0), readfun=XIL(0), start=XIL(0), end=XIL(0)) at lread.c:2036
| #104096 0x0000000000633a0a in Feval_buffer (buffer=XIL(0), printflag=XIL(0), filename=XIL(0x56ea3c4), unibyte=XIL(0), do_allow_print=XIL(0)) at lread.c:2103
| #104097 0x0000000000604d25 in funcall_subr (subr=0xc40240 <Seval_buffer>, numargs=0, args=0x7fffffffbfa0) at eval.c:2856
| #104098 0x0000000000604890 in Ffuncall (nargs=1, args=0x7fffffffbf98) at eval.c:2769
| #104099 0x00000000005fc8f7 in Ffuncall_interactively (nargs=1, args=0x7fffffffbf98) at callint.c:252
| #104100 0x0000000000604b87 in funcall_subr (subr=0xc3cfc0 <Sfuncall_interactively>, numargs=1, args=0x7fffffffbf98) at eval.c:2824
| #104101 0x0000000000604890 in Ffuncall (nargs=2, args=0x7fffffffbf90) at eval.c:2769
| #104102 0x00000000005fec50 in Fcall_interactively (function=XIL(0xad320), record_flag=XIL(0xaad0), keys=XIL(0x5a55745)) at callint.c:854
| #104103 0x0000000000604cac in funcall_subr (subr=0xc3d000 <Scall_interactively>, numargs=3, args=0x7fffffffc300) at eval.c:2849
| #104104 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffffc2f8) at eval.c:2769
| #104105 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x9f4bec), vector=XIL(0x9f4c0d), maxdepth=make_number(13), args_template=make_number(1025), nargs=2, args=0x7fffffffc868) at bytecode.c:629
| #104106 0x000000000060529c in funcall_lambda (fun=XIL(0x9f4bbd), nargs=2, arg_vector=0x7fffffffc858) at eval.c:2970
| #104107 0x00000000006048d4 in Ffuncall (nargs=3, args=0x7fffffffc850) at eval.c:2771
| #104108 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x9f489c), vector=XIL(0x9f48bd), maxdepth=make_number(15), args_template=make_number(769), nargs=3, args=0x7fffffffce00) at bytecode.c:629
| #104109 0x000000000060529c in funcall_lambda (fun=XIL(0x9f485d), nargs=3, arg_vector=0x7fffffffcde8) at eval.c:2970
| #104110 0x00000000006048d4 in Ffuncall (nargs=4, args=0x7fffffffcde0) at eval.c:2771
| #104111 0x0000000000603cdb in Fapply (nargs=2, args=0x7fffffffcfc0) at eval.c:2389
| #104112 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=2, args=0x7fffffffcfc0) at eval.c:2824
| #104113 0x0000000000604890 in Ffuncall (nargs=3, args=0x7fffffffcfb8) at eval.c:2769
| #104114 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x505da94), vector=XIL(0x5044095), maxdepth=make_number(3), args_template=XIL(0), nargs=0, args=0x0) at bytecode.c:629
| #104115 0x0000000000605615 in funcall_lambda (fun=XIL(0x50c3185), nargs=5, arg_vector=0x5044095) at eval.c:3052
| #104116 0x00000000006048d4 in Ffuncall (nargs=6, args=0x7fffffffd450) at eval.c:2771
| #104117 0x0000000000603cdb in Fapply (nargs=3, args=0x7fffffffd648) at eval.c:2389
| #104118 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=3, args=0x7fffffffd648) at eval.c:2824
| #104119 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffffd640) at eval.c:2769
| #104120 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x14402c4), vector=XIL(0x50441c5), maxdepth=make_number(5), args_template=make_number(128), nargs=4, args=0x7fffffffdbf0) at bytecode.c:629
| #104121 0x000000000060529c in funcall_lambda (fun=XIL(0x50441f5), nargs=4, arg_vector=0x7fffffffdbf0) at eval.c:2970
| #104122 0x00000000006048d4 in Ffuncall (nargs=5, args=0x7fffffffdbe8) at eval.c:2771
| #104123 0x00000000005fc8f7 in Ffuncall_interactively (nargs=5, args=0x7fffffffdbe8) at callint.c:252
| #104124 0x0000000000604b87 in funcall_subr (subr=0xc3cfc0 <Sfuncall_interactively>, numargs=5, args=0x7fffffffdbe8) at eval.c:2824
| #104125 0x0000000000604890 in Ffuncall (nargs=6, args=0x7fffffffdbe0) at eval.c:2769
| #104126 0x0000000000603cdb in Fapply (nargs=3, args=0x7fffffffdd90) at eval.c:2389
| #104127 0x00000000005fcd7d in Fcall_interactively (function=XIL(0xda3c0), record_flag=XIL(0), keys=XIL(0x5a3e0a5)) at callint.c:389
| #104128 0x0000000000604cac in funcall_subr (subr=0xc3d000 <Scall_interactively>, numargs=3, args=0x7fffffffdfe0) at eval.c:2849
| #104129 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffffdfd8) at eval.c:2769
| #104130 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x9f4bec), vector=XIL(0x9f4c0d), maxdepth=make_number(13), args_template=make_number(1025), nargs=1, args=0x7fffffffe520) at bytecode.c:629
| #104131 0x000000000060529c in funcall_lambda (fun=XIL(0x9f4bbd), nargs=1, arg_vector=0x7fffffffe518) at eval.c:2970
| #104132 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffe510) at eval.c:2771
| #104133 0x00000000006042a0 in call1 (fn=XIL(0x3f00), arg1=XIL(0xda3c0)) at eval.c:2620
| #104134 0x000000000055ec02 in command_loop_1 () at keyboard.c:1482
| #104135 0x00000000006013b2 in internal_condition_case (bfun=0x55e45e <command_loop_1>, handlers=XIL(0x5250), hfun=0x55dc1d <cmd_error>) at eval.c:1332
| #104136 0x000000000055e151 in command_loop_2 (ignore=XIL(0)) at keyboard.c:1110
| #104137 0x0000000000600c94 in internal_catch (tag=XIL(0xc6f0), func=0x55e124 <command_loop_2>, arg=XIL(0)) at eval.c:1097
| #104138 0x000000000055e0ef in command_loop () at keyboard.c:1089
| #104139 0x000000000055d7ef in recursive_edit_1 () at keyboard.c:695
| #104140 0x000000000055d970 in Frecursive_edit () at keyboard.c:766
| #104141 0x000000000055b5f0 in main (argc=1, argv=0x7fffffffe9f8) at emacs.c:1713
| Cannot access memory at address 0x7fffff66ff4f



Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27 11:39   ` Michael Heerdegen
@ 2018-02-27 12:08     ` Michael Heerdegen
  2018-02-27 18:08       ` Eli Zaretskii
  2018-02-27 18:00     ` Eli Zaretskii
  1 sibling, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-02-27 12:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 30626

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Here is the backtrace I'm now able to produce.  Looks like the crash
> happens in gc.

Here is a much simpler example that crashes as well:

#+begin_src emacs-lisp
(seq-doseq (_ (stream-range 1 1000000)) nil)
#+end_src

Note that this is executed as a loop due how to streams are implemented,
although the definition of `seq-doseq' looks recursive.  But it seems
that gc has a problem with the large number of conses created when
processing that.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27 11:39   ` Michael Heerdegen
  2018-02-27 12:08     ` Michael Heerdegen
@ 2018-02-27 18:00     ` Eli Zaretskii
  1 sibling, 0 replies; 57+ messages in thread
From: Eli Zaretskii @ 2018-02-27 18:00 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 30626

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: bug-gnu-emacs@gnu.org, 30626@debbugs.gnu.org
> Date: Tue, 27 Feb 2018 12:39:47 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I guess it's a stack overflow: that function recurses into
> > subdirectories.
> >
> > To avoid such problems, the function should be rewritten to work by
> > BFS, not DFS.
> 
> Here is the backtrace I'm now able to produce.  Looks like the crash
> happens in gc:

GC is deeply-recursive, and you have exacerbated that by using up a
lot of stack.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27 12:08     ` Michael Heerdegen
@ 2018-02-27 18:08       ` Eli Zaretskii
  2018-02-28  1:29         ` Noam Postavsky
                           ` (2 more replies)
  0 siblings, 3 replies; 57+ messages in thread
From: Eli Zaretskii @ 2018-02-27 18:08 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 30626

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: bug-gnu-emacs@gnu.org,  30626@debbugs.gnu.org
> Date: Tue, 27 Feb 2018 13:08:59 +0100
> 
> #+begin_src emacs-lisp
> (seq-doseq (_ (stream-range 1 1000000)) nil)
> #+end_src
> 
> Note that this is executed as a loop due how to streams are implemented,
> although the definition of `seq-doseq' looks recursive.  But it seems
> that gc has a problem with the large number of conses created when
> processing that.

What can we do instead in such cases?  Stack-overflow protection
cannot work in GC, so you are shooting yourself in the foot by
creating such large recursive structures.  By the time we get to GC,
where the problem will happen, it's too late, because the memory was
already allocated.

Does anyone has a reasonable idea for avoiding the crash in such
programs?





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27 18:08       ` Eli Zaretskii
@ 2018-02-28  1:29         ` Noam Postavsky
  2018-02-28 10:58           ` Michael Heerdegen
  2018-02-28 11:05         ` Michael Heerdegen
  2018-03-01 10:44         ` Daniel Colascione
  2 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-02-28  1:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Michael Heerdegen, 30626

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Michael Heerdegen <michael_heerdegen@web.de>
>> Cc: bug-gnu-emacs@gnu.org,  30626@debbugs.gnu.org
>> Date: Tue, 27 Feb 2018 13:08:59 +0100
>> 
>> #+begin_src emacs-lisp
>> (seq-doseq (_ (stream-range 1 1000000)) nil)
>> #+end_src
>> 
>> Note that this is executed as a loop due how to streams are implemented,
>> although the definition of `seq-doseq' looks recursive.

Doesn't look recursive to me, it expands to a call to seq-do, which uses
a simple loop.

>> But it seems that gc has a problem with the large number of conses
>> created when processing that.
>
> What can we do instead in such cases?  Stack-overflow protection
> cannot work in GC, so you are shooting yourself in the foot by
> creating such large recursive structures.  By the time we get to GC,
> where the problem will happen, it's too late, because the memory was
> already allocated.
>
> Does anyone has a reasonable idea for avoiding the crash in such
> programs?

I don't have a quick answer for the general case, but I think it's a bug
in stream.el that it's creating such large structures in the first
place.  As far as I understand it, the point of streams is to handle
long lists by encoding them as

    (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST)
 
so as to avoid allocating large amounts of memory.  Is there an easy way
to find out what the large structures are, and where they are coming
from?





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28  1:29         ` Noam Postavsky
@ 2018-02-28 10:58           ` Michael Heerdegen
  2018-02-28 16:00             ` Eli Zaretskii
  2019-04-25  3:20             ` Noam Postavsky
  0 siblings, 2 replies; 57+ messages in thread
From: Michael Heerdegen @ 2018-02-28 10:58 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 30626

Noam Postavsky <npostavs@gmail.com> writes:

> >> #+begin_src emacs-lisp
> >> (seq-doseq (_ (stream-range 1 1000000)) nil)
> >> #+end_src
> >> 
> >> Note that this is executed as a loop due how to streams are
> >> implemented, although the definition of `seq-doseq' looks
> >> recursive.
>
> Doesn't look recursive to me, it expands to a call to seq-do, which uses
> a simple loop.

I was imprecise, I meant that the streams are defined recursively (most
of the time).  Though, it's a delayed type of recursion.  Anyway, I
think that this doesn't matter here.

> > Does anyone has a reasonable idea for avoiding the crash in such
> > programs?
>
> I don't have a quick answer for the general case, but I think it's a bug
> in stream.el that it's creating such large structures in the first
> place.  As far as I understand it, the point of streams is to handle
> long lists by encoding them as
>
>     (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST)

Yes, that's exactly how it's implemented.  When requesting more elements
from the stream, that becomes

      (FIRST-VALUE .
        (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST))

When we loop over the string, the cons whose car is the FIRST-VALUE,
let's call it cons1, is immediately thrown away, and we continue with

      (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST)

etc.

AFAIU the problem is that the cons1 still exists in memory until garbage
collection kicks in.  When that happens, the cons1 points to a largely
recursive cons structure, though this structure is never referenced from
Lisp in this form.
  
> so as to avoid allocating large amounts of memory.  Is there an easy way
> to find out what the large structures are, and where they are coming
> from?

I think I've answered that.  At least, I think so.  What I don't
understand is that when I force the `seq-doseq' to call
`garbace-collect' explicitly every 1000 cycles, or so, it doesn't help:
the crash still happens after generating ~ 70 000 elements, or some
more, but I can't avoid the crash, no matter how often I call gc.  So
I'm not sure whether these long lists are the problem or something else.
The FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST looks harmless when I print
it, even after thousands of iterations, so I would not understand why
that could be problematic.  streams.el implements things in a way that
these rest functions are not deeply nested lambdas.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27 18:08       ` Eli Zaretskii
  2018-02-28  1:29         ` Noam Postavsky
@ 2018-02-28 11:05         ` Michael Heerdegen
  2018-02-28 13:20           ` Nicolas Petton
  2018-03-01 10:44         ` Daniel Colascione
  2 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-02-28 11:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: nicolas, 30626

Hello,

I had written that

> > (seq-doseq (_ (stream-range 1 1000000)) nil)

crashes.  CC'ing Nicolas, the author of stream.el.

> What can we do instead in such cases?  Stack-overflow protection
> cannot work in GC, so you are shooting yourself in the foot by
> creating such large recursive structures.  By the time we get to GC,
> where the problem will happen, it's too late, because the memory was
> already allocated.
>
> Does anyone has a reasonable idea for avoiding the crash in such
> programs?

I would appreciate any effort to fix that, because it seems that
currently streams are broken by design, and there is no way to fix that
from the Lisp implementation.

Yes, we could implement iterators instead of streams - that's what we
get when we avoid the consing.  But it's something different and not
always an alternative, depending on what you want to do.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28 11:05         ` Michael Heerdegen
@ 2018-02-28 13:20           ` Nicolas Petton
  0 siblings, 0 replies; 57+ messages in thread
From: Nicolas Petton @ 2018-02-28 13:20 UTC (permalink / raw)
  To: Michael Heerdegen, Eli Zaretskii; +Cc: 30626

[-- Attachment #1: Type: text/plain, Size: 239 bytes --]

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Hello,

Hi,

> I had written that
>
>> > (seq-doseq (_ (stream-range 1 1000000)) nil)
>
> crashes.  CC'ing Nicolas, the author of stream.el.

Thanks, I'll look into it.

Cheers,
Nico

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28 10:58           ` Michael Heerdegen
@ 2018-02-28 16:00             ` Eli Zaretskii
  2018-02-28 16:20               ` Michael Heerdegen
  2019-04-25  3:20             ` Noam Postavsky
  1 sibling, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-02-28 16:00 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: npostavs, 30626

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: Eli Zaretskii <eliz@gnu.org>,  30626@debbugs.gnu.org
> Date: Wed, 28 Feb 2018 11:58:49 +0100
> 
> >     (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST)
> 
> Yes, that's exactly how it's implemented.  When requesting more elements
> from the stream, that becomes
> 
>       (FIRST-VALUE .
>         (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST))
> 
> When we loop over the string, the cons whose car is the FIRST-VALUE,
> let's call it cons1, is immediately thrown away, and we continue with
> 
>       (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST)
> 
> etc.

How do you see that the car is immediately thrown away?

> AFAIU the problem is that the cons1 still exists in memory until garbage
> collection kicks in.  When that happens, the cons1 points to a largely
> recursive cons structure, though this structure is never referenced from
> Lisp in this form.

What is "cons1" in this context?





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28 16:00             ` Eli Zaretskii
@ 2018-02-28 16:20               ` Michael Heerdegen
  2018-02-28 17:22                 ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-02-28 16:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, 30626

Eli Zaretskii <eliz@gnu.org> writes:

> How do you see that the car is immediately thrown away?

After expansion and dispatch, this is what gets executed:

#+begin_src emacs-lisp
(let ((stream (stream-range 1 1000000)))
  (while (not (stream-empty-p stream))
    (funcall #'ignore (stream-first stream))
    (setq stream (stream-rest stream))))
#+end_src

Creating an element and throwing away the outermost cons alternate.

> > AFAIU the problem is that the cons1 still exists in memory until
> > garbage collection kicks in.  When that happens, the cons1 points to
> > a largely recursive cons structure, though this structure is never
> > referenced from Lisp in this form.

> What is "cons1" in this context?

I said it some lines above: I defined it as name of the "first cons",
i.e. the original stream.  The return value of `stream-range' in the
above example.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28 16:20               ` Michael Heerdegen
@ 2018-02-28 17:22                 ` Eli Zaretskii
  2018-02-28 18:25                   ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-02-28 17:22 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: npostavs, 30626

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: npostavs@gmail.com,  30626@debbugs.gnu.org
> Date: Wed, 28 Feb 2018 17:20:53 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > How do you see that the car is immediately thrown away?
> 
> After expansion and dispatch, this is what gets executed:
> 
> #+begin_src emacs-lisp
> (let ((stream (stream-range 1 1000000)))
>   (while (not (stream-empty-p stream))
>     (funcall #'ignore (stream-first stream))
>     (setq stream (stream-rest stream))))
> #+end_src
> 
> Creating an element and throwing away the outermost cons alternate.

I don't think it's thrown away from the POV of GC.  But you can easily
see what is going on if you trace the GC on the C level.  You should
be able to see which object causes recursion in mark_object.  I didn't
look long enough, but what I did see looks very much like the entire
unwound stream.






^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28 17:22                 ` Eli Zaretskii
@ 2018-02-28 18:25                   ` Michael Heerdegen
  2018-03-01 11:25                     ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-02-28 18:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, 30626

Eli Zaretskii <eliz@gnu.org> writes:

> I don't think it's thrown away from the POV of GC.  But you can easily
> see what is going on if you trace the GC on the C level.  You should
> be able to see which object causes recursion in mark_object.  I didn't
> look long enough, but what I did see looks very much like the entire
> unwound stream.

I have an idea what could be going on.  In the stream-range example,
this is how the stream is build:

#+begin_src emacs-lisp
(list stream--identifier
      (let
          (forced val)
        (lambda
          (&optional check)
          (if
              check
              forced
            (unless
                forced
              (setf
               val
               (progn
                 (cons
                  start
                  (stream-range (+ start step) end step))))
              (setf forced t))
            val))))
#+end_src

The inner `stream-range' call results in a closure, and I guess that
this closure includes a reference to the outside VAL, which is the
stream from one step back (though there isn't a lexical reference to the
variable...does that make sense?)

So there could be a chain of references via closure variables back to
the first cons.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-27 18:08       ` Eli Zaretskii
  2018-02-28  1:29         ` Noam Postavsky
  2018-02-28 11:05         ` Michael Heerdegen
@ 2018-03-01 10:44         ` Daniel Colascione
  2018-03-01 15:51           ` Noam Postavsky
  2 siblings, 1 reply; 57+ messages in thread
From: Daniel Colascione @ 2018-03-01 10:44 UTC (permalink / raw)
  To: Eli Zaretskii, Michael Heerdegen; +Cc: 30626

On 02/27/2018 10:08 AM, Eli Zaretskii wrote:
>> From: Michael Heerdegen <michael_heerdegen@web.de>
>> Cc: bug-gnu-emacs@gnu.org,  30626@debbugs.gnu.org
>> Date: Tue, 27 Feb 2018 13:08:59 +0100
>>
>> #+begin_src emacs-lisp
>> (seq-doseq (_ (stream-range 1 1000000)) nil)
>> #+end_src
>>
>> Note that this is executed as a loop due how to streams are implemented,
>> although the definition of `seq-doseq' looks recursive.  But it seems
>> that gc has a problem with the large number of conses created when
>> processing that.
> 
> What can we do instead in such cases?  Stack-overflow protection
> cannot work in GC, so you are shooting yourself in the foot by
> creating such large recursive structures.  By the time we get to GC,
> where the problem will happen, it's too late, because the memory was
> already allocated.
> 
> Does anyone has a reasonable idea for avoiding the crash in such
> programs?

We need to fix GC being deeply recursive once and for all. Tweaking 
stack sizes on various platforms and trying to spot-fix GC for the 
occasional deeply recursive structure is annoying. Here's my proposal:

Turn garbage_collect_1 into a queue-draining loop, initializing the 
object queue with the GC roots before draining it. We'll make 
mark_object put an object on this queue, turning the existing 
mark_object code into a mark_queued_object function.

garbage_collect_1 will just call mark_queued_object in a loop; 
mark_queued_object can call mark_object, but since mark_object just 
enqueues an object and doesn't recurse, we can't exhaust the stack with 
deep object graphs. (We'll repurpose the mark bit to mean that the 
object is on the to-mark queue; by the time we fully drain the queue, 
just before we sweep, the mark bit will have the same meaning it does now.)

We can't allocate memory to hold the queue during GC, so we'll have to 
pre-allocate it. We can implement the queue as a list of queue blocks, 
where each queue block is an array of 16k or so Lisp_Objects. During 
allocation, we'll just make sure we have one Lisp_Object queue-block 
slot for every non-self-representing Lisp object we allocate.

Since we know that we'll have enough queue blocks for the worst GC case, 
we can have mark_object pull queue blocks from a free list, aborting if 
for some reason it ever runs out of queue blocks. (The previous 
paragraph guarantees we won't.) garbage_collect_1 will churn through 
these heap blocks and place each back on the free list after it's called 
mark_queued_object on every Lisp_Object in the queue block.

In this way, in non-pathological cases of GC, we'll end up using the 
same few queue blocks over and over. That's a nice optimization, because 
we can MADV_DONTNEED unused queue blocks so the OS doesn't actually have 
to remember their contents.

In this way, I think we can make the current GC model recursion-proof 
without drastically changing how we allocate Lisp objects. The 
additional memory requirements should be modest: it's basically one 
Lisp_Object per Lisp object allocated.

The naive version of this scheme needs about 4.6MB of overhead on my 
current 20MB Emacs heap, but it should be possible to reduce the 
overhead significantly by taking advantage of the block allocation we do 
for conses and other types --- we can put whole blocks on the queue 
instead of pointers to individual block parts, so we can get away with a 
much smaller queue. Under this approach, the reserved-queue-block scheme 
would impose an overhead of somewhere around 1MB on the same heap. This 
amount of overhead seems reasonable. We may end up actually using less 
memory that we would for recursive mark_object stack invocation.






^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28 18:25                   ` Michael Heerdegen
@ 2018-03-01 11:25                     ` Michael Heerdegen
  2018-03-01 15:00                       ` Eli Zaretskii
  2018-03-02 14:11                       ` Noam Postavsky
  0 siblings, 2 replies; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-01 11:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, 30626

Michael Heerdegen <michael_heerdegen@web.de> writes:

> The inner `stream-range' call results in a closure, and I guess that
> this closure includes a reference to the outside VAL, which is the
> stream from one step back (though there isn't a lexical reference to the
> variable...does that make sense?)

This is probably only a part of the puzzle.  Some examples:

test1.el: This is more or less the `stream-range' example reimplemented
without dependencies so that it works in emacs -Q:

#+begin_src emacs-lisp
;; -*- lexical-binding: t -*-

(defun my-range (start)
  (let (forced val)
    (lambda ()
      (unless forced
        (setq val (cons start (my-range (1+ start)))
              forced t))
      val)))

(defun my-test ()
  (let ((range-object (my-range 1))
        current-cons)
    (while (< (car (setq current-cons (funcall range-object))) (* 1000 1000))
      (message "%d" (car current-cons))
      (setq range-object (cdr current-cons)))))

(my-test)
#+end_src

Loading this file test1.el crashes Emacs - but if you compile it,
loading the compiled file doesn't crash.  This is what I expected from
my previous thoughts, because only the uncompiled closures include a
reference to the outer VALs.


test2.el:

#+begin_src emacs-lisp
;; -*- lexical-binding: t -*-

(require 'stream)

(let ((stream (stream-range 1 1000000)))
  (stream-flush stream))
#+end_src

This traverses the whole stream ignoring the elements.  Loading this
file crashes Emacs, no matter if compiled or not.  I'm surprised it
doesn't work even when compiled.


test3.el:

#+begin_src emacs-lisp
;; -*- lexical-binding: t -*-

(require 'stream)

(let ((stream (stream-range 1 1000000)))
  (while (not (stream-empty-p stream))
    (cl-callf stream-rest stream)))
#+end_src

This is semantically exactly like test2.el, only the call to
`stream-flush' has been replaced by literally writing out the
definition.  Nonetheless, the compiled file suddenly doesn't crash Emacs
when loaded.  Loading the uncompiled file test3.el still crashes.

Seems that whether we get a crash or not depends on details in the
implementation of lexical-binding.

Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-01 11:25                     ` Michael Heerdegen
@ 2018-03-01 15:00                       ` Eli Zaretskii
  2018-03-02 14:11                       ` Noam Postavsky
  1 sibling, 0 replies; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-01 15:00 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: npostavs, 30626

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: npostavs@gmail.com,  30626@debbugs.gnu.org
> Date: Thu, 01 Mar 2018 12:25:43 +0100
> 
> Seems that whether we get a crash or not depends on details in the
> implementation of lexical-binding.

Byte compilation doesn't just produce byte code, it also changes the
code into an equivalent one, but "equivalence" in this context doesn't
include side effects like stack usage.  As an extreme example,
consider a tail-recursive program that the byte compiler converts (or
at least might convert theoretically) into a loop.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-01 10:44         ` Daniel Colascione
@ 2018-03-01 15:51           ` Noam Postavsky
  2018-03-01 16:54             ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-01 15:51 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Michael Heerdegen, 30626

On Thu, Mar 1, 2018 at 5:44 AM, Daniel Colascione <dancol@dancol.org> wrote:

> We need to fix GC being deeply recursive once and for all. Tweaking stack
> sizes on various platforms and trying to spot-fix GC for the occasional
> deeply recursive structure is annoying.

Agreed, but could you please open a new thread for it? AFAICT, this
bug thread is about streams.el functions producing structures
proportional in size to the length of the entire stream, which is a
bug in itself, regardless of whether or not the GC can handle the
result.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-01 15:51           ` Noam Postavsky
@ 2018-03-01 16:54             ` Michael Heerdegen
  2018-03-01 17:15               ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-01 16:54 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 30626

Noam Postavsky <npostavs@gmail.com> writes:

> Agreed, but could you please open a new thread for it? AFAICT, this
> bug thread is about streams.el functions producing structures
> proportional in size to the length of the entire stream, which is a
> bug in itself, regardless of whether or not the GC can handle the
> result.

No, it is a feature: we want streams to produce these structures,
streams are delayed lists, not just iterators.  Lists are also
potentially unlimited nested conses, and that's not a bug.  Streams are
the same realized with delayed conses.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-01 16:54             ` Michael Heerdegen
@ 2018-03-01 17:15               ` Noam Postavsky
  2018-03-02  7:08                 ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-01 17:15 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 30626

On Thu, Mar 1, 2018 at 11:54 AM, Michael Heerdegen
<michael_heerdegen@web.de> wrote:
> Noam Postavsky <npostavs@gmail.com> writes:
>
>> Agreed, but could you please open a new thread for it? AFAICT, this
>> bug thread is about streams.el functions producing structures
>> proportional in size to the length of the entire stream, which is a
>> bug in itself, regardless of whether or not the GC can handle the
>> result.
>
> No, it is a feature: we want streams to produce these structures,
> streams are delayed lists, not just iterators.  Lists are also
> potentially unlimited nested conses, and that's not a bug.  Streams are
> the same realized with delayed conses.

Maybe I've misunderstood, but is it not the case that iterating over
(stream-range 1 n) should require only a constant amount of memory,
regardless of the value of n?





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-01 17:15               ` Noam Postavsky
@ 2018-03-02  7:08                 ` Michael Heerdegen
  2018-03-02 13:01                   ` Noam Postavsky
  2018-03-02 13:04                   ` Michael Heerdegen
  0 siblings, 2 replies; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-02  7:08 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 30626

Noam Postavsky <npostavs@gmail.com> writes:

> Maybe I've misunderstood, but is it not the case that iterating over
> (stream-range 1 n) should require only a constant amount of memory,
> regardless of the value of n?

Depends on your definition of `require'.  Like, for example,

  (dolist (i 1000000) (message "%S" (cons i (1+ i))))

each iteration step creates and discards a new cons (or a constant
number of conses).  Not a exceptional thing in Lisp.  When iterating
over (stream-range 1 n), the garbage collector seems to have problems
with how the garbage is structured.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02  7:08                 ` Michael Heerdegen
@ 2018-03-02 13:01                   ` Noam Postavsky
  2018-03-02 13:13                     ` Michael Heerdegen
  2018-03-02 13:04                   ` Michael Heerdegen
  1 sibling, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-02 13:01 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 30626

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Noam Postavsky <npostavs@gmail.com> writes:
>
>> Maybe I've misunderstood, but is it not the case that iterating over
>> (stream-range 1 n) should require only a constant amount of memory,
>> regardless of the value of n?
>
> Depends on your definition of `require'.  Like, for example,
>
>   (dolist (i 1000000) (message "%S" (cons i (1+ i))))
>
> each iteration step creates and discards a new cons (or a constant
> number of conses).  Not a exceptional thing in Lisp.  When iterating
> over (stream-range 1 n), the garbage collector seems to have problems
> with how the garbage is structured.

Ah, so let me be more precise.  Iterating over (stream-range 1 n) should
require only a constant amount of *reachable* memory at any particular
instant.  So your example above is okay, but the following one would be
not acceptable (I mean, it's fine if some random lisp code does that,
but stream-range should not be creating such long lists, or equivalently
large structures):

    (let ((list nil))
      (dotimes (i 1000000)
        (push i list)
        (message "%S" (car list))))





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02  7:08                 ` Michael Heerdegen
  2018-03-02 13:01                   ` Noam Postavsky
@ 2018-03-02 13:04                   ` Michael Heerdegen
  1 sibling, 0 replies; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-02 13:04 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 30626

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Noam Postavsky <npostavs@gmail.com> writes:
>
> > Maybe I've misunderstood, but is it not the case that iterating over
> > (stream-range 1 n) should require only a constant amount of memory,
> > regardless of the value of n?

But in this regard, we have a problem with how lexical-binding is
implemented for interpreted code.  Nested thunks (as implemented in
"thunk.el") accumulate useless variable bindings - e.g.

(defun test ()
  (thunk-force
   (thunk-delay
    (thunk-force
     (thunk-delay
      (thunk-force
       (thunk-delay
        (lambda () 1))))))))
(test)
==>
#1=(closure
((check)
(#:val . #1#)
(#:forced . t)
(check)
(#:val . #1#)
(#:forced . t)
(check)
(#:val . #1#)
(#:forced . t)
t)
nil 1)

The length of the variable list is equivalent to the number of thunk
wrappers.  I believe that these useless variable lists are responsible
for the crashes of the uncompiled versions of the test files I had
posted.  I think this problem is different from the gc issue.

Streams use nested thunks.  Of course does thunk.el not explicitly add
such variable lists to the result - this is how closures are built in
interpreted code.  For nested thunks these just add up.

BTW, if you byte-compile the above `test' function, then

(disassemble (test))
==>
byte code:
  args: nil
0       constant  1
1       return

and this problem is gone.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 13:01                   ` Noam Postavsky
@ 2018-03-02 13:13                     ` Michael Heerdegen
  0 siblings, 0 replies; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-02 13:13 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 30626

Noam Postavsky <npostavs@gmail.com> writes:

> Ah, so let me be more precise.  Iterating over (stream-range 1 n) should
> require only a constant amount of *reachable* memory at any particular
> instant.  So your example above is okay, but the following one would be
> not acceptable (I mean, it's fine if some random lisp code does that,
> but stream-range should not be creating such long lists, or equivalently
> large structures):
>
>     (let ((list nil))
>       (dotimes (i 1000000)
>         (push i list)
>         (message "%S" (car list))))

Yes, absolutely agreed.

Using streams, you can logically refer to the complete list of elements
as one object, but the programmer must ensure that the referable list of
generated elements doesn't get too large.

Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-01 11:25                     ` Michael Heerdegen
  2018-03-01 15:00                       ` Eli Zaretskii
@ 2018-03-02 14:11                       ` Noam Postavsky
  2018-03-02 15:06                         ` Michael Heerdegen
                                           ` (2 more replies)
  1 sibling, 3 replies; 57+ messages in thread
From: Noam Postavsky @ 2018-03-02 14:11 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Nicolas Petton, 30626

Michael Heerdegen <michael_heerdegen@web.de> writes:

> #+begin_src emacs-lisp
> ;; -*- lexical-binding: t -*-
>
> (require 'stream)
>
> (let ((stream (stream-range 1 1000000)))
>   (while (not (stream-empty-p stream))
>     (cl-callf stream-rest stream)))
> #+end_src
>
> This is semantically exactly like test2.el, only the call to
> `stream-flush' has been replaced by literally writing out the
> definition.  Nonetheless, the compiled file suddenly doesn't crash Emacs
> when loaded.  Loading the uncompiled file test3.el still crashes.

Aha, but the following also crashes, whether compiled or not:

;; -*- lexical-binding: t -*-

(require 'stream)

(let* ((stream0 (stream-range 1 1000000))
       (stream stream0))
  (while (not (stream-empty-p stream))
    (cl-callf stream-rest stream)))

So the problem is that the initial stream0 object can reach the entire
unfolding stream as it goes, and just holding on to that reference is
enough to keep the whole thing in memory.

Now, I can see that letting stream0 automagically get access to the
unfolded result can be an optimization in some cases, although in this
case it's a pessimization.  It could also affect the semantics if
unfolding the stream has side effects, not sure if stream.el makes
guarantees about that though.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 14:11                       ` Noam Postavsky
@ 2018-03-02 15:06                         ` Michael Heerdegen
  2018-03-02 15:43                           ` Eli Zaretskii
  2018-03-02 20:16                         ` Nicolas Petton
  2018-03-02 21:48                         ` John Mastro
  2 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-02 15:06 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Nicolas Petton, 30626

Noam Postavsky <npostavs@gmail.com> writes:

> Aha, but the following also crashes, whether compiled or not:
>
> ;; -*- lexical-binding: t -*-
>
> (require 'stream)
>
> (let* ((stream0 (stream-range 1 1000000))
>        (stream stream0))
>   (while (not (stream-empty-p stream))
>     (cl-callf stream-rest stream)))

I guess you should have some awe if you write something like
(stream-range 1 1000000) and keep in mind what happens with this
gigantic thing.  Though, could the object `stream0' references not
already be garbage-collected when the loop has been entered, since there
is no lexical reference to that variable there?

BTW, it gets even worse if you append streams, since the original
streams are not copied and magically unfolded when you generate elements
from the concatenation.

> Now, I can see that letting stream0 automagically get access to the
> unfolded result can be an optimization in some cases, although in this
> case it's a pessimization.  It could also affect the semantics if
> unfolding the stream has side effects, not sure if stream.el makes
> guarantees about that though.

AFAICT we make no such guarantees at all.  When generating stream
elements has side effects (which is not ideal, but it's not forbidden),
then you must know what you are doing.  In my experience, side effects
often directly correlate with element generation - e.g. for a stream of
search matches, a side effect is to set a variable to the position where
to continue searching.  These kind of side effects are not problematic.

For non-trivial side-effects, dunno, never wanted something like that.
But this problem also concerns other forms of delayed evaluation,
including thunks, and generally everything you can do with closures.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 15:06                         ` Michael Heerdegen
@ 2018-03-02 15:43                           ` Eli Zaretskii
  0 siblings, 0 replies; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-02 15:43 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: nicolas, npostavs, 30626

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Date: Fri, 02 Mar 2018 16:06:49 +0100
> Cc: Nicolas Petton <nicolas@petton.fr>, 30626@debbugs.gnu.org
> 
> > (let* ((stream0 (stream-range 1 1000000))
> >        (stream stream0))
> >   (while (not (stream-empty-p stream))
> >     (cl-callf stream-rest stream)))
> 
> I guess you should have some awe if you write something like
> (stream-range 1 1000000) and keep in mind what happens with this
> gigantic thing.  Though, could the object `stream0' references not
> already be garbage-collected when the loop has been entered, since there
> is no lexical reference to that variable there?

It's still referenced by the let* form, I think.

But even if it didn't, you cannot rely on it being GC'ed, because
Emacs triggers GC at times which are hard or even impossible to
predict.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 14:11                       ` Noam Postavsky
  2018-03-02 15:06                         ` Michael Heerdegen
@ 2018-03-02 20:16                         ` Nicolas Petton
  2018-03-02 20:58                           ` Nicolas Petton
  2018-03-03  7:54                           ` Michael Heerdegen
  2018-03-02 21:48                         ` John Mastro
  2 siblings, 2 replies; 57+ messages in thread
From: Nicolas Petton @ 2018-03-02 20:16 UTC (permalink / raw)
  To: Noam Postavsky, Michael Heerdegen; +Cc: 30626

[-- Attachment #1: Type: text/plain, Size: 894 bytes --]

Noam Postavsky <npostavs@gmail.com> writes:

> Now, I can see that letting stream0 automagically get access to the
> unfolded result can be an optimization in some cases, although in this
> case it's a pessimization.

stream.el is at its core just an implementation of lazy-cons cells, so
not letting stream0 get access to the result would mean changing the
core implementation (I'm not necessarily against it).

If we accept that `seq-elt', and other positional functions of seq.el
should not work on streams, then I could rewrite stream.el to make it a
positioned stream where previous elements are discarded after each
element generation.  However the list of supported functions from seq.el
API would be significantly reduced.

> It could also affect the semantics if unfolding the stream has side
> effects, not sure if stream.el makes guarantees about that though.

No, it doesn't.

Nico

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 20:16                         ` Nicolas Petton
@ 2018-03-02 20:58                           ` Nicolas Petton
  2018-03-03  7:56                             ` Michael Heerdegen
  2018-03-03  7:54                           ` Michael Heerdegen
  1 sibling, 1 reply; 57+ messages in thread
From: Nicolas Petton @ 2018-03-02 20:58 UTC (permalink / raw)
  To: Noam Postavsky, Michael Heerdegen; +Cc: 30626

[-- Attachment #1: Type: text/plain, Size: 1031 bytes --]

Nicolas Petton <nicolas@petton.fr> writes:

> If we accept that `seq-elt', and other positional functions of seq.el
> should not work on streams, then I could rewrite stream.el to make it a
> positioned stream where previous elements are discarded after each
> element generation.  However the list of supported functions from seq.el
> API would be significantly reduced.

I had something like the following in mind:

  (cl-defstruct nstream current next-function)
  
  (cl-defmethod nstream-next ((stream nstream))
    (setf (nstream-current stream) (funcall (nstream-next-function stream)
                                            (nstream-current stream))))
  
  (defun nstream-range (&optional start end step)
    (unless start (setq start 0))
    (unless step (setq step 1))
    (make-nstream :current start
                  :next-function (lambda (cur)
                           (if (equal cur end)
                               nil
                             (+ cur step)))))

Cheers,
Nico

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 14:11                       ` Noam Postavsky
  2018-03-02 15:06                         ` Michael Heerdegen
  2018-03-02 20:16                         ` Nicolas Petton
@ 2018-03-02 21:48                         ` John Mastro
  2018-03-03 23:00                           ` Noam Postavsky
  2 siblings, 1 reply; 57+ messages in thread
From: John Mastro @ 2018-03-02 21:48 UTC (permalink / raw)
  To: 30626; +Cc: Michael Heerdegen, Nicolas Petton, Noam Postavsky

Noam Postavsky <npostavs@gmail.com> wrote:
> Aha, but the following also crashes, whether compiled or not:
>
> ;; -*- lexical-binding: t -*-
>
> (require 'stream)
>
> (let* ((stream0 (stream-range 1 1000000))
>        (stream stream0))
>   (while (not (stream-empty-p stream))
>     (cl-callf stream-rest stream)))
>
> So the problem is that the initial stream0 object can reach the entire
> unfolding stream as it goes, and just holding on to that reference is
> enough to keep the whole thing in memory.
>
> Now, I can see that letting stream0 automagically get access to the
> unfolded result can be an optimization in some cases, although in this
> case it's a pessimization.  It could also affect the semantics if
> unfolding the stream has side effects, not sure if stream.el makes
> guarantees about that though.

Clojure has/had a similar issue. They describe this scenario (having a
live reference to the beginning of the stream, preventing GC from
collecting it) as "holding onto the head" of the stream (in Clojure
they're called lazy seqs).

Their solution was what they call "locals clearing". The compiler tracks
the lifetimes of local bindings and "clears" them (by setting them to
nil/null) after their point of last use, e.g.:

(let* ((stream0 (stream-range 1 1000000))
       (stream stream0))
  (setq stream0 nil) ;; <<< Inserted by compiler
  (while (not (stream-empty-p stream))
    (cl-callf stream-rest stream)))

If the code does reference stream0 later, locals clearing can't help
you, but that's considered a "if it hurts, don't do it" situation.

This probably isn't practical for Emacs, especially since it could only
work for byte-compiled code, but thought the prior art may be of
interest.

        John





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 20:16                         ` Nicolas Petton
  2018-03-02 20:58                           ` Nicolas Petton
@ 2018-03-03  7:54                           ` Michael Heerdegen
  2018-03-03  8:47                             ` Nicolas Petton
  1 sibling, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-03  7:54 UTC (permalink / raw)
  To: Nicolas Petton; +Cc: Noam Postavsky, 30626

Nicolas Petton <nicolas@petton.fr> writes:

> stream.el is at its core just an implementation of lazy-cons cells, so
> not letting stream0 get access to the result would mean changing the
> core implementation (I'm not necessarily against it).

Please don't do this.  The semantics of streams is worth keeping.  I
make use of it in el-search and other stuff.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 20:58                           ` Nicolas Petton
@ 2018-03-03  7:56                             ` Michael Heerdegen
  0 siblings, 0 replies; 57+ messages in thread
From: Michael Heerdegen @ 2018-03-03  7:56 UTC (permalink / raw)
  To: Nicolas Petton; +Cc: Noam Postavsky, 30626

Nicolas Petton <nicolas@petton.fr> writes:

> I had something like the following in mind:
>
>   (cl-defstruct nstream current next-function)
>   
>   (cl-defmethod nstream-next ((stream nstream))
>     (setf (nstream-current stream) (funcall (nstream-next-function stream)
>                                             (nstream-current stream))))
>   
>   (defun nstream-range (&optional start end step)
>     (unless start (setq start 0))
>     (unless step (setq step 1))
>     (make-nstream :current start
>                   :next-function (lambda (cur)
>                            (if (equal cur end)
>                                nil
>                              (+ cur step)))))

This is an implementation of iterators, not streams.  We already have an
implementation of iterators in Emacs.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-03  7:54                           ` Michael Heerdegen
@ 2018-03-03  8:47                             ` Nicolas Petton
  0 siblings, 0 replies; 57+ messages in thread
From: Nicolas Petton @ 2018-03-03  8:47 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Noam Postavsky, 30626

[-- Attachment #1: Type: text/plain, Size: 214 bytes --]

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Please don't do this.  The semantics of streams is worth keeping.  I
> make use of it in el-search and other stuff.

I'm not saying that I want this :)

Nico

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-02 21:48                         ` John Mastro
@ 2018-03-03 23:00                           ` Noam Postavsky
  2018-03-04 15:56                             ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-03 23:00 UTC (permalink / raw)
  To: John Mastro; +Cc: Michael Heerdegen, Nicolas Petton, 30626

John Mastro <john.b.mastro@gmail.com> writes:

> (let* ((stream0 (stream-range 1 1000000))
>        (stream stream0))
>   (setq stream0 nil) ;; <<< Inserted by compiler
>   (while (not (stream-empty-p stream))
>     (cl-callf stream-rest stream)))
>
> If the code does reference stream0 later, locals clearing can't help
> you, but that's considered a "if it hurts, don't do it" situation.
>
> This probably isn't practical for Emacs, especially since it could only
> work for byte-compiled code, but thought the prior art may be of
> interest.

Not sure how doable this solution is, but the fact that it works only
for byte-compiled code seems fine to me.  The interpreted case is doomed
to fail anyway since the interpreter doesn't prune redundant variables
from closures.






^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-03 23:00                           ` Noam Postavsky
@ 2018-03-04 15:56                             ` Noam Postavsky
  2018-03-04 17:02                               ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-04 15:56 UTC (permalink / raw)
  To: John Mastro; +Cc: Michael Heerdegen, Nicolas Petton, 30626

Noam Postavsky <npostavs@gmail.com> writes:

> John Mastro <john.b.mastro@gmail.com> writes:
>
>> (let* ((stream0 (stream-range 1 1000000))
>>        (stream stream0))
>>   (setq stream0 nil) ;; <<< Inserted by compiler
>>   (while (not (stream-empty-p stream))
>>     (cl-callf stream-rest stream)))
>>
>> If the code does reference stream0 later, locals clearing can't help
>> you, but that's considered a "if it hurts, don't do it" situation.
>>
>> This probably isn't practical for Emacs, especially since it could only
>> work for byte-compiled code, but thought the prior art may be of
>> interest.
>
> Not sure how doable this solution is, but the fact that it works only
> for byte-compiled code seems fine to me.  The interpreted case is doomed
> to fail anyway since the interpreter doesn't prune redundant variables
> from closures.

Hmm, I think it won't work by itself though, just doing

    (stream-flush (stream-range 1 1000000))

also crashes, due to the head of the stream being referenced from the C
stack somewhere (I can get the address from gdb, but I can't figure out
how to get to the corresponding C variable from there).





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-04 15:56                             ` Noam Postavsky
@ 2018-03-04 17:02                               ` Eli Zaretskii
  2018-03-11 18:52                                 ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-04 17:02 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> From: Noam Postavsky <npostavs@gmail.com>
> Date: Sun, 04 Mar 2018 10:56:36 -0500
> Cc: Michael Heerdegen <michael_heerdegen@web.de>,
> 	Nicolas Petton <nicolas@petton.fr>, 30626@debbugs.gnu.org
> 
> Hmm, I think it won't work by itself though, just doing
> 
>     (stream-flush (stream-range 1 1000000))
> 
> also crashes, due to the head of the stream being referenced from the C
> stack somewhere (I can get the address from gdb, but I can't figure out
> how to get to the corresponding C variable from there).

Did you try "info symbol ADDRESS"?  (I'm not sure this will work for
automatic variables, though.)

You could also try "info locals" after "set print address on" and/or
"set print symbol on".






^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-04 17:02                               ` Eli Zaretskii
@ 2018-03-11 18:52                                 ` Noam Postavsky
  2018-03-11 20:31                                   ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-11 18:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

[-- Attachment #1: Type: text/plain, Size: 1319 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> also crashes, due to the head of the stream being referenced from the C
>> stack somewhere (I can get the address from gdb, but I can't figure out
>> how to get to the corresponding C variable from there).
>
> Did you try "info symbol ADDRESS"?  (I'm not sure this will work for
> automatic variables, though.)

Doesn't seem to work.  I guess it wouldn't work if the address was in
the middle of an array either.

> You could also try "info locals" after "set print address on" and/or
> "set print symbol on".

Those settings don't seem to help.

I first guessed that the problem is due to saving function arguments
during funcall, so I tried the following to check it:

--- i/src/bytecode.c
+++ w/src/bytecode.c
@@ -387,7 +387,10 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 			make_number (nargs)));
       ptrdiff_t pushedargs = min (nonrest, nargs);
       for (ptrdiff_t i = 0; i < pushedargs; i++, args++)
-	PUSH (*args);
+        {
+          PUSH (*args);
+          *args = Qnil;
+        }
       if (nonrest < nargs)
 	PUSH (Flist (nargs - nonrest, args));
       else

This did change the backtrace (from starting with mark_specpdl to
mark_stack), meaning I did find one reference, but it still crashes, so
there must be more.


[-- Attachment #2: gdb log excerpts --]
[-- Type: text/plain, Size: 7995 bytes --]

No stack.
Starting program: /home/npostavs/src/emacs/emacs-26/lucid/src/emacs -Q -batch -l elpa/packages/stream/stream.elc -l bug-30626-stream-crash/test5.elc
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000000000060e9b5 in live_cons_holding (m=0x2f6e290, p=0x2f77620) at ../../src/alloc.c:4613
4613		    return make_lisp_ptr (s, Lisp_Cons);
#0  0x000000000060e9b5 in live_cons_holding (m=0x2f6e290, p=0x2f77620) at ../../src/alloc.c:4613
#1  0x000000000060e9ec in live_cons_p (m=0x2f6e290, p=0x2f77620) at ../../src/alloc.c:4622
#2  0x0000000000612f94 in mark_object (arg=XIL(0x2f77623)) at ../../src/alloc.c:6717
#3  0x0000000000611d4f in mark_vectorlike (ptr=0x2f78b50) at ../../src/alloc.c:6227
[...]
#4850 0x0000000000612b42 in mark_object (arg=XIL(0x2efcb83)) at ../../src/alloc.c:6624
#4851 0x0000000000611d4f in mark_vectorlike (ptr=0x2e64c90) at ../../src/alloc.c:6227
#4852 0x0000000000612b42 in mark_object (arg=XIL(0x2e64c95)) at ../../src/alloc.c:6624
#4853 0x000000000060f3ce in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936
#4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868)
    at ../../src/alloc.c:4985
#4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", 
    end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193
#4856 0x00000000006cdd75 in mark_one_thread (thread=0xe103e0 <main_thread>) at ../../src/thread.c:616
#4857 0x00000000006cdf0a in mark_threads_callback (ignore=0x0) at ../../src/thread.c:649
#4858 0x000000000060f4da in flush_stack_call_func (func=0x6cde77 <mark_threads_callback>, arg=0x0)
    at ../../src/alloc.c:5220
#4859 0x00000000006cdf3c in mark_threads () at ../../src/thread.c:656
#4860 0x00000000006114a7 in garbage_collect_1 (end=0x7fffffffa700) at ../../src/alloc.c:5997
#4861 0x0000000000611b94 in Fgarbage_collect () at ../../src/alloc.c:6168
#4862 0x000000000057999d in maybe_gc () at ../../src/lisp.h:4749
#4863 0x000000000063c2cb in Ffuncall (nargs=6, args=0x7fffffffa7f8) at ../../src/eval.c:2751
#4864 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7aad4), vector=XIL(0x2e72715), 
    maxdepth=make_number(18), args_template=make_number(768), nargs=3, args=0x7fffffffad20)
    at ../../src/bytecode.c:632
#4865 0x000000000063cfef in funcall_lambda (fun=XIL(0x1450e45), nargs=3, arg_vector=0x7fffffffad08)
    at ../../src/eval.c:2970
#4866 0x000000000063c402 in Ffuncall (nargs=4, args=0x7fffffffad00) at ../../src/eval.c:2771
#4867 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7b2d4), vector=XIL(0x2fe2945), 
    maxdepth=make_number(7), args_template=make_number(256), nargs=0, args=0x7fffffffb1d8)
    at ../../src/bytecode.c:632
#4868 0x000000000063cfef in funcall_lambda (fun=XIL(0x2fe2985), nargs=0, arg_vector=0x7fffffffb1d8)
    at ../../src/eval.c:2970
#4869 0x000000000063c402 in Ffuncall (nargs=1, args=0x7fffffffb1d0) at ../../src/eval.c:2771
#4870 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e769d4), vector=XIL(0x9d00a5), 
    maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffffb670)
    at ../../src/bytecode.c:632
#4871 0x000000000063cfef in funcall_lambda (fun=XIL(0x1626f05), nargs=1, arg_vector=0x7fffffffb668)
    at ../../src/eval.c:2970
#4872 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffb660) at ../../src/eval.c:2771
#4873 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7f7b4), vector=XIL(0x1622fe5), 
    maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffbb10)
    at ../../src/bytecode.c:632
#4874 0x000000000063cfef in funcall_lambda (fun=XIL(0x1450f35), nargs=1, arg_vector=0x7fffffffbb08)
    at ../../src/eval.c:2970
#4875 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffbb00) at ../../src/eval.c:2771
#4876 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2ee3c94), vector=XIL(0x2e664f5), 
    maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffbfa8)
    at ../../src/bytecode.c:632
#4877 0x000000000063cfef in funcall_lambda (fun=XIL(0x2e66515), nargs=1, arg_vector=0x7fffffffbfa0)
    at ../../src/eval.c:2970
#4878 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffbf98) at ../../src/eval.c:2771
#4879 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e91054), vector=XIL(0x1699ef5), 
    maxdepth=make_number(4), args_template=XIL(0), nargs=0, args=0x0) at ../../src/bytecode.c:632
#4880 0x000000000068cb20 in Fbyte_code (bytestr=XIL(0x2e91054), vector=XIL(0x1699ef5), 
    maxdepth=make_number(4)) at ../../src/bytecode.c:321
#4881 0x000000000063ac83 in eval_sub (form=XIL(0x2eeb413)) at ../../src/eval.c:2240
#4882 0x0000000000672b34 in readevalloop (readcharfun=XIL(0x68a0), infile0=0x7fffffffc6b0, 
    sourcename=XIL(0x2e90f94), printflag=false, unibyte=XIL(0), readfun=XIL(0), start=XIL(0), 
    end=XIL(0)) at ../../src/lread.c:2038
#4883 0x0000000000670e85 in Fload (file=XIL(0x2e8d884), noerror=XIL(0), nomessage=XIL(0xc090), 
    nosuffix=XIL(0), must_suffix=XIL(0)) at ../../src/lread.c:1425
#4884 0x000000000063c945 in funcall_subr (subr=0xd7a420 <Sload>, numargs=3, args=0x7fffffffc8e8)
    at ../../src/eval.c:2856
#4885 0x000000000063c3be in Ffuncall (nargs=4, args=0x7fffffffc8e0) at ../../src/eval.c:2769
#4886 0x000000000068d950 in exec_byte_code (bytestr=XIL(0xadcc2c), vector=XIL(0xadcc4d), 
    maxdepth=make_number(23), args_template=make_number(257), nargs=1, args=0x7fffffffd248)
    at ../../src/bytecode.c:632
#4887 0x000000000063cfef in funcall_lambda (fun=XIL(0xadcbfd), nargs=1, arg_vector=0x7fffffffd240)
    at ../../src/eval.c:2970
#4888 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffd238) at ../../src/eval.c:2771
#4889 0x000000000068d950 in exec_byte_code (bytestr=XIL(0xad7424), vector=XIL(0xad7445), 
    maxdepth=make_number(21), args_template=make_number(0), nargs=0, args=0x7fffffffde58)
    at ../../src/bytecode.c:632
#4890 0x000000000063cfef in funcall_lambda (fun=XIL(0xad73f5), nargs=0, arg_vector=0x7fffffffde58)
    at ../../src/eval.c:2970
#4891 0x000000000063c402 in Ffuncall (nargs=1, args=0x7fffffffde50) at ../../src/eval.c:2771
#4892 0x000000000068d950 in exec_byte_code (bytestr=XIL(0xad6614), vector=XIL(0xad6635), 
    maxdepth=make_number(12), args_template=make_number(0), nargs=0, args=0x7fffffffe480)
    at ../../src/bytecode.c:632
#4893 0x000000000063cfef in funcall_lambda (fun=XIL(0xad65e5), nargs=0, arg_vector=0x7fffffffe480)
    at ../../src/eval.c:2970
#4894 0x000000000063cc29 in apply_lambda (fun=XIL(0xad65e5), args=XIL(0), count=4)
    at ../../src/eval.c:2906
#4895 0x000000000063ae2a in eval_sub (form=XIL(0x149a8e3)) at ../../src/eval.c:2279
#4896 0x000000000063a19f in Feval (form=XIL(0x149a8e3), lexical=XIL(0)) at ../../src/eval.c:2054
#4897 0x000000000058150f in top_level_2 () at ../../src/keyboard.c:1119
#4898 0x00000000006384ce in internal_condition_case (bfun=0x5814ec <top_level_2>, 
    handlers=XIL(0x5250), hfun=0x580ef1 <cmd_error>) at ../../src/eval.c:1332
#4899 0x0000000000581554 in top_level_1 (ignore=XIL(0)) at ../../src/keyboard.c:1127
#4900 0x00000000006379d7 in internal_catch (tag=XIL(0xc6f0), func=0x581511 <top_level_1>, arg=XIL(0))
    at ../../src/eval.c:1097
#4901 0x000000000058143e in command_loop () at ../../src/keyboard.c:1088
#4902 0x00000000005809db in recursive_edit_1 () at ../../src/keyboard.c:695
#4903 0x0000000000580bd0 in Frecursive_edit () at ../../src/keyboard.c:766
#4904 0x000000000057e76b in main (argc=7, argv=0x7fffffffe9e8) at ../../src/emacs.c:1713
Cannot access memory at address 0x7ffffff7ef7f

(gdb) frame 4854
#4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868)
    at ../../src/alloc.c:4985
4985	      mark_maybe_pointer (*(void **) pp);
(gdb) info locals
pp = 0x7fffffffa968 "\220L\346\002"
(gdb) info symbol 0x7fffffffa968
No symbol matches 0x7fffffffa968.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-11 18:52                                 ` Noam Postavsky
@ 2018-03-11 20:31                                   ` Eli Zaretskii
  2018-03-11 21:51                                     ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-11 20:31 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> From: Noam Postavsky <npostavs@gmail.com>
> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
> Date: Sun, 11 Mar 2018 14:52:22 -0400
> 
> This did change the backtrace (from starting with mark_specpdl to
> mark_stack), meaning I did find one reference, but it still crashes, so
> there must be more.

If you have the address, you could first find the stack frame to which
it belongs, right?  Then go to that stack frame and type "info
locals", which should give you the locals in that frame.  One of them
is your variable.  It could also be a temporary slot, in which case
disassembly of that frame's function should show it.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-11 20:31                                   ` Eli Zaretskii
@ 2018-03-11 21:51                                     ` Noam Postavsky
  2018-03-12  3:28                                       ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-11 21:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Noam Postavsky <npostavs@gmail.com>
>> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
>> Date: Sun, 11 Mar 2018 14:52:22 -0400
>> 
>> This did change the backtrace (from starting with mark_specpdl to
>> mark_stack), meaning I did find one reference, but it still crashes, so
>> there must be more.
>
> If you have the address, you could first find the stack frame to which
> it belongs, right?

Um, how do I do that part?





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-11 21:51                                     ` Noam Postavsky
@ 2018-03-12  3:28                                       ` Eli Zaretskii
  2018-03-13  1:59                                         ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-12  3:28 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> From: Noam Postavsky <npostavs@gmail.com>
> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
> Date: Sun, 11 Mar 2018 17:51:19 -0400
> 
> > If you have the address, you could first find the stack frame to which
> > it belongs, right?
> 
> Um, how do I do that part?

By comparing the address with the value of $bp in each frame, I'd say.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-12  3:28                                       ` Eli Zaretskii
@ 2018-03-13  1:59                                         ` Noam Postavsky
  2018-03-13 16:06                                           ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-13  1:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Noam Postavsky <npostavs@gmail.com>
>> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
>> Date: Sun, 11 Mar 2018 17:51:19 -0400
>> 
>> > If you have the address, you could first find the stack frame to which
>> > it belongs, right?
>> 
>> Um, how do I do that part?
>
> By comparing the address with the value of $bp in each frame, I'd say.

Hmm, I found a match, but it doesn't make any sense.

#4851 0x0000000000611d4f in mark_vectorlike (ptr=0x2e64c90) at ../../src/alloc.c:6227
#4852 0x0000000000612b42 in mark_object (arg=XIL(0x2e64c95)) at ../../src/alloc.c:6624
#4853 0x000000000060f3ce in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936
#4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868)
    at ../../src/alloc.c:4985
#4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", 
    end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193

(gdb) frame 4854
#4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868)
    at ../../src/alloc.c:4985
4985	      mark_maybe_pointer (*(void **) pp);
(gdb) p pp
$28 = 0x7fffffffa968 "\220L\346\002"

(gdb) frame 4864
#4864 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7aad4), vector=XIL(0x2e72715), 
    maxdepth=make_number(18), args_template=make_number(768), nargs=3, args=0x7fffffffad20)
    at ../../src/bytecode.c:632
632		    TOP = Ffuncall (op + 1, &TOP);
(gdb) p $rbp
$29 = (void *) 0x7fffffffabd0

(gdb) p/x $rbp - $28
$32 = 0x268

(gdb) disas /s
[...]
1180		CASE (Bbuffer_substring):
1181		  {
1182		    Lisp_Object v1 = POP;
   0x000000000068fea4 <+13154>:	mov    -0x40(%rbp),%rax
   0x000000000068fea8 <+13158>:	lea    -0x8(%rax),%rdx
   0x000000000068feac <+13162>:	mov    %rdx,-0x40(%rbp)
   0x000000000068feb0 <+13166>:	mov    (%rax),%rax
   0x000000000068feb3 <+13169>:	mov    %rax,-0x268(%rbp)

1183		    TOP = Fbuffer_substring (TOP, v1);
   0x000000000068feba <+13176>:	mov    -0x268(%rbp),%rdx
   0x000000000068fec1 <+13183>:	mov    -0x40(%rbp),%rax
   0x000000000068fec5 <+13187>:	mov    %rdx,%rsi
   0x000000000068fec8 <+13190>:	mov    (%rax),%rdi
   0x000000000068fecb <+13193>:	callq  0x627e0a <Fbuffer_substring>

It can't be a buffer-substring arg, but that's the only reference to
-0x268(%rbp) in that function.






^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-13  1:59                                         ` Noam Postavsky
@ 2018-03-13 16:06                                           ` Eli Zaretskii
  2018-03-14  0:09                                             ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-13 16:06 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> From: Noam Postavsky <npostavs@gmail.com>
> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
> Date: Mon, 12 Mar 2018 21:59:57 -0400
> 
> 4985	      mark_maybe_pointer (*(void **) pp);
> (gdb) p pp
> $28 = 0x7fffffffa968 "\220L\346\002"

Should you look at pp or at *pp?

Also note that for Lisp objects that are marked you need to reset
their mark bit before trying to determine their type and value.

If none of the above helps, please walk me through the steps that led
you to look at -0x268(%rbp), because I'm not sure I follow.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-13 16:06                                           ` Eli Zaretskii
@ 2018-03-14  0:09                                             ` Noam Postavsky
  2018-03-15 16:34                                               ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-14  0:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

Eli Zaretskii <eliz@gnu.org> writes:

> Should you look at pp or at *pp?

I think it should be pp, but I'm not sure.  The context:

#4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868)
    at ../../src/alloc.c:4985
#4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", 
    end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193

mark_memory (void *start, void *end)
{
  ...
  for (pp = start; (void *) pp < end; pp += GC_POINTER_ALIGNMENT)
    {
      mark_maybe_pointer (*(void **) pp);
      mark_maybe_object (*(Lisp_Object *) pp);
    }

So the value of pp ranges over stack addresses and *pp would be the
contents of the stack location.

> Also note that for Lisp objects that are marked you need to reset
> their mark bit before trying to determine their type and value.

I think I'm looking for a C variable, and not a Lisp object (although
the C variable presumably contains/points to a Lisp object).

> If none of the above helps, please walk me through the steps that led
> you to look at -0x268(%rbp), because I'm not sure I follow.

Starting with the value of pp, I then go up looking for a close value of
$rbp:

(gdb) p pp
$39 = 0x7fffffffa968 "\220L\346\002"

(gdb) up
#4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", 
    end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193
5193	  mark_memory (bottom, end);
(gdb) p $rbp
$40 = (void *) 0x7fffffffa420
(gdb) up
#4856 0x00000000006cdd75 in mark_one_thread (thread=0xe103e0 <main_thread>) at ../../src/thread.c:616
616	  mark_stack (thread->m_stack_bottom, stack_top);
(gdb) p $rbp
$41 = (void *) 0x7fffffffa470
[...]
(gdb) up 
#4863 0x000000000063c2cb in Ffuncall (nargs=6, args=0x7fffffffa7f8) at ../../src/eval.c:2751
2751	  maybe_gc ();
(gdb) p $rbp
$48 = (void *) 0x7fffffffa780
(gdb) up 
#4864 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7aad4), vector=XIL(0x2e72715), 
    maxdepth=make_number(18), args_template=make_number(768), nargs=3, args=0x7fffffffad20)
    at ../../src/bytecode.c:632
632		    TOP = Ffuncall (op + 1, &TOP);
(gdb) p $rbp
$49 = (void *) 0x7fffffffabd0

Now I see that $rbp is higher than the target address, and the
difference is 0x268, so the target location should be -0x268(%rbp).

(gdb) p $rbp - 0x7fffffffa968
$52 = (void *) 0x268

Except something must be wrong in my reasoning, since the only
ocurrences of -0x268(%rbp) are the buffer-string args, which could only
hold integers or markers (neither of which could further point to long
chains of objects).





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-14  0:09                                             ` Noam Postavsky
@ 2018-03-15 16:34                                               ` Eli Zaretskii
  2018-03-17 15:53                                                 ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-15 16:34 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> From: Noam Postavsky <npostavs@gmail.com>
> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
> Date: Tue, 13 Mar 2018 20:09:17 -0400
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Should you look at pp or at *pp?
> 
> I think it should be pp, but I'm not sure.  The context:
> 
> #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868)
>     at ../../src/alloc.c:4985
> #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", 
>     end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193
> 
> mark_memory (void *start, void *end)
> {
>   ...
>   for (pp = start; (void *) pp < end; pp += GC_POINTER_ALIGNMENT)
>     {
>       mark_maybe_pointer (*(void **) pp);
>       mark_maybe_object (*(Lisp_Object *) pp);
>     }
> 
> So the value of pp ranges over stack addresses and *pp would be the
> contents of the stack location.

But the call to mark_maybe_pointer means that we consider pp to be a
pointer (in)to a Lisp object.

Anyway, wouldn't it be easier to look one frame lower?  We have this:

  #4850 0x0000000000612b42 in mark_object (arg=XIL(0x2efcb83)) at ../../src/alloc.c:6624
  #4851 0x0000000000611d4f in mark_vectorlike (ptr=0x2e64c90) at ../../src/alloc.c:6227
  #4852 0x0000000000612b42 in mark_object (arg=XIL(0x2e64c95)) at ../../src/alloc.c:6624
  #4853 0x000000000060f3ce in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936
  #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868)
      at ../../src/alloc.c:4985
  #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", 
      end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193

In frame #4852, we have found an object, and we are marking it.  Did
you try looking at that object?  With these caveats:

> > Also note that for Lisp objects that are marked you need to reset
> > their mark bit before trying to determine their type and value.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-15 16:34                                               ` Eli Zaretskii
@ 2018-03-17 15:53                                                 ` Noam Postavsky
  2018-03-17 16:10                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-17 15:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

Eli Zaretskii <eliz@gnu.org> writes:

> In frame #4852, we have found an object, and we are marking it.  Did
> you try looking at that object?  With these caveats:
>
>> > Also note that for Lisp objects that are marked you need to reset
>> > their mark bit before trying to determine their type and value.

Okay, I think xpr takes care of that, right?  (I've restarted the debug
sessions a few times, so numbers may not match exactly)

#4853 0x000000000060f429 in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936
4936		mark_object (obj);
(gdb) p obj
$60 = XIL(0x2e64c95)
(gdb) xpr
Lisp_Vectorlike
PVEC_NORMAL_VECTOR
$61 = (struct Lisp_Vector *) 0x2e64c90
{XIL(0x2efcb63), make_number(1000000), XIL(0x2efcb53), XIL(0x2efcb73), XIL(0x2efcb83), 
  XIL(0x20ab5b0), XIL(0xc090)}
(gdb) p $61->contents[0]
$62 = XIL(0x2efcb63)
(gdb) xpr
Lisp_Cons
$63 = (struct Lisp_Cons *) 0x2efcb60
{
  [...]
      car = make_number(2369), 
      [...]
        chain = 0x0
[...]
(gdb) p $61->contents[1]
$64 = make_number(1000000)
(gdb) p $61->contents[2]
$65 = XIL(0x2efcb53)
(gdb) xpr
Lisp_Cons
[...] car = make_number(1), [...] chain = 0x0 [...]
(gdb) p $61->contents[3]
$67 = XIL(0x2efcb73)
(gdb) xpr
[...] car = XIL(0xc090), [...] chain = 0x0 [...]
(gdb) p $61->contents[4]
[...] car = XIL(0x2efc443), [...] chain = 0x0 [...]
(gdb) p $61->contents[5]
[...]
$72 = (struct Lisp_Symbol *) 0x2e97ef0
"stream-range"
(gdb) p $61->contents[6]
[...]
$74 = (struct Lisp_Symbol *) 0xdf89d0 <lispsym+49296>
"t"


It looks like a the lexical environment of a bytecode function, probably
the initial stream, e.g., (stream-range 1 1000000) gives:

(--stream-- #[256 "\211\203\303\242\207\303\242\204 \304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" 
              [(1) 1000000 (1) (nil) (nil) stream-range t] 7 "
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

(fn &optional CHECK)"])

Not sure where to go next with this.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-17 15:53                                                 ` Noam Postavsky
@ 2018-03-17 16:10                                                   ` Eli Zaretskii
  2018-03-17 16:27                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-17 16:10 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> From: Noam Postavsky <npostavs@gmail.com>
> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
> Date: Sat, 17 Mar 2018 11:53:36 -0400
> 
> It looks like a the lexical environment of a bytecode function, probably
> the initial stream, e.g., (stream-range 1 1000000) gives:
> 
> (--stream-- #[256 "\211\203\303\242\207\303\242\204 \304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" 
>               [(1) 1000000 (1) (nil) (nil) stream-range t] 7 "
>               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> (fn &optional CHECK)"])
> 
> Not sure where to go next with this.

The goal was to find out which variable holds a reference to the
entire long stream, right?  So it sounds like a pointer to it is kept
in an automatic variable on the stack of exec_byte_code, right?  Which
kinda makes sense, since the stream is still being processed, I think.

Or am I confused?





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-17 16:10                                                   ` Eli Zaretskii
@ 2018-03-17 16:27                                                     ` Eli Zaretskii
  2018-03-17 17:28                                                       ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-17 16:27 UTC (permalink / raw)
  To: npostavs; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> Date: Sat, 17 Mar 2018 18:10:24 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr,
> 	30626@debbugs.gnu.org
> 
> > (--stream-- #[256 "\211\203\303\242\207\303\242\204 \304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" 
> >               [(1) 1000000 (1) (nil) (nil) stream-range t] 7 "
> >               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> > (fn &optional CHECK)"])
> > 
> > Not sure where to go next with this.
> 
> The goal was to find out which variable holds a reference to the
> entire long stream, right?  So it sounds like a pointer to it is kept
> in an automatic variable on the stack of exec_byte_code, right?  Which
> kinda makes sense, since the stream is still being processed, I think.
> 
> Or am I confused?

Actually, there's still some mystery: if this object is a 7-element
vector, where do all the other GC frame come from?  Hmm... how
long/deep is each of the cons cells in elements 1 through 4 of the
vector?  If they are deeply nested, then that's the answer we've been
looking for, I think.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-17 16:27                                                     ` Eli Zaretskii
@ 2018-03-17 17:28                                                       ` Noam Postavsky
  2018-03-19 20:05                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2018-03-17 17:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

Eli Zaretskii <eliz@gnu.org> writes:

>> The goal was to find out which variable holds a reference to the
>> entire long stream, right?  So it sounds like a pointer to it is kept
>> in an automatic variable on the stack of exec_byte_code, right?  Which
>> kinda makes sense, since the stream is still being processed, I
>> think.

Yeah, but which variable exactly?  I'd like to find it and add 'X =
Qnil;' to confirm we've found where it is.

> Actually, there's still some mystery: if this object is a 7-element
> vector, where do all the other GC frame come from?  Hmm... how
> long/deep is each of the cons cells in elements 1 through 4 of the
> vector?  If they are deeply nested, then that's the answer we've been
> looking for, I think.

It's a bit confusing because of the indirection: stream-range uses the
stream-cons macro, which uses the stream-make macro, which uses the
thunk-delay macro.  I believe the end result is that the lexical
environment of the resulting closure has access to the next
stream-element in the chain, so the nesting depth is the length of the
stream (i.e., 100000 in the example).  Perhaps this example makes it
clearer:

    (setq print-circle t)

    (let* ((s0 (stream-range 1 2))
           (s1 (stream-rest s0)))
      (list s0 s1))
    ;=>
    ((--stream--
      #[256 "\211\203..."
        [(1) 2 (1) (t)
         ((1 . #1=(--stream--
               #[256 "\211\203..."
                 [(nil) (nil) nil t]
                 3 "\n\n(fn &optional CHECK)"])))
         stream-range t]
        7 "\n\n(fn &optional CHECK)"])
     #1#)





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-03-17 17:28                                                       ` Noam Postavsky
@ 2018-03-19 20:05                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 57+ messages in thread
From: Eli Zaretskii @ 2018-03-19 20:05 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626

> From: Noam Postavsky <npostavs@gmail.com>
> Cc: michael_heerdegen@web.de,  john.b.mastro@gmail.com,  nicolas@petton.fr,  30626@debbugs.gnu.org
> Date: Sat, 17 Mar 2018 13:28:22 -0400
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> The goal was to find out which variable holds a reference to the
> >> entire long stream, right?  So it sounds like a pointer to it is kept
> >> in an automatic variable on the stack of exec_byte_code, right?  Which
> >> kinda makes sense, since the stream is still being processed, I
> >> think.
> 
> Yeah, but which variable exactly?  I'd like to find it and add 'X =
> Qnil;' to confirm we've found where it is.

I don't think you will be able to tell without stepping through the
byte-code interpreter code, keeping track of what it stores where.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2018-02-28 10:58           ` Michael Heerdegen
  2018-02-28 16:00             ` Eli Zaretskii
@ 2019-04-25  3:20             ` Noam Postavsky
  2019-04-25  5:19               ` Michael Heerdegen
  1 sibling, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2019-04-25  3:20 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Nicolas Petton, 30626

[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]

Michael Heerdegen <michael_heerdegen@web.de> writes:

>> I don't have a quick answer for the general case, but I think it's a bug
>> in stream.el that it's creating such large structures in the first
>> place.  As far as I understand it, the point of streams is to handle
>> long lists by encoding them as
>>
>>     (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST)
>
> Yes, that's exactly how it's implemented.  When requesting more elements
> from the stream, that becomes
>
>       (FIRST-VALUE .
>         (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST))
>
> When we loop over the string, the cons whose car is the FIRST-VALUE,
> let's call it cons1, is immediately thrown away, and we continue with
>
>       (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST)

Coming back to this again.  I think I got lost in the weeds of the
byte-code function objects before.  The core problem is that streams are
not exactly encoded like the above, because even after forcing it you
don't have just a plain SECOND-VALUE stored in the stream.  The original
FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST stays around and keeps referencing
all the code and all the following elements.  So a possible solution is
change the stream to get rid of the lambda part and just leave the
computed value after it's forced.  With the following patch
(stream-flush (stream-range 1 1000000)) succeeds:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: patch --]
[-- Type: text/x-diff, Size: 2937 bytes --]

From 5e7139618f1b4cebbe9785ea5a9303b80d1b2c92 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@users.sourceforge.net>
Date: Wed, 24 Apr 2019 22:51:19 -0400
Subject: [PATCH] [WIP] Drop forced lambda's from stream (Bug#30626)

Let the stream id distinguish between forced and unforced stream
values, replacing (think-delay BODY) with just (lambda () BODY).  When
the value is forced, replace the lambda with its result.
---
 packages/stream/stream.el | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/packages/stream/stream.el b/packages/stream/stream.el
index 3f6bc4b5b..fa7a3b520 100644
--- a/packages/stream/stream.el
+++ b/packages/stream/stream.el
@@ -65,18 +65,28 @@
 
 (eval-when-compile (require 'cl-lib))
 (require 'seq)
-(require 'thunk)
 
 (eval-and-compile
-  (defconst stream--identifier '--stream--
-    "Symbol internally used to identify streams."))
+  (defconst stream--fresh-identifier '--stream-fresh--
+    "Symbol internally used to streams whose head was not evaluated.")
+  (defconst stream--evald-identifier '--stream-evald--
+    "Symbol internally used to streams whose head was evaluated."))
 
 (defmacro stream-make (&rest body)
   "Return a stream built from BODY.
 BODY must return nil or a cons cell whose cdr is itself a
 stream."
   (declare (debug t))
-  `(list ',stream--identifier (thunk-delay ,@body)))
+  `(list ',stream--fresh-identifier (lambda () ,@body)))
+
+(defun stream-force (stream)
+  (cond
+   ((eq (car-safe stream) stream--evald-identifier)
+    (cadr stream))
+   ((eq (car-safe stream) stream--fresh-identifier)
+    (setf (car stream) stream--evald-identifier)
+    (setf (cadr stream) (funcall (cadr stream))))
+   (t (signal 'wrong-type-argument (list 'streamp stream)))))
 
 (defmacro stream-cons (first rest)
   "Return a stream built from the cons of FIRST and REST.
@@ -159,24 +169,26 @@ (defun stream-range (&optional start end step)
 
 (defun streamp (stream)
   "Return non-nil if STREAM is a stream, nil otherwise."
-  (eq (car-safe stream) stream--identifier))
+  (let ((car (car-safe stream)))
+    (or (eq car stream--fresh-identifier)
+        (eq car stream--evald-identifier))))
 
 (defun stream-empty ()
   "Return a new empty stream."
-  (list stream--identifier (thunk-delay nil)))
+  (list stream--evald-identifier nil))
 
 (defun stream-empty-p (stream)
   "Return non-nil if STREAM is empty, nil otherwise."
-  (null (thunk-force (cadr stream))))
+  (null (cadr (stream-force stream))))
 
 (defun stream-first (stream)
   "Return the first element of STREAM.
 Return nil if STREAM is empty."
-  (car (thunk-force (cadr stream))))
+  (car (stream-force stream)))
 
 (defun stream-rest (stream)
   "Return a stream of all but the first element of STREAM."
-  (or (cdr (thunk-force (cadr stream)))
+  (or (cdr (stream-force stream))
       (stream-empty)))
 
 (defun stream-append (&rest streams)
-- 
2.11.0


[-- Attachment #3: Type: text/plain, Size: 494 bytes --]


The reason I didn't do this in thunk.el is that thunks are just bare
lambdas, so there is no way to get rid it "from the inside", as it were.
Whereas for streams, it's just a matter of list editing.

Some additional things that I thought of changing, but I didn't yet:

- stream--identifier vs just using '--stream-- directly, I don't see
  what's the benefit of indirection here.

- stream-make should use cons instead of list (or maybe a struct?).

- stream-empty should just be a constant.

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2019-04-25  3:20             ` Noam Postavsky
@ 2019-04-25  5:19               ` Michael Heerdegen
  2019-05-10 13:20                 ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2019-04-25  5:19 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Nicolas Petton, 30626

Noam Postavsky <npostavs@gmail.com> writes:

> Coming back to this again.  I think I got lost in the weeds of the
> byte-code function objects before.  The core problem is that streams
> are not exactly encoded like the above, because even after forcing it
> you don't have just a plain SECOND-VALUE stored in the stream.  The
> original FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST stays around and keeps
> referencing all the code and all the following elements.  So a
> possible solution is change the stream to get rid of the lambda part
> and just leave the computed value after it's forced.  With the
> following patch (stream-flush (stream-range 1 1000000)) succeeds:
>
> [Patch...]

Works for me, and it makes sense.  As a test case I recompiled
el-search.el (it uses streams for several things) with your patch
applied to stream.el, and it worked well.

> Some additional things that I thought of changing, but I didn't yet:

> - stream--identifier vs just using '--stream-- directly, I don't see
>   what's the benefit of indirection here.

A matter of taste I guess.

> - stream-make should use cons instead of list (or maybe a struct?).

I think cons would be ok.  Would a struct make things slower?

> - stream-empty should just be a constant.

Dunno if there are cases where this would be problematic, but I guess we
could do this as well.

Anyway, thanks for looking into this again, I like your solution.  Maybe
Nicolas can also chime in.


Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2019-04-25  5:19               ` Michael Heerdegen
@ 2019-05-10 13:20                 ` Michael Heerdegen
  2019-05-25 20:29                   ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2019-05-10 13:20 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Nicolas Petton, 30626

Michael Heerdegen <michael_heerdegen@web.de> writes:

> > [Patch...]
>
> Works for me, and it makes sense.  As a test case I recompiled
> el-search.el (it uses streams for several things) with your patch
> applied to stream.el, and it worked well.

@Nicolas: If you are short on time, can we just install this patch?
I've tested it for a while and it works well, and I'm quite sure it is
harmless (doesn't change any semantics apart from fixing the crashes).

> > - stream-make should use cons instead of list (or maybe a struct?).
>
> I think cons would be ok.  Would a struct make things slower?
>
> > - stream-empty should just be a constant.
>
> Dunno if there are cases where this would be problematic, but I guess we
> could do this as well.

@Nicolas: Do you want us to care about this or do you want to have a
look yourself?  I don't want to hurry, I just don't want this to be
forgotten.  If you say you have time in four months, it's still ok.


Thanks,

Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2019-05-10 13:20                 ` Michael Heerdegen
@ 2019-05-25 20:29                   ` Noam Postavsky
  2019-05-26  0:32                     ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2019-05-25 20:29 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Nicolas Petton, 30626

[-- Attachment #1: Type: text/plain, Size: 2457 bytes --]

tags 30626 + patch
quit

Michael Heerdegen <michael_heerdegen@web.de> writes:

>> > - stream-make should use cons instead of list (or maybe a struct?).
>>
>> I think cons would be ok.  Would a struct make things slower?

A struct might be slower, and cons has the advantage that the print
output is more readable for humans too.  E.g., with this code:

(let ((s (stream-range 1 5)))
  (stream-flush s)
  s)

;; Using cons (patch in this message):
(--stream-evald-- 1 --stream-evald-- 2 --stream-evald-- 3 --stream-evald-- 4 --stream-evald--)

;; Using list (previous patch):
(--stream-evald-- (1 --stream-evald-- (2 --stream-evald-- (3 --stream-evald-- (4 --stream-evald-- nil)))))

;; I guess using a struct would look something like this:
#(--stream-evald-- (1 . #(--stream-evald-- (2 . #(--stream-evald-- (3 . #(--stream-evald-- (4 . #(--stream-evald-- nil)))))))))

;; Using list with thunk (current, v2.2.4)
(--stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(1) 5 (1) (t) ((1 --stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(2) 5 (1) (t) ((2 --stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(3) 5 (1) (t) ((3 --stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(4) 5 (1) (t) ((4 --stream-- #[256 "\211\203\007\0\300\242\207\300\242\204\024\0\301\302\240\210\300\303\240\210\301\242\207" [(t) (nil) nil t] 3 "

(fn &optional CHECK)"])) stream-range t] 7 "

(fn &optional CHECK)"])) stream-range t] 7 "

(fn &optional CHECK)"])) stream-range t] 7 "

(fn &optional CHECK)"])) stream-range t] 7 "

(fn &optional CHECK)"])

>> > - stream-empty should just be a constant.
>>
>> Dunno if there are cases where this would be problematic, but I guess we
>> could do this as well.

I've done this in the patch below.  Passes all the tests, and I can't
see why it would be problematic.

> @Nicolas: Do you want us to care about this or do you want to have a
> look yourself?  I don't want to hurry, I just don't want this to be
> forgotten.  If you say you have time in four months, it's still ok.

Not getting any response; I'll wait another week for comments and then
push.


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 5752 bytes --]

From 7b126616c87bf034c933de711befcd80a7ada3bb Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@users.sourceforge.net>
Date: Wed, 24 Apr 2019 22:51:19 -0400
Subject: [PATCH] Drop forced lambda's from stream (Bug#30626)

Let the stream id distinguish between forced and unforced stream
values.  When the value is forced, replace the lambda with its result.
This lets the lambda and anything it references be garbage collected.

Change the representation of a stream from (--stream-- THUNK)
to (--stream-fresh-- . (lambda () VALUE)) or (--stream-evald .
VALUE).
* packages/stream/stream.el (stream--identifier): Remove.
(stream--fresh-identifier, stream--evald-identifier): New constants to
replace it.
(streamp): Check for new constants.
(stream-make): Use cons and lambda instead of list and thunk-delay.
(stream--force): New function.
(stream-empty-p, stream-first, stream-rest): Use it.
(stream-empty): New constant, return it from the function instead of
creating a new one each time.
* packages/stream/tests/stream-tests.el (stream-to-list): Remove.
(stream-list-test): Use seq-into instead.
---
 packages/stream/stream.el             | 41 +++++++++++++++++++++++++----------
 packages/stream/tests/stream-tests.el | 12 ++--------
 2 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/packages/stream/stream.el b/packages/stream/stream.el
index 3f6bc4b5b..9f73e8b86 100644
--- a/packages/stream/stream.el
+++ b/packages/stream/stream.el
@@ -1,6 +1,6 @@
 ;;; stream.el --- Implementation of streams  -*- lexical-binding: t -*-
 
-;; Copyright (C) 2016 Free Software Foundation, Inc.
+;; Copyright (C) 2016-2019 Free Software Foundation, Inc.
 
 ;; Author: Nicolas Petton <nicolas@petton.fr>
 ;; Keywords: stream, laziness, sequences
@@ -41,7 +41,7 @@
 ;; - ...
 ;;
 ;; All functions are prefixed with "stream-".
-;; All functions are tested in test/automated/stream-tests.el
+;; All functions are tested in tests/stream-tests.el
 ;;
 ;; Here is an example implementation of the Fibonacci numbers
 ;; implemented as in infinite stream:
@@ -65,18 +65,30 @@
 
 (eval-when-compile (require 'cl-lib))
 (require 'seq)
-(require 'thunk)
 
 (eval-and-compile
-  (defconst stream--identifier '--stream--
-    "Symbol internally used to identify streams."))
+  (defconst stream--fresh-identifier '--stream-fresh--
+    "Symbol internally used to streams whose head was not evaluated.")
+  (defconst stream--evald-identifier '--stream-evald--
+    "Symbol internally used to streams whose head was evaluated."))
 
 (defmacro stream-make (&rest body)
   "Return a stream built from BODY.
 BODY must return nil or a cons cell whose cdr is itself a
 stream."
   (declare (debug t))
-  `(list ',stream--identifier (thunk-delay ,@body)))
+  `(cons ',stream--fresh-identifier (lambda () ,@body)))
+
+(defun stream--force (stream)
+  "Evaluate and return the first cons cell of STREAM.
+That value is the one passed to `stream-make'."
+  (cond
+   ((eq (car-safe stream) stream--evald-identifier)
+    (cdr stream))
+   ((eq (car-safe stream) stream--fresh-identifier)
+    (setf (car stream) stream--evald-identifier)
+    (setf (cdr stream) (funcall (cdr stream))))
+   (t (signal 'wrong-type-argument (list 'streamp stream)))))
 
 (defmacro stream-cons (first rest)
   "Return a stream built from the cons of FIRST and REST.
@@ -159,24 +171,29 @@ (defun stream-range (&optional start end step)
 
 (defun streamp (stream)
   "Return non-nil if STREAM is a stream, nil otherwise."
-  (eq (car-safe stream) stream--identifier))
+  (let ((car (car-safe stream)))
+    (or (eq car stream--fresh-identifier)
+        (eq car stream--evald-identifier))))
+
+(defconst stream-empty (cons stream--evald-identifier nil)
+  "The empty stream.")
 
 (defun stream-empty ()
-  "Return a new empty stream."
-  (list stream--identifier (thunk-delay nil)))
+  "Return the empty stream."
+  stream-empty)
 
 (defun stream-empty-p (stream)
   "Return non-nil if STREAM is empty, nil otherwise."
-  (null (thunk-force (cadr stream))))
+  (null (cdr (stream--force stream))))
 
 (defun stream-first (stream)
   "Return the first element of STREAM.
 Return nil if STREAM is empty."
-  (car (thunk-force (cadr stream))))
+  (car (stream--force stream)))
 
 (defun stream-rest (stream)
   "Return a stream of all but the first element of STREAM."
-  (or (cdr (thunk-force (cadr stream)))
+  (or (cdr (stream--force stream))
       (stream-empty)))
 
 (defun stream-append (&rest streams)
diff --git a/packages/stream/tests/stream-tests.el b/packages/stream/tests/stream-tests.el
index 021ed65cf..7487ef69b 100644
--- a/packages/stream/tests/stream-tests.el
+++ b/packages/stream/tests/stream-tests.el
@@ -1,6 +1,6 @@
 ;;; stream-tests.el --- Unit tests for stream.el  -*- lexical-binding: t -*-
 
-;; Copyright (C) 2015, 2017 Free Software Foundation, Inc.
+;; Copyright (C) 2015, 2017-2019 Free Software Foundation, Inc.
 
 ;; Author: Nicolas Petton <nicolas@petton.fr>
 
@@ -29,14 +29,6 @@ (require 'stream)
 (require 'generator)
 (require 'cl-lib)
 
-(defun stream-to-list (stream)
-  "Eagerly traverse STREAM and return a list of its elements."
-  (let (result)
-    (seq-do (lambda (elt)
-                 (push elt result))
-               stream)
-    (reverse result)))
-
 (ert-deftest stream-empty-test ()
   (should (streamp (stream-empty)))
   (should (stream-empty-p (stream-empty))))
@@ -240,7 +232,7 @@ (ert-deftest stream-range-test ()
 
 (ert-deftest stream-list-test ()
   (dolist (list '(nil '(1 2 3) '(a . b)))
-    (should (equal list (stream-to-list (stream list))))))
+    (should (equal list (seq-into (stream list) 'list)))))
 
 (ert-deftest stream-seq-subseq-test ()
   (should (stream-empty-p (seq-subseq (stream-range 2 10) 0 0)))
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2019-05-25 20:29                   ` Noam Postavsky
@ 2019-05-26  0:32                     ` Michael Heerdegen
  2019-05-26  0:40                       ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2019-05-26  0:32 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Nicolas Petton, 30626

Noam Postavsky <npostavs@gmail.com> writes:

> >> > - stream-make should use cons instead of list (or maybe a struct?).
> >>
> >> I think cons would be ok.  Would a struct make things slower?
>
> A struct might be slower, and cons has the advantage that the print
> output is more readable for humans too.  E.g., with this code: [...]

I see no reason not to switch to cons.  Better readable, and less cons
garbage than with `list'.

Would you like to do this?

> >> > - stream-empty should just be a constant.
> >>
> >> Dunno if there are cases where this would be problematic, but I
> >> guess we
> >> could do this as well.
>
> I've done this in the patch below.  Passes all the tests, and I can't
> see why it would be problematic.

Looks good to me.  Also passes my tests (which are all my stream.el
uses, including el-search.el).

> > @Nicolas: Do you want us to care about this or do you want to have a
> > look yourself?  I don't want to hurry, I just don't want this to be
> > forgotten.  If you say you have time in four months, it's still ok.
>
> Not getting any response; I'll wait another week for comments and then
> push.

Ok, let's do this.


Thanks,

Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2019-05-26  0:32                     ` Michael Heerdegen
@ 2019-05-26  0:40                       ` Noam Postavsky
  2019-05-26  1:15                         ` Michael Heerdegen
  0 siblings, 1 reply; 57+ messages in thread
From: Noam Postavsky @ 2019-05-26  0:40 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Nicolas Petton, 30626

Michael Heerdegen <michael_heerdegen@web.de> writes:

> I see no reason not to switch to cons.  Better readable, and less cons
> garbage than with `list'.
>
> Would you like to do this?

Yes, just to be clear, the patch in my previous message did already
include the switch to cons.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2019-05-26  0:40                       ` Noam Postavsky
@ 2019-05-26  1:15                         ` Michael Heerdegen
  2019-06-04  0:26                           ` Noam Postavsky
  0 siblings, 1 reply; 57+ messages in thread
From: Michael Heerdegen @ 2019-05-26  1:15 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Nicolas Petton, 30626

Noam Postavsky <npostavs@gmail.com> writes:

> Yes, just to be clear, the patch in my previous message did already
> include the switch to cons.

Ah - of course - sorry, I was already a bit tired, I read it but wasn't
aware of it.  Sorry, I had spent the time finding a bug in your patch
that turned out was just my oversight of some file to recompile.

Michael.





^ permalink raw reply	[flat|nested] 57+ messages in thread

* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files'
  2019-05-26  1:15                         ` Michael Heerdegen
@ 2019-06-04  0:26                           ` Noam Postavsky
  0 siblings, 0 replies; 57+ messages in thread
From: Noam Postavsky @ 2019-06-04  0:26 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Nicolas Petton, 30626

tags 30626 fixed
close 30626 
quit

Pushed to elpa.

b5f4061db 2019-06-03T20:23:03-04:00 "Drop forced lambda's from stream (Bug#30626)"
https://git.savannah.gnu.org/cgit/emacs/elpa.git/commit/?id=b5f4061db4226d1d49fcb0ff53db0424f359e344

> I had spent the time finding a bug in your patch that turned out was
> just my oversight of some file to recompile.

Oh, I hate it when that happens.






^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2019-06-04  0:26 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-27  9:22 bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' Michael Heerdegen
2018-02-27 11:21 ` Eli Zaretskii
2018-02-27 11:39   ` Michael Heerdegen
2018-02-27 12:08     ` Michael Heerdegen
2018-02-27 18:08       ` Eli Zaretskii
2018-02-28  1:29         ` Noam Postavsky
2018-02-28 10:58           ` Michael Heerdegen
2018-02-28 16:00             ` Eli Zaretskii
2018-02-28 16:20               ` Michael Heerdegen
2018-02-28 17:22                 ` Eli Zaretskii
2018-02-28 18:25                   ` Michael Heerdegen
2018-03-01 11:25                     ` Michael Heerdegen
2018-03-01 15:00                       ` Eli Zaretskii
2018-03-02 14:11                       ` Noam Postavsky
2018-03-02 15:06                         ` Michael Heerdegen
2018-03-02 15:43                           ` Eli Zaretskii
2018-03-02 20:16                         ` Nicolas Petton
2018-03-02 20:58                           ` Nicolas Petton
2018-03-03  7:56                             ` Michael Heerdegen
2018-03-03  7:54                           ` Michael Heerdegen
2018-03-03  8:47                             ` Nicolas Petton
2018-03-02 21:48                         ` John Mastro
2018-03-03 23:00                           ` Noam Postavsky
2018-03-04 15:56                             ` Noam Postavsky
2018-03-04 17:02                               ` Eli Zaretskii
2018-03-11 18:52                                 ` Noam Postavsky
2018-03-11 20:31                                   ` Eli Zaretskii
2018-03-11 21:51                                     ` Noam Postavsky
2018-03-12  3:28                                       ` Eli Zaretskii
2018-03-13  1:59                                         ` Noam Postavsky
2018-03-13 16:06                                           ` Eli Zaretskii
2018-03-14  0:09                                             ` Noam Postavsky
2018-03-15 16:34                                               ` Eli Zaretskii
2018-03-17 15:53                                                 ` Noam Postavsky
2018-03-17 16:10                                                   ` Eli Zaretskii
2018-03-17 16:27                                                     ` Eli Zaretskii
2018-03-17 17:28                                                       ` Noam Postavsky
2018-03-19 20:05                                                         ` Eli Zaretskii
2019-04-25  3:20             ` Noam Postavsky
2019-04-25  5:19               ` Michael Heerdegen
2019-05-10 13:20                 ` Michael Heerdegen
2019-05-25 20:29                   ` Noam Postavsky
2019-05-26  0:32                     ` Michael Heerdegen
2019-05-26  0:40                       ` Noam Postavsky
2019-05-26  1:15                         ` Michael Heerdegen
2019-06-04  0:26                           ` Noam Postavsky
2018-02-28 11:05         ` Michael Heerdegen
2018-02-28 13:20           ` Nicolas Petton
2018-03-01 10:44         ` Daniel Colascione
2018-03-01 15:51           ` Noam Postavsky
2018-03-01 16:54             ` Michael Heerdegen
2018-03-01 17:15               ` Noam Postavsky
2018-03-02  7:08                 ` Michael Heerdegen
2018-03-02 13:01                   ` Noam Postavsky
2018-03-02 13:13                     ` Michael Heerdegen
2018-03-02 13:04                   ` Michael Heerdegen
2018-02-27 18:00     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).