* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' @ 2018-02-27 9:22 Michael Heerdegen 2018-02-27 11:21 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-02-27 9:22 UTC (permalink / raw) To: 30626 Hello, Traversing a `stream-of-directory-files' over a huge directory hierarchy crashes my Emacs. Here is a recipe: I have a file "/home/micha/Treasure/today/stream-crash.el" with these contents: #+begin_src emacs-lisp ;; -*- lexical-binding: t -*- (require 'stream) (seq-doseq (_file (stream-of-directory-files "/home/micha" t nil t nil (lambda (file) (and (file-readable-p file) (file-regular-p file))))) nil) #+end_src Then I micha> emacs -Q -L /home/micha/software/elpa/packages/stream\ -l /home/micha/Treasure/today/stream-crash.el and I get a segfault after ~ 1 minute. This kind of crash only seems to occur when the traversed directory is sufficiently large. Thanks in advance, Michael. In GNU Emacs 26.0.91 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.28) of 2018-02-26 built on drachen Repository revision: ea8f9fd7be138708a52aad418d09d5d4ca6b2a7e Windowing system distributor 'The X.Org Foundation', version 11.0.11906000 System Description: Debian GNU/Linux testing (buster) ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 9:22 bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' Michael Heerdegen @ 2018-02-27 11:21 ` Eli Zaretskii 2018-02-27 11:39 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-02-27 11:21 UTC (permalink / raw) To: 30626, michael_heerdegen On February 27, 2018 11:22:26 AM GMT+02:00, Michael Heerdegen <michael_heerdegen@web.de> wrote: > > Hello, > > Traversing a `stream-of-directory-files' over a huge directory > hierarchy > crashes my Emacs. > > Here is a recipe: I have a file > "/home/micha/Treasure/today/stream-crash.el" with these contents: > > #+begin_src emacs-lisp > ;; -*- lexical-binding: t -*- > > (require 'stream) > > (seq-doseq (_file (stream-of-directory-files > "/home/micha" t nil t nil > (lambda (file) (and (file-readable-p file) (file-regular-p file))))) > nil) > #+end_src > > Then I > > micha> emacs -Q -L /home/micha/software/elpa/packages/stream\ > -l /home/micha/Treasure/today/stream-crash.el > > and I get a segfault after ~ 1 minute. > > This kind of crash only seems to occur when the traversed directory is > sufficiently large. I guess it's a stack overflow: that function recurses into subdirectories. To avoid such problems, the function should be rewritten to work by BFS, not DFS. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 11:21 ` Eli Zaretskii @ 2018-02-27 11:39 ` Michael Heerdegen 2018-02-27 12:08 ` Michael Heerdegen 2018-02-27 18:00 ` Eli Zaretskii 0 siblings, 2 replies; 57+ messages in thread From: Michael Heerdegen @ 2018-02-27 11:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 30626 Eli Zaretskii <eliz@gnu.org> writes: > I guess it's a stack overflow: that function recurses into > subdirectories. > > To avoid such problems, the function should be rewritten to work by > BFS, not DFS. Here is the backtrace I'm now able to produce. Looks like the crash happens in gc: | [lots of more lines like these cut] | #104019 0x00000000005e0526 in mark_object (arg=XIL(0x5a15413)) at alloc.c:6624 | #104020 0x00000000005dfab5 in mark_vectorlike (ptr=0x5a584e0) at alloc.c:6227 | #104021 0x00000000005e0526 in mark_object (arg=XIL(0x5bd7cf3)) at alloc.c:6624 | #104022 0x00000000005dfab5 in mark_vectorlike (ptr=0x5873a80) at alloc.c:6227 | #104023 0x00000000005e0526 in mark_object (arg=XIL(0x5bd7cb3)) at alloc.c:6624 | #104024 0x00000000006070ea in mark_specpdl (first=0x58a1348, ptr=0x58a1b90) at eval.c:3868 | #104025 0x00000000006856b2 in mark_one_thread (thread=0xcd5340 <main_thread>) at thread.c:614 | #104026 0x00000000006857c7 in mark_threads_callback (ignore=0x0) at thread.c:649 | #104027 0x00000000005dda03 in flush_stack_call_func (func=0x685781 <mark_threads_callback>, arg=0x0) at alloc.c:5220 | #104028 0x00000000006857f5 in mark_threads () at thread.c:656 | #104029 0x00000000005df2d2 in garbage_collect_1 (end=0x7fffffff63f0) at alloc.c:5997 | #104030 0x00000000005df929 in Fgarbage_collect () at alloc.c:6168 | #104031 0x000000000055746c in maybe_gc () at lisp.h:4749 | #104032 0x00000000006047cb in Ffuncall (nargs=4, args=0x7fffffff64d0) at eval.c:2751 | #104033 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83c4), vector=XIL(0x3255bb5), maxdepth=make_number(13), args_template=make_number(128), nargs=2, args=0x7fffffff6b38) at bytecode.c:629 | #104034 0x000000000060529c in funcall_lambda (fun=XIL(0x3255c15), nargs=2, arg_vector=0x7fffffff6b38) at eval.c:2970 | #104035 0x00000000006048d4 in Ffuncall (nargs=3, args=0x7fffffff6b30) at eval.c:2771 | #104036 0x000000000060395b in Fapply (nargs=3, args=0x7fffffff6b30) at eval.c:2346 | #104037 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=3, args=0x7fffffff6b30) at eval.c:2824 | #104038 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffff6b28) at eval.c:2769 | #104039 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x15372775), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff6ff8) at bytecode.c:629 | #104040 0x000000000060529c in funcall_lambda (fun=XIL(0x153727c5), nargs=0, arg_vector=0x7fffffff6ff8) at eval.c:2970 | #104041 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff6ff0) at eval.c:2771 | #104042 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff7460) at bytecode.c:629 | #104043 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff7458) at eval.c:2970 | #104044 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff7450) at eval.c:2771 | #104045 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffff78e0) at bytecode.c:629 | #104046 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffff78d8) at eval.c:2970 | #104047 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff78d0) at eval.c:2771 | #104048 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x153727f5), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff7da8) at bytecode.c:629 | #104049 0x000000000060529c in funcall_lambda (fun=XIL(0x15372845), nargs=0, arg_vector=0x7fffffff7da8) at eval.c:2970 | #104050 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff7da0) at eval.c:2771 | #104051 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff8210) at bytecode.c:629 | #104052 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff8208) at eval.c:2970 | #104053 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff8200) at eval.c:2771 | #104054 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffff8690) at bytecode.c:629 | #104055 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffff8688) at eval.c:2970 | #104056 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff8680) at eval.c:2771 | #104057 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x15372875), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff8b58) at bytecode.c:629 | #104058 0x000000000060529c in funcall_lambda (fun=XIL(0x153728c5), nargs=0, arg_vector=0x7fffffff8b58) at eval.c:2970 | #104059 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff8b50) at eval.c:2771 | #104060 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff8fc0) at bytecode.c:629 | #104061 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff8fb8) at eval.c:2970 | #104062 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff8fb0) at eval.c:2771 | #104063 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffff9440) at bytecode.c:629 | #104064 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffff9438) at eval.c:2970 | #104065 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff9430) at eval.c:2771 | #104066 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f83e4), vector=XIL(0x153728f5), maxdepth=make_number(8), args_template=make_number(256), nargs=0, args=0x7fffffff9908) at bytecode.c:629 | #104067 0x000000000060529c in funcall_lambda (fun=XIL(0x15372945), nargs=0, arg_vector=0x7fffffff9908) at eval.c:2970 | #104068 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffff9900) at eval.c:2771 | #104069 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffff9d70) at bytecode.c:629 | #104070 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffff9d68) at eval.c:2970 | #104071 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffff9d60) at eval.c:2771 | #104072 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffa1e8) at bytecode.c:629 | #104073 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffffa1e0) at eval.c:2970 | #104074 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffa1d8) at eval.c:2771 | #104075 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f8c64), vector=XIL(0x153729a5), maxdepth=make_number(7), args_template=make_number(256), nargs=0, args=0x7fffffffa6a8) at bytecode.c:629 | #104076 0x000000000060529c in funcall_lambda (fun=XIL(0x153729f5), nargs=0, arg_vector=0x7fffffffa6a8) at eval.c:2970 | #104077 0x00000000006048d4 in Ffuncall (nargs=1, args=0x7fffffffa6a0) at eval.c:2771 | #104078 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42b7ad4), vector=XIL(0x930825), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffffab10) at bytecode.c:629 | #104079 0x000000000060529c in funcall_lambda (fun=XIL(0x3b73d75), nargs=1, arg_vector=0x7fffffffab08) at eval.c:2970 | #104080 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffab00) at eval.c:2771 | #104081 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f82f4), vector=XIL(0x3255aa5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffaf88) at bytecode.c:629 | #104082 0x000000000060529c in funcall_lambda (fun=XIL(0x3255ab5), nargs=1, arg_vector=0x7fffffffaf80) at eval.c:2970 | #104083 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffaf78) at eval.c:2771 | #104084 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42f8c04), vector=XIL(0x331ede5), maxdepth=make_number(5), args_template=make_number(514), nargs=2, args=0x7fffffffb5d0) at bytecode.c:629 | #104085 0x000000000060529c in funcall_lambda (fun=XIL(0x331ee05), nargs=2, arg_vector=0x7fffffffb5c0) at eval.c:2970 | #104086 0x00000000006048d4 in Ffuncall (nargs=3, args=0x7fffffffb5b8) at eval.c:2771 | #104087 0x000000000060390d in Fapply (nargs=4, args=0x7fffffffb5b8) at eval.c:2342 | #104088 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=4, args=0x7fffffffb5b8) at eval.c:2824 | #104089 0x0000000000604890 in Ffuncall (nargs=5, args=0x7fffffffb5b0) at eval.c:2769 | #104090 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x42fb574), vector=XIL(0x425f285), maxdepth=make_number(15), args_template=make_number(642), nargs=2, args=0x7fffffffba40) at bytecode.c:629 | #104091 0x000000000060529c in funcall_lambda (fun=XIL(0x425f305), nargs=2, arg_vector=0x7fffffffba30) at eval.c:2970 | #104092 0x000000000060500d in apply_lambda (fun=XIL(0x425f305), args=XIL(0x5bd89c3), count=30) at eval.c:2906 | #104093 0x0000000000603602 in eval_sub (form=XIL(0x5bd89d3)) at eval.c:2279 | #104094 0x0000000000632c69 in readevalloop_eager_expand_eval (val=XIL(0x5bd8a73), macroexpand=XIL(0xda5d0)) at lread.c:1850 | #104095 0x00000000006335f0 in readevalloop (readcharfun=XIL(0x5b75ce5), infile0=0x0, sourcename=XIL(0x55c2e14), printflag=false, unibyte=XIL(0), readfun=XIL(0), start=XIL(0), end=XIL(0)) at lread.c:2036 | #104096 0x0000000000633a0a in Feval_buffer (buffer=XIL(0), printflag=XIL(0), filename=XIL(0x56ea3c4), unibyte=XIL(0), do_allow_print=XIL(0)) at lread.c:2103 | #104097 0x0000000000604d25 in funcall_subr (subr=0xc40240 <Seval_buffer>, numargs=0, args=0x7fffffffbfa0) at eval.c:2856 | #104098 0x0000000000604890 in Ffuncall (nargs=1, args=0x7fffffffbf98) at eval.c:2769 | #104099 0x00000000005fc8f7 in Ffuncall_interactively (nargs=1, args=0x7fffffffbf98) at callint.c:252 | #104100 0x0000000000604b87 in funcall_subr (subr=0xc3cfc0 <Sfuncall_interactively>, numargs=1, args=0x7fffffffbf98) at eval.c:2824 | #104101 0x0000000000604890 in Ffuncall (nargs=2, args=0x7fffffffbf90) at eval.c:2769 | #104102 0x00000000005fec50 in Fcall_interactively (function=XIL(0xad320), record_flag=XIL(0xaad0), keys=XIL(0x5a55745)) at callint.c:854 | #104103 0x0000000000604cac in funcall_subr (subr=0xc3d000 <Scall_interactively>, numargs=3, args=0x7fffffffc300) at eval.c:2849 | #104104 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffffc2f8) at eval.c:2769 | #104105 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x9f4bec), vector=XIL(0x9f4c0d), maxdepth=make_number(13), args_template=make_number(1025), nargs=2, args=0x7fffffffc868) at bytecode.c:629 | #104106 0x000000000060529c in funcall_lambda (fun=XIL(0x9f4bbd), nargs=2, arg_vector=0x7fffffffc858) at eval.c:2970 | #104107 0x00000000006048d4 in Ffuncall (nargs=3, args=0x7fffffffc850) at eval.c:2771 | #104108 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x9f489c), vector=XIL(0x9f48bd), maxdepth=make_number(15), args_template=make_number(769), nargs=3, args=0x7fffffffce00) at bytecode.c:629 | #104109 0x000000000060529c in funcall_lambda (fun=XIL(0x9f485d), nargs=3, arg_vector=0x7fffffffcde8) at eval.c:2970 | #104110 0x00000000006048d4 in Ffuncall (nargs=4, args=0x7fffffffcde0) at eval.c:2771 | #104111 0x0000000000603cdb in Fapply (nargs=2, args=0x7fffffffcfc0) at eval.c:2389 | #104112 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=2, args=0x7fffffffcfc0) at eval.c:2824 | #104113 0x0000000000604890 in Ffuncall (nargs=3, args=0x7fffffffcfb8) at eval.c:2769 | #104114 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x505da94), vector=XIL(0x5044095), maxdepth=make_number(3), args_template=XIL(0), nargs=0, args=0x0) at bytecode.c:629 | #104115 0x0000000000605615 in funcall_lambda (fun=XIL(0x50c3185), nargs=5, arg_vector=0x5044095) at eval.c:3052 | #104116 0x00000000006048d4 in Ffuncall (nargs=6, args=0x7fffffffd450) at eval.c:2771 | #104117 0x0000000000603cdb in Fapply (nargs=3, args=0x7fffffffd648) at eval.c:2389 | #104118 0x0000000000604b87 in funcall_subr (subr=0xc3d7c0 <Sapply>, numargs=3, args=0x7fffffffd648) at eval.c:2824 | #104119 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffffd640) at eval.c:2769 | #104120 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x14402c4), vector=XIL(0x50441c5), maxdepth=make_number(5), args_template=make_number(128), nargs=4, args=0x7fffffffdbf0) at bytecode.c:629 | #104121 0x000000000060529c in funcall_lambda (fun=XIL(0x50441f5), nargs=4, arg_vector=0x7fffffffdbf0) at eval.c:2970 | #104122 0x00000000006048d4 in Ffuncall (nargs=5, args=0x7fffffffdbe8) at eval.c:2771 | #104123 0x00000000005fc8f7 in Ffuncall_interactively (nargs=5, args=0x7fffffffdbe8) at callint.c:252 | #104124 0x0000000000604b87 in funcall_subr (subr=0xc3cfc0 <Sfuncall_interactively>, numargs=5, args=0x7fffffffdbe8) at eval.c:2824 | #104125 0x0000000000604890 in Ffuncall (nargs=6, args=0x7fffffffdbe0) at eval.c:2769 | #104126 0x0000000000603cdb in Fapply (nargs=3, args=0x7fffffffdd90) at eval.c:2389 | #104127 0x00000000005fcd7d in Fcall_interactively (function=XIL(0xda3c0), record_flag=XIL(0), keys=XIL(0x5a3e0a5)) at callint.c:389 | #104128 0x0000000000604cac in funcall_subr (subr=0xc3d000 <Scall_interactively>, numargs=3, args=0x7fffffffdfe0) at eval.c:2849 | #104129 0x0000000000604890 in Ffuncall (nargs=4, args=0x7fffffffdfd8) at eval.c:2769 | #104130 0x000000000064cc41 in exec_byte_code (bytestr=XIL(0x9f4bec), vector=XIL(0x9f4c0d), maxdepth=make_number(13), args_template=make_number(1025), nargs=1, args=0x7fffffffe520) at bytecode.c:629 | #104131 0x000000000060529c in funcall_lambda (fun=XIL(0x9f4bbd), nargs=1, arg_vector=0x7fffffffe518) at eval.c:2970 | #104132 0x00000000006048d4 in Ffuncall (nargs=2, args=0x7fffffffe510) at eval.c:2771 | #104133 0x00000000006042a0 in call1 (fn=XIL(0x3f00), arg1=XIL(0xda3c0)) at eval.c:2620 | #104134 0x000000000055ec02 in command_loop_1 () at keyboard.c:1482 | #104135 0x00000000006013b2 in internal_condition_case (bfun=0x55e45e <command_loop_1>, handlers=XIL(0x5250), hfun=0x55dc1d <cmd_error>) at eval.c:1332 | #104136 0x000000000055e151 in command_loop_2 (ignore=XIL(0)) at keyboard.c:1110 | #104137 0x0000000000600c94 in internal_catch (tag=XIL(0xc6f0), func=0x55e124 <command_loop_2>, arg=XIL(0)) at eval.c:1097 | #104138 0x000000000055e0ef in command_loop () at keyboard.c:1089 | #104139 0x000000000055d7ef in recursive_edit_1 () at keyboard.c:695 | #104140 0x000000000055d970 in Frecursive_edit () at keyboard.c:766 | #104141 0x000000000055b5f0 in main (argc=1, argv=0x7fffffffe9f8) at emacs.c:1713 | Cannot access memory at address 0x7fffff66ff4f Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 11:39 ` Michael Heerdegen @ 2018-02-27 12:08 ` Michael Heerdegen 2018-02-27 18:08 ` Eli Zaretskii 2018-02-27 18:00 ` Eli Zaretskii 1 sibling, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-02-27 12:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 30626 Michael Heerdegen <michael_heerdegen@web.de> writes: > Here is the backtrace I'm now able to produce. Looks like the crash > happens in gc. Here is a much simpler example that crashes as well: #+begin_src emacs-lisp (seq-doseq (_ (stream-range 1 1000000)) nil) #+end_src Note that this is executed as a loop due how to streams are implemented, although the definition of `seq-doseq' looks recursive. But it seems that gc has a problem with the large number of conses created when processing that. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 12:08 ` Michael Heerdegen @ 2018-02-27 18:08 ` Eli Zaretskii 2018-02-28 1:29 ` Noam Postavsky ` (2 more replies) 0 siblings, 3 replies; 57+ messages in thread From: Eli Zaretskii @ 2018-02-27 18:08 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 30626 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: bug-gnu-emacs@gnu.org, 30626@debbugs.gnu.org > Date: Tue, 27 Feb 2018 13:08:59 +0100 > > #+begin_src emacs-lisp > (seq-doseq (_ (stream-range 1 1000000)) nil) > #+end_src > > Note that this is executed as a loop due how to streams are implemented, > although the definition of `seq-doseq' looks recursive. But it seems > that gc has a problem with the large number of conses created when > processing that. What can we do instead in such cases? Stack-overflow protection cannot work in GC, so you are shooting yourself in the foot by creating such large recursive structures. By the time we get to GC, where the problem will happen, it's too late, because the memory was already allocated. Does anyone has a reasonable idea for avoiding the crash in such programs? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 18:08 ` Eli Zaretskii @ 2018-02-28 1:29 ` Noam Postavsky 2018-02-28 10:58 ` Michael Heerdegen 2018-02-28 11:05 ` Michael Heerdegen 2018-03-01 10:44 ` Daniel Colascione 2 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-02-28 1:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Michael Heerdegen, 30626 Eli Zaretskii <eliz@gnu.org> writes: >> From: Michael Heerdegen <michael_heerdegen@web.de> >> Cc: bug-gnu-emacs@gnu.org, 30626@debbugs.gnu.org >> Date: Tue, 27 Feb 2018 13:08:59 +0100 >> >> #+begin_src emacs-lisp >> (seq-doseq (_ (stream-range 1 1000000)) nil) >> #+end_src >> >> Note that this is executed as a loop due how to streams are implemented, >> although the definition of `seq-doseq' looks recursive. Doesn't look recursive to me, it expands to a call to seq-do, which uses a simple loop. >> But it seems that gc has a problem with the large number of conses >> created when processing that. > > What can we do instead in such cases? Stack-overflow protection > cannot work in GC, so you are shooting yourself in the foot by > creating such large recursive structures. By the time we get to GC, > where the problem will happen, it's too late, because the memory was > already allocated. > > Does anyone has a reasonable idea for avoiding the crash in such > programs? I don't have a quick answer for the general case, but I think it's a bug in stream.el that it's creating such large structures in the first place. As far as I understand it, the point of streams is to handle long lists by encoding them as (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST) so as to avoid allocating large amounts of memory. Is there an easy way to find out what the large structures are, and where they are coming from? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 1:29 ` Noam Postavsky @ 2018-02-28 10:58 ` Michael Heerdegen 2018-02-28 16:00 ` Eli Zaretskii 2019-04-25 3:20 ` Noam Postavsky 0 siblings, 2 replies; 57+ messages in thread From: Michael Heerdegen @ 2018-02-28 10:58 UTC (permalink / raw) To: Noam Postavsky; +Cc: 30626 Noam Postavsky <npostavs@gmail.com> writes: > >> #+begin_src emacs-lisp > >> (seq-doseq (_ (stream-range 1 1000000)) nil) > >> #+end_src > >> > >> Note that this is executed as a loop due how to streams are > >> implemented, although the definition of `seq-doseq' looks > >> recursive. > > Doesn't look recursive to me, it expands to a call to seq-do, which uses > a simple loop. I was imprecise, I meant that the streams are defined recursively (most of the time). Though, it's a delayed type of recursion. Anyway, I think that this doesn't matter here. > > Does anyone has a reasonable idea for avoiding the crash in such > > programs? > > I don't have a quick answer for the general case, but I think it's a bug > in stream.el that it's creating such large structures in the first > place. As far as I understand it, the point of streams is to handle > long lists by encoding them as > > (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST) Yes, that's exactly how it's implemented. When requesting more elements from the stream, that becomes (FIRST-VALUE . (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST)) When we loop over the string, the cons whose car is the FIRST-VALUE, let's call it cons1, is immediately thrown away, and we continue with (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST) etc. AFAIU the problem is that the cons1 still exists in memory until garbage collection kicks in. When that happens, the cons1 points to a largely recursive cons structure, though this structure is never referenced from Lisp in this form. > so as to avoid allocating large amounts of memory. Is there an easy way > to find out what the large structures are, and where they are coming > from? I think I've answered that. At least, I think so. What I don't understand is that when I force the `seq-doseq' to call `garbace-collect' explicitly every 1000 cycles, or so, it doesn't help: the crash still happens after generating ~ 70 000 elements, or some more, but I can't avoid the crash, no matter how often I call gc. So I'm not sure whether these long lists are the problem or something else. The FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST looks harmless when I print it, even after thousands of iterations, so I would not understand why that could be problematic. streams.el implements things in a way that these rest functions are not deeply nested lambdas. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 10:58 ` Michael Heerdegen @ 2018-02-28 16:00 ` Eli Zaretskii 2018-02-28 16:20 ` Michael Heerdegen 2019-04-25 3:20 ` Noam Postavsky 1 sibling, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-02-28 16:00 UTC (permalink / raw) To: Michael Heerdegen; +Cc: npostavs, 30626 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: Eli Zaretskii <eliz@gnu.org>, 30626@debbugs.gnu.org > Date: Wed, 28 Feb 2018 11:58:49 +0100 > > > (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST) > > Yes, that's exactly how it's implemented. When requesting more elements > from the stream, that becomes > > (FIRST-VALUE . > (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST)) > > When we loop over the string, the cons whose car is the FIRST-VALUE, > let's call it cons1, is immediately thrown away, and we continue with > > (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST) > > etc. How do you see that the car is immediately thrown away? > AFAIU the problem is that the cons1 still exists in memory until garbage > collection kicks in. When that happens, the cons1 points to a largely > recursive cons structure, though this structure is never referenced from > Lisp in this form. What is "cons1" in this context? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 16:00 ` Eli Zaretskii @ 2018-02-28 16:20 ` Michael Heerdegen 2018-02-28 17:22 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-02-28 16:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, 30626 Eli Zaretskii <eliz@gnu.org> writes: > How do you see that the car is immediately thrown away? After expansion and dispatch, this is what gets executed: #+begin_src emacs-lisp (let ((stream (stream-range 1 1000000))) (while (not (stream-empty-p stream)) (funcall #'ignore (stream-first stream)) (setq stream (stream-rest stream)))) #+end_src Creating an element and throwing away the outermost cons alternate. > > AFAIU the problem is that the cons1 still exists in memory until > > garbage collection kicks in. When that happens, the cons1 points to > > a largely recursive cons structure, though this structure is never > > referenced from Lisp in this form. > What is "cons1" in this context? I said it some lines above: I defined it as name of the "first cons", i.e. the original stream. The return value of `stream-range' in the above example. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 16:20 ` Michael Heerdegen @ 2018-02-28 17:22 ` Eli Zaretskii 2018-02-28 18:25 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-02-28 17:22 UTC (permalink / raw) To: Michael Heerdegen; +Cc: npostavs, 30626 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: npostavs@gmail.com, 30626@debbugs.gnu.org > Date: Wed, 28 Feb 2018 17:20:53 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > How do you see that the car is immediately thrown away? > > After expansion and dispatch, this is what gets executed: > > #+begin_src emacs-lisp > (let ((stream (stream-range 1 1000000))) > (while (not (stream-empty-p stream)) > (funcall #'ignore (stream-first stream)) > (setq stream (stream-rest stream)))) > #+end_src > > Creating an element and throwing away the outermost cons alternate. I don't think it's thrown away from the POV of GC. But you can easily see what is going on if you trace the GC on the C level. You should be able to see which object causes recursion in mark_object. I didn't look long enough, but what I did see looks very much like the entire unwound stream. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 17:22 ` Eli Zaretskii @ 2018-02-28 18:25 ` Michael Heerdegen 2018-03-01 11:25 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-02-28 18:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, 30626 Eli Zaretskii <eliz@gnu.org> writes: > I don't think it's thrown away from the POV of GC. But you can easily > see what is going on if you trace the GC on the C level. You should > be able to see which object causes recursion in mark_object. I didn't > look long enough, but what I did see looks very much like the entire > unwound stream. I have an idea what could be going on. In the stream-range example, this is how the stream is build: #+begin_src emacs-lisp (list stream--identifier (let (forced val) (lambda (&optional check) (if check forced (unless forced (setf val (progn (cons start (stream-range (+ start step) end step)))) (setf forced t)) val)))) #+end_src The inner `stream-range' call results in a closure, and I guess that this closure includes a reference to the outside VAL, which is the stream from one step back (though there isn't a lexical reference to the variable...does that make sense?) So there could be a chain of references via closure variables back to the first cons. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 18:25 ` Michael Heerdegen @ 2018-03-01 11:25 ` Michael Heerdegen 2018-03-01 15:00 ` Eli Zaretskii 2018-03-02 14:11 ` Noam Postavsky 0 siblings, 2 replies; 57+ messages in thread From: Michael Heerdegen @ 2018-03-01 11:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, 30626 Michael Heerdegen <michael_heerdegen@web.de> writes: > The inner `stream-range' call results in a closure, and I guess that > this closure includes a reference to the outside VAL, which is the > stream from one step back (though there isn't a lexical reference to the > variable...does that make sense?) This is probably only a part of the puzzle. Some examples: test1.el: This is more or less the `stream-range' example reimplemented without dependencies so that it works in emacs -Q: #+begin_src emacs-lisp ;; -*- lexical-binding: t -*- (defun my-range (start) (let (forced val) (lambda () (unless forced (setq val (cons start (my-range (1+ start))) forced t)) val))) (defun my-test () (let ((range-object (my-range 1)) current-cons) (while (< (car (setq current-cons (funcall range-object))) (* 1000 1000)) (message "%d" (car current-cons)) (setq range-object (cdr current-cons))))) (my-test) #+end_src Loading this file test1.el crashes Emacs - but if you compile it, loading the compiled file doesn't crash. This is what I expected from my previous thoughts, because only the uncompiled closures include a reference to the outer VALs. test2.el: #+begin_src emacs-lisp ;; -*- lexical-binding: t -*- (require 'stream) (let ((stream (stream-range 1 1000000))) (stream-flush stream)) #+end_src This traverses the whole stream ignoring the elements. Loading this file crashes Emacs, no matter if compiled or not. I'm surprised it doesn't work even when compiled. test3.el: #+begin_src emacs-lisp ;; -*- lexical-binding: t -*- (require 'stream) (let ((stream (stream-range 1 1000000))) (while (not (stream-empty-p stream)) (cl-callf stream-rest stream))) #+end_src This is semantically exactly like test2.el, only the call to `stream-flush' has been replaced by literally writing out the definition. Nonetheless, the compiled file suddenly doesn't crash Emacs when loaded. Loading the uncompiled file test3.el still crashes. Seems that whether we get a crash or not depends on details in the implementation of lexical-binding. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-01 11:25 ` Michael Heerdegen @ 2018-03-01 15:00 ` Eli Zaretskii 2018-03-02 14:11 ` Noam Postavsky 1 sibling, 0 replies; 57+ messages in thread From: Eli Zaretskii @ 2018-03-01 15:00 UTC (permalink / raw) To: Michael Heerdegen; +Cc: npostavs, 30626 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: npostavs@gmail.com, 30626@debbugs.gnu.org > Date: Thu, 01 Mar 2018 12:25:43 +0100 > > Seems that whether we get a crash or not depends on details in the > implementation of lexical-binding. Byte compilation doesn't just produce byte code, it also changes the code into an equivalent one, but "equivalence" in this context doesn't include side effects like stack usage. As an extreme example, consider a tail-recursive program that the byte compiler converts (or at least might convert theoretically) into a loop. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-01 11:25 ` Michael Heerdegen 2018-03-01 15:00 ` Eli Zaretskii @ 2018-03-02 14:11 ` Noam Postavsky 2018-03-02 15:06 ` Michael Heerdegen ` (2 more replies) 1 sibling, 3 replies; 57+ messages in thread From: Noam Postavsky @ 2018-03-02 14:11 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Nicolas Petton, 30626 Michael Heerdegen <michael_heerdegen@web.de> writes: > #+begin_src emacs-lisp > ;; -*- lexical-binding: t -*- > > (require 'stream) > > (let ((stream (stream-range 1 1000000))) > (while (not (stream-empty-p stream)) > (cl-callf stream-rest stream))) > #+end_src > > This is semantically exactly like test2.el, only the call to > `stream-flush' has been replaced by literally writing out the > definition. Nonetheless, the compiled file suddenly doesn't crash Emacs > when loaded. Loading the uncompiled file test3.el still crashes. Aha, but the following also crashes, whether compiled or not: ;; -*- lexical-binding: t -*- (require 'stream) (let* ((stream0 (stream-range 1 1000000)) (stream stream0)) (while (not (stream-empty-p stream)) (cl-callf stream-rest stream))) So the problem is that the initial stream0 object can reach the entire unfolding stream as it goes, and just holding on to that reference is enough to keep the whole thing in memory. Now, I can see that letting stream0 automagically get access to the unfolded result can be an optimization in some cases, although in this case it's a pessimization. It could also affect the semantics if unfolding the stream has side effects, not sure if stream.el makes guarantees about that though. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 14:11 ` Noam Postavsky @ 2018-03-02 15:06 ` Michael Heerdegen 2018-03-02 15:43 ` Eli Zaretskii 2018-03-02 20:16 ` Nicolas Petton 2018-03-02 21:48 ` John Mastro 2 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-03-02 15:06 UTC (permalink / raw) To: Noam Postavsky; +Cc: Nicolas Petton, 30626 Noam Postavsky <npostavs@gmail.com> writes: > Aha, but the following also crashes, whether compiled or not: > > ;; -*- lexical-binding: t -*- > > (require 'stream) > > (let* ((stream0 (stream-range 1 1000000)) > (stream stream0)) > (while (not (stream-empty-p stream)) > (cl-callf stream-rest stream))) I guess you should have some awe if you write something like (stream-range 1 1000000) and keep in mind what happens with this gigantic thing. Though, could the object `stream0' references not already be garbage-collected when the loop has been entered, since there is no lexical reference to that variable there? BTW, it gets even worse if you append streams, since the original streams are not copied and magically unfolded when you generate elements from the concatenation. > Now, I can see that letting stream0 automagically get access to the > unfolded result can be an optimization in some cases, although in this > case it's a pessimization. It could also affect the semantics if > unfolding the stream has side effects, not sure if stream.el makes > guarantees about that though. AFAICT we make no such guarantees at all. When generating stream elements has side effects (which is not ideal, but it's not forbidden), then you must know what you are doing. In my experience, side effects often directly correlate with element generation - e.g. for a stream of search matches, a side effect is to set a variable to the position where to continue searching. These kind of side effects are not problematic. For non-trivial side-effects, dunno, never wanted something like that. But this problem also concerns other forms of delayed evaluation, including thunks, and generally everything you can do with closures. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 15:06 ` Michael Heerdegen @ 2018-03-02 15:43 ` Eli Zaretskii 0 siblings, 0 replies; 57+ messages in thread From: Eli Zaretskii @ 2018-03-02 15:43 UTC (permalink / raw) To: Michael Heerdegen; +Cc: nicolas, npostavs, 30626 > From: Michael Heerdegen <michael_heerdegen@web.de> > Date: Fri, 02 Mar 2018 16:06:49 +0100 > Cc: Nicolas Petton <nicolas@petton.fr>, 30626@debbugs.gnu.org > > > (let* ((stream0 (stream-range 1 1000000)) > > (stream stream0)) > > (while (not (stream-empty-p stream)) > > (cl-callf stream-rest stream))) > > I guess you should have some awe if you write something like > (stream-range 1 1000000) and keep in mind what happens with this > gigantic thing. Though, could the object `stream0' references not > already be garbage-collected when the loop has been entered, since there > is no lexical reference to that variable there? It's still referenced by the let* form, I think. But even if it didn't, you cannot rely on it being GC'ed, because Emacs triggers GC at times which are hard or even impossible to predict. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 14:11 ` Noam Postavsky 2018-03-02 15:06 ` Michael Heerdegen @ 2018-03-02 20:16 ` Nicolas Petton 2018-03-02 20:58 ` Nicolas Petton 2018-03-03 7:54 ` Michael Heerdegen 2018-03-02 21:48 ` John Mastro 2 siblings, 2 replies; 57+ messages in thread From: Nicolas Petton @ 2018-03-02 20:16 UTC (permalink / raw) To: Noam Postavsky, Michael Heerdegen; +Cc: 30626 [-- Attachment #1: Type: text/plain, Size: 894 bytes --] Noam Postavsky <npostavs@gmail.com> writes: > Now, I can see that letting stream0 automagically get access to the > unfolded result can be an optimization in some cases, although in this > case it's a pessimization. stream.el is at its core just an implementation of lazy-cons cells, so not letting stream0 get access to the result would mean changing the core implementation (I'm not necessarily against it). If we accept that `seq-elt', and other positional functions of seq.el should not work on streams, then I could rewrite stream.el to make it a positioned stream where previous elements are discarded after each element generation. However the list of supported functions from seq.el API would be significantly reduced. > It could also affect the semantics if unfolding the stream has side > effects, not sure if stream.el makes guarantees about that though. No, it doesn't. Nico [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 20:16 ` Nicolas Petton @ 2018-03-02 20:58 ` Nicolas Petton 2018-03-03 7:56 ` Michael Heerdegen 2018-03-03 7:54 ` Michael Heerdegen 1 sibling, 1 reply; 57+ messages in thread From: Nicolas Petton @ 2018-03-02 20:58 UTC (permalink / raw) To: Noam Postavsky, Michael Heerdegen; +Cc: 30626 [-- Attachment #1: Type: text/plain, Size: 1031 bytes --] Nicolas Petton <nicolas@petton.fr> writes: > If we accept that `seq-elt', and other positional functions of seq.el > should not work on streams, then I could rewrite stream.el to make it a > positioned stream where previous elements are discarded after each > element generation. However the list of supported functions from seq.el > API would be significantly reduced. I had something like the following in mind: (cl-defstruct nstream current next-function) (cl-defmethod nstream-next ((stream nstream)) (setf (nstream-current stream) (funcall (nstream-next-function stream) (nstream-current stream)))) (defun nstream-range (&optional start end step) (unless start (setq start 0)) (unless step (setq step 1)) (make-nstream :current start :next-function (lambda (cur) (if (equal cur end) nil (+ cur step))))) Cheers, Nico [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 20:58 ` Nicolas Petton @ 2018-03-03 7:56 ` Michael Heerdegen 0 siblings, 0 replies; 57+ messages in thread From: Michael Heerdegen @ 2018-03-03 7:56 UTC (permalink / raw) To: Nicolas Petton; +Cc: Noam Postavsky, 30626 Nicolas Petton <nicolas@petton.fr> writes: > I had something like the following in mind: > > (cl-defstruct nstream current next-function) > > (cl-defmethod nstream-next ((stream nstream)) > (setf (nstream-current stream) (funcall (nstream-next-function stream) > (nstream-current stream)))) > > (defun nstream-range (&optional start end step) > (unless start (setq start 0)) > (unless step (setq step 1)) > (make-nstream :current start > :next-function (lambda (cur) > (if (equal cur end) > nil > (+ cur step))))) This is an implementation of iterators, not streams. We already have an implementation of iterators in Emacs. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 20:16 ` Nicolas Petton 2018-03-02 20:58 ` Nicolas Petton @ 2018-03-03 7:54 ` Michael Heerdegen 2018-03-03 8:47 ` Nicolas Petton 1 sibling, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-03-03 7:54 UTC (permalink / raw) To: Nicolas Petton; +Cc: Noam Postavsky, 30626 Nicolas Petton <nicolas@petton.fr> writes: > stream.el is at its core just an implementation of lazy-cons cells, so > not letting stream0 get access to the result would mean changing the > core implementation (I'm not necessarily against it). Please don't do this. The semantics of streams is worth keeping. I make use of it in el-search and other stuff. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-03 7:54 ` Michael Heerdegen @ 2018-03-03 8:47 ` Nicolas Petton 0 siblings, 0 replies; 57+ messages in thread From: Nicolas Petton @ 2018-03-03 8:47 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Noam Postavsky, 30626 [-- Attachment #1: Type: text/plain, Size: 214 bytes --] Michael Heerdegen <michael_heerdegen@web.de> writes: > Please don't do this. The semantics of streams is worth keeping. I > make use of it in el-search and other stuff. I'm not saying that I want this :) Nico [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 14:11 ` Noam Postavsky 2018-03-02 15:06 ` Michael Heerdegen 2018-03-02 20:16 ` Nicolas Petton @ 2018-03-02 21:48 ` John Mastro 2018-03-03 23:00 ` Noam Postavsky 2 siblings, 1 reply; 57+ messages in thread From: John Mastro @ 2018-03-02 21:48 UTC (permalink / raw) To: 30626; +Cc: Michael Heerdegen, Nicolas Petton, Noam Postavsky Noam Postavsky <npostavs@gmail.com> wrote: > Aha, but the following also crashes, whether compiled or not: > > ;; -*- lexical-binding: t -*- > > (require 'stream) > > (let* ((stream0 (stream-range 1 1000000)) > (stream stream0)) > (while (not (stream-empty-p stream)) > (cl-callf stream-rest stream))) > > So the problem is that the initial stream0 object can reach the entire > unfolding stream as it goes, and just holding on to that reference is > enough to keep the whole thing in memory. > > Now, I can see that letting stream0 automagically get access to the > unfolded result can be an optimization in some cases, although in this > case it's a pessimization. It could also affect the semantics if > unfolding the stream has side effects, not sure if stream.el makes > guarantees about that though. Clojure has/had a similar issue. They describe this scenario (having a live reference to the beginning of the stream, preventing GC from collecting it) as "holding onto the head" of the stream (in Clojure they're called lazy seqs). Their solution was what they call "locals clearing". The compiler tracks the lifetimes of local bindings and "clears" them (by setting them to nil/null) after their point of last use, e.g.: (let* ((stream0 (stream-range 1 1000000)) (stream stream0)) (setq stream0 nil) ;; <<< Inserted by compiler (while (not (stream-empty-p stream)) (cl-callf stream-rest stream))) If the code does reference stream0 later, locals clearing can't help you, but that's considered a "if it hurts, don't do it" situation. This probably isn't practical for Emacs, especially since it could only work for byte-compiled code, but thought the prior art may be of interest. John ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 21:48 ` John Mastro @ 2018-03-03 23:00 ` Noam Postavsky 2018-03-04 15:56 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-03 23:00 UTC (permalink / raw) To: John Mastro; +Cc: Michael Heerdegen, Nicolas Petton, 30626 John Mastro <john.b.mastro@gmail.com> writes: > (let* ((stream0 (stream-range 1 1000000)) > (stream stream0)) > (setq stream0 nil) ;; <<< Inserted by compiler > (while (not (stream-empty-p stream)) > (cl-callf stream-rest stream))) > > If the code does reference stream0 later, locals clearing can't help > you, but that's considered a "if it hurts, don't do it" situation. > > This probably isn't practical for Emacs, especially since it could only > work for byte-compiled code, but thought the prior art may be of > interest. Not sure how doable this solution is, but the fact that it works only for byte-compiled code seems fine to me. The interpreted case is doomed to fail anyway since the interpreter doesn't prune redundant variables from closures. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-03 23:00 ` Noam Postavsky @ 2018-03-04 15:56 ` Noam Postavsky 2018-03-04 17:02 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-04 15:56 UTC (permalink / raw) To: John Mastro; +Cc: Michael Heerdegen, Nicolas Petton, 30626 Noam Postavsky <npostavs@gmail.com> writes: > John Mastro <john.b.mastro@gmail.com> writes: > >> (let* ((stream0 (stream-range 1 1000000)) >> (stream stream0)) >> (setq stream0 nil) ;; <<< Inserted by compiler >> (while (not (stream-empty-p stream)) >> (cl-callf stream-rest stream))) >> >> If the code does reference stream0 later, locals clearing can't help >> you, but that's considered a "if it hurts, don't do it" situation. >> >> This probably isn't practical for Emacs, especially since it could only >> work for byte-compiled code, but thought the prior art may be of >> interest. > > Not sure how doable this solution is, but the fact that it works only > for byte-compiled code seems fine to me. The interpreted case is doomed > to fail anyway since the interpreter doesn't prune redundant variables > from closures. Hmm, I think it won't work by itself though, just doing (stream-flush (stream-range 1 1000000)) also crashes, due to the head of the stream being referenced from the C stack somewhere (I can get the address from gdb, but I can't figure out how to get to the corresponding C variable from there). ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-04 15:56 ` Noam Postavsky @ 2018-03-04 17:02 ` Eli Zaretskii 2018-03-11 18:52 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-03-04 17:02 UTC (permalink / raw) To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > From: Noam Postavsky <npostavs@gmail.com> > Date: Sun, 04 Mar 2018 10:56:36 -0500 > Cc: Michael Heerdegen <michael_heerdegen@web.de>, > Nicolas Petton <nicolas@petton.fr>, 30626@debbugs.gnu.org > > Hmm, I think it won't work by itself though, just doing > > (stream-flush (stream-range 1 1000000)) > > also crashes, due to the head of the stream being referenced from the C > stack somewhere (I can get the address from gdb, but I can't figure out > how to get to the corresponding C variable from there). Did you try "info symbol ADDRESS"? (I'm not sure this will work for automatic variables, though.) You could also try "info locals" after "set print address on" and/or "set print symbol on". ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-04 17:02 ` Eli Zaretskii @ 2018-03-11 18:52 ` Noam Postavsky 2018-03-11 20:31 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-11 18:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 [-- Attachment #1: Type: text/plain, Size: 1319 bytes --] Eli Zaretskii <eliz@gnu.org> writes: >> also crashes, due to the head of the stream being referenced from the C >> stack somewhere (I can get the address from gdb, but I can't figure out >> how to get to the corresponding C variable from there). > > Did you try "info symbol ADDRESS"? (I'm not sure this will work for > automatic variables, though.) Doesn't seem to work. I guess it wouldn't work if the address was in the middle of an array either. > You could also try "info locals" after "set print address on" and/or > "set print symbol on". Those settings don't seem to help. I first guessed that the problem is due to saving function arguments during funcall, so I tried the following to check it: --- i/src/bytecode.c +++ w/src/bytecode.c @@ -387,7 +387,10 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth, make_number (nargs))); ptrdiff_t pushedargs = min (nonrest, nargs); for (ptrdiff_t i = 0; i < pushedargs; i++, args++) - PUSH (*args); + { + PUSH (*args); + *args = Qnil; + } if (nonrest < nargs) PUSH (Flist (nargs - nonrest, args)); else This did change the backtrace (from starting with mark_specpdl to mark_stack), meaning I did find one reference, but it still crashes, so there must be more. [-- Attachment #2: gdb log excerpts --] [-- Type: text/plain, Size: 7995 bytes --] No stack. Starting program: /home/npostavs/src/emacs/emacs-26/lucid/src/emacs -Q -batch -l elpa/packages/stream/stream.elc -l bug-30626-stream-crash/test5.elc [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x000000000060e9b5 in live_cons_holding (m=0x2f6e290, p=0x2f77620) at ../../src/alloc.c:4613 4613 return make_lisp_ptr (s, Lisp_Cons); #0 0x000000000060e9b5 in live_cons_holding (m=0x2f6e290, p=0x2f77620) at ../../src/alloc.c:4613 #1 0x000000000060e9ec in live_cons_p (m=0x2f6e290, p=0x2f77620) at ../../src/alloc.c:4622 #2 0x0000000000612f94 in mark_object (arg=XIL(0x2f77623)) at ../../src/alloc.c:6717 #3 0x0000000000611d4f in mark_vectorlike (ptr=0x2f78b50) at ../../src/alloc.c:6227 [...] #4850 0x0000000000612b42 in mark_object (arg=XIL(0x2efcb83)) at ../../src/alloc.c:6624 #4851 0x0000000000611d4f in mark_vectorlike (ptr=0x2e64c90) at ../../src/alloc.c:6227 #4852 0x0000000000612b42 in mark_object (arg=XIL(0x2e64c95)) at ../../src/alloc.c:6624 #4853 0x000000000060f3ce in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936 #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868) at ../../src/alloc.c:4985 #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193 #4856 0x00000000006cdd75 in mark_one_thread (thread=0xe103e0 <main_thread>) at ../../src/thread.c:616 #4857 0x00000000006cdf0a in mark_threads_callback (ignore=0x0) at ../../src/thread.c:649 #4858 0x000000000060f4da in flush_stack_call_func (func=0x6cde77 <mark_threads_callback>, arg=0x0) at ../../src/alloc.c:5220 #4859 0x00000000006cdf3c in mark_threads () at ../../src/thread.c:656 #4860 0x00000000006114a7 in garbage_collect_1 (end=0x7fffffffa700) at ../../src/alloc.c:5997 #4861 0x0000000000611b94 in Fgarbage_collect () at ../../src/alloc.c:6168 #4862 0x000000000057999d in maybe_gc () at ../../src/lisp.h:4749 #4863 0x000000000063c2cb in Ffuncall (nargs=6, args=0x7fffffffa7f8) at ../../src/eval.c:2751 #4864 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7aad4), vector=XIL(0x2e72715), maxdepth=make_number(18), args_template=make_number(768), nargs=3, args=0x7fffffffad20) at ../../src/bytecode.c:632 #4865 0x000000000063cfef in funcall_lambda (fun=XIL(0x1450e45), nargs=3, arg_vector=0x7fffffffad08) at ../../src/eval.c:2970 #4866 0x000000000063c402 in Ffuncall (nargs=4, args=0x7fffffffad00) at ../../src/eval.c:2771 #4867 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7b2d4), vector=XIL(0x2fe2945), maxdepth=make_number(7), args_template=make_number(256), nargs=0, args=0x7fffffffb1d8) at ../../src/bytecode.c:632 #4868 0x000000000063cfef in funcall_lambda (fun=XIL(0x2fe2985), nargs=0, arg_vector=0x7fffffffb1d8) at ../../src/eval.c:2970 #4869 0x000000000063c402 in Ffuncall (nargs=1, args=0x7fffffffb1d0) at ../../src/eval.c:2771 #4870 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e769d4), vector=XIL(0x9d00a5), maxdepth=make_number(2), args_template=make_number(257), nargs=1, args=0x7fffffffb670) at ../../src/bytecode.c:632 #4871 0x000000000063cfef in funcall_lambda (fun=XIL(0x1626f05), nargs=1, arg_vector=0x7fffffffb668) at ../../src/eval.c:2970 #4872 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffb660) at ../../src/eval.c:2771 #4873 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7f7b4), vector=XIL(0x1622fe5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffbb10) at ../../src/bytecode.c:632 #4874 0x000000000063cfef in funcall_lambda (fun=XIL(0x1450f35), nargs=1, arg_vector=0x7fffffffbb08) at ../../src/eval.c:2970 #4875 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffbb00) at ../../src/eval.c:2771 #4876 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2ee3c94), vector=XIL(0x2e664f5), maxdepth=make_number(3), args_template=make_number(257), nargs=1, args=0x7fffffffbfa8) at ../../src/bytecode.c:632 #4877 0x000000000063cfef in funcall_lambda (fun=XIL(0x2e66515), nargs=1, arg_vector=0x7fffffffbfa0) at ../../src/eval.c:2970 #4878 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffbf98) at ../../src/eval.c:2771 #4879 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e91054), vector=XIL(0x1699ef5), maxdepth=make_number(4), args_template=XIL(0), nargs=0, args=0x0) at ../../src/bytecode.c:632 #4880 0x000000000068cb20 in Fbyte_code (bytestr=XIL(0x2e91054), vector=XIL(0x1699ef5), maxdepth=make_number(4)) at ../../src/bytecode.c:321 #4881 0x000000000063ac83 in eval_sub (form=XIL(0x2eeb413)) at ../../src/eval.c:2240 #4882 0x0000000000672b34 in readevalloop (readcharfun=XIL(0x68a0), infile0=0x7fffffffc6b0, sourcename=XIL(0x2e90f94), printflag=false, unibyte=XIL(0), readfun=XIL(0), start=XIL(0), end=XIL(0)) at ../../src/lread.c:2038 #4883 0x0000000000670e85 in Fload (file=XIL(0x2e8d884), noerror=XIL(0), nomessage=XIL(0xc090), nosuffix=XIL(0), must_suffix=XIL(0)) at ../../src/lread.c:1425 #4884 0x000000000063c945 in funcall_subr (subr=0xd7a420 <Sload>, numargs=3, args=0x7fffffffc8e8) at ../../src/eval.c:2856 #4885 0x000000000063c3be in Ffuncall (nargs=4, args=0x7fffffffc8e0) at ../../src/eval.c:2769 #4886 0x000000000068d950 in exec_byte_code (bytestr=XIL(0xadcc2c), vector=XIL(0xadcc4d), maxdepth=make_number(23), args_template=make_number(257), nargs=1, args=0x7fffffffd248) at ../../src/bytecode.c:632 #4887 0x000000000063cfef in funcall_lambda (fun=XIL(0xadcbfd), nargs=1, arg_vector=0x7fffffffd240) at ../../src/eval.c:2970 #4888 0x000000000063c402 in Ffuncall (nargs=2, args=0x7fffffffd238) at ../../src/eval.c:2771 #4889 0x000000000068d950 in exec_byte_code (bytestr=XIL(0xad7424), vector=XIL(0xad7445), maxdepth=make_number(21), args_template=make_number(0), nargs=0, args=0x7fffffffde58) at ../../src/bytecode.c:632 #4890 0x000000000063cfef in funcall_lambda (fun=XIL(0xad73f5), nargs=0, arg_vector=0x7fffffffde58) at ../../src/eval.c:2970 #4891 0x000000000063c402 in Ffuncall (nargs=1, args=0x7fffffffde50) at ../../src/eval.c:2771 #4892 0x000000000068d950 in exec_byte_code (bytestr=XIL(0xad6614), vector=XIL(0xad6635), maxdepth=make_number(12), args_template=make_number(0), nargs=0, args=0x7fffffffe480) at ../../src/bytecode.c:632 #4893 0x000000000063cfef in funcall_lambda (fun=XIL(0xad65e5), nargs=0, arg_vector=0x7fffffffe480) at ../../src/eval.c:2970 #4894 0x000000000063cc29 in apply_lambda (fun=XIL(0xad65e5), args=XIL(0), count=4) at ../../src/eval.c:2906 #4895 0x000000000063ae2a in eval_sub (form=XIL(0x149a8e3)) at ../../src/eval.c:2279 #4896 0x000000000063a19f in Feval (form=XIL(0x149a8e3), lexical=XIL(0)) at ../../src/eval.c:2054 #4897 0x000000000058150f in top_level_2 () at ../../src/keyboard.c:1119 #4898 0x00000000006384ce in internal_condition_case (bfun=0x5814ec <top_level_2>, handlers=XIL(0x5250), hfun=0x580ef1 <cmd_error>) at ../../src/eval.c:1332 #4899 0x0000000000581554 in top_level_1 (ignore=XIL(0)) at ../../src/keyboard.c:1127 #4900 0x00000000006379d7 in internal_catch (tag=XIL(0xc6f0), func=0x581511 <top_level_1>, arg=XIL(0)) at ../../src/eval.c:1097 #4901 0x000000000058143e in command_loop () at ../../src/keyboard.c:1088 #4902 0x00000000005809db in recursive_edit_1 () at ../../src/keyboard.c:695 #4903 0x0000000000580bd0 in Frecursive_edit () at ../../src/keyboard.c:766 #4904 0x000000000057e76b in main (argc=7, argv=0x7fffffffe9e8) at ../../src/emacs.c:1713 Cannot access memory at address 0x7ffffff7ef7f (gdb) frame 4854 #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868) at ../../src/alloc.c:4985 4985 mark_maybe_pointer (*(void **) pp); (gdb) info locals pp = 0x7fffffffa968 "\220L\346\002" (gdb) info symbol 0x7fffffffa968 No symbol matches 0x7fffffffa968. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-11 18:52 ` Noam Postavsky @ 2018-03-11 20:31 ` Eli Zaretskii 2018-03-11 21:51 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-03-11 20:31 UTC (permalink / raw) To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > From: Noam Postavsky <npostavs@gmail.com> > Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org > Date: Sun, 11 Mar 2018 14:52:22 -0400 > > This did change the backtrace (from starting with mark_specpdl to > mark_stack), meaning I did find one reference, but it still crashes, so > there must be more. If you have the address, you could first find the stack frame to which it belongs, right? Then go to that stack frame and type "info locals", which should give you the locals in that frame. One of them is your variable. It could also be a temporary slot, in which case disassembly of that frame's function should show it. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-11 20:31 ` Eli Zaretskii @ 2018-03-11 21:51 ` Noam Postavsky 2018-03-12 3:28 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-11 21:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 Eli Zaretskii <eliz@gnu.org> writes: >> From: Noam Postavsky <npostavs@gmail.com> >> Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org >> Date: Sun, 11 Mar 2018 14:52:22 -0400 >> >> This did change the backtrace (from starting with mark_specpdl to >> mark_stack), meaning I did find one reference, but it still crashes, so >> there must be more. > > If you have the address, you could first find the stack frame to which > it belongs, right? Um, how do I do that part? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-11 21:51 ` Noam Postavsky @ 2018-03-12 3:28 ` Eli Zaretskii 2018-03-13 1:59 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-03-12 3:28 UTC (permalink / raw) To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > From: Noam Postavsky <npostavs@gmail.com> > Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org > Date: Sun, 11 Mar 2018 17:51:19 -0400 > > > If you have the address, you could first find the stack frame to which > > it belongs, right? > > Um, how do I do that part? By comparing the address with the value of $bp in each frame, I'd say. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-12 3:28 ` Eli Zaretskii @ 2018-03-13 1:59 ` Noam Postavsky 2018-03-13 16:06 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-13 1:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 Eli Zaretskii <eliz@gnu.org> writes: >> From: Noam Postavsky <npostavs@gmail.com> >> Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org >> Date: Sun, 11 Mar 2018 17:51:19 -0400 >> >> > If you have the address, you could first find the stack frame to which >> > it belongs, right? >> >> Um, how do I do that part? > > By comparing the address with the value of $bp in each frame, I'd say. Hmm, I found a match, but it doesn't make any sense. #4851 0x0000000000611d4f in mark_vectorlike (ptr=0x2e64c90) at ../../src/alloc.c:6227 #4852 0x0000000000612b42 in mark_object (arg=XIL(0x2e64c95)) at ../../src/alloc.c:6624 #4853 0x000000000060f3ce in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936 #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868) at ../../src/alloc.c:4985 #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193 (gdb) frame 4854 #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868) at ../../src/alloc.c:4985 4985 mark_maybe_pointer (*(void **) pp); (gdb) p pp $28 = 0x7fffffffa968 "\220L\346\002" (gdb) frame 4864 #4864 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7aad4), vector=XIL(0x2e72715), maxdepth=make_number(18), args_template=make_number(768), nargs=3, args=0x7fffffffad20) at ../../src/bytecode.c:632 632 TOP = Ffuncall (op + 1, &TOP); (gdb) p $rbp $29 = (void *) 0x7fffffffabd0 (gdb) p/x $rbp - $28 $32 = 0x268 (gdb) disas /s [...] 1180 CASE (Bbuffer_substring): 1181 { 1182 Lisp_Object v1 = POP; 0x000000000068fea4 <+13154>: mov -0x40(%rbp),%rax 0x000000000068fea8 <+13158>: lea -0x8(%rax),%rdx 0x000000000068feac <+13162>: mov %rdx,-0x40(%rbp) 0x000000000068feb0 <+13166>: mov (%rax),%rax 0x000000000068feb3 <+13169>: mov %rax,-0x268(%rbp) 1183 TOP = Fbuffer_substring (TOP, v1); 0x000000000068feba <+13176>: mov -0x268(%rbp),%rdx 0x000000000068fec1 <+13183>: mov -0x40(%rbp),%rax 0x000000000068fec5 <+13187>: mov %rdx,%rsi 0x000000000068fec8 <+13190>: mov (%rax),%rdi 0x000000000068fecb <+13193>: callq 0x627e0a <Fbuffer_substring> It can't be a buffer-substring arg, but that's the only reference to -0x268(%rbp) in that function. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-13 1:59 ` Noam Postavsky @ 2018-03-13 16:06 ` Eli Zaretskii 2018-03-14 0:09 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-03-13 16:06 UTC (permalink / raw) To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > From: Noam Postavsky <npostavs@gmail.com> > Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org > Date: Mon, 12 Mar 2018 21:59:57 -0400 > > 4985 mark_maybe_pointer (*(void **) pp); > (gdb) p pp > $28 = 0x7fffffffa968 "\220L\346\002" Should you look at pp or at *pp? Also note that for Lisp objects that are marked you need to reset their mark bit before trying to determine their type and value. If none of the above helps, please walk me through the steps that led you to look at -0x268(%rbp), because I'm not sure I follow. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-13 16:06 ` Eli Zaretskii @ 2018-03-14 0:09 ` Noam Postavsky 2018-03-15 16:34 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-14 0:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 Eli Zaretskii <eliz@gnu.org> writes: > Should you look at pp or at *pp? I think it should be pp, but I'm not sure. The context: #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868) at ../../src/alloc.c:4985 #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193 mark_memory (void *start, void *end) { ... for (pp = start; (void *) pp < end; pp += GC_POINTER_ALIGNMENT) { mark_maybe_pointer (*(void **) pp); mark_maybe_object (*(Lisp_Object *) pp); } So the value of pp ranges over stack addresses and *pp would be the contents of the stack location. > Also note that for Lisp objects that are marked you need to reset > their mark bit before trying to determine their type and value. I think I'm looking for a C variable, and not a Lisp object (although the C variable presumably contains/points to a Lisp object). > If none of the above helps, please walk me through the steps that led > you to look at -0x268(%rbp), because I'm not sure I follow. Starting with the value of pp, I then go up looking for a close value of $rbp: (gdb) p pp $39 = 0x7fffffffa968 "\220L\346\002" (gdb) up #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193 5193 mark_memory (bottom, end); (gdb) p $rbp $40 = (void *) 0x7fffffffa420 (gdb) up #4856 0x00000000006cdd75 in mark_one_thread (thread=0xe103e0 <main_thread>) at ../../src/thread.c:616 616 mark_stack (thread->m_stack_bottom, stack_top); (gdb) p $rbp $41 = (void *) 0x7fffffffa470 [...] (gdb) up #4863 0x000000000063c2cb in Ffuncall (nargs=6, args=0x7fffffffa7f8) at ../../src/eval.c:2751 2751 maybe_gc (); (gdb) p $rbp $48 = (void *) 0x7fffffffa780 (gdb) up #4864 0x000000000068d950 in exec_byte_code (bytestr=XIL(0x2e7aad4), vector=XIL(0x2e72715), maxdepth=make_number(18), args_template=make_number(768), nargs=3, args=0x7fffffffad20) at ../../src/bytecode.c:632 632 TOP = Ffuncall (op + 1, &TOP); (gdb) p $rbp $49 = (void *) 0x7fffffffabd0 Now I see that $rbp is higher than the target address, and the difference is 0x268, so the target location should be -0x268(%rbp). (gdb) p $rbp - 0x7fffffffa968 $52 = (void *) 0x268 Except something must be wrong in my reasoning, since the only ocurrences of -0x268(%rbp) are the buffer-string args, which could only hold integers or markers (neither of which could further point to long chains of objects). ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-14 0:09 ` Noam Postavsky @ 2018-03-15 16:34 ` Eli Zaretskii 2018-03-17 15:53 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-03-15 16:34 UTC (permalink / raw) To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > From: Noam Postavsky <npostavs@gmail.com> > Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org > Date: Tue, 13 Mar 2018 20:09:17 -0400 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Should you look at pp or at *pp? > > I think it should be pp, but I'm not sure. The context: > > #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868) > at ../../src/alloc.c:4985 > #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", > end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193 > > mark_memory (void *start, void *end) > { > ... > for (pp = start; (void *) pp < end; pp += GC_POINTER_ALIGNMENT) > { > mark_maybe_pointer (*(void **) pp); > mark_maybe_object (*(Lisp_Object *) pp); > } > > So the value of pp ranges over stack addresses and *pp would be the > contents of the stack location. But the call to mark_maybe_pointer means that we consider pp to be a pointer (in)to a Lisp object. Anyway, wouldn't it be easier to look one frame lower? We have this: #4850 0x0000000000612b42 in mark_object (arg=XIL(0x2efcb83)) at ../../src/alloc.c:6624 #4851 0x0000000000611d4f in mark_vectorlike (ptr=0x2e64c90) at ../../src/alloc.c:6227 #4852 0x0000000000612b42 in mark_object (arg=XIL(0x2e64c95)) at ../../src/alloc.c:6624 #4853 0x000000000060f3ce in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936 #4854 0x000000000060f452 in mark_memory (start=0x7fffffffa520, end=0x7fffffffe868) at ../../src/alloc.c:4985 #4855 0x000000000060f493 in mark_stack (bottom=0x7fffffffe868 "a\036h\364\377\177", end=0x7fffffffa520 "0\245\377\377\377\177") at ../../src/alloc.c:5193 In frame #4852, we have found an object, and we are marking it. Did you try looking at that object? With these caveats: > > Also note that for Lisp objects that are marked you need to reset > > their mark bit before trying to determine their type and value. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-15 16:34 ` Eli Zaretskii @ 2018-03-17 15:53 ` Noam Postavsky 2018-03-17 16:10 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-17 15:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 Eli Zaretskii <eliz@gnu.org> writes: > In frame #4852, we have found an object, and we are marking it. Did > you try looking at that object? With these caveats: > >> > Also note that for Lisp objects that are marked you need to reset >> > their mark bit before trying to determine their type and value. Okay, I think xpr takes care of that, right? (I've restarted the debug sessions a few times, so numbers may not match exactly) #4853 0x000000000060f429 in mark_maybe_pointer (p=0x2e64c90) at ../../src/alloc.c:4936 4936 mark_object (obj); (gdb) p obj $60 = XIL(0x2e64c95) (gdb) xpr Lisp_Vectorlike PVEC_NORMAL_VECTOR $61 = (struct Lisp_Vector *) 0x2e64c90 {XIL(0x2efcb63), make_number(1000000), XIL(0x2efcb53), XIL(0x2efcb73), XIL(0x2efcb83), XIL(0x20ab5b0), XIL(0xc090)} (gdb) p $61->contents[0] $62 = XIL(0x2efcb63) (gdb) xpr Lisp_Cons $63 = (struct Lisp_Cons *) 0x2efcb60 { [...] car = make_number(2369), [...] chain = 0x0 [...] (gdb) p $61->contents[1] $64 = make_number(1000000) (gdb) p $61->contents[2] $65 = XIL(0x2efcb53) (gdb) xpr Lisp_Cons [...] car = make_number(1), [...] chain = 0x0 [...] (gdb) p $61->contents[3] $67 = XIL(0x2efcb73) (gdb) xpr [...] car = XIL(0xc090), [...] chain = 0x0 [...] (gdb) p $61->contents[4] [...] car = XIL(0x2efc443), [...] chain = 0x0 [...] (gdb) p $61->contents[5] [...] $72 = (struct Lisp_Symbol *) 0x2e97ef0 "stream-range" (gdb) p $61->contents[6] [...] $74 = (struct Lisp_Symbol *) 0xdf89d0 <lispsym+49296> "t" It looks like a the lexical environment of a bytecode function, probably the initial stream, e.g., (stream-range 1 1000000) gives: (--stream-- #[256 "\211\203\303\242\207\303\242\204 \304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(1) 1000000 (1) (nil) (nil) stream-range t] 7 " ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (fn &optional CHECK)"]) Not sure where to go next with this. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-17 15:53 ` Noam Postavsky @ 2018-03-17 16:10 ` Eli Zaretskii 2018-03-17 16:27 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-03-17 16:10 UTC (permalink / raw) To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > From: Noam Postavsky <npostavs@gmail.com> > Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org > Date: Sat, 17 Mar 2018 11:53:36 -0400 > > It looks like a the lexical environment of a bytecode function, probably > the initial stream, e.g., (stream-range 1 1000000) gives: > > (--stream-- #[256 "\211\203\303\242\207\303\242\204 \304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" > [(1) 1000000 (1) (nil) (nil) stream-range t] 7 " > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > (fn &optional CHECK)"]) > > Not sure where to go next with this. The goal was to find out which variable holds a reference to the entire long stream, right? So it sounds like a pointer to it is kept in an automatic variable on the stack of exec_byte_code, right? Which kinda makes sense, since the stream is still being processed, I think. Or am I confused? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-17 16:10 ` Eli Zaretskii @ 2018-03-17 16:27 ` Eli Zaretskii 2018-03-17 17:28 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Eli Zaretskii @ 2018-03-17 16:27 UTC (permalink / raw) To: npostavs; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > Date: Sat, 17 Mar 2018 18:10:24 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, > 30626@debbugs.gnu.org > > > (--stream-- #[256 "\211\203\303\242\207\303\242\204 \304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" > > [(1) 1000000 (1) (nil) (nil) stream-range t] 7 " > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > (fn &optional CHECK)"]) > > > > Not sure where to go next with this. > > The goal was to find out which variable holds a reference to the > entire long stream, right? So it sounds like a pointer to it is kept > in an automatic variable on the stack of exec_byte_code, right? Which > kinda makes sense, since the stream is still being processed, I think. > > Or am I confused? Actually, there's still some mystery: if this object is a 7-element vector, where do all the other GC frame come from? Hmm... how long/deep is each of the cons cells in elements 1 through 4 of the vector? If they are deeply nested, then that's the answer we've been looking for, I think. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-17 16:27 ` Eli Zaretskii @ 2018-03-17 17:28 ` Noam Postavsky 2018-03-19 20:05 ` Eli Zaretskii 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-17 17:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 Eli Zaretskii <eliz@gnu.org> writes: >> The goal was to find out which variable holds a reference to the >> entire long stream, right? So it sounds like a pointer to it is kept >> in an automatic variable on the stack of exec_byte_code, right? Which >> kinda makes sense, since the stream is still being processed, I >> think. Yeah, but which variable exactly? I'd like to find it and add 'X = Qnil;' to confirm we've found where it is. > Actually, there's still some mystery: if this object is a 7-element > vector, where do all the other GC frame come from? Hmm... how > long/deep is each of the cons cells in elements 1 through 4 of the > vector? If they are deeply nested, then that's the answer we've been > looking for, I think. It's a bit confusing because of the indirection: stream-range uses the stream-cons macro, which uses the stream-make macro, which uses the thunk-delay macro. I believe the end result is that the lexical environment of the resulting closure has access to the next stream-element in the chain, so the nesting depth is the length of the stream (i.e., 100000 in the example). Perhaps this example makes it clearer: (setq print-circle t) (let* ((s0 (stream-range 1 2)) (s1 (stream-rest s0))) (list s0 s1)) ;=> ((--stream-- #[256 "\211\203..." [(1) 2 (1) (t) ((1 . #1=(--stream-- #[256 "\211\203..." [(nil) (nil) nil t] 3 "\n\n(fn &optional CHECK)"]))) stream-range t] 7 "\n\n(fn &optional CHECK)"]) #1#) ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-17 17:28 ` Noam Postavsky @ 2018-03-19 20:05 ` Eli Zaretskii 0 siblings, 0 replies; 57+ messages in thread From: Eli Zaretskii @ 2018-03-19 20:05 UTC (permalink / raw) To: Noam Postavsky; +Cc: michael_heerdegen, john.b.mastro, nicolas, 30626 > From: Noam Postavsky <npostavs@gmail.com> > Cc: michael_heerdegen@web.de, john.b.mastro@gmail.com, nicolas@petton.fr, 30626@debbugs.gnu.org > Date: Sat, 17 Mar 2018 13:28:22 -0400 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> The goal was to find out which variable holds a reference to the > >> entire long stream, right? So it sounds like a pointer to it is kept > >> in an automatic variable on the stack of exec_byte_code, right? Which > >> kinda makes sense, since the stream is still being processed, I > >> think. > > Yeah, but which variable exactly? I'd like to find it and add 'X = > Qnil;' to confirm we've found where it is. I don't think you will be able to tell without stepping through the byte-code interpreter code, keeping track of what it stores where. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 10:58 ` Michael Heerdegen 2018-02-28 16:00 ` Eli Zaretskii @ 2019-04-25 3:20 ` Noam Postavsky 2019-04-25 5:19 ` Michael Heerdegen 1 sibling, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2019-04-25 3:20 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Nicolas Petton, 30626 [-- Attachment #1: Type: text/plain, Size: 1382 bytes --] Michael Heerdegen <michael_heerdegen@web.de> writes: >> I don't have a quick answer for the general case, but I think it's a bug >> in stream.el that it's creating such large structures in the first >> place. As far as I understand it, the point of streams is to handle >> long lists by encoding them as >> >> (FIRST-VALUE . FUNCTION-TO-PRODUCE-REST-OF-LIST) > > Yes, that's exactly how it's implemented. When requesting more elements > from the stream, that becomes > > (FIRST-VALUE . > (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST)) > > When we loop over the string, the cons whose car is the FIRST-VALUE, > let's call it cons1, is immediately thrown away, and we continue with > > (SECOND-VALUE . FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST) Coming back to this again. I think I got lost in the weeds of the byte-code function objects before. The core problem is that streams are not exactly encoded like the above, because even after forcing it you don't have just a plain SECOND-VALUE stored in the stream. The original FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST stays around and keeps referencing all the code and all the following elements. So a possible solution is change the stream to get rid of the lambda part and just leave the computed value after it's forced. With the following patch (stream-flush (stream-range 1 1000000)) succeeds: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: patch --] [-- Type: text/x-diff, Size: 2937 bytes --] From 5e7139618f1b4cebbe9785ea5a9303b80d1b2c92 Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@users.sourceforge.net> Date: Wed, 24 Apr 2019 22:51:19 -0400 Subject: [PATCH] [WIP] Drop forced lambda's from stream (Bug#30626) Let the stream id distinguish between forced and unforced stream values, replacing (think-delay BODY) with just (lambda () BODY). When the value is forced, replace the lambda with its result. --- packages/stream/stream.el | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/packages/stream/stream.el b/packages/stream/stream.el index 3f6bc4b5b..fa7a3b520 100644 --- a/packages/stream/stream.el +++ b/packages/stream/stream.el @@ -65,18 +65,28 @@ (eval-when-compile (require 'cl-lib)) (require 'seq) -(require 'thunk) (eval-and-compile - (defconst stream--identifier '--stream-- - "Symbol internally used to identify streams.")) + (defconst stream--fresh-identifier '--stream-fresh-- + "Symbol internally used to streams whose head was not evaluated.") + (defconst stream--evald-identifier '--stream-evald-- + "Symbol internally used to streams whose head was evaluated.")) (defmacro stream-make (&rest body) "Return a stream built from BODY. BODY must return nil or a cons cell whose cdr is itself a stream." (declare (debug t)) - `(list ',stream--identifier (thunk-delay ,@body))) + `(list ',stream--fresh-identifier (lambda () ,@body))) + +(defun stream-force (stream) + (cond + ((eq (car-safe stream) stream--evald-identifier) + (cadr stream)) + ((eq (car-safe stream) stream--fresh-identifier) + (setf (car stream) stream--evald-identifier) + (setf (cadr stream) (funcall (cadr stream)))) + (t (signal 'wrong-type-argument (list 'streamp stream))))) (defmacro stream-cons (first rest) "Return a stream built from the cons of FIRST and REST. @@ -159,24 +169,26 @@ (defun stream-range (&optional start end step) (defun streamp (stream) "Return non-nil if STREAM is a stream, nil otherwise." - (eq (car-safe stream) stream--identifier)) + (let ((car (car-safe stream))) + (or (eq car stream--fresh-identifier) + (eq car stream--evald-identifier)))) (defun stream-empty () "Return a new empty stream." - (list stream--identifier (thunk-delay nil))) + (list stream--evald-identifier nil)) (defun stream-empty-p (stream) "Return non-nil if STREAM is empty, nil otherwise." - (null (thunk-force (cadr stream)))) + (null (cadr (stream-force stream)))) (defun stream-first (stream) "Return the first element of STREAM. Return nil if STREAM is empty." - (car (thunk-force (cadr stream)))) + (car (stream-force stream))) (defun stream-rest (stream) "Return a stream of all but the first element of STREAM." - (or (cdr (thunk-force (cadr stream))) + (or (cdr (stream-force stream)) (stream-empty))) (defun stream-append (&rest streams) -- 2.11.0 [-- Attachment #3: Type: text/plain, Size: 494 bytes --] The reason I didn't do this in thunk.el is that thunks are just bare lambdas, so there is no way to get rid it "from the inside", as it were. Whereas for streams, it's just a matter of list editing. Some additional things that I thought of changing, but I didn't yet: - stream--identifier vs just using '--stream-- directly, I don't see what's the benefit of indirection here. - stream-make should use cons instead of list (or maybe a struct?). - stream-empty should just be a constant. ^ permalink raw reply related [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2019-04-25 3:20 ` Noam Postavsky @ 2019-04-25 5:19 ` Michael Heerdegen 2019-05-10 13:20 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2019-04-25 5:19 UTC (permalink / raw) To: Noam Postavsky; +Cc: Nicolas Petton, 30626 Noam Postavsky <npostavs@gmail.com> writes: > Coming back to this again. I think I got lost in the weeds of the > byte-code function objects before. The core problem is that streams > are not exactly encoded like the above, because even after forcing it > you don't have just a plain SECOND-VALUE stored in the stream. The > original FUNCTION-TO-PRODUCE-MORE-REST-OF-LIST stays around and keeps > referencing all the code and all the following elements. So a > possible solution is change the stream to get rid of the lambda part > and just leave the computed value after it's forced. With the > following patch (stream-flush (stream-range 1 1000000)) succeeds: > > [Patch...] Works for me, and it makes sense. As a test case I recompiled el-search.el (it uses streams for several things) with your patch applied to stream.el, and it worked well. > Some additional things that I thought of changing, but I didn't yet: > - stream--identifier vs just using '--stream-- directly, I don't see > what's the benefit of indirection here. A matter of taste I guess. > - stream-make should use cons instead of list (or maybe a struct?). I think cons would be ok. Would a struct make things slower? > - stream-empty should just be a constant. Dunno if there are cases where this would be problematic, but I guess we could do this as well. Anyway, thanks for looking into this again, I like your solution. Maybe Nicolas can also chime in. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2019-04-25 5:19 ` Michael Heerdegen @ 2019-05-10 13:20 ` Michael Heerdegen 2019-05-25 20:29 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2019-05-10 13:20 UTC (permalink / raw) To: Noam Postavsky; +Cc: Nicolas Petton, 30626 Michael Heerdegen <michael_heerdegen@web.de> writes: > > [Patch...] > > Works for me, and it makes sense. As a test case I recompiled > el-search.el (it uses streams for several things) with your patch > applied to stream.el, and it worked well. @Nicolas: If you are short on time, can we just install this patch? I've tested it for a while and it works well, and I'm quite sure it is harmless (doesn't change any semantics apart from fixing the crashes). > > - stream-make should use cons instead of list (or maybe a struct?). > > I think cons would be ok. Would a struct make things slower? > > > - stream-empty should just be a constant. > > Dunno if there are cases where this would be problematic, but I guess we > could do this as well. @Nicolas: Do you want us to care about this or do you want to have a look yourself? I don't want to hurry, I just don't want this to be forgotten. If you say you have time in four months, it's still ok. Thanks, Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2019-05-10 13:20 ` Michael Heerdegen @ 2019-05-25 20:29 ` Noam Postavsky 2019-05-26 0:32 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2019-05-25 20:29 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Nicolas Petton, 30626 [-- Attachment #1: Type: text/plain, Size: 2457 bytes --] tags 30626 + patch quit Michael Heerdegen <michael_heerdegen@web.de> writes: >> > - stream-make should use cons instead of list (or maybe a struct?). >> >> I think cons would be ok. Would a struct make things slower? A struct might be slower, and cons has the advantage that the print output is more readable for humans too. E.g., with this code: (let ((s (stream-range 1 5))) (stream-flush s) s) ;; Using cons (patch in this message): (--stream-evald-- 1 --stream-evald-- 2 --stream-evald-- 3 --stream-evald-- 4 --stream-evald--) ;; Using list (previous patch): (--stream-evald-- (1 --stream-evald-- (2 --stream-evald-- (3 --stream-evald-- (4 --stream-evald-- nil))))) ;; I guess using a struct would look something like this: #(--stream-evald-- (1 . #(--stream-evald-- (2 . #(--stream-evald-- (3 . #(--stream-evald-- (4 . #(--stream-evald-- nil))))))))) ;; Using list with thunk (current, v2.2.4) (--stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(1) 5 (1) (t) ((1 --stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(2) 5 (1) (t) ((2 --stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(3) 5 (1) (t) ((3 --stream-- #[256 "\211\203\007\0\303\242\207\303\242\204 \0\304\300\242\305\300\242\302\242\\\301\302\242#B\240\210\303\306\240\210\304\242\207" [(4) 5 (1) (t) ((4 --stream-- #[256 "\211\203\007\0\300\242\207\300\242\204\024\0\301\302\240\210\300\303\240\210\301\242\207" [(t) (nil) nil t] 3 " (fn &optional CHECK)"])) stream-range t] 7 " (fn &optional CHECK)"])) stream-range t] 7 " (fn &optional CHECK)"])) stream-range t] 7 " (fn &optional CHECK)"])) stream-range t] 7 " (fn &optional CHECK)"]) >> > - stream-empty should just be a constant. >> >> Dunno if there are cases where this would be problematic, but I guess we >> could do this as well. I've done this in the patch below. Passes all the tests, and I can't see why it would be problematic. > @Nicolas: Do you want us to care about this or do you want to have a > look yourself? I don't want to hurry, I just don't want this to be > forgotten. If you say you have time in four months, it's still ok. Not getting any response; I'll wait another week for comments and then push. [-- Attachment #2: patch --] [-- Type: text/plain, Size: 5752 bytes --] From 7b126616c87bf034c933de711befcd80a7ada3bb Mon Sep 17 00:00:00 2001 From: Noam Postavsky <npostavs@users.sourceforge.net> Date: Wed, 24 Apr 2019 22:51:19 -0400 Subject: [PATCH] Drop forced lambda's from stream (Bug#30626) Let the stream id distinguish between forced and unforced stream values. When the value is forced, replace the lambda with its result. This lets the lambda and anything it references be garbage collected. Change the representation of a stream from (--stream-- THUNK) to (--stream-fresh-- . (lambda () VALUE)) or (--stream-evald . VALUE). * packages/stream/stream.el (stream--identifier): Remove. (stream--fresh-identifier, stream--evald-identifier): New constants to replace it. (streamp): Check for new constants. (stream-make): Use cons and lambda instead of list and thunk-delay. (stream--force): New function. (stream-empty-p, stream-first, stream-rest): Use it. (stream-empty): New constant, return it from the function instead of creating a new one each time. * packages/stream/tests/stream-tests.el (stream-to-list): Remove. (stream-list-test): Use seq-into instead. --- packages/stream/stream.el | 41 +++++++++++++++++++++++++---------- packages/stream/tests/stream-tests.el | 12 ++-------- 2 files changed, 31 insertions(+), 22 deletions(-) diff --git a/packages/stream/stream.el b/packages/stream/stream.el index 3f6bc4b5b..9f73e8b86 100644 --- a/packages/stream/stream.el +++ b/packages/stream/stream.el @@ -1,6 +1,6 @@ ;;; stream.el --- Implementation of streams -*- lexical-binding: t -*- -;; Copyright (C) 2016 Free Software Foundation, Inc. +;; Copyright (C) 2016-2019 Free Software Foundation, Inc. ;; Author: Nicolas Petton <nicolas@petton.fr> ;; Keywords: stream, laziness, sequences @@ -41,7 +41,7 @@ ;; - ... ;; ;; All functions are prefixed with "stream-". -;; All functions are tested in test/automated/stream-tests.el +;; All functions are tested in tests/stream-tests.el ;; ;; Here is an example implementation of the Fibonacci numbers ;; implemented as in infinite stream: @@ -65,18 +65,30 @@ (eval-when-compile (require 'cl-lib)) (require 'seq) -(require 'thunk) (eval-and-compile - (defconst stream--identifier '--stream-- - "Symbol internally used to identify streams.")) + (defconst stream--fresh-identifier '--stream-fresh-- + "Symbol internally used to streams whose head was not evaluated.") + (defconst stream--evald-identifier '--stream-evald-- + "Symbol internally used to streams whose head was evaluated.")) (defmacro stream-make (&rest body) "Return a stream built from BODY. BODY must return nil or a cons cell whose cdr is itself a stream." (declare (debug t)) - `(list ',stream--identifier (thunk-delay ,@body))) + `(cons ',stream--fresh-identifier (lambda () ,@body))) + +(defun stream--force (stream) + "Evaluate and return the first cons cell of STREAM. +That value is the one passed to `stream-make'." + (cond + ((eq (car-safe stream) stream--evald-identifier) + (cdr stream)) + ((eq (car-safe stream) stream--fresh-identifier) + (setf (car stream) stream--evald-identifier) + (setf (cdr stream) (funcall (cdr stream)))) + (t (signal 'wrong-type-argument (list 'streamp stream))))) (defmacro stream-cons (first rest) "Return a stream built from the cons of FIRST and REST. @@ -159,24 +171,29 @@ (defun stream-range (&optional start end step) (defun streamp (stream) "Return non-nil if STREAM is a stream, nil otherwise." - (eq (car-safe stream) stream--identifier)) + (let ((car (car-safe stream))) + (or (eq car stream--fresh-identifier) + (eq car stream--evald-identifier)))) + +(defconst stream-empty (cons stream--evald-identifier nil) + "The empty stream.") (defun stream-empty () - "Return a new empty stream." - (list stream--identifier (thunk-delay nil))) + "Return the empty stream." + stream-empty) (defun stream-empty-p (stream) "Return non-nil if STREAM is empty, nil otherwise." - (null (thunk-force (cadr stream)))) + (null (cdr (stream--force stream)))) (defun stream-first (stream) "Return the first element of STREAM. Return nil if STREAM is empty." - (car (thunk-force (cadr stream)))) + (car (stream--force stream))) (defun stream-rest (stream) "Return a stream of all but the first element of STREAM." - (or (cdr (thunk-force (cadr stream))) + (or (cdr (stream--force stream)) (stream-empty))) (defun stream-append (&rest streams) diff --git a/packages/stream/tests/stream-tests.el b/packages/stream/tests/stream-tests.el index 021ed65cf..7487ef69b 100644 --- a/packages/stream/tests/stream-tests.el +++ b/packages/stream/tests/stream-tests.el @@ -1,6 +1,6 @@ ;;; stream-tests.el --- Unit tests for stream.el -*- lexical-binding: t -*- -;; Copyright (C) 2015, 2017 Free Software Foundation, Inc. +;; Copyright (C) 2015, 2017-2019 Free Software Foundation, Inc. ;; Author: Nicolas Petton <nicolas@petton.fr> @@ -29,14 +29,6 @@ (require 'stream) (require 'generator) (require 'cl-lib) -(defun stream-to-list (stream) - "Eagerly traverse STREAM and return a list of its elements." - (let (result) - (seq-do (lambda (elt) - (push elt result)) - stream) - (reverse result))) - (ert-deftest stream-empty-test () (should (streamp (stream-empty))) (should (stream-empty-p (stream-empty)))) @@ -240,7 +232,7 @@ (ert-deftest stream-range-test () (ert-deftest stream-list-test () (dolist (list '(nil '(1 2 3) '(a . b))) - (should (equal list (stream-to-list (stream list)))))) + (should (equal list (seq-into (stream list) 'list))))) (ert-deftest stream-seq-subseq-test () (should (stream-empty-p (seq-subseq (stream-range 2 10) 0 0))) -- 2.11.0 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2019-05-25 20:29 ` Noam Postavsky @ 2019-05-26 0:32 ` Michael Heerdegen 2019-05-26 0:40 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2019-05-26 0:32 UTC (permalink / raw) To: Noam Postavsky; +Cc: Nicolas Petton, 30626 Noam Postavsky <npostavs@gmail.com> writes: > >> > - stream-make should use cons instead of list (or maybe a struct?). > >> > >> I think cons would be ok. Would a struct make things slower? > > A struct might be slower, and cons has the advantage that the print > output is more readable for humans too. E.g., with this code: [...] I see no reason not to switch to cons. Better readable, and less cons garbage than with `list'. Would you like to do this? > >> > - stream-empty should just be a constant. > >> > >> Dunno if there are cases where this would be problematic, but I > >> guess we > >> could do this as well. > > I've done this in the patch below. Passes all the tests, and I can't > see why it would be problematic. Looks good to me. Also passes my tests (which are all my stream.el uses, including el-search.el). > > @Nicolas: Do you want us to care about this or do you want to have a > > look yourself? I don't want to hurry, I just don't want this to be > > forgotten. If you say you have time in four months, it's still ok. > > Not getting any response; I'll wait another week for comments and then > push. Ok, let's do this. Thanks, Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2019-05-26 0:32 ` Michael Heerdegen @ 2019-05-26 0:40 ` Noam Postavsky 2019-05-26 1:15 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2019-05-26 0:40 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Nicolas Petton, 30626 Michael Heerdegen <michael_heerdegen@web.de> writes: > I see no reason not to switch to cons. Better readable, and less cons > garbage than with `list'. > > Would you like to do this? Yes, just to be clear, the patch in my previous message did already include the switch to cons. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2019-05-26 0:40 ` Noam Postavsky @ 2019-05-26 1:15 ` Michael Heerdegen 2019-06-04 0:26 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2019-05-26 1:15 UTC (permalink / raw) To: Noam Postavsky; +Cc: Nicolas Petton, 30626 Noam Postavsky <npostavs@gmail.com> writes: > Yes, just to be clear, the patch in my previous message did already > include the switch to cons. Ah - of course - sorry, I was already a bit tired, I read it but wasn't aware of it. Sorry, I had spent the time finding a bug in your patch that turned out was just my oversight of some file to recompile. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2019-05-26 1:15 ` Michael Heerdegen @ 2019-06-04 0:26 ` Noam Postavsky 0 siblings, 0 replies; 57+ messages in thread From: Noam Postavsky @ 2019-06-04 0:26 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Nicolas Petton, 30626 tags 30626 fixed close 30626 quit Pushed to elpa. b5f4061db 2019-06-03T20:23:03-04:00 "Drop forced lambda's from stream (Bug#30626)" https://git.savannah.gnu.org/cgit/emacs/elpa.git/commit/?id=b5f4061db4226d1d49fcb0ff53db0424f359e344 > I had spent the time finding a bug in your patch that turned out was > just my oversight of some file to recompile. Oh, I hate it when that happens. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 18:08 ` Eli Zaretskii 2018-02-28 1:29 ` Noam Postavsky @ 2018-02-28 11:05 ` Michael Heerdegen 2018-02-28 13:20 ` Nicolas Petton 2018-03-01 10:44 ` Daniel Colascione 2 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-02-28 11:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: nicolas, 30626 Hello, I had written that > > (seq-doseq (_ (stream-range 1 1000000)) nil) crashes. CC'ing Nicolas, the author of stream.el. > What can we do instead in such cases? Stack-overflow protection > cannot work in GC, so you are shooting yourself in the foot by > creating such large recursive structures. By the time we get to GC, > where the problem will happen, it's too late, because the memory was > already allocated. > > Does anyone has a reasonable idea for avoiding the crash in such > programs? I would appreciate any effort to fix that, because it seems that currently streams are broken by design, and there is no way to fix that from the Lisp implementation. Yes, we could implement iterators instead of streams - that's what we get when we avoid the consing. But it's something different and not always an alternative, depending on what you want to do. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-28 11:05 ` Michael Heerdegen @ 2018-02-28 13:20 ` Nicolas Petton 0 siblings, 0 replies; 57+ messages in thread From: Nicolas Petton @ 2018-02-28 13:20 UTC (permalink / raw) To: Michael Heerdegen, Eli Zaretskii; +Cc: 30626 [-- Attachment #1: Type: text/plain, Size: 239 bytes --] Michael Heerdegen <michael_heerdegen@web.de> writes: > Hello, Hi, > I had written that > >> > (seq-doseq (_ (stream-range 1 1000000)) nil) > > crashes. CC'ing Nicolas, the author of stream.el. Thanks, I'll look into it. Cheers, Nico [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 18:08 ` Eli Zaretskii 2018-02-28 1:29 ` Noam Postavsky 2018-02-28 11:05 ` Michael Heerdegen @ 2018-03-01 10:44 ` Daniel Colascione 2018-03-01 15:51 ` Noam Postavsky 2 siblings, 1 reply; 57+ messages in thread From: Daniel Colascione @ 2018-03-01 10:44 UTC (permalink / raw) To: Eli Zaretskii, Michael Heerdegen; +Cc: 30626 On 02/27/2018 10:08 AM, Eli Zaretskii wrote: >> From: Michael Heerdegen <michael_heerdegen@web.de> >> Cc: bug-gnu-emacs@gnu.org, 30626@debbugs.gnu.org >> Date: Tue, 27 Feb 2018 13:08:59 +0100 >> >> #+begin_src emacs-lisp >> (seq-doseq (_ (stream-range 1 1000000)) nil) >> #+end_src >> >> Note that this is executed as a loop due how to streams are implemented, >> although the definition of `seq-doseq' looks recursive. But it seems >> that gc has a problem with the large number of conses created when >> processing that. > > What can we do instead in such cases? Stack-overflow protection > cannot work in GC, so you are shooting yourself in the foot by > creating such large recursive structures. By the time we get to GC, > where the problem will happen, it's too late, because the memory was > already allocated. > > Does anyone has a reasonable idea for avoiding the crash in such > programs? We need to fix GC being deeply recursive once and for all. Tweaking stack sizes on various platforms and trying to spot-fix GC for the occasional deeply recursive structure is annoying. Here's my proposal: Turn garbage_collect_1 into a queue-draining loop, initializing the object queue with the GC roots before draining it. We'll make mark_object put an object on this queue, turning the existing mark_object code into a mark_queued_object function. garbage_collect_1 will just call mark_queued_object in a loop; mark_queued_object can call mark_object, but since mark_object just enqueues an object and doesn't recurse, we can't exhaust the stack with deep object graphs. (We'll repurpose the mark bit to mean that the object is on the to-mark queue; by the time we fully drain the queue, just before we sweep, the mark bit will have the same meaning it does now.) We can't allocate memory to hold the queue during GC, so we'll have to pre-allocate it. We can implement the queue as a list of queue blocks, where each queue block is an array of 16k or so Lisp_Objects. During allocation, we'll just make sure we have one Lisp_Object queue-block slot for every non-self-representing Lisp object we allocate. Since we know that we'll have enough queue blocks for the worst GC case, we can have mark_object pull queue blocks from a free list, aborting if for some reason it ever runs out of queue blocks. (The previous paragraph guarantees we won't.) garbage_collect_1 will churn through these heap blocks and place each back on the free list after it's called mark_queued_object on every Lisp_Object in the queue block. In this way, in non-pathological cases of GC, we'll end up using the same few queue blocks over and over. That's a nice optimization, because we can MADV_DONTNEED unused queue blocks so the OS doesn't actually have to remember their contents. In this way, I think we can make the current GC model recursion-proof without drastically changing how we allocate Lisp objects. The additional memory requirements should be modest: it's basically one Lisp_Object per Lisp object allocated. The naive version of this scheme needs about 4.6MB of overhead on my current 20MB Emacs heap, but it should be possible to reduce the overhead significantly by taking advantage of the block allocation we do for conses and other types --- we can put whole blocks on the queue instead of pointers to individual block parts, so we can get away with a much smaller queue. Under this approach, the reserved-queue-block scheme would impose an overhead of somewhere around 1MB on the same heap. This amount of overhead seems reasonable. We may end up actually using less memory that we would for recursive mark_object stack invocation. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-01 10:44 ` Daniel Colascione @ 2018-03-01 15:51 ` Noam Postavsky 2018-03-01 16:54 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-01 15:51 UTC (permalink / raw) To: Daniel Colascione; +Cc: Michael Heerdegen, 30626 On Thu, Mar 1, 2018 at 5:44 AM, Daniel Colascione <dancol@dancol.org> wrote: > We need to fix GC being deeply recursive once and for all. Tweaking stack > sizes on various platforms and trying to spot-fix GC for the occasional > deeply recursive structure is annoying. Agreed, but could you please open a new thread for it? AFAICT, this bug thread is about streams.el functions producing structures proportional in size to the length of the entire stream, which is a bug in itself, regardless of whether or not the GC can handle the result. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-01 15:51 ` Noam Postavsky @ 2018-03-01 16:54 ` Michael Heerdegen 2018-03-01 17:15 ` Noam Postavsky 0 siblings, 1 reply; 57+ messages in thread From: Michael Heerdegen @ 2018-03-01 16:54 UTC (permalink / raw) To: Noam Postavsky; +Cc: 30626 Noam Postavsky <npostavs@gmail.com> writes: > Agreed, but could you please open a new thread for it? AFAICT, this > bug thread is about streams.el functions producing structures > proportional in size to the length of the entire stream, which is a > bug in itself, regardless of whether or not the GC can handle the > result. No, it is a feature: we want streams to produce these structures, streams are delayed lists, not just iterators. Lists are also potentially unlimited nested conses, and that's not a bug. Streams are the same realized with delayed conses. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-01 16:54 ` Michael Heerdegen @ 2018-03-01 17:15 ` Noam Postavsky 2018-03-02 7:08 ` Michael Heerdegen 0 siblings, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-01 17:15 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 30626 On Thu, Mar 1, 2018 at 11:54 AM, Michael Heerdegen <michael_heerdegen@web.de> wrote: > Noam Postavsky <npostavs@gmail.com> writes: > >> Agreed, but could you please open a new thread for it? AFAICT, this >> bug thread is about streams.el functions producing structures >> proportional in size to the length of the entire stream, which is a >> bug in itself, regardless of whether or not the GC can handle the >> result. > > No, it is a feature: we want streams to produce these structures, > streams are delayed lists, not just iterators. Lists are also > potentially unlimited nested conses, and that's not a bug. Streams are > the same realized with delayed conses. Maybe I've misunderstood, but is it not the case that iterating over (stream-range 1 n) should require only a constant amount of memory, regardless of the value of n? ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-01 17:15 ` Noam Postavsky @ 2018-03-02 7:08 ` Michael Heerdegen 2018-03-02 13:01 ` Noam Postavsky 2018-03-02 13:04 ` Michael Heerdegen 0 siblings, 2 replies; 57+ messages in thread From: Michael Heerdegen @ 2018-03-02 7:08 UTC (permalink / raw) To: Noam Postavsky; +Cc: 30626 Noam Postavsky <npostavs@gmail.com> writes: > Maybe I've misunderstood, but is it not the case that iterating over > (stream-range 1 n) should require only a constant amount of memory, > regardless of the value of n? Depends on your definition of `require'. Like, for example, (dolist (i 1000000) (message "%S" (cons i (1+ i)))) each iteration step creates and discards a new cons (or a constant number of conses). Not a exceptional thing in Lisp. When iterating over (stream-range 1 n), the garbage collector seems to have problems with how the garbage is structured. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 7:08 ` Michael Heerdegen @ 2018-03-02 13:01 ` Noam Postavsky 2018-03-02 13:13 ` Michael Heerdegen 2018-03-02 13:04 ` Michael Heerdegen 1 sibling, 1 reply; 57+ messages in thread From: Noam Postavsky @ 2018-03-02 13:01 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 30626 Michael Heerdegen <michael_heerdegen@web.de> writes: > Noam Postavsky <npostavs@gmail.com> writes: > >> Maybe I've misunderstood, but is it not the case that iterating over >> (stream-range 1 n) should require only a constant amount of memory, >> regardless of the value of n? > > Depends on your definition of `require'. Like, for example, > > (dolist (i 1000000) (message "%S" (cons i (1+ i)))) > > each iteration step creates and discards a new cons (or a constant > number of conses). Not a exceptional thing in Lisp. When iterating > over (stream-range 1 n), the garbage collector seems to have problems > with how the garbage is structured. Ah, so let me be more precise. Iterating over (stream-range 1 n) should require only a constant amount of *reachable* memory at any particular instant. So your example above is okay, but the following one would be not acceptable (I mean, it's fine if some random lisp code does that, but stream-range should not be creating such long lists, or equivalently large structures): (let ((list nil)) (dotimes (i 1000000) (push i list) (message "%S" (car list)))) ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 13:01 ` Noam Postavsky @ 2018-03-02 13:13 ` Michael Heerdegen 0 siblings, 0 replies; 57+ messages in thread From: Michael Heerdegen @ 2018-03-02 13:13 UTC (permalink / raw) To: Noam Postavsky; +Cc: 30626 Noam Postavsky <npostavs@gmail.com> writes: > Ah, so let me be more precise. Iterating over (stream-range 1 n) should > require only a constant amount of *reachable* memory at any particular > instant. So your example above is okay, but the following one would be > not acceptable (I mean, it's fine if some random lisp code does that, > but stream-range should not be creating such long lists, or equivalently > large structures): > > (let ((list nil)) > (dotimes (i 1000000) > (push i list) > (message "%S" (car list)))) Yes, absolutely agreed. Using streams, you can logically refer to the complete list of elements as one object, but the programmer must ensure that the referable list of generated elements doesn't get too large. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-03-02 7:08 ` Michael Heerdegen 2018-03-02 13:01 ` Noam Postavsky @ 2018-03-02 13:04 ` Michael Heerdegen 1 sibling, 0 replies; 57+ messages in thread From: Michael Heerdegen @ 2018-03-02 13:04 UTC (permalink / raw) To: Noam Postavsky; +Cc: 30626 Michael Heerdegen <michael_heerdegen@web.de> writes: > Noam Postavsky <npostavs@gmail.com> writes: > > > Maybe I've misunderstood, but is it not the case that iterating over > > (stream-range 1 n) should require only a constant amount of memory, > > regardless of the value of n? But in this regard, we have a problem with how lexical-binding is implemented for interpreted code. Nested thunks (as implemented in "thunk.el") accumulate useless variable bindings - e.g. (defun test () (thunk-force (thunk-delay (thunk-force (thunk-delay (thunk-force (thunk-delay (lambda () 1)))))))) (test) ==> #1=(closure ((check) (#:val . #1#) (#:forced . t) (check) (#:val . #1#) (#:forced . t) (check) (#:val . #1#) (#:forced . t) t) nil 1) The length of the variable list is equivalent to the number of thunk wrappers. I believe that these useless variable lists are responsible for the crashes of the uncompiled versions of the test files I had posted. I think this problem is different from the gc issue. Streams use nested thunks. Of course does thunk.el not explicitly add such variable lists to the result - this is how closures are built in interpreted code. For nested thunks these just add up. BTW, if you byte-compile the above `test' function, then (disassemble (test)) ==> byte code: args: nil 0 constant 1 1 return and this problem is gone. Michael. ^ permalink raw reply [flat|nested] 57+ messages in thread
* bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' 2018-02-27 11:39 ` Michael Heerdegen 2018-02-27 12:08 ` Michael Heerdegen @ 2018-02-27 18:00 ` Eli Zaretskii 1 sibling, 0 replies; 57+ messages in thread From: Eli Zaretskii @ 2018-02-27 18:00 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 30626 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: bug-gnu-emacs@gnu.org, 30626@debbugs.gnu.org > Date: Tue, 27 Feb 2018 12:39:47 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I guess it's a stack overflow: that function recurses into > > subdirectories. > > > > To avoid such problems, the function should be rewritten to work by > > BFS, not DFS. > > Here is the backtrace I'm now able to produce. Looks like the crash > happens in gc: GC is deeply-recursive, and you have exacerbated that by using up a lot of stack. ^ permalink raw reply [flat|nested] 57+ messages in thread
end of thread, other threads:[~2019-06-04 0:26 UTC | newest] Thread overview: 57+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-02-27 9:22 bug#30626: 26.0.91; Crash when traversing a `stream-of-directory-files' Michael Heerdegen 2018-02-27 11:21 ` Eli Zaretskii 2018-02-27 11:39 ` Michael Heerdegen 2018-02-27 12:08 ` Michael Heerdegen 2018-02-27 18:08 ` Eli Zaretskii 2018-02-28 1:29 ` Noam Postavsky 2018-02-28 10:58 ` Michael Heerdegen 2018-02-28 16:00 ` Eli Zaretskii 2018-02-28 16:20 ` Michael Heerdegen 2018-02-28 17:22 ` Eli Zaretskii 2018-02-28 18:25 ` Michael Heerdegen 2018-03-01 11:25 ` Michael Heerdegen 2018-03-01 15:00 ` Eli Zaretskii 2018-03-02 14:11 ` Noam Postavsky 2018-03-02 15:06 ` Michael Heerdegen 2018-03-02 15:43 ` Eli Zaretskii 2018-03-02 20:16 ` Nicolas Petton 2018-03-02 20:58 ` Nicolas Petton 2018-03-03 7:56 ` Michael Heerdegen 2018-03-03 7:54 ` Michael Heerdegen 2018-03-03 8:47 ` Nicolas Petton 2018-03-02 21:48 ` John Mastro 2018-03-03 23:00 ` Noam Postavsky 2018-03-04 15:56 ` Noam Postavsky 2018-03-04 17:02 ` Eli Zaretskii 2018-03-11 18:52 ` Noam Postavsky 2018-03-11 20:31 ` Eli Zaretskii 2018-03-11 21:51 ` Noam Postavsky 2018-03-12 3:28 ` Eli Zaretskii 2018-03-13 1:59 ` Noam Postavsky 2018-03-13 16:06 ` Eli Zaretskii 2018-03-14 0:09 ` Noam Postavsky 2018-03-15 16:34 ` Eli Zaretskii 2018-03-17 15:53 ` Noam Postavsky 2018-03-17 16:10 ` Eli Zaretskii 2018-03-17 16:27 ` Eli Zaretskii 2018-03-17 17:28 ` Noam Postavsky 2018-03-19 20:05 ` Eli Zaretskii 2019-04-25 3:20 ` Noam Postavsky 2019-04-25 5:19 ` Michael Heerdegen 2019-05-10 13:20 ` Michael Heerdegen 2019-05-25 20:29 ` Noam Postavsky 2019-05-26 0:32 ` Michael Heerdegen 2019-05-26 0:40 ` Noam Postavsky 2019-05-26 1:15 ` Michael Heerdegen 2019-06-04 0:26 ` Noam Postavsky 2018-02-28 11:05 ` Michael Heerdegen 2018-02-28 13:20 ` Nicolas Petton 2018-03-01 10:44 ` Daniel Colascione 2018-03-01 15:51 ` Noam Postavsky 2018-03-01 16:54 ` Michael Heerdegen 2018-03-01 17:15 ` Noam Postavsky 2018-03-02 7:08 ` Michael Heerdegen 2018-03-02 13:01 ` Noam Postavsky 2018-03-02 13:13 ` Michael Heerdegen 2018-03-02 13:04 ` Michael Heerdegen 2018-02-27 18:00 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).