* bug#65858: mumi crashes @ 2023-09-11 1:28 Maxim Cournoyer 2023-10-08 3:13 ` Maxim Cournoyer 0 siblings, 1 reply; 11+ messages in thread From: Maxim Cournoyer @ 2023-09-11 1:28 UTC (permalink / raw) To: 65858; +Cc: guix-sysadmin [-- Attachment #1: Type: text/plain, Size: 4564 bytes --] Hello, Since today, Mumi was found crashing with the following errors in /var/log/mumi.log: --8<---------------cut here---------------start------------->8--- 2023-09-10 11:49:01 GET GC Warning: Repeated allocation of very large block (appr. size 4194304): 2023-09-10 11:49:01 May lead to memory leak and poor performance 2023-09-10 12:02:37 Throw to key `match-error' with args `("match" "no matching pattern" (HEAD "63641"))'. 2023-09-10 12:06:00 Uncaught exception in fiber ##f: 2023-09-10 12:06:00 In ice-9/boot-9.scm: 2023-09-10 12:06:00 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2023-09-10 12:06:00 In fibers/web/server.scm: 2023-09-10 12:06:00 211:21 4 (_) 2023-09-10 12:06:00 In ice-9/suspendable-ports.scm: 2023-09-10 12:06:00 83:4 3 (write-bytes #<closed: file 7f4efcabc8c0> #vu8(60 33 …) …) 2023-09-10 12:06:00 In unknown file: 2023-09-10 12:06:00 2 (port-write #<closed: file 7f4efcabc8c0> #vu8(60 33 # …) …) 2023-09-10 12:06:00 In ice-9/boot-9.scm: 2023-09-10 12:06:00 1685:16 1 (raise-exception _ #:continuable? _) 2023-09-10 12:06:00 1685:16 0 (raise-exception _ #:continuable? _) 2023-09-10 12:06:00 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-09-10 12:06:00 In procedure fport_write: Broken pipe 2023-09-10 12:15:36 /graphql --8<---------------cut here---------------end--------------->8--- And also: --8<---------------cut here---------------start------------->8--- [...] 2023-09-10 22:23:07 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-09-10 22:23:07 In procedure fport_write: Broken pipe 2023-09-10 22:23:07 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-09-10 22:23:07 In procedure fport_write: Broken pipe 2023-09-10 22:23:07 Uncaught exception in fiber ##f: 2023-09-10 22:23:07 In ice-9/boot-9.scm: 2023-09-10 22:23:07 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2023-09-10 22:23:07 In fibers/web/server.scm: 2023-09-10 22:23:07 211:21 4 (_) 2023-09-10 22:23:07 In ice-9/suspendable-ports.scm: 2023-09-10 22:23:07 83:4 3 (write-bytes #<closed: file 7f1caec72d90> #vu8(60 33 …) …) 2023-09-10 22:23:07 In unknown file: 2023-09-10 22:23:07 2 (port-write #<closed: file 7f1caec72d90> #vu8(60 33 # …) …) 2023-09-10 22:23:07 In ice-9/boot-9.scm: 2023-09-10 22:23:07 1685:16 1 (raise-exception _ #:continuable? _) 2023-09-10 22:23:07 Uncaught exception in fiber ##f: 2023-09-10 22:23:07 1685:16 0 (raise-exception _ #:continuable? _) 2023-09-10 22:23:07 In ice-9/boot-9.scm: 2023-09-10 22:23:07 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2023-09-10 22:23:07 In fibers/web/server.scm: 2023-09-10 22:23:07 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-09-10 22:23:07 In procedure fport_write: Broken pipe 2023-09-10 22:23:07 211:21 4 (_) 2023-09-10 22:23:07 In ice-9/suspendable-ports.scm: 2023-09-10 22:23:07 83:4 3 (write-bytes #<closed: file 7f1cb6303e00> #vu8(60 33 …) …) 2023-09-10 22:23:07 In unknown file: 2023-09-10 22:23:07 2 (port-write #<closed: file 7f1cb6303e00> #vu8(60 33 # …) …) 2023-09-10 22:23:07 In ice-9/boot-9.scm: 2023-09-10 22:23:07 1685:16 1 (raise-exception _ #:continuable? _) 2023-09-10 22:23:07 1685:16 0 (raise-exception _ #:continuable? _) 2023-09-10 22:23:07 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-09-10 22:23:07 In procedure fport_write: Broken pipe 2023-09-10 22:23:07 Uncaught exception in fiber ##f: 2023-09-10 22:23:07 In ice-9/boot-9.scm: 2023-09-10 22:23:07 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2023-09-10 22:23:07 In fibers/web/server.scm: 2023-09-10 22:23:07 211:21 4 (_) 2023-09-10 22:23:07 In ice-9/suspendable-ports.scm: 2023-09-10 22:23:07 83:4 3 (write-bytes #<closed: file 7f1cac7101c0> #vu8(60 33 …) …) 2023-09-10 22:23:07 In unknown file: 2023-09-10 22:23:07 2 (port-write #<closed: file 7f1cac7101c0> #vu8(60 33 # …) …) 2023-09-10 22:23:07 In ice-9/boot-9.scm: 2023-09-10 22:23:07 1685:16 1 (raise-exception _ #:continuable? _) 2023-09-10 22:23:07 1685:16 0 (raise-exception _ #:continuable? _) 2023-09-10 22:23:07 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-09-10 22:23:07 In procedure fport_write: Broken pipe --8<---------------cut here---------------end--------------->8--- I've restarted the 'mumi' service and it's up and running again, for now. Attached is the complete mumi log for reference. [-- Attachment #2: mumi.log.gz --] [-- Type: application/octet-stream, Size: 123757 bytes --] [-- Attachment #3: Type: text/plain, Size: 19 bytes --] -- Thanks, Maxim ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-09-11 1:28 bug#65858: mumi crashes Maxim Cournoyer @ 2023-10-08 3:13 ` Maxim Cournoyer 2023-10-09 23:02 ` Arun Isaac 0 siblings, 1 reply; 11+ messages in thread From: Maxim Cournoyer @ 2023-10-08 3:13 UTC (permalink / raw) To: 65858; +Cc: guix-sysadmin Hello, Today, Mumi (version 0.0.5-4.2453a5a) was crashing again, with the website returning 502 Bad Gateway errors, with the same error, it seems: --8<---------------cut here---------------start------------->8--- 2023-10-07 11:48:52 In procedure fport_write: Broken pipe 2023-10-07 11:48:52 Uncaught exception in fiber ##f: 2023-10-07 11:48:52 In ice-9/boot-9.scm: 2023-10-07 11:48:52 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2023-10-07 11:48:52 In fibers/web/server.scm: 2023-10-07 11:48:52 211:21 4 (_) 2023-10-07 11:48:52 In ice-9/suspendable-ports.scm: 2023-10-07 11:48:52 83:4 3 (write-bytes #<closed: file 7f55f4a47070> #vu8(60 33 …) …) 2023-10-07 11:48:52 In unknown file: 2023-10-07 11:48:52 2 (port-write #<closed: file 7f55f4a47070> #vu8(60 33 # …) …) 2023-10-07 11:48:52 In ice-9/boot-9.scm: 2023-10-07 11:48:52 1685:16 1 (raise-exception _ #:continuable? _) 2023-10-07 11:48:52 1685:16 0 (raise-exception _ #:continuable? _) 2023-10-07 11:48:52 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-10-07 11:48:52 In procedure fport_write: Broken pipe --8<---------------cut here---------------end--------------->8--- That's from /var/log/mumi.log.1.gz. The worker is peaking 1 to 4 CPUs; here's what strace looked like at some point: --8<---------------cut here---------------start------------->8--- rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f831b49a910, parent_tid=0x7f831b49a910, exit_signal=0, stack=0x7f831ac9a000, stack_size=0x7fff00, tls=0x7f831b49a640} => {parent_tid=[87808]}, 88) = 87808 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0 futex(0x7f831e3177c0, FUTEX_WAKE_PRIVATE, 1) = 0 rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f8319c97910, parent_tid=0x7f8319c97910, exit_signal=0, stack=0x7f8319497000, stack_size=0x7fff00, tls=0x7f8319c97640} => {parent_tid=[87809]}, 88) = 87809 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 futex(0x7f8328b87f70, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0 futex(0x7f831e317780, FUTEX_WAKE_PRIVATE, 1) = 0 rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f831ac99910, parent_tid=0x7f831ac99910, exit_signal=0, stack=0x7f831a499000, stack_size=0x7fff00, tls=0x7f831ac99640} => {parent_tid=[87810]}, 88) = 87810 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable) futex(0x7f831e317740, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7f8329190c80, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x7f8329190c80, FUTEX_WAKE_PRIVATE, 1) = 0 rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f831a498910, parent_tid=0x7f831a498910, exit_signal=0, stack=0x7f8319c98000, stack_size=0x7fff00, tls=0x7f831a498640} => {parent_tid=[87811]}, 88) = 87811 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 futex(0x7f8328b87f70, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable) futex(0x7f831e317700, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGPWR {si_signo=SIGPWR, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigsuspend(~[INT QUIT ABRT BUS SEGV TERM XCPU RTMIN RT_1], 8) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGXCPU {si_signo=SIGXCPU, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- rt_sigreturn({mask=~[INT QUIT ABRT BUS KILL SEGV TERM STOP RTMIN RT_1]}) = -1 EINTR (Interrupted system call) futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigreturn({mask=[]}) = 202 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGPWR {si_signo=SIGPWR, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigsuspend(~[INT QUIT ABRT BUS SEGV TERM XCPU RTMIN RT_1], 8) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGXCPU {si_signo=SIGXCPU, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- rt_sigreturn({mask=~[INT QUIT ABRT BUS KILL SEGV TERM STOP RTMIN RT_1]}) = -1 EINTR (Interrupted system call) futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigreturn({mask=[]}) = 202 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGPWR {si_signo=SIGPWR, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigsuspend(~[INT QUIT ABRT BUS SEGV TERM XCPU RTMIN RT_1], 8) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGXCPU {si_signo=SIGXCPU, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- rt_sigreturn({mask=~[INT QUIT ABRT BUS KILL SEGV TERM STOP RTMIN RT_1]}) = -1 EINTR (Interrupted system call) futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigreturn({mask=[]}) = 202 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGPWR {si_signo=SIGPWR, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigsuspend(~[INT QUIT ABRT BUS SEGV TERM XCPU RTMIN RT_1], 8) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGXCPU {si_signo=SIGXCPU, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- rt_sigreturn({mask=~[INT QUIT ABRT BUS KILL SEGV TERM STOP RTMIN RT_1]}) = -1 EINTR (Interrupted system call) futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigreturn({mask=[]}) = 202 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGPWR {si_signo=SIGPWR, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigsuspend(~[INT QUIT ABRT BUS SEGV TERM XCPU RTMIN RT_1], 8) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGXCPU {si_signo=SIGXCPU, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- rt_sigreturn({mask=~[INT QUIT ABRT BUS KILL SEGV TERM STOP RTMIN RT_1]}) = -1 EINTR (Interrupted system call) futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigreturn({mask=[]}) = 202 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGPWR {si_signo=SIGPWR, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigsuspend(~[INT QUIT ABRT BUS SEGV TERM XCPU RTMIN RT_1], 8) = ? ERESTARTNOHAND (To be restarted if no handler) --- SIGXCPU {si_signo=SIGXCPU, si_code=SI_TKILL, si_pid=75049, si_uid=940} --- rt_sigreturn({mask=~[INT QUIT ABRT BUS KILL SEGV TERM STOP RTMIN RT_1]}) = -1 EINTR (Interrupted system call) futex(0x7f832917fce0, FUTEX_WAKE_PRIVATE, 1) = 1 rt_sigreturn({mask=[]}) = 202 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0 futex(0x7f831e317700, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7f8328b87f70, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0 futex(0x7f831e317780, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7f8328b87f74, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0 futex(0x7f831e3177c0, FUTEX_WAKE_PRIVATE, 1) = 0 newfstatat(AT_FDCWD, "/var/mumi/db/mumi.xapian", {st_mode=S_IFDIR|0755, st_size=144, ...}, 0) = 0 newfstatat(AT_FDCWD, "/var/mumi/db/mumi.xapian/iamchert", 0x7ffcd7e8ec60, 0) = -1 ENOENT (No such file or directory) newfstatat(AT_FDCWD, "/var/mumi/db/mumi.xapian/iamglass", {st_mode=S_IFREG|0644, st_size=155, ...}, 0) = 0 newfstatat(AT_FDCWD, "/var/mumi/db/mumi.xapian/postlist.glass", {st_mode=S_IFREG|0644, st_size=389939200, ...}, 0) = 0 openat(AT_FDCWD, "/var/mumi/db/mumi.xapian/flintlock", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 10 fcntl(10, F_OFD_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}) = 0 --8<---------------cut here---------------end--------------->8--- Restarting it resolved it for now. -- Thanks, Maxim ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-08 3:13 ` Maxim Cournoyer @ 2023-10-09 23:02 ` Arun Isaac 2023-10-10 3:51 ` Maxim Cournoyer 0 siblings, 1 reply; 11+ messages in thread From: Arun Isaac @ 2023-10-09 23:02 UTC (permalink / raw) To: Maxim Cournoyer, 65858; +Cc: guix-sysadmin Hi Maxim, Thanks for monitoring mumi. I'm not sure why these mumi crashes are happening. I can think of two leads to start working on. 1. I see many lines such as those below that warn of garbage collection issues with "repeated allocations of large blocks". I am not sure, but this sounds like some limitation of the guile garbage collector. I suspect these warnings arise because the guile-email parser thrashes the garbage collector too much. I could look into rewriting parts of guile-email to be more careful with memory use. --8<---------------cut here---------------start------------->8--- 2023-09-09 16:21:00 GC Warning: Repeated allocation of very large block (appr. size 16777216): 2023-09-09 16:21:00 May lead to memory leak and poor performance 2023-09-09 16:30:24 GC Warning: Repeated allocation of very large block (appr. size 16138240): 2023-09-09 16:30:24 May lead to memory leak and poor performance 2023-09-09 16:39:11 GC Warning: Repeated allocation of very large block (appr. size 16781312): 2023-09-09 16:39:11 May lead to memory leak and poor performance --8<---------------cut here---------------end--------------->8--- 2. If there is a specific query that causes mumi to crash, I'd like to find out. This could be a GraphQL query. Unfortunately the actual GraphQL document lives in the query part of the URI, and we don't log the query part. I can patch mumi to log the query part. WDYT? Regards, Arun ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-09 23:02 ` Arun Isaac @ 2023-10-10 3:51 ` Maxim Cournoyer 2023-10-10 22:57 ` Arun Isaac 0 siblings, 1 reply; 11+ messages in thread From: Maxim Cournoyer @ 2023-10-10 3:51 UTC (permalink / raw) To: Arun Isaac; +Cc: 65858, guix-sysadmin Hi Arun, Arun Isaac <arunisaac@systemreboot.net> writes: > Hi Maxim, > > Thanks for monitoring mumi. I'm not sure why these mumi crashes are > happening. I can think of two leads to start working on. > > 1. I see many lines such as those below that warn of garbage collection > issues with "repeated allocations of large blocks". I am not sure, but > this sounds like some limitation of the guile garbage collector. I > suspect these warnings arise because the guile-email parser thrashes the > garbage collector too much. I could look into rewriting parts of > guile-email to be more careful with memory use. > > 2023-09-09 16:21:00 GC Warning: Repeated allocation of very large block (appr. size 16777216): > 2023-09-09 16:21:00 May lead to memory leak and poor performance > 2023-09-09 16:30:24 GC Warning: Repeated allocation of very large block (appr. size 16138240): > 2023-09-09 16:30:24 May lead to memory leak and poor performance > 2023-09-09 16:39:11 GC Warning: Repeated allocation of very large block (appr. size 16781312): > 2023-09-09 16:39:11 May lead to memory leak and poor performance Perhaps that could be related, but I have no such experience to corroborate unfortunately. > 2. If there is a specific query that causes mumi to crash, I'd like to > find out. This could be a GraphQL query. Unfortunately the actual > GraphQL document lives in the query part of the URI, and we don't log > the query part. I can patch mumi to log the query part. > > WDYT? I think adding more logging to narrow where the problem is happening would be a fine action to take here! -- Thanks, Maxim ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-10 3:51 ` Maxim Cournoyer @ 2023-10-10 22:57 ` Arun Isaac 2023-10-11 1:29 ` Maxim Cournoyer 0 siblings, 1 reply; 11+ messages in thread From: Arun Isaac @ 2023-10-10 22:57 UTC (permalink / raw) To: Maxim Cournoyer; +Cc: 65858, guix-sysadmin Hi Maxim, I have made a number of changes to mumi and reconfigured berlin with the latest mumi. Here is a quick summary of the main changes to mumi. - We now log the complete URI and the response code for every request to mumi. - We now handle HEAD requests correctly. This should eliminate some of the crashes we saw in the mumi log. Regards, Arun ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-10 22:57 ` Arun Isaac @ 2023-10-11 1:29 ` Maxim Cournoyer 2023-10-24 4:26 ` Maxim Cournoyer 0 siblings, 1 reply; 11+ messages in thread From: Maxim Cournoyer @ 2023-10-11 1:29 UTC (permalink / raw) To: Arun Isaac; +Cc: 65858, guix-sysadmin Hi Arun, Arun Isaac <arunisaac@systemreboot.net> writes: > Hi Maxim, > > I have made a number of changes to mumi and reconfigured berlin with the > latest mumi. Here is a quick summary of the main changes to mumi. > > - We now log the complete URI and the response code for every request to > mumi. > - We now handle HEAD requests correctly. This should eliminate some of > the crashes we saw in the mumi log. Thanks! Let's keep an eye on things. -- Thanks, Maxim ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-11 1:29 ` Maxim Cournoyer @ 2023-10-24 4:26 ` Maxim Cournoyer 2023-10-24 7:13 ` Christopher Baines 0 siblings, 1 reply; 11+ messages in thread From: Maxim Cournoyer @ 2023-10-24 4:26 UTC (permalink / raw) To: Arun Isaac; +Cc: 65858, guix-sysadmin Hi, Maxim Cournoyer <maxim.cournoyer@gmail.com> writes: > Hi Arun, > > Arun Isaac <arunisaac@systemreboot.net> writes: > >> Hi Maxim, >> >> I have made a number of changes to mumi and reconfigured berlin with the >> latest mumi. Here is a quick summary of the main changes to mumi. >> >> - We now log the complete URI and the response code for every request to >> mumi. >> - We now handle HEAD requests correctly. This should eliminate some of >> the crashes we saw in the mumi log. > > Thanks! Let's keep an eye on things. Here's a fresh crash (on berlin): --8<---------------cut here---------------start------------->8--- 2023-10-24 06:22:58 GET /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065806%29%20%7B%0A%20%20%20% 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 2023-10-24 06:22:58 GET /issue/29433/attachment/1/ 200 2023-10-24 06:22:58 GET /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065853%29%20%7B%0A%20%20%20% 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 2023-10-24 06:22:58 GET /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065869%29%20%7B%0A%20%20%20% 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 2023-10-24 06:23:15 GET Uncaught exception in task: 2023-10-24 06:23:15 In procedure port-auxiliary-write-buffer: Wrong type argument in position 1 (expecting open port): #<closed: file 7f2ac30077e0> 2023-10-24 06:23:15 In fibers.scm: 2023-10-24 06:23:15 172:8 6 (_) 2023-10-24 06:23:15 In ice-9/boot-9.scm: 2023-10-24 06:23:15 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) 2023-10-24 06:23:15 In fibers/web/server.scm: 2023-10-24 06:23:15 214:25 4 (_) 2023-10-24 06:23:15 In ice-9/suspendable-ports.scm: 2023-10-24 06:23:15 83:4 3 (write-bytes #<closed: file 7f2abfe0e770> #vu8(47 42 …) …) 2023-10-24 06:23:15 In unknown file: 2023-10-24 06:23:15 2 (port-write #<closed: file 7f2abfe0e770> #vu8(47 42 # …) …) 2023-10-24 06:23:15 In ice-9/boot-9.scm: 2023-10-24 06:23:15 1685:16 1 (raise-exception _ #:continuable? _) 2023-10-24 06:23:15 1685:16 0 (raise-exception _ #:continuable? _) 2023-10-24 06:23:15 ice-9/boot-9.scm:1685:16: In procedure raise-exception: 2023-10-24 06:23:15 In procedure fport_write: Broken pipe --8<---------------cut here---------------end--------------->8--- -- Thanks, Maxim ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-24 4:26 ` Maxim Cournoyer @ 2023-10-24 7:13 ` Christopher Baines 2023-10-24 12:47 ` Maxim Cournoyer 0 siblings, 1 reply; 11+ messages in thread From: Christopher Baines @ 2023-10-24 7:13 UTC (permalink / raw) To: Maxim Cournoyer; +Cc: 65858, arunisaac, guix-sysadmin [-- Attachment #1: Type: text/plain, Size: 2598 bytes --] Maxim Cournoyer <maxim.cournoyer@gmail.com> writes: > Hi, > > Maxim Cournoyer <maxim.cournoyer@gmail.com> writes: > >> Hi Arun, >> >> Arun Isaac <arunisaac@systemreboot.net> writes: >> >>> Hi Maxim, >>> >>> I have made a number of changes to mumi and reconfigured berlin with the >>> latest mumi. Here is a quick summary of the main changes to mumi. >>> >>> - We now log the complete URI and the response code for every request to >>> mumi. >>> - We now handle HEAD requests correctly. This should eliminate some of >>> the crashes we saw in the mumi log. >> >> Thanks! Let's keep an eye on things. > > Here's a fresh crash (on berlin): > > 2023-10-24 06:22:58 GET /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065806%29%20%7B%0A%20%20%20% > 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 > 2023-10-24 06:22:58 GET /issue/29433/attachment/1/ 200 > 2023-10-24 06:22:58 GET /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065853%29%20%7B%0A%20%20%20% > 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 > 2023-10-24 06:22:58 GET /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065869%29%20%7B%0A%20%20%20% > 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 > 2023-10-24 06:23:15 GET Uncaught exception in task: > 2023-10-24 06:23:15 In procedure port-auxiliary-write-buffer: Wrong type argument in position 1 (expecting > open port): #<closed: file 7f2ac30077e0> > 2023-10-24 06:23:15 In fibers.scm: > 2023-10-24 06:23:15 172:8 6 (_) > 2023-10-24 06:23:15 In ice-9/boot-9.scm: > 2023-10-24 06:23:15 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) > 2023-10-24 06:23:15 In fibers/web/server.scm: > 2023-10-24 06:23:15 214:25 4 (_) > 2023-10-24 06:23:15 In ice-9/suspendable-ports.scm: > 2023-10-24 06:23:15 83:4 3 (write-bytes #<closed: file 7f2abfe0e770> #vu8(47 42 …) …) > 2023-10-24 06:23:15 In unknown file: > 2023-10-24 06:23:15 2 (port-write #<closed: file 7f2abfe0e770> #vu8(47 42 # …) …) > 2023-10-24 06:23:15 In ice-9/boot-9.scm: > 2023-10-24 06:23:15 1685:16 1 (raise-exception _ #:continuable? _) > 2023-10-24 06:23:15 1685:16 0 (raise-exception _ #:continuable? _) > 2023-10-24 06:23:15 ice-9/boot-9.scm:1685:16: In procedure raise-exception: > 2023-10-24 06:23:15 In procedure fport_write: Broken pipe I think this is kind of expected. If NGinx hits the proxy_read_timeout it'll return a 504 to the client and close the connection to Mumi I think. I think what you're seeing here is Mumi trying to respond to a request from NGinx that NGinx has closed. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 987 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-24 7:13 ` Christopher Baines @ 2023-10-24 12:47 ` Maxim Cournoyer 2023-10-24 13:15 ` Christopher Baines 2023-10-26 0:16 ` Arun Isaac 0 siblings, 2 replies; 11+ messages in thread From: Maxim Cournoyer @ 2023-10-24 12:47 UTC (permalink / raw) To: Christopher Baines; +Cc: 65858, Arun Isaac, guix-sysadmin Hi Christopher, Christopher Baines <mail@cbaines.net> writes: [...] >> Here's a fresh crash (on berlin): >> >> 2023-10-24 06:22:58 GET >> /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065806%29%20%7B%0A%20%20%20% >> 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 >> 2023-10-24 06:22:58 GET /issue/29433/attachment/1/ 200 >> 2023-10-24 06:22:58 GET >> /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065853%29%20%7B%0A%20%20%20% >> 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 >> 2023-10-24 06:22:58 GET >> /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065869%29%20%7B%0A%20%20%20% >> 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 >> 2023-10-24 06:23:15 GET Uncaught exception in task: >> 2023-10-24 06:23:15 In procedure port-auxiliary-write-buffer: Wrong >> type argument in position 1 (expecting >> open port): #<closed: file 7f2ac30077e0> >> 2023-10-24 06:23:15 In fibers.scm: >> 2023-10-24 06:23:15 172:8 6 (_) >> 2023-10-24 06:23:15 In ice-9/boot-9.scm: >> 2023-10-24 06:23:15 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) >> 2023-10-24 06:23:15 In fibers/web/server.scm: >> 2023-10-24 06:23:15 214:25 4 (_) >> 2023-10-24 06:23:15 In ice-9/suspendable-ports.scm: >> 2023-10-24 06:23:15 83:4 3 (write-bytes #<closed: file 7f2abfe0e770> #vu8(47 42 …) …) >> 2023-10-24 06:23:15 In unknown file: >> 2023-10-24 06:23:15 2 (port-write #<closed: file 7f2abfe0e770> #vu8(47 42 # …) …) >> 2023-10-24 06:23:15 In ice-9/boot-9.scm: >> 2023-10-24 06:23:15 1685:16 1 (raise-exception _ #:continuable? _) >> 2023-10-24 06:23:15 1685:16 0 (raise-exception _ #:continuable? _) >> 2023-10-24 06:23:15 ice-9/boot-9.scm:1685:16: In procedure raise-exception: >> 2023-10-24 06:23:15 In procedure fport_write: Broken pipe > > I think this is kind of expected. If NGinx hits the proxy_read_timeout > it'll return a 504 to the client and close the connection to Mumi I > think. I think what you're seeing here is Mumi trying to respond to a > request from NGinx that NGinx has closed. If it's expected, we should handle it and produce a useful warning instead of crashing, right? Thanks for tipping in! -- Thanks, Maxim ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-24 12:47 ` Maxim Cournoyer @ 2023-10-24 13:15 ` Christopher Baines 2023-10-26 0:16 ` Arun Isaac 1 sibling, 0 replies; 11+ messages in thread From: Christopher Baines @ 2023-10-24 13:15 UTC (permalink / raw) To: Maxim Cournoyer; +Cc: 65858, Arun Isaac, guix-sysadmin [-- Attachment #1: Type: text/plain, Size: 3092 bytes --] Maxim Cournoyer <maxim.cournoyer@gmail.com> writes: > Hi Christopher, > > Christopher Baines <mail@cbaines.net> writes: > > [...] > >>> Here's a fresh crash (on berlin): >>> >>> 2023-10-24 06:22:58 GET >>> /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065806%29%20%7B%0A%20%20%20% >>> 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 >>> 2023-10-24 06:22:58 GET /issue/29433/attachment/1/ 200 >>> 2023-10-24 06:22:58 GET >>> /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065853%29%20%7B%0A%20%20%20% >>> 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 >>> 2023-10-24 06:22:58 GET >>> /graphql?query=query%20%7B%0A%20%20issue%28number%3A%2065869%29%20%7B%0A%20%20%20% >>> 20open%0A%20%20%7D%0A%7D%0A&variables=%7B%7D 200 >>> 2023-10-24 06:23:15 GET Uncaught exception in task: >>> 2023-10-24 06:23:15 In procedure port-auxiliary-write-buffer: Wrong >>> type argument in position 1 (expecting >>> open port): #<closed: file 7f2ac30077e0> >>> 2023-10-24 06:23:15 In fibers.scm: >>> 2023-10-24 06:23:15 172:8 6 (_) >>> 2023-10-24 06:23:15 In ice-9/boot-9.scm: >>> 2023-10-24 06:23:15 1752:10 5 (with-exception-handler _ _ #:unwind? _ # _) >>> 2023-10-24 06:23:15 In fibers/web/server.scm: >>> 2023-10-24 06:23:15 214:25 4 (_) >>> 2023-10-24 06:23:15 In ice-9/suspendable-ports.scm: >>> 2023-10-24 06:23:15 83:4 3 (write-bytes #<closed: file 7f2abfe0e770> #vu8(47 42 …) …) >>> 2023-10-24 06:23:15 In unknown file: >>> 2023-10-24 06:23:15 2 (port-write #<closed: file 7f2abfe0e770> #vu8(47 42 # …) …) >>> 2023-10-24 06:23:15 In ice-9/boot-9.scm: >>> 2023-10-24 06:23:15 1685:16 1 (raise-exception _ #:continuable? _) >>> 2023-10-24 06:23:15 1685:16 0 (raise-exception _ #:continuable? _) >>> 2023-10-24 06:23:15 ice-9/boot-9.scm:1685:16: In procedure raise-exception: >>> 2023-10-24 06:23:15 In procedure fport_write: Broken pipe >> >> I think this is kind of expected. If NGinx hits the proxy_read_timeout >> it'll return a 504 to the client and close the connection to Mumi I >> think. I think what you're seeing here is Mumi trying to respond to a >> request from NGinx that NGinx has closed. > > If it's expected, we should handle it and produce a useful warning > instead of crashing, right? As you can see from the above stack trace, this doesn't involve Mumi, it's more of an issue for the guile-fibers web server. But yes, it would be nice not to have this clutter in the logs. That being said, if there are ways of having the application internally timeout before NGinx times out, that can help to avoid this kind of issue. Maybe that's not that easy for Mumi though. I think I remember Ricardo doing some investigating and experimenting with switching to non-suspendable ports when writing responses, and I think that might have exhibited different behaviour. It's probably worth trying to reproduce this in isolation, and double checking if the ports implementation has any effect (obviously suspendable ports are better, but this might be informative). [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 987 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* bug#65858: mumi crashes 2023-10-24 12:47 ` Maxim Cournoyer 2023-10-24 13:15 ` Christopher Baines @ 2023-10-26 0:16 ` Arun Isaac 1 sibling, 0 replies; 11+ messages in thread From: Arun Isaac @ 2023-10-26 0:16 UTC (permalink / raw) To: Maxim Cournoyer, Christopher Baines; +Cc: 65858, guix-sysadmin Hi Chris and Maxim, >> I think this is kind of expected. If NGinx hits the proxy_read_timeout >> it'll return a 504 to the client and close the connection to Mumi I >> think. I think what you're seeing here is Mumi trying to respond to a >> request from NGinx that NGinx has closed. Thanks for the tip. This indeed makes a lot of sense! But, I don't understand why a proxy read timeout happens. The requests in question seem to be very simple requests that shoud be processed quickly. Should we increase nginx's proxy_read_timeout? > If it's expected, we should handle it and produce a useful warning > instead of crashing, right? A couple of weeks ago, I put in a mechanism to better catch and report errors. https://git.savannah.gnu.org/cgit/guix/mumi.git/commit/?id=7a71c12a1f7f033ae93f26fb09387f6c04c143ba I'm not sure this is working correctly, though. I have used `guard` in the commit above. Does `guard` catch all exceptions correctly, or should I instead use `catch`? Also, I believe the logs are written in parallel by fibers. This may mangle up the logs a bit, but I'm not sure if we're actually see requests rapid enough for this to happen. Regards, Arun ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-10-26 0:49 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-11 1:28 bug#65858: mumi crashes Maxim Cournoyer 2023-10-08 3:13 ` Maxim Cournoyer 2023-10-09 23:02 ` Arun Isaac 2023-10-10 3:51 ` Maxim Cournoyer 2023-10-10 22:57 ` Arun Isaac 2023-10-11 1:29 ` Maxim Cournoyer 2023-10-24 4:26 ` Maxim Cournoyer 2023-10-24 7:13 ` Christopher Baines 2023-10-24 12:47 ` Maxim Cournoyer 2023-10-24 13:15 ` Christopher Baines 2023-10-26 0:16 ` Arun Isaac
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).