* bug#61350: Eglot over Tramp freezes with large project @ 2023-02-07 16:33 Thomas Koch 2023-02-17 9:54 ` Michael Albinus ` (2 more replies) 0 siblings, 3 replies; 98+ messages in thread From: Thomas Koch @ 2023-02-07 16:33 UTC (permalink / raw) To: 61350 This problem has already been reported to eglot, however it is probably a problem with Tramp: https://github.com/joaotavora/eglot/discussions/883 I've reproduced the problem with Emacs 27.1 from Debian and a minimal Emacs 28.2. installed via nixpkg. The follwing describes a hopefully reliable reproduction. 1. Have a client system with Emacs + eglot and an SSH server. 2. On the server: git clone --depth 1 --no-tags --single-branch \ -b eglot-tramp-freeze-repro \ https://github.com/thkoch2001/yacy_search_server 3. On the server download and untar https://download.eclipse.org/jdtls/milestones/1.19.0/jdt-language-server-1.19.0-202301171536.tar.gz 4. Still on server make a symlink from ~/bin/jdtls to bin/jdtls from the tarball. 5. Install eglot and use this emacs config on client: (custom-set-variables '(tramp-remote-path '("/run/current-system/sw/bin" "/bin" "/usr/bin" "/sbin" "/usr/sbin" "/usr/local/bin" "/usr/local/sbin" "/local/bin" tramp-own-remote-path)) '(tramp-use-ssh-controlmaster-options nil) ;; problem exists also without previous line ) 6. Open source/net/yacy/yacy.java over Tramp with ssh: 7. M-x eglot Freeze should happen after a few seconds. The project works fine with eglot with a local Emacs. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-07 16:33 bug#61350: Eglot over Tramp freezes with large project Thomas Koch @ 2023-02-17 9:54 ` Michael Albinus 2023-02-17 10:33 ` Thomas Koch 2023-02-23 12:17 ` João Távora 2023-04-24 1:44 ` Aaron Madlon-Kay 2 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-17 9:54 UTC (permalink / raw) To: Thomas Koch; +Cc: 61350 Thomas Koch <thomas@koch.ro> writes: Hi Thomas, > I've reproduced the problem with Emacs 27.1 from Debian and a minimal > Emacs 28.2. installed via nixpkg. The follwing describes a hopefully > reliable reproduction. > > 1. Have a client system with Emacs + eglot and an SSH server. > > 2. On the server: > > git clone --depth 1 --no-tags --single-branch \ > -b eglot-tramp-freeze-repro \ > https://github.com/thkoch2001/yacy_search_server > > 3. On the server download and untar > https://download.eclipse.org/jdtls/milestones/1.19.0/jdt-language-server-1.19.0-202301171536.tar.gz > > 4. Still on server make a symlink from ~/bin/jdtls to bin/jdtls from the > tarball. > > 5. Install eglot and use this emacs config on client: > > (custom-set-variables > '(tramp-remote-path > '("/run/current-system/sw/bin" "/bin" "/usr/bin" "/sbin" "/usr/sbin" "/usr/local/bin" "/usr/local/sbin" "/local/bin" tramp-own-remote-path)) > '(tramp-use-ssh-controlmaster-options nil) > ;; problem exists also without previous line > ) > > 6. Open source/net/yacy/yacy.java over Tramp with ssh: > > 7. M-x eglot Thanks for the concise recipe. Since I have used a virgin Ubuntu VM as test target, one additional step is needed: Install Java. > Freeze should happen after a few seconds. Yep, I could reproduce it with both Emacs 29.0.60 and 30.0.50. And the backtrace is --8<---------------cut here---------------start------------->8--- backtrace() tramp-error((tramp-file-name "ssh" nil nil "ubuntu-2204" nil "/home/albinus/src/yacy_search_server/examples/Simp..." nil) quit "") tramp-signal-hook-function(quit nil) accept-process-output(#<process *tramp/ssh ubuntu-2204*> nil nil t) tramp-accept-process-output(#<process *tramp/ssh ubuntu-2204*>) tramp-wait-for-regexp(#<process *tramp/ssh ubuntu-2204*> nil "\\(?:^\\|\0\\)\\(?:[^\n#$]*///3f62f1d55bd34cba7fe730c633...") tramp-wait-for-output(#<process *tramp/ssh ubuntu-2204*>) tramp-send-command((tramp-file-name "ssh" nil nil "ubuntu-2204" nil "/home/albinus/src/yacy_search_server/examples/Simp..." nil) "\\readlink --canonicalize-missing /home/albinus/src...") tramp-send-command-and-check((tramp-file-name "ssh" nil nil "ubuntu-2204" nil "/home/albinus/src/yacy_search_server/examples/Simp..." nil) "\\readlink --canonicalize-missing /home/albinus/src...") tramp-sh-handle-file-truename("/ssh:ubuntu-2204:/home/albinus/src/yacy_search_ser...") apply(tramp-sh-handle-file-truename "/ssh:ubuntu-2204:/home/albinus/src/yacy_search_ser...") tramp-sh-file-name-handler(file-truename "/ssh:ubuntu-2204:/home/albinus/src/yacy_search_ser...") apply(tramp-sh-file-name-handler file-truename "/ssh:ubuntu-2204:/home/albinus/src/yacy_search_ser...") tramp-file-name-handler(file-truename "/ssh:ubuntu-2204:/home/albinus/src/yacy_search_ser...") file-truename("/ssh:ubuntu-2204:/home/albinus/src/yacy_search_ser...") find-buffer-visiting("/ssh:ubuntu-2204:/home/albinus/src/yacy_search_ser...") #f(compiled-function (arg1 arg2 &rest rest) "Handle notification publishDiagnostics." #<bytecode 0x5bbf29480580234>)(#<eglot-lsp-server eglot-lsp-server-bfebda> textDocument/publishDiagnostics :uri "file:///home/albinus/src/yacy_search_server/exampl..." :diagnostics []) apply(#f(compiled-function (arg1 arg2 &rest rest) "Handle notification publishDiagnostics." #<bytecode 0x5bbf29480580234>) #<eglot-lsp-server eglot-lsp-server-bfebda> textDocument/publishDiagnostics (:uri "file:///home/albinus/src/yacy_search_server/exampl..." :diagnostics [])) eglot-handle-notification(#<eglot-lsp-server eglot-lsp-server-bfebda> textDocument/publishDiagnostics :uri "file:///home/albinus/src/yacy_search_server/exampl..." :diagnostics []) apply(eglot-handle-notification #<eglot-lsp-server eglot-lsp-server-bfebda> textDocument/publishDiagnostics (:uri "file:///home/albinus/src/yacy_search_server/exampl..." :diagnostics [])) #f(compiled-function (server method params) #<bytecode -0xa39f6ed8241f361>)(#<eglot-lsp-server eglot-lsp-server-bfebda> textDocument/publishDiagnostics (:uri "file:///home/albinus/src/yacy_search_server/exampl..." :diagnostics [])) jsonrpc-connection-receive(#<eglot-lsp-server eglot-lsp-server-bfebda> (:jsonrpc "2.0" :method "textDocument/publishDiagnostics" :params (:uri "file:///home/albinus/src/yacy_search_server/exampl..." :diagnostics []))) jsonrpc--process-filter(#<process EGLOT (yacy_search_server/(java-mode java-ts-mode))> "pc\":\"2.0\",\"method\":\"$/progress\",\"params\":{\"token\":...") --8<---------------cut here---------------end--------------->8--- The problem is clear. We are in jsonrpc--process-filter (accepting process output). This runs eglot-handle-notification, and that runs file-truename. The latter needs to check remote, so Tramp sends a remote command, and waits for that output (calling accept-process-output again). Since jsonrpc always accepts output from *all* running processes, there could be (and is!) the constellation, that process output has been read already, and Tramp didn't get it, waiting for the output forever. I could mitigate the problem partly by changing the arguments of accept-process-output in jsonrpc-request, and eglot could work successfully, sometimes. And sometimes not. And even then, I ran into the famous "Forbidden reentrant call" error, see bug#60534. I will continue to investigate, but after playing with this for some days, I feel we must add threads to Tramp, when there are several concurrent processes. I've tried this already some years agao, and I failed. Maybe it is time now for a new attempt. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-17 9:54 ` Michael Albinus @ 2023-02-17 10:33 ` Thomas Koch 2023-02-18 11:10 ` Thomas Koch 0 siblings, 1 reply; 98+ messages in thread From: Thomas Koch @ 2023-02-17 10:33 UTC (permalink / raw) To: Michael Albinus; +Cc: 61350 Thanks a lot Michael for the analysis. I don't know anything about Emacs internals, but I have a healthy fear of threads and could understand that it might be a can of worms. Maybe there are ideas that could be taken from JavaScript, which also has only a single thread. As I understand, any threading problem can be translated into event driven code. Additionally, browsers now support web workers which (in theory?) should avoid locks by not sharing context with the main thread: https://developer.mozilla.org/en-US/docs/Web/API/Worker I forwarded your analysis to the GitHub issue: https://github.com/joaotavora/eglot/discussions/883#discussioncomment-5008509 ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-17 10:33 ` Thomas Koch @ 2023-02-18 11:10 ` Thomas Koch 2023-02-18 12:07 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: Thomas Koch @ 2023-02-18 11:10 UTC (permalink / raw) To: Michael Albinus; +Cc: joaotavora@gmail.com, 61350 CC eglot maintainer as requested by him. Please bare with me Michael, if I ask a stupid question although you're a long time Emacs contributor. I saw now that concurrency in Emacs is an age old problem: https://www.emacswiki.org/emacs/ConcurrentEmacs There are emacs-async and emacs-deferred. Couldn't one of them be used to write a non-blocking wrapper for file-io? NodeJS made this programming model popular when it appeared in 2009. I can't believe that nobody discussed this before for Emacs? (Of course I otherwise despise JavaScript and NodeJS in particular with passion.) Meanwhile I could hack eglot to not block so far by replacing find-buffer-visiting with get-file-buffer in eglot-handle-notification. Of course there will probably be other places that cause problems and I could also only use this hack because my project files are not opened via symlinks. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-18 11:10 ` Thomas Koch @ 2023-02-18 12:07 ` Michael Albinus 2023-02-23 11:55 ` Thomas Koch 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-18 12:07 UTC (permalink / raw) To: Thomas Koch; +Cc: joaotavora@gmail.com, 61350 Thomas Koch <thomas@koch.ro> writes: Hi Thomas, > Please bare with me Michael, if I ask a stupid question although > you're a long time Emacs contributor. I saw now that concurrency in > Emacs is an age old problem: > https://www.emacswiki.org/emacs/ConcurrentEmacs Don't worry. You don't ask stupid questions, often it is useful to think differently. > There are emacs-async and emacs-deferred. Couldn't one of them be used > to write a non-blocking wrapper for file-io? NodeJS made this > programming model popular when it appeared in 2009. I can't believe > that nobody discussed this before for Emacs? emacs-async uses a second Emacs instance, AFAIU. I doubt that this approach is helpful for Tramp, which needs to read files into buffers and alike. emacs-deferred I don't know. > Meanwhile I could hack eglot to not block so far by replacing > find-buffer-visiting with get-file-buffer in > eglot-handle-notification. Of course there will probably be other > places that cause problems and I could also only use this hack because > my project files are not opened via symlinks. Yes, that's similar to what I've done while trying to mitigate the problem. But then, later, there was still the famous "Forbidden reentrant call of Tramp", which we must fix anyway. In fact, the threaded Tramp some years ago wasn't that bad. It worked in many cases. The major problem with threads in Emacs is user interaction. When there is a dialogue with a user, for example asking for a password in the minibuffer, it happened that the user input didn't return to the correct thread. Something like a UI-thread is needed in Emacs, I believe. That is, all threads in Emacs delegate user interaction to that UI-thread, and then wait for the result via a mutex. Something like this. For the new attempt to bring threads to Tramp, I plan another approach. Not a whole operation (like find-file) shall be located into an own thread. Instead, just the process communication, process-send-string and accept-process-output, shall be delegated to another thread. This are the places we have problems. Let's see how far I could go. But pls don't hold your breath, it will take time until I could show results. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-18 12:07 ` Michael Albinus @ 2023-02-23 11:55 ` Thomas Koch 2023-02-25 14:36 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: Thomas Koch @ 2023-02-23 11:55 UTC (permalink / raw) To: Michael Albinus; +Cc: joaotavora@gmail.com, 61350 > Michael Albinus <michael.albinus@gmx.de> hat am 18.02.2023 14:07 EET geschrieben: > Don't worry. You don't ask stupid questions, often it is useful to think > differently. Another [...] question: Would it be possible for Tramp to open and close a new SSH connection for every new command? Would this mitigate the problem as an immediate workaround? I use tramp to connect to a local VM or a machine in my home network. Thus opening a new SSH connection with ControlMaster is fast. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-23 11:55 ` Thomas Koch @ 2023-02-25 14:36 ` Michael Albinus 0 siblings, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-02-25 14:36 UTC (permalink / raw) To: Thomas Koch; +Cc: joaotavora@gmail.com, 61350 Thomas Koch <thomas@koch.ro> writes: Hi Thomas, > Another [...] question: > > Would it be possible for Tramp to open and close a new SSH connection for every new command? > Would this mitigate the problem as an immediate workaround? It might help, don't know. But the concurrent accessing of process output by different processes still exist, and I don't know exactly (yet) why it doesn't work as expected. > I use tramp to connect to a local VM or a machine in my home network. Thus opening a new SSH connection with ControlMaster is fast. Yes, but I expect performance problems. We have already the Tramp "sudo" method, which works like an ssh connection: open a process to the target, and send commands via this process. Since it is a local connection, this is even faster than ssh. But there are security aware users, who dislike a permanent Tramp sudo process, which would allow attacks with root permissions. For these users, Tramp offers the "sudoedit" method. The difference is, that every single command Tramp sends, is a new process, similar to what you propose with ssh. The performance of Tramp "sudoedit", compared with "sudo", is much worse. Likely to be expected for a similar "ssh" approach. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-07 16:33 bug#61350: Eglot over Tramp freezes with large project Thomas Koch 2023-02-17 9:54 ` Michael Albinus @ 2023-02-23 12:17 ` João Távora 2023-02-23 14:18 ` João Távora 2023-04-24 1:44 ` Aaron Madlon-Kay 2 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-23 12:17 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> wrote: > The problem is clear. We are in jsonrpc--process-filter (accepting process > output). This runs eglot-handle-notification, and that runs > file-truename. The latter needs to check remote, so Tramp sends a remote > command, and waits for that output (calling accept-process-output again). > > Since jsonrpc always accepts output from *all* running processes, there > could be (and is!) the constellation, that process output has been read > already, and Tramp didn't get it, waiting for the output forever. I > could mitigate the problem partly by changing the arguments of > accept-process-output in jsonrpc-request, and eglot could work > successfully, sometimes. And sometimes not. And even then, I ran into > the famous "Forbidden reentrant call" error, see bug#60534. I see the problem more or less, indeed, thanks for reproducing it and clarifying it. Can you try your experiment again after evaluating these two lines? (cl-defmethod eglot-handle-notification :around (_ _ &key &allow-other-keys) (run-at-time 0 nil (lambda () (cl-call-next-method)))) This makes the notification handlers execute "outside" the LSP server's process filter, so maybe it fix it. I've given it very little testing and it might have other monumentally severe consequences in non-TRAMP cases, so take it with a grain of salt. If the fix works, maybe it could be done upstream in jsonrpc.el itself. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-23 12:17 ` João Távora @ 2023-02-23 14:18 ` João Távora 2023-02-23 14:47 ` Thomas Koch 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-23 14:18 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 > Can you try your experiment again after evaluating these two lines? > (cl-defmethod eglot-handle-notification :around (_ _ &key &allow-other-keys) > (run-at-time 0 nil (lambda () (cl-call-next-method)))) I forgot to mention that one should evaluate that in a buffer were lexical-binding is t. Better try this jsonrpc.el patch instead which does basically the same, but won't have this pitfall. diff --git a/lisp/jsonrpc.el b/lisp/jsonrpc.el index f583d116d20..45e3d97ca41 100644 --- a/lisp/jsonrpc.el +++ b/lisp/jsonrpc.el @@ -184,8 +184,8 @@ jsonrpc-connection-receive (apply #'jsonrpc--reply connection id reply))) (;; A remote notification method - (funcall (jsonrpc--notification-dispatcher connection) - connection (intern method) params)) + (run-at-time 0 nil (jsonrpc--notification-dispatcher connection) + connection (intern method) params)) (;; A remote response (setq continuations (and id (gethash id (jsonrpc--request-continuations connection)))) João ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-23 14:18 ` João Távora @ 2023-02-23 14:47 ` Thomas Koch 2023-02-23 15:22 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Thomas Koch @ 2023-02-23 14:47 UTC (permalink / raw) To: João Távora, Michael Albinus; +Cc: 61350 Thanks João, but with the jsonrpc patch it is the same as with the previous patch: Emacs still hangs. This time I could unfreeze it with C-g, but it hangs again a moment later. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-23 14:47 ` Thomas Koch @ 2023-02-23 15:22 ` João Távora 2023-02-24 17:19 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-23 15:22 UTC (permalink / raw) To: Thomas Koch; +Cc: Michael Albinus, 61350 On Thu, Feb 23, 2023 at 2:47 PM Thomas Koch <thomas@koch.ro> wrote: > > Thanks João, but with the jsonrpc patch it is the same as with the previous patch: Emacs still hangs. This time I could unfreeze it with C-g, but it hangs again a moment later. I'll need you to describe the conditions that lead Emacs to hang. An Emacs -Q run, a server, an example file, all in all, a MRE. See how this user informally but very effectively reported a bug in Eglot: https://lists.gnu.org/archive/html/emacs-devel/2023-02/msg00851.html Here, I don't even understand whether you're trying this with TRAMP or locally. If I can't see the error you're facing in my machine, it's likely I won't be of any help. I can report I've been working for a few hours with that patch with the Clangd server with no problems using a local project (no TRAMP). We can also wait for Michael to try out the patch, as I'm sure he can report any problems just as clearly as he last did in this thread. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-23 15:22 ` João Távora @ 2023-02-24 17:19 ` Michael Albinus 2023-02-24 17:45 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-24 17:19 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi, sorry for the delay, I have been out of order the last days :-( >> Thanks João, but with the jsonrpc patch it is the same as with the previous patch: Emacs still hangs. This time I could unfreeze it with C-g, but it hangs again a moment later. I confirm, that with the jsonrpc patch the problem still happens to me. I'm using the emacs-29 branch for test. > I'll need you to describe the conditions that lead Emacs to hang. > An Emacs -Q run, a server, an example file, all in all, a MRE. See > how this user informally but very effectively reported a bug in Eglot: > > https://lists.gnu.org/archive/html/emacs-devel/2023-02/msg00851.html > > Here, I don't even understand whether you're trying this with TRAMP > or locally. I have prepared the test as described by Thomas in <https://debbugs.gnu.org/cgi/bugreport.cgi?bug=61350#5>. After setting up everything on the remote side, I have called --8<---------------cut here---------------start------------->8--- # ~/src/emacs-29/src/emacs -Q /ssh:ubuntu-2204:/home/albinus/src/yacy_search_server/source/net/yacy/yacy.java -f eglot --8<---------------cut here---------------end--------------->8--- The remote machine is a virgin Ubuntu VM. > If I can't see the error you're facing in my machine, it's likely > I won't be of any help. I can report I've been working for a few > hours with that patch with the Clangd server with no problems using > a local project (no TRAMP). > > We can also wait for Michael to try out the patch, as I'm sure he > can report any problems just as clearly as he last did in this > thread. As said above, same result: eglot hangs. Your patch doesn't change the game. You don't call jsonrpc--notification-dispatcher anymore in the process filter directly. But calling it by a timer has the same effect: The process filter (accepting output) is still active, and the called function in the timer still uses file-truename, which tries to read output from the Tramp process. Again, I still believe we need a general solution in Tramp, using threads. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-24 17:19 ` Michael Albinus @ 2023-02-24 17:45 ` João Távora 2023-02-25 14:27 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-24 17:45 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 On Fri, Feb 24, 2023 at 5:19 PM Michael Albinus <michael.albinus@gmx.de> wrote: > sorry for the delay, I have been out of order the last days :-( No problem. Hope you're feeling better! > As said above, same result: eglot hangs. Your patch doesn't change the > game. Sorry to hear about that, I thought it was problem related to processes inside processes. > You don't call jsonrpc--notification-dispatcher anymore in the > process filter directly. But calling it by a timer has the same effect: > The process filter (accepting output) is still active, What exactly do you mean by "still active" and what process are you referring to? Jsonrpc's or Tramp? > and the called > function in the timer still uses file-truename, which tries to read > output from the Tramp process. This part I can understand. But I don't understand this that you wrote in the original message: > > Since jsonrpc always accepts output from *all* running processes, there > > could be (and is!) the constellation, that process output has been read > > already, and Tramp didn't get it, waiting for the output forever. I could understand if we were talking C and read() here, but the process output of other processes read by jsonrpc's call to accept-process-output surely must have run through some filter function, it wasn't just lost AFAIK. You've probably seen this trick a mililon times, but markers in the process buffer is what is used commonly. > Again, I still believe we need a general solution in Tramp, using threads. I don't understand what is conceptually impossible (or very hard) to achieve with evented IO like we have in Emacs. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-24 17:45 ` João Távora @ 2023-02-25 14:27 ` Michael Albinus 2023-02-25 23:09 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-25 14:27 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> You don't call jsonrpc--notification-dispatcher anymore in the >> process filter directly. But calling it by a timer has the same effect: >> The process filter (accepting output) is still active, > > What exactly do you mean by "still active" and what process are you > referring to? Jsonrpc's or Tramp? Your timer is called immediately, while you're still in jsonrpc--process-filter I believe. >> > Since jsonrpc always accepts output from *all* running processes, there >> > could be (and is!) the constellation, that process output has been read >> > already, and Tramp didn't get it, waiting for the output forever. > > I could understand if we were talking C and read() here, but the process > output of other processes read by jsonrpc's call to accept-process-output > surely must have run through some filter function, it wasn't just lost > AFAIK. You've probably seen this trick a mililon times, but markers > in the process buffer is what is used commonly. It wasn't lost. The process output was retrieved and placed into the Tramp buffer, w/o Tramp's interaction. Tramp doesn't use a process filter for its own connections. The problem is rather, that Tramp must know where the output to be parsed starts in the buffer. If another process has consumed the output, even if it is pushed into the correct buffer, Tramp doesn't know. >> Again, I still believe we need a general solution in Tramp, using threads. > > I don't understand what is conceptually impossible (or very hard) to > achieve with evented IO like we have in Emacs. I've tried different patches, mainly in tramp-accept-process-output. It improves the situation a little bit, but not reliably. Sometimes the test works, sometimes it blocks. And even if it doesn't block, a while later we run into the "Forbidden reentrant call of Tramp" error. Honestly, I still don't understand really what's up. Let's see whether adding threads to Tramp helps. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-25 14:27 ` Michael Albinus @ 2023-02-25 23:09 ` João Távora 2023-02-26 10:24 ` Thomas Koch 2023-02-26 17:23 ` Michael Albinus 0 siblings, 2 replies; 98+ messages in thread From: João Távora @ 2023-02-25 23:09 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > Hi João, > >>> You don't call jsonrpc--notification-dispatcher anymore in the >>> process filter directly. But calling it by a timer has the same effect: >>> The process filter (accepting output) is still active, >> >> What exactly do you mean by "still active" and what process are you >> referring to? Jsonrpc's or Tramp? > > Your timer is called immediately, while you're still in > jsonrpc--process-filter I believe. I don't think so. Any timer function runs in its stack, after the stack where jsonrpc--process-filter happened has unwound. But you've answered my question: by "active" you mean "under the stack frame of jsonrpc--process-filter". >>> > Since jsonrpc always accepts output from *all* running processes, there >>> > could be (and is!) the constellation, that process output has been read >>> > already, and Tramp didn't get it, waiting for the output forever. >> >> I could understand if we were talking C and read() here, but the process >> output of other processes read by jsonrpc's call to accept-process-output >> surely must have run through some filter function, it wasn't just lost >> AFAIK. You've probably seen this trick a mililon times, but markers >> in the process buffer is what is used commonly. > > It wasn't lost. The process output was retrieved and placed into the > Tramp buffer, w/o Tramp's interaction. That's great and quite normal. > Tramp doesn't use a process filter for its own connections. Every process in Emacs has a process filter, though it might not be a custom process filter. If you don't give it one, it is assigned internal-default-process-filter, which as the docstring explains, simply inserts process output into the buffer (of course you probably know this, I'm just stating for other readers). > is rather, that Tramp must know where the output to be parsed starts in > the buffer. Right, and this is where 'point' and 'process-mark' come in. Process mark is where the internal filter last left the insertion, and point is where you the programmer last left your parsing. > If another process has consumed the output, even if it is pushed into > the correct buffer, Tramp doesn't know. Why? May I ask -- perhaps naively -- why can't you can't just (with-current-buffer proc (= (point) (process-mark))) to "know"? In my experience, something like this is usually sufficient. One parses the buffer for incoming messages looking for regexps, magic byte sequences, etc. One always leaves point after a successful search (search-forward-regexp has this behaviour and it's very convenient). Eventually, point may be left exactly at process-mark, or not, depending on how much data came in, a full message, multiple full messages, or some fractional message. Regardless, next time you want to get messages from the process, you perform a check before you go on a potentially blocking call to fetch more output. The check is usually "has process-mark advanced behind my back from other libraries I don't control?" Here, jsonrpc.el's a-p-o calls are the "behing your back". After checking, you know if you have enough data to form a full message, or if you need to go on a potentially blocking a-p-o call. But somehow I suspect you know all this by heart already!! In fact, from the backtrace you posted Fri, 17 Feb 2023, it's clear the hang happend in 'tramp-wait-for-regexp' whic starts (defun tramp-wait-for-regexp (proc timeout regexp) ... (let ((found (tramp-check-for-regexp proc regexp))) (cond (timeout (with-timeout (timeout) (while (not found) (tramp-accept-process-output proc) Here, exactly how I imaged, you first check for "found" before venturing into the blocking tramp-accept-process-output call. So it looks to me that you're doing conceptually the right thing, but it's tramp-check-for-regexp who is missing the fact that there is a perfectly good message in process buffer already. At least according to what I understood from your account of the problem. So my suspicion is in tramp-check-for-regexp. I found it a bit hard to read to find an obvious culprit, and I still haven't a working setup to reproduce this... >>> Again, I still believe we need a general solution in Tramp, using threads. >> >> I don't understand what is conceptually impossible (or very hard) to >> achieve with evented IO like we have in Emacs. > > I've tried different patches, mainly in tramp-accept-process-output. It > improves the situation a little bit, but not reliably. Sometimes the > test works, sometimes it blocks. And even if it doesn't block, a while > later we run into the "Forbidden reentrant call of Tramp" error. I had recently a problem with reentrant calls in jsonrpc--process-filter. This is why you find a run-with-timer call there, right at the beginning. This removes the reentrancy because as written before, timers run in their own stack. This was the fix to a nasty Eglot bug#60088 with similar "hanging" behaviour. > Honestly, I still don't understand really what's up. Let's see whether > adding threads to Tramp helps. I'll try to setup a VM myself with the reproduction recipe that Thomas used. I'm reasonably confident that two process-based extensions such as Jsonrpc.el and TRAMP can coeexist if each follows process etiquette. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-25 23:09 ` João Távora @ 2023-02-26 10:24 ` Thomas Koch 2023-02-26 15:58 ` Michael Albinus 2023-02-26 17:23 ` Michael Albinus 1 sibling, 1 reply; 98+ messages in thread From: Thomas Koch @ 2023-02-26 10:24 UTC (permalink / raw) To: João Távora, Michael Albinus; +Cc: 61350 Michael and I met yesterday for a debugging session. Eventually we tried without SSH ControlMaster (via tramp-use-ssh-controlmaster-options). Eglot did not freeze anymore. I actually found the hint to ControlMaster already early on but probably made a mistake when trying it: - Either I set tramp-ssh-controlmaster-options to nil, which is the default and does not have any effect. - Or SSH still used ControlMaster due to my ssh config: Host * ControlMaster auto I still see one error in process filter: tramp-error: peculiar error: "Forbidden reentrant call of Tramp" I don't consider this a solution but just a workaround. After all I had tramp freeze my emacs also before I used eglot. After reading jsonrpc and tramp source code yesterday I also believe that Joãos proposal might help to make Tramp more stable. Unfortunately I've not done any elisp programming yet and thus can't help with coding. I was thinking whether there could be an abstract (and reliable) process interaction library to be used by tramp, jsonrpc and others. In my init.el I found at least magit (magit-process.el) and maybe org-export (ox.el) which also call external processes and need to reliably parse process output. jsonrpc is special since it needs to match process output by message id to previous process input. Otherwise it seems to me that there is a lot of shared logic? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-26 10:24 ` Thomas Koch @ 2023-02-26 15:58 ` Michael Albinus 0 siblings, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-02-26 15:58 UTC (permalink / raw) To: Thomas Koch; +Cc: João Távora, 61350 Thomas Koch <thomas@koch.ro> writes: Hi Thomas, > Michael and I met yesterday for a debugging session. > > Eventually we tried without SSH ControlMaster (via tramp-use-ssh-controlmaster-options). Eglot did not freeze anymore. Yep. It looks, like neither Eglot nor Tramp do it wrong. > I actually found the hint to ControlMaster already early on but probably made a mistake when trying it: > > - Either I set tramp-ssh-controlmaster-options to nil, which is the default and does not have any effect. Do not use this variable, it is Tramp internal. Instead, set the user option tramp-use-ssh-controlmaster-options to nil *prior* establishing a connection, as explained in the Tramp manual. > - Or SSH still used ControlMaster due to my ssh config: > > Host * > ControlMaster auto This shouldn't be set for hosts you're communicating via Eglot. > I still see one > > error in process filter: tramp-error: peculiar error: "Forbidden reentrant call of Tramp" This is *another* error, it doesn't block Emacs as described in this report. And since we have other bug reports about (bug#49954, bug#60534), I prefer to not discuss it in *this* bug report. > I don't consider this a solution but just a workaround. After all I > had tramp freeze my emacs also before I used eglot. I don't believe it is a workaround only. Using ssh's ControlMaster seems to be problematic for large data to be transferred, for whatever reason. It doesn't look like it is an error in Emacs. I might be wrong, but as I said already, after reading the code again and again (and also after extensive debugging) I have no clue what could be wrong in Eglot or Tramp. Therefore, I recommend that Eglot and Tramp manuals warn about using ssh ControlMaster option when large data are on the wire. > After reading jsonrpc and tramp source code yesterday I also believe > that Joãos proposal might help to make Tramp more stable. I'm not against changes in Eglot or jsonrpc. But I believe this won't help with the "Forbidden reentrant call of Tramp" problem. > I was thinking whether there could be an abstract (and reliable) > process interaction library to be used by tramp, jsonrpc and > others. In my init.el I found at least magit (magit-process.el) and > maybe org-export (ox.el) which also call external processes and need > to reliably parse process output. magit-process.el offers some wrappers for process-file, start-file-process, and even for tramp-sh-handle-process-file and tramp-sh-handle-start-file-process. So the real work is still done in Tramp. In ox.el, start-process is used with a sentinel in org-export-async-start. However, I don't see where accept-process-output is called here, it might be distributed over the Org source tree (which is much too large for me to read). > jsonrpc is special since it needs to match process output by message > id to previous process input. > > Otherwise it seems to me that there is a lot of shared logic? What do you mean by this? Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-25 23:09 ` João Távora 2023-02-26 10:24 ` Thomas Koch @ 2023-02-26 17:23 ` Michael Albinus 2023-02-26 21:13 ` João Távora 1 sibling, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-26 17:23 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> is rather, that Tramp must know where the output to be parsed starts in >> the buffer. > > Right, and this is where 'point' and 'process-mark' come in. Process > mark is where the internal filter last left the insertion, and point is > where you the programmer last left your parsing. Yes, but Tramp doesn't get the whole output with one accept-process-output call. It must use a while loop, until everything it expects has been arrived. So the process-mark also moves again and again, and cannot be used by Tramp directly. >> If another process has consumed the output, even if it is pushed into >> the correct buffer, Tramp doesn't know. > > Why? May I ask -- perhaps naively -- why can't you can't just > > (with-current-buffer proc (= (point) (process-mark))) > > to "know"? See above. > In my experience, something like this is usually sufficient. One parses > the buffer for incoming messages looking for regexps, magic byte > sequences, etc. One always leaves point after a successful search > (search-forward-regexp has this behaviour and it's very convenient). > Eventually, point may be left exactly at process-mark, or not, depending > on how much data came in, a full message, multiple full messages, or > some fractional message. > > Regardless, next time you want to get messages from the process, you > perform a check before you go on a potentially blocking call to fetch > more output. The check is usually "has process-mark advanced behind my > back from other libraries I don't control?" Here, jsonrpc.el's a-p-o > calls are the "behing your back". After checking, you know if you have > enough data to form a full message, or if you need to go on a > potentially blocking a-p-o call. Everything correct. But the problem was that not the whole expected output hasn't arrived the buffer. I thought it was "stolen" by another process. But yesterday's debugging has shown, that ssh ControlMaster seems to be guilty; it cannot handle large amount of data reliably. > But somehow I suspect you know all this by heart already!! In fact, > from the backtrace you posted Fri, 17 Feb 2023, it's clear the hang > happend in 'tramp-wait-for-regexp' whic starts > > (defun tramp-wait-for-regexp (proc timeout regexp) > ... > (let ((found (tramp-check-for-regexp proc regexp))) > (cond (timeout > (with-timeout (timeout) > (while (not found) > (tramp-accept-process-output proc) > > Here, exactly how I imaged, you first check for "found" before venturing > into the blocking tramp-accept-process-output call. So it looks to me > that you're doing conceptually the right thing, but it's > tramp-check-for-regexp who is missing the fact that there is a perfectly > good message in process buffer already. At least according to what I > understood from your account of the problem. Yes, that's the place. Tramp waits for a trailing shell prompt that it tells that everything has been sent from remote. But this shell prompt didn't arrive, and Tramp hangs in waiting for this. >> Honestly, I still don't understand really what's up. Let's see whether >> adding threads to Tramp helps. > > I'll try to setup a VM myself with the reproduction recipe that Thomas > used. I'm reasonably confident that two process-based extensions such > as Jsonrpc.el and TRAMP can coeexist if each follows process etiquette. They will cooperate, definitely! > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-26 17:23 ` Michael Albinus @ 2023-02-26 21:13 ` João Távora 2023-02-26 21:45 ` João Távora 2023-02-27 7:47 ` Michael Albinus 0 siblings, 2 replies; 98+ messages in thread From: João Távora @ 2023-02-26 21:13 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > Yes, but Tramp doesn't get the whole output with one > accept-process-output call. It must use a while loop, until everything > it expects has been arrived. So the process-mark also moves again and > again, and cannot be used by Tramp directly. Sure, but you wrote: "Tramp must know where the output to be parsed starts in the buffer". And I suggested that you arrange for that _starting point_ to be given by the Lisp function 'point', not 'process-mark'. > Everything correct. But the problem was that not the whole expected > output hasn't arrived the buffer. I thought it was "stolen" by another > process. OK, but earlier you said that the process _had_ arrived in Tramp's buffer (though this arrival was triggered by jsonrpc.el) but that Tramp couldn't be made aware of it. You wrote yesterday: >>> It wasn't lost. The process output was retrieved and placed into the >>> Tramp buffer, w/o Tramp's interaction. Can you confirm that you no longer believe this to be the case? > But yesterday's debugging has shown, that ssh ControlMaster > seems to be guilty; it cannot handle large amount of data reliably. So maybe one should disable it by default. How does one disable it? I have trouble understanding from tramp-use-ssh-controlmaster-options and tramp-ssh-controlmaster-options. > Yes, that's the place. Tramp waits for a trailing shell prompt that it > tells that everything has been sent from remote. But this shell prompt > didn't arrive, and Tramp hangs in waiting for this. So the problem is now that the shell prompt didn't arrive after all, because of controlmaster, not because of jsonrpc.el. Can you confirm this? João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-26 21:13 ` João Távora @ 2023-02-26 21:45 ` João Távora 2023-02-27 7:53 ` Michael Albinus 2023-02-27 7:47 ` Michael Albinus 1 sibling, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-26 21:45 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 [-- Attachment #1: Type: text/plain, Size: 1738 bytes --] João Távora <joaotavora@gmail.com> writes: > So maybe one should disable it by default. How does one disable it? I > have trouble understanding from tramp-use-ssh-controlmaster-options and > tramp-ssh-controlmaster-options. In the meantime I have finally, with great Docker-relearning hardship, setup a reproduction environment. You'll see it follows exactly Thomas's reproduction recipe almost until the end. It sets up a server on port 2022 I attach the dockerfile. $ docker build -t sshubuntu . $ docker run -d -p 2022:22 sshubuntu $ /path/to/emacs -Q /ssh:sshuser@localhost#2022:/home/sshuser/yacy_search_server/source/net/yacy/yacy.java -f eglot Now the odd bit. Sometimes this works, sometimes it doesn't!!! Sometimes it finds '~/bin/jtdls' in the remote's PATH and sometimes it doesn't! I have no idea why. Maybe someone can help me debug: how is PATH to work? In the server I can see that on simple ssh login, ~/bin/jdtls _is_ found. Anyway, when I _do_ manage to get it to find sshuser's '~/bin/jtdls': * I do experience the hang regularly, but not always at the same point in time. * I don't know how to obtain the pretty Backtrace that you Michael got. Michael how do you get that?? Maybe you're using some tramp flag, because debug-on-error and debug-on-quit don't work. * If I set (setq tramp-ssh-controlmaster-options nil tramp-use-ssh-controlmaster-options nil) I _no longer_ experience the hang. This is after visiting 5 or 6 files, trying some completion, M-. xref-find-definitions, etc. Michael can you confirm that these are per-buffer values? Would it make sense for Eglot to set them like so in its minor mode hook? João [-- Attachment #2: dockerfile --] [-- Type: text/plain, Size: 1215 bytes --] FROM ubuntu:latest RUN apt update && apt install openssh-server sudo openjdk-17-jdk openjdk-17-jre git wget -y # Create a user “sshuser” and group “sshgroup” RUN groupadd sshgroup && useradd -ms /bin/bash -g sshgroup sshuser # Create sshuser directory in home RUN mkdir -p /home/sshuser/.ssh # Copy the ssh public key in the authorized_keys file. The idkey.pub below is a public key file you get from ssh-keygen. They are under ~/.ssh directory by default. COPY id_rsa.pub /home/sshuser/.ssh/authorized_keys # change ownership of the key file. RUN chown sshuser:sshgroup /home/sshuser/.ssh/authorized_keys && chmod 600 /home/sshuser/.ssh/authorized_keys # Start SSH service RUN service ssh start # Expose docker port 22 EXPOSE 22 USER sshuser WORKDIR /home/sshuser RUN git clone --depth 1 --no-tags --single-branch -b eglot-tramp-freeze-repro https://github.com/thkoch2001/yacy_search_server RUN wget https://download.eclipse.org/jdtls/milestones/1.19.0/jdt-language-server-1.19.0-202301171536.tar.gz RUN mkdir jdtls RUN tar xvfz jdt-language-server-1.19.0-202301171536.tar.gz -C jdtls RUN mkdir bin RUN ln -sf ~/jdtls/bin/jdtls ~/bin/jdtls USER root CMD ["/usr/sbin/sshd","-D"] ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-26 21:45 ` João Távora @ 2023-02-27 7:53 ` Michael Albinus 2023-02-27 9:42 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-27 7:53 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, > * I don't know how to obtain the pretty Backtrace that you Michael got. > Michael how do you get that?? Maybe you're using some tramp flag, > because debug-on-error and debug-on-quit don't work. There's a trick with tramp-verbose. If you set it to 10, you get a *lot* of traces in the debug buffer, plus a backtrace in case of errors or quit. > * If I set > > (setq tramp-ssh-controlmaster-options nil > tramp-use-ssh-controlmaster-options nil) > > I _no longer_ experience the hang. This is after visiting 5 or 6 > files, trying some completion, M-. xref-find-definitions, etc. Don't touch tramp-ssh-controlmaster-options, it is meant for internal purposes. > Michael can you confirm that these are per-buffer values? Would it > make sense for Eglot to set them like so in its minor mode hook? No, they are global IIRC. I haven't checked. Maybe we could use them as connection-local variables, that needs further checks. And it won't help if a user has enabled ControlMaster in her ~/.ssh/config. So we must speak about in the doc anyway. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-27 7:53 ` Michael Albinus @ 2023-02-27 9:42 ` João Távora 2023-02-27 20:11 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-27 9:42 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > And it won't help if a user has enabled ControlMaster in her > ~/.ssh/config. Can't Tramp override this config it it wants to? João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-27 9:42 ` João Távora @ 2023-02-27 20:11 ` Michael Albinus 0 siblings, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-02-27 20:11 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: >> And it won't help if a user has enabled ControlMaster in her >> ~/.ssh/config. > > Can't Tramp override this config it it wants to? See my other message. I'll try to do so if instructed by the user. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-26 21:13 ` João Távora 2023-02-26 21:45 ` João Távora @ 2023-02-27 7:47 ` Michael Albinus 2023-02-27 9:35 ` João Távora 1 sibling, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-27 7:47 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >>>> It wasn't lost. The process output was retrieved and placed into the >>>> Tramp buffer, w/o Tramp's interaction. > > Can you confirm that you no longer believe this to be the case? I don't know whether it was Tramp's accept-process-output call, or another one. But yes, this doesn't seem to be a problem. >> But yesterday's debugging has shown, that ssh ControlMaster >> seems to be guilty; it cannot handle large amount of data reliably. > > So maybe one should disable it by default. No, not by default. It improves performance. > How does one disable it? I have trouble understanding from > tramp-use-ssh-controlmaster-options and > tramp-ssh-controlmaster-options. RTFM. The Tramp manual says --8<---------------cut here---------------start------------->8--- If the ‘~/.ssh/config’ file is configured appropriately for the above behavior, then any changes to ‘ssh’ can be suppressed with this ‘nil’ setting: (customize-set-variable 'tramp-use-ssh-controlmaster-options nil) This should also be set to ‘nil’ if you use the ‘ProxyCommand’ or ‘ProxyJump’ options in your ‘ssh’ configuration. --8<---------------cut here---------------end--------------->8--- I will extend this, descibing possible problems with a large amount of data. >> Yes, that's the place. Tramp waits for a trailing shell prompt that it >> tells that everything has been sent from remote. But this shell prompt >> didn't arrive, and Tramp hangs in waiting for this. > > So the problem is now that the shell prompt didn't arrive after all, > because of controlmaster, not because of jsonrpc.el. Can you confirm > this? Yes. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-27 7:47 ` Michael Albinus @ 2023-02-27 9:35 ` João Távora 2023-02-27 20:10 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-27 9:35 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > João Távora <joaotavora@gmail.com> writes: > > Hi João, > >>>>> It wasn't lost. The process output was retrieved and placed into the >>>>> Tramp buffer, w/o Tramp's interaction. >> >> Can you confirm that you no longer believe this to be the case? > > I don't know whether it was Tramp's accept-process-output call, or > another one. But yes, this doesn't seem to be a problem. > >>> But yesterday's debugging has shown, that ssh ControlMaster >>> seems to be guilty; it cannot handle large amount of data reliably. >> >> So maybe one should disable it by default. > > No, not by default. It improves performance. So would you buy a very fast car that only explodes 10% of the time :-) I think that these kinds of features should be opt-in, at least until one is more sure of their stability. Here, this looks like an unstable feature. >> How does one disable it? I have trouble understanding from >> tramp-use-ssh-controlmaster-options and >> tramp-ssh-controlmaster-options. > > RTFM. The Tramp manual says > > If the ‘~/.ssh/config’ file is configured appropriately for the above > behavior, then any changes to ‘ssh’ can be suppressed with this ‘nil’ > setting: > > (customize-set-variable 'tramp-use-ssh-controlmaster-options nil) > > This should also be set to ‘nil’ if you use the ‘ProxyCommand’ or > ‘ProxyJump’ options in your ‘ssh’ configuration. I had read the docstrings. Neither the docstrings nor this manual entry unequivocally states how to _guarante_ the SSH "controlmaster" are _not_ used for TRAMP. It is something like this? (setq tramp-use-ssh-controlmaster-options t tramp-ssh-controlmaster-options "") I understand you appreciate this feature and it has advantages elsewhere. But I we can recognize that, as things stand, it is a non-essential performance optimization that wrecks Eglot functionally. Until the problem is fixed, it's important for Eglot users to know how to turn it off guaranteedly, at least for the duration Eglot sesssions. This will helps us clear a big slice of doubt/uncertainty in all the Eglot TRAMP-related bugs (of which there are many). João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-27 9:35 ` João Távora @ 2023-02-27 20:10 ` Michael Albinus 2023-02-28 0:10 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-27 20:10 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >>> So maybe one should disable it by default. >> >> No, not by default. It improves performance. > > So would you buy a very fast car that only explodes 10% of the time :-) > I think that these kinds of features should be opt-in, at least until > one is more sure of their stability. Here, this looks like an unstable > feature. It is stable. Except in the case of large amount of data, which is exceptional for Tramp usage. > I had read the docstrings. Neither the docstrings nor this manual entry > unequivocally states how to _guarante_ the SSH "controlmaster" are _not_ > used for TRAMP. It is something like this? > > (setq tramp-use-ssh-controlmaster-options t > tramp-ssh-controlmaster-options "") > > I understand you appreciate this feature and it has advantages > elsewhere. But I we can recognize that, as things stand, it is a > non-essential performance optimization that wrecks Eglot functionally. It is essential. I have been beaten by many Tramp users to support it. > Until the problem is fixed, it's important for Eglot users to know how > to turn it off guaranteedly, at least for the duration Eglot sesssions. > > This will helps us clear a big slice of doubt/uncertainty in all the > Eglot TRAMP-related bugs (of which there are many). Yes. I believe we must extend the meaning of tramp-use-ssh-controlmaster-options. `t' shall mean that Tramp shall apply its own settings. `nil' shall mean Tramp doesn't do anything, and settings in ~/.ssh/config shall apply. And a new value, say `suppress', shall suppress this feature even if enabled in ~/.ssh/config. I'll try to implement something along this line. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-27 20:10 ` Michael Albinus @ 2023-02-28 0:10 ` João Távora 2023-02-28 10:38 ` Michael Albinus 2023-02-28 14:18 ` Michael Albinus 0 siblings, 2 replies; 98+ messages in thread From: João Távora @ 2023-02-28 0:10 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: >> So would you buy a very fast car that only explodes 10% of the time :-) >> I think that these kinds of features should be opt-in, at least until >> one is more sure of their stability. Here, this looks like an unstable >> feature. > > It is stable. Except in the case of large amount of data, which is > exceptional for Tramp usage. ...is it though? :-) Can a feature that hangs when presented with more data than usual (however one defines that) be considered stable? >> I understand you appreciate this feature and it has advantages >> elsewhere. But I we can recognize that, as things stand, it is a >> non-essential performance optimization that wrecks Eglot functionally. > > It is essential. I have been beaten by many Tramp users to support it. I'd say something "essential" is something you can't live without. But that doesn't seem to be the case here. I could use Eglot over Tramp with decent performance (although this is a local-to-local in disguise). I wonder what Thomas has to say about performance with a real remote server when he turns off ControlMaster. Of course I do understand that it's a very sought after performance optimization. Doesn't seem to be quite finished, but that's just MHO. > settings in ~/.ssh/config shall apply. And a new value, say `suppress', > shall suppress this feature even if enabled in ~/.ssh/config. > > I'll try to implement something along this line. Thanks. I think that is a good start. But can Eglot or the user somehow set 'suppress' for connections "motivated" by Eglot, i.e. M-x eglot in some remote file? How? João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 0:10 ` João Távora @ 2023-02-28 10:38 ` Michael Albinus 2023-02-28 11:33 ` João Távora 2023-02-28 14:18 ` Michael Albinus 1 sibling, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-28 10:38 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> It is stable. Except in the case of large amount of data, which is >> exceptional for Tramp usage. > > ...is it though? :-) Can a feature that hangs when presented with more > data than usual (however one defines that) be considered stable? Tramp supports ControlMaster since Emacs 24. Eglot is the first case I've heard of problems. I don't deny it, but it seems to be premature to me to disable the feature at all, falling back to an opt-in config. >> It is essential. I have been beaten by many Tramp users to support it. > > I'd say something "essential" is something you can't live without. But > that doesn't seem to be the case here. Ohh. You haven't seen how much Tramp has been blamed because it doesn't support it out of the box. > I could use Eglot over Tramp with decent performance (although this is > a local-to-local in disguise). Of course. Tramp allows you to opt-out, which is an accepted configuration possibility. I don't say the way to opt-out is perfect (it's currently an all-or-nothing), but this can be improved. >> settings in ~/.ssh/config shall apply. And a new value, say `suppress', >> shall suppress this feature even if enabled in ~/.ssh/config. >> >> I'll try to implement something along this line. > > Thanks. I think that is a good start. But can Eglot or the user > somehow set 'suppress' for connections "motivated" by Eglot, i.e. M-x > eglot in some remote file? How? Currently it's not possible. I'll investigate a way to disable ControlMaster per process. By this, the main Tramp process and other processes on the remote host could still profit from the performance boost, and other processes (like Eglot) could opt-out. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 10:38 ` Michael Albinus @ 2023-02-28 11:33 ` João Távora 2023-02-28 12:59 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-28 11:33 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, eliz, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > João Távora <joaotavora@gmail.com> writes: > > Hi João, > >>> It is stable. Except in the case of large amount of data, which is >>> exceptional for Tramp usage. >> ...is it though? :-) Can a feature that hangs when presented with more >> data than usual (however one defines that) be considered stable? > Tramp supports ControlMaster since Emacs 24. Eglot is the first case > I've heard of problems. I don't deny it That could be because, as far as I understand, Eglot is the first "major" package to use the transparent (make-process ... :file-handler t) which, AFAIU, means that Eglot's process, which doesn't really transmit _that_ much data, is piped transparently through Tramp's process managing the remote file whose buffer M-x eglot is executed in. >>> It is essential. I have been beaten by many Tramp users to support it. >> I'd say something "essential" is something you can't live without. But >> that doesn't seem to be the case here. > Ohh. You haven't seen how much Tramp has been blamed because it doesn't > support it out of the box. I understand, but it's the occasionally-exploding racecar all over again. I understand your reasoning: you're betting that it's acceptable for the ocassionally-exploded user to learn opt-out for a slightly slower car. The thing is they will probably blame the steering wheel, Eglot, because -- Tramp being "transparent" -- that's what they see. Ideally, we would just fix this. It would be nice to understand what actually happen and what information is lost, presumably to ControlMaster's gremlins. I think bringing in Eli's knowledge of process.c internals could be beneficial here. > Currently it's not possible. I'll investigate a way to disable > ControlMaster per process. By this, the main Tramp process and other > processes on the remote host could still profit from the performance > boost, and other processes (like Eglot) could opt-out. One thing I don't understand, here and in other bugs, is the "process" model of Tramp. Here are some questions. Apologies in advance if some of them are in the "FM". 1. When I open a single remote file on a remote host, I am creating Tramp process, I presume. 2. If I kill this buffer and re-visit it, is the Tramp process re-used somehow or is a new one created? 3. If I open another nearby file on the same host, I am sharing that Tramp process? 4. What about in a different host? 5. Is there any way to get an overview of which Tramp processes are "responsible" for each set of remote files? 6. Now if I M-x eglot in a Tramp remote file that I had been working on Eglot-less for a while, is the Eglot process tunneled through the existing Tramp process, reusing it, or is a new one started? 7. If an existing process is reused, is there any way to force Tramp to open a new one, perhaps with slightly different configuration options? João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 11:33 ` João Távora @ 2023-02-28 12:59 ` Michael Albinus 2023-02-28 14:41 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-28 12:59 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, eliz, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >>> ...is it though? :-) Can a feature that hangs when presented with more >>> data than usual (however one defines that) be considered stable? >> Tramp supports ControlMaster since Emacs 24. Eglot is the first case >> I've heard of problems. I don't deny it > > That could be because, as far as I understand, Eglot is the first > "major" package to use the transparent (make-process ... :file-handler > t) which, AFAIU, means that Eglot's process, which doesn't really > transmit _that_ much data, is piped transparently through Tramp's > process managing the remote file whose buffer M-x eglot is executed in. Eglot is not the first user of this. But Eglot sends a huge amount of data over the wire, which seems to trigger the problem. For example, lsp-mode uses start-file-process-shell-command, which calls under the hood make-process similarly. >>>> It is essential. I have been beaten by many Tramp users to support it. >>> I'd say something "essential" is something you can't live without. But >>> that doesn't seem to be the case here. >> Ohh. You haven't seen how much Tramp has been blamed because it doesn't >> support it out of the box. > > I understand, but it's the occasionally-exploding racecar all over > again. I understand your reasoning: you're betting that it's acceptable > for the ocassionally-exploded user to learn opt-out for a slightly > slower car. > > The thing is they will probably blame the steering wheel, Eglot, > because -- Tramp being "transparent" -- that's what they see. > > Ideally, we would just fix this. It would be nice to understand what > actually happen and what information is lost, presumably to > ControlMaster's gremlins. I think bringing in Eli's knowledge of > process.c internals could be beneficial here. I'm not against changing Tramp, really. But first I must understand what's the problem, and why a change will solve it. Without other collateral damages. > Here are some questions. Apologies in advance if some of them are in > the "FM". The Tramp manual misses detailed implementation notes. On my TODO for years, but it didn't happen until now. > 1. When I open a single remote file on a remote host, I am creating > Tramp process, I presume. Yes. In Tramp slang, this is the (main) connection process. > 2. If I kill this buffer and re-visit it, is the Tramp process re-used > somehow or is a new one created? The same connection process is used. > 3. If I open another nearby file on the same host, I am sharing that > Tramp process? Yes, as long as the remote identification (method, user, host) are the same. > 4. What about in a different host? A new connection process. Also if just the method or user changes. "/ssh::" and "/scp::" use different connection processes. > 5. Is there any way to get an overview of which Tramp processes are > "responsible" for each set of remote files? The so called connection buffers. This is always a buffer with the name "*tramp/METHOD HOST*" or "*tramp/METHOD USER@HOST*", depending whether there is a user name, or not. See function tramp-buffer-name. The Tramp connection processes are bound to these buffers. > 6. Now if I M-x eglot in a Tramp remote file that I had been working on > Eglot-less for a while, is the Eglot process tunneled through the > existing Tramp process, reusing it, or is a new one started? It depends. A synchronous process, created by process-file, is tunnelled through the connection process. An asynchronous process, created by make-process or start-file-process, is a new one. > 7. If an existing process is reused, is there any way to force Tramp to > open a new one, perhaps with slightly different configuration > options? No need for synchronous processes. And for asynchronous processes, they have their own process. If configuration options are changed prior the make-process call, they apply. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 12:59 ` Michael Albinus @ 2023-02-28 14:41 ` João Távora 0 siblings, 0 replies; 98+ messages in thread From: João Távora @ 2023-02-28 14:41 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, eliz, 61350 On Tue, Feb 28, 2023 at 12:59 PM Michael Albinus <michael.albinus@gmx.de> wrote: > Eglot is not the first user of this. But Eglot sends a huge amount of > data over the wire, which seems to trigger the problem. > > For example, lsp-mode uses start-file-process-shell-command, which calls > under the hood make-process similarly. Do you have reasons to think that lsp-mode's approach is better in this regard? I'd be happy to switch, if feasible, but judging from LSP mode's bug tracker it doesn't look very good. There are quite a few Tramp-related issues and some of them seem to reference hangs too. So Eglot might not be the first ever victim here. > I'm not against changing Tramp, really. But first I must understand > what's the problem, and why a change will solve it. Without other > collateral damages. Of course. I just happen think changing the default of a variable to a more conservative and safe one is a good starting point. > > Here are some questions. > [...] > No need for synchronous processes. And for asynchronous processes, they > have their own process. If configuration options are changed prior the > make-process call, they apply. That's good news. Thank you for your answers. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 0:10 ` João Távora 2023-02-28 10:38 ` Michael Albinus @ 2023-02-28 14:18 ` Michael Albinus 2023-02-28 14:51 ` João Távora 1 sibling, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-02-28 14:18 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 [-- Attachment #1: Type: text/plain, Size: 867 bytes --] João Távora <joaotavora@gmail.com> writes: Hi Joao, >> settings in ~/.ssh/config shall apply. And a new value, say `suppress', >> shall suppress this feature even if enabled in ~/.ssh/config. >> >> I'll try to implement something along this line. > > Thanks. I think that is a good start. But can Eglot or the user > somehow set 'suppress' for connections "motivated" by Eglot, i.e. M-x > eglot in some remote file? How? The appended patch fixes the issue for me with Thomas' example. It binds tramp-use-ssh-controlmaster-options to nil before calling make-process, and that's sufficient for the majority of users. Users, who have set ControlMaster in their ~/.ssh/config (like Thomas), are not satified. For them we need another setting of tramp-use-ssh-controlmaster-options. I'm working on this. > João Best regards, Michael. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 1141 bytes --] diff --git a/lisp/progmodes/eglot.el b/lisp/progmodes/eglot.el index ffc9511469f..71271ca0b0a 100644 --- a/lisp/progmodes/eglot.el +++ b/lisp/progmodes/eglot.el @@ -1214,6 +1214,8 @@ eglot--cmd (defvar-local eglot--cached-server nil "A cached reference to the current Eglot server.") +(defvar tramp-use-ssh-controlmaster-options) + (defun eglot--connect (managed-modes project class contact language-id) "Connect to MANAGED-MODES, LANGUAGE-ID, PROJECT, CLASS and CONTACT. This docstring appeases checkdoc, that's all." @@ -1249,7 +1251,9 @@ eglot--connect (contact (cl-subseq contact 0 probe))) `(:process ,(lambda () - (let ((default-directory default-directory)) + (let ((default-directory default-directory) + ;; See bug#61350. + tramp-use-ssh-controlmaster-options) (make-process :name readable-name :command (setq server-info (eglot--cmd contact)) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 14:18 ` Michael Albinus @ 2023-02-28 14:51 ` João Távora 2023-02-28 15:01 ` Michael Albinus 2023-03-01 10:46 ` Gregory Heytings 0 siblings, 2 replies; 98+ messages in thread From: João Távora @ 2023-02-28 14:51 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 On Tue, Feb 28, 2023 at 2:18 PM Michael Albinus <michael.albinus@gmx.de> wrote: > The appended patch fixes the issue for me with Thomas' example. It binds > tramp-use-ssh-controlmaster-options to nil before calling make-process, > and that's sufficient for the majority of users. Thanks. Something like that patch is acceptable for now. > Users, who have set ControlMaster in their ~/.ssh/config (like Thomas), > are not satified. For them we need another setting of > tramp-use-ssh-controlmaster-options. I'm working on this. Looking at the Tramp code, I think Thomas's idea of using -o "ControlMaster=no" -o "ControlPath=none" would probably be quite effective. It would work with previous Tramp versions that Eglot users might be using. Thomas, can you try this patch? diff --git a/lisp/progmodes/eglot.el b/lisp/progmodes/eglot.el index e20d209332d..83462633dd9 100644 --- a/lisp/progmodes/eglot.el +++ b/lisp/progmodes/eglot.el @@ -130,6 +130,8 @@ (defvar markdown-fontify-code-blocks-natively) (defvar company-backends) (defvar company-tooltip-align-annotations) +(defvar tramp-ssh-controlmaster-options) +(defvar tramp-use-ssh-controlmaster-options) @@ -1247,7 +1249,15 @@ eglot--connect (contact (cl-subseq contact 0 probe))) `(:process ,(lambda () - (let ((default-directory default-directory)) + (let ((default-directory default-directory) + ;; bug#61350: Tramp turns a feature on + ;; by default that can't (yet) handle + ;; very much data so we turn it off + ;; unconditionally -- just for our + ;; process. + (tramp-use-ssh-controlmaster-options t) + (tramp-ssh-controlmaster-options + "-o ControlMaster=no -o ControlPath=none")) (make-process :name readable-name :command (setq server-info (eglot--cmd contact)) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 14:51 ` João Távora @ 2023-02-28 15:01 ` Michael Albinus 2023-02-28 17:55 ` Thomas Koch 2023-02-28 19:37 ` João Távora 2023-03-01 10:46 ` Gregory Heytings 1 sibling, 2 replies; 98+ messages in thread From: Michael Albinus @ 2023-02-28 15:01 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, > Looking at the Tramp code, I think Thomas's idea of using > > -o "ControlMaster=no" -o "ControlPath=none" > > would probably be quite effective. It would work with previous > Tramp versions that Eglot users might be using. Yes, that I have also in mind. > Thomas, can you try this patch? > > diff --git a/lisp/progmodes/eglot.el b/lisp/progmodes/eglot.el > index e20d209332d..83462633dd9 100644 > --- a/lisp/progmodes/eglot.el > +++ b/lisp/progmodes/eglot.el > @@ -130,6 +130,8 @@ > (defvar markdown-fontify-code-blocks-natively) > (defvar company-backends) > (defvar company-tooltip-align-annotations) > +(defvar tramp-ssh-controlmaster-options) > +(defvar tramp-use-ssh-controlmaster-options) > > > > @@ -1247,7 +1249,15 @@ eglot--connect > (contact (cl-subseq contact 0 probe))) > `(:process > ,(lambda () > - (let ((default-directory default-directory)) > + (let ((default-directory default-directory) > + ;; bug#61350: Tramp turns a feature on > + ;; by default that can't (yet) handle > + ;; very much data so we turn it off > + ;; unconditionally -- just for our > + ;; process. > + (tramp-use-ssh-controlmaster-options t) > + (tramp-ssh-controlmaster-options > + "-o ControlMaster=no -o ControlPath=none")) > (make-process > :name readable-name > :command (setq server-info (eglot--cmd contact)) Even better than my proposal :-) There is the small risk that a user runs a local ssh client which is not aware of the ControlMaster option. For such a user, Eglot on a remote host might fail. Tramp is busy to check, whether ControlMaster could be applied. OTOH, I don't know if such ssh clients are still used in the wild, at least by Eglot users. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 15:01 ` Michael Albinus @ 2023-02-28 17:55 ` Thomas Koch 2023-03-01 14:10 ` João Távora 2023-03-02 10:40 ` Michael Albinus 2023-02-28 19:37 ` João Távora 1 sibling, 2 replies; 98+ messages in thread From: Thomas Koch @ 2023-02-28 17:55 UTC (permalink / raw) To: Michael Albinus, João Távora; +Cc: 61350 Thank you both for being so constructively engaged in the bug I opened. I'll give Joao's patch a try tomorrow. (I'm too tired now.) However I don't see ControlMaster as the root cause at the moment. My current working theory is: 1. There is some buffer in SSH (or TCP) that gets filled by the language server sending data to eglot. 2. Tramp sends a command over SSH. 3. Tramp sets the JUST-THIS-ONE option of accept-process-output to t since https://debbugs.gnu.org/12145. 4. The response to 2. can not arrive due to the buffer being filled in 1. Tramp blocks the emacs main thread. I tested my theory by deleting (and thus disabling) the JUST-THIS-ONE argument in all calls to accept-process-output in tramp.el and tramp-sh.el. Eglot did not freeze anymore in two tests with two different systems (but the same jdtls binary on the same Debian version). In one test however I saw this in the message buffer: [jsonrpc] (warning) Invalid JSON: (control character 0xd <callback> 1 381 381) {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigSearchBox.java","diagnostics":[{"range":{"start":{"line":35,"character":86},"end":{"line":35,"character":94}},"severity":3,"code":"1102","source":"Java","message":"At least one of the problems in category \u0027unuseContent-Length: 798 {"jsonrpc":"2.0","method":"textDocument/publ I already sent the above to Michael in an out-of-band thread (thanks for your patience!). I have a vague feeling, that Tramp could be improved with a work queue such that requests to tramp from notification or timer threads get blocked while another tramp command is still waiting for a reply. Additionally I understand that Joao has an idea to use markers controlled by Tramp. - I'm sorry that I can not (yet) contribute with putting both ideas into code. There are two statements in this bug thread that I don't yet understand and that might be worth more elaboration. I volunteer as a debugging rubber duck[1]. However if you both understand each other then never mind: [1] https://en.wikipedia.org/wiki/Rubber_duck_debugging Joao: "markers in the process buffer is what is used commonly" I understand that tramp sends only one command at a time per connection. Otherwise the reentrant error is thrown. The command result gets written to a connection-specific buffer. The output is parsed from point-min and the buffer content deleted after parsing and before sending the next command. So what should there be improved? Jsonrpc needs its own markers because N messages can arrive at any given time and the buffer can not be deleted after each message. Michael: "If another process has consumed the output, even if it is pushed into the correct buffer, Tramp doesn't know." What exact scenario should that be? Emacs writes the output (if there's no filter) in the correct buffer for the process. Tramp might call accept-process-output unnecessarily because the output is already in the buffer due to a accept-process-output call of some other code. But in this case Tramp will find the output there eventually, maybe after having waited until a timeout. This could be improved by checking for already present output before calling accept-process-output? ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 17:55 ` Thomas Koch @ 2023-03-01 14:10 ` João Távora 2023-03-01 16:19 ` João Távora 2023-03-02 11:01 ` Michael Albinus 2023-03-02 10:40 ` Michael Albinus 1 sibling, 2 replies; 98+ messages in thread From: João Távora @ 2023-03-01 14:10 UTC (permalink / raw) To: Thomas Koch; +Cc: Michael Albinus, 61350 Thomas Koch <thomas@koch.ro> writes: > Thank you both for being so constructively engaged in the bug I opened. > > I'll give Joao's patch a try tomorrow. (I'm too tired now.) However I don't see ControlMaster as the root cause at the moment. > > My current working theory is: > > 1. There is some buffer in SSH (or TCP) that gets filled by the language server sending data to eglot. > 2. Tramp sends a command over SSH. > 3. Tramp sets the JUST-THIS-ONE option of accept-process-output to t since https://debbugs.gnu.org/12145. > 4. The response to 2. can not arrive due to the buffer being filled in 1. Tramp blocks the emacs main thread. > I tested my theory by deleting (and thus disabling) the JUST-THIS-ONE > argument in all calls to accept-process-output in tramp.el and > tramp-sh.el. Eglot did not freeze anymore I don't think your observation necessarily validates your theory. If in '1.' you mean _an Emacs buffer_, the buffer in question is populated by Tramp's process filter (the default process filter, that is). Then accept-process-output with non-nil JUST-THIS-ONE would not prevent that buffer from getting the all the output, becasue the "THIS-ONE" is the process. But JUST-THIS-ONE _is_ relevant when there is more than one process. Here, there is. There's one process, the jsonrpc.el process, henceforth 'jprocess', and the Tramp process, henceforth 'tprocess'. jprocess receives only JSONRPC data from the LSP server. It "thinks" it is talking directly to a JSONRPC server, but in Tramp scenarios it is being fed data from tprocess, which is the process connected to the remote host. In tprocess, other things, such as shell interactions are going on. Michael can probably confirm, correct or deny this. When one (accept-process-output tprocess nil nil 'JUST-THIS-ONE=t) one must be absolutely sure that tprocess is going to send _something_ "soon". If it doesn't, we'll hang indefinitely (until the process dies or the user quits) That's what has been confirmed through a backtrace. It's a particular accept-process-output call in tramp-wait-for-regexp that hangs, understandibly so. The need to ensure safety of that call is why there is a prior for any 'found' messages in 'tramp-wait-for-regexp'. Only then is the "risky" 'accept-process-output' attempted. (defun tramp-wait-for-regexp (proc timeout regexp) ... (let ((found (tramp-check-for-regexp proc regexp))) (cond (...) (t (while (not found) (tramp-accept-process-output proc) ;; WE "HANG" HERE SOMETIMES (unless (process-live-p proc) (tramp-error-with-buffer nil proc 'file-error "Process has died")) (setq found (tramp-check-for-regexp proc regexp))))) 'found' relies on searching the contents of what is already in tprocess's buffer, which tprocess's filter dumped there. So far so good. Now, 'tramp-check-for-regexp' uses a somewhat non-standard technique of searching for messages: it searches them from the back, from the end of the tprocess's buffer. I don't know what motivated this, but I find it odd. I find one of its callees, tramp-search-regexp, particularly suspicious: (defun tramp-search-regexp (regexp) "Search for REGEXP backwards, starting at point-max."" (goto-char (point-max)) ;; We restrict ourselves to the last 256 characters. There were ;; reports of a shell command "git ls-files -zco --exclude-standard" ;; with 85k files involved, which has blocked Tramp forever. (re-search-backward regexp (max (point-min) (- (point) 256)) 'noerror)) See the comment there? Only 256 characters back are inspected. So, finally, here's my conjecture: 1. Tramp goes into 'tramp-wait-for-regexp'. tprocess's buffer already the message that 'found' is supposed to return, but it also has a lot more stuff, say a lot of JSONRPC data from the LSP server that also came into that tprocess buffer and is awaiting to be delivered to jprocess. 2. This data is for piping into jprocess, where the JSONRPC message will be decoded, but it will probably never arrive at its destination. 3. 'found' will be nil in tramp-wait-for-regexp, because of the tramp-search-regexp limitation. 4. tramp-wait-for-regexp will issue the "risky" accept-process-output call. 5. there is no more data that accept-process-output wants to put in the buffer, because the LSP server is fine for the moment. 6. Emacs hang Just a conjecture. More comments to your comments. > Jsonrpc needs its own markers because N messages can arrive at any > given time and the buffer can not be deleted after each message. Jsonrpc operates in jprocess's buffer, which is separate. > I have a vague feeling, that Tramp could be improved with a work queue > such that requests to tramp from notification or timer threads get > blocked while another tramp command is still waiting for a > reply. There are no (usable) threads in Emacs. Timers are events, and so are runs of each processe's process filter. Those two are what creates asynchronicity and the emulation of simultaneity in Emacs. When jprocess's filter sees a whole JSONRPC message, it calls the message handler. > Joao: "markers in the process buffer is what is used commonly" Markers are positions in the buffer that maintain relative positions if you add text before them. Point is a marker, and so is the process-mark, which marks where process last left input. With that concept in place, it's usually easi(er) to code processing of messages. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 14:10 ` João Távora @ 2023-03-01 16:19 ` João Távora 2023-03-02 11:01 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: João Távora @ 2023-03-01 16:19 UTC (permalink / raw) To: Thomas Koch; +Cc: Michael Albinus, 61350 [-- Attachment #1: Type: text/plain, Size: 622 bytes --] On Wed, Mar 1, 2023, 14:08 João Távora <joaotavora@gmail.com> wrote: > in Tramp scenarios it is being > fed data from tprocess, which is the process connected to the remote > host. In tprocess, other things, such as shell interactions are going > on. > > Michael can probably confirm, correct or deny this. > In the mean time, someone reached out to me to point out this is the wrong model for how things work in this scenario, particularly in the relation between jprocess and tprocess. That may very well be. I'm very sure those two processes exist, but they don't share data like that. João > [-- Attachment #2: Type: text/html, Size: 1158 bytes --] ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 14:10 ` João Távora 2023-03-01 16:19 ` João Távora @ 2023-03-02 11:01 ` Michael Albinus 2023-03-02 11:22 ` João Távora 1 sibling, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-02 11:01 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, > But JUST-THIS-ONE _is_ relevant when there is more than one process. > Here, there is. There's one process, the jsonrpc.el process, henceforth > 'jprocess', and the Tramp process, henceforth 'tprocess'. jprocess > receives only JSONRPC data from the LSP server. It "thinks" it is > talking directly to a JSONRPC server, but in Tramp scenarios it is being > fed data from tprocess, which is the process connected to the remote > host. In tprocess, other things, such as shell interactions are going > on. > > Michael can probably confirm, correct or deny this. More or less correct. But I still can't say which process gets output when, because I cannot debug accept-process-output (it's a C function). And running Emacs under gdb changes timings, which is important I believe. > When one (accept-process-output tprocess nil nil 'JUST-THIS-ONE=t) one > must be absolutely sure that tprocess is going to send _something_ > "soon". If it doesn't, we'll hang indefinitely (until the process dies > or the user quits) Yes. But Tramp calls accept-process-output only, if it has send a command to the remote shell, and it expects something to be returned. At least the shell prompt. During my tests I have also changed this to --8<---------------cut here---------------start------------->8--- (while (accept-process-output tprocess 0 nil 'JUST-THIS-ONE)) --8<---------------cut here---------------end--------------->8--- but it didn't help either. > That's what has been confirmed through a backtrace. It's a particular > accept-process-output call in tramp-wait-for-regexp that hangs, > understandibly so. Yes. > Now, 'tramp-check-for-regexp' uses a somewhat non-standard technique of > searching for messages: it searches them from the back, from the end of > the tprocess's buffer. I don't know what motivated this, but I find it > odd. I find one of its callees, tramp-search-regexp, particularly > suspicious: > > (defun tramp-search-regexp (regexp) > "Search for REGEXP backwards, starting at point-max."" > (goto-char (point-max)) > ;; We restrict ourselves to the last 256 characters. There were > ;; reports of a shell command "git ls-files -zco --exclude-standard" > ;; with 85k files involved, which has blocked Tramp forever. > (re-search-backward regexp (max (point-min) (- (point) 256)) 'noerror)) > > See the comment there? Only 256 characters back are inspected. Yes. But the regexp it searches for is the final shell prompt. Something like "///4b3b7d4fa561141e84c94a1cf25e8575#$", which is shorter than 256 bytes for sure. > So, finally, here's my conjecture: > > 1. Tramp goes into 'tramp-wait-for-regexp'. tprocess's buffer already > the message that 'found' is supposed to return, but it also has a lot > more stuff, say a lot of JSONRPC data from the LSP server that also > came into that tprocess buffer and is awaiting to be delivered to > jprocess. > > 2. This data is for piping into jprocess, where the JSONRPC message will > be decoded, but it will probably never arrive at its destination. > > 3. 'found' will be nil in tramp-wait-for-regexp, because of the > tramp-search-regexp limitation. > > 4. tramp-wait-for-regexp will issue the "risky" accept-process-output > call. > > 5. there is no more data that accept-process-output wants to put in the > buffer, because the LSP server is fine for the moment. > > 6. Emacs hang > > Just a conjecture. Yes, this is more or less the scenario. But I still don't understand why not all data are delivered through the socket ssh is using. Could it be there is a limitation, how much data could be buffered by ssh? >> I have a vague feeling, that Tramp could be improved with a work queue >> such that requests to tramp from notification or timer threads get >> blocked while another tramp command is still waiting for a >> reply. > > There are no (usable) threads in Emacs. There are. I made Tramp using threads, and it worked fine, when no interactive dialogue inside a thread happened. > Timers are events, and so are runs of each processe's process filter. > Those two are what creates asynchronicity and the emulation of > simultaneity in Emacs. When jprocess's filter sees a whole JSONRPC > message, it calls the message handler. Timers and process filters are the cause of the "Forbidden reentrant call in Tramp" errors. Wwe must do anything, solving this. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-02 11:01 ` Michael Albinus @ 2023-03-02 11:22 ` João Távora 2023-03-02 11:50 ` Michael Albinus 2023-03-05 11:21 ` Michael Albinus 0 siblings, 2 replies; 98+ messages in thread From: João Távora @ 2023-03-02 11:22 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: >> Michael can probably confirm, correct or deny this. > More or less correct. Actually, I don't think it is correct, at least not the way I meant it: JSONRPC data never gets into the tprocess's buffer. Experiments also seem to disprove it. > But I still can't say which process gets output > when, because I cannot debug accept-process-output (it's a C > function). And running Emacs under gdb changes timings, which is > important I believe. Yes, we must probably write some gdb scripts. Eli is expert at that, but I've done some it myself. >> When one (accept-process-output tprocess nil nil 'JUST-THIS-ONE=t) one >> must be absolutely sure that tprocess is going to send _something_ >> "soon". If it doesn't, we'll hang indefinitely (until the process dies >> or the user quits) > > Yes. But Tramp calls accept-process-output only, if it has send a > command to the remote shell, and it expects something to be returned. At > least the shell prompt. Yes. Tramp is doing the right thing. It really expects a response to come. And more often than not, it does. But sometimes it doesn't, and that's when we hang. >> See the comment there? Only 256 characters back are inspected. > > Yes. But the regexp it searches for is the final shell prompt. Something > like "///4b3b7d4fa561141e84c94a1cf25e8575#$", which is shorter than 256 > bytes for sure. OK, but why can't one search for it from where you last left parsing, (i.e. point) up to process-mark? Anyway, I increased that value significantly and it didn't make a difference, so this is probably a red herring. >> So, finally, here's my conjecture: >> >> 1. Tramp goes into 'tramp-wait-for-regexp'. tprocess's buffer already >> the message that 'found' is supposed to return, but it also has a lot >> more stuff, say a lot of JSONRPC data from the LSP server that also >> came into that tprocess buffer and is awaiting to be delivered to >> jprocess. >> >> 2. This data is for piping into jprocess, where the JSONRPC message will >> be decoded, but it will probably never arrive at its destination. >> >> 3. 'found' will be nil in tramp-wait-for-regexp, because of the >> tramp-search-regexp limitation. >> >> 4. tramp-wait-for-regexp will issue the "risky" accept-process-output >> call. >> >> 5. there is no more data that accept-process-output wants to put in the >> buffer, because the LSP server is fine for the moment. >> >> 6. Emacs hang >> >> Just a conjecture. > > Yes, this is more or less the scenario. But I still don't understand why > not all data are delivered through the socket ssh is using. Could it be > there is a limitation, how much data could be buffered by ssh? After testing with a enhanced tramp-backtrace that prints out the contents of every process buffer, I don't think my conjecture is correct. diff --git a/lisp/net/tramp.el b/lisp/net/tramp.el index df4b7dfca2c..f24a3b51074 100644 --- a/lisp/net/tramp.el +++ b/lisp/net/tramp.el @@ -2212,12 +2212,21 @@ tramp-backtrace If VEC-OR-PROC is nil, the buffer *debug tramp* is used. FORCE forces the backtrace even if `tramp-verbose' is less than 10. This function is meant for debugging purposes." - (let ((tramp-verbose (if force 10 tramp-verbose))) + (let ((tramp-verbose (if force 10 tramp-verbose)) + (bt (lambda () + (backtrace) + (dolist (p (process-list)) + (let* ((buf (process-buffer p)) + (name (and buf (buffer-name buf)))) + (when buf + (princ (format "\n--8<---- begin contents of `%s' ------>8---\n" name)) + (princ (with-current-buffer buf (buffer-string))) + (princ (format "\n--8<---- end contents of `%s' ------>8---\n" name)))))))) (when (>= tramp-verbose 10) (if vec-or-proc (tramp-message - vec-or-proc 10 "\n%s" (with-output-to-string (backtrace))) - (with-output-to-temp-buffer "*debug tramp*" (backtrace)))))) + vec-or-proc 10 "\n%s" (with-output-to-string (funcall bt))) + (with-output-to-temp-buffer "*debug tramp*" (funcall bt)))))) When you press C-g after the hang occurs, the backtrace is correct but the tprocess buffer is simply empty, according to the new logs. IOW the response Tramp was waiting for never arrived. >> There are no (usable) threads in Emacs. > There are. I made Tramp using threads, and it worked fine, when no > interactive dialogue inside a thread happened. Right. There are so-called green threads, which could fit input-waiting situations like this one, not for achieving true simultaneity. In my opinion, they don't present any advantage over the evented model, which we understand much better (wait... except we don't :-) ) I don't think we should do that until we understand what is happening. >> Timers are events, and so are runs of each processe's process filter. >> Those two are what creates asynchronicity and the emulation of >> simultaneity in Emacs. When jprocess's filter sees a whole JSONRPC >> message, it calls the message handler. > > Timers and process filters are the cause of the "Forbidden reentrant > call in Tramp" errors. I've had problems with reentrant calls to process filters before, and I solved them with a timer. Not sure what that error is. > Wwe must do anything, solving this. I don't understand what you mean here. João ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-02 11:22 ` João Távora @ 2023-03-02 11:50 ` Michael Albinus 2023-03-05 11:21 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-03-02 11:50 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >>> Michael can probably confirm, correct or deny this. >> More or less correct. > > Actually, I don't think it is correct, at least not the way I meant it: > JSONRPC data never gets into the tprocess's buffer. Experiments also > seem to disprove it. Hmm, right. *When* output arrives, it is pushed into the right buffer. But sometimes, the expected output doesn't arrive. >> But I still can't say which process gets output >> when, because I cannot debug accept-process-output (it's a C >> function). And running Emacs under gdb changes timings, which is >> important I believe. > > Yes, we must probably write some gdb scripts. Eli is expert at that, > but I've done some it myself. If you like, pls. I'm not so fluent with process handling in Emacs C core. > When you press C-g after the hang occurs, the backtrace is correct but > the tprocess buffer is simply empty, according to the new logs. IOW the > response Tramp was waiting for never arrived. Result is random. Sometimes the buffer is empty, sometimes the buffer contains the beginning of the output. The point is, that not everything is contained, and Tramp waits for the final shell prompt. >> Wwe must do anything, solving this. > > I don't understand what you mean here. Fixing the "Forbidden reentrant call ..." error now and forever. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-02 11:22 ` João Távora 2023-03-02 11:50 ` Michael Albinus @ 2023-03-05 11:21 ` Michael Albinus 2023-03-05 11:45 ` Thomas Koch ` (2 more replies) 1 sibling, 3 replies; 98+ messages in thread From: Michael Albinus @ 2023-03-05 11:21 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> Yes. But Tramp calls accept-process-output only, if it has send a >> command to the remote shell, and it expects something to be returned. At >> least the shell prompt. > > Yes. Tramp is doing the right thing. It really expects a response to > come. And more often than not, it does. But sometimes it doesn't, and > that's when we hang. After digging further, I believe I understand now why it hangs. We have the following scenario: - Both Eglot and Tramp use the same ssh connection with enabled ControlMaster. - Eglot gets JSON output from the remote LSP server. In jsonrpc--process-filter, this output is handled. It includes a call to file-truename, which triggers Tramp to send a request in its own process to the remote side. - The remote side returns the answer for Tramp (shell output). However, the ssh socket is still full of the jsonrpc process output, which waits to be handled. - So the socket is blocked. The jsonrpc output cannot be read, because jsonrpc--process-filter waits for the result of file-truename. And the Tramp process output cannot be handled, because it is stuck in the socket after the jsonrpc output. The proper solution is indeed to have two connections, and to refuse use of ControlMaster. Surprisingly, this is not new to Tramp :-) But I've simply forgotten the case ... See tramp-integration.el. There is a comment for bug#45518, a similar blocking in compile.el. And the solution is indeed to disable ssh ControlMaster. Tramp hooks into compilation-mode-hook, and sets tramp-use-ssh-controlmaster-options buffer-local to nil. I'd like to apply the same solution for eglot.el. Unfortunately, there's no hook Tramp could use. I've played with eglot-server-initialized-hook, but this is applied too late. So would you mind to add a hook to Eglot, which runs before calling make-process, but in the proper process buffer? When you've added it, I'll let Tramp hook into, and you don't need any longer the Tramp specific code in eglot.el. In parallel, I'll extend tramp-use-ssh-controlmaster-options to accept a further value 'suppress', which overrides possible settings in ~/.ssh/config. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-05 11:21 ` Michael Albinus @ 2023-03-05 11:45 ` Thomas Koch 2023-03-05 12:23 ` Michael Albinus 2023-03-06 12:42 ` Michael Albinus 2023-03-06 13:42 ` João Távora 2 siblings, 1 reply; 98+ messages in thread From: Thomas Koch @ 2023-03-05 11:45 UTC (permalink / raw) To: Michael Albinus, João Távora; +Cc: 61350 Thanks Michael for digging! This is exactly what I also think happens. However my conclusion is different. I consider the root-cause to be the use of JUST-THIS-ONE in tramps call to accept-process-output. Please check my comment to bug https://debbugs.gnu.org/cgi/bugreport.cgi?bug=12145 Applying a workaround (adding extra code to surpress ControlMaster with a newly introduced hook) would increase the complexity and thus brittleness. I believe the right thing to do is remove JUST-THIS-ONE in Tramp and fix find-name-dired and any other place that's broken. As a practical roadmap you could add an option to Tramp to disable JUST-THIS-ONE and recommend its use. Later the options default could be toggled. Eventually the option can go away. Eglot could emit a warning if it sees the above tramp option not being used. (... could lead to Emacs hanging. Are you sure to continue (y/n)?) > Michael Albinus <michael.albinus@gmx.de> hat am 05.03.2023 13:21 EET geschrieben: > > > João Távora <joaotavora@gmail.com> writes: > > Hi João, > > >> Yes. But Tramp calls accept-process-output only, if it has send a > >> command to the remote shell, and it expects something to be returned. At > >> least the shell prompt. > > > > Yes. Tramp is doing the right thing. It really expects a response to > > come. And more often than not, it does. But sometimes it doesn't, and > > that's when we hang. > > After digging further, I believe I understand now why it hangs. We have > the following scenario: > > - Both Eglot and Tramp use the same ssh connection with enabled ControlMaster. > > - Eglot gets JSON output from the remote LSP server. In > jsonrpc--process-filter, this output is handled. It includes a call to > file-truename, which triggers Tramp to send a request in its own > process to the remote side. > > - The remote side returns the answer for Tramp (shell output). However, > the ssh socket is still full of the jsonrpc process output, which > waits to be handled. > > - So the socket is blocked. The jsonrpc output cannot be read, because > jsonrpc--process-filter waits for the result of file-truename. And the > Tramp process output cannot be handled, because it is stuck in the > socket after the jsonrpc output. > > The proper solution is indeed to have two connections, and to refuse use > of ControlMaster. > > Surprisingly, this is not new to Tramp :-) But I've simply forgotten the > case ... > > See tramp-integration.el. There is a comment for bug#45518, a similar > blocking in compile.el. And the solution is indeed to disable ssh > ControlMaster. Tramp hooks into compilation-mode-hook, and sets > tramp-use-ssh-controlmaster-options buffer-local to nil. > > I'd like to apply the same solution for eglot.el. Unfortunately, there's > no hook Tramp could use. I've played with eglot-server-initialized-hook, > but this is applied too late. > > So would you mind to add a hook to Eglot, which runs before calling > make-process, but in the proper process buffer? When you've added it, > I'll let Tramp hook into, and you don't need any longer the Tramp > specific code in eglot.el. > > In parallel, I'll extend tramp-use-ssh-controlmaster-options to accept a > further value 'suppress', which overrides possible settings in ~/.ssh/config. > > > João > > Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-05 11:45 ` Thomas Koch @ 2023-03-05 12:23 ` Michael Albinus 2023-03-07 12:49 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-05 12:23 UTC (permalink / raw) To: Thomas Koch; +Cc: João Távora, 61350 Thomas Koch <thomas@koch.ro> writes: Hi Thomas, > Thanks Michael for digging! This is exactly what I also think happens. > > However my conclusion is different. I consider the root-cause to be the use of JUST-THIS-ONE in tramps call to accept-process-output. Please check my comment to bug > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=12145 I've seen this. And yes, we shall fix the problems described there. > Applying a workaround (adding extra code to surpress ControlMaster > with a newly introduced hook) would increase the complexity and thus > brittleness. I believe the right thing to do is remove JUST-THIS-ONE > in Tramp and fix find-name-dired and any other place that's broken. I won't change Tramp this way in Emacs 29. It is in feature freeze, closed to pretest. I cannot risk to destabilize it. And as we see with the find-name-dired case, there might be collateral damages if we change Tramp this way. And I don't believe, that adding a hook in Eglot is worse than what we have now, where Eglot manipulates Tramp data. Whenever we need to change Tramp in this area, it wouldn't apply in Egloot. A hook is much better suited, and we use the very same technique already with compile.el. > As a practical roadmap you could add an option to Tramp to disable > JUST-THIS-ONE and recommend its use. Later the options default could > be toggled. Eventually the option can go away. Might be applicable in the master branch. But first we need much more checks that it doesn't break something else. I'm always uncomfortable to change Tramp such a way that other packages could be broken. Letting Tramp accept process output for all existing processes doesn't seem to be the right thing to me, although I admit that it seems to fix the specific problem we're discussing here. > Eglot could emit a warning if it sees the above tramp option not being > used. (... could lead to Emacs hanging. Are you sure to continue > (y/n)?) No check in Eglot. A hint in the manual would be sufficient I believe. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-05 12:23 ` Michael Albinus @ 2023-03-07 12:49 ` Michael Albinus 2023-03-07 13:04 ` Thomas Koch 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-07 12:49 UTC (permalink / raw) To: Thomas Koch; +Cc: João Távora, 61350 [-- Attachment #1: Type: text/plain, Size: 2033 bytes --] Michael Albinus <michael.albinus@gmx.de> writes: Hi Thomas & João, before going offline just a short status update. >> However my conclusion is different. I consider the root-cause to be >> the use of JUST-THIS-ONE in tramps call to >> accept-process-output. Please check my comment to bug >> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=12145 > > I've seen this. And yes, we shall fix the problems described there. > >> As a practical roadmap you could add an option to Tramp to disable >> JUST-THIS-ONE and recommend its use. Later the options default could >> be toggled. Eventually the option can go away. > > Might be applicable in the master branch. But first we need much more > checks that it doesn't break something else. > > I'm always uncomfortable to change Tramp such a way that other packages > could be broken. Letting Tramp accept process output for all existing > processes doesn't seem to be the right thing to me, although I admit > that it seems to fix the specific problem we're discussing here. Thinking about, I came to a less invasive idea: Just the processes which possibly use the same socket for the ssh connection should accept their output in time. This avoids to change the behavior of other processes, which are not related to a given Tramp connection. And it makes me feel much better :-) The small patch below, on top of Tramp 2.6.0.2, seems to fix the problem. I've also deactivated (in Emacs 29) João's Tramp fix, and I've applied the recipe given by Thomas 10 times in a row, always starting with "emacs -Q ...". 10 times success. Could you please check how it works in your environment? You'll have some days until I'll be back. If it also works for you, I will add it to Tramp 2.6.0.3, and we could close this bug. For completeness, tramp-test44-asynchronous-requests now fails in tramp-tests.el. But this test is already flaky (not passing successfully every time), so I'll investigate later what's up with this. Best regards, Michael. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: Type: text/x-patch, Size: 1037 bytes --] diff --git a/lisp/tramp.el b/lisp/tramp.el index 24b7c45d..50945ccb 100644 --- a/lisp/tramp.el +++ b/lisp/tramp.el @@ -5746,6 +5746,17 @@ Mostly useful to protect BODY from being interrupted by timers." This is needed in order to hide `last-coding-system-used', which is set for process communication also. If the user quits via `C-g', it is propagated up to `tramp-file-name-handler'." + ;; There could be other processes which use the same socket for + ;; communication. This could block the output for the current + ;; process. Read such output first. (Bug#61350) + ;; If `timeout' is nil, `accept-process-output' for `proc' could + ;; block another such a process. So we set `timeout' to 0 then. + (dolist (p (delq proc (process-list))) + (when (tramp-file-name-equal-p + (process-get proc 'vector) (process-get p 'vector)) + (setq timeout (or timeout 0)) + (accept-process-output p 0 nil t))) + (with-current-buffer (process-buffer proc) (let ((inhibit-read-only t) last-coding-system-used ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-07 12:49 ` Michael Albinus @ 2023-03-07 13:04 ` Thomas Koch 2023-03-07 13:33 ` João Távora 2023-03-07 13:47 ` Michael Albinus 0 siblings, 2 replies; 98+ messages in thread From: Thomas Koch @ 2023-03-07 13:04 UTC (permalink / raw) To: Michael Albinus; +Cc: João Távora, 61350 Thanks Michael! What is the advantage of this patch over just removing the JUST-THIS-ONE argument? In both cases tramp is triggering accept-process-output for processes it does not own. However with your patch there is a potential timing issue: blocking process output could still arrive between your new loop and tramps call to accept-process-output with JUST-THIS-ONE. I'd think that there might be room for an essay "JUST-THIS-ONE considered evil"? > Michael Albinus <michael.albinus@gmx.de> hat am 07.03.2023 14:49 EET geschrieben: > > > Michael Albinus <michael.albinus@gmx.de> writes: > > Hi Thomas & João, > > before going offline just a short status update. > > >> However my conclusion is different. I consider the root-cause to be > >> the use of JUST-THIS-ONE in tramps call to > >> accept-process-output. Please check my comment to bug > >> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=12145 > > > > I've seen this. And yes, we shall fix the problems described there. > > > >> As a practical roadmap you could add an option to Tramp to disable > >> JUST-THIS-ONE and recommend its use. Later the options default could > >> be toggled. Eventually the option can go away. > > > > Might be applicable in the master branch. But first we need much more > > checks that it doesn't break something else. > > > > I'm always uncomfortable to change Tramp such a way that other packages > > could be broken. Letting Tramp accept process output for all existing > > processes doesn't seem to be the right thing to me, although I admit > > that it seems to fix the specific problem we're discussing here. > > Thinking about, I came to a less invasive idea: Just the processes which > possibly use the same socket for the ssh connection should accept their > output in time. This avoids to change the behavior of other processes, > which are not related to a given Tramp connection. And it makes me feel > much better :-) > > The small patch below, on top of Tramp 2.6.0.2, seems to fix the > problem. I've also deactivated (in Emacs 29) João's Tramp fix, and I've > applied the recipe given by Thomas 10 times in a row, always starting > with "emacs -Q ...". 10 times success. > > Could you please check how it works in your environment? You'll have > some days until I'll be back. If it also works for you, I will add it to > Tramp 2.6.0.3, and we could close this bug. > > For completeness, tramp-test44-asynchronous-requests now fails in > tramp-tests.el. But this test is already flaky (not passing successfully > every time), so I'll investigate later what's up with this. > > Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-07 13:04 ` Thomas Koch @ 2023-03-07 13:33 ` João Távora 2023-03-07 13:52 ` Michael Albinus 2023-03-07 13:47 ` Michael Albinus 1 sibling, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-07 13:33 UTC (permalink / raw) To: Thomas Koch; +Cc: Michael Albinus, 61350 On Tue, Mar 7, 2023 at 1:04 PM Thomas Koch <thomas@koch.ro> wrote: > > Thanks Michael! What is the advantage of this patch over just removing the JUST-THIS-ONE argument? > In both cases tramp is triggering accept-process-output for processes it does not own. Yes, I also ask myself that question. Though here, Tramp has "shared" ownership of them, sort of, so it's more conservative when compared to the current mechanics. Michael, your patch goes in the right direction but AFAIR a single accept-process-output is not guaranteed to bring into the process filter all the buffered data. In "40.9.4 Accepting Output from Processes": [In accept-process-output, ] If PROCESS is non-‘nil’ then this function does not return until some output has [^^^^^^^^^^^^^^^^^] been received from PROCESS or PROCESS has closed the connection. Note "some", not "all". So while less common, I think the hang can still happen. So maybe you meant: (while (accept-process-output p 0 nil t)) as suggested in that section? João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-07 13:33 ` João Távora @ 2023-03-07 13:52 ` Michael Albinus 2023-03-07 14:03 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-07 13:52 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> Thanks Michael! What is the advantage of this patch over just removing the JUST-THIS-ONE argument? >> In both cases tramp is triggering accept-process-output for processes it does not own. > > Yes, I also ask myself that question. Though here, Tramp > has "shared" ownership of them, sort of, so it's more > conservative when compared to the current mechanics. Indeed. > Michael, your patch goes in the right direction but AFAIR > a single accept-process-output is not guaranteed to bring > into the process filter all the buffered data. > > In "40.9.4 Accepting Output from Processes": > > [In accept-process-output, ] If PROCESS is > non-‘nil’ then this function does not return until some output has > [^^^^^^^^^^^^^^^^^] > been received from PROCESS or PROCESS has closed the connection. > > > Note "some", not "all". So while less common, I think the hang > can still happen. > > So maybe you meant: > > (while (accept-process-output p 0 nil t)) > > as suggested in that section? That was my first idea as well. But tramp-accept-process-output itself is called in a loop, so there's no difference in practice[1]. Pls give my patch testing, in order to see whether Tramp still blocks. > João Best regards, Michael. [1]: In theory, there's no difference between theory and practice. In practice, there is. :-) ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-07 13:52 ` Michael Albinus @ 2023-03-07 14:03 ` João Távora 2023-03-07 14:31 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-07 14:03 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 On Tue, Mar 7, 2023 at 1:52 PM Michael Albinus <michael.albinus@gmx.de> wrote: > > [In accept-process-output, ] If PROCESS is > > non-‘nil’ then this function does not return until some output has > > [^^^^^^^^^^^^^^^^^] > > been received from PROCESS or PROCESS has closed the connection. > > > > > > Note "some", not "all". So while less common, I think the hang > > can still happen. > > > > So maybe you meant: > > > > (while (accept-process-output p 0 nil t)) > > > > as suggested in that section? > > That was my first idea as well. But tramp-accept-process-output itself is > called in a loop, so there's no difference in practice[1]. Hmmm, I still don't think they are equivalent. In tramp-a-p-o with TIMEOUT set to nil is where it normally hangs, right? Well then, if it does hang it will _not_ be called in a loop, by definition. So the prior "hang removal" a-p-o call that you added must _still_ be made in a loop itself to ensure that the outher existing inner call: (accept-process-output tprocess nil nil t) will not block due to the conditions you described. Regardless, if this were my code. I'd put a big fat FIXME there explaining that I want to remove the JUST-THIS-ONE, but I'm afraid to :-) > [1]: In theory, there's no difference between theory and practice. In > practice, there is. :-) hehe :-) > Pls give my patch testing, in order to see whether Tramp still blocks. I will do that later. But 10 out of 10 successes in your testing is a pretty solid indication that this is on the right track. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-07 14:03 ` João Távora @ 2023-03-07 14:31 ` Michael Albinus 2023-03-11 9:00 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-07 14:31 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> > So maybe you meant: >> > >> > (while (accept-process-output p 0 nil t)) >> > >> > as suggested in that section? >> >> That was my first idea as well. But tramp-accept-process-output itself is >> called in a loop, so there's no difference in practice[1]. > > Hmmm, I still don't think they are equivalent. In tramp-a-p-o with > TIMEOUT set to nil is where it normally hangs, right? Well then, if > it does hang it will _not_ be called in a loop, by definition. But the hang happens only when there is another process using the same socket. As soon as such a parallel process is detected, my patch does --8<---------------cut here---------------start------------->8--- (setq timeout (or timeout 0)) --8<---------------cut here---------------end--------------->8--- No blocking anymore. >> Pls give my patch testing, in order to see whether Tramp still blocks. > > I will do that later. But 10 out of 10 successes in your testing is > a pretty solid indication that this is on the right track. Thanks. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-07 14:31 ` Michael Albinus @ 2023-03-11 9:00 ` Michael Albinus 2023-03-11 10:14 ` Thomas Koch 2023-03-11 11:42 ` João Távora 0 siblings, 2 replies; 98+ messages in thread From: Michael Albinus @ 2023-03-11 9:00 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: Hi João & Thomas, I'm back home. Not fully recovered yet, but able to sit at may desk :-) >>> Pls give my patch testing, in order to see whether Tramp still blocks. >> >> I will do that later. But 10 out of 10 successes in your testing is >> a pretty solid indication that this is on the right track. No further comments, so I guess it is OK for you? I plan to install the patch next days on master, unless you oppose. >> João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 9:00 ` Michael Albinus @ 2023-03-11 10:14 ` Thomas Koch 2023-03-11 11:47 ` João Távora 2023-03-11 12:27 ` Michael Albinus 2023-03-11 11:42 ` João Távora 1 sibling, 2 replies; 98+ messages in thread From: Thomas Koch @ 2023-03-11 10:14 UTC (permalink / raw) To: Michael Albinus, João Távora; +Cc: 61350 Hi Michael, gute Besserung! I'm sorry, I didn't test your patch yet. I didn't have the focus time to do so. But mainly it does not look good to me. Even if our tests would indicate no problem, we couldn't possibly test for all side effects to all elisp code that works with processes. For me the conservative approach is to remove the JUST-THIS-ONE argument. I digged into the history of the JUST-THIS-ONE argument. etc/NEWS.22 says, emphasis from me: """ *** Function 'accept-process-output' has a new optional fourth arg JUST-THIS-ONE. If non-nil, only output from the specified process is handled, suspending output from other processes. If value is an integer, also inhibit running timers. THIS FEATURE IS GENERALLY NOT RECOMMENDED, but may be necessary for specific applications, such as speech synthesis. """ The argument was discussed here: https://lists.gnu.org/archive/html/emacs-devel/2004-08/msg00141.html and introduced in this commit: https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=107ed38d4bdec03002b2a23619e205722cd5b8d1 I don't even think that the original motivation for introducing JUST-THIS-ONE was valid. Unfortunately there was not much discussion about it. It was argued, that it would be hard to make a process filter function reentrant. And I think that this was an invalid root cause analysis to start with. First, the emacs manual says[1]: "Note that if any of those functions are called by the filter, the filter may be called recursively." - So you should make your filter reentrant, If I understand correctly. [1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Filter-Functions.html A process filter function Second, the manual further says: "Quitting is normally inhibited within a filter function". This indicates to me, that a filter function should be "side effect free" besides putting its input somewhere (e.g. in a buffer or message queue) and trigger an event if there is enough input for further processing. This also reduces the risk, that the function could be called recursively in a damaging way. It seems to me, that there is not yet a standard way in Emacs for continuations (or event driven programming) although the Emacs Wiki refers to the emacs-deferred library: https://www.emacswiki.org/emacs/ConcurrentEmacs Because there is no such library in Emacs, people either write their own code for continuations (eglot?) or do too much work in a process filter function (speechd-el in 2004 which led to JUST-THIS-ONE). I'm sending this email now to give you a quick response but will continue to read: https://jyp.github.io/posts/elisp-cps.html Thanks, Thomas ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 10:14 ` Thomas Koch @ 2023-03-11 11:47 ` João Távora 2023-03-11 12:27 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: João Távora @ 2023-03-11 11:47 UTC (permalink / raw) To: Thomas Koch; +Cc: Michael Albinus, 61350 Thomas Koch <thomas@koch.ro> writes: > I'm sorry, I didn't test your patch yet. I didn't have the focus time > to do so. But mainly it does not look good to me. > Even if our tests would indicate no problem, we couldn't possibly test > for all side effects to all elisp code that works with processes. For > me the conservative approach is to remove the JUST-THIS-ONE argument. I agree that JUST-THIS-ONE would be cleaner, but I disagree that it would be more conservative. I can't speak for Michael, but in my view, "conservative" is a measure of the amplitude of the change. The change that Michael and I are proposing, despite being objectively ugly when compared to removing JUST-THIS-ONE, is _less_ removed from the current Tramp mechanisms than your idea. Thus, it is safer. > I digged into the history of the JUST-THIS-ONE argument. etc/NEWS.22 says, emphasis from me: > > [...] > integer, also inhibit running timers. THIS FEATURE IS GENERALLY NOT > RECOMMENDED, but may be necessary for specific applications, such as > speech synthesis. > """ > [...] > Because there is no such library in Emacs, people either write their > own code for continuations (eglot?) or do too much work in a process > filter function (speechd-el in 2004 which led to JUST-THIS-ONE). I think this analysis is very interesting and valid. Not necessarily for fixing this particular bug. You should send it to emacs-devel@gnu.org so it becomes more visible to people who may not be following the bug tracker. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 10:14 ` Thomas Koch 2023-03-11 11:47 ` João Távora @ 2023-03-11 12:27 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-03-11 12:27 UTC (permalink / raw) To: Thomas Koch; +Cc: João Távora, 61350 Thomas Koch <thomas@koch.ro> writes: > Hi Michael, Hi Thomas, > I'm sorry, I didn't test your patch yet. I didn't have the focus time > to do so. No problem. Whatever we change, it won't go into the emacs-29 branch, so we have plenty of time. > But mainly it does not look good to me. > > Even if our tests would indicate no problem, we couldn't possibly test > for all side effects to all elisp code that works with processes. For > me the conservative approach is to remove the JUST-THIS-ONE argument. I believe my patch is more conservative (as João has said as well). It changes the behavior ONLY for processes created by Tramp's make-process. If we would remove the JUST-THIS-ONE argument, it would change the behavior of ANY process, even for processes out of Tramp control. > Thanks, Thomas Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 9:00 ` Michael Albinus 2023-03-11 10:14 ` Thomas Koch @ 2023-03-11 11:42 ` João Távora 2023-03-11 12:44 ` Michael Albinus 1 sibling, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-11 11:42 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: >>> I will do that later. But 10 out of 10 successes in your testing is >>> a pretty solid indication that this is on the right track. > > No further comments, so I guess it is OK for you? I think the patch goes in the right direction. The force resetting of 'timeout' to 0, which you highlighted earlier, is odd. + (dolist (p (delq proc (process-list))) + (when (tramp-file-name-equal-p + (process-get proc 'vector) (process-get p 'vector)) + (setq timeout (or timeout 0)) + (accept-process-output p 0 nil t))) That's because callers of 'tramp-accept-process-output' who call it with TIMEOUT=nil will now see their expectations violated depending on conditions they do not control. You fully control these callers, so you may know better, but I think this patch makes more sense, because it keeps the contract. + (dolist (p (delq proc (process-list))) + (when (tramp-file-name-equal-p + (process-get proc 'vector) (process-get p 'vector)) + (while (accept-process-output p 0 nil t)))) (with-current-buffer (process-buffer proc) (let ((inhibit-read-only t) last-coding-system-used This is even more "conservative" than your patch, because it deviates less from the current behaviour. I've tested it (first removing the eglot.el workaround) with my reproduction recipe to a 100% success rate in 6-7 experiments (whereas without it it is 0%). > I plan to install the patch next days on master, unless you oppose. I don't, but consider my alternative. João PS: And I also think that Eglot over Tramp does seem to have become faster with ControlMaster, at least in a subjective evaluation. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 11:42 ` João Távora @ 2023-03-11 12:44 ` Michael Albinus 2023-03-11 14:01 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-11 12:44 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, > I think the patch goes in the right direction. The force resetting of > 'timeout' to 0, which you highlighted earlier, is odd. Yes. But I've tried first w/o this setting, and the recipe from Thomas still didn't work. > + (dolist (p (delq proc (process-list))) > + (when (tramp-file-name-equal-p > + (process-get proc 'vector) (process-get p 'vector)) > + (setq timeout (or timeout 0)) > + (accept-process-output p 0 nil t))) > > That's because callers of 'tramp-accept-process-output' who call it with > TIMEOUT=nil will now see their expectations violated depending on > conditions they do not control. You fully control these callers, so you > may know better, but I think this patch makes more sense, because it > keeps the contract. In fact, I'm even not sure that a nil timeout is still needed. This is something from the very beginning of Tramp. Maybe I shall review the code whether we still need a nil timeout. Because its initial idea, wait in accept-process-output until the remote shell prompt appears, is no longer needed, because we call tramp-accept-process-output (and therefore accept-process-output) in a loop. This is something to be changed later on. A short scan over the Tramp sources shows, that there are 12 calls of tramp-accept-process-output with a TIMEOUT of 0, and only two other calls with a nil TIMEOUT. And these two calls are already in while loop, so I guess it would be also OK to give them a default TIMEOUT of 0. I will try it, but this needs lot of regression tests (I have 100+ different configurations to run tramp-tests.el). > + (dolist (p (delq proc (process-list))) > + (when (tramp-file-name-equal-p > + (process-get proc 'vector) (process-get p 'vector)) > + (while (accept-process-output p 0 nil t)))) > (with-current-buffer (process-buffer proc) > (let ((inhibit-read-only t) > last-coding-system-used > > This is even more "conservative" than your patch, because it deviates > less from the current behaviour. I've tested it (first removing the > eglot.el workaround) with my reproduction recipe to a 100% success rate > in 6-7 experiments (whereas without it it is 0%). Hmm, in my experience there were still one or two blocking tests wit Thomas' recipe, IIRC. This is because the blocking situation, data is going through the only ssh socket for both tramp-accept-process-output and jsonrpc, is not unblocked then. So I would prefer to keep this (setq timeout (or timeout 0)) for now. And I will reviw the code, whether tramp-accept-process-output still needs a TIMEOUT ARGUMENT as sketched above. >> I plan to install the patch next days on master, unless you oppose. > > I don't, but consider my alternative. I consider it by removing the TIMEOUT argument, if possible. > João > > PS: And I also think that Eglot over Tramp does seem to have become > faster with ControlMaster, at least in a subjective evaluation. Of course. That is the reason behind using it. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 12:44 ` Michael Albinus @ 2023-03-11 14:01 ` João Távora 2023-03-11 14:25 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-11 14:01 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > Hmm, in my experience there were still one or two blocking tests wit > Thomas' recipe, IIRC. That's surprising to me. What tests are you running? I'm testing with the same "Thomas recipe" AFAICT. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 14:01 ` João Távora @ 2023-03-11 14:25 ` Michael Albinus 2023-03-12 0:48 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-11 14:25 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> Hmm, in my experience there were still one or two blocking tests wit >> Thomas' recipe, IIRC. > > That's surprising to me. What tests are you running? I'm testing with > the same "Thomas recipe" AFAICT. Me too. Anyway, I've removed the TIMEOUT argument from t-a-p-o (defaulting it to 0 now), and I've started the Tramp regression test campaign. It takes 2 or 3 days usually, so let's see what's the outcome. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-11 14:25 ` Michael Albinus @ 2023-03-12 0:48 ` João Távora 2023-03-12 10:22 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-12 0:48 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, monnier, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > Anyway, I've removed the TIMEOUT argument from t-a-p-o (defaulting it to > 0 now), and I've started the Tramp regression test campaign. It takes 2 > or 3 days usually, so let's see what's the outcome. OK, sounds reasonable. We're definitely in a better place now that the problem is understood. After you push the patch, are you also going to bump the Tramp version? I'd like to remove the Tramp-specific workaround in Eglot, but I don't know if I can: Eglot would have to "Package-Require" a certain version of Tramp. Normally, :core packages should only require other :core things. But Tramp is not a :core GNU ELPA package. At least according to elpa.git/elpa-packages, it lives in its own repo. Wait, but Tramp is _also_ in emacs.git. Huh? So maybe it's possible? I'm confused. Stefan, can you help? João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-12 0:48 ` João Távora @ 2023-03-12 10:22 ` Michael Albinus 2023-03-14 11:01 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-12 10:22 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, monnier, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> Anyway, I've removed the TIMEOUT argument from t-a-p-o (defaulting it to >> 0 now), and I've started the Tramp regression test campaign. It takes 2 >> or 3 days usually, so let's see what's the outcome. > > OK, sounds reasonable. We're definitely in a better place now that the > problem is understood. After you push the patch, are you also going to > bump the Tramp version? Yes, this patch is also marked for the next Tramp release on GNU ELPA. This will be Tramp 2.6.0.3, roughly scheduled for end of March. FTR, the regression tests have shown problems with older Emacsen, Emacs 26 and 27, to say. So I need to fix this somehow, before the regression tests will continue. To tell the truth, the regression tests use the upcoming 2.6.0.3 version. Once everything is OK, it will be merged into the Emacs master branch. > I'd like to remove the Tramp-specific workaround in Eglot, but I don't > know if I can: Eglot would have to "Package-Require" a certain version > of Tramp. Normally, :core packages should only require other :core > things. But Tramp is not a :core GNU ELPA package. At least according > to elpa.git/elpa-packages, it lives in its own repo. Wait, but Tramp is > _also_ in emacs.git. Huh? So maybe it's possible? I'm confused. > Stefan, can you help? I'm not Stefan, but perhaps I could clarify. The Tramp version released on GNU ELPA doesn't correspond to the Tramp in Emacs master. This is mainly because I want to release a stable version on GNU ELPA. People with a need for the bleeding edge Tramp can always use Emacs master, or pull Tramp from its own repository on savannah. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-12 10:22 ` Michael Albinus @ 2023-03-14 11:01 ` Michael Albinus 2023-03-14 15:00 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-14 11:01 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, monnier, 61350 Michael Albinus <michael.albinus@gmx.de> writes: Hi João & Thomas, >>> Anyway, I've removed the TIMEOUT argument from t-a-p-o (defaulting it to >>> 0 now), and I've started the Tramp regression test campaign. It takes 2 >>> or 3 days usually, so let's see what's the outcome. >> >> OK, sounds reasonable. We're definitely in a better place now that the >> problem is understood. After you push the patch, are you also going to >> bump the Tramp version? > > Yes, this patch is also marked for the next Tramp release on GNU > ELPA. This will be Tramp 2.6.0.3, roughly scheduled for end of March. > > FTR, the regression tests have shown problems with older Emacsen, Emacs > 26 and 27, to say. So I need to fix this somehow, before the regression > tests will continue. > > To tell the truth, the regression tests use the upcoming 2.6.0.3 > version. Once everything is OK, it will be merged into the Emacs master > branch. Regression tests are finished, and there were indeed some minor annoyances I had to fix. I have pushed everything to Emacs master branch. If you like, you could test it with eglot.el w/o the Tramp fix. As said, release of Tramp 2.6.0.3 on GNU ELPA will follow end of March. This gives us also the time to see whether there are collateral damages, reported by somebody. Until then I guess the bug might still be open. >> João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-14 11:01 ` Michael Albinus @ 2023-03-14 15:00 ` Michael Albinus 2023-03-14 15:19 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-14 15:00 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, monnier, 61350 Michael Albinus <michael.albinus@gmx.de> writes: Hi João & Thomas, > As said, release of Tramp 2.6.0.3 on GNU ELPA will follow end of > March. This gives us also the time to see whether there are collateral > damages, reported by somebody. Until then I guess the bug might still be > open. FTR, the first collateral damage is reported already on EMBA: --8<---------------cut here---------------start------------->8--- FAILED file-notify-test04-autorevert-remote "Forbidden reentrant call of Tramp" FAILED file-notify-test11-symlinks-remote ((should (file-notify--test-with-actions-check actions)) :form (file-notify--test-with-actions-check (nil)) :value nil :explanation "Received actions do not match expected actions\n(deleted stopped)\nnil") --8<---------------cut here---------------end--------------->8--- Achhhhhh :-( I will investigate. IMHO, this is the final indication that we shouldn't allow a nil JUST-THIS-ONE in accept-process-output. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-14 15:00 ` Michael Albinus @ 2023-03-14 15:19 ` João Távora 2023-03-14 15:42 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-14 15:19 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, monnier, 61350 On Tue, Mar 14, 2023 at 3:00 PM Michael Albinus <michael.albinus@gmx.de> wrote: > Achhhhhh :-( Pech! > I will investigate. IMHO, this is the final indication that we shouldn't > allow a nil JUST-THIS-ONE in accept-process-output. I don't see this failure as an indication of that, at all!! First, you didn't do that -- put nil in JUST-THIS-ONE -- did you? At most this means that you are doing more or less the same as setting JUST-THIS-ONE to nil, but in a much more convoluted way, so that you may as well start setting JUST-THIS-ONE to nil... ...and bite the bullet, continuing this analysis. We're making progress... So, I reach a completely different conclusion. For me, this means that those auto-revert and symlinks-remote tests, whatever they are, must be investigated. It's good that you have these tests! Do they fail in all Emacsen or just some? Always/sometimes? Can I reproduce this here? After we get a consistent failure, we should ask ourselves what exactly is the reentrancy in question here. Perhaps you already have a conjecture of the chain of events that leads to it. FWIW I've recently solved a reentrancy bug in jsonrpc.el. I identified the conditions (here, i was ignoring that process-send-string when given enough input to send, will cause output acceptance, and filters to run), and fixed it. João commit 8bf4cdcf79bc3254a9169f28f68922ab83bd4e78 Author: João Távora Date: Thu Dec 15 15:26:13 2022 +0000 Avoid recursive process filters in lisp/jsonrpc.el (bug#60088) jsonrpc.el may lose JSON-RPC messages because of recursive process filters. The problem happens in jsonrpc.el's jsonrpc--process-filter. The code of the process filter didn't expect to be called recursively and fails in that case. But that can happen if the three conditions are verified. 1. the client code invoked by its jsonrpc--connection-receive inside the process filter callee immediately sends follow-up input to process within the same Lisp stack. This is a common scenario, especially during LSP initialiation sequence used by Eglot, a jsonrpc.el client. 2. that follow-up message is large enough for process-send-string to send the input in bunches (output from processes can arrive in between bunches). 3. the process happens to have already some more output ready The fix in this commit detects recursive invocations and immediately re-schedules them as non-recursive calls to the jsonrpc--process-filter (but started from timers). * lisp/jsonrpc.el (jsonrpc--process-filter): Rework. (Version): Bump to 1.0.16. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-14 15:19 ` João Távora @ 2023-03-14 15:42 ` Michael Albinus 2023-03-15 17:47 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-14 15:42 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, monnier, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> I will investigate. IMHO, this is the final indication that we shouldn't >> allow a nil JUST-THIS-ONE in accept-process-output. > > I don't see this failure as an indication of that, at all!! > > First, you didn't do that -- put nil in JUST-THIS-ONE -- did you? > At most this means that you are doing more or less the same > as setting JUST-THIS-ONE to nil, but in a much more convoluted > way, so that you may as well start setting JUST-THIS-ONE to nil... It is an indication, that, if some processes accept output at the same time, and this output runs through process filters, anything can happen. No, I didn't set JUST-THIS-ONE to nil. I just allow related processes, belonging to the same Tramp connection, to do so. This is a case we have under our control. When JUSt_THIS-ONE is set to nil, the same could happen to any arbitrary process running randomly in concurrency to a Tramp process, and this we wouldn't have under our control. > ...and bite the bullet, continuing this analysis. We're making > progress... Sure, I'll do. Coincidently, the author of filenotify.el and filenotify-tests.el is well known to us, I'll contact him :-) > For me, this means that those auto-revert and symlinks-remote tests, > whatever they are, must be investigated. It's good that you have these > tests! Do they fail in all Emacsen or just some? Always/sometimes? > Can I reproduce this here? This happens with Emacs master (which runs on EMBA). And I could reproduce it locally, with slightly modified error messages, by --8<---------------cut here---------------start------------->8--- # make -C test filenotify-tests --8<---------------cut here---------------end--------------->8--- > After we get a consistent failure, we should ask ourselves what exactly > is the reentrancy in question here. Perhaps you already have a conjecture > of the chain of events that leads to it. Yes, I have. Contacting the author ... > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-14 15:42 ` Michael Albinus @ 2023-03-15 17:47 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 18:05 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 17:47 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, João Távora, 61350 > It is an indication, that, if some processes accept output at the same > time, and this output runs through process filters, anything can happen. Yes, this is a fundamental problem in Emacs's handling of asynchronous execution. We should really revisit this. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 17:47 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 18:05 ` João Távora 2023-03-15 18:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-15 18:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: Thomas Koch, Michael Albinus, 61350 [-- Attachment #1: Type: text/plain, Size: 558 bytes --] On Wed, Mar 15, 2023, 17:47 Stefan Monnier <monnier@iro.umontreal.ca> wrote: > > It is an indication, that, if some processes accept output at the same > > time, and this output runs through process filters, anything can happen. > > Yes, this is a fundamental problem in Emacs's handling of asynchronous > execution. We should really revisit this. > We could start by understanding exactly what is triggering this reentrancy. AFAIU, we don't. If have a recipe to trigger is consistently, now we should work backwards from there. João > [-- Attachment #2: Type: text/html, Size: 1105 bytes --] ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 18:05 ` João Távora @ 2023-03-15 18:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 19:44 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 18:30 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Michael Albinus, 61350 > We could start by understanding exactly what is triggering this reentrancy. > AFAIU, we don't. If have a recipe to trigger is consistently, now we should > work backwards from there. I think we understand some of the problems well enough to think about solutions. E.g. the case where we run a process filter for process P while we haven't finished running a previous invocation of P's process filter. [ While this tends to occur in hard-to-debug corner cases, it also tends to occur when Edebugging process filters. ] Another is that `process-send-string` should be more clear about its asynchronous behavior: I'd argue that it should *never* block, but it should additionally provide a way to check when the send is completed. I suspect we also have some loose ends in timer handling, where we should provide ways to distinguish "run every N seconds", "wait N seconds between each run", and whether re-entrancy should be allowed when running "every N seconds". And I seem to remember seeing cases where upon a process's exit we run the process sentinel before the last invocation of the process filter: we should clarify the ordering of these things (where "these things" should also include things like `process-live-p` and `process-status`). I haven't bumped into such problem recently, so maybe they've been fixed, but maybe it's just because I haven't played with those tools recently (not sure what the doc says about these things either). Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 18:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 19:44 ` João Távora 2023-03-15 20:14 ` João Távora 2023-03-15 21:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 2 replies; 98+ messages in thread From: João Távora @ 2023-03-15 19:44 UTC (permalink / raw) To: Stefan Monnier; +Cc: Thomas Koch, Michael Albinus, 61350 Stefan Monnier <monnier@iro.umontreal.ca> writes: >> We could start by understanding exactly what is triggering this reentrancy. >> AFAIU, we don't. If have a recipe to trigger is consistently, now we should >> work backwards from there. > > I think we understand some of the problems well enough to think > about solutions. Does that "some" include _this_ problem? Let's take this one for example. make -C test filenotify-tests SELECTOR=file-notify-test04-autorevert Michael reports that this reports "Forbidden reentrant call of Tramp" on EMBA. I've confirmed it fails locally with a different error: (file-error "‘tramp_stat_file_attributes /tmp/file-notify-testNkTum7/file-notify-testhEtklA’ does not return a valid Lisp expression: ‘’") Good enough, a failure is a failure is a failure. We know this was introduced by the recent changes 54ef338ba3670415cf47fabc33a92d4904707c7e. But we don't know a lot of other stuff * Why the difference between Local and EMBA? * How is this test supposed to work? * Can we split the test to be shorter, as currently it takes between 25 second and 120 seconds to run? * What exactly breaks here? The test fails consistently enough, that we should be able to log/instrument/debug. IMHO, we can be answering this in parallel with -- but ideally before -- a deeper review of Emacs's process machinery. We could also ask other questions, of course, like: could Tramp use a process filter? IMO these are very powerful tools and bring a substantial degree of freedom of implementation. Currently Tramp relies on accept-p-o and then searches the process buffer for regexps. To me, that just seems more difficult and error-prone than process filters. But even my own questions I would defer until after we have understood this particular failure of file-notify-test04-autorevert on master. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 19:44 ` João Távora @ 2023-03-15 20:14 ` João Távora 2023-03-15 21:34 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 21:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 21:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 2 replies; 98+ messages in thread From: João Távora @ 2023-03-15 20:14 UTC (permalink / raw) To: Stefan Monnier; +Cc: Thomas Koch, Michael Albinus, 61350 [-- Attachment #1: Type: text/plain, Size: 1542 bytes --] João Távora <joaotavora@gmail.com> writes: > Stefan Monnier <monnier@iro.umontreal.ca> writes: > Does that "some" include _this_ problem? Let's take this one for > example. > > make -C test filenotify-tests > SELECTOR=file-notify-test04-autorevert Sorry, here I meant to type 'file-notify-test04-autorevert-remote' > We know this was introduced by the recent changes > 54ef338ba3670415cf47fabc33a92d4904707c7e. FWIW, I've run the following experiment: 1. Reverted 54ef338ba3670415cf47fabc33a92d4904707c7e 2. Applied a simpler, more conservative, patch, discussed with Michael before. The patch keeps the protocol for tramp-a-p-o untouched. It simply adds a "while" to the 'a-p-o' call issued for "related" processes. 3. Ran the filenotify test suite. All but one tests pass. 'file-notify-test04-autorevert-remote' now passes. The failure I get, 'file-notify-test06-dir-validity-remote', is _also_ in Emacs 29 4. Removed the Tramp-specific Eglot workaround in lisp/progmodes/eglot.el which disabled Controlmaster. 5. Ran the Eglot-over-Tramp experiment with the JDTLS LSP server multiple times with 0% failures. 6. The two Tramp-related failures in the eglot-tests.el test suite are _also_ fixed. Ran this multiple times, 0% failures. It seems evident that my simpler patch puts us in a much better place than we are now, so I propose we push them. Patches are attached. Of course, we can continue to explore other more elegant solutions. João [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Revert-Improve-Tramp-processes-to-accept-output-over.patch --] [-- Type: text/x-patch, Size: 9830 bytes --] From f8ea932928fe37573d28c197c1c9d2da9a08089d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jo=C3=A3o=20T=C3=A1vora?= <joaotavora@gmail.com> Date: Wed, 15 Mar 2023 19:18:00 +0000 Subject: [PATCH 1/3] Revert "Improve Tramp processes to accept output over the same socket" This reverts commit 54ef338ba3670415cf47fabc33a92d4904707c7e. --- lisp/net/tramp-adb.el | 2 +- lisp/net/tramp-gvfs.el | 2 +- lisp/net/tramp-sh.el | 14 +------------- lisp/net/tramp-smb.el | 6 +++--- lisp/net/tramp-sudoedit.el | 2 +- lisp/net/tramp.el | 33 +++++++++------------------------ 6 files changed, 16 insertions(+), 43 deletions(-) diff --git a/lisp/net/tramp-adb.el b/lisp/net/tramp-adb.el index d338201ab72..64f45e7958d 100644 --- a/lisp/net/tramp-adb.el +++ b/lisp/net/tramp-adb.el @@ -990,7 +990,7 @@ tramp-adb-handle-make-process (progn (goto-char (point-min)) (not (search-forward "\n" nil t))) - (tramp-accept-process-output p)) + (tramp-accept-process-output p 0)) (delete-region (point-min) (point))) ;; Provide error buffer. This shows only ;; initial error messages; messages diff --git a/lisp/net/tramp-gvfs.el b/lisp/net/tramp-gvfs.el index c1ad37de1d2..266724c587f 100644 --- a/lisp/net/tramp-gvfs.el +++ b/lisp/net/tramp-gvfs.el @@ -1469,7 +1469,7 @@ tramp-gvfs-handle-file-notify-add-watch (set-process-sentinel p #'tramp-file-notify-process-sentinel) ;; There might be an error if the monitor is not supported. ;; Give the filter a chance to read the output. - (while (tramp-accept-process-output p)) + (while (tramp-accept-process-output p 0)) (unless (process-live-p p) (tramp-error p 'file-notify-error "Monitoring not supported for `%s'" file-name)) diff --git a/lisp/net/tramp-sh.el b/lisp/net/tramp-sh.el index 882b79b3ee7..48ebfff6cfe 100644 --- a/lisp/net/tramp-sh.el +++ b/lisp/net/tramp-sh.el @@ -2424,10 +2424,6 @@ tramp-do-copy-or-rename-file-out-of-band copy-program copy-args))) (tramp-message v 6 "%s" (string-join (process-command p) " ")) (process-put p 'vector v) - ;; This is neded for ssh or PuTTY based processes, and - ;; only if the respective options are set. Perhaps, - ;; the setting could be more fine-grained. - (process-put p 'shared-socket t) (process-put p 'adjust-window-size-function #'ignore) (set-process-query-on-exit-flag p nil) @@ -3757,10 +3753,6 @@ tramp-sh-handle-file-notify-add-watch (string-join sequence " ")) (tramp-message v 6 "Run `%s', %S" (string-join sequence " ") p) (process-put p 'vector v) - ;; This is neded for ssh or PuTTY based processes, and only if - ;; the respective options are set. Perhaps, the setting could - ;; be more fine-grained. - (process-put p 'shared-socket t) ;; Needed for process filter. (process-put p 'events events) (process-put p 'watch-name localname) @@ -3769,7 +3761,7 @@ tramp-sh-handle-file-notify-add-watch (set-process-sentinel p #'tramp-file-notify-process-sentinel) ;; There might be an error if the monitor is not supported. ;; Give the filter a chance to read the output. - (while (tramp-accept-process-output p)) + (while (tramp-accept-process-output p 0)) (unless (process-live-p p) (tramp-error p 'file-notify-error "Monitoring not supported for `%s'" file-name)) @@ -5121,10 +5113,6 @@ tramp-maybe-open-connection ;; Set sentinel and query flag. Initialize variables. (set-process-sentinel p #'tramp-process-sentinel) (process-put p 'vector vec) - ;; This is neded for ssh or PuTTY based processes, and - ;; only if the respective options are set. Perhaps, - ;; the setting could be more fine-grained. - (process-put p 'shared-socket t) (process-put p 'adjust-window-size-function #'ignore) (set-process-query-on-exit-flag p nil) (setq tramp-current-connection (cons vec (current-time))) diff --git a/lisp/net/tramp-smb.el b/lisp/net/tramp-smb.el index bb4ab9e3057..1aa4520eeb6 100644 --- a/lisp/net/tramp-smb.el +++ b/lisp/net/tramp-smb.el @@ -757,7 +757,7 @@ tramp-smb-action-get-acl "Read ACL data from connection buffer." (unless (process-live-p proc) ;; Accept pending output. - (while (tramp-accept-process-output proc)) + (while (tramp-accept-process-output proc 0)) (with-current-buffer (tramp-get-connection-buffer vec) ;; There might be a hidden password prompt. (widen) @@ -1363,7 +1363,7 @@ tramp-smb-action-set-acl "Set ACL data." (unless (process-live-p proc) ;; Accept pending output. - (while (tramp-accept-process-output proc)) + (while (tramp-accept-process-output proc 0)) (tramp-message vec 10 "\n%s" (tramp-get-buffer-string (tramp-get-connection-buffer vec))) (throw 'tramp-action 'ok))) @@ -2023,7 +2023,7 @@ tramp-smb-wait-for-output ;; Read pending output. (while (not (re-search-forward tramp-smb-prompt nil t)) - (while (tramp-accept-process-output p)) + (while (tramp-accept-process-output p 0)) (goto-char (point-min))) (tramp-message vec 6 "\n%s" (buffer-string)) diff --git a/lisp/net/tramp-sudoedit.el b/lisp/net/tramp-sudoedit.el index 3cacde2468c..abb9afc570b 100644 --- a/lisp/net/tramp-sudoedit.el +++ b/lisp/net/tramp-sudoedit.el @@ -692,7 +692,7 @@ tramp-sudoedit-action-sudo "Check, whether a sudo process has finished. Remove unneeded output." ;; There might be pending output for the exit status. (unless (process-live-p proc) - (while (tramp-accept-process-output proc)) + (while (tramp-accept-process-output proc 0)) ;; Delete narrowed region, it would be in the way reading a Lisp form. (goto-char (point-min)) (widen) diff --git a/lisp/net/tramp.el b/lisp/net/tramp.el index b6e985db6b1..47173b95bea 100644 --- a/lisp/net/tramp.el +++ b/lisp/net/tramp.el @@ -5087,11 +5087,6 @@ tramp-handle-make-process ;; t. See Bug#51177. (when filter (set-process-filter p filter)) - (process-put p 'vector v) - ;; This is neded for ssh or PuTTY based processes, and - ;; only if the respective options are set. Perhaps, the - ;; setting could be more fine-grained. - (process-put p 'shared-socket t) (process-put p 'remote-command orig-command) (tramp-set-connection-property p "remote-command" orig-command) @@ -5494,7 +5489,7 @@ tramp-handle-file-notify-rm-watch ;; There might be pending output. Avoid problems with reentrant ;; call of Tramp. (ignore-errors - (while (tramp-accept-process-output proc))) + (while (tramp-accept-process-output proc 0))) (tramp-message proc 6 "Kill %S" proc) (delete-process proc)) @@ -5646,13 +5641,13 @@ tramp-action-process-alive "Check, whether a process has finished." (unless (process-live-p proc) ;; There might be pending output. - (while (tramp-accept-process-output proc)) + (while (tramp-accept-process-output proc 0)) (throw 'tramp-action 'process-died))) (defun tramp-action-out-of-band (proc vec) "Check, whether an out-of-band copy has finished." ;; There might be pending output for the exit status. - (while (tramp-accept-process-output proc)) + (while (tramp-accept-process-output proc 0)) (cond ((and (not (process-live-p proc)) (zerop (process-exit-status proc))) (tramp-message vec 3 "Process has finished.") @@ -5683,7 +5678,7 @@ tramp-process-one-action (while (not found) ;; Reread output once all actions have been performed. ;; Obviously, the output was not complete. - (while (tramp-accept-process-output proc)) + (while (tramp-accept-process-output proc 0)) (setq todo actions) (while todo (setq item (pop todo) @@ -5800,21 +5795,11 @@ with-tramp-locked-connection ,@body) (tramp-flush-connection-property ,proc "locked")))) -(defun tramp-accept-process-output (proc &optional _timeout) +(defun tramp-accept-process-output (proc &optional timeout) "Like `accept-process-output' for Tramp processes. This is needed in order to hide `last-coding-system-used', which is set for process communication also. If the user quits via `C-g', it is propagated up to `tramp-file-name-handler'." - (declare (advertised-calling-convention (proc) "29.2")) - ;; There could be other processes which use the same socket for - ;; communication. This could block the output for the current - ;; process. Read such output first. (Bug#61350) - (when-let (((process-get proc 'shared-socket)) - (v (process-get proc 'vector))) - (dolist (p (delq proc (process-list))) - (when (tramp-file-name-equal-p v (process-get p 'vector)) - (accept-process-output p 0 nil t)))) - (with-current-buffer (process-buffer proc) (let ((inhibit-read-only t) last-coding-system-used @@ -5824,10 +5809,10 @@ tramp-accept-process-output ;; JUST-THIS-ONE is set due to Bug#12145. `with-local-quit' ;; returns t in order to report success. (if (with-local-quit - (setq result (accept-process-output proc 0 nil t)) t) + (setq result (accept-process-output proc timeout nil t)) t) (tramp-message - proc 10 "%s %s %s\n%s" - proc (process-status proc) result (buffer-string)) + proc 10 "%s %s %s %s\n%s" + proc timeout (process-status proc) result (buffer-string)) ;; Propagate quit. (keyboard-quit))) result))) @@ -6840,7 +6825,7 @@ tramp-interrupt-process (tramp-get-remote-null-device (process-get proc 'vector)))) ;; Wait, until the process has disappeared. If it doesn't, ;; fall back to the default implementation. - (while (tramp-accept-process-output proc)) + (while (tramp-accept-process-output proc 0)) (not (process-live-p proc)))))) (add-hook 'interrupt-process-functions #'tramp-interrupt-process) -- 2.39.2 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #3: 0003-Simpler-fix-for-bug-61350.patch --] [-- Type: text/x-patch, Size: 1118 bytes --] From 16e93416780e416d27ed514e3d8a876405a4f4bc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jo=C3=A3o=20T=C3=A1vora?= <joaotavora@gmail.com> Date: Wed, 15 Mar 2023 20:02:43 +0000 Subject: [PATCH 3/3] Simpler fix for bug#61350 * lisp/net/tramp.el (tramp-accept-process-output): Accept process from related processes. --- lisp/net/tramp.el | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/lisp/net/tramp.el b/lisp/net/tramp.el index 47173b95bea..885b29f9471 100644 --- a/lisp/net/tramp.el +++ b/lisp/net/tramp.el @@ -5800,6 +5800,11 @@ tramp-accept-process-output This is needed in order to hide `last-coding-system-used', which is set for process communication also. If the user quits via `C-g', it is propagated up to `tramp-file-name-handler'." + (when-let (((process-get proc 'shared-socket)) + (v (process-get proc 'vector))) + (dolist (p (delq proc (process-list))) + (when (tramp-file-name-equal-p v (process-get p 'vector)) + (while (accept-process-output p 0 nil t))))) (with-current-buffer (process-buffer proc) (let ((inhibit-read-only t) last-coding-system-used -- 2.39.2 ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 20:14 ` João Távora @ 2023-03-15 21:34 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 21:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 1 sibling, 0 replies; 98+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 21:34 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Michael Albinus, 61350 > @@ -990,7 +990,7 @@ tramp-adb-handle-make-process > (progn > (goto-char (point-min)) > (not (search-forward "\n" nil t))) > - (tramp-accept-process-output p)) > + (tramp-accept-process-output p 0)) > (delete-region (point-min) (point))) > ;; Provide error buffer. This shows only > ;; initial error messages; messages Could you explain why we need all those "0 second timeout" thingies? They make me nervous because it really means busy waiting in a tight loop. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 20:14 ` João Távora 2023-03-15 21:34 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 21:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 13:28 ` João Távora 2023-03-18 12:34 ` Michael Albinus 1 sibling, 2 replies; 98+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 21:55 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Michael Albinus, 61350 Sorry 'bout my previous email. > diff --git a/lisp/net/tramp.el b/lisp/net/tramp.el > index 47173b95bea..885b29f9471 100644 > --- a/lisp/net/tramp.el > +++ b/lisp/net/tramp.el > @@ -5800,6 +5800,11 @@ tramp-accept-process-output > This is needed in order to hide `last-coding-system-used', which is set > for process communication also. > If the user quits via `C-g', it is propagated up to `tramp-file-name-handler'." > + (when-let (((process-get proc 'shared-socket)) > + (v (process-get proc 'vector))) > + (dolist (p (delq proc (process-list))) > + (when (tramp-file-name-equal-p v (process-get p 'vector)) > + (while (accept-process-output p 0 nil t))))) > (with-current-buffer (process-buffer proc) > (let ((inhibit-read-only t) > last-coding-system-used Looks OK to me (tho this `dolist` in the middle of all those `(tramp-accept-process-output .. 0)` busy loops is definitely not the most efficient use of resources :-). Stefan PS: An unrelated comment, is that `shared-socket` and `vector` should be renamed to clarify that these are properties specific to Tramp processes (i.e. use a `tramp-` prefix). ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 21:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-16 13:28 ` João Távora 2023-03-18 12:34 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: João Távora @ 2023-03-16 13:28 UTC (permalink / raw) To: Stefan Monnier; +Cc: Thomas Koch, Michael Albinus, 61350 Stefan Monnier <monnier@iro.umontreal.ca> writes: > Sorry 'bout my previous email. > >> diff --git a/lisp/net/tramp.el b/lisp/net/tramp.el >> index 47173b95bea..885b29f9471 100644 >> --- a/lisp/net/tramp.el >> +++ b/lisp/net/tramp.el >> @@ -5800,6 +5800,11 @@ tramp-accept-process-output >> This is needed in order to hide `last-coding-system-used', which is set >> for process communication also. >> If the user quits via `C-g', it is propagated up to `tramp-file-name-handler'." >> + (when-let (((process-get proc 'shared-socket)) >> + (v (process-get proc 'vector))) >> + (dolist (p (delq proc (process-list))) >> + (when (tramp-file-name-equal-p v (process-get p 'vector)) >> + (while (accept-process-output p 0 nil t))))) >> (with-current-buffer (process-buffer proc) >> (let ((inhibit-read-only t) >> last-coding-system-used > > Looks OK to me (tho this `dolist` in the middle of all those > `(tramp-accept-process-output .. 0)` busy loops is definitely not the > most efficient use of resources :-). It sure ain't pretty, but that's what is known to fix it for now. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 21:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 13:28 ` João Távora @ 2023-03-18 12:34 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-03-18 12:34 UTC (permalink / raw) To: Stefan Monnier; +Cc: Thomas Koch, João Távora, 61350 Stefan Monnier <monnier@iro.umontreal.ca> writes: Hi Stefan, > PS: An unrelated comment, is that `shared-socket` and `vector` should be > renamed to clarify that these are properties specific to Tramp processes > (i.e. use a `tramp-` prefix). That's true. There are three properties which shall keep their names, `remote-tty', `remote-pid' and `remote-command'. They are documented in (info "(tramp) Remote processes") All other Trmp internal properties I have prefixed with `tramp-', as proposed. Pushed to master. > Stefan Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 19:44 ` João Távora 2023-03-15 20:14 ` João Távora @ 2023-03-15 21:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 21:49 ` João Távora 1 sibling, 1 reply; 98+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 21:43 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Michael Albinus, 61350 >> I think we understand some of the problems well enough to think >> about solutions. > Does that "some" include _this_ problem? Not on my side, sadly, no, but maybe on Michael's? > We could also ask other questions, of course, like: could Tramp use a > process filter? IMO these are very powerful tools and bring a > substantial degree of freedom of implementation. Currently Tramp relies > on accept-p-o and then searches the process buffer for regexps. To me, > that just seems more difficult and error-prone than process filters. FWIW, I believe the crux of the matter here is that process filters receive an arbitrary chunk at a time, so we almost always need to make them accumulate the received data into some kind of buffer-like data structure until we have "enough" data. IOW, most process filters need to do something like (lambda (proc string) (let* ((buffer (process-get proc 'my-buffer)) (new (concat buffer string))) (if (not (had-enough-p new)) (process-put proc 'my-buffer new) (process-put proc 'my-buffer nil) (do-the-actual-processing new)))) Another issue for Tramp is that basically all of the API it has to implement (for `file-name-handler-alist`) is synchronous, so that forces it to resort to `accept-p-o` :-( If we want Tramp to work better, we'll need to provide asynchronous versions of most of those operations (presumably using some kind of promise/future abstraction). Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 21:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-15 21:49 ` João Távora 2023-03-16 6:24 ` Jim Porter 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-15 21:49 UTC (permalink / raw) To: Stefan Monnier; +Cc: Thomas Koch, Michael Albinus, 61350 On Wed, Mar 15, 2023 at 9:43 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote: > IOW, most process filters need to do something like > > (lambda (proc string) > (let* ((buffer (process-get proc 'my-buffer)) > (new (concat buffer string))) > (if (not (had-enough-p new)) > (process-put proc 'my-buffer new) > (process-put proc 'my-buffer nil) > (do-the-actual-processing new)))) > > Another issue for Tramp is that basically all of the API it has to > implement (for `file-name-handler-alist`) is synchronous, so that forces > it to resort to `accept-p-o` :-( SLY, jsonrpc.el, and other code is synchronous as well (the completion API is synch, as you well know). `accept-p-o` + a filter that invokes a closure that throws out of the loop is a great way to do this. See jsonrpc-request, for example. No JUST-THIS-ONE, and has been working fine in many forms since Emacs 24.3 AFAIK. > If we want Tramp to work better, we'll need to provide asynchronous > versions of most of those operations (presumably using some kind of > promise/future abstraction). We could start with using a filter, just like the one you described above Tramp doesn't use a filter, and I think that's part of the problem. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-15 21:49 ` João Távora @ 2023-03-16 6:24 ` Jim Porter 2023-03-16 13:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 13:28 ` João Távora 0 siblings, 2 replies; 98+ messages in thread From: Jim Porter @ 2023-03-16 6:24 UTC (permalink / raw) To: João Távora, Stefan Monnier; +Cc: Thomas Koch, Michael Albinus, 61350 On 3/15/2023 2:49 PM, João Távora wrote: > SLY, jsonrpc.el, and other code is synchronous as well (the completion > API is synch, as you well know). `accept-p-o` + a filter that invokes > a closure that throws out of the loop is a great way to do this. > See jsonrpc-request, for example. No JUST-THIS-ONE, and has > been working fine in many forms since Emacs 24.3 AFAIK. Assuming I understand the context here correctly, this sort of thing is part of why Chris Wellons' emacs-aio package uses 'run-at-time' for handling resolved promises[1]: > If the result is ready call the callback in the next event loop turn using run-at-time. This is important because it keeps all the asynchronous components isolated from one another. They won’t see each others’ frames on the call stack, nor frames from aio. This is so important that the Promises/A+ specification is explicit about it. While a solution specific to the problem in this bug likely doesn't require an entire asynchrony framework, if we get one into Emacs, it should definitely support this case. (That said, a lot of the discussion from this is probably best left to the emacs-devel thread so as not to cause too much distraction... :)) [1] https://nullprogram.com/blog/2019/03/10/ ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-16 6:24 ` Jim Porter @ 2023-03-16 13:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 13:28 ` João Távora 1 sibling, 0 replies; 98+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-16 13:25 UTC (permalink / raw) To: Jim Porter; +Cc: Thomas Koch, Michael Albinus, João Távora, 61350 >> SLY, jsonrpc.el, and other code is synchronous as well (the completion >> API is synch, as you well know). `accept-p-o` + a filter that invokes >> a closure that throws out of the loop is a great way to do this. >> See jsonrpc-request, for example. No JUST-THIS-ONE, and has >> been working fine in many forms since Emacs 24.3 AFAIK. > > Assuming I understand the context here correctly, this sort of thing is part > of why Chris Wellons' emacs-aio package uses 'run-at-time' for handling > resolved promises[1]: >> If the result is ready call the callback in the next event loop turn using >> run-at-time. This is important because it keeps all the asynchronous >> components isolated from one another. They won’t see each others’ frames >> on the call stack, nor frames from aio. This is so important that the >> Promises/A+ specification is explicit about it. My futur.el code does the same, indeed (tho it uses `funcall-later` which I implemented in C to expose to ELisp the `pending_funcalls` we already had in C). Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-16 6:24 ` Jim Porter 2023-03-16 13:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-16 13:28 ` João Távora 2023-03-16 15:58 ` João Távora 1 sibling, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-16 13:28 UTC (permalink / raw) To: Jim Porter; +Cc: Thomas Koch, Michael Albinus, Stefan Monnier, 61350 Jim Porter <jporterbugs@gmail.com> writes: > On 3/15/2023 2:49 PM, João Távora wrote: >> SLY, jsonrpc.el, and other code is synchronous as well (the completion >> API is synch, as you well know). `accept-p-o` + a filter that invokes >> a closure that throws out of the loop is a great way to do this. >> See jsonrpc-request, for example. No JUST-THIS-ONE, and has >> been working fine in many forms since Emacs 24.3 AFAIK. > > Assuming I understand the context here correctly, this sort of thing > is part of why Chris Wellons' emacs-aio package uses 'run-at-time' for > handling resolved promises[1]: There's a relation, but it's not a direct relation. emacs-aio is a promise/future/cps library which uses timers. I was describing this design pattern for _synch_ requests over Emacs's asynchronous processes (set-process-filter proc (lambda (proc output) (internal-default-process-filter proc output) (when (and (process-get proc 'busy) (buffer-live-p (process-buffer proc))) (with-current-buffer (process-buffer proc) (when (search-forward ", DAVE.\n") (unwind-protect (throw 'done (buffer-substring (point-min) (point))) (process-put proc 'busy nil) (delete-region (point-min) (point)))))))) (setq answer-from-proc (catch 'done (process-put proc 'busy t) (process-send-string proc "OPEN THE POD BAY DOORS\n") (while (accept-process-output)))) This is very simplified (and untested), but that's the gist. You can then add cancellation, timeouts, etc. Also normally (in SLY and JSONRPC) instead of a single flag 'busy, you look up a table of pending requests This pattern has worked very well for me, for a long time, 10+ years. SLIME used this pattern back in 2000's, there where I saw it (and distilled it). It's similar to what "await" does other languages. >> If the result is ready call the callback in the next event loop turn >> using run-at-time. This is important because it keeps all the >> asynchronous components isolated from one another. They won’t see >> each others’ frames on the call stack, nor frames from aio. This is >> so important that the Promises/A+ specification is explicit about >> it. > While a solution specific to the problem in this bug likely doesn't > require an entire asynchrony framework, if we get one into Emacs, it > should definitely support this case. Sure, I'd be happy with just using the stuff that's been in Emacs for ages. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-16 13:28 ` João Távora @ 2023-03-16 15:58 ` João Távora 2023-03-16 20:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-16 15:58 UTC (permalink / raw) To: Jim Porter; +Cc: Thomas Koch, Michael Albinus, Stefan Monnier, 61350 João Távora <joaotavora@gmail.com> writes: > Jim Porter <jporterbugs@gmail.com> writes: > (setq answer-from-proc (catch 'done > (process-put proc 'busy t) > (process-send-string proc "OPEN THE POD BAY DOORS\n") > (while (accept-process-output)))) > > This is very simplified (and untested), but that's the gist. The gist it is, but also very broken. As ever, vapourware sucks. This is what that snippet should have been: ;;; hal.el --- asdasd -*- lexical-binding: t; -*- (defvar hal-proc nil) (defun hal-connect() (interactive) (when (processp hal-proc) (delete-process hal-proc)) (setq hal-proc (make-process :name "halproc" :command (list (expand-file-name "hal" default-directory)) :buffer (generate-new-buffer "*halbuf*") :filter (lambda (proc output) (internal-default-process-filter proc output) (when-let* ((buffer (process-buffer proc)) (callback (and (buffer-live-p buffer) (process-get proc 'callback)))) (with-current-buffer buffer (save-excursion (goto-char (point-min)) (when (search-forward ", DAVE\n" nil t) (unwind-protect (funcall callback (buffer-substring (point-min) (point))) (process-put proc 'callback nil) (delete-region (point-min) (point))))))))))) (defun hal-command () "Synch command to HAL that doesn't block others." (interactive) (message (catch 'hal-done (process-put hal-proc 'callback (lambda (result) (throw 'hal-done result))) (process-send-string hal-proc "OPEN THE POD BAY DOORS!\n") (while (accept-process-output hal-proc))))) You can test this with this very complex C++ program #include <cstdlib> #include <iostream> #include <string> #include <array> int main() { std::string line; const char* retorts[] = { "I'M AFRAID I CAN'T DO THAT", "I'D RATHER NOT", "THINK OF THE CONSEQUENCES", "PLEASE STOP" }; while (std::cin >> line) std::cout << retorts[std::rand()%4] << ", DAVE" << std::endl; } ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-16 15:58 ` João Távora @ 2023-03-16 20:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 22:04 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-16 20:36 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Jim Porter, Michael Albinus, 61350 > The gist it is, but also very broken. As ever, vapourware sucks. This > is what that snippet should have been: > > ;;; hal.el --- asdasd -*- lexical-binding: t; -*- > (defvar hal-proc nil) > (defun hal-connect() > (interactive) > (when (processp hal-proc) (delete-process hal-proc)) > (setq hal-proc > (make-process > :name "halproc" > :command (list (expand-file-name "hal" default-directory)) > :buffer (generate-new-buffer "*halbuf*") > :filter (lambda (proc output) > (internal-default-process-filter proc output) > (when-let* ((buffer (process-buffer proc)) > (callback (and (buffer-live-p buffer) > (process-get proc 'callback)))) > (with-current-buffer buffer > (save-excursion > (goto-char (point-min)) > (when (search-forward ", DAVE\n" nil t) > (unwind-protect (funcall callback > (buffer-substring > (point-min) (point))) > (process-put proc 'callback nil) > (delete-region (point-min) > (point))))))))))) > > (defun hal-command () "Synch command to HAL that doesn't block others." > (interactive) > (message > (catch 'hal-done > (process-put hal-proc 'callback > (lambda (result) (throw 'hal-done result))) > (process-send-string hal-proc "OPEN THE POD BAY DOORS!\n") > (while (accept-process-output hal-proc))))) :-) I think doing more of the work asynchronously (i.e. from the process's filters/sentinels rather than after `accept-process-output`), like in this example, is virtually always better, yes. The use of `catch/throw` is not applicable to code which may be used asynchronously (i.e. where we don't know that there's always a `catch` waiting for our `throw`), so I hope we can get code to work just as reliably without using `catch/throw`. This mission is too important for me to allow you to jeopardize it. Stefan ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-16 20:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-03-16 22:04 ` João Távora 0 siblings, 0 replies; 98+ messages in thread From: João Távora @ 2023-03-16 22:04 UTC (permalink / raw) To: Stefan Monnier; +Cc: Thomas Koch, Jim Porter, Michael Albinus, 61350 On Thu, Mar 16, 2023 at 8:36 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote: > I think doing more of the work asynchronously (i.e. from the process's > filters/sentinels rather than after `accept-process-output`), like in > this example, is virtually always better, yes. > > The use of `catch/throw` is not applicable to code which may be used > asynchronously (i.e. where we don't know that there's always a `catch` > waiting for our `throw`), so I hope we can get code to work just as > reliably without using `catch/throw`. Of course, normally you want the async/sync pair of primitives: (defun hal-async-command (callback) "Send command, call CALLBACK." (when (process-get hal-proc 'callback) (error "HAL only takes one command at a time") (process-put hal-proc 'callback callback) (process-send-string hal-proc "OPEN THE POD BAY DOORS!\n")) (defun hal-sync-command () (catch 'hal-done (hal-async-command (lambda (r) (throw 'hal-done r)))) So what I mean is: * when synchronous APIs are needed (that is presumably what TRAMP needs, according to you) then *do* use catch/throw/filter/accept-process-output * when async is needed, use callbacks. You can mix in the upcoming emacs-io.el/futur.el/whatever frameworks, but when processes are involved, some variation on that technique should be used somewhere. My main point here is that this bug's root cause is the lack of good filter+accept-process-output design. I just think that basic tools to achieve that design are already in place and reasonably easy to grasp. I'm not opposed to newer tools. And I already mentioned that the above approach can be enhanced to include timeouts and cancellation on user input (for the latter, sit-for is usually preferable to a-p-o). > This mission is too important for me to allow you to jeopardize it. :-) Sorry to interrupt the festivities, Dave João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-07 13:04 ` Thomas Koch 2023-03-07 13:33 ` João Távora @ 2023-03-07 13:47 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-03-07 13:47 UTC (permalink / raw) To: Thomas Koch; +Cc: João Távora, 61350 Thomas Koch <thomas@koch.ro> writes: > Thanks Michael! Hi Thomas, > What is the advantage of this patch over just removing the JUST-THIS-ONE argument? In both cases tramp is triggering accept-process-output for processes it does not own. It touches only processes which share the same connection via ssh ControlMaster (and putty share). Nothing else. > However with your patch there is a potential timing issue: blocking > process output could still arrive between your new loop and tramps > call to accept-process-output with JUST-THIS-ONE. You have seen the timing? I'll change the timeout of accept-process-output to 0. This should prevent blocking. > I'd think that there might be room for an essay "JUST-THIS-ONE considered evil"? In my book, one shall touch only data one is related to. Processes, not related to the same Tramp connection, are taboo. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-05 11:21 ` Michael Albinus 2023-03-05 11:45 ` Thomas Koch @ 2023-03-06 12:42 ` Michael Albinus 2023-03-06 13:45 ` João Távora 2023-03-06 13:42 ` João Távora 2 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-06 12:42 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: Hi João, > In parallel, I'll extend tramp-use-ssh-controlmaster-options to accept a > further value 'suppress', which overrides possible settings in ~/.ssh/config. I've pushed a respective change to the master branch. It will also be included in Tramp 2.6.0.3, the next planned version on GNU ELPA. If you want to check this feature, you might call --8<---------------cut here---------------start------------->8--- (safe-local-variable-p 'tramp-use-ssh-controlmaster-options 'suppress) --8<---------------cut here---------------end--------------->8--- >> João Best regards, Michael. PS: Starting on Wednesday, I'll be almost offline for a couple of days. I have no idea how long, 'tho. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-06 12:42 ` Michael Albinus @ 2023-03-06 13:45 ` João Távora 0 siblings, 0 replies; 98+ messages in thread From: João Távora @ 2023-03-06 13:45 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 On Mon, Mar 6, 2023 at 12:42 PM Michael Albinus <michael.albinus@gmx.de> wrote: > > Michael Albinus <michael.albinus@gmx.de> writes: > --8<---------------cut here---------------start------------->8--- > (safe-local-variable-p 'tramp-use-ssh-controlmaster-options 'suppress) > --8<---------------cut here---------------end--------------->8--- Thanks. As Tramp is a :core GNU ELPA package, I think we can make Eglot "Package-Require" tramp and use this new variable value. Or we can just keep using slightly more hacky version for now (since turning off ControlMaster is kind of a hack intrinsically). I don't see any rush. > PS: Starting on Wednesday, I'll be almost offline for a couple of > days. I have no idea how long, 'tho. Take care! João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-05 11:21 ` Michael Albinus 2023-03-05 11:45 ` Thomas Koch 2023-03-06 12:42 ` Michael Albinus @ 2023-03-06 13:42 ` João Távora 2 siblings, 0 replies; 98+ messages in thread From: João Távora @ 2023-03-06 13:42 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 On Sun, Mar 5, 2023 at 11:21 AM Michael Albinus <michael.albinus@gmx.de> wrote: > > João Távora <joaotavora@gmail.com> writes: > > Hi João, > > >> Yes. But Tramp calls accept-process-output only, if it has send a > >> command to the remote shell, and it expects something to be returned. At > >> least the shell prompt. > > > > Yes. Tramp is doing the right thing. It really expects a response to > > come. And more often than not, it does. But sometimes it doesn't, and > > that's when we hang. > > After digging further, I believe I understand now why it hangs. We have > the following scenario: > > - Both Eglot and Tramp use the same ssh connection with enabled ControlMaster. > > - Eglot gets JSON output from the remote LSP server. In > jsonrpc--process-filter, this output is handled. It includes a call to > file-truename, which triggers Tramp to send a request in its own > process to the remote side. > > - The remote side returns the answer for Tramp (shell output). However, > the ssh socket is still full of the jsonrpc process output, which > waits to be handled. > > - So the socket is blocked. The jsonrpc output cannot be read, because > jsonrpc--process-filter waits for the result of file-truename. And the > Tramp process output cannot be handled, because it is stuck in the > socket after the jsonrpc output. OK, I think I understand. But here, the main point that makes this deadlock possible is that tramp when called by the jsonrpc process filter is calling accept-process-output with JUST-THIS-ONE set to true. _That_ is why jsonrpc--process-filter cannot see the remaining output that is stuckl in the ssh socket, as you describe. > The proper solution is indeed to have two connections, and to refuse use > of ControlMaster. Here, I disagree. I think, like Thomas, that the proper solution is to stop passing JUST-THIS-ONE to tramp's a-p-output calls. > So would you mind to add a hook to Eglot, which runs before calling > make-process, but in the proper process buffer? When you've added it, > I'll let Tramp hook into, and you don't need any longer the Tramp > specific code in eglot.el. I don't think using this hook for this is the right idea, and I'm fine with having that bit of Tramp-specific code in there until we find a better solution. So no need to worry about that. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 17:55 ` Thomas Koch 2023-03-01 14:10 ` João Távora @ 2023-03-02 10:40 ` Michael Albinus 1 sibling, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-03-02 10:40 UTC (permalink / raw) To: Thomas Koch; +Cc: João Távora, 61350 Thomas Koch <thomas@koch.ro> writes: Hi Thomas, > My current working theory is: > > 1. There is some buffer in SSH (or TCP) that gets filled by the language server sending data to eglot. > > 2. Tramp sends a command over SSH. > > 3. Tramp sets the JUST-THIS-ONE option of accept-process-output to t since https://debbugs.gnu.org/12145. > > 4. The response to 2. can not arrive due to the buffer being filled in 1. Tramp blocks the emacs main thread. It is a little bit more specific. Tramp speaks with the remote shell in order to execute commands like '(file-truename "...")'. This is transformed by Tramp into the shell command --8<---------------cut here---------------start------------->8--- \readlink --canonicalize-missing /home/albinus/src/yacy_search_server/examples/SimpleSearchClient/pom.xml 2>/dev/null; echo tramp_exit_status $? --8<---------------cut here---------------end--------------->8--- Tramp expects the result of the readlink call, and the return value (in order to know whether this result can be used). And it expects a special shell prompt, which is the indication, that the remote command has been finished. So it calls tramp-accept-process-output, unitl the connection buffer contains something like --8<---------------cut here---------------start------------->8--- /home/albinus/src/yacy_search_server/examples/SimpleSearchClient/pom.xml tramp_exit_status 0 ///4b3b7d4fa561141e84c94a1cf25e8575#$ --8<---------------cut here---------------end--------------->8--- This special shell prompt is initialized at process startup, by setting PS1. This works fine, until there is the concurrency with the Eglot process. According to my debugging, Tramp sees only part of the output. The final shell prompt, ///4b3b7d4fa561141e84c94a1cf25e8575#$, doesn't arrive, and so Tramp is blocked. > I tested my theory by deleting (and thus disabling) the JUST-THIS-ONE > argument in all calls to accept-process-output in tramp.el and > tramp-sh.el. Eglot did not freeze anymore in two tests with two > different systems (but the same jdtls binary on the same Debian > version). Yes. But I still don't understand what's the difference in scenario. > In one test however I saw this in the message buffer: > > [jsonrpc] (warning) Invalid JSON: (control character 0xd <callback> 1 > 381 381) > {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigSearchBox.java","diagnostics":[{"range":{"start":{"line":35,"character":86},"end":{"line":35,"character":94}},"severity":3,"code":"1102","source":"Java","message":"At > least one of the problems in category \u0027unuseContent-Length: 798 > > {"jsonrpc":"2.0","method":"textDocument/publ > > I already sent the above to Michael in an out-of-band thread (thanks > for your patience!). I suspect this is due to the "Forbidden reentrant call", but it isn't obvious. > I have a vague feeling, that Tramp could be improved with a work queue > such that requests to tramp from notification or timer threads get > blocked while another tramp command is still waiting for a > reply. Exactly. This is what I want to achieve by using threads for protecting the communication with the remote processes. Or at least with the main connection process. > Additionally I understand that Joao has an idea to use markers > controlled by Tramp. - I'm sorry that I can not (yet) contribute with > putting both ideas into code. > > There are two statements in this bug thread that I don't yet understand and that might be worth more elaboration. I volunteer as a debugging rubber duck[1]. However if you both understand each other then never mind: > > [1] https://en.wikipedia.org/wiki/Rubber_duck_debugging > > Joao: "markers in the process buffer is what is used commonly" > > I understand that tramp sends only one command at a time per > connection. Otherwise the reentrant error is thrown. The command > result gets written to a connection-specific buffer. The output is > parsed from point-min and the buffer content deleted after parsing and > before sending the next command. So what should there be improved? > Jsonrpc needs its own markers because N messages can arrive at any > given time and the buffer can not be deleted after each message. Meanwhile I don't believe any longer that markers are the problem. > Michael: "If another process has consumed the output, even if it is pushed into the correct buffer, Tramp doesn't know." > > What exact scenario should that be? Emacs writes the output (if > there's no filter) in the correct buffer for the process. Tramp might > call accept-process-output unnecessarily because the output is already > in the buffer due to a accept-process-output call of some other > code. But in this case Tramp will find the output there eventually, > maybe after having waited until a timeout. This could be improved by > checking for already present output before calling > accept-process-output? I stand corrected. The problem isn't that another process has consumed the output, but rather that not everything arrives. For whatever reason. Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 15:01 ` Michael Albinus 2023-02-28 17:55 ` Thomas Koch @ 2023-02-28 19:37 ` João Távora 2023-03-01 8:44 ` Michael Albinus 1 sibling, 1 reply; 98+ messages in thread From: João Távora @ 2023-02-28 19:37 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: > João Távora <joaotavora@gmail.com> writes: > > Hi João, > >> Looking at the Tramp code, I think Thomas's idea of using >> >> -o "ControlMaster=no" -o "ControlPath=none" >> >> would probably be quite effective. It would work with previous >> Tramp versions that Eglot users might be using. > > Yes, that I have also in mind. > >> Thomas, can you try this patch? >> >> diff --git a/lisp/progmodes/eglot.el b/lisp/progmodes/eglot.el >> index e20d209332d..83462633dd9 100644 >> --- a/lisp/progmodes/eglot.el >> +++ b/lisp/progmodes/eglot.el >> @@ -130,6 +130,8 @@ >> (defvar markdown-fontify-code-blocks-natively) >> (defvar company-backends) >> (defvar company-tooltip-align-annotations) >> +(defvar tramp-ssh-controlmaster-options) >> +(defvar tramp-use-ssh-controlmaster-options) >> >> >> >> @@ -1247,7 +1249,15 @@ eglot--connect >> (contact (cl-subseq contact 0 probe))) >> `(:process >> ,(lambda () >> - (let ((default-directory default-directory)) >> + (let ((default-directory default-directory) >> + ;; bug#61350: Tramp turns a feature on >> + ;; by default that can't (yet) handle >> + ;; very much data so we turn it off >> + ;; unconditionally -- just for our >> + ;; process. >> + (tramp-use-ssh-controlmaster-options t) >> + (tramp-ssh-controlmaster-options >> + "-o ControlMaster=no -o ControlPath=none")) >> (make-process >> :name readable-name >> :command (setq server-info (eglot--cmd contact)) > > Even better than my proposal :-) > > There is the small risk that a user runs a local ssh client which is > not aware of the ControlMaster option. For such a user, Eglot on a > remote host might fail. > > Tramp is busy to check, whether ControlMaster could be applied. > > OTOH, I don't know if such ssh clients are still used in the wild, at > least by Eglot users. There is this risk indeed. But considering the two demographics: 1. People who use ControlMaster in ~/.ssh/config 2. People who have very old ssh clients I think that 2 is probably smaller than 1, especially if ControlMaster is such a momentous thing. So i think the risk is worth taking. Especially because an error for people in 2. is easier to recognize, understand and report than an inexplicable Emacs hang. And Eglot is a :core ELPA package, so bugfixes can be expedited. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 19:37 ` João Távora @ 2023-03-01 8:44 ` Michael Albinus 2023-03-01 11:15 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-01 8:44 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, 61350 João Távora <joaotavora@gmail.com> writes: Hi João, >> OTOH, I don't know if such ssh clients are still used in the wild, at >> least by Eglot users. > > There is this risk indeed. But considering the two demographics: > > 1. People who use ControlMaster in ~/.ssh/config > 2. People who have very old ssh clients > > I think that 2 is probably smaller than 1, especially if ControlMaster > is such a momentous thing. So i think the risk is worth taking. > Especially because an error for people in 2. is easier to recognize, > understand and report than an inexplicable Emacs hang. And Eglot is a > :core ELPA package, so bugfixes can be expedited. I agree with your judgement. But we know that it is just a workaround. I would sleep much better if I would know what's the root cause of the problem. Thomas seems to be on a good path with his analysis. Hopefully, we can find something more on his way. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 8:44 ` Michael Albinus @ 2023-03-01 11:15 ` João Távora 0 siblings, 0 replies; 98+ messages in thread From: João Távora @ 2023-03-01 11:15 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, 61350 Michael Albinus <michael.albinus@gmx.de> writes: >> I think that 2 is probably smaller than 1, especially if ControlMaster >> is such a momentous thing. So i think the risk is worth taking. >> Especially because an error for people in 2. is easier to recognize, >> understand and report than an inexplicable Emacs hang. And Eglot is a >> :core ELPA package, so bugfixes can be expedited. > > I agree with your judgement. I just pushed the patch to Emacs 29. > But we know that it is just a workaround. I would sleep much better if > I would know what's the root cause of the problem. Thomas seems to be > on a good path with his analysis. Hopefully, we can find something > more on his way. Agree. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-28 14:51 ` João Távora 2023-02-28 15:01 ` Michael Albinus @ 2023-03-01 10:46 ` Gregory Heytings 2023-03-01 11:08 ` João Távora 1 sibling, 1 reply; 98+ messages in thread From: Gregory Heytings @ 2023-03-01 10:46 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Michael Albinus, 61350 >> Users, who have set ControlMaster in their ~/.ssh/config (like Thomas), >> are not satified. For them we need another setting of >> tramp-use-ssh-controlmaster-options. I'm working on this. > > Looking at the Tramp code, I think Thomas's idea of using > > -o "ControlMaster=no" -o "ControlPath=none" > > would probably be quite effective. > Instead of turning ControlMaster off, I suggest to first try to set ControlPersist on. The "freeze" problem described here is typical of multiplexing an ssh connection with ControlMaster set without also setting ControlPersist. (If you set ControlMaster without also setting ControlPersist, the socket is closed when the connection by which it was created is closed, even if other connections are still using it.) ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 10:46 ` Gregory Heytings @ 2023-03-01 11:08 ` João Távora 2023-03-01 11:23 ` Gregory Heytings 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-01 11:08 UTC (permalink / raw) To: Gregory Heytings; +Cc: Thomas Koch, Michael Albinus, 61350 Gregory Heytings <gregory@heytings.org> writes: >>> Users, who have set ControlMaster in their ~/.ssh/config (like >>> Thomas), are not satified. For them we need another setting of >>> tramp-use-ssh-controlmaster-options. I'm working on this. >> >> Looking at the Tramp code, I think Thomas's idea of using >> >> -o "ControlMaster=no" -o "ControlPath=none" >> >> would probably be quite effective. >> > > Instead of turning ControlMaster off, I suggest to first try to set > ControlPersist on. The "freeze" problem described here is typical of > multiplexing an ssh connection with ControlMaster set without also > setting ControlPersist. (If you set ControlMaster without also > setting ControlPersist, the socket is closed when the connection by > which it was created is closed, even if other connections are still > using it.) Still hangs. I tried my previously described test with this patch: diff --git a/lisp/net/tramp-sh.el b/lisp/net/tramp-sh.el index ec8437176db..f9ac83e4008 100644 --- a/lisp/net/tramp-sh.el +++ b/lisp/net/tramp-sh.el @@ -4974,10 +4974,10 @@ tramp-ssh-controlmaster-options (when (zerop (tramp-call-process vec "ssh" nil nil nil - "-G" "-o" "ControlPersist=no" "0.0.0.1")) + "-G" "-o" "ControlPersist=yes" "0.0.0.1")) (setq tramp-ssh-controlmaster-options (concat tramp-ssh-controlmaster-options - " -o ControlPersist=no"))))))) + " -o ControlPersist=yes"))))))) tramp-ssh-controlmaster-options))) (defun tramp-scp-strict-file-name-checking (vec) ^ permalink raw reply related [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 11:08 ` João Távora @ 2023-03-01 11:23 ` Gregory Heytings 2023-03-01 11:37 ` João Távora 0 siblings, 1 reply; 98+ messages in thread From: Gregory Heytings @ 2023-03-01 11:23 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Michael Albinus, 61350 > > Still hangs. I tried my previously described test with this patch: > Are you sure that this patch actually turns ControlPersist on? Instead of trying to modify tramp-sh.el, I would try to add a section like Host foo ... ControlMaster yes ControlPath /some/path/%r@%h:%p ControlPersist yes in my .ssh/config. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 11:23 ` Gregory Heytings @ 2023-03-01 11:37 ` João Távora 2023-03-01 14:51 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: João Távora @ 2023-03-01 11:37 UTC (permalink / raw) To: Gregory Heytings; +Cc: Thomas Koch, Michael Albinus, 61350 On Wed, Mar 1, 2023 at 11:23 AM Gregory Heytings <gregory@heytings.org> wrote: > Are you sure that this patch actually turns ControlPersist on? From Tramp's debug log. 11:34:29.869596 tramp-maybe-open-connection (3) # Sending command ‘exec ssh -l sshuser -p 2022 -o ControlMaster=auto -o ControlPath=tramp.%C -o ControlPersist=yes -e none localhost’ 11:34:29.869679 tramp-send-command (6) # exec ssh -l sshuser -p 2022 -o ControlMaster=auto -o ControlPath=tramp.%C -o ControlPersist=yes -e none localhost 11:34:29.869814 tramp-process-actions (3) # Waiting for prompts from remote shell... > I would try to add a section like in my .ssh/config. You can try your own experiments if you follow my Docker-based MRE. João ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 11:37 ` João Távora @ 2023-03-01 14:51 ` Michael Albinus 2023-03-01 15:02 ` Gregory Heytings 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-03-01 14:51 UTC (permalink / raw) To: João Távora; +Cc: Thomas Koch, Gregory Heytings, 61350 João Távora <joaotavora@gmail.com> writes: Hi, > On Wed, Mar 1, 2023 at 11:23 AM Gregory Heytings <gregory@heytings.org> wrote: > >> Are you sure that this patch actually turns ControlPersist on? > > From Tramp's debug log. > > 11:34:29.869596 tramp-maybe-open-connection (3) # Sending command > ‘exec ssh -l sshuser -p 2022 -o ControlMaster=auto -o > ControlPath=tramp.%C -o ControlPersist=yes -e none localhost’ > 11:34:29.869679 tramp-send-command (6) # exec ssh -l sshuser -p 2022 > -o ControlMaster=auto -o ControlPath=tramp.%C -o ControlPersist=yes -e > none localhost > 11:34:29.869814 tramp-process-actions (3) # Waiting for prompts from > remote shell... I confirm João's observation. Setting ControlPersist=yes doesn't fix the problem. And it is clear why it doesn't change anything: - You open a remote connection. Tramp's main connection process becomes the ControlMaster. - Eglot is called, which reuses this ControlMaster socket. Both the Tramp main connection process, and the Eglot process, persist, and are used for data exchange. It doesn't make a difference what value we have for ControlPersist. > João Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-03-01 14:51 ` Michael Albinus @ 2023-03-01 15:02 ` Gregory Heytings 0 siblings, 0 replies; 98+ messages in thread From: Gregory Heytings @ 2023-03-01 15:02 UTC (permalink / raw) To: Michael Albinus; +Cc: Thomas Koch, João Távora, 61350 [-- Attachment #1: Type: text/plain, Size: 277 bytes --] > > I confirm João's observation. Setting ControlPersist=yes doesn't fix the > problem. > Okay, thanks. That was just a suggestion, I've seen similar hanging problems in setups without ControlPersist set, and AFAIR that setting wasn't mentioned in this thread. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-02-07 16:33 bug#61350: Eglot over Tramp freezes with large project Thomas Koch 2023-02-17 9:54 ` Michael Albinus 2023-02-23 12:17 ` João Távora @ 2023-04-24 1:44 ` Aaron Madlon-Kay 2023-05-05 11:32 ` Michael Albinus 2 siblings, 1 reply; 98+ messages in thread From: Aaron Madlon-Kay @ 2023-04-24 1:44 UTC (permalink / raw) To: 61350 Hello all. I wanted to share that changes related to this bug, specifically the changes in commit 54ef338 ("Improve Tramp processes to accept output over the same socket") break a package I use: https://github.com/jscheid/prettier.el It seems that prettier.el calls tramp-accept-process-output with a 20s timeout, and when that timeout is not respected it thinks that the process has died. Here is the relevant code: https://github.com/jscheid/prettier.el/blob/e419bb7a916c38a4a62632c49233de1523f41218/prettier.el#L1251-L1256 I don't know enough about process handling to understand if prettier.el is doing the wrong thing, and if so how to fix it. Thanks, Aaron ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-04-24 1:44 ` Aaron Madlon-Kay @ 2023-05-05 11:32 ` Michael Albinus 2023-05-05 13:14 ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors 0 siblings, 1 reply; 98+ messages in thread From: Michael Albinus @ 2023-05-05 11:32 UTC (permalink / raw) To: Aaron Madlon-Kay; +Cc: 61350 Aaron Madlon-Kay <aaron@madlon-kay.com> writes: > Hello all. Hi, > I wanted to share that changes related to this bug, > specifically the changes in commit 54ef338 ("Improve Tramp processes > to accept output over the same socket") break a package I use: > https://github.com/jscheid/prettier.el > > It seems that prettier.el calls tramp-accept-process-output with a 20s > timeout, and when that timeout is not respected it thinks that the > process has died. Here is the relevant code: > https://github.com/jscheid/prettier.el/blob/e419bb7a916c38a4a62632c49233de1523f41218/prettier.el#L1251-L1256 > > I don't know enough about process handling to understand if > prettier.el is doing the wrong thing, and if so how to fix it. tramp-accept-process-output is not a public function, it could change internally in an incompatible way. So you might contact the Tramp team on the respective mailing list in order to check what we could do with prettier.el. > Thanks, > Aaron Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-05-05 11:32 ` Michael Albinus @ 2023-05-05 13:14 ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-05-05 14:53 ` Michael Albinus 0 siblings, 1 reply; 98+ messages in thread From: Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-05-05 13:14 UTC (permalink / raw) To: Michael Albinus; +Cc: 61350, Aaron Madlon-Kay Michael Albinus <michael.albinus@gmx.de> writes: > tramp-accept-process-output is not a public function, it could change > internally in an incompatible way. I'm confused -- I don't see "private" in its name (no double dash), nor did its docstring say anything about it. If this function is in fact private, is there any documentation on this topic? -- Best, RY [Please note that this mail might go to spam due to some misconfiguration in my mail server -- still investigating.] ^ permalink raw reply [flat|nested] 98+ messages in thread
* bug#61350: Eglot over Tramp freezes with large project 2023-05-05 13:14 ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-05-05 14:53 ` Michael Albinus 0 siblings, 0 replies; 98+ messages in thread From: Michael Albinus @ 2023-05-05 14:53 UTC (permalink / raw) To: Ruijie Yu; +Cc: 61350, Aaron Madlon-Kay Ruijie Yu <ruijie@netyu.xyz> writes: Hi, >> tramp-accept-process-output is not a public function, it could change >> internally in an incompatible way. > > I'm confused -- I don't see "private" in its name (no double dash), nor > did its docstring say anything about it. If this function is in fact > private, is there any documentation on this topic? The interface to be used for Tramp are the functions described in the Elisp manual at (info "(elisp) Magic File Names") All Tramp functions should be regarded as internal, except some very few commands described the Tramp manual. In (info "(tramp) External packages") there is --8<---------------cut here---------------start------------->8--- 8.2 Integrating with external Lisp packages =========================================== In general, it is not recommended to use TRAMP functions and variables not described in this manual. They might change their signature and/or semantics without any announcement. --8<---------------cut here---------------end--------------->8--- Best regards, Michael. ^ permalink raw reply [flat|nested] 98+ messages in thread
end of thread, other threads:[~2023-05-05 14:53 UTC | newest] Thread overview: 98+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-07 16:33 bug#61350: Eglot over Tramp freezes with large project Thomas Koch 2023-02-17 9:54 ` Michael Albinus 2023-02-17 10:33 ` Thomas Koch 2023-02-18 11:10 ` Thomas Koch 2023-02-18 12:07 ` Michael Albinus 2023-02-23 11:55 ` Thomas Koch 2023-02-25 14:36 ` Michael Albinus 2023-02-23 12:17 ` João Távora 2023-02-23 14:18 ` João Távora 2023-02-23 14:47 ` Thomas Koch 2023-02-23 15:22 ` João Távora 2023-02-24 17:19 ` Michael Albinus 2023-02-24 17:45 ` João Távora 2023-02-25 14:27 ` Michael Albinus 2023-02-25 23:09 ` João Távora 2023-02-26 10:24 ` Thomas Koch 2023-02-26 15:58 ` Michael Albinus 2023-02-26 17:23 ` Michael Albinus 2023-02-26 21:13 ` João Távora 2023-02-26 21:45 ` João Távora 2023-02-27 7:53 ` Michael Albinus 2023-02-27 9:42 ` João Távora 2023-02-27 20:11 ` Michael Albinus 2023-02-27 7:47 ` Michael Albinus 2023-02-27 9:35 ` João Távora 2023-02-27 20:10 ` Michael Albinus 2023-02-28 0:10 ` João Távora 2023-02-28 10:38 ` Michael Albinus 2023-02-28 11:33 ` João Távora 2023-02-28 12:59 ` Michael Albinus 2023-02-28 14:41 ` João Távora 2023-02-28 14:18 ` Michael Albinus 2023-02-28 14:51 ` João Távora 2023-02-28 15:01 ` Michael Albinus 2023-02-28 17:55 ` Thomas Koch 2023-03-01 14:10 ` João Távora 2023-03-01 16:19 ` João Távora 2023-03-02 11:01 ` Michael Albinus 2023-03-02 11:22 ` João Távora 2023-03-02 11:50 ` Michael Albinus 2023-03-05 11:21 ` Michael Albinus 2023-03-05 11:45 ` Thomas Koch 2023-03-05 12:23 ` Michael Albinus 2023-03-07 12:49 ` Michael Albinus 2023-03-07 13:04 ` Thomas Koch 2023-03-07 13:33 ` João Távora 2023-03-07 13:52 ` Michael Albinus 2023-03-07 14:03 ` João Távora 2023-03-07 14:31 ` Michael Albinus 2023-03-11 9:00 ` Michael Albinus 2023-03-11 10:14 ` Thomas Koch 2023-03-11 11:47 ` João Távora 2023-03-11 12:27 ` Michael Albinus 2023-03-11 11:42 ` João Távora 2023-03-11 12:44 ` Michael Albinus 2023-03-11 14:01 ` João Távora 2023-03-11 14:25 ` Michael Albinus 2023-03-12 0:48 ` João Távora 2023-03-12 10:22 ` Michael Albinus 2023-03-14 11:01 ` Michael Albinus 2023-03-14 15:00 ` Michael Albinus 2023-03-14 15:19 ` João Távora 2023-03-14 15:42 ` Michael Albinus 2023-03-15 17:47 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 18:05 ` João Távora 2023-03-15 18:30 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 19:44 ` João Távora 2023-03-15 20:14 ` João Távora 2023-03-15 21:34 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 21:55 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 13:28 ` João Távora 2023-03-18 12:34 ` Michael Albinus 2023-03-15 21:43 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-15 21:49 ` João Távora 2023-03-16 6:24 ` Jim Porter 2023-03-16 13:25 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 13:28 ` João Távora 2023-03-16 15:58 ` João Távora 2023-03-16 20:36 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-03-16 22:04 ` João Távora 2023-03-07 13:47 ` Michael Albinus 2023-03-06 12:42 ` Michael Albinus 2023-03-06 13:45 ` João Távora 2023-03-06 13:42 ` João Távora 2023-03-02 10:40 ` Michael Albinus 2023-02-28 19:37 ` João Távora 2023-03-01 8:44 ` Michael Albinus 2023-03-01 11:15 ` João Távora 2023-03-01 10:46 ` Gregory Heytings 2023-03-01 11:08 ` João Távora 2023-03-01 11:23 ` Gregory Heytings 2023-03-01 11:37 ` João Távora 2023-03-01 14:51 ` Michael Albinus 2023-03-01 15:02 ` Gregory Heytings 2023-04-24 1:44 ` Aaron Madlon-Kay 2023-05-05 11:32 ` Michael Albinus 2023-05-05 13:14 ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors 2023-05-05 14:53 ` Michael Albinus
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).