From: Michael Albinus <michael.albinus@gmx.de>
To: Philippe Vaucher <philippe.vaucher@gmail.com>
Cc: Emacs developers <emacs-devel@gnu.org>
Subject: Re: TRAMP problem with large repositories
Date: Thu, 12 Dec 2019 14:35:50 +0100 [thread overview]
Message-ID: <877e313bux.fsf@gmx.de> (raw)
In-Reply-To: CAGK7Mr6ghyJ_MpOOg+bLB-P4zcm-i21SReNaW_u41nhR=o-etg@mail.gmail.com
[-- Attachment #1: Type: text/plain, Size: 1411 bytes --]
Philippe Vaucher <philippe.vaucher@gmail.com> writes:
> Hello,
Hi Philippe,
> While helping someone for a projectile issue
> (https://github.com/bbatsov/projectile/issues/1480), it seems that
> when `shell-command-to-string` tries to execute `git ls-files -zco -
> -exclude-standard` over TRAMP on a repository that has 85K files it
> takes forever to complete.
>
> We see that `tramp-wait-for-output` calls `tramp-wait-for-regexp`
> which calls `tramp-check-for-regexp`, and when looking at the source:
>
> My understanding is that it does a loop that reads a bit of what the
> commands outputs then tries to parse end of lines (or '\0') and
> repeats until the process died or that it found one. Because the
> command returns a huge string (85K files), this process of
> read-regexp-repeat takes all the CPU (compared to reading the whole
> chunk in one go and then trying to check for the regexp).
>
> My questions are the following:
>
> 1 Did I understand the problem right? Is this something known?
Yes, your analysis is right. And no, I haven't seen related reports yet.
> 2 Is there something to be done about this? Or maybe it would it
> require too much refactoring / faster implementation?
I have appended a patch which should fix the problem. Could you, please,
(let) test?
Btw, the latest Tramp release is always available via GNU ELPA.
> Kind regards,
> Philippe
Best regards, Michael.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 3505 bytes --]
diff --git a/lisp/tramp-sh.el b/lisp/tramp-sh.el
index 8de88d35..506c33df 100644
--- a/lisp/tramp-sh.el
+++ b/lisp/tramp-sh.el
@@ -5102,9 +5102,8 @@ function waits for output unless NOOUTPUT is set."
(forward-line 1)
(delete-region (point-min) (point)))
;; Delete the prompt.
- (goto-char (point-max))
- (re-search-backward regexp nil t)
- (delete-region (point) (point-max)))
+ (when (tramp-search-regexp regexp)
+ (delete-region (point) (point-max))))
(if timeout
(tramp-error
proc 'file-error
@@ -5134,8 +5133,7 @@ DONT-SUPPRESS-ERR is non-nil, stderr won't be sent to /dev/null."
"echo tramp_exit_status $?"
(if subshell " )" "")))
(with-current-buffer (tramp-get-connection-buffer vec)
- (goto-char (point-max))
- (unless (re-search-backward "tramp_exit_status [0-9]+" nil t)
+ (unless (tramp-search-regexp "tramp_exit_status [0-9]+")
(tramp-error
vec 'file-error "Couldn't find exit status of `%s'" command))
(skip-chars-forward "^ ")
diff --git a/lisp/tramp.el b/lisp/tramp.el
index 03e04568..e05e0965 100644
--- a/lisp/tramp.el
+++ b/lisp/tramp.el
@@ -4196,19 +4196,35 @@ for process communication also."
(buffer-string))
result)))
+(defun tramp-search-regexp (regexp)
+ "Search for REGEXP backwards, starting at point-max.
+If found, set point to the end of the occurrence found, and return point.
+Otherwise, return nil."
+ (goto-char (point-max))
+ ;; We restrict ourselves to the last 256 characters. There were
+ ;; reports of 85kB output, which has blocked Tramp forever.
+ (re-search-backward regexp (max (point-min) (- (point) 256)) 'noerror))
+
(defun tramp-check-for-regexp (proc regexp)
"Check, whether REGEXP is contained in process buffer of PROC.
Erase echoed commands if exists."
(with-current-buffer (process-buffer proc)
(goto-char (point-min))
- ;; Check whether we need to remove echo output.
+ ;; Check whether we need to remove echo output. The max length of
+ ;; the echo mark regexp is taken for search. We restrict the
+ ;; search for the second echo mark to PIPE_BUF characters.
(when (and (tramp-get-connection-property proc "check-remote-echo" nil)
- (re-search-forward tramp-echoed-echo-mark-regexp nil t))
+ (re-search-forward
+ tramp-echoed-echo-mark-regexp
+ (+ (point) (* 5 tramp-echo-mark-marker-length)) t))
(let ((begin (match-beginning 0)))
- (when (re-search-forward tramp-echoed-echo-mark-regexp nil t)
+ (when
+ (re-search-forward
+ tramp-echoed-echo-mark-regexp
+ (+ (point) (tramp-get-connection-property proc "pipe-buf" 4096)) t)
;; Discard echo from remote output.
- (tramp-set-connection-property proc "check-remote-echo" nil)
+ (tramp-flush-connection-property proc "check-remote-echo")
(tramp-message proc 5 "echo-mark found")
(forward-line 1)
(delete-region begin (point))
@@ -4229,8 +4245,7 @@ Erase echoed commands if exists."
;; overflow in regexp matcher". For example, //DIRED// lines of
;; directory listings with some thousand files. Therefore, we
;; look from the end.
- (goto-char (point-max))
- (ignore-errors (re-search-backward regexp nil t)))))
+ (tramp-search-regexp regexp))))
(defun tramp-wait-for-regexp (proc timeout regexp)
"Wait for a REGEXP to appear from process PROC within TIMEOUT seconds.
next prev parent reply other threads:[~2019-12-12 13:35 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-11 19:46 TRAMP problem with large repositories Philippe Vaucher
2019-12-12 13:35 ` Michael Albinus [this message]
2019-12-13 11:39 ` Philippe Vaucher
2019-12-13 11:56 ` Michael Albinus
2019-12-13 17:38 ` Philippe Vaucher
2019-12-13 18:31 ` Michael Albinus
2019-12-14 11:48 ` Philippe Vaucher
2019-12-15 9:06 ` Philippe Vaucher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877e313bux.fsf@gmx.de \
--to=michael.albinus@gmx.de \
--cc=emacs-devel@gnu.org \
--cc=philippe.vaucher@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).