From: Philippe Vaucher <philippe.vaucher@gmail.com>
To: Emacs developers <emacs-devel@gnu.org>
Cc: Michael Albinus <michael.albinus@gmx.de>
Subject: TRAMP problem with large repositories
Date: Wed, 11 Dec 2019 20:46:11 +0100 [thread overview]
Message-ID: <CAGK7Mr6ghyJ_MpOOg+bLB-P4zcm-i21SReNaW_u41nhR=o-etg@mail.gmail.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2224 bytes --]
Hello,
Sorry if this is not the right place to post, feel free to redirect me as
needed.
While helping someone for a projectile issue (
https://github.com/bbatsov/projectile/issues/1480), it seems that when
`shell-command-to-string` tries to execute `git ls-files -zco
--exclude-standard` over TRAMP on a repository that has 85K files it takes
forever to complete.
Here's a stacktrace:
https://user-images.githubusercontent.com/81829/70549675-72b07f00-1b29-11ea-90f6-91fe0c36b0f4.png
We see that `tramp-wait-for-output` calls `tramp-wait-for-regexp` which
calls `tramp-check-for-regexp`, and when looking at the source:
(defun tramp-wait-for-output (proc &optional timeout)
"Wait for output from remote command."
(unless (buffer-live-p (process-buffer proc))
(delete-process proc)
(tramp-error proc 'file-error "Process `%s' not available, try again" proc))
(with-current-buffer (process-buffer proc)
(let* (;; Initially, `tramp-end-of-output' is "#$ ". There might
;; be leading escape sequences, which must be ignored.
;; Busyboxes built with the EDITING_ASK_TERMINAL config
;; option send also escape sequences, which must be
;; ignored.
(regexp (format "[^#$\n]*%s\\(%s\\)?\r?$"
(regexp-quote tramp-end-of-output)
tramp-device-escape-sequence-regexp))
;; Sometimes, the commands do not return a newline but a
;; null byte before the shell prompt, for example "git
;; ls-files -c -z ...".
(regexp1 (format "\\(^\\|\000\\)%s" regexp))
(found (tramp-wait-for-regexp proc timeout regexp1)))
.... snip ...
My understanding is that it does a loop that reads a bit of what the
commands outputs then tries to parse end of lines (or '\0') and repeats
until the process died or that it found one. Because the command returns a
huge string (85K files), this process of read-regexp-repeat takes all the
CPU (compared to reading the whole chunk in one go and then trying to check
for the regexp).
My questions are the following:
1. Did I understand the problem right? Is this something known?
2. Is there something to be done about this? Or maybe it would it
require too much refactoring / faster implementation?
Kind regards,
Philippe
[-- Attachment #2: Type: text/html, Size: 7583 bytes --]
next reply other threads:[~2019-12-11 19:46 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-11 19:46 Philippe Vaucher [this message]
2019-12-12 13:35 ` TRAMP problem with large repositories Michael Albinus
2019-12-13 11:39 ` Philippe Vaucher
2019-12-13 11:56 ` Michael Albinus
2019-12-13 17:38 ` Philippe Vaucher
2019-12-13 18:31 ` Michael Albinus
2019-12-14 11:48 ` Philippe Vaucher
2019-12-15 9:06 ` Philippe Vaucher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGK7Mr6ghyJ_MpOOg+bLB-P4zcm-i21SReNaW_u41nhR=o-etg@mail.gmail.com' \
--to=philippe.vaucher@gmail.com \
--cc=emacs-devel@gnu.org \
--cc=michael.albinus@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.