From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Philippe Vaucher Newsgroups: gmane.emacs.devel Subject: TRAMP problem with large repositories Date: Wed, 11 Dec 2019 20:46:11 +0100 Message-ID: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000023495059972e310" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="35328"; mail-complaints-to="usenet@blaine.gmane.org" Cc: Michael Albinus To: Emacs developers Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Dec 11 20:47:31 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1if7xK-00094K-J6 for ged-emacs-devel@m.gmane.org; Wed, 11 Dec 2019 20:47:30 +0100 Original-Received: from localhost ([::1]:48480 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1if7xJ-000825-3v for ged-emacs-devel@m.gmane.org; Wed, 11 Dec 2019 14:47:29 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56027) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1if7wX-00081v-OK for emacs-devel@gnu.org; Wed, 11 Dec 2019 14:46:43 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1if7wW-0001Ik-6O for emacs-devel@gnu.org; Wed, 11 Dec 2019 14:46:41 -0500 Original-Received: from mail-lf1-x131.google.com ([2a00:1450:4864:20::131]:40105) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1if7wV-0001EH-PQ for emacs-devel@gnu.org; Wed, 11 Dec 2019 14:46:40 -0500 Original-Received: by mail-lf1-x131.google.com with SMTP id i23so5187020lfo.7 for ; Wed, 11 Dec 2019 11:46:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=KIJFYW12cd7wKljLh1e6DryLTb5j9AnanYQ6Q4nqAME=; b=fh423J9TZUot9aijZw+ADmaTUPzOoMD+Uu0US/gN9Ni/Q5G2A4uIKuh+zudxzWsT3t 1EEQ7X7eRcUovQwQq7Mu6oFuNe5zPG8HYQgIAW14IdQzGgf/uQsfw7U7vuuF/5MDmY/x FTpDW4zwMy2jThbQFDxlLXfmrIKo+Ks0on4+PVqg7CkqKFA700MAJrm30nm+it3k7pe1 y8c4Xm7Z8xmjSHfmg8KYov6h1kj/Gx2KR3peE2Yta2DGVIzgk/6UEMv0YyAmAGFpKABF zbnKi8MZOpFnxGNVlz6g4/Wm/XEJOZ+Jp1lkRaRSfn7CUFt64UuzUN2MK9f6mCsOU1VD 98ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=KIJFYW12cd7wKljLh1e6DryLTb5j9AnanYQ6Q4nqAME=; b=aZI6KiXvispVdbNaHiCLj65+NK4PzRQof2FzEC+XDZGKs7SVmIGUUCF8XjyzZkGQ+K upCk/TdyzgEysZy6x65sRIKfp5shjuo8WehpINl/6ux6RfroUbXDxUUhDFkwRZYEauvK p71kyOSuim9yY4Jgw3vQQa2lI6nXFa2e1sUuzSel5WthXKudoGSorg5kQp8Ax/En0GD4 17f2tWXuwT0NQhjsTAhrz1JPNlmuBcWdMlZfWLjcMY6DRgwELwhdOwmzmZYsJhOHOu33 tRjPw8jM7eaJCfzk6G5IkkT6HoyG2sXlQG7Gj4jopd779xtAlcuaA77XIgLfD4BTnb8+ FhTw== X-Gm-Message-State: APjAAAXI5F7EJpZPWrMMIbusHh0pQMWsbNnSpR01cOMmWvIwW7RyAoX2 4Z9Ph0bwz9j8Hu11YSj592GXF4gPbLDeznPuSaxeyMYEzYo= X-Google-Smtp-Source: APXvYqz0fVznf492s3CGzc7DwJXZqR3N3EBfzqQmU9uGl0HGwafSg3Wc20M3v3or355+0BPDokgP6kdY0SjEsxyoTLw= X-Received: by 2002:ac2:4834:: with SMTP id 20mr3094056lft.166.1576093597445; Wed, 11 Dec 2019 11:46:37 -0800 (PST) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::131 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:243309 Archived-At: --000000000000023495059972e310 Content-Type: text/plain; charset="UTF-8" Hello, Sorry if this is not the right place to post, feel free to redirect me as needed. While helping someone for a projectile issue ( https://github.com/bbatsov/projectile/issues/1480), it seems that when `shell-command-to-string` tries to execute `git ls-files -zco --exclude-standard` over TRAMP on a repository that has 85K files it takes forever to complete. Here's a stacktrace: https://user-images.githubusercontent.com/81829/70549675-72b07f00-1b29-11ea-90f6-91fe0c36b0f4.png We see that `tramp-wait-for-output` calls `tramp-wait-for-regexp` which calls `tramp-check-for-regexp`, and when looking at the source: (defun tramp-wait-for-output (proc &optional timeout) "Wait for output from remote command." (unless (buffer-live-p (process-buffer proc)) (delete-process proc) (tramp-error proc 'file-error "Process `%s' not available, try again" proc)) (with-current-buffer (process-buffer proc) (let* (;; Initially, `tramp-end-of-output' is "#$ ". There might ;; be leading escape sequences, which must be ignored. ;; Busyboxes built with the EDITING_ASK_TERMINAL config ;; option send also escape sequences, which must be ;; ignored. (regexp (format "[^#$\n]*%s\\(%s\\)?\r?$" (regexp-quote tramp-end-of-output) tramp-device-escape-sequence-regexp)) ;; Sometimes, the commands do not return a newline but a ;; null byte before the shell prompt, for example "git ;; ls-files -c -z ...". (regexp1 (format "\\(^\\|\000\\)%s" regexp)) (found (tramp-wait-for-regexp proc timeout regexp1))) .... snip ... My understanding is that it does a loop that reads a bit of what the commands outputs then tries to parse end of lines (or '\0') and repeats until the process died or that it found one. Because the command returns a huge string (85K files), this process of read-regexp-repeat takes all the CPU (compared to reading the whole chunk in one go and then trying to check for the regexp). My questions are the following: 1. Did I understand the problem right? Is this something known? 2. Is there something to be done about this? Or maybe it would it require too much refactoring / faster implementation? Kind regards, Philippe --000000000000023495059972e310 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello,

Sorry if this is not the right p= lace to post, feel free to redirect me as needed.

Wh= ile helping someone for a projectile issue (https://github.com/bbatsov/projectile/issues= /1480), it seems that when `shell-command-to-string` tries to execute `= git ls-files -zco --exclude-standard` over TRAMP on a repository that has 8= 5K files it takes forever to complete.=C2=A0

Here&= #39;s a stacktrace:


We see that `t= ramp-wait-for-output` calls `tramp-wait-for-regexp` which calls `tramp-chec= k-for-regexp`, and when looking at the source:

(defun tramp-wait-for-output (proc =
&optional timeout)
  "Wait for output from remote command."
  (unless (buffer-live-p (process-buffer proc))
    (delete-process proc)
    (tramp-error proc 'file-error "Process <=
span class=3D"gmail-pl-smi" style=3D"box-sizing:border-box;color:rgb(36,41,=
46)">`%s' not available, try again" proc))
  (with-current-buffer (process-buffer proc)
    (let* (;; Initially, `tramp-end-of-output' i=
s "#$ ".  There might
	   ;; be leading escape sequences, which must be ignored.
	   ;; Busyboxes built with the EDITING_ASK_TERMINAL config
	   ;; option send also escape sequences, which must be
	   ;; ignored.
	   (regexp (format "[^#$\n]*%s\\(%s\\)?=
\r?$"
			   (regexp-quote tramp-end-of-output)
			   tramp-device-escape-sequence-regexp))
	   ;; Sometimes, the commands do not return a newline but a
	   ;; null byte before the shell prompt, for example "git
	   ;; ls-files -c -z ...".
	   (regexp1 (format "\\(^\\|\000\\)%s"=
 regexp))
	   (found (tramp-wait-for-regexp proc timeout regexp1)))
      .... snip ...

My understanding is that= it does a loop that reads a bit of what the commands outputs then tries to= parse end of lines (or '\0') and repeats until the process died or= that it found one. Because the command returns a huge string (85K files), = this process of read-regexp-repeat takes all the CPU (compared to reading t= he whole chunk in one go and then trying to check for the regexp).

My questions are the following:
  1. Did I un= derstand the problem right? Is this something known?
  2. Is there somet= hing to be done about this? Or maybe it would it require too much refactori= ng / faster implementation?
Kind regards,
Phi= lippe
--000000000000023495059972e310--