From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Paul Pogonyshev Newsgroups: gmane.emacs.bugs Subject: bug#56342: TRAMP (sh) issues way too many commands, thus being very slow over high-ping networks Date: Sat, 2 Jul 2022 20:14:35 +0200 Message-ID: References: <8735fjh5ge.fsf@gmx.de> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000050098e05e2d67ade" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34766"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 56342@debbugs.gnu.org To: Michael Albinus Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Jul 02 20:15:28 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o7heS-0008pO-Il for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 02 Jul 2022 20:15:28 +0200 Original-Received: from localhost ([::1]:53094 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o7heR-0004Fy-0J for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 02 Jul 2022 14:15:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54760) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o7he2-0004Fl-Pb for bug-gnu-emacs@gnu.org; Sat, 02 Jul 2022 14:15:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:48934) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1o7he2-0007WB-H5 for bug-gnu-emacs@gnu.org; Sat, 02 Jul 2022 14:15:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1o7he2-0000Ha-Am for bug-gnu-emacs@gnu.org; Sat, 02 Jul 2022 14:15:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Pogonyshev Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 02 Jul 2022 18:15:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 56342 X-GNU-PR-Package: emacs Original-Received: via spool by 56342-submit@debbugs.gnu.org id=B56342.16567856961055 (code B ref 56342); Sat, 02 Jul 2022 18:15:02 +0000 Original-Received: (at 56342) by debbugs.gnu.org; 2 Jul 2022 18:14:56 +0000 Original-Received: from localhost ([127.0.0.1]:42831 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o7hdw-0000Gv-1K for submit@debbugs.gnu.org; Sat, 02 Jul 2022 14:14:56 -0400 Original-Received: from mail-yw1-f176.google.com ([209.85.128.176]:35345) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o7hds-0000Gg-8f for 56342@debbugs.gnu.org; Sat, 02 Jul 2022 14:14:54 -0400 Original-Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-31c782f7d96so19207707b3.2 for <56342@debbugs.gnu.org>; Sat, 02 Jul 2022 11:14:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AokJDcy7hKPJp8AABywEI8gLvXO/gAn1sDPkj8v0nvw=; b=VpyjD+nmKe/0UJF2lt5CYkrt0FbhbREJDpswwsiZaIjcRpjuhJ6nivlsYoa0ZE3A2U YVwee4XzOeDrDLnXsXutcw5DzMPxkreh9tgMLQt0OqCkW9WprtU2CEeeH6rZRVqm1XJd fi92TuLTUDNayqizT+jVx4gUAOCnjXqSVro6IJeXYjgCdtERh5uWzr9d5aGDMCLeoyEk aSRpPi3VtHsndOBfT5GyMTHX4RWbcKcamFGLCAe6KGQo+e1I4iN4EANIHWZmraJG82sg sF9+rZ5DtXVVSn9PJ9dPGYhU9MpCgMvMxKnPNDkuH6ntXmx/3BvwnwpmYQYrDQv2gbpg 5jfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AokJDcy7hKPJp8AABywEI8gLvXO/gAn1sDPkj8v0nvw=; b=EEOEpQ/NgrKqa0ytHXsFHYbeDM/w75z/goQhBNnPDK9XakMrbmIiasG0j4gQmfdb4X kHfqhoSSYHoWYkYpSA/PPGcNDA2IRt1Vlaf4TkYMs9nI8BFQGUN60DGvUZgNkfC3Iv4T GbqDCn75eR4/eEgveZPFnHiNJETm4GwwzyIKaEfhBBeLvOLxpjMoi84zlb6ee7ehEnHS MYvAb52yk3WlAFXTX+gtunk+R74kDIaf36Ff5YOonyjJwVkQvp1BFCbRbQFvMccIIkl0 iOetKJ2Vjh5nuIjJyl9GY+J1nkjsxkVI8ZnK97who6IHNWeoqd9AYH3ctGseNnOvBhCO ug1Q== X-Gm-Message-State: AJIora+Ja5bdNlBoZI8+8eFmRVdNR2eSyItMIl9HNivZU6LUSzFDo0Ah 9600s7U//gFqPVDGufE6rPU8EE4b6eBCgSZW1JHPnXm40+pM X-Google-Smtp-Source: AGRyM1ttpVTMFnoUhAR8kCduK8sNpw2zwPlijRyhK5gNMiLkUdDZR2UQ7726+cptSLpTWbpmeo9hS7X0gxrnSK0EFZ0= X-Received: by 2002:a81:6cd2:0:b0:31c:95d:552 with SMTP id h201-20020a816cd2000000b0031c095d0552mr23512852ywc.28.1656785686496; Sat, 02 Jul 2022 11:14:46 -0700 (PDT) In-Reply-To: <8735fjh5ge.fsf@gmx.de> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:235948 Archived-At: --00000000000050098e05e2d67ade Content-Type: text/plain; charset="UTF-8" Some more thoughts. Why does it even need `echo are you awake'? It's a network connection, it can still fail even if it worked fine 1 ms before when you checked. So, why not just let the first command fail if the connection is dead and restart the connection if it fails in such a way as to suspect that it is dead (i.e. no output)? Maybe limit this to read commands. A way to let higher-level code avoid certain `file-exists-p' calls: add a dynamic variable that tells TRAMP to skip certain commands if the result is not available from a cache. Something similar to `process-file-side-effects'. Calling code could then do sth. like this: (when (let ((tramp-may-skip-if-not-cached `((file-exists-p unknown ,file)))) (file-exists-p file)) ; TRAMP will return t or nil if it knows or 'unknown if not cached; for local files there is no effect ...) Suggested semantics: list of (FUNCTION INSTANT-RESULT-IF-NOT-CACHED ARGUMENT...). Any element of the list with unknown function name etc. would be simply ignored. Code that doesn't let-bind this variable will behave as before. Code that cares can be optimized. Paul On Sat, 2 Jul 2022 at 17:58, Michael Albinus wrote: > Paul Pogonyshev writes: > > Hi Paul, > > > 1) check if connection is alive (`echo are you awake'); > > 2) test if the file exists; > > 3) creating a temporary file for the chunk to be inserted; I guess it > > tries until it finds an unused filename, e.g. here it seems to be done > > after `test -e /tmp/tramp.OD3cCu', which doesn't exist; > > 4) 'touch' on the temporary file, presumably to create it; > > 5) 'chmod' on the temporary, presumably so that other users cannot > > read it; > > 6) copying the requested chunk from the full file into the temporary > > (using `dd'); > > 7) finding the real name of the temporary with `readlink'; > > 8) finding attributes of the temporary with `stat'; > > 9) gzipping the temporary for transmition from the remote to the local > > machine; > > 10) testing if the temporary is a directory (WTF?); > > 11) removing the temporary. > > > > I guess it should be obvious that this is a bit too much for one > > `insert-file-contents' call. > > In general, I agree. However, some of the commands are caused by > primitive file operations, like file-exists-p. Tramp cannot know what > will be the next call, and it doesn't have all the opportunities to > optimize, compared with the overall picture you see in the eleven steps. > > > Suggested improvements: > > > > * TRAMP should issue just one `stat' command to find out most of the > > things about a file: whether it exists, if it is a directory, its real > > name when dereferencing links and whatever stats it is used to find > > now; from `$ stat --help' this seems to be possible. In other words, > > TRAMP shouldn't use simple commands like `test -e': any ping, even > > nominal, will negate any gains from using a tad faster command. > > Instead, if it needs to find anything about a file, it should ask the > > remote about as many things as possible in one go: it is very likely > > that the additional information will be needed soon and even if not, > > this is basically free compared to ping anyway. > > Not all remote hosts carry a stat command, and not all existing stat's > are GNU compatible. But yes, if possible, Tramp shall gather as much > information in one run, and cache the results for further use. > > I will see what could be done. Will come back with a proposal next days > (note that this will be for Emacs 29, ie git master). > > > * TRAMP code should prefer the approach "try do something and handle > > resulting errors" where possible. For example, don't check if the file > > exists, try to read it right away and handle failures properly. Code > > like `(when (file-exists-p ...) do-something)' adds an unnecessary > > command call and creates a racing condition anyway. Also, error-free > > requests should be more frequent, so they should be the main > > optimization goal. I'm not sure if it is applicable to TRAMP itself > > and doesn't come from a higher level, though. > > Indeed, this is not Tramp's responsibility. Tramp is a stupid > library. If there is a call for file-exists-p, it must return the > answer. It doesn't know what will be the next request. So I'm rather > pesimistic that Tramp can improve here. > > Best regards, Michael. > --00000000000050098e05e2d67ade Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Some more thoughts. Why does it even need `echo are you aw= ake'? It's a network connection, it can still fail even if it worke= d fine 1 ms before when you checked. So, why not just let the first command= fail if the connection is dead and restart the connection if it fails in s= uch a way as to suspect that it is dead (i.e. no output)? Maybe limit this = to read commands.

A way to let higher-level code avoid c= ertain `file-exists-p' calls: add a dynamic variable that tells TRAMP t= o skip certain commands if the result is not available from a cache. Someth= ing similar to `process-file-side-effects'. Calling code could then do = sth. like this:

=C2=A0 =C2=A0 (when (let ((tramp-m= ay-skip-if-not-cached `((file-exists-p unknown ,file))))
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (file-exists-p file))=C2=A0 ; TRAMP will= return t or nil if it knows or 'unknown if not cached; for local files= there is no effect
=C2=A0 =C2=A0 =C2=A0 ...)

Suggested semantics: list of (FUNCTION INSTANT-RESULT-IF-NOT-CACHED A= RGUMENT...). Any element of the list with unknown function name etc. would = be simply ignored.

Code that doesn't let-bind = this variable will behave as before. Code that cares can be optimized.

Paul

On Sat, 2 Jul 2022 at 17:58, Michael Albinus= <michael.al= binus@gmx.de> wrote:
Paul Pogonyshev <pogonyshev@gmail.com> writes:

Hi Paul,

> 1) check if connection is alive (`echo are you awake');
> 2) test if the file exists;
> 3) creating a temporary file for the chunk to be inserted; I guess it<= br> > tries until it finds an unused filename, e.g. here it seems to be done=
> after `test -e /tmp/tramp.OD3cCu', which doesn't exist;
> 4) 'touch' on the temporary file, presumably to create it;
> 5) 'chmod' on the temporary, presumably so that other users ca= nnot
> read it;
> 6) copying the requested chunk from the full file into the temporary > (using `dd');
> 7) finding the real name of the temporary with `readlink';
> 8) finding attributes of the temporary with `stat';
> 9) gzipping the temporary for transmition from the remote to the local=
> machine;
> 10) testing if the temporary is a directory (WTF?);
> 11) removing the temporary.
>
> I guess it should be obvious that this is a bit too much for one
> `insert-file-contents' call.

In general, I agree. However, some of the commands are caused by
primitive file operations, like file-exists-p. Tramp cannot know what
will be the next call, and it doesn't have all the opportunities to
optimize, compared with the overall picture you see in the eleven steps.
> Suggested improvements:
>
> * TRAMP should issue just one `stat' command to find out most of t= he
> things about a file: whether it exists, if it is a directory, its real=
> name when dereferencing links and whatever stats it is used to find > now; from `$ stat --help' this seems to be possible. In other word= s,
> TRAMP shouldn't use simple commands like `test -e': any ping, = even
> nominal, will negate any gains from using a tad faster command.
> Instead, if it needs to find anything about a file, it should ask the<= br> > remote about as many things as possible in one go: it is very likely > that the additional information will be needed soon and even if not, > this is basically free compared to ping anyway.

Not all remote hosts carry a stat command, and not all existing stat's<= br> are GNU compatible. But yes, if possible, Tramp shall gather as much
information in one run, and cache the results for further use.

I will see what could be done. Will come back with a proposal next days
(note that this will be for Emacs 29, ie git master).

> * TRAMP code should prefer the approach "try do something and han= dle
> resulting errors" where possible. For example, don't check if= the file
> exists, try to read it right away and handle failures properly. Code > like `(when (file-exists-p ...) do-something)' adds an unnecessary=
> command call and creates a racing condition anyway. Also, error-free > requests should be more frequent, so they should be the main
> optimization goal. I'm not sure if it is applicable to TRAMP itsel= f
> and doesn't come from a higher level, though.

Indeed, this is not Tramp's responsibility. Tramp is a stupid
library. If there is a call for file-exists-p, it must return the
answer. It doesn't know what will be the next request. So I'm rathe= r
pesimistic that Tramp can improve here.

Best regards, Michael.
--00000000000050098e05e2d67ade--