unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* reading from input pipe is slow?
@ 2018-06-22 16:40 Thomas Danckaert
  2018-06-22 21:47 ` Mark H Weaver
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Danckaert @ 2018-06-22 16:40 UTC (permalink / raw)
  To: guile-user

Dear guile users,

I've notice that reading a child process' stdout from a port opened 
with (open-input-pipe "command") is very slow.  In fact, it's faster 
to redirect the child process' output to a temporary file, and then 
read this temporary file from Guile, than to directly read the child 
process' output using a port (I've added some example code below 
where you can see this).

Is there an explanation for this?  Or even better: a solution?

Thank you for any information!

Thomas

You can execute the following at the REPL.  Here, I "cat" a big file 
to simulate a process generating a largish amount of text.

(use-modules (ice-9 textual-ports) (ice-9 popen))

(define big-file "/tmp/big-file.txt")

;; write 5MB to big-file
(with-output-to-file big-file
   (lambda ()
     (let put-line ((n 0))
       (simple-format (current-output-port) "~a~%" (make-string 1024 
#\a))
       (unless (> n (* 1024 5))
	(put-line (1+ n))))
     (simple-format (current-output-port) "This is the end")))

(define my-command (format #f "cat ~a" big-file))

;; 15 seconds on my machine:
,time (format #t "end of file: '~a'~%"
	      (string-take-right
	       (get-string-all (open-input-pipe my-command))
	       15))

;; 1 second on my machine:
,time (let ((tmpfile (tmpnam)))
	;; redirect my-command into a temporary file and read the 
file:
	(system (format #f "~a > ~a" my-command tmpfile))
	(format #t "end of file: '~a'~%"
		(string-take-right
		 (call-with-input-file tmpfile get-string-all)
		 15))
	(delete-file tmpfile))



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: reading from input pipe is slow?
  2018-06-22 16:40 reading from input pipe is slow? Thomas Danckaert
@ 2018-06-22 21:47 ` Mark H Weaver
  2018-06-25 11:32   ` Thomas Danckaert
  0 siblings, 1 reply; 3+ messages in thread
From: Mark H Weaver @ 2018-06-22 21:47 UTC (permalink / raw)
  To: Thomas Danckaert; +Cc: guile-user

Hi Thomas,

Thomas Danckaert <post@thomasdanckaert.be> writes:

> I've notice that reading a child process' stdout from a port opened
> with (open-input-pipe "command") is very slow.  In fact, it's faster
> to redirect the child process' output to a temporary file, and then
> read this temporary file from Guile, than to directly read the child
> process' output using a port (I've added some example code below where
> you can see this).
>
> Is there an explanation for this?  Or even better: a solution?

For some reason that I don't know, possibly historical, the pipes
created by procedures in (ice-9 popen) are unbuffered by default.  When
reading or writing from an unbuffered port, a system call is made for
every byte.

To enable buffering on pipe ports, call (setvbuf port 'block) after
opening it.  Something like this, where (rnrs io ports) is added to the
module imports you already had, for 'call-with-port':

  (call-with-port (open-input-pipe my-command)
    (lambda (port)
      (setvbuf port 'block)
      (string-take-right (get-string-all port)
                         15)))

Hope this helps,

      Mark



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: reading from input pipe is slow?
  2018-06-22 21:47 ` Mark H Weaver
@ 2018-06-25 11:32   ` Thomas Danckaert
  0 siblings, 0 replies; 3+ messages in thread
From: Thomas Danckaert @ 2018-06-25 11:32 UTC (permalink / raw)
  To: mhw; +Cc: guile-user

Hi Mark,

indeed, setting buffering with (setvbuf ...)  solved the issue.  
Thanks a lot!

> For some reason that I don't know, possibly historical, the pipes
> created by procedures in (ice-9 popen) are unbuffered by default.

I think you're right, but doesn't the manual suggest otherwise, when 
it says "Each direction has PIPE_BUF bytes of buffering" 
(https://www.gnu.org/software/guile/manual/html_node/Pipes.html)?  Or 
do I misinterpret that?

Also, let me mention that the statistical profiler from the 
(statprof) module has been very useful to isolate the problem. 
Initially I assumed the slowness was due to the external process.  
Only after running statprof did I realize that the problem was in my 
code.

Thomas



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-06-25 11:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-22 16:40 reading from input pipe is slow? Thomas Danckaert
2018-06-22 21:47 ` Mark H Weaver
2018-06-25 11:32   ` Thomas Danckaert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).