* process.c: read_process_output: hard coded 4096 bytes read limit
@ 2013-06-22 10:05 Miguel Guedes
2013-06-22 17:27 ` Paul Eggert
2013-06-23 9:28 ` Andreas Schwab
0 siblings, 2 replies; 8+ messages in thread
From: Miguel Guedes @ 2013-06-22 10:05 UTC (permalink / raw)
To: emacs-devel
Hello List,
In process.c, function read_process_output, all reads are limited to a
maximum of 4096 bytes. What is the rationale behind imposing a hardcoded
limit of 4096 bytes per read from a process channel? (why isn't the
user allowed to specify this read limit when creating a process/
connection?)
A minor mode I've written that sometimes receives 500-800KB of data as a
response from an async network connection (opened via make-network-
process) takes several seconds to fully receive the data due to the hard
coded limit mentioned above and the fact that emacs waits between reads
(probably in wait_reading_process_output upon receiving a signal in
sigchld_handler).
--
Miguel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit
2013-06-22 10:05 process.c: read_process_output: hard coded 4096 bytes read limit Miguel Guedes
@ 2013-06-22 17:27 ` Paul Eggert
2013-06-23 8:59 ` Miguel Guedes
2013-06-23 9:28 ` Andreas Schwab
1 sibling, 1 reply; 8+ messages in thread
From: Paul Eggert @ 2013-06-22 17:27 UTC (permalink / raw)
To: Miguel Guedes; +Cc: emacs-devel
On 06/22/13 03:05, Miguel Guedes wrote:
> In process.c, function read_process_output, all reads are limited to a
> maximum of 4096 bytes. What is the rationale behind imposing a hardcoded
> limit of 4096 bytes per read from a process channel? (why isn't the
> user allowed to specify this read limit when creating a process/
> connection?)
I don't know, and that limit has bothered me in the past, too.
It's a simple matter to increase it to 16 K bytes; does the
patch below help your performance? If so, I can install it.
Letting the user specify a larger limit
might make sense, but would take a bit of thought and hacking.
=== modified file 'src/coding.h'
--- src/coding.h 2013-05-22 14:53:21 +0000
+++ src/coding.h 2013-06-22 16:53:36 +0000
@@ -399,6 +399,7 @@
int rejected;
};
+enum { CARRYOVER_BYTES_MAX = 64 };
struct coding_system
{
@@ -491,7 +492,7 @@
/* Set to 1 if charbuf contains an annotation. */
unsigned annotated : 1;
- unsigned char carryover[64];
+ unsigned char carryover[CARRYOVER_BYTES_MAX];
int carryover_bytes;
int default_char;
=== modified file 'src/process.c'
--- src/process.c 2013-06-22 16:43:39 +0000
+++ src/process.c 2013-06-22 17:08:46 +0000
@@ -4949,7 +4949,7 @@
starting with our buffered-ahead character if we have one.
Yield number of decoded characters read.
- This function reads at most 4096 characters.
+ This function reads at most MAX_ALLOCA bytes.
If you want to read all available subprocess output,
you must call it repeatedly until it returns zero.
@@ -4959,16 +4959,16 @@
static int
read_process_output (Lisp_Object proc, register int channel)
{
- register ssize_t nbytes;
- char *chars;
- register struct Lisp_Process *p = XPROCESS (proc);
+ ssize_t nbytes;
+ char chars[MAX_ALLOCA];
+ struct Lisp_Process *p = XPROCESS (proc);
struct coding_system *coding = proc_decode_coding_system[channel];
int carryover = p->decoding_carryover;
- int readmax = 4096;
+ verify (CARRYOVER_BYTES_MAX < sizeof chars);
+ int readmax = sizeof chars - carryover;
ptrdiff_t count = SPECPDL_INDEX ();
Lisp_Object odeactivate;
- chars = alloca (carryover + readmax);
if (carryover)
/* See the comment above. */
memcpy (chars, SDATA (p->decoding_buf), carryover);
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit
2013-06-22 17:27 ` Paul Eggert
@ 2013-06-23 8:59 ` Miguel Guedes
0 siblings, 0 replies; 8+ messages in thread
From: Miguel Guedes @ 2013-06-23 8:59 UTC (permalink / raw)
To: emacs-devel
On Sat, 22 Jun 2013 10:27:55 -0700, Paul Eggert wrote:
> I don't know, and that limit has bothered me in the past, too.
>
> It's a simple matter to increase it to 16 K bytes; does the patch below
> help your performance? If so, I can install it. Letting the user
> specify a larger limit might make sense, but would take a bit of thought
> and hacking.
>
Thank you for your reply, Paul!
I am unable to apply and test the patch at the moment but I'm sure it
will help with performance, though not entirely fix it. I have this
minor mode that I'm developing that receives code completion data
asynchronously from a server bound to localhost. The actual server side
processing of code completion always takes less than 100ms and the
results are sent to the client in full. It is then ironic (as well as
frustrating) that the bottleneck (latency lasting many seconds) I'm
experiencing is being caused by emacs when just receiving the raw data
before even any processing takes place in the client side (which is very
fast anyway)!
There are a couple of ways I can use to mitigate the latency issues
mentioned above so I already have a fix, however IMHO there is a problem
in the way emacs reads data from process channels that needs to be fixed,
which I wouldn't mind contributing to. I assume you're an active emacs
developer (my own knowledge of emacs' internal code and architecture is
extremely limited), how would you tackle this, Paul?
Apart from raising the hardcoded limit, one (perhaps simpler) way would
be to allow the user to specify the buffer size when spawning a process/
network connection, however that still leaves open the problem of the
waits between reads. So perhaps the best strategy to fix this, though
probably more complicated, would be to tackle the waits between reads
head on? How problematic with this be?
Apologies for not understanding but by "I can install it", did you mean
you can commit it?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit
2013-06-22 10:05 process.c: read_process_output: hard coded 4096 bytes read limit Miguel Guedes
2013-06-22 17:27 ` Paul Eggert
@ 2013-06-23 9:28 ` Andreas Schwab
2013-06-25 8:20 ` Miguel Guedes
1 sibling, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2013-06-23 9:28 UTC (permalink / raw)
To: Miguel Guedes; +Cc: emacs-devel
Miguel Guedes <miguel.a.guedes@gmail.com> writes:
> A minor mode I've written that sometimes receives 500-800KB of data as a
> response from an async network connection (opened via make-network-
> process) takes several seconds to fully receive the data due to the hard
> coded limit mentioned above and the fact that emacs waits between reads
> (probably in wait_reading_process_output upon receiving a signal in
> sigchld_handler).
If you call accept-process-output with this process it should read
whatever is available without delay.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit
2013-06-23 9:28 ` Andreas Schwab
@ 2013-06-25 8:20 ` Miguel Guedes
2013-06-26 11:57 ` Thien-Thi Nguyen
0 siblings, 1 reply; 8+ messages in thread
From: Miguel Guedes @ 2013-06-25 8:20 UTC (permalink / raw)
To: emacs-devel
On Sun, 23 Jun 2013 11:28:38 +0200, Andreas Schwab wrote:
> If you call accept-process-output with this process it should read
> whatever is available without delay.
Thanks, Andreas.
How would you use accept-process-output with an asynchronous process
channel? E.g. when would you call it? How do you know when there is data
available? The only way (that I see) of knowing when there is data
available is when the filter is first called but can accept-process-
output be called from within the filter; wouldn't it lead to potentially
massive recursiveness as the filter is called again and again?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit
2013-06-25 8:20 ` Miguel Guedes
@ 2013-06-26 11:57 ` Thien-Thi Nguyen
2013-06-26 13:16 ` Stefan Monnier
0 siblings, 1 reply; 8+ messages in thread
From: Thien-Thi Nguyen @ 2013-06-26 11:57 UTC (permalink / raw)
To: Miguel Guedes; +Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 2548 bytes --]
() Miguel Guedes <miguel.a.guedes@gmail.com>
() Tue, 25 Jun 2013 08:20:40 +0000 (UTC)
How would you use accept-process-output with an asynchronous process
channel? E.g. when would you call it? How do you know when there is
data available?
Presumably the protocol specifies a format for the received data that
includes some kind of "end of message" marker. You know there is data
available when that marker is not present (given a well-formed message).
For example:
(defun gnugo-synchronous-send/return (message)
"Return (TIME . STRING) where TIME is that returned by `current-time' and
STRING omits the two trailing newlines. See also `gnugo-query'."
(when (gnugo-get :waitingp)
(error "Sorry, still waiting for %s to play" (gnugo-get :gnugo-color)))
(gnugo-put :sync-return "")
(let ((proc (gnugo-get :proc)))
(set-process-filter
proc (lambda (proc string)
(let* ((so-far (gnugo-get :sync-return))
(start (max 0 (- (length so-far) 2))) ; backtrack a little
(full (gnugo-put :sync-return (concat so-far string))))
(when (string-match "\n\n" full start)
(gnugo-put :sync-return
(cons (current-time) (substring full 0 -2)))))))
(gnugo-send-line message)
(let (rv)
;; type change => break
(while (stringp (setq rv (gnugo-get :sync-return)))
(accept-process-output proc))
(gnugo-put :sync-return "")
rv)))
Here, the "end of message" marker is two consecutive newlines, detected
by ‘string-match’. The (gratuitously dynamically prepared, but somewhat
convenient for copy-and-paste purposes :-D) filter func accumulates the
input and does the final (at EOM) transform, only. The trigger for the
initial process output is from ‘gnugo-send-line’ after which it is
‘gnugo-synchronous-send/return’ that controls the looping, by monitoring
the accumulated state to detect the transform.
The only way (that I see) of knowing when there is data available is
when the filter is first called but can accept-process- output be
called from within the filter; wouldn't it lead to potentially
massive recursiveness as the filter is called again and again?
Yes. That's why the filter func and its (patiently looping) controller
must adhere to some mini protocol defining access to shared state, and
what that state "means". [Insert VMS mailbox mumblings, here...]
--
Thien-Thi Nguyen
GPG key: 4C807502
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit
2013-06-26 11:57 ` Thien-Thi Nguyen
@ 2013-06-26 13:16 ` Stefan Monnier
2013-06-27 6:03 ` Thien-Thi Nguyen
0 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2013-06-26 13:16 UTC (permalink / raw)
To: Thien-Thi Nguyen; +Cc: Miguel Guedes, emacs-devel
> How would you use accept-process-output with an asynchronous process
> channel? E.g. when would you call it? How do you know when there is
> data available?
> Presumably the protocol specifies a format for the received data that
> includes some kind of "end of message" marker. You know there is data
> available when that marker is not present (given a well-formed message).
> For example:
But accept-process-output is synchronous: while running it, nothing else
can happen. It's important to make sure that the performance is good
enough with accept-process-output but it should *also* be good enough
without accept-process-output.
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit
2013-06-26 13:16 ` Stefan Monnier
@ 2013-06-27 6:03 ` Thien-Thi Nguyen
0 siblings, 0 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2013-06-27 6:03 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 610 bytes --]
() Stefan Monnier <monnier@iro.umontreal.ca>
() Wed, 26 Jun 2013 09:16:13 -0400
But accept-process-output is synchronous: while running it, nothing
else can happen. It's important to make sure that the performance is
good enough with accept-process-output but it should *also* be good
enough without accept-process-output.
I cannot agree or disagree w/o knowing the particulars. In the case of
gnugo.el, message size never exceeds a few kilobytes, and performance is
not critical. Anyway, i'm interested in hearing what choices OP makes.
--
Thien-Thi Nguyen
GPG key: 4C807502
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-06-27 6:03 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-22 10:05 process.c: read_process_output: hard coded 4096 bytes read limit Miguel Guedes
2013-06-22 17:27 ` Paul Eggert
2013-06-23 8:59 ` Miguel Guedes
2013-06-23 9:28 ` Andreas Schwab
2013-06-25 8:20 ` Miguel Guedes
2013-06-26 11:57 ` Thien-Thi Nguyen
2013-06-26 13:16 ` Stefan Monnier
2013-06-27 6:03 ` Thien-Thi Nguyen
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).