* process.c: read_process_output: hard coded 4096 bytes read limit @ 2013-06-22 10:05 Miguel Guedes 2013-06-22 17:27 ` Paul Eggert 2013-06-23 9:28 ` Andreas Schwab 0 siblings, 2 replies; 8+ messages in thread From: Miguel Guedes @ 2013-06-22 10:05 UTC (permalink / raw) To: emacs-devel Hello List, In process.c, function read_process_output, all reads are limited to a maximum of 4096 bytes. What is the rationale behind imposing a hardcoded limit of 4096 bytes per read from a process channel? (why isn't the user allowed to specify this read limit when creating a process/ connection?) A minor mode I've written that sometimes receives 500-800KB of data as a response from an async network connection (opened via make-network- process) takes several seconds to fully receive the data due to the hard coded limit mentioned above and the fact that emacs waits between reads (probably in wait_reading_process_output upon receiving a signal in sigchld_handler). -- Miguel ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit 2013-06-22 10:05 process.c: read_process_output: hard coded 4096 bytes read limit Miguel Guedes @ 2013-06-22 17:27 ` Paul Eggert 2013-06-23 8:59 ` Miguel Guedes 2013-06-23 9:28 ` Andreas Schwab 1 sibling, 1 reply; 8+ messages in thread From: Paul Eggert @ 2013-06-22 17:27 UTC (permalink / raw) To: Miguel Guedes; +Cc: emacs-devel On 06/22/13 03:05, Miguel Guedes wrote: > In process.c, function read_process_output, all reads are limited to a > maximum of 4096 bytes. What is the rationale behind imposing a hardcoded > limit of 4096 bytes per read from a process channel? (why isn't the > user allowed to specify this read limit when creating a process/ > connection?) I don't know, and that limit has bothered me in the past, too. It's a simple matter to increase it to 16 K bytes; does the patch below help your performance? If so, I can install it. Letting the user specify a larger limit might make sense, but would take a bit of thought and hacking. === modified file 'src/coding.h' --- src/coding.h 2013-05-22 14:53:21 +0000 +++ src/coding.h 2013-06-22 16:53:36 +0000 @@ -399,6 +399,7 @@ int rejected; }; +enum { CARRYOVER_BYTES_MAX = 64 }; struct coding_system { @@ -491,7 +492,7 @@ /* Set to 1 if charbuf contains an annotation. */ unsigned annotated : 1; - unsigned char carryover[64]; + unsigned char carryover[CARRYOVER_BYTES_MAX]; int carryover_bytes; int default_char; === modified file 'src/process.c' --- src/process.c 2013-06-22 16:43:39 +0000 +++ src/process.c 2013-06-22 17:08:46 +0000 @@ -4949,7 +4949,7 @@ starting with our buffered-ahead character if we have one. Yield number of decoded characters read. - This function reads at most 4096 characters. + This function reads at most MAX_ALLOCA bytes. If you want to read all available subprocess output, you must call it repeatedly until it returns zero. @@ -4959,16 +4959,16 @@ static int read_process_output (Lisp_Object proc, register int channel) { - register ssize_t nbytes; - char *chars; - register struct Lisp_Process *p = XPROCESS (proc); + ssize_t nbytes; + char chars[MAX_ALLOCA]; + struct Lisp_Process *p = XPROCESS (proc); struct coding_system *coding = proc_decode_coding_system[channel]; int carryover = p->decoding_carryover; - int readmax = 4096; + verify (CARRYOVER_BYTES_MAX < sizeof chars); + int readmax = sizeof chars - carryover; ptrdiff_t count = SPECPDL_INDEX (); Lisp_Object odeactivate; - chars = alloca (carryover + readmax); if (carryover) /* See the comment above. */ memcpy (chars, SDATA (p->decoding_buf), carryover); ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit 2013-06-22 17:27 ` Paul Eggert @ 2013-06-23 8:59 ` Miguel Guedes 0 siblings, 0 replies; 8+ messages in thread From: Miguel Guedes @ 2013-06-23 8:59 UTC (permalink / raw) To: emacs-devel On Sat, 22 Jun 2013 10:27:55 -0700, Paul Eggert wrote: > I don't know, and that limit has bothered me in the past, too. > > It's a simple matter to increase it to 16 K bytes; does the patch below > help your performance? If so, I can install it. Letting the user > specify a larger limit might make sense, but would take a bit of thought > and hacking. > Thank you for your reply, Paul! I am unable to apply and test the patch at the moment but I'm sure it will help with performance, though not entirely fix it. I have this minor mode that I'm developing that receives code completion data asynchronously from a server bound to localhost. The actual server side processing of code completion always takes less than 100ms and the results are sent to the client in full. It is then ironic (as well as frustrating) that the bottleneck (latency lasting many seconds) I'm experiencing is being caused by emacs when just receiving the raw data before even any processing takes place in the client side (which is very fast anyway)! There are a couple of ways I can use to mitigate the latency issues mentioned above so I already have a fix, however IMHO there is a problem in the way emacs reads data from process channels that needs to be fixed, which I wouldn't mind contributing to. I assume you're an active emacs developer (my own knowledge of emacs' internal code and architecture is extremely limited), how would you tackle this, Paul? Apart from raising the hardcoded limit, one (perhaps simpler) way would be to allow the user to specify the buffer size when spawning a process/ network connection, however that still leaves open the problem of the waits between reads. So perhaps the best strategy to fix this, though probably more complicated, would be to tackle the waits between reads head on? How problematic with this be? Apologies for not understanding but by "I can install it", did you mean you can commit it? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit 2013-06-22 10:05 process.c: read_process_output: hard coded 4096 bytes read limit Miguel Guedes 2013-06-22 17:27 ` Paul Eggert @ 2013-06-23 9:28 ` Andreas Schwab 2013-06-25 8:20 ` Miguel Guedes 1 sibling, 1 reply; 8+ messages in thread From: Andreas Schwab @ 2013-06-23 9:28 UTC (permalink / raw) To: Miguel Guedes; +Cc: emacs-devel Miguel Guedes <miguel.a.guedes@gmail.com> writes: > A minor mode I've written that sometimes receives 500-800KB of data as a > response from an async network connection (opened via make-network- > process) takes several seconds to fully receive the data due to the hard > coded limit mentioned above and the fact that emacs waits between reads > (probably in wait_reading_process_output upon receiving a signal in > sigchld_handler). If you call accept-process-output with this process it should read whatever is available without delay. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit 2013-06-23 9:28 ` Andreas Schwab @ 2013-06-25 8:20 ` Miguel Guedes 2013-06-26 11:57 ` Thien-Thi Nguyen 0 siblings, 1 reply; 8+ messages in thread From: Miguel Guedes @ 2013-06-25 8:20 UTC (permalink / raw) To: emacs-devel On Sun, 23 Jun 2013 11:28:38 +0200, Andreas Schwab wrote: > If you call accept-process-output with this process it should read > whatever is available without delay. Thanks, Andreas. How would you use accept-process-output with an asynchronous process channel? E.g. when would you call it? How do you know when there is data available? The only way (that I see) of knowing when there is data available is when the filter is first called but can accept-process- output be called from within the filter; wouldn't it lead to potentially massive recursiveness as the filter is called again and again? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit 2013-06-25 8:20 ` Miguel Guedes @ 2013-06-26 11:57 ` Thien-Thi Nguyen 2013-06-26 13:16 ` Stefan Monnier 0 siblings, 1 reply; 8+ messages in thread From: Thien-Thi Nguyen @ 2013-06-26 11:57 UTC (permalink / raw) To: Miguel Guedes; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 2548 bytes --] () Miguel Guedes <miguel.a.guedes@gmail.com> () Tue, 25 Jun 2013 08:20:40 +0000 (UTC) How would you use accept-process-output with an asynchronous process channel? E.g. when would you call it? How do you know when there is data available? Presumably the protocol specifies a format for the received data that includes some kind of "end of message" marker. You know there is data available when that marker is not present (given a well-formed message). For example: (defun gnugo-synchronous-send/return (message) "Return (TIME . STRING) where TIME is that returned by `current-time' and STRING omits the two trailing newlines. See also `gnugo-query'." (when (gnugo-get :waitingp) (error "Sorry, still waiting for %s to play" (gnugo-get :gnugo-color))) (gnugo-put :sync-return "") (let ((proc (gnugo-get :proc))) (set-process-filter proc (lambda (proc string) (let* ((so-far (gnugo-get :sync-return)) (start (max 0 (- (length so-far) 2))) ; backtrack a little (full (gnugo-put :sync-return (concat so-far string)))) (when (string-match "\n\n" full start) (gnugo-put :sync-return (cons (current-time) (substring full 0 -2))))))) (gnugo-send-line message) (let (rv) ;; type change => break (while (stringp (setq rv (gnugo-get :sync-return))) (accept-process-output proc)) (gnugo-put :sync-return "") rv))) Here, the "end of message" marker is two consecutive newlines, detected by ‘string-match’. The (gratuitously dynamically prepared, but somewhat convenient for copy-and-paste purposes :-D) filter func accumulates the input and does the final (at EOM) transform, only. The trigger for the initial process output is from ‘gnugo-send-line’ after which it is ‘gnugo-synchronous-send/return’ that controls the looping, by monitoring the accumulated state to detect the transform. The only way (that I see) of knowing when there is data available is when the filter is first called but can accept-process- output be called from within the filter; wouldn't it lead to potentially massive recursiveness as the filter is called again and again? Yes. That's why the filter func and its (patiently looping) controller must adhere to some mini protocol defining access to shared state, and what that state "means". [Insert VMS mailbox mumblings, here...] -- Thien-Thi Nguyen GPG key: 4C807502 [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit 2013-06-26 11:57 ` Thien-Thi Nguyen @ 2013-06-26 13:16 ` Stefan Monnier 2013-06-27 6:03 ` Thien-Thi Nguyen 0 siblings, 1 reply; 8+ messages in thread From: Stefan Monnier @ 2013-06-26 13:16 UTC (permalink / raw) To: Thien-Thi Nguyen; +Cc: Miguel Guedes, emacs-devel > How would you use accept-process-output with an asynchronous process > channel? E.g. when would you call it? How do you know when there is > data available? > Presumably the protocol specifies a format for the received data that > includes some kind of "end of message" marker. You know there is data > available when that marker is not present (given a well-formed message). > For example: But accept-process-output is synchronous: while running it, nothing else can happen. It's important to make sure that the performance is good enough with accept-process-output but it should *also* be good enough without accept-process-output. Stefan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: process.c: read_process_output: hard coded 4096 bytes read limit 2013-06-26 13:16 ` Stefan Monnier @ 2013-06-27 6:03 ` Thien-Thi Nguyen 0 siblings, 0 replies; 8+ messages in thread From: Thien-Thi Nguyen @ 2013-06-27 6:03 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 610 bytes --] () Stefan Monnier <monnier@iro.umontreal.ca> () Wed, 26 Jun 2013 09:16:13 -0400 But accept-process-output is synchronous: while running it, nothing else can happen. It's important to make sure that the performance is good enough with accept-process-output but it should *also* be good enough without accept-process-output. I cannot agree or disagree w/o knowing the particulars. In the case of gnugo.el, message size never exceeds a few kilobytes, and performance is not critical. Anyway, i'm interested in hearing what choices OP makes. -- Thien-Thi Nguyen GPG key: 4C807502 [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-06-27 6:03 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-06-22 10:05 process.c: read_process_output: hard coded 4096 bytes read limit Miguel Guedes 2013-06-22 17:27 ` Paul Eggert 2013-06-23 8:59 ` Miguel Guedes 2013-06-23 9:28 ` Andreas Schwab 2013-06-25 8:20 ` Miguel Guedes 2013-06-26 11:57 ` Thien-Thi Nguyen 2013-06-26 13:16 ` Stefan Monnier 2013-06-27 6:03 ` Thien-Thi Nguyen
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).