From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Dmitry Antipov Newsgroups: gmane.emacs.devel Subject: Simple optimization for read_avail_input() Date: Fri, 30 Jan 2004 18:57:50 +0300 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: <401A7EFE.8030509@mail.ru> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1075475541 16165 80.91.224.253 (30 Jan 2004 15:12:21 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 30 Jan 2004 15:12:21 +0000 (UTC) Original-X-From: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Fri Jan 30 16:12:14 2004 Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1AmaJa-0007a6-00 for ; Fri, 30 Jan 2004 16:12:14 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1AmaJa-0007c8-00 for ; Fri, 30 Jan 2004 16:12:14 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.24) id 1AmaIf-0004s7-92 for emacs-devel@quimby.gnus.org; Fri, 30 Jan 2004 10:11:17 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.24) id 1AmaI2-0004rF-Od for emacs-devel@gnu.org; Fri, 30 Jan 2004 10:10:38 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.24) id 1Ama8S-00029G-LQ for emacs-devel@gnu.org; Fri, 30 Jan 2004 10:01:15 -0500 Original-Received: from [80.240.96.70] (helo=mail.dev.rtsoft.ru) by monty-python.gnu.org with smtp (Exim 4.24) id 1Ama8R-00027o-Eg for emacs-devel@gnu.org; Fri, 30 Jan 2004 10:00:43 -0500 Original-Received: (qmail 14496 invoked from network); 30 Jan 2004 14:38:35 -0000 Original-Received: from antipov.dev.rtsoft.ru (HELO mail.ru) (192.168.1.213) by mail.dev.rtsoft.ru with SMTP; 30 Jan 2004 14:38:35 -0000 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en-us, en Original-To: emacs-devel@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.2 Precedence: list List-Id: Emacs development discussions. List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:19575 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:19575 Hello, this is a top of gprof output for Emacs CVS snapshot. It was being compiled with '-O0 -ftest-coverage -g -pg -fprofile-arcs', started and finished with C-x C-c immediately: Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 12.12 0.04 0.04 2464 0.02 0.02 ccl_driver 12.12 0.08 0.04 546 0.07 0.07 read_avail_input 9.09 0.11 0.03 23731 0.00 0.00 read1 9.09 0.14 0.03 4452 0.01 0.01 mark_object 6.06 0.16 0.02 289315 0.00 0.00 readchar 6.06 0.18 0.02 8335 0.00 0.00 Fbyte_code 6.06 0.20 0.02 743 0.03 0.03 Fassoc 3.03 0.21 0.01 136877 0.00 0.00 translate_char It's clear here that very simple function read_avail_input() wastes a lot of CPU time. IMHO this is because it wants to zero large 'struct input_event buf' (which is KBD_BUFFER_SIZE (4096, except old MacOSs) * sizeof (struct input_event) (44 bytes on 32-bit systems)) every time. But we can clear all 'buf' only once and clear only used slots next time. The following patch illustrates this idea: --- keyboard.c.~1.761.~ 2004-01-21 23:19:41.000000000 +0300 +++ keyboard.c 2004-01-30 18:37:04.000000000 +0300 @@ -6568,6 +6568,8 @@ Returns the number of keyboard chars read, or -1 meaning this is a bad time to try to read input. */ +static int prev_read = KBD_BUFFER_SIZE; + static int read_avail_input (expected) int expected; @@ -6576,7 +6578,7 @@ register int i; int nread; - for (i = 0; i < KBD_BUFFER_SIZE; i++) + for (i = 0; i < prev_read; i++) EVENT_INIT (buf[i]); if (read_socket_hook) @@ -6592,12 +6594,12 @@ /* Determine how many characters we should *try* to read. */ #ifdef WINDOWSNT - return 0; + return (prev_read = 0); #else /* not WINDOWSNT */ #ifdef MSDOS n_to_read = dos_keysns (); if (n_to_read == 0) - return 0; + return (prev_read = 0); #else /* not MSDOS */ #ifdef FIONREAD /* Find out how much input is available. */ @@ -6615,7 +6617,7 @@ n_to_read = 0; } if (n_to_read == 0) - return 0; + return (prev_read = 0); if (n_to_read > sizeof cbuf) n_to_read = sizeof cbuf; #else /* no FIONREAD */ @@ -6706,7 +6708,7 @@ break; } - return nread; + return (prev_read = nread); } #endif /* not VMS */ Here is an example of gprof output with this patch applied (other conditions are exactly the same): Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 18.42 0.07 0.07 4453 0.02 0.02 mark_object 10.53 0.11 0.04 36279 0.00 0.00 specbind 7.89 0.14 0.03 2465 0.01 0.01 ccl_driver 5.26 0.16 0.02 8358 0.00 0.01 Fbyte_code 5.26 0.18 0.02 7197 0.00 0.00 re_search_2 2.63 0.19 0.01 289315 0.00 0.00 readchar 2.63 0.20 0.01 156900 0.00 0.00 Faref ... 0.00 0.38 0.00 563 0.00 0.00 call1 0.00 0.38 0.00 548 0.00 0.00 XTread_socket 0.00 0.38 0.00 548 0.00 0.00 read_avail_input 0.00 0.38 0.00 544 0.00 0.00 handle_async_input So, after several runs, read_avail_input() goes from 1st or 2nd to > 200th (238 here) place (by CPU usage). What do you think about this idea ? Dmitry