From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.devel Subject: Re: size of emacs executable after unicode merge Date: Thu, 27 Nov 2008 20:20:21 +0900 Message-ID: References: <200805141652.m4EGqikr018644@sallyv1.ics.uci.edu> <200805151529.m4FFTlF1004684@sallyv1.ics.uci.edu> <482D8435.6060407@gnu.org> <20081030101819.GA15223@orion.lan> <200810311507.m9VF7EAl022755@mothra.ics.uci.edu> <873ai7t7fx.fsf@cyd.mit.edu> <87iqqwk672.fsf@cyd.mit.edu> <873ahym8ji.fsf@cyd.mit.edu> <87r65flh5n.fsf@cyd.mit.edu> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: ger.gmane.org 1227784854 11499 80.91.229.12 (27 Nov 2008 11:20:54 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 27 Nov 2008 11:20:54 +0000 (UTC) Cc: rms@gnu.org, emanuele.giaquinta@gmail.com, emacs-devel@gnu.org, dann@ics.uci.edu, monnier@iro.umontreal.ca, evilborisnet@netscape.net, jasonr@gnu.org To: Chong Yidong Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Nov 27 12:21:57 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1L5ewU-0004X5-Tt for ged-emacs-devel@m.gmane.org; Thu, 27 Nov 2008 12:21:55 +0100 Original-Received: from localhost ([127.0.0.1]:43388 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5evL-0007J3-6A for ged-emacs-devel@m.gmane.org; Thu, 27 Nov 2008 06:20:43 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L5evF-0007If-6L for emacs-devel@gnu.org; Thu, 27 Nov 2008 06:20:37 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L5evD-0007IB-RO for emacs-devel@gnu.org; Thu, 27 Nov 2008 06:20:36 -0500 Original-Received: from [199.232.76.173] (port=57724 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5evD-0007I0-LW for emacs-devel@gnu.org; Thu, 27 Nov 2008 06:20:35 -0500 Original-Received: from mx1.aist.go.jp ([150.29.246.133]:38521) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L5ev6-00015C-IT; Thu, 27 Nov 2008 06:20:29 -0500 Original-Received: from rqsmtp1.aist.go.jp (rqsmtp1.aist.go.jp [150.29.254.115]) by mx1.aist.go.jp with ESMTP id mARBKNwT024519; Thu, 27 Nov 2008 20:20:23 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp1.aist.go.jp by rqsmtp1.aist.go.jp with ESMTP id mARBKMV3018160; Thu, 27 Nov 2008 20:20:22 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp1.aist.go.jp with ESMTP id mARBKLHA010878; Thu, 27 Nov 2008 20:20:21 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken.m17n.org with local (Exim 4.69) (envelope-from ) id 1L5euz-0005r3-IQ; Thu, 27 Nov 2008 20:20:21 +0900 In-reply-to: <87r65flh5n.fsf@cyd.mit.edu> (message from Chong Yidong on Thu, 13 Nov 2008 11:33:40 -0500) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/23.0.60 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) X-detected-operating-system: by monty-python.gnu.org: Solaris 9 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:106222 Archived-At: In article <87r65flh5n.fsf@cyd.mit.edu>, Chong Yidong writes: > Kenichi Handa writes: >>> > One idea is to have a single boolean vector of size #x110000 >>> > (139264 bytes), setup it for CHARSET everytime when we call >>> > map-charset-chars for the different charset. In that >>> > vector, only the bit for #x3000, #x3001, #x3002, etc are 1 >>> > for chinese-gb2312. Then map-charset-chars can know for >>> > which characters FUNCTION must be called. > > >>> but it appears to free a negligible about of memory. > > > > Did you comment out the calls of unify-charset in > > mule-conf.el and change the encoding of all preloaded *.el > > files to utf-8? > Commenting out the calls to unify-charset does reduce the memory by > several megabytes. After taking over Chong's experiment, I could reduce the size of Emacs executables about 7M bytes. About 4M bytes were actually because of charset mapping tables, and it could be reduced by setting up C structure temp_charset_work (see charset.c for the detail) instead of making many Lisp objects (char-table and vector). Another 3M bytes were because of big standard category table. It could be reduced by hashing the table entries (see hash_get_category_set in category.c for the detail). As a result, now the executable is 10,671,313 bytes on GNU/Linux. It's still 1.6M bytes larger than Emacs 22, but I'm not sure it's worth making more effort to reduce it. --- Kenichi Handa handa@ni.aist.go.jp