From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Dan Nicolaescu Newsgroups: gmane.emacs.devel Subject: Re: size of emacs executable after unicode merge Date: Thu, 27 Nov 2008 08:12:29 -0800 (PST) Message-ID: <200811271612.mARGCT3f021393@mothra.ics.uci.edu> References: <200805151529.m4FFTlF1004684@sallyv1.ics.uci.edu> <482D8435.6060407@gnu.org> <20081030101819.GA15223@orion.lan> <200810311507.m9VF7EAl022755@mothra.ics.uci.edu> <873ai7t7fx.fsf@cyd.mit.edu> <87iqqwk672.fsf@cyd.mit.edu> <873ahym8ji.fsf@cyd.mit.edu> <87r65flh5n.fsf@cyd.mit.edu> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1227802423 4397 80.91.229.12 (27 Nov 2008 16:13:43 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 27 Nov 2008 16:13:43 +0000 (UTC) Cc: rms@gnu.org, emanuele.giaquinta@gmail.com, Chong Yidong , emacs-devel@gnu.org, monnier@iro.umontreal.ca, evilborisnet@netscape.net, jasonr@gnu.org To: Kenichi Handa Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Nov 27 17:14:45 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1L5jVk-0003Ma-AK for ged-emacs-devel@m.gmane.org; Thu, 27 Nov 2008 17:14:36 +0100 Original-Received: from localhost ([127.0.0.1]:59117 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5jUa-0002XF-FH for ged-emacs-devel@m.gmane.org; Thu, 27 Nov 2008 11:13:24 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1L5jU0-0002Lv-EH for emacs-devel@gnu.org; Thu, 27 Nov 2008 11:12:48 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1L5jTz-0002LS-LI for emacs-devel@gnu.org; Thu, 27 Nov 2008 11:12:48 -0500 Original-Received: from [199.232.76.173] (port=58930 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1L5jTz-0002LL-II for emacs-devel@gnu.org; Thu, 27 Nov 2008 11:12:47 -0500 Original-Received: from sallyv2.ics.uci.edu ([128.195.1.120]:53783) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_3DES_EDE_CBC_SHA1:24) (Exim 4.60) (envelope-from ) id 1L5jTu-0006E2-86; Thu, 27 Nov 2008 11:12:42 -0500 Original-Received: from mothra.ics.uci.edu (mothra.ics.uci.edu [128.195.6.93]) by sallyv2.ics.uci.edu (8.13.7+Sun/8.13.7) with ESMTP id mARGCVWY012652; Thu, 27 Nov 2008 08:12:31 -0800 (PST) Original-Received: (from dann@localhost) by mothra.ics.uci.edu (8.13.8+Sun/8.13.6/Submit) id mARGCT3f021393; Thu, 27 Nov 2008 08:12:29 -0800 (PST) In-Reply-To: (Kenichi Handa's message of "Thu, 27 Nov 2008 20:20:21 +0900") Original-Lines: 46 X-ICS-MailScanner-Information: Please contact the ISP for more information X-ICS-MailScanner-ID: mARGCVWY012652 X-ICS-MailScanner: Found to be clean X-ICS-MailScanner-SpamCheck: not spam, SpamAssassin (score=-1.44, required 5, autolearn=disabled, ALL_TRUSTED -1.44) X-ICS-MailScanner-From: dann@mothra.ics.uci.edu X-detected-operating-system: by monty-python.gnu.org: Solaris 10 (beta) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:106239 Archived-At: Kenichi Handa writes: > In article <87r65flh5n.fsf@cyd.mit.edu>, Chong Yidong writes: > > > Kenichi Handa writes: > >>> > One idea is to have a single boolean vector of size #x110000 > >>> > (139264 bytes), setup it for CHARSET everytime when we call > >>> > map-charset-chars for the different charset. In that > >>> > vector, only the bit for #x3000, #x3001, #x3002, etc are 1 > >>> > for chinese-gb2312. Then map-charset-chars can know for > >>> > which characters FUNCTION must be called. > > > > >>> but it appears to free a negligible about of memory. > > > > > > Did you comment out the calls of unify-charset in > > > mule-conf.el and change the encoding of all preloaded *.el > > > files to utf-8? > > > Commenting out the calls to unify-charset does reduce the memory by > > several megabytes. > > After taking over Chong's experiment, I could reduce the > size of Emacs executables about 7M bytes. About 4M bytes > were actually because of charset mapping tables, and it > could be reduced by setting up C structure temp_charset_work > (see charset.c for the detail) instead of making many Lisp > objects (char-table and vector). Another 3M bytes were > because of big standard category table. It could be reduced > by hashing the table entries (see hash_get_category_set in > category.c for the detail). > > As a result, now the executable is 10,671,313 bytes on > GNU/Linux. Thanks for doing this! > It's still 1.6M bytes larger than Emacs 22, but I'm not sure it's > worth making more effort to reduce it. In that case that size increase might be with us for ever and ever, which is not ideal. Also a related question: the data in the .map files in emacs/etc/charsets be transformed into elisp? That way the normal loading mechanism could be used for them, and no parser + other code would be needed...