From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Thomas Lord Newsgroups: gmane.emacs.devel Subject: Re: size of emacs executable after unicode merge Date: Fri, 16 May 2008 16:01:45 -0700 Message-ID: <482E1259.7040808@emf.net> References: <200805140351.m4E3pQuE004549@sallyv1.ics.uci.edu> <200805141652.m4EGqikr018644@sallyv1.ics.uci.edu> <200805151529.m4FFTlF1004684@sallyv1.ics.uci.edu> <482D8435.6060407@gnu.org> <482DAF4B.60900@emf.net> <873aohucld.fsf@uwakimon.sk.tsukuba.ac.jp> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1210976336 32496 80.91.229.12 (16 May 2008 22:18:56 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 16 May 2008 22:18:56 +0000 (UTC) Cc: rms@gnu.org, Kenichi Handa , emacs-devel@gnu.org, dann@ics.uci.edu, evilborisnet@netscape.net, Jason Rumney To: "Stephen J. Turnbull" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat May 17 00:19:31 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Jx8Gv-0008Fm-Bm for ged-emacs-devel@m.gmane.org; Sat, 17 May 2008 00:19:29 +0200 Original-Received: from localhost ([127.0.0.1]:39194 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Jx8GB-0003xC-Ns for ged-emacs-devel@m.gmane.org; Fri, 16 May 2008 18:18:43 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Jx8G4-0003ws-It for emacs-devel@gnu.org; Fri, 16 May 2008 18:18:36 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Jx8G1-0003wS-RO for emacs-devel@gnu.org; Fri, 16 May 2008 18:18:35 -0400 Original-Received: from [199.232.76.173] (port=34952 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Jx8G1-0003wP-Kb for emacs-devel@gnu.org; Fri, 16 May 2008 18:18:33 -0400 Original-Received: from mail.42inc.com ([205.149.0.25]:56492) by monty-python.gnu.org with esmtps (SSL 3.0:RSA_3DES_EDE_CBC_SHA1:24) (Exim 4.60) (envelope-from ) id 1Jx8Ft-0007wb-9v; Fri, 16 May 2008 18:18:25 -0400 X-TFF-CGPSA-Version: 1.5 X-TFF-CGPSA-Filter-42inc: Scanned X-42-Virus-Scanned: by 42 Antivirus -- Found to be clean. Original-Received: from [69.236.114.9] (account lord@emf.net HELO [192.168.1.64]) by mail.42inc.com (CommuniGate Pro SMTP 5.0.13) with ESMTPA id 30755731; Fri, 16 May 2008 15:18:12 -0700 User-Agent: Thunderbird 1.5.0.5 (X11/20060808) In-Reply-To: <873aohucld.fsf@uwakimon.sk.tsukuba.ac.jp> X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:97298 Archived-At: Stephen J. Turnbull wrote: > > If it would be helpful, > > Did you do much better than 60% savings? As I recall, I did considerably better, though I'm not clear whether or not we're talking about the same tables. I could be mistaken, hence the passive request for prompting to indicate whether or not it's worth really refreshing my memory here. You are on the right track to observe that the density of stuff that matters is the key to optimization. Trie-based sparse-away approaches seem to work very well. The trick is to do some off-line computation to work out a roughly optimal breadth and depth. I found it worked well to vary the breadth according to depth. That's, in a nutshell, what I'm talking about. You talk about range encoding. Ick. Too many tests and branches, in my experience. A simple trie will do -- just take care to get its shape correct. > In other words, even with a naive strategy, the Unicode BMP database > should only add about 1.1MB to 1.4MB, ie, about 10% of the size > increase seen here, if coded compactly but straightforwardly in C. > > I'm not talking about boatloads of code and, if done right, it has other applications as well. It's no big deal either way. I don't mean to argue. I just thought it might be helpful. I'm just a patzer or kibbitzer here, take yr pick. As an aside: virtual memory hardware sucks and is pointless. Segmentation rocks, on the other hand. But, that's a topic for a day a ways in the future, unfortunately. -t