From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Optimizing memory footprint of bidi code Date: Mon, 07 Jun 2010 12:01:26 -0400 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1275926621 28401 80.91.229.12 (7 Jun 2010 16:03:41 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 7 Jun 2010 16:03:41 +0000 (UTC) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jun 07 18:03:39 2010 connect(): No such file or directory Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OLenb-0005L0-4h for ged-emacs-devel@m.gmane.org; Mon, 07 Jun 2010 18:03:39 +0200 Original-Received: from localhost ([127.0.0.1]:60244 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OLena-00053L-8w for ged-emacs-devel@m.gmane.org; Mon, 07 Jun 2010 12:03:38 -0400 Original-Received: from [140.186.70.92] (port=33054 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OLelc-0004DL-KL for emacs-devel@gnu.org; Mon, 07 Jun 2010 12:01:37 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OLelb-0003DG-8r for emacs-devel@gnu.org; Mon, 07 Jun 2010 12:01:36 -0400 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.181]:40994 helo=ironport2-out.pppoe.ca) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OLelU-0003Bk-Iq; Mon, 07 Jun 2010 12:01:28 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAOG0DExFpYBi/2dsb2JhbACeR3K/NIUXBIxn X-IronPort-AV: E=Sophos;i="4.53,379,1272859200"; d="scan'208";a="67512169" Original-Received: from 69-165-128-98.dsl.teksavvy.com (HELO pastel.home) ([69.165.128.98]) by ironport2-out.pppoe.ca with ESMTP; 07 Jun 2010 12:01:27 -0400 Original-Received: by pastel.home (Postfix, from userid 20848) id C7AD38175; Mon, 7 Jun 2010 12:01:26 -0400 (EDT) In-Reply-To: (Eli Zaretskii's message of "Mon, 07 Jun 2010 10:30:56 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:125587 Archived-At: > I see a bidi_dir_t enumeration that has only 3 possible values. Can > we trust modern compilers to use the smallest integer type for > enumerations? Not that I know, no. > If so, this part is already taken care of. If not, how to make this > type smaller without losing the benefits of an enumerated type (I > don't want to manipulate magic values or macros)? I'm not sure what you're worried about. What's wrong with adding a ":2" to the field? > bits for the latter, at the time this code was designed, 9 years ago, > there was a significant performance hit for manipulating bits on the > then popular architectures. Or at least that's what I thought. These Accessing and modifying a bitfield is and has always been slower than an "int", yes. Some machines (typically the P4) have relatively slow bit-shifting operations, among other things. > Maybe manipulating bitfields is no longer an issue with current > architectures and compiler technology. Maybe this problem didn't I think the issue is not so much whether bitfield access has gotten faster, but that processors in general have gotten faster, while memory has sped up much less. So in many cases it's worthwhile to do a bit more work if it lets you access less memory. In many cases (tho obviously not all), time can be estimated fairly well by looking at the amount of memory that's accessed, treating the actual computation as being completely free. > Can someone "in the know" tell what is the current state of art in > these matters? I'm not sure I'm "in the know" enough, but the general rule is that if those fields amount to a significant quantity of memory (as compared to the "working set" within the various levels of caches) and if those fields are often accessed at about the same time (at the time-sale of the corresponding level of cache), then packing them in bitfelds can be a win. If they're not accessed at the same time, then you don't save any memory (you still bring in a whole cache line to access either the int or the bitfield). Also if you almost constantly access the same fields (i.e. you'd almost want them in registers), then bitfields are probably not a good choice. > . struct bidi_it weighs in at 712 bytes on a 64-bit machine. Can > this amount of memory cause more frequent cache misses, with > today's cache sizes? If there's only one such struct used, then it's probably small enough that it doesn't matter. Typical L1 caches are in the 16KB range ("have been and always will be" it seems: apparently smaller would make them useless and larger makes them slow enough that it's worthwhile to add a smaller lower-level cache), so 712 bytes are not lost in the noise at that scale, but they fit well. > . Out of 712 bytes, the level stack takes 512. The Unicode Standard > requires support for 62 levels (I will remove the extra 2 when I > have time), but I have never seen more than 4 used in any > real-life text. I don't want to give up full compliance with > UAX#9, and I don't want to allocate memory from the heap for each > extra level. But maybe starting with a 4-level stack and > allocating the other 58 when the 5th is required will be a good > enough optimization, leaving us with a 250-byte structure for all > but the very extreme use-cases? That's not worth the trouble: as long as these extra bytes are contiguous and are not accessed they only cost "main memory" but don't slow us down (they won't be brought into the cache at all). > . OTOH, the code on bidi.c consults a much larger char-table of > bidirectional properties _for_each_character_it_processes_. Given > this, how significant is the impact of a few hundred bytes in > struct bidi_it for causing a possible increase in cache misses? For a single bidi_it, with fields accessed "all the time", you're probably better off with int than with bitfields (bitfields will slow you down everytime you access them, and they'll mostly stay in cache if they're used often enough, so their cost is only in terms of pushing out other useful things from the cache). Of course, only measurements can give reliable answers. Stefan