From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Optimizing memory footprint of bidi code Date: Mon, 07 Jun 2010 10:30:56 -0400 Message-ID: Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1275921067 5459 80.91.229.12 (7 Jun 2010 14:31:07 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 7 Jun 2010 14:31:07 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jun 07 16:31:05 2010 connect(): No such file or directory Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OLdM1-0001X5-3q for ged-emacs-devel@m.gmane.org; Mon, 07 Jun 2010 16:31:05 +0200 Original-Received: from localhost ([127.0.0.1]:46499 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OLdM0-0001bJ-FU for ged-emacs-devel@m.gmane.org; Mon, 07 Jun 2010 10:31:04 -0400 Original-Received: from [199.232.76.173] (port=46575 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OLdLv-0001aR-9t for emacs-devel@gnu.org; Mon, 07 Jun 2010 10:30:59 -0400 Original-Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1OLdLs-0003fR-SS for emacs-devel@gnu.org; Mon, 07 Jun 2010 10:30:59 -0400 Original-Received: from fencepost.gnu.org ([140.186.70.10]:37470) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1OLdLs-0003fN-Iv for emacs-devel@gnu.org; Mon, 07 Jun 2010 10:30:56 -0400 Original-Received: from eliz by fencepost.gnu.org with local (Exim 4.69) (envelope-from ) id 1OLdLs-0006ZF-Dw for emacs-devel@gnu.org; Mon, 07 Jun 2010 10:30:56 -0400 X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:125580 Archived-At: In a recent bug report ("bug#6365: bidi data structure inefficiencies") Dan Nicolaescu writes: > Some bidi data structures are bigger that they need to be, this > probably results in additional cache misses. > Examples: > > struct bidi_saved_info could use bitfields for the bidi_type_t members > Same for bidi_stack > > bidi_it could use bitfields for a lot of it's members. I see a couple of boolean flags that could indeed become one-bit fields. I see a bidi_dir_t enumeration that has only 3 possible values. Can we trust modern compilers to use the smallest integer type for enumerations? If so, this part is already taken care of. If not, how to make this type smaller without losing the benefits of an enumerated type (I don't want to manipulate magic values or macros)? The other members which could be converted to bitfields are small integers, such as bidi_type_t (20 possible values) or stack_idx (up to 62). While it would be possible to use 5 bits for the former and 6 bits for the latter, at the time this code was designed, 9 years ago, there was a significant performance hit for manipulating bits on the then popular architectures. Or at least that's what I thought. These members use `int' deliberately, because I was (and still am) afraid of the slow-down from replacing a simple increment of 2 integer variables (in the unidirectional display engine) with the 1500-line reordering code. After all, this code runs in the innermost loops of Emacs display engine, and it runs for each character we display in each window. Maybe manipulating bitfields is no longer an issue with current architectures and compiler technology. Maybe this problem didn't exist, except in my head, even back when I designed this. But before I change the code, I'd like to know for sure. Because if it's still more expensive today to extract bitfields, then it may be worth our while to give up some memory for extra performance. And if the benefits from using bitfields are insignificant, perhaps it's not worth the hassle to change the code for gaining a few hundred bytes. Can someone "in the know" tell what is the current state of art in these matters? Failing that, can someone measure the differences and publish the results? A few more things we may wish to consider: . struct bidi_it weighs in at 712 bytes on a 64-bit machine. Can this amount of memory cause more frequent cache misses, with today's cache sizes? . Out of 712 bytes, the level stack takes 512. The Unicode Standard requires support for 62 levels (I will remove the extra 2 when I have time), but I have never seen more than 4 used in any real-life text. I don't want to give up full compliance with UAX#9, and I don't want to allocate memory from the heap for each extra level. But maybe starting with a 4-level stack and allocating the other 58 when the 5th is required will be a good enough optimization, leaving us with a 250-byte structure for all but the very extreme use-cases? . OTOH, the code on bidi.c consults a much larger char-table of bidirectional properties _for_each_character_it_processes_. Given this, how significant is the impact of a few hundred bytes in struct bidi_it for causing a possible increase in cache misses? Comments welcome. I will add to my todo implementation of any conclusions we will reach here (and encourage anyone with enough time to beat me to it). Thanks in advance.