From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel Subject: Re: Proposal: immediate strings Date: Wed, 23 May 2012 23:08:15 -0700 Organization: UCLA Computer Science Department Message-ID: <4FBDD04F.9010203@cs.ucla.edu> References: <4FBB51E7.6080601@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1337839717 920 80.91.229.3 (24 May 2012 06:08:37 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 24 May 2012 06:08:37 +0000 (UTC) Cc: Dmitry Antipov , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu May 24 08:08:34 2012 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SXRDr-0001kS-M7 for ged-emacs-devel@m.gmane.org; Thu, 24 May 2012 08:08:31 +0200 Original-Received: from localhost ([::1]:52569 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXRDq-00042X-UE for ged-emacs-devel@m.gmane.org; Thu, 24 May 2012 02:08:30 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:33782) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXRDi-00041U-Gm for emacs-devel@gnu.org; Thu, 24 May 2012 02:08:29 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SXRDf-0005pp-Bg for emacs-devel@gnu.org; Thu, 24 May 2012 02:08:22 -0400 Original-Received: from smtp.cs.ucla.edu ([131.179.128.62]:48974) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXRDe-0005oH-T2 for emacs-devel@gnu.org; Thu, 24 May 2012 02:08:19 -0400 Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 1965F39E800C; Wed, 23 May 2012 23:08:15 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Original-Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uqLplgqwn6Ze; Wed, 23 May 2012 23:08:14 -0700 (PDT) Original-Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 0641739E8006; Wed, 23 May 2012 23:08:14 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 131.179.128.62 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:150629 Archived-At: On 05/23/2012 10:17 PM, Stefan Monnier wrote: > maybe we can move both the mark bit and the immbit into the `intervals' > fields (after all, we know those are aligned on a multiple of at least > 4 on all architectures on which Emacs is known to run, so we have > 2 bits free for (ab)use there). That might work. But this raises another idea: assuming most tiny strings don't have text properties, won't it improve performance overall if any string with text properties is forced to be an ordinary string, so that immediate strings can reuse the rarely-used 'intervals' member for data? Another thought that comes to mind, is that we could leave the mark bit where it is (the most significant bit of 'size'), and reserve 'size' values >= (EMACS_INT / 2 & ~0xffff) to represent immediate strings, with the size and size_bytes values packed into the low-order 16 bits of 'size'. Something like this: struct Lisp_String { EMACS_INT size; union { struct Ordinary_Lisp_String_Component { EMACS_INT size_byte; INTERVAL intervals; unsigned char *data; } ordinary; unsigned char data[sizeof (struct Ordinary_Lisp_String_Component)]; } u; }; #define IMMEDIATE_STRING(s) (EMACS_INT_MAX & ~0xffff <= (s)->size) #define SDATA(s) (IMMEDIATE_STRING (XSTRING (s)) \ ? XSTRING (s)->data \ : XSTRING (s)->ordinary.data) #define SCHARS(s) (IMMEDIATE_STRING (XSTRING (s)) \ ? XSTRING (s)->size & 0xff : XSTRING (s)->size) #define SBYTES(s) (IMMEDIATE_STRING (XSTRING (s)) \ ? (XSTRING (s)->size >> 8) & 0xff : STRING_BYTES (XSTRING (s))) etc.