unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Dmitry Antipov <dmantipov@yandex.ru>
Cc: emacs-devel@gnu.org
Subject: Re: Proposal: immediate strings
Date: Thu, 24 May 2012 01:17:31 -0400	[thread overview]
Message-ID: <jwvk402dvsf.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <4FBB51E7.6080601@yandex.ru> (Dmitry Antipov's message of "Tue, 22 May 2012 12:44:23 +0400")

> +#if __GNUC__ > 1 /* Any GCC still in use should support this.  */
> +#define PACKED __attribute__((packed))
> +#else
> +#define PACKED
> +#endif

Let's not use "packed" structures: too much trouble.

> +      if (nbytes >= STRING_IMM_MAX)
> +	/* Impossible immediate string.  */

Actually, we can allocate all the pure strings as "immediate strings"
(within the limits of imm.nbytes, of course), in which case we'd have
strings with more bytes than STRING_IMM_MAX, so while this assertion may
have been useful for debugging, we'll want to get rid of it.

> +      if (nbytes < STRING_IMM_MAX)
> +	/* Impossible normal string.  */
> +	abort ();

Please use "eassert (nbytes >= STRING_IMM_MAX);" instead.

> +#ifdef GC_STRING_STATS
> +  total_imm_strings = total_dat_strings = 0;
> +  total_imm_bytes = total_dat_bytes = 0;
> +  total_imm_intervals = total_dat_intervals = 0;
> +#endif

Have you looked at some of those stats?  I'd be interested to know which
proportion of the strings are "immediate and using intervals",
v.s. "just slightly too big to be immediate, but not using intervals".

> +		/* Fill data with special pattern. Used by
> +		   GC to find dead immediate strings.  */
> +		memset (s->u.imm.data, 0xff, STRING_IMM_MAX);

Is that really 100% reliable?  Why not use the `intervals' field with
a 0xdeadbeef pointer instead?

> +#define SDATA(string)		(XSTRING (string)->u.imm.immbit ? \
> +				 (XSTRING (string)->u.imm.data + 0) : \
> +				 (XSTRING (string)->u.dat.data + 0))

IIUC the "+ 0" are unneeded here, because using a "..?..:.." already
makes sure we have an rvalue.  Same for SCHARS.

> +   This assumes that sizeof (EMACS_INT) is equal to sizeof (void * ).  */
> +#define STRING_IMM_MAX (3 * sizeof (EMACS_INT) - 2)

Miles already pointed out that this is not a valid assumption.

>    do { if (EQ (STR, empty_multibyte_string))  \
>        (STR) = empty_unibyte_string;  \
> -    else XSTRING (STR)->size_byte = -1; } while (0)
> +    else if (XSTRING (STR)->u.imm.immbit) \
> +      XSTRING (STR)->u.imm.size_byte = -1; \
> +    else \
> +      XSTRING (STR)->u.dat.size_byte = -1; } while (0)

BTW, TAB in cc-mode will not only properly indent the code, but also
nicely align the \ at the end of lines.  ;-)

> +    union {
> +
> +      /* GC mark bit and subtype bit are in IMM just by convention - when
> +	 IMMBIT is 0, the DAT field is used except it's UNUSED field.  */
> +
> +      struct {
> +	unsigned gcmarkbit : 1;
> +	unsigned immbit : 1;
> +	EMACS_INT size : 7;
> +	EMACS_INT size_byte : 7;
> +	unsigned char data[STRING_IMM_MAX];
> +      } PACKED imm;
> +      
> +      struct {
> +	unsigned unused : 2; /* Do not access this placeholder.  */
> +	EMACS_INT size : BITS_PER_EMACS_INT - 1;
> +	EMACS_INT size_byte : BITS_PER_EMACS_INT - 1;
> +	unsigned char *data;
> +      } PACKED dat;
> +
> +    } u;
>    };

Access to your `size' field is inefficient because it will straddle
two words.  We could move the gcmarkbit as Paul suggests (at the cost of
having to check immbit before knowing where gcmarkbit is located), or
maybe we can move both the mark bit and the immbit into the `intervals'
fields (after all, we know those are aligned on a multiple of at least
4 on all architectures on which Emacs is known to run, so we have
2 bits free for (ab)use there).


        Stefan



  parent reply	other threads:[~2012-05-24  5:17 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-22  8:44 Proposal: immediate strings Dmitry Antipov
2012-05-22 20:51 ` Miles Bader
2012-05-22 22:13   ` Paul Eggert
2012-05-24  5:17 ` Stefan Monnier [this message]
2012-05-24  5:41   ` Ken Raeburn
2012-05-24  5:50     ` Miles Bader
2012-05-24  6:08   ` Paul Eggert
2012-05-24  7:14     ` Stefan Monnier
2012-05-24  7:52       ` Paul Eggert
2012-05-24 12:51         ` Stefan Monnier
2012-05-24 16:35           ` Paul Eggert
2012-05-25  6:43             ` Dmitry Antipov
2012-05-25  7:30               ` Paul Eggert
2012-05-28 11:32       ` Dmitry Antipov
2012-05-28 14:25         ` Stefan Monnier
2012-05-29  6:55   ` Dmitry Antipov
2012-05-29  7:38     ` Paul Eggert
2012-05-29 13:33       ` Dmitry Antipov
2012-05-29 15:24         ` Paul Eggert
2012-05-31  9:28           ` Dmitry Antipov
2012-05-31 16:34             ` Paul Eggert
2012-06-06  6:14               ` Dmitry Antipov
2012-06-06  6:41                 ` Paul Eggert
2012-06-06  7:29                   ` Dmitry Antipov
2012-06-06 15:14                     ` Eli Zaretskii
2012-06-06 21:44                       ` Paul Eggert
2012-07-04  8:27                       ` Old topic(s) again [was: Re: Proposal: immediate strings] Dmitry Antipov
2012-07-04 13:08                         ` Stefan Monnier
2012-07-04 19:32                           ` Paul Eggert
2012-05-29  7:38     ` Proposal: immediate strings Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvk402dvsf.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=dmantipov@yandex.ru \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).