From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Oleksandr Gavenko <gavenkoa@gmail.com>
Newsgroups: gmane.emacs.help
Subject: Size and length limits for Emacs primitive types and etc data?
Date: Wed, 23 Jan 2013 00:06:04 +0200
Organization: Oleksandr Gavenko <gavenkoa@gmail.com>,
	http://gavenkoa.users.sf.net
Message-ID: <87sj5s50vn.fsf@gavenkoa.example.com>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: ger.gmane.org 1358892386 15095 80.91.229.3 (22 Jan 2013 22:06:26 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Tue, 22 Jan 2013 22:06:26 +0000 (UTC)
To: help-gnu-emacs@gnu.org
Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue Jan 22 23:06:45 2013
Return-path: <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>
Envelope-to: geh-help-gnu-emacs@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>)
	id 1TxlzQ-0004Wv-D2
	for geh-help-gnu-emacs@m.gmane.org; Tue, 22 Jan 2013 23:06:44 +0100
Original-Received: from localhost ([::1]:46481 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org>)
	id 1Txlz9-0007T6-4M
	for geh-help-gnu-emacs@m.gmane.org; Tue, 22 Jan 2013 17:06:27 -0500
Original-Received: from eggs.gnu.org ([208.118.235.92]:50033)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <geh-help-gnu-emacs@m.gmane.org>) id 1Txlz2-0007T0-Pi
	for help-gnu-emacs@gnu.org; Tue, 22 Jan 2013 17:06:22 -0500
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <geh-help-gnu-emacs@m.gmane.org>) id 1Txlyy-0006qT-LS
	for help-gnu-emacs@gnu.org; Tue, 22 Jan 2013 17:06:20 -0500
Original-Received: from plane.gmane.org ([80.91.229.3]:36651)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <geh-help-gnu-emacs@m.gmane.org>) id 1Txlyy-0006qN-BM
	for help-gnu-emacs@gnu.org; Tue, 22 Jan 2013 17:06:16 -0500
Original-Received: from list by plane.gmane.org with local (Exim 4.69)
	(envelope-from <geh-help-gnu-emacs@m.gmane.org>) id 1TxlzD-0004MO-Gr
	for help-gnu-emacs@gnu.org; Tue, 22 Jan 2013 23:06:31 +0100
Original-Received: from 37.229.4.200 ([37.229.4.200])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <help-gnu-emacs@gnu.org>; Tue, 22 Jan 2013 23:06:31 +0100
Original-Received: from gavenkoa by 37.229.4.200 with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <help-gnu-emacs@gnu.org>; Tue, 22 Jan 2013 23:06:31 +0100
X-Injected-Via-Gmane: http://gmane.org/
Original-Lines: 226
Original-X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: 37.229.4.200
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)
Cancel-Lock: sha1:+6wEIyTgyle98c4mANf6Sog9rw8=
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 80.91.229.3
X-BeenThere: help-gnu-emacs@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Users list for the GNU Emacs text editor <help-gnu-emacs.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/help-gnu-emacs>,
	<mailto:help-gnu-emacs-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/help-gnu-emacs>
List-Post: <mailto:help-gnu-emacs@gnu.org>
List-Help: <mailto:help-gnu-emacs-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/help-gnu-emacs>,
	<mailto:help-gnu-emacs-request@gnu.org?subject=subscribe>
Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org
Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.help:88772
Archived-At: <http://permalink.gmane.org/gmane.emacs.help/88772>

during search I found these sources of information about limits of Emacs runtime:

  (info "(elisp)Programming Types")
                Programming Types
  http://www.emacswiki.org/emacs/EmacsFileSizeLimit
                EmacsFileSizeLimit
  http://article.gmane.org/gmane.emacs.devel/139119
                 Re: stack overflow limit
                 The value of re_max_failures we use now needs 4MB of stack on
                 a 32-but machine, twice as much on a 64-bit machine. We also
                 need stack space for GC.

>From official docs:

For integers: 28bit + sign.
For chars: 22-bit.

Next types have unknown or undefined size limits in manual but:

================================================================

For float: Emacs uses the IEEE floating point standard where possible. But
which precision exactly (half/single/double
http://en.wikipedia.org/wiki/IEEE_754#Basic_formats)?

/* Lisp floating point type.  */
struct Lisp_Float  /* src/lisp.h */
  {
    union
    {
      double data;
      struct Lisp_Float *chain;
    } u;
  };

Seems it uses 64-bit (double precision) IEEE 754 on most of 32-bit platforms.

Any function in runtime that return digits and exponent width for float?

================================================================

For list: I think their length unlimited at all.

================================================================

But how many bytes take symbol? For example 'foo'?

>From src/lisp.h:

typedef struct { EMACS_INT i; } Lisp_Object;

struct Lisp_Symbol
{
  unsigned gcmarkbit : 1;
  ENUM_BF (symbol_redirect) redirect : 3;
  unsigned constant : 2;
  unsigned interned : 2;
  unsigned declared_special : 1;
  Lisp_Object name;
  union {
    Lisp_Object value;
    struct Lisp_Symbol *alias;
    struct Lisp_Buffer_Local_Value *blv;
    union Lisp_Fwd *fwd;
  } val;
  Lisp_Object function;
  Lisp_Object plist;
  struct Lisp_Symbol *next;
};

For 32-bit arch I count 4*6=24 bytes.

Seems that Lisp_Object is index in hash table to actual values (like actual
name or function code...).

================================================================

How many memory takes cons cell?

struct Lisp_Cons
  {
    Lisp_Object car;
    union
    {
      Lisp_Object cdr;
      struct Lisp_Cons *chain;
    } u;
  };

For 32-bit arch I count 4*2=8 bytes.

================================================================

How many takes plist for storing single property?

From:

DEFUN ("plist-put", Fplist_put, Splist_put, 3, 3, 0,
  (Lisp_Object plist, register Lisp_Object prop, Lisp_Object val)
{
  register Lisp_Object tail, prev;
  Lisp_Object newcell;
  prev = Qnil;
  for (tail = plist; CONSP (tail) && CONSP (XCDR (tail));
       tail = XCDR (XCDR (tail)))

seems that 2 cons... or 8*2=16 bytes.

================================================================

How many memory takes string (which is buffer strings and symbols names)?

typedef struct interval *INTERVAL;
struct Lisp_String
  {
    ptrdiff_t size;
    ptrdiff_t size_byte;
    INTERVAL intervals;		/* Text properties in this string.  */
    unsigned char *data;
  };

Seems that 3*4 + lengthOf(data) bytes.

Manual say that "strings really contain integers" and "strings are arrays, and
therefore sequences as well".

So each char (in data) uses 4 bytes? Seem doesn't. As

     To conserve memory, Emacs does not hold fixed-length 22-bit numbers that
  are codepoints of text characters within buffers and strings. Rather, Emacs
  uses a variable-length internal representation of characters, that stores
  each character as a sequence of 1 to 5 8-bit bytes, depending on the
  magnitude of its codepoint.

and:

  Encoded text is not really text, as far as Emacs is concerned, but rather a
  sequence of raw 8-bit bytes. We call buffers and strings that hold encoded
  text "unibyte" buffers and strings, because Emacs treats them as a sequence
  of individual bytes.

With unibyte I understand that it is easy to get char by index.

But with multibyte I don't understand. And don't understand why in this case
string are array, is it an inefficient array?

Seems that buffer text == string:

struct buffer_text   /* from src/buffer.h */
  {
    unsigned char *beg;
    ptrdiff_t gpt;		/* Char pos of gap in buffer.  */
    ptrdiff_t z;		/* Char pos of end of buffer.  */
    ptrdiff_t gpt_byte;		/* Byte pos of gap in buffer.  */
    ptrdiff_t z_byte;		/* Byte pos of end of buffer.  */
    ptrdiff_t gap_size;		/* Size of buffer's gap.  */
    EMACS_INT modiff;		/* This counts buffer-modification events
    EMACS_INT chars_modiff;	/* This is modified with character change
    EMACS_INT save_modiff;	/* Previous value of modiff, as of last
    EMACS_INT overlay_modiff;	/* Counts modifications to overlays.  */
    EMACS_INT compact;		/* Set to modiff each time when compact_buffer
    ptrdiff_t beg_unchanged;
    ptrdiff_t end_unchanged;
    EMACS_INT unchanged_modified;
    EMACS_INT overlay_unchanged_modified;
    INTERVAL intervals;
    struct Lisp_Marker *markers;
    bool inhibit_shrinking;
  };

So opening 10 KiB Russian file in cp1251 actually take 2*10 KiB for buffer as
each Russian chars in multibyte string take 2 bytes... (just type C-u C-x =
and look to "buffer code: #xD0 #x91").

I think that string have no length limit (except limit in 28-bit for index on
32-bit platform).

================================================================

Seems that arrays/vectors also have no limits for length (except limit in
28-bit for index on 32-bit platform):

/* Regular vector is just a header plus array of Lisp_Objects.  */
struct Lisp_Vector   /* src/lisp.h */
  {
    struct vectorlike_header header;
    Lisp_Object contents[1];
  };

/* A boolvector is a kind of vectorlike, with contents are like a string.  */
struct Lisp_Bool_Vector
  {
    struct vectorlike_header header;
    /* This is the size in bits.  */
    EMACS_INT size;
    /* This contains the actual bits, packed into bytes.  */
    unsigned char data[1];
  };

================================================================

Hash tables are harder data type and I don't understand limitations on count
of key-values pairs from:

struct Lisp_Hash_Table
{
  struct vectorlike_header header;
  Lisp_Object weak;
  Lisp_Object rehash_size;
  Lisp_Object rehash_threshold;
  Lisp_Object hash;
  Lisp_Object next;
  Lisp_Object next_free;
  Lisp_Object index;
  ptrdiff_t count;
  Lisp_Object key_and_value;
  struct hash_table_test test;
  struct Lisp_Hash_Table *next_weak;
};

================================================================

Please correct me and answer the questions...

-- 
Best regards!