unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Using empty_string as the only "" string
@ 2007-04-24 16:32 Dmitry Antipov
  2007-04-24 17:05 ` Juanma Barranquero
                   ` (2 more replies)
  0 siblings, 3 replies; 59+ messages in thread
From: Dmitry Antipov @ 2007-04-24 16:32 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 192 bytes --]

Hello all,

probably I've missed something, but what's the reason(s) to have a
lot of "" (zero-length) strings ? Why not uniq them into the only
one ? Here is a way I'm doing this...

Dmitry


[-- Attachment #2: empty_string.patch --]
[-- Type: text/plain, Size: 3698 bytes --]

Index: alloc.c
===================================================================
RCS file: /sources/emacs/emacs/src/alloc.c,v
retrieving revision 1.409
diff -u -r1.409 alloc.c
--- alloc.c	16 Apr 2007 03:09:33 -0000	1.409
+++ alloc.c	24 Apr 2007 15:38:29 -0000
@@ -1947,7 +1947,7 @@
    S->data.  Set S->size to NCHARS and S->size_byte to NBYTES.  Free
    S->data if it was initially non-null.  */
 
-void
+struct Lisp_String *
 allocate_string_data (s, nchars, nbytes)
      struct Lisp_String *s;
      int nchars, nbytes;
@@ -2049,6 +2049,7 @@
     }
 
   consing_since_gc += needed;
+  return s;
 }
 
 
@@ -2493,14 +2494,14 @@
      int nchars, nbytes;
 {
   Lisp_Object string;
-  struct Lisp_String *s;
 
   if (nchars < 0)
     abort ();
+  if (!nbytes)
+    return empty_string;
 
-  s = allocate_string ();
-  allocate_string_data (s, nchars, nbytes);
-  XSETSTRING (string, s);
+  XSETSTRING (string, allocate_string_data (allocate_string (), 
+					    nchars, nbytes));
   string_chars_consed += nbytes;
   return string;
 }
@@ -6469,6 +6470,12 @@
   Qpost_gc_hook = intern ("post-gc-hook");
   staticpro (&Qpost_gc_hook);
 
+  /* Must be initialized before any other possible string
+     allocation can be made, and before syms_of_lread ().  */
+  XSETSTRING (empty_string, allocate_string_data (allocate_string (), 0, 0));
+  STRING_SET_UNIBYTE (empty_string);
+  staticpro (&empty_string);
+
   DEFVAR_LISP ("memory-signal-data", &Vmemory_signal_data,
 	       doc: /* Precomputed `signal' argument for memory-full error.  */);
   /* We build this in advance because if we wait until we need it, we might
Index: emacs.c
===================================================================
RCS file: /sources/emacs/emacs/src/emacs.c,v
retrieving revision 1.401
diff -u -r1.401 emacs.c
--- emacs.c	3 Apr 2007 15:25:28 -0000	1.401
+++ emacs.c	24 Apr 2007 15:38:38 -0000
@@ -2468,9 +2468,6 @@
 The hook is not run in batch mode, i.e., if `noninteractive' is non-nil.  */);
   Vkill_emacs_hook = Qnil;
 
-  empty_string = build_string ("");
-  staticpro (&empty_string);
-
   DEFVAR_INT ("emacs-priority", &emacs_priority,
 	      doc: /* Priority for Emacs to run at.
 This value is effective only if set before Emacs is dumped,
Index: lisp.h
===================================================================
RCS file: /sources/emacs/emacs/src/lisp.h,v
retrieving revision 1.574
diff -u -r1.574 lisp.h
--- lisp.h	17 Mar 2007 18:27:10 -0000	1.574
+++ lisp.h	24 Apr 2007 15:38:42 -0000
@@ -2545,7 +2545,8 @@
 
 /* Defined in alloc.c */
 extern void check_pure_size P_ ((void));
-extern void allocate_string_data P_ ((struct Lisp_String *, int, int));
+extern struct Lisp_String * allocate_string_data P_ ((struct Lisp_String *,
+						      int, int));
 extern void reset_malloc_hooks P_ ((void));
 extern void uninterrupt_malloc P_ ((void));
 extern void malloc_warning P_ ((char *));
Index: lread.c
===================================================================
RCS file: /sources/emacs/emacs/src/lread.c,v
retrieving revision 1.369
diff -u -r1.369 lread.c
--- lread.c	28 Mar 2007 08:16:19 -0000	1.369
+++ lread.c	24 Apr 2007 15:38:47 -0000
@@ -4070,8 +4070,7 @@
 in order to do so.  However, if you want to customize which suffixes
 the loading functions recognize as compression suffixes, you should
 customize `jka-compr-load-suffixes' rather than the present variable.  */);
-  /* We don't use empty_string because it's not initialized yet.  */
-  Vload_file_rep_suffixes = Fcons (build_string (""), Qnil);
+  Vload_file_rep_suffixes = Fcons (empty_string, Qnil);
 
   DEFVAR_BOOL ("load-in-progress", &load_in_progress,
 	       doc: /* Non-nil iff inside of `load'.  */);

[-- Attachment #3: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread
* Re: using empty_string as the only "" string
@ 2007-04-25  5:38 dmantipov
  2007-04-25  5:49 ` Miles Bader
                   ` (3 more replies)
  0 siblings, 4 replies; 59+ messages in thread
From: dmantipov @ 2007-04-25  5:38 UTC (permalink / raw)
  To: emacs-devel

That was an interesting discussion, thanks to all.

All CLs I've installed (clisp, cmucl and franz) gives (eq 0 0) => t and
(eq "" "") => nil. But a) we can tweak 'eq' to handle this special case
(looks poor, but just to purify the language) and b) Emacs isn't a CL
and should not obey CLtL2 completely, isn't it ?

Immediately after startup but before any user interaction, my emacs
binary creates >260 empty strings, and >60 of them survives the first GC.
Saving 960 bytes (on 32-bit system) of Lisp_Strings may be considered
marginal. But, for example, after you have gnus loaded, you will have
>1000 empty strings created, and >600 of them survives the next GC.
I don't agree that approx. 10K is a marginal space optimization even
if your desktop has 4G RAM.

Immediate (built into Lisp_String) short strings is a nice and
interesting idea too, IMHO.

I don't expect too much from the canonicalization of another objects,
'frequently-used' float numbers like 1.0 or 0.0 in particular. I believe
these objects are very rare (in comparison with empty strings) in the most
common situations, so it will be just 0.0001% over no-op.

> You can modify the multibyteness of an empty string, and a
> unibyte empty string and a multibyte empty string behave a
> little bit differently, for instance, when concatinated with
> an unibyte 8-bit string.

How you can modify the multibyteness of an empty string ? You can't aset
multibyte char (as well as anything else) into empty string, and conversion
functions like 'string-make-unibyte' or 'string-to-multibyte' always creates
new strings instead of touching an argument. Moreover, since "" is a
no-op in concatenation operations, it may be silently discarded without
looking into internal structure, isn't it ?

Dmitry

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2007-06-08 19:16 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-24 16:32 Using empty_string as the only "" string Dmitry Antipov
2007-04-24 17:05 ` Juanma Barranquero
2007-04-24 18:11   ` Andreas Schwab
2007-04-24 18:50     ` Juanma Barranquero
2007-04-24 21:38       ` Andreas Schwab
2007-04-24 21:54         ` Juanma Barranquero
2007-04-24 22:11           ` Andreas Schwab
2007-04-24 22:54             ` Juanma Barranquero
2007-04-24 21:57         ` David Kastrup
2007-04-24 22:07           ` Lennart Borgman (gmail)
2007-04-24 22:29             ` David Kastrup
2007-04-24 22:35               ` Andreas Schwab
2007-04-25  0:55                 ` Kenichi Handa
2007-04-25  9:51                   ` Andreas Schwab
2007-04-25  9:58                     ` David Kastrup
2007-04-25 10:50                       ` Andreas Schwab
2007-04-24 22:40               ` Lennart Borgman (gmail)
2007-04-24 22:12           ` Andreas Schwab
2007-04-24 22:31             ` David Kastrup
2007-04-24 22:56               ` Andreas Schwab
2007-04-24 21:39       ` Miles Bader
2007-04-24 21:45         ` Juanma Barranquero
2007-04-24 22:11           ` Miles Bader
2007-04-24 22:59             ` Juanma Barranquero
2007-04-24 23:37               ` Miles Bader
2007-04-24 23:44                 ` Johan Bockgård
2007-04-25  1:47                   ` Miles Bader
2007-04-25 14:52                   ` Richard Stallman
2007-04-26 15:03                     ` Daniel Brockman
2007-04-27 20:40                       ` Richard Stallman
2007-04-25  2:05       ` Richard Stallman
2007-04-25 12:00         ` Juanma Barranquero
2007-04-25  2:05   ` Richard Stallman
2007-04-24 17:48 ` Stefan Monnier
2007-04-25  2:05   ` Richard Stallman
2007-04-26 14:24   ` Dmitry Antipov
2007-04-25  2:05 ` Richard Stallman
  -- strict thread matches above, loose matches on Subject: below --
2007-04-25  5:38 using " dmantipov
2007-04-25  5:49 ` Miles Bader
2007-04-25 11:50 ` Juanma Barranquero
2007-04-25 11:56 ` Kenichi Handa
2007-04-25 13:22   ` Dmitry Antipov
2007-04-25 16:07     ` Stefan Monnier
2007-04-26  4:23 ` Richard Stallman
2007-04-26 13:03   ` Dmitry Antipov
2007-04-27  6:00     ` Richard Stallman
2007-04-27 10:04       ` Dmitry Antipov
2007-04-27 10:29         ` David Kastrup
2007-04-28  4:06         ` Richard Stallman
2007-04-28  8:54           ` Dmitry Antipov
2007-04-28 18:35             ` Richard Stallman
2007-06-05 15:43               ` Juanma Barranquero
2007-06-05 19:17                 ` Richard Stallman
2007-06-05 19:45                   ` Juanma Barranquero
2007-06-06  1:17                     ` Stefan Monnier
2007-06-06 11:04                       ` Juanma Barranquero
2007-06-06 22:09                         ` Richard Stallman
2007-06-08 15:49                           ` Juanma Barranquero
2007-06-08 19:16                             ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).