unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* indirect threading for bytecode interpreter
@ 2009-09-17 16:46 Tom Tromey
  2009-09-17 18:21 ` Stefan Monnier
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Tom Tromey @ 2009-09-17 16:46 UTC (permalink / raw)
  To: Emacs development discussions

This patch implements indirect threading for the Emacs bytecode
interpreter.  It uses a GNU C extension, and falls back to the current
switch-based code if Emacs is compiled with some other compiler.

This speeds up the bytecode interpreter a bit.  On my benchmark, it
improved the overall time between 4-7% (lots of noise in the runs).
That isn't bad when you consider that Fbyte_code was the 3rd biggest
time user (the first two being re_match_2_internal and mark_object).

I think re_match_2_internal would benefit from something similar.
There, however, I think direct threading would be preferable, because it
performs better, and because the size of the intermediate form is not as
big a concern.  (At least, I think that is true as long as Emacs does
not have first-class compiled regular expressions.)

Tom

2009-09-17  Tom Tromey  <tromey@redhat.com>

	* bytecode.c (BYTE_CODE_THREADED): New define.
	(Bvarref1, Bvarref2, Bvarref3, Bvarref4, Bvarref5, Bvarref6)
	(Bvarref7, Bvarset1, Bvarset2, Bvarset3, Bvarset4, Bvarset5)
	(Bvarset6, Bvarset7, Bvarbind1, Bvarbind2, Bvarbind3, Bvarbind4)
	(Bvarbind5, Bvarbind6, Bvarbind7, Bcall1, Bcall2, Bcall3)
	(Bcall4, Bcall5, Bcall6, Bcall7, Bunbind1, Bunbind2, Bunbind3)
	(Bunbind4, Bunbind5, Bunbind6, Bunbind7): New defines.
	(CASE, LABEL, NEXT, FIRST, CASE_DEFAULT, CASE_ABORT): New
	defines.
	(Fbyte_code): Use indirect threading.  Rewrite switch and case
	labels.

diff --git a/src/bytecode.c b/src/bytecode.c
index 43d972e..bd69a89 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -54,6 +54,13 @@ by Hallvard:
 /* #define BYTE_CODE_SAFE */
 /* #define BYTE_CODE_METER */
 
+/* If BYTE_CODE_THREADED is defined, then the interpreter will be
+   indirect threaded, using GCC's computed goto extension.  This is
+   incompatible with BYTE_CODE_SAFE and BYTE_CODE_METER.  */
+#if defined (__GNUC__) && !defined (BYTE_CODE_SAFE) && !defined (BYTE_CODE_METER)
+#define BYTE_CODE_THREADED
+#endif
+
 \f
 #ifdef BYTE_CODE_METER
 
@@ -90,10 +97,45 @@ Lisp_Object Qbytecode;
 /*  Byte codes: */
 
 #define Bvarref 010
+#define Bvarref1 011
+#define Bvarref2 012
+#define Bvarref3 013
+#define Bvarref4 014
+#define Bvarref5 015
+#define Bvarref6 016
+#define Bvarref7 017
 #define Bvarset 020
+#define Bvarset1 021
+#define Bvarset2 022
+#define Bvarset3 023
+#define Bvarset4 024
+#define Bvarset5 025
+#define Bvarset6 026
+#define Bvarset7 027
 #define Bvarbind 030
+#define Bvarbind1 031
+#define Bvarbind2 032
+#define Bvarbind3 033
+#define Bvarbind4 034
+#define Bvarbind5 035
+#define Bvarbind6 036
+#define Bvarbind7 037
 #define Bcall 040
+#define Bcall1 041
+#define Bcall2 042
+#define Bcall3 043
+#define Bcall4 044
+#define Bcall5 045
+#define Bcall6 046
+#define Bcall7 047
 #define Bunbind 050
+#define Bunbind1 051
+#define Bunbind2 052
+#define Bunbind3 053
+#define Bunbind4 054
+#define Bunbind5 055
+#define Bunbind6 056
+#define Bunbind7 057
 
 #define Bnth 070
 #define Bsymbolp 071
@@ -463,27 +505,324 @@ If the third argument is incorrect, Emacs may crash.  */)
       this_op = op = FETCH;
       METER_CODE (prev_op, op);
 #else
+#ifndef BYTE_CODE_THREADED
       op = FETCH;
 #endif
+#endif
+
+#ifdef BYTE_CODE_THREADED
+#define CASE(OP) insn_ ## OP
+#define LABEL(OP) &&insn_ ## OP
+#define NEXT goto *(targets[op = FETCH])
+#define FIRST NEXT;
+#define CASE_DEFAULT
+#define CASE_ABORT CASE (default)
+#else
+#define CASE(OP) case OP
+#define NEXT break
+#define FIRST switch (op)
+#define CASE_DEFAULT case 255: default:
+#define CASE_ABORT case 0
+#endif
+
+#ifdef BYTE_CODE_THREADED
+      static const void *const targets[256] =
+	{
+	  LABEL (default),	/* 0 */
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),	/* 7 */
+	  LABEL (Bvarref),
+	  LABEL (Bvarref1),
+	  LABEL (Bvarref2),
+	  LABEL (Bvarref3),
+	  LABEL (Bvarref4),
+	  LABEL (Bvarref5),
+	  LABEL (Bvarref6),
+	  LABEL (Bvarref7),
+	  LABEL (Bvarset),
+	  LABEL (Bvarset1),
+	  LABEL (Bvarset2),
+	  LABEL (Bvarset3),
+	  LABEL (Bvarset4),
+	  LABEL (Bvarset5),
+	  LABEL (Bvarset6),
+	  LABEL (Bvarset7),
+	  LABEL (Bvarbind),
+	  LABEL (Bvarbind1),
+	  LABEL (Bvarbind2),
+	  LABEL (Bvarbind3),
+	  LABEL (Bvarbind4),
+	  LABEL (Bvarbind5),
+	  LABEL (Bvarbind6),
+	  LABEL (Bvarbind7),
+	  LABEL (Bcall),
+	  LABEL (Bcall1),
+	  LABEL (Bcall2),
+	  LABEL (Bcall3),
+	  LABEL (Bcall4),
+	  LABEL (Bcall5),
+	  LABEL (Bcall6),
+	  LABEL (Bcall7),
+	  LABEL (Bunbind),
+	  LABEL (Bunbind1),
+	  LABEL (Bunbind2),
+	  LABEL (Bunbind3),
+	  LABEL (Bunbind4),
+	  LABEL (Bunbind5),
+	  LABEL (Bunbind6),
+	  LABEL (Bunbind7),
+	  LABEL (default),	/* 060 */
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),	/* 067 */
+	  LABEL (Bnth),
+	  LABEL (Bsymbolp),
+	  LABEL (Bconsp),
+	  LABEL (Bstringp),
+	  LABEL (Blistp),
+	  LABEL (Beq),
+	  LABEL (Bmemq),
+	  LABEL (Bnot),
+	  LABEL (Bcar),
+	  LABEL (Bcdr),
+	  LABEL (Bcons),
+	  LABEL (Blist1),
+	  LABEL (Blist2),
+	  LABEL (Blist3),
+	  LABEL (Blist4),
+	  LABEL (Blength),
+	  LABEL (Baref),
+	  LABEL (Baset),
+	  LABEL (Bsymbol_value),
+	  LABEL (Bsymbol_function),
+	  LABEL (Bset),
+	  LABEL (Bfset),
+	  LABEL (Bget),
+	  LABEL (Bsubstring),
+	  LABEL (Bconcat2),
+	  LABEL (Bconcat3),
+	  LABEL (Bconcat4),
+	  LABEL (Bsub1),
+	  LABEL (Badd1),
+	  LABEL (Beqlsign),
+	  LABEL (Bgtr),
+	  LABEL (Blss),
+	  LABEL (Bleq),
+	  LABEL (Bgeq),
+	  LABEL (Bdiff),
+	  LABEL (Bnegate),
+	  LABEL (Bplus),
+	  LABEL (Bmax),
+	  LABEL (Bmin),
+	  LABEL (Bmult),
+	  LABEL (Bpoint),
+	  LABEL (Bsave_current_buffer),
+	  LABEL (Bgoto_char),
+	  LABEL (Binsert),
+	  LABEL (Bpoint_max),
+	  LABEL (Bpoint_min),
+	  LABEL (Bchar_after),
+	  LABEL (Bfollowing_char),
+	  LABEL (Bpreceding_char),
+	  LABEL (Bcurrent_column),
+	  LABEL (Bindent_to),
+	  LABEL (Bscan_buffer),
+	  LABEL (Beolp),
+	  LABEL (Beobp),
+	  LABEL (Bbolp),
+	  LABEL (Bbobp),
+	  LABEL (Bcurrent_buffer),
+	  LABEL (Bset_buffer),
+	  LABEL (Bsave_current_buffer_1),
+	  LABEL (Bset_mark),
+	  LABEL (Binteractive_p),
+	  LABEL (Bforward_char),
+	  LABEL (Bforward_word),
+	  LABEL (Bskip_chars_forward),
+	  LABEL (Bskip_chars_backward),
+	  LABEL (Bforward_line),
+	  LABEL (Bchar_syntax),
+	  LABEL (Bbuffer_substring),
+	  LABEL (Bdelete_region),
+	  LABEL (Bnarrow_to_region),
+	  LABEL (Bwiden),
+	  LABEL (Bend_of_line),
+	  LABEL (default),	/* 0200 */
+	  LABEL (Bconstant2),
+	  LABEL (Bgoto),
+	  LABEL (Bgotoifnil),
+	  LABEL (Bgotoifnonnil),
+	  LABEL (Bgotoifnilelsepop),
+	  LABEL (Bgotoifnonnilelsepop),
+	  LABEL (Breturn),
+	  LABEL (Bdiscard),
+	  LABEL (Bdup),
+	  LABEL (Bsave_excursion),
+	  LABEL (Bsave_window_excursion),
+	  LABEL (Bsave_restriction),
+	  LABEL (Bcatch),
+	  LABEL (Bunwind_protect),
+	  LABEL (Bcondition_case),
+	  LABEL (Btemp_output_buffer_setup),
+	  LABEL (Btemp_output_buffer_show),
+	  LABEL (Bunbind_all),
+	  LABEL (Bset_marker),
+	  LABEL (Bmatch_beginning),
+	  LABEL (Bmatch_end),
+	  LABEL (Bupcase),
+	  LABEL (Bdowncase),
+	  LABEL (Bstringeqlsign),
+	  LABEL (Bstringlss),
+	  LABEL (Bequal),
+	  LABEL (Bnthcdr),
+	  LABEL (Belt),
+	  LABEL (Bmember),
+	  LABEL (Bassq),
+	  LABEL (Bnreverse),
+	  LABEL (Bsetcar),
+	  LABEL (Bsetcdr),
+	  LABEL (Bcar_safe),
+	  LABEL (Bcdr_safe),
+	  LABEL (Bnconc),
+	  LABEL (Bquo),
+	  LABEL (Brem),
+	  LABEL (Bnumberp),
+	  LABEL (Bintegerp),
+	  LABEL (default),	/* 0251 */
+	  LABEL (BRgoto),
+	  LABEL (BRgotoifnil),
+	  LABEL (BRgotoifnonnil),
+	  LABEL (BRgotoifnilelsepop),
+	  LABEL (BRgotoifnonnilelsepop),
+	  LABEL (BlistN),
+	  LABEL (BconcatN),
+	  LABEL (BinsertN),
+	  LABEL (default),	/* 0262 */
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),	/* 0270 */
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),
+	  LABEL (default),	/* 0277 */
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant),
+	  LABEL (Bconstant)
+	};
+#endif
+
+#if 0
+      /* Sanity checks that are useful if you modify the table.  */
+      if (targets[Bsymbol_value] != LABEL (Bsymbol_value))
+	abort ();
+      if (targets[Bconstant2] != LABEL (Bconstant2))
+	abort ();
+      if (targets[Bnegate] != LABEL (Bnegate))
+	abort ();
+      if (targets[BinsertN] != LABEL (BinsertN))
+	abort ();
+      if (targets[0277] != LABEL (default))
+	abort ();
+      if (targets[Bconstant-1] == LABEL (Bconstant))
+	abort ();
+      if (targets[Bconstant] != LABEL (Bconstant))
+	abort ();
+#endif
 
-      switch (op)
+      FIRST
 	{
-	case Bvarref + 7:
+	CASE (Bvarref7):
 	  op = FETCH2;
 	  goto varref;
 
-	case Bvarref:
-	case Bvarref + 1:
-	case Bvarref + 2:
-	case Bvarref + 3:
-	case Bvarref + 4:
-	case Bvarref + 5:
+	CASE (Bvarref):
+	CASE (Bvarref1):
+	CASE (Bvarref2):
+	CASE (Bvarref3):
+	CASE (Bvarref4):
+	CASE (Bvarref5):
 	  op = op - Bvarref;
 	  goto varref;
 
 	/* This seems to be the most frequently executed byte-code
 	   among the Bvarref's, so avoid a goto here.  */
-	case Bvarref+6:
+	CASE (Bvarref6):
 	  op = FETCH;
 	varref:
 	  {
@@ -507,10 +846,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 		AFTER_POTENTIAL_GC ();
 	      }
 	    PUSH (v2);
-	    break;
+	    NEXT;
 	  }
 
-	case Bgotoifnil:
+	CASE (Bgotoifnil):
 	  {
 	    Lisp_Object v1;
 	    MAYBE_GC ();
@@ -522,57 +861,57 @@ If the third argument is incorrect, Emacs may crash.  */)
 		CHECK_RANGE (op);
 		stack.pc = stack.byte_string_start + op;
 	      }
-	    break;
+	    NEXT;
 	  }
 
-	case Bcar:
+	CASE (Bcar):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
 	    TOP = CAR (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Beq:
+	CASE (Beq):
 	  {
 	    Lisp_Object v1;
 	    v1 = POP;
 	    TOP = EQ (v1, TOP) ? Qt : Qnil;
-	    break;
+	    NEXT;
 	  }
 
-	case Bmemq:
+	CASE (Bmemq):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fmemq (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bcdr:
+	CASE (Bcdr):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
 	    TOP = CDR (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bvarset:
-	case Bvarset+1:
-	case Bvarset+2:
-	case Bvarset+3:
-	case Bvarset+4:
-	case Bvarset+5:
+	CASE (Bvarset):
+	CASE (Bvarset1):
+	CASE (Bvarset2):
+	CASE (Bvarset3):
+	CASE (Bvarset4):
+	CASE (Bvarset5):
 	  op -= Bvarset;
 	  goto varset;
 
-	case Bvarset+7:
+	CASE (Bvarset7):
 	  op = FETCH2;
 	  goto varset;
 
-	case Bvarset+6:
+	CASE (Bvarset6):
 	  op = FETCH;
 	varset:
 	  {
@@ -596,54 +935,54 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      }
 	  }
 	  (void) POP;
-	  break;
+	  NEXT;
 
-	case Bdup:
+	CASE (Bdup):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
 	    PUSH (v1);
-	    break;
+	    NEXT;
 	  }
 
 	/* ------------------ */
 
-	case Bvarbind+6:
+	CASE (Bvarbind6):
 	  op = FETCH;
 	  goto varbind;
 
-	case Bvarbind+7:
+	CASE (Bvarbind7):
 	  op = FETCH2;
 	  goto varbind;
 
-	case Bvarbind:
-	case Bvarbind+1:
-	case Bvarbind+2:
-	case Bvarbind+3:
-	case Bvarbind+4:
-	case Bvarbind+5:
+	CASE (Bvarbind):
+	CASE (Bvarbind1):
+	CASE (Bvarbind2):
+	CASE (Bvarbind3):
+	CASE (Bvarbind4):
+	CASE (Bvarbind5):
 	  op -= Bvarbind;
 	varbind:
 	  /* Specbind can signal and thus GC.  */
 	  BEFORE_POTENTIAL_GC ();
 	  specbind (vectorp[op], POP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bcall+6:
+	CASE (Bcall6):
 	  op = FETCH;
 	  goto docall;
 
-	case Bcall+7:
+	CASE (Bcall7):
 	  op = FETCH2;
 	  goto docall;
 
-	case Bcall:
-	case Bcall+1:
-	case Bcall+2:
-	case Bcall+3:
-	case Bcall+4:
-	case Bcall+5:
+	CASE (Bcall):
+	CASE (Bcall1):
+	CASE (Bcall2):
+	CASE (Bcall3):
+	CASE (Bcall4):
+	CASE (Bcall5):
 	  op -= Bcall;
 	docall:
 	  {
@@ -666,47 +1005,47 @@ If the third argument is incorrect, Emacs may crash.  */)
 #endif
 	    TOP = Ffuncall (op + 1, &TOP);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bunbind+6:
+	CASE (Bunbind6):
 	  op = FETCH;
 	  goto dounbind;
 
-	case Bunbind+7:
+	CASE (Bunbind7):
 	  op = FETCH2;
 	  goto dounbind;
 
-	case Bunbind:
-	case Bunbind+1:
-	case Bunbind+2:
-	case Bunbind+3:
-	case Bunbind+4:
-	case Bunbind+5:
+	CASE (Bunbind):
+	CASE (Bunbind1):
+	CASE (Bunbind2):
+	CASE (Bunbind3):
+	CASE (Bunbind4):
+	CASE (Bunbind5):
 	  op -= Bunbind;
 	dounbind:
 	  BEFORE_POTENTIAL_GC ();
 	  unbind_to (SPECPDL_INDEX () - op, Qnil);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bunbind_all:
+	CASE (Bunbind_all):
 	  /* To unbind back to the beginning of this frame.  Not used yet,
 	     but will be needed for tail-recursion elimination.  */
 	  BEFORE_POTENTIAL_GC ();
 	  unbind_to (count, Qnil);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bgoto:
+	CASE (Bgoto):
 	  MAYBE_GC ();
 	  BYTE_CODE_QUIT;
 	  op = FETCH2;    /* pc = FETCH2 loses since FETCH2 contains pc++ */
 	  CHECK_RANGE (op);
 	  stack.pc = stack.byte_string_start + op;
-	  break;
+	  NEXT;
 
-	case Bgotoifnonnil:
+	CASE (Bgotoifnonnil):
 	  {
 	    Lisp_Object v1;
 	    MAYBE_GC ();
@@ -718,10 +1057,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 		CHECK_RANGE (op);
 		stack.pc = stack.byte_string_start + op;
 	      }
-	    break;
+	    NEXT;
 	  }
 
-	case Bgotoifnilelsepop:
+	CASE (Bgotoifnilelsepop):
 	  MAYBE_GC ();
 	  op = FETCH2;
 	  if (NILP (TOP))
@@ -731,9 +1070,9 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      stack.pc = stack.byte_string_start + op;
 	    }
 	  else DISCARD (1);
-	  break;
+	  NEXT;
 
-	case Bgotoifnonnilelsepop:
+	CASE (Bgotoifnonnilelsepop):
 	  MAYBE_GC ();
 	  op = FETCH2;
 	  if (!NILP (TOP))
@@ -743,15 +1082,15 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      stack.pc = stack.byte_string_start + op;
 	    }
 	  else DISCARD (1);
-	  break;
+	  NEXT;
 
-	case BRgoto:
+	CASE (BRgoto):
 	  MAYBE_GC ();
 	  BYTE_CODE_QUIT;
 	  stack.pc += (int) *stack.pc - 127;
-	  break;
+	  NEXT;
 
-	case BRgotoifnil:
+	CASE (BRgotoifnil):
 	  {
 	    Lisp_Object v1;
 	    MAYBE_GC ();
@@ -762,10 +1101,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 		stack.pc += (int) *stack.pc - 128;
 	      }
 	    stack.pc++;
-	    break;
+	    NEXT;
 	  }
 
-	case BRgotoifnonnil:
+	CASE (BRgotoifnonnil):
 	  {
 	    Lisp_Object v1;
 	    MAYBE_GC ();
@@ -776,10 +1115,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 		stack.pc += (int) *stack.pc - 128;
 	      }
 	    stack.pc++;
-	    break;
+	    NEXT;
 	  }
 
-	case BRgotoifnilelsepop:
+	CASE (BRgotoifnilelsepop):
 	  MAYBE_GC ();
 	  op = *stack.pc++;
 	  if (NILP (TOP))
@@ -788,9 +1127,9 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      stack.pc += op - 128;
 	    }
 	  else DISCARD (1);
-	  break;
+	  NEXT;
 
-	case BRgotoifnonnilelsepop:
+	CASE (BRgotoifnonnilelsepop):
 	  MAYBE_GC ();
 	  op = *stack.pc++;
 	  if (!NILP (TOP))
@@ -799,56 +1138,56 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      stack.pc += op - 128;
 	    }
 	  else DISCARD (1);
-	  break;
+	  NEXT;
 
-	case Breturn:
+	CASE (Breturn):
 	  result = POP;
 	  goto exit;
 
-	case Bdiscard:
+	CASE (Bdiscard):
 	  DISCARD (1);
-	  break;
+	  NEXT;
 
-	case Bconstant2:
+	CASE (Bconstant2):
 	  PUSH (vectorp[FETCH2]);
-	  break;
+	  NEXT;
 
-	case Bsave_excursion:
+	CASE (Bsave_excursion):
 	  record_unwind_protect (save_excursion_restore,
 				 save_excursion_save ());
-	  break;
+	  NEXT;
 
-	case Bsave_current_buffer:
-	case Bsave_current_buffer_1:
+	CASE (Bsave_current_buffer):
+	CASE (Bsave_current_buffer_1):
 	  record_unwind_protect (set_buffer_if_live, Fcurrent_buffer ());
-	  break;
+	  NEXT;
 
-	case Bsave_window_excursion:
+	CASE (Bsave_window_excursion):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fsave_window_excursion (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bsave_restriction:
+	CASE (Bsave_restriction):
 	  record_unwind_protect (save_restriction_restore,
 				 save_restriction_save ());
-	  break;
+	  NEXT;
 
-	case Bcatch:
+	CASE (Bcatch):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = internal_catch (TOP, Feval, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bunwind_protect:
+	CASE (Bunwind_protect):
 	  record_unwind_protect (Fprogn, POP);
-	  break;
+	  NEXT;
 
-	case Bcondition_case:
+	CASE (Bcondition_case):
 	  {
 	    Lisp_Object handlers, body;
 	    handlers = POP;
@@ -856,18 +1195,18 @@ If the third argument is incorrect, Emacs may crash.  */)
 	    BEFORE_POTENTIAL_GC ();
 	    TOP = internal_lisp_condition_case (TOP, body, handlers);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Btemp_output_buffer_setup:
+	CASE (Btemp_output_buffer_setup):
 	  BEFORE_POTENTIAL_GC ();
 	  CHECK_STRING (TOP);
 	  temp_output_buffer_setup (SDATA (TOP));
 	  AFTER_POTENTIAL_GC ();
 	  TOP = Vstandard_output;
-	  break;
+	  NEXT;
 
-	case Btemp_output_buffer_show:
+	CASE (Btemp_output_buffer_show):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
@@ -877,10 +1216,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 	    /* pop binding of standard-output */
 	    unbind_to (SPECPDL_INDEX () - 1, Qnil);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bnth:
+	CASE (Bnth):
 	  {
 	    Lisp_Object v1, v2;
 	    BEFORE_POTENTIAL_GC ();
@@ -894,173 +1233,173 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      v1 = XCDR (v1);
 	    immediate_quit = 0;
 	    TOP = CAR (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bsymbolp:
+	CASE (Bsymbolp):
 	  TOP = SYMBOLP (TOP) ? Qt : Qnil;
-	  break;
+	  NEXT;
 
-	case Bconsp:
+	CASE (Bconsp):
 	  TOP = CONSP (TOP) ? Qt : Qnil;
-	  break;
+	  NEXT;
 
-	case Bstringp:
+	CASE (Bstringp):
 	  TOP = STRINGP (TOP) ? Qt : Qnil;
-	  break;
+	  NEXT;
 
-	case Blistp:
+	CASE (Blistp):
 	  TOP = CONSP (TOP) || NILP (TOP) ? Qt : Qnil;
-	  break;
+	  NEXT;
 
-	case Bnot:
+	CASE (Bnot):
 	  TOP = NILP (TOP) ? Qt : Qnil;
-	  break;
+	  NEXT;
 
-	case Bcons:
+	CASE (Bcons):
 	  {
 	    Lisp_Object v1;
 	    v1 = POP;
 	    TOP = Fcons (TOP, v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Blist1:
+	CASE (Blist1):
 	  TOP = Fcons (TOP, Qnil);
-	  break;
+	  NEXT;
 
-	case Blist2:
+	CASE (Blist2):
 	  {
 	    Lisp_Object v1;
 	    v1 = POP;
 	    TOP = Fcons (TOP, Fcons (v1, Qnil));
-	    break;
+	    NEXT;
 	  }
 
-	case Blist3:
+	CASE (Blist3):
 	  DISCARD (2);
 	  TOP = Flist (3, &TOP);
-	  break;
+	  NEXT;
 
-	case Blist4:
+	CASE (Blist4):
 	  DISCARD (3);
 	  TOP = Flist (4, &TOP);
-	  break;
+	  NEXT;
 
-	case BlistN:
+	CASE (BlistN):
 	  op = FETCH;
 	  DISCARD (op - 1);
 	  TOP = Flist (op, &TOP);
-	  break;
+	  NEXT;
 
-	case Blength:
+	CASE (Blength):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Flength (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Baref:
+	CASE (Baref):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Faref (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Baset:
+	CASE (Baset):
 	  {
 	    Lisp_Object v1, v2;
 	    BEFORE_POTENTIAL_GC ();
 	    v2 = POP; v1 = POP;
 	    TOP = Faset (TOP, v1, v2);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bsymbol_value:
+	CASE (Bsymbol_value):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fsymbol_value (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bsymbol_function:
+	CASE (Bsymbol_function):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fsymbol_function (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bset:
+	CASE (Bset):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fset (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bfset:
+	CASE (Bfset):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Ffset (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bget:
+	CASE (Bget):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fget (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bsubstring:
+	CASE (Bsubstring):
 	  {
 	    Lisp_Object v1, v2;
 	    BEFORE_POTENTIAL_GC ();
 	    v2 = POP; v1 = POP;
 	    TOP = Fsubstring (TOP, v1, v2);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bconcat2:
+	CASE (Bconcat2):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Fconcat (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bconcat3:
+	CASE (Bconcat3):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (2);
 	  TOP = Fconcat (3, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bconcat4:
+	CASE (Bconcat4):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (3);
 	  TOP = Fconcat (4, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case BconcatN:
+	CASE (BconcatN):
 	  op = FETCH;
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (op - 1);
 	  TOP = Fconcat (op, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bsub1:
+	CASE (Bsub1):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
@@ -1075,10 +1414,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 		TOP = Fsub1 (v1);
 		AFTER_POTENTIAL_GC ();
 	      }
-	    break;
+	    NEXT;
 	  }
 
-	case Badd1:
+	CASE (Badd1):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
@@ -1093,10 +1432,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 		TOP = Fadd1 (v1);
 		AFTER_POTENTIAL_GC ();
 	      }
-	    break;
+	    NEXT;
 	  }
 
-	case Beqlsign:
+	CASE (Beqlsign):
 	  {
 	    Lisp_Object v1, v2;
 	    BEFORE_POTENTIAL_GC ();
@@ -1114,57 +1453,57 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      }
 	    else
 	      TOP = (XINT (v1) == XINT (v2) ? Qt : Qnil);
-	    break;
+	    NEXT;
 	  }
 
-	case Bgtr:
+	CASE (Bgtr):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fgtr (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Blss:
+	CASE (Blss):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Flss (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bleq:
+	CASE (Bleq):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fleq (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bgeq:
+	CASE (Bgeq):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fgeq (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bdiff:
+	CASE (Bdiff):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Fminus (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bnegate:
+	CASE (Bnegate):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
@@ -1179,209 +1518,209 @@ If the third argument is incorrect, Emacs may crash.  */)
 		TOP = Fminus (1, &TOP);
 		AFTER_POTENTIAL_GC ();
 	      }
-	    break;
+	    NEXT;
 	  }
 
-	case Bplus:
+	CASE (Bplus):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Fplus (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bmax:
+	CASE (Bmax):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Fmax (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bmin:
+	CASE (Bmin):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Fmin (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bmult:
+	CASE (Bmult):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Ftimes (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bquo:
+	CASE (Bquo):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Fquo (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Brem:
+	CASE (Brem):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Frem (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bpoint:
+	CASE (Bpoint):
 	  {
 	    Lisp_Object v1;
 	    XSETFASTINT (v1, PT);
 	    PUSH (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bgoto_char:
+	CASE (Bgoto_char):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fgoto_char (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Binsert:
+	CASE (Binsert):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Finsert (1, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case BinsertN:
+	CASE (BinsertN):
 	  op = FETCH;
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (op - 1);
 	  TOP = Finsert (op, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bpoint_max:
+	CASE (Bpoint_max):
 	  {
 	    Lisp_Object v1;
 	    XSETFASTINT (v1, ZV);
 	    PUSH (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bpoint_min:
+	CASE (Bpoint_min):
 	  {
 	    Lisp_Object v1;
 	    XSETFASTINT (v1, BEGV);
 	    PUSH (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bchar_after:
+	CASE (Bchar_after):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fchar_after (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bfollowing_char:
+	CASE (Bfollowing_char):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = Ffollowing_char ();
 	    AFTER_POTENTIAL_GC ();
 	    PUSH (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bpreceding_char:
+	CASE (Bpreceding_char):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = Fprevious_char ();
 	    AFTER_POTENTIAL_GC ();
 	    PUSH (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bcurrent_column:
+	CASE (Bcurrent_column):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    XSETFASTINT (v1, (int) current_column ()); /* iftc */
 	    AFTER_POTENTIAL_GC ();
 	    PUSH (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bindent_to:
+	CASE (Bindent_to):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Findent_to (TOP, Qnil);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Beolp:
+	CASE (Beolp):
 	  PUSH (Feolp ());
-	  break;
+	  NEXT;
 
-	case Beobp:
+	CASE (Beobp):
 	  PUSH (Feobp ());
-	  break;
+	  NEXT;
 
-	case Bbolp:
+	CASE (Bbolp):
 	  PUSH (Fbolp ());
-	  break;
+	  NEXT;
 
-	case Bbobp:
+	CASE (Bbobp):
 	  PUSH (Fbobp ());
-	  break;
+	  NEXT;
 
-	case Bcurrent_buffer:
+	CASE (Bcurrent_buffer):
 	  PUSH (Fcurrent_buffer ());
-	  break;
+	  NEXT;
 
-	case Bset_buffer:
+	CASE (Bset_buffer):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fset_buffer (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Binteractive_p:
+	CASE (Binteractive_p):
 	  PUSH (Finteractive_p ());
-	  break;
+	  NEXT;
 
-	case Bforward_char:
+	CASE (Bforward_char):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fforward_char (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bforward_word:
+	CASE (Bforward_word):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fforward_word (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bskip_chars_forward:
+	CASE (Bskip_chars_forward):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fskip_chars_forward (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bskip_chars_backward:
+	CASE (Bskip_chars_backward):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fskip_chars_backward (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bforward_line:
+	CASE (Bforward_line):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fforward_line (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bchar_syntax:
+	CASE (Bchar_syntax):
 	  {
 	    int c;
 
@@ -1393,51 +1732,51 @@ If the third argument is incorrect, Emacs may crash.  */)
 	      MAKE_CHAR_MULTIBYTE (c);
 	    XSETFASTINT (TOP, syntax_code_spec[(int) SYNTAX (c)]);
 	  }
-	  break;
+	  NEXT;
 
-	case Bbuffer_substring:
+	CASE (Bbuffer_substring):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fbuffer_substring (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bdelete_region:
+	CASE (Bdelete_region):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fdelete_region (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bnarrow_to_region:
+	CASE (Bnarrow_to_region):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fnarrow_to_region (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bwiden:
+	CASE (Bwiden):
 	  BEFORE_POTENTIAL_GC ();
 	  PUSH (Fwiden ());
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bend_of_line:
+	CASE (Bend_of_line):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fend_of_line (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bset_marker:
+	CASE (Bset_marker):
 	  {
 	    Lisp_Object v1, v2;
 	    BEFORE_POTENTIAL_GC ();
@@ -1445,72 +1784,72 @@ If the third argument is incorrect, Emacs may crash.  */)
 	    v2 = POP;
 	    TOP = Fset_marker (TOP, v2, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bmatch_beginning:
+	CASE (Bmatch_beginning):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fmatch_beginning (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bmatch_end:
+	CASE (Bmatch_end):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fmatch_end (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bupcase:
+	CASE (Bupcase):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fupcase (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bdowncase:
+	CASE (Bdowncase):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fdowncase (TOP);
 	  AFTER_POTENTIAL_GC ();
-	break;
+	NEXT;
 
-	case Bstringeqlsign:
+      CASE (Bstringeqlsign):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fstring_equal (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bstringlss:
+	CASE (Bstringlss):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fstring_lessp (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bequal:
+	CASE (Bequal):
 	  {
 	    Lisp_Object v1;
 	    v1 = POP;
 	    TOP = Fequal (TOP, v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bnthcdr:
+	CASE (Bnthcdr):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fnthcdr (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Belt:
+	CASE (Belt):
 	  {
 	    Lisp_Object v1, v2;
 	    if (CONSP (TOP))
@@ -1535,104 +1874,105 @@ If the third argument is incorrect, Emacs may crash.  */)
 		TOP = Felt (TOP, v1);
 		AFTER_POTENTIAL_GC ();
 	      }
-	    break;
+	    NEXT;
 	  }
 
-	case Bmember:
+	CASE (Bmember):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fmember (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bassq:
+	CASE (Bassq):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fassq (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bnreverse:
+	CASE (Bnreverse):
 	  BEFORE_POTENTIAL_GC ();
 	  TOP = Fnreverse (TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bsetcar:
+	CASE (Bsetcar):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fsetcar (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bsetcdr:
+	CASE (Bsetcdr):
 	  {
 	    Lisp_Object v1;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
 	    TOP = Fsetcdr (TOP, v1);
 	    AFTER_POTENTIAL_GC ();
-	    break;
+	    NEXT;
 	  }
 
-	case Bcar_safe:
+	CASE (Bcar_safe):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
 	    TOP = CAR_SAFE (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bcdr_safe:
+	CASE (Bcdr_safe):
 	  {
 	    Lisp_Object v1;
 	    v1 = TOP;
 	    TOP = CDR_SAFE (v1);
-	    break;
+	    NEXT;
 	  }
 
-	case Bnconc:
+	CASE (Bnconc):
 	  BEFORE_POTENTIAL_GC ();
 	  DISCARD (1);
 	  TOP = Fnconc (2, &TOP);
 	  AFTER_POTENTIAL_GC ();
-	  break;
+	  NEXT;
 
-	case Bnumberp:
+	CASE (Bnumberp):
 	  TOP = (NUMBERP (TOP) ? Qt : Qnil);
-	  break;
+	  NEXT;
 
-	case Bintegerp:
+	CASE (Bintegerp):
 	  TOP = INTEGERP (TOP) ? Qt : Qnil;
-	  break;
+	  NEXT;
 
 #ifdef BYTE_CODE_SAFE
-	case Bset_mark:
+	CASE (Bset_mark):
 	  BEFORE_POTENTIAL_GC ();
 	  error ("set-mark is an obsolete bytecode");
 	  AFTER_POTENTIAL_GC ();
-	  break;
-	case Bscan_buffer:
+	  NEXT;
+	CASE (Bscan_buffer):
 	  BEFORE_POTENTIAL_GC ();
 	  error ("scan-buffer is an obsolete bytecode");
 	  AFTER_POTENTIAL_GC ();
-	  break;
-#endif
-
-	case 0:
+	  NEXT;
+#else
+	CASE (Bset_mark):
+	CASE (Bscan_buffer):
 	  abort ();
+#endif
 
-	case 255:
-	default:
+	CASE_DEFAULT
+	CASE (Bconstant):
 #ifdef BYTE_CODE_SAFE
 	  if (op < Bconstant)
 	    {
@@ -1646,6 +1986,10 @@ If the third argument is incorrect, Emacs may crash.  */)
 #else
 	  PUSH (vectorp[op - Bconstant]);
 #endif
+	  NEXT;
+
+	CASE_ABORT:
+	  abort ();
 	}
     }
 




^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 16:46 indirect threading for bytecode interpreter Tom Tromey
@ 2009-09-17 18:21 ` Stefan Monnier
  2009-09-17 19:20 ` Helmut Eller
  2009-09-18 19:15 ` Dan Nicolaescu
  2 siblings, 0 replies; 20+ messages in thread
From: Stefan Monnier @ 2009-09-17 18:21 UTC (permalink / raw)
  To: tromey; +Cc: Emacs development discussions

> This patch implements indirect threading for the Emacs bytecode
> interpreter.  It uses a GNU C extension, and falls back to the current
> switch-based code if Emacs is compiled with some other compiler.

Looks good, thank you.  I'd prefer a rewrite of bytecode.c using vmgen,
since it should make the code (c)leaner (and eliminate the risk of
inconsistency between different parts of the code, like your sanity
checks implement).  But it's a good change.

I'd rather wait a little bit before installing it, tho:
there's a chance we may be able to switch to Bzr fairly soon, in which
case I think I'll prefer to install it in the Emacs-24 branch.


        Stefan





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 16:46 indirect threading for bytecode interpreter Tom Tromey
  2009-09-17 18:21 ` Stefan Monnier
@ 2009-09-17 19:20 ` Helmut Eller
  2009-09-17 19:38   ` Tom Tromey
  2009-09-18 19:15 ` Dan Nicolaescu
  2 siblings, 1 reply; 20+ messages in thread
From: Helmut Eller @ 2009-09-17 19:20 UTC (permalink / raw)
  To: emacs-devel

* Tom Tromey [2009-09-17 18:46+0200] writes:

> This patch implements indirect threading for the Emacs bytecode
> interpreter.  It uses a GNU C extension, and falls back to the current
> switch-based code if Emacs is compiled with some other compiler.

According to the usual Forth terminology
http://en.wikipedia.org/wiki/Threaded_code
your patch implements token threading not indirect threading.

[Indirect threading uses pointers (to Forth-style "word headers") which
 contain the address of the machine code.  There are no such pointers in
 your patch.  At the machine code level, token threaded code uses a
 "token" as index in a table.  That's what your code is doing.  Unlike
 token threaded code, indirect threaded code is typically not position
 independent.]

Despite that, a few years ago somebody has already proposed that 
http://lists.gnu.org/archive/html/emacs-devel/2004-05/msg00095.html
and it wasn't deemed worthwhile back then.

Helmut





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 19:20 ` Helmut Eller
@ 2009-09-17 19:38   ` Tom Tromey
  2009-09-17 20:41     ` Helmut Eller
  2009-09-17 20:43     ` Stefan Monnier
  0 siblings, 2 replies; 20+ messages in thread
From: Tom Tromey @ 2009-09-17 19:38 UTC (permalink / raw)
  To: Helmut Eller; +Cc: emacs-devel

>>>>> "Helmut" == Helmut Eller <eller.helmut@gmail.com> writes:

Helmut> Despite that, a few years ago somebody has already proposed that 
Helmut> http://lists.gnu.org/archive/html/emacs-devel/2004-05/msg00095.html
Helmut> and it wasn't deemed worthwhile back then.

That is not my take on that thread.  It just sort of petered out with no
results.  I didn't see any actual objection.


Also, the Emacs project has a problem with losing patches.
This is the second time I've reimplemented something only to find that a
nearly identical patch has been sitting in the archives for years.

Tom




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 19:38   ` Tom Tromey
@ 2009-09-17 20:41     ` Helmut Eller
  2009-09-17 21:06       ` Tom Tromey
  2009-09-17 20:43     ` Stefan Monnier
  1 sibling, 1 reply; 20+ messages in thread
From: Helmut Eller @ 2009-09-17 20:41 UTC (permalink / raw)
  To: emacs-devel

* Tom Tromey [2009-09-17 21:38+0200] writes:

>>>>>> "Helmut" == Helmut Eller <eller.helmut@gmail.com> writes:
>
> Helmut> Despite that, a few years ago somebody has already proposed that 
> Helmut> http://lists.gnu.org/archive/html/emacs-devel/2004-05/msg00095.html
> Helmut> and it wasn't deemed worthwhile back then.
>
> That is not my take on that thread.  It just sort of petered out with no
> results.  I didn't see any actual objection.

Hmm.. maybe I'm misremembering something.  But then such a change
introduces quite a bit of clutter and it's not so clear how much speedup
is required to justify that.  5% doesn't sound like a lot to some
people.  Well, at least not to me. :-) 

vmgen sounds like a good idea, but I fear that it makes the build
process quite a bit more complicated.

I'm wondering why gcc can't perform this transformation from the switch
based code.  Is there no compiler setting to skip the range check in the
switch statement?  Quite bizarre, considering C's philosophy of skipping
safety checks.

> Also, the Emacs project has a problem with losing patches.
> This is the second time I've reimplemented something only to find that a
> nearly identical patch has been sitting in the archives for years.

Looks like it.  Maybe things are better now with the bug tracker.

Helmut





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 19:38   ` Tom Tromey
  2009-09-17 20:41     ` Helmut Eller
@ 2009-09-17 20:43     ` Stefan Monnier
  2009-09-17 20:57       ` Tom Tromey
  1 sibling, 1 reply; 20+ messages in thread
From: Stefan Monnier @ 2009-09-17 20:43 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Helmut Eller, emacs-devel

> Also, the Emacs project has a problem with losing patches.
> This is the second time I've reimplemented something only to find that a
> nearly identical patch has been sitting in the archives for years.

That's one of the reasons we setup a bugtracker.  I recommend you send
your patch there, to (try to) make sure it doesn't get lost.


        Stefan




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 20:43     ` Stefan Monnier
@ 2009-09-17 20:57       ` Tom Tromey
  0 siblings, 0 replies; 20+ messages in thread
From: Tom Tromey @ 2009-09-17 20:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Helmut Eller, emacs-devel

Stefan> That's one of the reasons we setup a bugtracker.  I recommend you send
Stefan> your patch there, to (try to) make sure it doesn't get lost.

Ok, I sent it.  Thanks.

Tom




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 20:41     ` Helmut Eller
@ 2009-09-17 21:06       ` Tom Tromey
  2009-09-17 22:48         ` Helmut Eller
  2009-09-18  0:59         ` Stefan Monnier
  0 siblings, 2 replies; 20+ messages in thread
From: Tom Tromey @ 2009-09-17 21:06 UTC (permalink / raw)
  To: Helmut Eller; +Cc: emacs-devel

>>>>> "Helmut" == Helmut Eller <eller.helmut@gmail.com> writes:

Helmut> 5% doesn't sound like a lot to some people.

Shrug.  Obviously I think the tradeoff is worth it, or I would not have
sent the patch.  I don't think the result is all that ugly.  And,
importantly, it is very low-hanging fruit.

Helmut> vmgen sounds like a good idea, but I fear that it makes the build
Helmut> process quite a bit more complicated.

You can check in the generated code.

vmgen is a nice idea.  I rejected writing this as a direct-threaded
interpreter because I assumed that the added memory use would be a bad
tradeoff.  But, if you are interested in that, perhaps I could take a
stab at it.

Helmut> I'm wondering why gcc can't perform this transformation from the
Helmut> switch based code.  Is there no compiler setting to skip the
Helmut> range check in the switch statement?

It isn't about range checking but about eliminating a jump during the
dispatch.

GCC could be taught to do this.  I imagine that it has always been
simpler for people to just update their interpreter than it has been to
try to fix GCC.

I don't think that some possible future GCC change should affect whether
this patch goes in.

Tom




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 21:06       ` Tom Tromey
@ 2009-09-17 22:48         ` Helmut Eller
  2009-09-18  0:59         ` Stefan Monnier
  1 sibling, 0 replies; 20+ messages in thread
From: Helmut Eller @ 2009-09-17 22:48 UTC (permalink / raw)
  To: emacs-devel

* Tom Tromey [2009-09-17 23:06+0200] writes:

>>>>>> "Helmut" == Helmut Eller <eller.helmut@gmail.com> writes:
>
> Helmut> 5% doesn't sound like a lot to some people.
>
> Shrug.  Obviously I think the tradeoff is worth it, or I would not have
> sent the patch.  I don't think the result is all that ugly.  And,
> importantly, it is very low-hanging fruit.

I just expressed my opinion.  I'm not the one to decide those issues.

> Helmut> vmgen sounds like a good idea, but I fear that it makes the build
> Helmut> process quite a bit more complicated.
>
> You can check in the generated code.

vmgen generates quite a few files, though.

> vmgen is a nice idea.  I rejected writing this as a direct-threaded
> interpreter because I assumed that the added memory use would be a bad
> tradeoff.  But, if you are interested in that, perhaps I could take a
> stab at it.

Yes, sounds a bit expensive memory-wise.  Especially on 64-bit machines.

> Helmut> I'm wondering why gcc can't perform this transformation from the
> Helmut> switch based code.  Is there no compiler setting to skip the
> Helmut> range check in the switch statement?
>
> It isn't about range checking but about eliminating a jump during the
> dispatch.

You mean the shared NEXT vs. copied NEXT issue.  Agreed, gcc can't
figure that out easily.

Helmut





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 21:06       ` Tom Tromey
  2009-09-17 22:48         ` Helmut Eller
@ 2009-09-18  0:59         ` Stefan Monnier
  2009-09-18  2:59           ` Tom Tromey
  1 sibling, 1 reply; 20+ messages in thread
From: Stefan Monnier @ 2009-09-18  0:59 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Helmut Eller, emacs-devel

Helmut> 5% doesn't sound like a lot to some people.
> Shrug.  Obviously I think the tradeoff is worth it, or I would not have
> sent the patch.  I don't think the result is all that ugly.  And,
> importantly, it is very low-hanging fruit.

I agree it doesn't seem that ugly.  Looking at
http://lists.gnu.org/archive/html/emacs-devel/2004-05/txt1OKi7Cs5BI.txt
again, I like his use of

   # define OPLABL(X)   [X] = &&lbl_ ##X

to initialize the table, making sure that it's initialized correctly
(no need for your sanity checks).

Helmut> vmgen sounds like a good idea, but I fear that it makes the build
Helmut> process quite a bit more complicated.
> You can check in the generated code.

IIUC the code it generates may depend on the platform (tho, it can
probably output platform-independent code as well, I guess).

> vmgen is a nice idea.

Yes, and it could bring yet more optimization tricks, for free.

> I rejected writing this as a direct-threaded interpreter because
> I assumed that the added memory use would be a bad tradeoff.  But, if
> you are interested in that, perhaps I could take a stab at it.

I think a direct-threaded interpreter would take a bit more work,
because you need to replace the bytecode with "word-code".  I don't know
how much of an impact it would have on memory use.

Helmut> I'm wondering why gcc can't perform this transformation from the
Helmut> switch based code.  Is there no compiler setting to skip the
Helmut> range check in the switch statement?

> It isn't about range checking but about eliminating a jump during the
> dispatch.

Actually, IIRC Anton Ertl (vmgen's author) has some articles which
indicate that a big part of the win isn't just the removal of some
instructions, but more importantly the multiplication of "jump to next
target": instead of having only 1 computed jump to the next byte-code
target (plus N jumps back to the starting point), you have N computed
jumps, so each one can be predicted independently.  The single computed
jump in gcc's output code is terribly difficult for the CPU to predict,
leading to lots and lots of cycles wasted due to mispredictions.
The N computed jumps aren't very easy to predict either, but some of
them at least are a bit easier, because some sequences of byte-code are
more common than others, so the CPU's jump prediction can fail a bit
less often, leading to fewer wasted cycles.

Anton had some experiments where he duplicated some byte-codes, and
showed that it could also improve performance (again, by making the
jumps more predictable: the actual executed instructions were exactly
identical).

> GCC could be taught to do this.  I imagine that it has always been
> simpler for people to just update their interpreter than it has been to
> try to fix GCC.

IIRC, some people experimented with gcc to teach it to do this kind of
copy the initial jump to the end of each block, but IIRC it was
difficult for gcc to tell automatically when it was a good idea and when
it wasn't.  After all, by duplicating this code, you increase the code
size, and if each branch's prediction is pretty much identical, you
might be better off with a single jump so the prediction data from one
branch helps the other branches as well (as so it doesn't use up as
much space in the jump prediction table).

For interpreters it's almost always a good thing to do, because a lot of
execution time will be spent in this loop+switch, but in general it's
not that clear cut.

> I don't think that some possible future GCC change should affect whether
> this patch goes in.

No, indeed.


        Stefan




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-18  0:59         ` Stefan Monnier
@ 2009-09-18  2:59           ` Tom Tromey
  0 siblings, 0 replies; 20+ messages in thread
From: Tom Tromey @ 2009-09-18  2:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Helmut Eller, emacs-devel

>>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:

Stefan>    # define OPLABL(X)   [X] = &&lbl_ ##X
Stefan> to initialize the table, making sure that it's initialized correctly
Stefan> (no need for your sanity checks).

Yeah, I considered that but then forgot that all this was gcc-specific
and rejected it due to a mistaken concern for portability.  Oops.

I will implement this.

Stefan> Yes, and it could bring yet more optimization tricks, for free.

Yeah.  I was also wondering about GNU lightning or one of the other jit
libraries.  But that is a whole other big project.

Stefan> Actually, IIRC Anton Ertl (vmgen's author) has some articles which
[...]

Delightful exposition, thanks.
Clearly I am behind on my Ertl reading.

Tom




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-17 16:46 indirect threading for bytecode interpreter Tom Tromey
  2009-09-17 18:21 ` Stefan Monnier
  2009-09-17 19:20 ` Helmut Eller
@ 2009-09-18 19:15 ` Dan Nicolaescu
  2009-09-18 20:26   ` Tom Tromey
  2 siblings, 1 reply; 20+ messages in thread
From: Dan Nicolaescu @ 2009-09-18 19:15 UTC (permalink / raw)
  To: tromey; +Cc: Emacs development discussions

Tom Tromey <tromey@redhat.com> writes:

  > This patch implements indirect threading for the Emacs bytecode
  > interpreter.  It uses a GNU C extension, and falls back to the current
  > switch-based code if Emacs is compiled with some other compiler.
  > 
  > This speeds up the bytecode interpreter a bit.  On my benchmark, it
  > improved the overall time between 4-7% (lots of noise in the runs).
  > That isn't bad when you consider that Fbyte_code was the 3rd biggest
  > time user (the first two being re_match_2_internal and mark_object).

If your patch applies to emacs-22.3 (quite likely), and you are curious,
you might want to test the performance there.  The overhead of
mark_object should be lower in that version (it's higher in 23 because
of the way charsets are represented).




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-18 19:15 ` Dan Nicolaescu
@ 2009-09-18 20:26   ` Tom Tromey
  2009-09-21  1:58     ` Dan Nicolaescu
  0 siblings, 1 reply; 20+ messages in thread
From: Tom Tromey @ 2009-09-18 20:26 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: Emacs development discussions

>>>>> "Dan" == Dan Nicolaescu <dann@ics.uci.edu> writes:

Dan> If your patch applies to emacs-22.3 (quite likely), and you are curious,
Dan> you might want to test the performance there.

I probably won't get to this.

Dan> The overhead of mark_object should be lower in that version (it's
Dan> higher in 23 because of the way charsets are represented).

I would be interested if you, or anybody, has ideas on how to improve
the performance of mark_object.  That is, without a whole GC rewrite :-)

Tom




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-18 20:26   ` Tom Tromey
@ 2009-09-21  1:58     ` Dan Nicolaescu
  2009-09-21  3:17       ` Tom Tromey
  0 siblings, 1 reply; 20+ messages in thread
From: Dan Nicolaescu @ 2009-09-21  1:58 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Emacs development discussions

Tom Tromey <tromey@redhat.com> writes:

  > >>>>> "Dan" == Dan Nicolaescu <dann@ics.uci.edu> writes:
  > 
  > Dan> The overhead of mark_object should be lower in that version (it's
  > Dan> higher in 23 because of the way charsets are represented).
  > 
  > I would be interested if you, or anybody, has ideas on how to improve
  > the performance of mark_object.  That is, without a whole GC rewrite :-)

Not sure if much can be done about mark_object itself, the issue is
the amount of things that need to be marked.

Making the GC generational might help.  The first GC after startup does
200K mark_object calls, so if most of those don't need to be scanned for
each GC, then that might help.

If you have a testcase that shows GC problems, maybe posting the test
case would enticemore people look into it...
     
     




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-21  1:58     ` Dan Nicolaescu
@ 2009-09-21  3:17       ` Tom Tromey
  2009-09-21 13:13         ` Stefan Monnier
  0 siblings, 1 reply; 20+ messages in thread
From: Tom Tromey @ 2009-09-21  3:17 UTC (permalink / raw)
  To: Dan Nicolaescu; +Cc: Emacs development discussions

>>>>> "Dan" == Dan Nicolaescu <dann@ics.uci.edu> writes:

Dan> Making the GC generational might help.  The first GC after startup does
Dan> 200K mark_object calls, so if most of those don't need to be scanned for
Dan> each GC, then that might help.

Yeah.

Whatever happened to the Boehm GC branch?  That seems like the simplest
way to implement this feature.

Dan> If you have a testcase that shows GC problems, maybe posting the test
Dan> case would enticemore people look into it...
     
Here's a script I've been running on the GDB sources:

http://sourceware.org/ml/gdb-patches/2009-08/msg00390.html

I've profiled (using oprofile) Emacs running this a few times; typical
results show mark_object at 10-12% of runtime.  It is probably pretty
easy to reproduce this with a variety of elisp programs.

Tom




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-21  3:17       ` Tom Tromey
@ 2009-09-21 13:13         ` Stefan Monnier
  2009-09-21 13:21           ` David Kastrup
  2009-09-21 14:27           ` Tom Tromey
  0 siblings, 2 replies; 20+ messages in thread
From: Stefan Monnier @ 2009-09-21 13:13 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Dan Nicolaescu, Emacs development discussions

> I've profiled (using oprofile) Emacs running this a few times; typical
> results show mark_object at 10-12% of runtime.  It is probably pretty
> easy to reproduce this with a variety of elisp programs.

So no matter how hard we try, we won't gain more than about 10% speed-up
trying to optimize the GC.  Doesn't sound like a great motivator.


        Stefan




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-21 13:13         ` Stefan Monnier
@ 2009-09-21 13:21           ` David Kastrup
  2009-09-21 13:47             ` joakim
  2009-09-21 14:46             ` Stefan Monnier
  2009-09-21 14:27           ` Tom Tromey
  1 sibling, 2 replies; 20+ messages in thread
From: David Kastrup @ 2009-09-21 13:21 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I've profiled (using oprofile) Emacs running this a few times;
>> typical results show mark_object at 10-12% of runtime.  It is
>> probably pretty easy to reproduce this with a variety of elisp
>> programs.
>
> So no matter how hard we try, we won't gain more than about 10%
> speed-up trying to optimize the GC.  Doesn't sound like a great
> motivator.

The 10% occur in bursts of unresponsiveness.  If such a burst is halved
in size, the total runtime improvement will not be impressive, but the
responsiveness gets quite better.

-- 
David Kastrup





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-21 13:21           ` David Kastrup
@ 2009-09-21 13:47             ` joakim
  2009-09-21 14:46             ` Stefan Monnier
  1 sibling, 0 replies; 20+ messages in thread
From: joakim @ 2009-09-21 13:47 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

David Kastrup <dak@gnu.org> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> I've profiled (using oprofile) Emacs running this a few times;
>>> typical results show mark_object at 10-12% of runtime.  It is
>>> probably pretty easy to reproduce this with a variety of elisp
>>> programs.
>>
>> So no matter how hard we try, we won't gain more than about 10%
>> speed-up trying to optimize the GC.  Doesn't sound like a great
>> motivator.
>
> The 10% occur in bursts of unresponsiveness.  If such a burst is halved
> in size, the total runtime improvement will not be impressive, but the
> responsiveness gets quite better.

Unresponsiveness is the main cause of intestinal agony, so seems quite
worthwile to me.

-- 
Joakim Verona




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-21 13:13         ` Stefan Monnier
  2009-09-21 13:21           ` David Kastrup
@ 2009-09-21 14:27           ` Tom Tromey
  1 sibling, 0 replies; 20+ messages in thread
From: Tom Tromey @ 2009-09-21 14:27 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Dan Nicolaescu, Emacs development discussions

>>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I've profiled (using oprofile) Emacs running this a few times; typical
>> results show mark_object at 10-12% of runtime.  It is probably pretty
>> easy to reproduce this with a variety of elisp programs.

Stefan> So no matter how hard we try, we won't gain more than about 10% speed-up
Stefan> trying to optimize the GC.  Doesn't sound like a great motivator.

Yeah, though it does vary by workload.

re_match_2_internal is actually the top time user in my profiles, using
17-25% of the runtime.

Tom




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: indirect threading for bytecode interpreter
  2009-09-21 13:21           ` David Kastrup
  2009-09-21 13:47             ` joakim
@ 2009-09-21 14:46             ` Stefan Monnier
  1 sibling, 0 replies; 20+ messages in thread
From: Stefan Monnier @ 2009-09-21 14:46 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

>>> I've profiled (using oprofile) Emacs running this a few times;
>>> typical results show mark_object at 10-12% of runtime.  It is
>>> probably pretty easy to reproduce this with a variety of elisp
>>> programs.
>> So no matter how hard we try, we won't gain more than about 10%
>> speed-up trying to optimize the GC.  Doesn't sound like a great
>> motivator.
> The 10% occur in bursts of unresponsiveness.  If such a burst is halved
> in size, the total runtime improvement will not be impressive, but the
> responsiveness gets quite better.

That is true.  To help solve this, we should aim to move towards
a concurrent GC (i.e. a GC that's actually likely to be slower, but
won't cause unresponsiveness).


        Stefan




^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2009-09-21 14:46 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-17 16:46 indirect threading for bytecode interpreter Tom Tromey
2009-09-17 18:21 ` Stefan Monnier
2009-09-17 19:20 ` Helmut Eller
2009-09-17 19:38   ` Tom Tromey
2009-09-17 20:41     ` Helmut Eller
2009-09-17 21:06       ` Tom Tromey
2009-09-17 22:48         ` Helmut Eller
2009-09-18  0:59         ` Stefan Monnier
2009-09-18  2:59           ` Tom Tromey
2009-09-17 20:43     ` Stefan Monnier
2009-09-17 20:57       ` Tom Tromey
2009-09-18 19:15 ` Dan Nicolaescu
2009-09-18 20:26   ` Tom Tromey
2009-09-21  1:58     ` Dan Nicolaescu
2009-09-21  3:17       ` Tom Tromey
2009-09-21 13:13         ` Stefan Monnier
2009-09-21 13:21           ` David Kastrup
2009-09-21 13:47             ` joakim
2009-09-21 14:46             ` Stefan Monnier
2009-09-21 14:27           ` Tom Tromey

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).