unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
To: 3208@emacsbugs.donarmstrong.com
Cc: schierlm@gmx.de, cyd@stupidchicken.com
Subject: bug#3208: 23.0.93; Memory full / crash when displaying lots of characters from a	large	font (like Arial Unicode or Code2000) which is not explicitly	selected (on Win32)
Date: Thu, 02 Jul 2009 21:13:12 +0900	[thread overview]
Message-ID: <E1MMLA8-0007F3-D6@etlken> (raw)
In-Reply-To: <E1MK0Cs-0006cp-AB@etlken> (message from Kenichi Handa on Fri, 26 Jun 2009 10:26:22 +0900)

Attached is a new patch I propose for 23.1 to solve the bug.
It is not the best one, but is simpler and safer.

I'll summarize the problem here.  Yidong and Stefan, please
consider installing it.

Emacs searches for a font for C in this order:

(1) search a font-group for C in the current fontset.
(2) search a font-group for C in the default fontset.
(3) search a fallback font-group of the current fontset.
(4) search a fallback font-group of the default fontset.

The problem occurs when there's a very long line that
contains characters whose fonts are found only in the second
or later font in the group at step (4).  Actually there are
two problems; memory full and extreme slowness.  They happen
typically on Windows because both uniscribe and gdi
font-backends returns many fonts, but theoretically it
happens also on GNU/Linux.

First, the reason of memory full is this.

When a set of fonts are found in multiple font-backends,
Emacs concatenates them into a single vector, sort elements
of the vector, then return the best matching font.  This is
done for each character in a line.  But, as Emacs doesn't
perform GC while displaying that long line, the created (and
then abandoned) vectors consume memory without being
re-used.  This results in memory full error.

The changes for font.c in the patch is to solve it.  The
strategy is to cache vectors of specific sizes in a weak
hash table for reuse until the next GC time.

Next, the reason of slowness is this.

When a font is found in the second or the following spec in
the fallback font-group, Emacs sorts fonts matching with the
first spec in the font-group, then check if fonts support C
one by one, even if it results in no-font-for-C.  This is
because Emacs doesn't remember which font-spec in the
font-group can return a suitable font for C.  We can't
remember such info for each character because it requires
lots of memory.  So, anyway this sorting and checking is
done for each character in that long line, and it's very
slow.

The changes for fontset.c in the patch is to solve it by
re-ordering the entries in the font-group so that the
lastly used entry comes first.  We do this re-ordering only
for fallback font-group assuming that the order is not that
important in a fallback font-group.  And, for a font-group
corresponding to a character, usually the font-specs are more
specific and the number of matching fonts is small, and thus
the above soring and checking is not that heavy.

By the way, I'm going to install a different change (bigger
but better) for 23.2.

---
Kenichi Handa
handa@m17n.org

2009-07-02  Kenichi Handa  <handa@m17n.org>

	* font.c (font_entity_vector_cache): New variable.
	(syms_of_font): Initialize it.
	(font_concat_entities): New function.
	(font_list_entities): Use font_concat_entities instead of
	Fvconcat.

	* fontset.c (fontset_find_font): Re-order a fallback font-group.
	(fontset_font): Treat return value t of fontset_find_font as a
	sign of no-font in that fontset, not as a sign of no-font
	anywhere.

Index: font.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/font.c,v
retrieving revision 1.133
diff -u -r1.133 font.c
--- font.c	10 Jun 2009 01:26:15 -0000	1.133
+++ font.c	2 Jul 2009 11:13:16 -0000
@@ -2763,6 +2763,33 @@
   return Fnreverse (val);
 }
 
+static Lisp_Object font_entity_vector_cache;
+
+/* Concatenate lists of font-entities in VEC_ENTITY_LIST or length LEN.  */
+
+static Lisp_Object
+font_concat_entities (Lisp_Object *vec_entity_list, int len)
+{
+  int i, j, num;
+  Lisp_Object vec, tail;
+  
+  for (i = 0, num = 0; i < len; i++)
+    num += XINT (Flength (vec_entity_list[i]));
+  vec = Fgethash (make_number (num), font_entity_vector_cache, Qnil);
+  if (NILP (vec))
+    {
+      vec = Fvconcat (len, vec_entity_list);
+      Fputhash (make_number (num), vec, font_entity_vector_cache);
+    }
+  else
+    {
+      for (i = 0, j = 0; i < len; i++)
+	for (tail = vec_entity_list[i]; CONSP (tail); tail = XCDR (tail), j++)
+	  ASET (vec, j, XCAR (tail));
+    }
+  return vec;
+}
+
 
 /* Return a vector of font-entities matching with SPEC on FRAME.  */
 
@@ -2831,7 +2858,7 @@
 	  vec[i++] = val;
       }
 
-  val = (i > 0 ? Fvconcat (i, vec) : null_vector);
+  val = (i > 0 ? font_concat_entities (vec, i) : null_vector);
   font_add_log ("list", spec, val);
   return (val);
 }
@@ -5178,6 +5205,14 @@
   staticpro (&Vfont_log_deferred);
   Vfont_log_deferred = Fmake_vector (make_number (3), Qnil);
 
+  staticpro (&font_entity_vector_cache);
+  { /* Here we rely on the fact that syms_of_font is called fairly
+       late, when QCweakness is known to be set.  */
+    Lisp_Object args[2];
+    args[0] = QCweakness;
+    args[1] = Qt;
+    font_entity_vector_cache = Fmake_hash_table (2, args);
+  }
 #if 0
 #ifdef HAVE_LIBOTF
   staticpro (&otf_list);
Index: fontset.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/fontset.c,v
retrieving revision 1.173
diff -u -r1.173 fontset.c
--- fontset.c	8 Jun 2009 04:33:40 -0000	1.173
+++ fontset.c	2 Jul 2009 11:06:14 -0000
@@ -525,6 +525,8 @@
 {
   Lisp_Object vec, font_group;
   int i, charset_matched = -1;
+  Lisp_Object rfont_def;
+  int found_index;
   FRAME_PTR f = (FRAMEP (FONTSET_FRAME (fontset)))
     ? XFRAME (selected_frame) : XFRAME (FONTSET_FRAME (fontset));
 
@@ -564,21 +566,22 @@
   /* Find the first available font in the vector of RFONT-DEF.  */
   for (i = 0; i < ASIZE (vec); i++)
     {
-      Lisp_Object rfont_def, font_def;
+      Lisp_Object font_def;
       Lisp_Object font_entity, font_object;
 
       if (i == 0 && charset_matched >= 0)
 	{
 	  /* Try the element matching with the charset ID at first.  */
-	  rfont_def = AREF (vec, charset_matched);
+	  found_index = charset_matched;
 	  charset_matched = -1;
 	  i--;
 	}
       else if (i != charset_matched)
-	rfont_def = AREF (vec, i);
+	found_index = i;
       else
 	continue;
 
+      rfont_def = AREF (vec, found_index);
       if (NILP (rfont_def))
 	/* This is a sign of not to try the other fonts.  */
 	return Qt;
@@ -623,7 +626,7 @@
 	}
 
       if (font_has_char (f, font_object, c))
-	return rfont_def;
+	goto found;
 
       /* Find a font already opened, maching with the current spec,
 	 and supporting C. */
@@ -637,7 +640,7 @@
 	    break;
 	  font_object = RFONT_DEF_OBJECT (AREF (vec, i));
 	  if (! NILP (font_object) && font_has_char (f, font_object, c))
-	    return rfont_def;
+	    goto found;
 	}
 
       /* Find a font-entity with the current spec and supporting C.  */
@@ -661,10 +664,12 @@
 	  for (j = 0; j < i; j++)
 	    ASET (new_vec, j, AREF (vec, j));
 	  ASET (new_vec, j, rfont_def);
+	  found_index = j;
 	  for (j++; j < ASIZE (new_vec); j++)
 	    ASET (new_vec, j, AREF (vec, j - 1));
 	  XSETCDR (font_group, new_vec);
-	  return rfont_def;
+	  vec = new_vec;
+	  goto found;
 	}
 
       /* No font of the current spec for C.  Try the next spec.  */
@@ -673,6 +678,20 @@
 
   FONTSET_SET (fontset, make_number (c), make_number (0));
   return Qnil;
+
+ found:
+  if (fallback && found_index > 0)
+    {
+      /* The order of fonts in the fallback font-group is not that
+	 important, and it is better to move the found font to the
+	 first of the group so that the next try will find it
+	 quickly. */
+      for (i = found_index; i > 0; i--)
+	ASET (vec, i, AREF (vec, i - 1));
+      ASET (vec, 0, rfont_def);
+      found_index = 0;
+    }
+  return rfont_def;
 }
 
 
@@ -685,13 +704,14 @@
 {
   Lisp_Object rfont_def;
   Lisp_Object base_fontset;
+  int try_fallback = 0, try_default_fallback = 0;
 
   /* Try a font-group of FONTSET. */
   rfont_def = fontset_find_font (fontset, c, face, id, 0);
   if (VECTORP (rfont_def))
     return rfont_def;
-  if (EQ (rfont_def, Qt))
-    goto no_font;
+  if (! EQ (rfont_def, Qt))
+    try_fallback = 1;
 
   /* Try a font-group of the default fontset. */
   base_fontset = FONTSET_BASE (fontset);
@@ -703,29 +723,30 @@
       rfont_def = fontset_find_font (FONTSET_DEFAULT (fontset), c, face, id, 0);
       if (VECTORP (rfont_def))
 	return rfont_def;
-      if (EQ (rfont_def, Qt))
-	goto no_font;
+      if (! EQ (rfont_def, Qt))
+	try_default_fallback = 1;
     }
 
   /* Try a fallback font-group of FONTSET. */
-  rfont_def = fontset_find_font (fontset, c, face, id, 1);
-  if (VECTORP (rfont_def))
-    return rfont_def;
-  if (EQ (rfont_def, Qt))
-    goto no_font;
+  if (try_fallback)
+    {
+      rfont_def = fontset_find_font (fontset, c, face, id, 1);
+      if (VECTORP (rfont_def))
+	return rfont_def;
+      /* Remember that FONTSET has no font for C.  */
+      FONTSET_SET (fontset, make_number (c), Qt);
+    }
 
   /* Try a fallback font-group of the default fontset . */
-  if (! EQ (base_fontset, Vdefault_fontset))
+  if (try_default_fallback)
     {
       rfont_def = fontset_find_font (FONTSET_DEFAULT (fontset), c, face, id, 1);
       if (VECTORP (rfont_def))
 	return rfont_def;
+      /* Remember that the default fontset has no font for C.  */
+      FONTSET_SET (FONTSET_DEFAULT (fontset), make_number (c), Qt);
     }
 
- no_font:
-  /* Remember that we have no font for C.  */
-  FONTSET_SET (fontset, make_number (c), Qt);
-
   return Qnil;
 }
 





  parent reply	other threads:[~2009-07-02 12:13 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4AAF9D6C.1040303@gnu.org>
2009-05-04 18:26 ` bug#3208: 23.0.93; Memory full / crash when displaying lots of characters from a large font (like Arial Unicode or Code2000) which is not explicitly selected (on Win32) Michael Schierl
2009-05-05 15:25   ` Jason Rumney
2009-05-05 15:46     ` Jason Rumney
2009-05-19  2:13     ` Kenichi Handa
2009-06-18  5:29       ` Jason Rumney
2009-06-22  5:47       ` Jason Rumney
2009-06-22 11:22         ` Kenichi Handa
2009-06-22 11:51           ` Jason Rumney
2009-06-22 12:51             ` Kenichi Handa
2009-06-22 13:05               ` Jason Rumney
2009-06-22 14:01                 ` bug#3650: M-x gdb unusable on Windows Jason Rumney
2009-06-23  1:59                   ` Kenichi Handa
2009-06-23  3:37                     ` Dan Nicolaescu
2009-06-23  6:22                     ` Nick Roberts
2009-06-23  7:38                       ` Kenichi Handa
2009-06-23  6:09                   ` Nick Roberts
2009-06-23  7:59                     ` Jason Rumney
2009-06-23 13:22                     ` Kenichi Handa
2009-06-23 17:08                       ` Dan Nicolaescu
2009-06-25  5:50                       ` Kenichi Handa
2009-06-25  6:13                         ` Nick Roberts
2009-06-25  7:51                           ` Kenichi Handa
2019-11-02  6:04                   ` Stefan Kangas
2019-11-02  8:41                     ` Eli Zaretskii
2022-04-13  0:40                       ` Lars Ingebrigtsen
2009-06-24  4:26                 ` bug#3208: 23.0.93; Memory full / crash when displaying lots of characters from a large font (like Arial Unicode or Code2000) which is not explicitly selected (on Win32) Kenichi Handa
2009-06-24 10:37                   ` Jason Rumney
2009-06-24 11:45                     ` Kenichi Handa
2009-06-24 10:43                   ` Jason Rumney
2009-06-24 11:55                     ` Kenichi Handa
     [not found]                       ` <4A422909.9060800@gnu.org>
2009-06-25  8:10                         ` Kenichi Handa
2009-06-25 13:21                           ` Jason Rumney
2009-06-26  1:26                             ` Kenichi Handa
2009-06-26  5:54                               ` Jason Rumney
2009-06-26 13:12                                 ` Kenichi Handa
2009-07-02 12:13                               ` Kenichi Handa [this message]
2009-07-02 21:36                                 ` Stefan Monnier
2009-07-03  2:11                                   ` Kenichi Handa
2009-09-15 14:05   ` bug#3208: marked as done (23.0.93; Memory full / crash when displaying lots of characters from a large font (like Arial Unicode or Code2000) which is not explicitly selected (on Win32)) Emacs bug Tracking System
2009-05-06 23:11 bug#3208: 23.0.93; Memory full / crash when displaying lots of characters from a large font (like Arial Unicode or Code2000) which is not explicitly selected (on Win32) Chong Yidong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1MMLA8-0007F3-D6@etlken \
    --to=handa@m17n.org \
    --cc=3208@emacsbugs.donarmstrong.com \
    --cc=cyd@stupidchicken.com \
    --cc=schierlm@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).