From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.bugs Subject: bug#3208: 23.0.93; Memory full / crash when displaying lots of characters from a large font (like Arial Unicode or Code2000) which is not explicitly selected (on Win32) Date: Thu, 02 Jul 2009 21:13:12 +0900 Message-ID: References: <49FF3340.2040008@gmx.de> <4A005A64.5050908@gnu.org> <4A3F1B05.7030105@gnu.org> <4A3F7058.902@gnu.org> <4A3F81AC.1070404@gnu.org> <4A420357.8050706@gnu.org> <4A422909.9060800@gnu.org> <4A4379F7.4000100@gnu.org> Reply-To: Kenichi Handa , 3208@emacsbugs.donarmstrong.com NNTP-Posting-Host: lo.gmane.org X-Trace: ger.gmane.org 1246537797 2753 80.91.229.12 (2 Jul 2009 12:29:57 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 2 Jul 2009 12:29:57 +0000 (UTC) Cc: schierlm@gmx.de, cyd@stupidchicken.com To: 3208@emacsbugs.donarmstrong.com Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Jul 02 14:29:50 2009 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1MMLQ9-00040s-0z for geb-bug-gnu-emacs@m.gmane.org; Thu, 02 Jul 2009 14:29:45 +0200 Original-Received: from localhost ([127.0.0.1]:60582 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MMLQ7-0003Xs-VU for geb-bug-gnu-emacs@m.gmane.org; Thu, 02 Jul 2009 08:29:44 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MMLEa-0000Jh-Jf for bug-gnu-emacs@gnu.org; Thu, 02 Jul 2009 08:17:48 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MMLEV-0000Ht-1b for bug-gnu-emacs@gnu.org; Thu, 02 Jul 2009 08:17:47 -0400 Original-Received: from [199.232.76.173] (port=47098 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MMLER-0000HN-OS for bug-gnu-emacs@gnu.org; Thu, 02 Jul 2009 08:17:40 -0400 Original-Received: from rzlab.ucr.edu ([138.23.92.77]:44152) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MMLEQ-0003Ey-Rr for bug-gnu-emacs@gnu.org; Thu, 02 Jul 2009 08:17:39 -0400 Original-Received: from rzlab.ucr.edu (rzlab.ucr.edu [127.0.0.1]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id n62CHPIW004982; Thu, 2 Jul 2009 05:17:36 -0700 Original-Received: (from debbugs@localhost) by rzlab.ucr.edu (8.14.3/8.14.3/Submit) id n62CF4U7004345; Thu, 2 Jul 2009 05:15:04 -0700 X-Loop: owner@emacsbugs.donarmstrong.com Resent-From: Kenichi Handa Resent-To: bug-submit-list@donarmstrong.com Resent-CC: Emacs Bugs , owner@emacsbugs.donarmstrong.com Resent-Date: Thu, 02 Jul 2009 12:15:04 +0000 Resent-Message-ID: Resent-Sender: owner@emacsbugs.donarmstrong.com X-Emacs-PR-Message: followup 3208 X-Emacs-PR-Package: emacs,w32 X-Emacs-PR-Keywords: Original-Received: via spool by 3208-submit@emacsbugs.donarmstrong.com id=B3208.12465368104225 (code B ref 3208); Thu, 02 Jul 2009 12:15:04 +0000 Original-Received: (at 3208) by emacsbugs.donarmstrong.com; 2 Jul 2009 12:13:30 +0000 X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. Original-Received: from mx1.aist.go.jp (mx1.aist.go.jp [150.29.246.133]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id n62CDMZC004211 for <3208@emacsbugs.donarmstrong.com>; Thu, 2 Jul 2009 05:13:24 -0700 Original-Received: from rqsmtp1.aist.go.jp (rqsmtp1.aist.go.jp [150.29.254.115]) by mx1.aist.go.jp with ESMTP id n62CDFwD028359; Thu, 2 Jul 2009 21:13:15 +0900 (JST) env-from (handa@m17n.org) Original-Received: from smtp1.aist.go.jp by rqsmtp1.aist.go.jp with ESMTP id n62CDFe5011051; Thu, 2 Jul 2009 21:13:15 +0900 (JST) env-from (handa@m17n.org) Original-Received: by smtp1.aist.go.jp with ESMTP id n62CDCE1008507; Thu, 2 Jul 2009 21:13:12 +0900 (JST) env-from (handa@m17n.org) Original-Received: from handa by etlken with local (Exim 4.69) (envelope-from ) id 1MMLA8-0007F3-D6; Thu, 02 Jul 2009 21:13:12 +0900 In-reply-to: (message from Kenichi Handa on Fri, 26 Jun 2009 10:26:22 +0900) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) Resent-Date: Thu, 02 Jul 2009 08:17:46 -0400 X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:29138 Archived-At: Attached is a new patch I propose for 23.1 to solve the bug. It is not the best one, but is simpler and safer. I'll summarize the problem here. Yidong and Stefan, please consider installing it. Emacs searches for a font for C in this order: (1) search a font-group for C in the current fontset. (2) search a font-group for C in the default fontset. (3) search a fallback font-group of the current fontset. (4) search a fallback font-group of the default fontset. The problem occurs when there's a very long line that contains characters whose fonts are found only in the second or later font in the group at step (4). Actually there are two problems; memory full and extreme slowness. They happen typically on Windows because both uniscribe and gdi font-backends returns many fonts, but theoretically it happens also on GNU/Linux. First, the reason of memory full is this. When a set of fonts are found in multiple font-backends, Emacs concatenates them into a single vector, sort elements of the vector, then return the best matching font. This is done for each character in a line. But, as Emacs doesn't perform GC while displaying that long line, the created (and then abandoned) vectors consume memory without being re-used. This results in memory full error. The changes for font.c in the patch is to solve it. The strategy is to cache vectors of specific sizes in a weak hash table for reuse until the next GC time. Next, the reason of slowness is this. When a font is found in the second or the following spec in the fallback font-group, Emacs sorts fonts matching with the first spec in the font-group, then check if fonts support C one by one, even if it results in no-font-for-C. This is because Emacs doesn't remember which font-spec in the font-group can return a suitable font for C. We can't remember such info for each character because it requires lots of memory. So, anyway this sorting and checking is done for each character in that long line, and it's very slow. The changes for fontset.c in the patch is to solve it by re-ordering the entries in the font-group so that the lastly used entry comes first. We do this re-ordering only for fallback font-group assuming that the order is not that important in a fallback font-group. And, for a font-group corresponding to a character, usually the font-specs are more specific and the number of matching fonts is small, and thus the above soring and checking is not that heavy. By the way, I'm going to install a different change (bigger but better) for 23.2. --- Kenichi Handa handa@m17n.org 2009-07-02 Kenichi Handa * font.c (font_entity_vector_cache): New variable. (syms_of_font): Initialize it. (font_concat_entities): New function. (font_list_entities): Use font_concat_entities instead of Fvconcat. * fontset.c (fontset_find_font): Re-order a fallback font-group. (fontset_font): Treat return value t of fontset_find_font as a sign of no-font in that fontset, not as a sign of no-font anywhere. Index: font.c =================================================================== RCS file: /cvsroot/emacs/emacs/src/font.c,v retrieving revision 1.133 diff -u -r1.133 font.c --- font.c 10 Jun 2009 01:26:15 -0000 1.133 +++ font.c 2 Jul 2009 11:13:16 -0000 @@ -2763,6 +2763,33 @@ return Fnreverse (val); } +static Lisp_Object font_entity_vector_cache; + +/* Concatenate lists of font-entities in VEC_ENTITY_LIST or length LEN. */ + +static Lisp_Object +font_concat_entities (Lisp_Object *vec_entity_list, int len) +{ + int i, j, num; + Lisp_Object vec, tail; + + for (i = 0, num = 0; i < len; i++) + num += XINT (Flength (vec_entity_list[i])); + vec = Fgethash (make_number (num), font_entity_vector_cache, Qnil); + if (NILP (vec)) + { + vec = Fvconcat (len, vec_entity_list); + Fputhash (make_number (num), vec, font_entity_vector_cache); + } + else + { + for (i = 0, j = 0; i < len; i++) + for (tail = vec_entity_list[i]; CONSP (tail); tail = XCDR (tail), j++) + ASET (vec, j, XCAR (tail)); + } + return vec; +} + /* Return a vector of font-entities matching with SPEC on FRAME. */ @@ -2831,7 +2858,7 @@ vec[i++] = val; } - val = (i > 0 ? Fvconcat (i, vec) : null_vector); + val = (i > 0 ? font_concat_entities (vec, i) : null_vector); font_add_log ("list", spec, val); return (val); } @@ -5178,6 +5205,14 @@ staticpro (&Vfont_log_deferred); Vfont_log_deferred = Fmake_vector (make_number (3), Qnil); + staticpro (&font_entity_vector_cache); + { /* Here we rely on the fact that syms_of_font is called fairly + late, when QCweakness is known to be set. */ + Lisp_Object args[2]; + args[0] = QCweakness; + args[1] = Qt; + font_entity_vector_cache = Fmake_hash_table (2, args); + } #if 0 #ifdef HAVE_LIBOTF staticpro (&otf_list); Index: fontset.c =================================================================== RCS file: /cvsroot/emacs/emacs/src/fontset.c,v retrieving revision 1.173 diff -u -r1.173 fontset.c --- fontset.c 8 Jun 2009 04:33:40 -0000 1.173 +++ fontset.c 2 Jul 2009 11:06:14 -0000 @@ -525,6 +525,8 @@ { Lisp_Object vec, font_group; int i, charset_matched = -1; + Lisp_Object rfont_def; + int found_index; FRAME_PTR f = (FRAMEP (FONTSET_FRAME (fontset))) ? XFRAME (selected_frame) : XFRAME (FONTSET_FRAME (fontset)); @@ -564,21 +566,22 @@ /* Find the first available font in the vector of RFONT-DEF. */ for (i = 0; i < ASIZE (vec); i++) { - Lisp_Object rfont_def, font_def; + Lisp_Object font_def; Lisp_Object font_entity, font_object; if (i == 0 && charset_matched >= 0) { /* Try the element matching with the charset ID at first. */ - rfont_def = AREF (vec, charset_matched); + found_index = charset_matched; charset_matched = -1; i--; } else if (i != charset_matched) - rfont_def = AREF (vec, i); + found_index = i; else continue; + rfont_def = AREF (vec, found_index); if (NILP (rfont_def)) /* This is a sign of not to try the other fonts. */ return Qt; @@ -623,7 +626,7 @@ } if (font_has_char (f, font_object, c)) - return rfont_def; + goto found; /* Find a font already opened, maching with the current spec, and supporting C. */ @@ -637,7 +640,7 @@ break; font_object = RFONT_DEF_OBJECT (AREF (vec, i)); if (! NILP (font_object) && font_has_char (f, font_object, c)) - return rfont_def; + goto found; } /* Find a font-entity with the current spec and supporting C. */ @@ -661,10 +664,12 @@ for (j = 0; j < i; j++) ASET (new_vec, j, AREF (vec, j)); ASET (new_vec, j, rfont_def); + found_index = j; for (j++; j < ASIZE (new_vec); j++) ASET (new_vec, j, AREF (vec, j - 1)); XSETCDR (font_group, new_vec); - return rfont_def; + vec = new_vec; + goto found; } /* No font of the current spec for C. Try the next spec. */ @@ -673,6 +678,20 @@ FONTSET_SET (fontset, make_number (c), make_number (0)); return Qnil; + + found: + if (fallback && found_index > 0) + { + /* The order of fonts in the fallback font-group is not that + important, and it is better to move the found font to the + first of the group so that the next try will find it + quickly. */ + for (i = found_index; i > 0; i--) + ASET (vec, i, AREF (vec, i - 1)); + ASET (vec, 0, rfont_def); + found_index = 0; + } + return rfont_def; } @@ -685,13 +704,14 @@ { Lisp_Object rfont_def; Lisp_Object base_fontset; + int try_fallback = 0, try_default_fallback = 0; /* Try a font-group of FONTSET. */ rfont_def = fontset_find_font (fontset, c, face, id, 0); if (VECTORP (rfont_def)) return rfont_def; - if (EQ (rfont_def, Qt)) - goto no_font; + if (! EQ (rfont_def, Qt)) + try_fallback = 1; /* Try a font-group of the default fontset. */ base_fontset = FONTSET_BASE (fontset); @@ -703,29 +723,30 @@ rfont_def = fontset_find_font (FONTSET_DEFAULT (fontset), c, face, id, 0); if (VECTORP (rfont_def)) return rfont_def; - if (EQ (rfont_def, Qt)) - goto no_font; + if (! EQ (rfont_def, Qt)) + try_default_fallback = 1; } /* Try a fallback font-group of FONTSET. */ - rfont_def = fontset_find_font (fontset, c, face, id, 1); - if (VECTORP (rfont_def)) - return rfont_def; - if (EQ (rfont_def, Qt)) - goto no_font; + if (try_fallback) + { + rfont_def = fontset_find_font (fontset, c, face, id, 1); + if (VECTORP (rfont_def)) + return rfont_def; + /* Remember that FONTSET has no font for C. */ + FONTSET_SET (fontset, make_number (c), Qt); + } /* Try a fallback font-group of the default fontset . */ - if (! EQ (base_fontset, Vdefault_fontset)) + if (try_default_fallback) { rfont_def = fontset_find_font (FONTSET_DEFAULT (fontset), c, face, id, 1); if (VECTORP (rfont_def)) return rfont_def; + /* Remember that the default fontset has no font for C. */ + FONTSET_SET (FONTSET_DEFAULT (fontset), make_number (c), Qt); } - no_font: - /* Remember that we have no font for C. */ - FONTSET_SET (fontset, make_number (c), Qt); - return Qnil; }