From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Elias Oltmanns <oltmanns@uni-bonn.de>
Newsgroups: gmane.emacs.devel
Subject: New buffer-case-table makes search_buffer painfully slow
Date: Thu, 04 May 2006 15:46:05 +0200
Message-ID: <87y7xhq4wy.fsf@denkblock.local>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: sea.gmane.org 1146750650 14931 80.91.229.2 (4 May 2006 13:50:50 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Thu, 4 May 2006 13:50:50 +0000 (UTC)
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu May 04 15:50:49 2006
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by ciao.gmane.org with esmtp (Exim 4.43)
	id 1FbeDz-0001SN-8B
	for ged-emacs-devel@m.gmane.org; Thu, 04 May 2006 15:50:35 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1FbeDy-00072j-Ea
	for ged-emacs-devel@m.gmane.org; Thu, 04 May 2006 09:50:34 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1FbeDi-00072F-75
	for emacs-devel@gnu.org; Thu, 04 May 2006 09:50:18 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1FbeDh-00071l-NJ
	for emacs-devel@gnu.org; Thu, 04 May 2006 09:50:17 -0400
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1FbeDh-00071i-Cp
	for emacs-devel@gnu.org; Thu, 04 May 2006 09:50:17 -0400
Original-Received: from [80.91.229.2] (helo=ciao.gmane.org)
	by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32)
	(Exim 4.52) id 1FbeEJ-0005Fj-En
	for emacs-devel@gnu.org; Thu, 04 May 2006 09:50:55 -0400
Original-Received: from root by ciao.gmane.org with local (Exim 4.43)
	id 1FbeDS-0001K9-EX
	for emacs-devel@gnu.org; Thu, 04 May 2006 15:50:02 +0200
Original-Received: from p508868f6.dip.t-dialin.net ([80.136.104.246])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <emacs-devel@gnu.org>; Thu, 04 May 2006 15:50:02 +0200
Original-Received: from oltmanns by p508868f6.dip.t-dialin.net with local (Gmexim 0.1
	(Debian)) id 1AlnuQ-0007hv-00
	for <emacs-devel@gnu.org>; Thu, 04 May 2006 15:50:02 +0200
X-Injected-Via-Gmane: http://gmane.org/
Original-To: emacs-devel@gnu.org
Original-Lines: 29
Original-X-Complaints-To: usenet@sea.gmane.org
X-Gmane-NNTP-Posting-Host: p508868f6.dip.t-dialin.net
User-Agent: Gnus/5.110004 (No Gnus v0.4)
Cancel-Lock: sha1:J2LV85PS4IhID9UjmV1JmvMLZbQ=
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:53901
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/53901>

Hi all,

switching from emacs 21 to emacs 22 has a very significant performance
impact on packages that make heavy use of search_buffer. An example
that actually made me aware of this problem is gnus processing large
mbox files. Further analysis of this problem revealed that in emacs 22
an "i" in the search string makes search_buffer use simple_search()
instead of boyer_moore(). This means that, for instance, a loop
repeatedly calling re-search-forward with the search string
"X-Gnus-Article-Number" takes (in the order of several magnitudes)
more time in emacs 22 than in emacs 21 just because of the "i" in
article -- at least in a multibyte buffer. The cause for this seems to
be a change in the buffer-case-table. Comparing the output of M-x
describe-buffer-case-table in emacs 21 resp. emacs 22 makes me wonder
whether a match of a certain character in unicode row 32 with "i" in
the emacs 22 table might be the cause for this trouble. If so, what
would be the right thing to do about it? Of course, applications like
gnus have to open the mbox files in multibyte mode simply because
mails in different languages and charsets may be stored in these
files. Yet, I'm quite confident that quite a few people if not the
majority will never need the match of i with this obscure character
but would certainly prefer the boyer_moore algorithm when searching
for strings containing an "i".

Any ideas and thoughts concerning this problem?

Regards,

Elias