From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Thomas Lord Newsgroups: gmane.emacs.devel Subject: Re: Release plans Date: Tue, 02 Sep 2008 13:43:24 -0700 Message-ID: <48BDA56C.10502@emf.net> References: <878wusz0v9.fsf@uwakimon.sk.tsukuba.ac.jp> <87vdxp27z6.fsf@uwakimon.sk.tsukuba.ac.jp> <87prnxe5hc.fsf@rattlesnake.com> <873aktck5d.fsf@uwakimon.sk.tsukuba.ac.jp> <87k5e5dsvq.fsf@rattlesnake.com> <48B44802.1080302@emf.net> <48B5D5EF.2030501@emf.net> <20080827220906.GB5374@muc.de> <48B5FB11.6090708@emf.net> <48BC4DF0.3090908@emf.net> <87ljybp6t1.fsf@uwakimon.sk.tsukuba.ac.jp> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1220385214 601 80.91.229.12 (2 Sep 2008 19:53:34 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 2 Sep 2008 19:53:34 +0000 (UTC) Cc: Stefan Monnier , rms@gnu.org, emacs-devel@gnu.org To: "Stephen J. Turnbull" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Sep 02 21:54:28 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1Kabwv-0000Lh-9V for ged-emacs-devel@m.gmane.org; Tue, 02 Sep 2008 21:54:01 +0200 Original-Received: from localhost ([127.0.0.1]:55410 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Kabvw-0001wK-8j for ged-emacs-devel@m.gmane.org; Tue, 02 Sep 2008 15:53:00 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KabvE-0001OJ-7j for emacs-devel@gnu.org; Tue, 02 Sep 2008 15:52:16 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KabvC-0001MR-UC for emacs-devel@gnu.org; Tue, 02 Sep 2008 15:52:15 -0400 Original-Received: from [199.232.76.173] (port=50208 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KabvC-0001M1-5f for emacs-devel@gnu.org; Tue, 02 Sep 2008 15:52:14 -0400 Original-Received: from mail.42inc.com ([205.149.0.25]:50674) by monty-python.gnu.org with esmtps (SSL 3.0:RSA_3DES_EDE_CBC_SHA1:24) (Exim 4.60) (envelope-from ) id 1Kabv7-0008Dp-4h; Tue, 02 Sep 2008 15:52:09 -0400 X-TFF-CGPSA-Version: 1.5 X-TFF-CGPSA-Filter-42inc: Scanned X-42-Virus-Scanned: by 42 Antivirus -- Found to be clean. Original-Received: from [69.236.75.128] (account lord@emf.net HELO [192.168.1.64]) by mail.42inc.com (CommuniGate Pro SMTP 5.0.13) with ESMTPA id 38136027; Tue, 02 Sep 2008 12:52:04 -0700 User-Agent: Thunderbird 1.5.0.5 (X11/20060808) In-Reply-To: <87ljybp6t1.fsf@uwakimon.sk.tsukuba.ac.jp> X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:103439 Archived-At: Stephen J. Turnbull wrote: > Thomas Lord writes: > > > of markers, but because this will be a self-adjusting tree > > all those operations will be O(1) in the expected case where > > changes display pretty good locality. > > I would advise you not to expect that case. That's two of you now (saying that). Ok, then, I'll assume the worst (which should still be "good enough"). > Experience in XEmacs > shows that some applications (the VM MUA in particular, but also the > old implementation of font-lock, dunno about jit-lock) like to run up > and down the buffer a lot. AFAICS it's really important to have O(log > N) worst-case behavior. > > Sorry I can't be much more precise about why this happens, I just know > that our algorithms that deal with extents (which we use to support > overlay-like behavior and text properties) are tuned for good locality > and lose badly in large buffers; they show up as time hogs in profiling. > > I looked at (an old version of?) the XEmacs internals manual. It says extents are implemented as a pair of gap buffers. It also talks about the concept of cached stacks of events -- e.g., a cache of the list of extents at a given point from which the extents over a nearby point can be quickly computed. I don't know if that's current. The gap buffer and extent-stack cache would fit your description of "optimized for locality" and would exhibit the (performance) failure mode you describe under access patterns about like you describe ("[running] up and down the buffer"). In that access pattern, you're frequently moving lots of data across the gap and if you're jumping around at all during this traversal then you're constantly getting cache misses or unwanted cache hits for the extent stack. The splay tree of gap buffers should do considerably better with those access patterns. That's the theory, anyway: The new data structure has an operation equivalent to "move the gap" (so, e.g., insertions just write to a linear piece of memory) however gap motion is O(log N) buffer size in terms of operations and O(1) for the amount of text that has to move. Marker updates, getting the position of a marker, and getting back a list of text properties over a given point are all O(log N) -- O(1) with good locality. When I say "locality" I'm using that more loosely than it is used when talking about ordinary gap buffers. With ordinary gap buffers, the optimized case is a lot of action near one point at a buffer, then an infrequent motion far away, then lots of closely placed action. When I say "locality" I mean to include (a) cases where the action at a given time forms a dense set, even if that set is scattered; (b) linear access patterns. By "dense set" I mean that programs can be simultaneously modifying, say, 5 widely separated areas of the buffer in a multiplexed way -- every modification is probably close to a recent modification (so the set is dense) but every modification is also probably far from the *most* recent modification (so the set is scattered). By "linear access" I mean cases where code is scanning forwards or backwards. > It's possible it's something internal to the implementation of > extents, too, but I think a word to the wise is appropriate here. > If extents are still the double gap buffer + extent stack cache then I can see where you'd have problems like those you describe. In theory, this new thing does better in those cases. Thanks, -t