From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alin Soare Newsgroups: gmane.emacs.devel Subject: Re: line buffer as Red Black Trees instead of linear Date: Wed, 21 May 2014 11:34:10 +0300 Message-ID: References: <83mwein6w1.fsf@gnu.org> <83fvkamg9v.fsf@gnu.org> <837g5mm8ke.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=047d7b2e12db51d39804f9e4dc30 X-Trace: ger.gmane.org 1400661269 29343 80.91.229.3 (21 May 2014 08:34:29 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 21 May 2014 08:34:29 +0000 (UTC) Cc: Stefan Monnier , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 21 10:34:22 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Wn1yf-000798-8L for ged-emacs-devel@m.gmane.org; Wed, 21 May 2014 10:34:21 +0200 Original-Received: from localhost ([::1]:57779 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wn1ye-0007xs-SZ for ged-emacs-devel@m.gmane.org; Wed, 21 May 2014 04:34:20 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40519) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wn1yZ-0007ux-3h for emacs-devel@gnu.org; Wed, 21 May 2014 04:34:16 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Wn1yX-0002b8-JZ for emacs-devel@gnu.org; Wed, 21 May 2014 04:34:14 -0400 Original-Received: from mail-ig0-x22d.google.com ([2607:f8b0:4001:c05::22d]:61970) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wn1yW-0002ab-1Z; Wed, 21 May 2014 04:34:12 -0400 Original-Received: by mail-ig0-f173.google.com with SMTP id hn18so5900426igb.12 for ; Wed, 21 May 2014 01:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Tnr7jBYxiSa3rAjnctphc92cYwHv01xJKHDkft11TKA=; b=Yn/gbPnjKf3+wFPMJPEuXdU0xVqfc3Fq+QXepLNXycCDbISL7J3bJf9fCRZKJOtOuY RY4nEpT3kxTX7Q3UO3g1fmnzgE9s1+OMq5BLIGZnf7IFFgoJUaR74EvA/sJSAhIDF9vP mJYpWkI4ldtS9h90/Iph/iW+iZyNRf61X2ow/3mouoAcTt1xWieVaMukd+Ae4wlXIBP4 qrtBtD7DDZMDjWkGdDzeeNNpFqvJ0M79Y2dJtM6i54yV/LXZGjGsVtAUU0hl7kVHzfiE e0jA+3fgVjla/ONb2cjS6q7xNagqtqygS7F8h+HnC27bYTM5TB11+2rS25R+cBGCwSj2 NDIA== X-Received: by 10.50.131.138 with SMTP id om10mr11662354igb.25.1400661251050; Wed, 21 May 2014 01:34:11 -0700 (PDT) Original-Received: by 10.43.142.70 with HTTP; Wed, 21 May 2014 01:34:10 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:4001:c05::22d X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:171983 Archived-At: --047d7b2e12db51d39804f9e4dc30 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Here is how the buffer text could be encoded. In this case, I pasted here the output of a Visual method , with the input text "RedBlackTree." Each node has a block of the whole text, the whole text is ordered in in-order (left - print_node - right). In this demo that I wrote in Java, each node keeps only 1 character of the whole text. Each node has also a number, which represents the depth of that node (those that are starred are at the same level as their parent, because they are marked RED -- red means star in my visual output). (Use a font with fixed width size of each character to see the output clearly) ** In Emacs each node there must be a linked list of structures that keeps not a char, but a few blocks of text, like this struct block_text { char *text; struct block_text * next; struct block_text * prev; Node *owner; ... } The owner is a pointer to the node where this block is kept, and it is useful for fast splitting and joining nodes. Note that at each node there can be more than 1 structure into the double linked list , because this helps operations like delete and insert to work fast, without having to move text in the moment when 2 nodes are joined or removed. The last node of a list of a given node points to the beginning at the list of the following node, such that operations like search-forward and backward, whose NFA needs to know the following char to work as fast as it does in current linear buffer implementation, RBT corresponding to the input text "RedBlackTree." // Visual RedBlackTree k----0 | // Example =C2=B7 | // :=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4= =C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B0=C2=AF=C2=AF=C2=AF=C2=AF=C2= =AF=C2=AF=C2=AF=C2=A6 | // B----0-(*) r---1 | // =C2=B7 =C2=B7 = | // =C2=A6=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=B0=C2= =AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=A6 =C2=A6=C2=AF=C2=AF=C2= =AF=C2=AF=C2=AF=C2=B0=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF= =C2=AF=C2=AF=C2=A6 | // e---1 a---1 T---2 e---2 | // =C2=B7 =C2=B7 =C2=B7 = =C2=B7 | // =C2=A6=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=B0=C2=AF=C2=A6 = =C2=A6=C2=AF=C2=AF=C2=AF=C2=AF=C2=AF=C2=B0=C2=AF=C2=A6 =C2=A6=C2=B0=C2= =A6 :=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B4=C2=B0=C2= =A6 | // R---2 d---2 l---2 c---2 # # e---2-(*) # | // =C2=B7 =C2=B7 =C2=B7 =C2=B7 =C2= =B7 | // =C2=A6=C2=B0=C2=A6 =C2=A6=C2=B0=C2=A6 =C2=A6=C2=B0=C2=A6= =C2=A6=C2=B0=C2=A6 =C2=A6=C2=B0=C2=A6 | // # # # # # # # # # # | Ideally, at each node all the text should have the same properties (marks, fonts, etc), and chaging 1 such property can be implemented by splitting the node or removing a few nodes, and their text be preserved in 1 node (only the field owner must be changed when a few nodes are joined in only 1). Each node in RBT must have pre-computed all the characteristics of the block of text keps at that node, including the number of newlines of the nodes at its left, the number of chars at its left, such that the searach operations of a given line and a given point offset to work fast. Note that the gap space has trivial implementation like this, and all the operations on gap work logirithmically in the worse case. This is the idea that I will implement in future Clearly, it works, and all the details can be solved with such a data structure. For me the red black trees are trivial to imeplement, and integrating them in emacs buffer will be a lot of fun. --047d7b2e12db51d39804f9e4dc30 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: base64 PGRpdiBkaXI9Imx0ciI+SGVyZSBpcyBob3cgdGhlIGJ1ZmZlciB0ZXh0IGNvdWxkIGJlIGVuY29k ZWQuIEluIHRoaXMgY2FzZSwgSSBwYXN0ZWQgaGVyZSB0aGUgb3V0cHV0IG9mIGEgVmlzdWFsIG1l dGhvZCAsIHdpdGggdGhlIGlucHV0IHRleHQgJnF1b3Q7UmVkQmxhY2tUcmVlLiZxdW90OzxkaXY+ PGJyPjwvZGl2PjxkaXY+RWFjaCBub2RlIGhhcyBhIGJsb2NrIG9mIHRoZSB3aG9sZSB0ZXh0LCB0 aGUgd2hvbGUgdGV4dCBpcyBvcmRlcmVkIGluIGluLW9yZGVyIChsZWZ0IC0gcHJpbnRfbm9kZSAt IHJpZ2h0KS48L2Rpdj4NCjxkaXY+PGJyPjwvZGl2PjxkaXY+SW4gdGhpcyBkZW1vIHRoYXQgSSB3 cm90ZSBpbiBKYXZhLCBlYWNoIG5vZGUga2VlcHMgb25seSAxIGNoYXJhY3RlciBvZiB0aGUgd2hv bGUgdGV4dC4gRWFjaCBub2RlIGhhcyBhbHNvIGEgbnVtYmVyLCB3aGljaCByZXByZXNlbnRzIHRo ZSBkZXB0aCBvZiB0aGF0IG5vZGUgKHRob3NlIHRoYXQgYXJlIHN0YXJyZWQgYXJlIGF0IHRoZSBz YW1lIGxldmVsIGFzIHRoZWlyIHBhcmVudCwgYmVjYXVzZSB0aGV5IGFyZSBtYXJrZWQgUkVEIC0t IHJlZCBtZWFucyBzdGFyIGluIG15IHZpc3VhbCBvdXRwdXQpLjwvZGl2Pg0KPGRpdj48YnI+PC9k aXY+PGRpdj4oVXNlIGEgZm9udCB3aXRoIGZpeGVkIHdpZHRoIHNpemUgb2YgZWFjaCBjaGFyYWN0 ZXIgdG8gc2VlIHRoZSBvdXRwdXQgY2xlYXJseSk8L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2Pioq IEluIEVtYWNzIGVhY2ggbm9kZSB0aGVyZSBtdXN0IGJlIGEgbGlua2VkIGxpc3Qgb2Ygc3RydWN0 dXJlcyB0aGF0IGtlZXBzIG5vdCBhIGNoYXIsIGJ1dCBhIGZldyBibG9ja3Mgb2YgdGV4dCwgbGlr ZSB0aGlzPC9kaXY+DQo8ZGl2Pjxicj48L2Rpdj48ZGl2PnN0cnVjdCBibG9ja190ZXh0IHs8L2Rp dj48ZGl2PsKgIGNoYXIgKnRleHQ7PC9kaXY+PGRpdj7CoCBzdHJ1Y3QgYmxvY2tfdGV4dCAqIG5l eHQ7PGJyPjwvZGl2PjxkaXY+wqAgc3RydWN0IGJsb2NrX3RleHQgKiBwcmV2Ozxicj48L2Rpdj48 ZGl2PsKgIE5vZGUgKm93bmVyOzwvZGl2PjxkaXY+wqAgLi4uPC9kaXY+PGRpdj59PC9kaXY+PGRp dj48YnI+PC9kaXY+DQo8ZGl2PlRoZSBvd25lciBpcyBhIHBvaW50ZXIgdG8gdGhlIG5vZGUgd2hl cmUgdGhpcyBibG9jayBpcyBrZXB0LCBhbmQgaXQgaXMgdXNlZnVsIGZvciBmYXN0IHNwbGl0dGlu ZyBhbmQgam9pbmluZyBub2Rlcy48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2Pk5vdGUgdGhhdCBh dCBlYWNoIG5vZGUgdGhlcmUgY2FuIGJlIG1vcmUgdGhhbiAxIHN0cnVjdHVyZSBpbnRvIHRoZSBk b3VibGUgbGlua2VkIGxpc3QgLCBiZWNhdXNlIHRoaXMgaGVscHMgb3BlcmF0aW9ucyBsaWtlIGRl bGV0ZSBhbmQgaW5zZXJ0IHRvIHdvcmsgZmFzdCwgd2l0aG91dCBoYXZpbmcgdG8gbW92ZSB0ZXh0 IGluIHRoZSBtb21lbnQgd2hlbiAyIG5vZGVzIGFyZSBqb2luZWQgb3IgcmVtb3ZlZC48YnI+DQo8 L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2PlRoZSBsYXN0IG5vZGUgb2YgYSBsaXN0IG9mIGEgZ2l2 ZW4gbm9kZSBwb2ludHMgdG8gdGhlIGJlZ2lubmluZyBhdCB0aGUgbGlzdCBvZiB0aGUgZm9sbG93 aW5nIG5vZGUsIHN1Y2ggdGhhdCBvcGVyYXRpb25zIGxpa2Ugc2VhcmNoLWZvcndhcmQgYW5kIGJh Y2t3YXJkLCB3aG9zZSBORkEgbmVlZHMgdG8ga25vdyB0aGUgZm9sbG93aW5nIGNoYXIgdG8gd29y ayBhcyBmYXN0IGFzIGl0IGRvZXMgaW4gY3VycmVudCBsaW5lYXIgYnVmZmVyIGltcGxlbWVudGF0 aW9uLDwvZGl2Pg0KPGRpdj48YnI+PC9kaXY+PGRpdj48YnI+PC9kaXY+PGRpdj48YnI+PC9kaXY+ PGRpdj5SQlQgY29ycmVzcG9uZGluZyB0byB0aGUgaW5wdXQgdGV4dCAmcXVvdDtSZWRCbGFja1Ry ZWUuJnF1b3Q7PGJyPjwvZGl2PjxkaXY+PGRpdj7CoCDCoCDCoCDCoCDCoCDCoMKgPC9kaXY+PGRp dj7CoCDCoCDCoCDCoCAvLyBWaXN1YWwgUmVkQmxhY2tUcmVlIMKgIMKgIMKgIMKgIMKgIMKgay0t LS0wIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIHw8L2Rpdj4NCjxkaXY+wqAgwqAgwqAg wqAgLy8gRXhhbXBsZSDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoCDCoMK3IMKgIMKg IMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgfDwvZGl2PjxkaXY+wqAgwqAgwqAgwqAg Ly8gwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqA6wrTCtMK0wrTCtMK0wrTCtMK0wrTCtMK0wrTCtMK0 wrDCr8Kvwq/Cr8Kvwq/Cr8KmIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgfDwvZGl2PjxkaXY+ wqAgwqAgwqAgwqAgLy8gwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqBCLS0tLTAtKCopIMKgIMKgIMKg IMKgIMKgIMKgIMKgci0tLTEgwqAgwqAgwqAgwqAgwqAgwqAgwqB8PC9kaXY+DQo8ZGl2PsKgIMKg IMKgIMKgIC8vIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgwrcgwqAgwqAgwqAgwqAgwqAgwqAgwqAg wqAgwqAgwqAgwqAgwrcgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqAgwqB8PC9kaXY+PGRpdj7CoCDC oCDCoCDCoCAvLyDCoCDCoCDCoCDCoMKmwq/Cr8Kvwq/Cr8Kvwq/CsMKvwq/Cr8Kvwq/Cr8KvwqYg wqAgwqAgwqAgwqAgwqbCr8Kvwq/Cr8KvwrDCr8Kvwq/Cr8Kvwq/Cr8Kvwq/Cr8KvwqYgwqAgwqAg wqB8PC9kaXY+PGRpdj7CoCDCoCDCoCDCoCAvLyDCoCDCoCDCoCDCoGUtLS0xIMKgIMKgIMKgIMKg IMKgIGEtLS0xIMKgIMKgIFQtLS0yIMKgIMKgIMKgIMKgIMKgIMKgIGUtLS0yIMKgfDwvZGl2Pg0K PGRpdj7CoCDCoCDCoCDCoCAvLyDCoCDCoCDCoCDCoMK3IMKgIMKgIMKgIMKgIMKgIMKgIMKgIMK3 IMKgIMKgIMKgIMKgIMK3IMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgIMK3IMKgIMKgIMKgfDwvZGl2 PjxkaXY+wqAgwqAgwqAgwqAgLy8gwqDCpsKvwq/Cr8Kvwq/CsMKvwqYgwqAgwqAgwqAgwqbCr8Kv wq/Cr8KvwrDCr8KmIMKgIMKgIMKgwqbCsMKmIMKgIMKgIMKgOsK0wrTCtMK0wrTCtMK0wrTCtMKw wqYgwqAgwqAgfDwvZGl2PjxkaXY+wqAgwqAgwqAgwqAgLy8gwqBSLS0tMiDCoCBkLS0tMiDCoCBs LS0tMiDCoCBjLS0tMiDCoCMgIyDCoCDCoCDCoGUtLS0yLSgqKSDCoCMgwqAgwqAgfDwvZGl2Pg0K PGRpdj7CoCDCoCDCoCDCoCAvLyDCoMK3IMKgIMKgIMKgIMK3IMKgIMKgIMKgIMK3IMKgIMKgIMKg IMK3IMKgIMKgIMKgIMKgIMKgIMKgIMKgIMK3IMKgIMKgIMKgIMKgIMKgIMKgIMKgIMKgfDwvZGl2 PjxkaXY+wqAgwqAgwqAgwqAgLy8gwqbCsMKmIMKgIMKgIMKmwrDCpiDCoCDCoCDCpsKwwqYgwqAg wqAgwqbCsMKmIMKgIMKgIMKgIMKgIMKgIMKgIMKmwrDCpiDCoCDCoCDCoCDCoCDCoCDCoCDCoCB8 PC9kaXY+PGRpdj7CoCDCoCDCoCDCoCAvLyAjICMgwqAgwqAgIyAjIMKgIMKgICMgIyDCoCDCoCAj ICMgwqAgwqAgwqAgwqAgwqAgwqAgIyAjIMKgIMKgIMKgIMKgIMKgIMKgIMKgIHw8L2Rpdj4NCjxk aXY+PGJyPjwvZGl2PjxkaXY+wqAgwqAgwqAgwqAgwqAgwqDCoDxicj48L2Rpdj48L2Rpdj48ZGl2 PjxkaXY+PGRpdj5JZGVhbGx5LCBhdCBlYWNoIG5vZGUgYWxsIHRoZSB0ZXh0IHNob3VsZCBoYXZl IHRoZSBzYW1lIHByb3BlcnRpZXMgKG1hcmtzLCBmb250cywgZXRjKSwgYW5kIGNoYWdpbmcgMSBz dWNoIHByb3BlcnR5IGNhbiBiZSBpbXBsZW1lbnRlZCBieSBzcGxpdHRpbmcgdGhlIG5vZGUgb3Ig cmVtb3ZpbmcgYSBmZXcgbm9kZXMsIGFuZCB0aGVpciB0ZXh0IGJlIHByZXNlcnZlZCBpbiAxIG5v ZGUgKG9ubHkgdGhlIGZpZWxkIG93bmVyIG11c3QgYmUgY2hhbmdlZCB3aGVuIGEgZmV3IG5vZGVz IGFyZSBqb2luZWQgaW4gb25seSAxKS48L2Rpdj4NCjxkaXY+PGJyPjwvZGl2PjxkaXY+RWFjaCBu b2RlIGluIFJCVCBtdXN0IGhhdmUgcHJlLWNvbXB1dGVkIGFsbCB0aGUgY2hhcmFjdGVyaXN0aWNz IG9mIHRoZSBibG9jayBvZiB0ZXh0IGtlcHMgYXQgdGhhdCBub2RlLCBpbmNsdWRpbmcgdGhlIG51 bWJlciBvZiBuZXdsaW5lcyBvZiB0aGUgbm9kZXMgYXQgaXRzIGxlZnQsIHRoZSBudW1iZXIgb2Yg Y2hhcnMgYXQgaXRzIGxlZnQsIHN1Y2ggdGhhdCB0aGUgc2VhcmFjaCBvcGVyYXRpb25zIG9mIGEg Z2l2ZW4gbGluZSBhbmQgYSBnaXZlbiBwb2ludCBvZmZzZXQgdG8gd29yayBmYXN0LjwvZGl2Pg0K PGRpdj48YnI+PC9kaXY+PGRpdj5Ob3RlIHRoYXQgdGhlIGdhcCBzcGFjZSBoYXMgdHJpdmlhbCBp bXBsZW1lbnRhdGlvbiBsaWtlIHRoaXMsIGFuZCBhbGwgdGhlIG9wZXJhdGlvbnMgb24gZ2FwIHdv cmsgbG9naXJpdGhtaWNhbGx5IGluIHRoZSB3b3JzZSBjYXNlLjwvZGl2PjxkaXY+PGJyPjwvZGl2 PjxkaXY+PGJyPjwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+DQpUaGlz IGlzIHRoZSBpZGVhIHRoYXQgSSB3aWxsIGltcGxlbWVudCBpbiBmdXR1cmUgQ2xlYXJseSwgaXQg d29ya3MsIGFuZCBhbGwgdGhlIGRldGFpbHMgY2FuIGJlIHNvbHZlZCB3aXRoIHN1Y2ggYSBkYXRh IHN0cnVjdHVyZS4gRm9yIG1lIHRoZSByZWQgYmxhY2sgdHJlZXMgYXJlIHRyaXZpYWwgdG8gaW1l cGxlbWVudCwgYW5kIGludGVncmF0aW5nIHRoZW0gaW4gZW1hY3MgYnVmZmVyIHdpbGwgYmUgYSBs b3Qgb2YgZnVuLjwvZGl2Pg0KPC9kaXY+PGRpdj48YnI+PC9kaXY+PGRpdj48YnI+PC9kaXY+PGRp dj48YnI+PC9kaXY+PGRpdj48YnI+PC9kaXY+PC9kaXY+PC9kaXY+DQo= --047d7b2e12db51d39804f9e4dc30--