Here is how the buffer text could be encoded. In this case, I pasted here the output of a Visual method , with the input text "RedBlackTree."
Each node has a block of the whole text, the whole text is ordered in in-order (left - print_node - right).
In this demo that I wrote in Java, each node keeps only 1 character of the whole text. Each node has also a number, which represents the depth of that node (those that are starred are at the same level as their parent, because they are marked RED -- red means star in my visual output).
(Use a font with fixed width size of each character to see the output clearly)
** In Emacs each node there must be a linked list of structures that keeps not a char, but a few blocks of text, like this
struct block_text {
char *text;
struct block_text * next;
struct block_text * prev;
Node *owner;
...
}
The owner is a pointer to the node where this block is kept, and it is useful for fast splitting and joining nodes.
Note that at each node there can be more than 1 structure into the double linked list , because this helps operations like delete and insert to work fast, without having to move text in the moment when 2 nodes are joined or removed.
The last node of a list of a given node points to the beginning at the list of the following node, such that operations like search-forward and backward, whose NFA needs to know the following char to work as fast as it does in current linear buffer implementation,
RBT corresponding to the input text "RedBlackTree."
// Visual RedBlackTree k----0 |
// Example · |
// :´´´´´´´´´´´´´´´°¯¯¯¯¯¯¯¦ |
// B----0-(*) r---1 |
// · · |
// ¦¯¯¯¯¯¯¯°¯¯¯¯¯¯¯¦ ¦¯¯¯¯¯°¯¯¯¯¯¯¯¯¯¯¯¦ |
// e---1 a---1 T---2 e---2 |
// · · · · |
// ¦¯¯¯¯¯°¯¦ ¦¯¯¯¯¯°¯¦ ¦°¦ :´´´´´´´´´°¦ |
// R---2 d---2 l---2 c---2 # # e---2-(*) # |
// · · · · · |
// ¦°¦ ¦°¦ ¦°¦ ¦°¦ ¦°¦ |
// # # # # # # # # # # |
Ideally, at each node all the text should have the same properties (marks, fonts, etc), and chaging 1 such property can be implemented by splitting the node or removing a few nodes, and their text be preserved in 1 node (only the field owner must be changed when a few nodes are joined in only 1).
Each node in RBT must have pre-computed all the characteristics of the block of text keps at that node, including the number of newlines of the nodes at its left, the number of chars at its left, such that the searach operations of a given line and a given point offset to work fast.
Note that the gap space has trivial implementation like this, and all the operations on gap work logirithmically in the worse case.
This is the idea that I will implement in future Clearly, it works, and all the details can be solved with such a data structure. For me the red black trees are trivial to imeplement, and integrating them in emacs buffer will be a lot of fun.