* Clarification on blank lines following list items
@ 2023-08-19 5:30 Tom Alexander
2023-08-19 8:43 ` Ihor Radchenko
0 siblings, 1 reply; 7+ messages in thread
From: Tom Alexander @ 2023-08-19 5:30 UTC (permalink / raw)
To: emacs-orgmode
I am noticing the list items have some very context-sensitive specific behavior regarding ownership of the trailing blank lines. I was hoping to get some clarification on this (namely, are my observations correct, am I stumbling across a bug, or have I not dug deep enough to suss out the real rules?). The org-mode documentation states:
> With the exception of list items and footnote definitions blank lines belong to the preceding element with the narrowest possible scope
but it does not state who ends up owning those blank lines.
In a previous email the incredibly helpful Ihor Radchenko expanded on this further with:
> Also, in addition to list items, footnote-definitions do not extend their contents to the trailing blank lines.
which, I would interpret as the list items do not own their trailing blank lines but rather the list owns them. But that is not the behavior I am seeing. If I had to summarize the behavior I am seeing into words it would be:
> List items own their trailing blank lines unless they are both the final list item and not the final element of a non-final list item.
Below I have how I reached this conclusion, but before diving into the weeds I want to point two things out:
1. I have hastily thrown together a tool to help rapidly visualize the ownership of nodes in org-mode's AST. Before this tool, I was manually running org-element-parse-buffer and using M-g c to jump around to the various indices to see where nodes started/ended. With this tool, I can paste in my org-mode source and get a tree showing the contents of each node and I can click on the nodes to highlight the relevant characters in the org-mode source. It is available at https://github.com/tomalexander/org_mode_ast_investigation_tool .
2. I have flattened my analysis for plain-text consumption over email below, but if you'd prefer the original org-mode version of this investigation it is available at https://github.com/tomalexander/org_mode_ast_investigation_tool/blob/cba1d1e988be230f3104f5f63dfaeaaf5cd0d280/notes/plain_list_ownership_notes.org .
And now, here is how I reached that conclusion:
*** Test case 1
```
1. foo
1. bar
2. baz
2. lorem
ipsum
```
| Plain List *Item* | Owns trailing blank lines |
|------------------------+---------------------------|
| foo (includes bar baz) | Yes |
| bar | Yes |
| baz | Yes |
| lorem | No |
In this test case, we see that the only list item that doesn't own its trailing blank lines is "lorem", the final list item of the outer-most list.
*** Test case 2
We add "cat" as a paragraph at the end of foo which makes "baz" lose its trailing blank lines
```
1. foo
1. bar
2. baz
cat
2. lorem
ipsum
```
| Plain List *Item* | Owns trailing blank lines |
|-------------------------------+---------------------------|
| foo -> cat (includes bar baz) | Yes |
| bar | Yes |
| baz | No |
| lorem | No |
In isolation, this implies that the final plain list item does not own its trailing blank lines, which conflicts with "baz" from test 1.
New theory: List items own their trailing blank lines unless they are both the final list item and not the final element of a list item.
Adding why to the table:
| Plain List *Item* | Owns trailing blank lines | Why |
|-------------------------------+---------------------------+-----------------------------------------------------------|
| foo -> cat (includes bar baz) | Yes | Not the final list item |
| bar | Yes | Not the final list item |
| baz | No | Final item of bar->baz and not the final element of "foo" |
| lorem | No | Final item of foo->lorem and not contained in a list item |
*** Test case 3
So if that theory is true, taking the entire (foo -> lorem) list from test 1 and nesting it inside a list should coerce "lorem" to own its trailing blank lines since it would then be a final list item (of foo -> lorem) and the final element of the new list.
```
1. cat
1. foo
1. bar
2. baz
2. lorem
ipsum
```
| Plain List *Item* | Owns trailing blank lines |
|-----------------------------+---------------------------|
| cat (includes foo -> lorem) | No |
| foo (includes bar baz) | Yes |
| bar | Yes |
| baz | Yes |
| lorem | No |
Against expectations, we did not coerce lorem to consume its trailing blank lines. What is different between "baz" and "lorem"? Well, "baz" is contained within "foo" which has a "lorem" after it, whereas "lorem" is contained within "cat" which does not have any list items after it.
New theory: List items own their trailing blank lines unless they are both the final list item and not the final element of a non-final list item.
| Plain List *Item* | Owns trailing blank lines | Why |
|-----------------------------+---------------------------+------------------------------------------------------|
| cat (includes foo -> lorem) | No | Final list item and not contained in a list item |
| foo (includes bar baz) | Yes | Not the final list item |
| bar | Yes | Not the final list item |
| baz | Yes | Final element of non-final list item |
| lorem | No | Final list item and final element of final list item |
*** Test case 4
So if that theory is true, then we should be able to coerce lorem to consume its trailing blank lines by adding a second item to the cat list.
```
1. cat
1. foo
1. bar
2. baz
2. lorem
2. dog
ipsum
```
| Plain List *Item* | Owns trailing blank lines |
|-----------------------------+---------------------------|
| cat (includes foo -> lorem) | Yes |
| foo (includes bar baz) | Yes |
| bar | Yes |
| baz | Yes |
| lorem | Yes |
| dog | No |
For the first time our expectations were met!
Enduring theory: List items own their trailing blank lines unless they are both the final list item and not the final element of a non-final list item.
| Plain List *Item* | Owns trailing blank lines | Why |
|-----------------------------+---------------------------+--------------------------------------------------|
| cat (includes foo -> lorem) | Yes | Not the final list item |
| foo (includes bar baz) | Yes | Not the final list item |
| bar | Yes | Not the final list item |
| baz | Yes | Final element of non-final list item |
| lorem | Yes | Final element of non-final list item |
| dog | No | Final list item and not contained in a list item |
--
Tom Alexander
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Clarification on blank lines following list items
2023-08-19 5:30 Clarification on blank lines following list items Tom Alexander
@ 2023-08-19 8:43 ` Ihor Radchenko
2023-08-21 1:56 ` Tom Alexander
2023-09-16 11:26 ` Ihor Radchenko
0 siblings, 2 replies; 7+ messages in thread
From: Ihor Radchenko @ 2023-08-19 8:43 UTC (permalink / raw)
To: Tom Alexander; +Cc: emacs-orgmode
"Tom Alexander" <tom@fizz.buzz> writes:
> I am noticing the list items have some very context-sensitive specific behavior regarding ownership of the trailing blank lines. I was hoping to get some clarification on this (namely, are my observations correct, am I stumbling across a bug, or have I not dug deep enough to suss out the real rules?). The org-mode documentation states:
>
>> With the exception of list items and footnote definitions blank lines belong to the preceding element with the narrowest possible scope
>
> but it does not state who ends up owning those blank lines.
I can see how this explanation steered you into wrong line of thoughts.
It should better be explained from the widest scope to the narrowest
scope, not the opposite.
Greater Org elements are generally represented by contents where child
elements are located + markup defining the greater element itself +
optional trailing blank lines.
For example, drawers are
:NAME:
<contents begin>
...
<contents end>:END:
<blank lines>
Naturally, blank lines are the attribute of such drawer - they belong to
it and are recorded as :post-blank property.
The above works for many greater elements. However, it becomes a bit
tricky when a greater element does not have any "end" delimiter:
--------
- item
Some text
Or even
:drawer:
with text
:end:
Not an item.
--------
Now, assigning contents vs. blank lines is not so obvious. We can either
include these blank lines into contents or keep them separate within
:post-blank property.
Then, there are two kinds of greater elements that can end with blank
lines without separator:
1. Elements where blank lines do not affect parsing (headlines)
2. Elements where trailing blank lines are syntactically meaningful and
by themselves serve as a marker of element ending.
- footnote-definition ends when Org see two consecutive blank lines
or a heading or another footnote-definition.
- plain-list also uses two consecutive blank lines as delimiter.
In the second case, Org makes the plain-list/footnote-definition element
"own" the blank lines (set :post-blank) instead of putting these blank
lines inside contents. If we did otherwise, changes in contents could
make the parent plain-list/footnote-definition invalid - if the double
blank delimiter is edited away.
-----
Further, there is a special case with greater elements without contents:
* Heading with no contents
* Another heading
The first heading does not have contents, yet we want to record the fact
that it has multiple blank lines before the next element - :post-blank
here is set, unlike heading with contents.
-----
Finally, :post-blank in items is special.
Consider:
- item 1
- item 2
- item 3
We do not treat blank between items as parts of their paragraphs
historically. Also, it makes sense for such short items.
(there are actually some reasons why we might want to alter this
historical convention, but for now it is how it is)
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Clarification on blank lines following list items
2023-08-19 8:43 ` Ihor Radchenko
@ 2023-08-21 1:56 ` Tom Alexander
2023-08-22 7:47 ` Ihor Radchenko
2023-09-16 11:26 ` Ihor Radchenko
1 sibling, 1 reply; 7+ messages in thread
From: Tom Alexander @ 2023-08-21 1:56 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-orgmode
Thank you so much for explaining all of that! There is some good information there I was missing. I think the most important bit I was missing is the post-blank stuff. I was only looking at begin->end but I think digging into the post-blank is what makes this consistent.
I've got 2 separate questions:
1. Is the following statement true? "Two elements can count the same character in their post-blank?"
I am seeing dual-ownership of the post-blank in the examples below, but at the same time if I put a plain-list inside a footnote definition, the footnote definition ends up with sole custody of the post-blank.
2. I'm still not sure about some behavior I'm seeing. I think it would be easiest to see if we focus on exactly 1 blank line:
```
1. bar
2. baz
<---- this blank line here
ipsum
```
In this example, the blank line gets counted in the post-blank for the plain-list but not for the item:
```
plain-list: post-blank 1 | begin 1 end 16 | contents-begin 1 contents-end 15
item: post-blank 0 | begin 1 end 8 | contents-begin 4 contents-end 8
paragraph: post-blank 0 | begin 4 end 8 | contents-begin 4 contents-end 8
item: post-blank 0 | begin 8 end 15 | contents-begin 11 contents-end 15
paragraph: post-blank 0 | begin 11 end 15 | contents-begin 11 contents-end 15
paragraph: post-blank 0 | begin 16 end 22 | contents-begin 16 contents-end 22
```
but if we take that plain-list and nest it inside another plain-list:
```
1. foo
1. bar
2. baz
<---- this blank line here
2. lorem
ipsum
```
The blank line gets counted as a post-blank for both the item "foo" and the item "baz":
```
plain-list: post-blank 0 | begin 1 end 38 | contents-begin 1 contents-end 38
item: post-blank 1 | begin 1 end 29 | contents-begin 4 contents-end 28
paragraph: post-blank 0 | begin 4 end 8 | contents-begin 4 contents-end 8
plain-list: post-blank 0 | begin 8 end 29 | contents-begin 8 contents-end 29
item: post-blank 0 | begin 8 end 18 | contents-begin 14 contents-end 18
paragraph: post-blank 0 | begin 14 end 18 | contents-begin 14 contents-end 18
item: post-blank 1 | begin 18 end 29 | contents-begin 24 contents-end 28
paragraph: post-blank 0 | begin 24 end 28 | contents-begin 24 contents-end 28
item: post-blank 0 | begin 29 end 38 | contents-begin 32 contents-end 38
paragraph: post-blank 0 | begin 32 end 38 | contents-begin 32 contents-end 38
paragraph: post-blank 0 | begin 38 end 44 | contents-begin 38 contents-end 44
```
Meaning the post-blank did this movement:
```
plain-list: post-blank 0
item: post-blank 1 <---<----<----<-\
paragraph: post-blank 0 |
plain-list: post-blank 0 >---->--|
item: post-blank 0 |
paragraph: post-blank 0 |
item: post-blank 1 <---<---/
paragraph: post-blank 0
item: post-blank 0
paragraph: post-blank 0
paragraph: post-blank 0
```
Question ---> So why is the item "baz" gaining a post-blank instead of the inner plain-list (bar baz) keeping that post-blank?
I would expect it to instead be:
```
plain-list: post-blank 0
item: post-blank 1
paragraph: post-blank 0
here -> plain-list: post-blank 1
item: post-blank 0
paragraph: post-blank 0
not here -> item: post-blank 0
paragraph: post-blank 0
item: post-blank 0
paragraph: post-blank 0
paragraph: post-blank 0
```
I re-did both test cases using greater blocks and lesser blocks instead of paragraphs to make sure it wasn't that historical exception at the end of your email, and the post-blank behavior was exactly the same.
--
Tom Alexander
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Clarification on blank lines following list items
2023-08-21 1:56 ` Tom Alexander
@ 2023-08-22 7:47 ` Ihor Radchenko
2023-08-22 8:26 ` Ihor Radchenko
0 siblings, 1 reply; 7+ messages in thread
From: Ihor Radchenko @ 2023-08-22 7:47 UTC (permalink / raw)
To: Tom Alexander; +Cc: emacs-orgmode
"Tom Alexander" <tom@fizz.buzz> writes:
> 1. Is the following statement true? "Two elements can count the same character in their post-blank?"
This statement ought to be false.
> I am seeing dual-ownership of the post-blank in the examples below, but at the same time if I put a plain-list inside a footnote definition, the footnote definition ends up with sole custody of the post-blank.
It is a bug in the list parser.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Clarification on blank lines following list items
2023-08-22 7:47 ` Ihor Radchenko
@ 2023-08-22 8:26 ` Ihor Radchenko
2023-08-24 18:46 ` Tom Alexander
0 siblings, 1 reply; 7+ messages in thread
From: Ihor Radchenko @ 2023-08-22 8:26 UTC (permalink / raw)
To: Tom Alexander; +Cc: emacs-orgmode
Ihor Radchenko <yantar92@posteo.net> writes:
>> I am seeing dual-ownership of the post-blank in the examples below, but at the same time if I put a plain-list inside a footnote definition, the footnote definition ends up with sole custody of the post-blank.
>
> It is a bug in the list parser.
Fixed, on main.
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=53c9d91d3
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Clarification on blank lines following list items
2023-08-22 8:26 ` Ihor Radchenko
@ 2023-08-24 18:46 ` Tom Alexander
0 siblings, 0 replies; 7+ messages in thread
From: Tom Alexander @ 2023-08-24 18:46 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-orgmode
Thanks!
--
Tom Alexander
On Tue, Aug 22, 2023, at 4:26 AM, Ihor Radchenko wrote:
> Ihor Radchenko <yantar92@posteo.net> writes:
>
>>> I am seeing dual-ownership of the post-blank in the examples below, but at the same time if I put a plain-list inside a footnote definition, the footnote definition ends up with sole custody of the post-blank.
>>
>> It is a bug in the list parser.
>
> Fixed, on main.
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=53c9d91d3
>
> --
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Clarification on blank lines following list items
2023-08-19 8:43 ` Ihor Radchenko
2023-08-21 1:56 ` Tom Alexander
@ 2023-09-16 11:26 ` Ihor Radchenko
1 sibling, 0 replies; 7+ messages in thread
From: Ihor Radchenko @ 2023-09-16 11:26 UTC (permalink / raw)
To: Tom Alexander; +Cc: emacs-orgmode
Ihor Radchenko <yantar92@posteo.net> writes:
> "Tom Alexander" <tom@fizz.buzz> writes:
>
>> I am noticing the list items have some very context-sensitive specific behavior regarding ownership of the trailing blank lines. I was hoping to get some clarification on this (namely, are my observations correct, am I stumbling across a bug, or have I not dug deep enough to suss out the real rules?). The org-mode documentation states:
>>
>>> With the exception of list items and footnote definitions blank lines belong to the preceding element with the narrowest possible scope
>>
>> but it does not state who ends up owning those blank lines.
>
> I can see how this explanation steered you into wrong line of thoughts.
> It should better be explained from the widest scope to the narrowest
> scope, not the opposite.
I tried to clarify things in
https://git.sr.ht/~bzg/worg/commit/ac9de71c1e73ef3a1d63e58e364fc55e83f0214e
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-09-16 11:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-19 5:30 Clarification on blank lines following list items Tom Alexander
2023-08-19 8:43 ` Ihor Radchenko
2023-08-21 1:56 ` Tom Alexander
2023-08-22 7:47 ` Ihor Radchenko
2023-08-22 8:26 ` Ihor Radchenko
2023-08-24 18:46 ` Tom Alexander
2023-09-16 11:26 ` Ihor Radchenko
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.