emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Possible bug getting bounds of URL at point?
@ 2024-07-16 20:10 Karl Fogel
  2024-07-16 20:21 ` Ihor Radchenko
  0 siblings, 1 reply; 5+ messages in thread
From: Karl Fogel @ 2024-07-16 20:10 UTC (permalink / raw)
  To: Org Mode

In Org Mode buffers, `bounds-of-thing-at-point-provider-alist' 
names a Org-Mode-specific URL provider:

  ((url . org--bounds-of-link-at-point))

That handler is defined in org.el:

  (defun org--bounds-of-link-at-point ()
    "`bounds-of-thing-at-point' provider function."
    (let ((context (org-element-context)))
      (when (eq (org-element-type context) 'link)
        (cons (org-element-begin context)
              (org-element-end context)))))

(This is in the tree as of today, commit f2141541b45.)

I think this is causing URL boundaries to be calculated 
incorrectly.

REPRODUCTION:

Assume we have this line in an Org Mode buffer (note there are 
three trailing spaces after the final "m" -- hopefully the MTAs 
and MUAs will leave those spaces there):

  https://example.com   

Let's say the initial "h" is at position 22205, the position right 
after the final "m" is 22224, and the final position on the line 
(after the three spaces) is 22227.

With point anywhere inside the URL, if I run 
(bounds-of-thing-at-point 'url), I currently get this result:

  (22205 . 22227)

But I expected this result instead:

  (22205 . 22224)

Is (22205 . 22227) correct, and I'm just misunderstanding how URL 
boundaries are supposed to work in Org Mode?

I haven't yet debugged into `org-element-end' (nor into 
`org-element-property', which is what `org-element-end' wraps). 
First I want to check that my expectations are correct.

Is there a bug here?

Best regards,
-Karl


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug getting bounds of URL at point?
  2024-07-16 20:10 Possible bug getting bounds of URL at point? Karl Fogel
@ 2024-07-16 20:21 ` Ihor Radchenko
  2024-07-16 20:34   ` Karl Fogel
  0 siblings, 1 reply; 5+ messages in thread
From: Ihor Radchenko @ 2024-07-16 20:21 UTC (permalink / raw)
  To: Karl Fogel; +Cc: Org Mode

Karl Fogel <kfogel@red-bean.com> writes:

> In Org Mode buffers, `bounds-of-thing-at-point-provider-alist' 
> names a Org-Mode-specific URL provider:
> ...
> I think this is causing URL boundaries to be calculated 
> incorrectly.
>
> REPRODUCTION:
>
> Assume we have this line in an Org Mode buffer (note there are 
> three trailing spaces after the final "m" -- hopefully the MTAs 
> and MUAs will leave those spaces there):
>
>   https://example.com   
>
> Let's say the initial "h" is at position 22205, the position right 
> after the final "m" is 22224, and the final position on the line 
> (after the three spaces) is 22227.
>
> With point anywhere inside the URL, if I run 
> (bounds-of-thing-at-point 'url), I currently get this result:
>
>   (22205 . 22227)
>
> But I expected this result instead:
>
>   (22205 . 22224)
>
> Is (22205 . 22227) correct, and I'm just misunderstanding how URL 
> boundaries are supposed to work in Org Mode?

This is correct. Trailing whitespace belongs to the preceding node in
Org syntax. This is not a bug.

Moreover, if you have something like
[[https//orgmode.org][description]]
the whole thing will be considered a URL.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug getting bounds of URL at point?
  2024-07-16 20:21 ` Ihor Radchenko
@ 2024-07-16 20:34   ` Karl Fogel
  2024-07-17 14:37     ` Ihor Radchenko
  0 siblings, 1 reply; 5+ messages in thread
From: Karl Fogel @ 2024-07-16 20:34 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Org Mode

On 16 Jul 2024, Ihor Radchenko wrote:
>> Assume we have this line in an Org Mode buffer (note there are 
>> three trailing spaces after the final "m" -- hopefully the MTAs 
>> and MUAs will leave those spaces there):
>>
>>   https://example.com   
>>
>> Let's say the initial "h" is at position 22205, the position 
>> right 
>> after the final "m" is 22224, and the final position on the 
>> line 
>> (after the three spaces) is 22227.
>>
>> With point anywhere inside the URL, if I run 
>> (bounds-of-thing-at-point 'url), I currently get this result:
>>
>>   (22205 . 22227)
>>
>> But I expected this result instead:
>>
>>   (22205 . 22224)
>>
>> Is (22205 . 22227) correct, and I'm just misunderstanding how 
>> URL 
>> boundaries are supposed to work in Org Mode?
>
>This is correct. Trailing whitespace belongs to the preceding 
>node in
>Org syntax. This is not a bug.
>
>Moreover, if you have something like
>[[https//orgmode.org][description]]
>the whole thing will be considered a URL.

Thank you, Ihor.

I admit that I don't immediately understand why this is a good 
thing.  The user asked for the bounds of the URL at point, but got 
instead the bounds of some other thing (the Org "node"). 
Especially in the case of a standalone URL, with no description 
text, I don't see how including the whitespace is useful.

However, there could be issues here that I'm not familiar with. 
It sounds like you've already thought this out and concluded that 
that including the trailing whitespace is the right behavior.  If 
you have time to explain why in more detail, I'd appreciate a 
chance to learn more about it.  However, if you don't have time to 
do that, it's no problem.

I can change the code I'm writing to do things a different way, so 
this behavior need not interfere with my current task, in any 
case.

Best regards,
-Karl


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug getting bounds of URL at point?
  2024-07-16 20:34   ` Karl Fogel
@ 2024-07-17 14:37     ` Ihor Radchenko
  2024-07-17 17:05       ` Karl Fogel
  0 siblings, 1 reply; 5+ messages in thread
From: Ihor Radchenko @ 2024-07-17 14:37 UTC (permalink / raw)
  To: Karl Fogel; +Cc: Org Mode

Karl Fogel <kfogel@red-bean.com> writes:

> I admit that I don't immediately understand why this is a good 
> thing.  The user asked for the bounds of the URL at point, but got 
> instead the bounds of some other thing (the Org "node"). 
> Especially in the case of a standalone URL, with no description 
> text, I don't see how including the whitespace is useful.
>
> However, there could be issues here that I'm not familiar with. 
> It sounds like you've already thought this out and concluded that 
> that including the trailing whitespace is the right behavior.  If 
> you have time to explain why in more detail, I'd appreciate a 
> chance to learn more about it.  However, if you don't have time to 
> do that, it's no problem.

The notion of "URL", and especially "URL at point" in Org mode needs to
be special. Consider something like

[[https://orgmode.org][this is a very long and /convoluted/
description of this url; all the text here is clickable as a link]].

Org mode will consider point anywhere inside the link as "at URL".
That "URL" will be https://orgmode.org, and it is indeed what
(thing-at-point 'url) will return on that link in Org mode, even when
point is on the link description.
Hope it makes sense.

What does not make sense in such scenario is returning
(bounds-of-thing-at-point 'url) to not include point. So, we instead
return the relevant syntax object - link object. And that object
includes description, brackets, and whitespace after.

There is no reason to make plain links special in this regard, so we
don't.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug getting bounds of URL at point?
  2024-07-17 14:37     ` Ihor Radchenko
@ 2024-07-17 17:05       ` Karl Fogel
  0 siblings, 0 replies; 5+ messages in thread
From: Karl Fogel @ 2024-07-17 17:05 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Org Mode

On 17 Jul 2024, Ihor Radchenko wrote:
>The notion of "URL", and especially "URL at point" in Org mode 
>needs to
>be special. Consider something like
>
>[[https://orgmode.org][this is a very long and /convoluted/
>description of this url; all the text here is clickable as a 
>link]].
>
>Org mode will consider point anywhere inside the link as "at 
>URL".
>That "URL" will be https://orgmode.org, and it is indeed what
>(thing-at-point 'url) will return on that link in Org mode, even 
>when
>point is on the link description.
>Hope it makes sense.
>
>What does not make sense in such scenario is returning
>(bounds-of-thing-at-point 'url) to not include point. So, we 
>instead
>return the relevant syntax object - link object. And that object
>includes description, brackets, and whitespace after.
>
>There is no reason to make plain links special in this regard, so 
>we
>don't.

Thank you for the explanation, Ihor.

I'm sure there are people depending on the fact that the rest of 
the line, after a plain link, is still part of the same node.  My 
code does a manipulation of just the link itself -- for example, 
in one keystroke, it turns

  https://example.org/

into

  [[https://example.org/][example.org]]

In order for that to work, I needed the bounds to be the start and 
end of the link text itself.  But that was easy to do: I just 
temporarily override `bounds-of-thing-at-point-provider-alist' in 
Org Mode, so that I get the default thingatpt handler instead.

So: problem solved.

Best regards,
-Karl


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-17 17:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-16 20:10 Possible bug getting bounds of URL at point? Karl Fogel
2024-07-16 20:21 ` Ihor Radchenko
2024-07-16 20:34   ` Karl Fogel
2024-07-17 14:37     ` Ihor Radchenko
2024-07-17 17:05       ` Karl Fogel

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).