* Bug in identification of links?
@ 2020-06-11 22:00 Daniele Nicolodi
2020-06-12 1:19 ` Kyle Meyer
0 siblings, 1 reply; 2+ messages in thread
From: Daniele Nicolodi @ 2020-06-11 22:00 UTC (permalink / raw)
To: emacs-orgmode
Hello,
org-mode fails to recognize https://doi.org/10.1016/0370-1573(89)90087-2
as a valid URL, it breaks it after the closing parenthesis ). I don't
understand why this is the case as I would imagine that if the )
character is not allowed in URLs the link would be broken before it and
not after. I haven't tried to find the code responsible for this, thus I
don't know what exactly is going on. Does anyone have an idea?
Thank you.
Cheers,
Dan
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Bug in identification of links?
2020-06-11 22:00 Bug in identification of links? Daniele Nicolodi
@ 2020-06-12 1:19 ` Kyle Meyer
0 siblings, 0 replies; 2+ messages in thread
From: Kyle Meyer @ 2020-06-12 1:19 UTC (permalink / raw)
To: Daniele Nicolodi; +Cc: emacs-orgmode
Daniele Nicolodi writes:
> org-mode fails to recognize https://doi.org/10.1016/0370-1573(89)90087-2
> as a valid URL, it breaks it after the closing parenthesis ). I don't
> understand why this is the case as I would imagine that if the )
> character is not allowed in URLs the link would be broken before it and
> not after. I haven't tried to find the code responsible for this, thus I
> don't know what exactly is going on. Does anyone have an idea?
The link is matched by org-link-plain-re, which is created by
org-link-make-regexps. The relevant part looks like this:
\\([^][ \t\n()<>]+\\(?:([[:word:]0-9_]+)\\|\\([^[:punct:] \t\n]\\|/\\)\\)\\)
-----------------
The underlined bit is what is matching "(89)". This subpattern
appeared, without the underscore, in facedba05 (Use John Gruber's
regular expression for URL's, 2009-12-09). The commit message links to
an article [0] that has this to say about the parentheses matching:
It attempts to be particularly clever with regard to parentheses,
which, in my experience, only ever seem to occur in the wild in
Wikipedia URLs, and which many URL matching patterns seem to
botch. The pattern looks for a single pair of balanced parentheses
within the URL, which is how it correctly omits the trailing
parenthesis in the following line:
(Something like http://foo.com/blah_blah)
That article also has an update recommending to use an improved variant.
Untested, but it seems like it'd handle your case.
This issue has been around a long time and is minor in that there will
always be cases that fool the regexp and these can be handled by
enclosing the text with <...> or [[...]]. Still, in my view it'd be
worth taking a look at tweaking the regexp after the release of v9.4.
[0] https://daringfireball.net/2009/11/liberal_regex_for_matching_urls
Related thread on mailing list:
https://orgmode.org/list/loom.20091130T200527-783@post.gmane.org/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-06-12 1:19 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-06-11 22:00 Bug in identification of links? Daniele Nicolodi
2020-06-12 1:19 ` Kyle Meyer
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.