From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path:
Received: from mp0.migadu.com ([2001:41d0:303:e224::])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
by ms13.migadu.com with LMTPS
id iOo2M3YIb2Yu0wAAqHPOHw:P1
(envelope-from )
for ; Sun, 16 Jun 2024 15:44:55 +0000
Received: from aspmx1.migadu.com ([2001:41d0:303:e224::])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
by mp0.migadu.com with LMTPS
id iOo2M3YIb2Yu0wAAqHPOHw
(envelope-from )
for ; Sun, 16 Jun 2024 17:44:55 +0200
X-Envelope-To: larch@yhetil.org
Authentication-Results: aspmx1.migadu.com;
dkim=none;
dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none);
spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"
ARC-Seal: i=1; s=key1; d=yhetil.org; t=1718552694; a=rsa-sha256; cv=none;
b=foi7JcFouHozNU16Ag6X47CwNFA7tVPDPA35Tv9mS+U/4QJKQLJSKQhup1zlB0zETlI/By
7XEZNzKWu7Y+HKpzlWI2UNA7OEHvyd5wDa6GUYdMZZ3H7+GMViIRMy1g8C7rHio82H8I2S
X2G/hvt9nXN/fIzq7irNqiATGiYQ0n+M6eGaX7mRNmYAWr/wfG8/qoHF9XbHp1rSDqW6Bu
gVougqAIbP0I8C+q91e/Q87TQ0yFOt4nVa5TUmd5SYJQJMUGlHnHN+QcyAmLKgMuuLH/Je
H5mypA8swXrqHCa0wxPaWp0G/RNu8ORIWTAiVIwCs3ZP9X0pGzHpKyzYiwG1pg==
ARC-Authentication-Results: i=1;
aspmx1.migadu.com;
dkim=none;
dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none);
spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org;
s=key1; t=1718552694;
h=from:from:sender:sender:reply-to:subject:subject:date:date:
message-id:message-id:to:to:cc:mime-version:mime-version:
content-type:content-type:
content-transfer-encoding:content-transfer-encoding:
in-reply-to:in-reply-to:references:references:list-id:list-help:
list-unsubscribe:list-subscribe:list-post;
bh=MEy+Ei4ncuZm3ak5f/2QEECeEJpkER937HyUr2aE3Eo=;
b=HsM+t2hGbrmuz//wEwBO1QiiswEg/ZDGj89aHKTRv8d/CzxCuzNweeCwLwChKR1ojsYkT0
9YtqBcEN3lMfYpsSoKo1fZzuP8kkV6/4vDo5smB0JdwID/hy78QalxZ+ykmHInwowZL/sR
gdcZvey59Us5amJ9XpLqGM1fz1CpPTHxzvWGU11lgScFSzRefPkPIsvj73yoxMoNurSILo
6Xwg9vCj2Fa+AxO33/Ui/4A0nT4qsOcpMdgmNmJBWI1LfT4bgx0/bCmhctN43uNihUyJYu
F4FKI0YjuW/HXmR1MvioJatiyE2su+usmUqUXcDFyTInEiHgiVplfkKIinpxig==
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(No client certificate requested)
by aspmx1.migadu.com (Postfix) with ESMTPS id C8D1B69070
for ; Sun, 16 Jun 2024 17:44:53 +0200 (CEST)
Received: from localhost ([::1] helo=lists1p.gnu.org)
by lists.gnu.org with esmtp (Exim 4.90_1)
(envelope-from )
id 1sIs2t-0004NW-Fw; Sun, 16 Jun 2024 11:43:55 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from )
id 1sIs2s-0004NI-2X
for emacs-orgmode@gnu.org; Sun, 16 Jun 2024 11:43:54 -0400
Received: from ciao.gmane.io ([116.202.254.214])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from )
id 1sIs2q-0000Kn-8t
for emacs-orgmode@gnu.org; Sun, 16 Jun 2024 11:43:53 -0400
Received: from list by ciao.gmane.io with local (Exim 4.92)
(envelope-from )
id 1sIs2o-0004eA-6x
for emacs-orgmode@gnu.org; Sun, 16 Jun 2024 17:43:50 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: emacs-orgmode@gnu.org
From: Max Nikulin
Subject: Re: [BUG] Trailing dash is not included in link [9.7.3 (9.7.3-2f1844
@ /home/mwillcock/.emacs.d/elpa/org-9.7.3/)]
Date: Sun, 16 Jun 2024 22:43:39 +0700
Message-ID:
References: <87sexh9ddv.fsf@ice9.digital> <87le37k4c8.fsf@localhost>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
User-Agent: Mozilla Thunderbird
Content-Language: en-US, ru-RU
In-Reply-To: <87le37k4c8.fsf@localhost>
Received-SPF: pass client-ip=116.202.254.214;
envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io
X-Spam_score_int: 26
X-Spam_score: 2.6
X-Spam_bar: ++
X-Spam_report: (2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001,
FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309,
FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001,
HEADER_FROM_DIFFERENT_DOMAINS=0.249, NML_ADSP_CUSTOM_MED=0.9,
SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no
X-Spam_action: no action
X-BeenThere: emacs-orgmode@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "General discussions about Org-mode."
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org
Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org
X-Migadu-Flow: FLOW_IN
X-Migadu-Country: US
X-Migadu-Spam-Score: -5.44
X-Migadu-Scanner: mx12.migadu.com
X-Spam-Score: -5.44
X-Migadu-Queue-Id: C8D1B69070
X-TUID: vQxzkKMLtXLF
On 14/06/2024 21:04, Ihor Radchenko wrote:
> Morgan Willcock writes:
>
>> i.e. Inserting "https://domain/test-" into the buffer will create a
>> clickable link for "https://domain/test".
>>
> I improved the heuristics we use to detect plain links.
> Fixed, on main.
> https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=73da6beb5
> +++ b/etc/ORG-NEWS
[...]
> +*** Trailing =-= is now allowed in plain links
After a look into
7dcb1afb6 2021-03-24 21:27:24 +0800 Ihor Radchenko: Improve
org-link-plain-re
I suspect, it worked prior to v9.5. Without a unit test it may be
accidentally broken again.
> +: https://domain/test-
example.org, example.net, example.com are domains reserved for usage in
examples:
> (or (regexp "[^[:punct:] \t\n]")
I have realized that some Org regexps use [:punct:] *regexp class* and
others *syntax class*, see latex math regexp. I am in doubts if the
discrepancy is intentional.
I have noticed that the following change
09ced6d2c 2024-02-03 15:15:46 +0100 Ihor Radchenko: org-link-plain-re:
Improve regexp heuristics
that causes
(link http://example.org/a
(link http://example.org/a%3Cb)
I expect that ")" should not be parsed as a part of the link. Balanced
brackets are tricky with regexps (and it is not possible to match
arbitrary nested ones).
Perhaps "[^[:punct:] \t\n]" is too strict in respect to spaces. It does
not allow the recommended workaround with zero width space:
(org-export-string-as
"http://example.org\N{ZERO WIDTH SPACE}[fn::footnote]" 'html 'body)
"
http://example.org[fn::footnote]
"
Actually some kind of non-breakable space should be better in such cases:
(org-export-string-as
"http://example.org\N{NO-BREAK SPACE}[fn::footnote]" 'html 'body)
"
http://example.org [fn::footnote]
"
I would consider [:space:] or \s-.
As to the original bug report, while reading it, I noticed that
thunderbird includes dash into the recognized link for
"https://domain/test-"
I decided to look into its implementation and to my surprise I found:
``punctation chars and "-" at the end are stipped off.'' I realized that
double quotes along with angle brackets are treated as a recommended way
to mark URLs in plain text. Thunderbird does not consider dash as a part
of links for e.g. http://example.org/t- It might be an attempt to
reserve possibility to assemble URLs wrapped into several lines with
added hyphenation marks, but it has not been implemented (RFC2396
appendix E warns about accidentally added hyphens).
https://www.bucksch.org/1/projects/mozilla/16507/
https://searchfox.org/mozilla-central/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#line-243
mozTXTToHTMLConv::FindURLEnd
Implementation is tricky, I have not noticed anything that may be reused
to improve heuristics for Org. Nowadays it is likely better to inspect
autolinking code for GitHub/GitLab or widely used python packages.