all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#23343: 25.0.93; URI schemes are not regexp-quoted for `goto-address-url-regexp'
@ 2016-04-23 13:51 Phil Sainty
  2016-04-23 14:02 ` bug#23343: 25.0.93; [PATCH] " Phil Sainty
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Sainty @ 2016-04-23 13:51 UTC (permalink / raw)
  To: 23343

The URI schemes from `thing-at-point-uri-schemes' are concatenated
into a regexp group without being suitably escaped, meaning there
are bugs whenever regexp meta-characters appear. e.g.:

bzr+ssh://
iris.beep:






^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#23343: 25.0.93; [PATCH] URI schemes are not regexp-quoted for `goto-address-url-regexp'
  2016-04-23 13:51 bug#23343: 25.0.93; URI schemes are not regexp-quoted for `goto-address-url-regexp' Phil Sainty
@ 2016-04-23 14:02 ` Phil Sainty
  2016-04-23 22:48   ` Phil Sainty
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Sainty @ 2016-04-23 14:02 UTC (permalink / raw)
  To: 23343

[-- Attachment #1: Type: text/plain, Size: 285 bytes --]

This could be resolved with

-   (mapconcat 'identity
+   (mapconcat 'regexp-quote

But on account of 
http://stackoverflow.com/questions/36787889/how-to-remove-some-link-type-in-emacs
I thought that making it easier to customise this feature was a useful
improvement to make.


-Phil

[-- Attachment #2: 0001-Fix-goto-address-url-regexp.patch --]
[-- Type: text/x-patch, Size: 2558 bytes --]

From 0d7a39a1c56700020127c01a622d13c27aec498d Mon Sep 17 00:00:00 2001
From: Phil Sainty <psainty@orcon.net.nz>
Date: Sun, 24 Apr 2016 01:40:47 +1200
Subject: [PATCH] Fix `goto-address-url-regexp'

* lisp/net/goto-addr.el: The URI schemes to be recognised by
`goto-address-mode' were not regexp-quoted. (Bug#23343)
---
 lisp/net/goto-addr.el | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/lisp/net/goto-addr.el b/lisp/net/goto-addr.el
index bc3c403..e4bbf76 100644
--- a/lisp/net/goto-addr.el
+++ b/lisp/net/goto-addr.el
@@ -59,6 +59,7 @@
 
 ;;; Code:
 
+(require 'seq)
 (require 'thingatpt)
 (autoload 'browse-url-url-at-point "browse-url")
 
@@ -101,23 +102,26 @@ goto-address-mail-regexp
   "[-a-zA-Z0-9=._+]+@\\([-a-zA-z0-9_]+\\.\\)+[a-zA-Z0-9]+"
   "A regular expression probably matching an e-mail address.")
 
+(defvar goto-address-uri-schemes-ignored
+  ;; By default we exclude `mailto:' (email addresses are matched
+  ;; by `goto-address-mail-regexp') and also `data:', as it is not
+  ;; terribly useful to follow those URIs, and leaving them causes
+  ;; `use Data::Dumper;' to be fontified oddly in Perl files.
+  '("mailto:" "data:")
+  "List of URI schemes to exclude from `goto-address-uri-schemes'.")
+
+(defvar goto-address-uri-schemes
+  ;; We use `thing-at-point-uri-schemes', with a few exclusions,
+  ;; as listed in `goto-address-uri-schemes-ignored'.
+  (seq-reduce (lambda (accum elt) (delete elt accum))
+              goto-address-uri-schemes-ignored
+              (copy-sequence thing-at-point-uri-schemes))
+  "List of URI schemes matched by `goto-address-url-regexp'.")
+
 (defvar goto-address-url-regexp
-  (concat
-   "\\<\\("
-   (mapconcat 'identity
-              (delete "mailto:"
-		      ;; Remove `data:', as it's not terribly useful to follow
-		      ;; those.  Leaving them causes `use Data::Dumper;' to be
-		      ;; fontified oddly in Perl files.
-                      (delete "data:"
-                              (copy-sequence thing-at-point-uri-schemes)))
-              "\\|")
-   "\\)"
-   thing-at-point-url-path-regexp)
-  ;; (concat "\\b\\(s?https?\\|ftp\\|file\\|gopher\\|news\\|"
-  ;; 	  "telnet\\|wais\\):\\(//[-a-zA-Z0-9_.]+:"
-  ;; 	  "[0-9]*\\)?[-a-zA-Z0-9_=?#$@~`%&*+|\\/.,]*"
-  ;; 	  "[-a-zA-Z0-9_=#$@~`%&*+|\\/]")
+  (concat "\\<"
+          (regexp-opt goto-address-uri-schemes t)
+          thing-at-point-url-path-regexp)
   "A regular expression probably matching a URL.")
 
 (defvar goto-address-highlight-keymap
-- 
2.8.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#23343: 25.0.93; [PATCH] URI schemes are not regexp-quoted for `goto-address-url-regexp'
  2016-04-23 14:02 ` bug#23343: 25.0.93; [PATCH] " Phil Sainty
@ 2016-04-23 22:48   ` Phil Sainty
  2016-04-24 11:36     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Sainty @ 2016-04-23 22:48 UTC (permalink / raw)
  To: 23343

[-- Attachment #1: Type: text/plain, Size: 624 bytes --]

I've made docstring changes to point out that it's only useful to
set goto-address-uri-schemes-ignored and goto-address-uri-schemes
prior to loading the library.

I'm not sure if there's any policy about such things, but it seemed
like it would be quite a lot more work to get around that. I imagine
the :set ability of defcustom would make it possible to have changes
to those variables dynamically update goto-address-url-regexp as well;
but in doing that you'd need to take care not to clobber values which
had themselves been customized also, and it all seemed like a lot of
added complexity for little benefit.


-Phil

[-- Attachment #2: 0001-Fix-goto-address-url-regexp.patch --]
[-- Type: text/x-patch, Size: 2702 bytes --]

From 9cbe7647f7f1558274afaabb26c84d85f3e6c6a4 Mon Sep 17 00:00:00 2001
From: Phil Sainty <psainty@orcon.net.nz>
Date: Sun, 24 Apr 2016 01:40:47 +1200
Subject: [PATCH] Fix `goto-address-url-regexp'

* lisp/net/goto-addr.el: The URI schemes to be recognised by
`goto-address-mode' were not regexp-quoted. (Bug#23343)
---
 lisp/net/goto-addr.el | 40 ++++++++++++++++++++++++----------------
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/lisp/net/goto-addr.el b/lisp/net/goto-addr.el
index bc3c403..e8d1b62 100644
--- a/lisp/net/goto-addr.el
+++ b/lisp/net/goto-addr.el
@@ -59,6 +59,7 @@
 
 ;;; Code:
 
+(require 'seq)
 (require 'thingatpt)
 (autoload 'browse-url-url-at-point "browse-url")
 
@@ -101,23 +102,30 @@ goto-address-mail-regexp
   "[-a-zA-Z0-9=._+]+@\\([-a-zA-z0-9_]+\\.\\)+[a-zA-Z0-9]+"
   "A regular expression probably matching an e-mail address.")
 
+(defvar goto-address-uri-schemes-ignored
+  ;; By default we exclude `mailto:' (email addresses are matched
+  ;; by `goto-address-mail-regexp') and also `data:', as it is not
+  ;; terribly useful to follow those URIs, and leaving them causes
+  ;; `use Data::Dumper;' to be fontified oddly in Perl files.
+  '("mailto:" "data:")
+  "List of URI schemes to exclude from `goto-address-uri-schemes'.
+
+Customisations made after goto-addr is loaded will have no effect.")
+
+(defvar goto-address-uri-schemes
+  ;; We use `thing-at-point-uri-schemes', with a few exclusions,
+  ;; as listed in `goto-address-uri-schemes-ignored'.
+  (seq-reduce (lambda (accum elt) (delete elt accum))
+              goto-address-uri-schemes-ignored
+              (copy-sequence thing-at-point-uri-schemes))
+  "List of URI schemes matched by `goto-address-url-regexp'.
+
+Customisations made after goto-addr is loaded will have no effect.")
+
 (defvar goto-address-url-regexp
-  (concat
-   "\\<\\("
-   (mapconcat 'identity
-              (delete "mailto:"
-		      ;; Remove `data:', as it's not terribly useful to follow
-		      ;; those.  Leaving them causes `use Data::Dumper;' to be
-		      ;; fontified oddly in Perl files.
-                      (delete "data:"
-                              (copy-sequence thing-at-point-uri-schemes)))
-              "\\|")
-   "\\)"
-   thing-at-point-url-path-regexp)
-  ;; (concat "\\b\\(s?https?\\|ftp\\|file\\|gopher\\|news\\|"
-  ;; 	  "telnet\\|wais\\):\\(//[-a-zA-Z0-9_.]+:"
-  ;; 	  "[0-9]*\\)?[-a-zA-Z0-9_=?#$@~`%&*+|\\/.,]*"
-  ;; 	  "[-a-zA-Z0-9_=#$@~`%&*+|\\/]")
+  (concat "\\<"
+          (regexp-opt goto-address-uri-schemes t)
+          thing-at-point-url-path-regexp)
   "A regular expression probably matching a URL.")
 
 (defvar goto-address-highlight-keymap
-- 
2.8.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#23343: 25.0.93; [PATCH] URI schemes are not regexp-quoted for `goto-address-url-regexp'
  2016-04-23 22:48   ` Phil Sainty
@ 2016-04-24 11:36     ` Lars Magne Ingebrigtsen
  2016-04-24 13:48       ` Phil Sainty
  0 siblings, 1 reply; 6+ messages in thread
From: Lars Magne Ingebrigtsen @ 2016-04-24 11:36 UTC (permalink / raw)
  To: Phil Sainty; +Cc: 23343

Phil Sainty <psainty@orcon.net.nz> writes:

> I've made docstring changes to point out that it's only useful to
> set goto-address-uri-schemes-ignored and goto-address-uri-schemes
> prior to loading the library.
>
> I'm not sure if there's any policy about such things, but it seemed
> like it would be quite a lot more work to get around that. I imagine
> the :set ability of defcustom would make it possible to have changes
> to those variables dynamically update goto-address-url-regexp as well;
> but in doing that you'd need to take care not to clobber values which
> had themselves been customized also, and it all seemed like a lot of
> added complexity for little benefit.

Having variable defaults depend on each other can be awkward, especially
when it's the "first" variable that users will realistically be
tweaking.

[...]

> +(defvar goto-address-uri-schemes-ignored
> +  ;; By default we exclude `mailto:' (email addresses are matched
> +  ;; by `goto-address-mail-regexp') and also `data:', as it is not
> +  ;; terribly useful to follow those URIs, and leaving them causes
> +  ;; `use Data::Dumper;' to be fontified oddly in Perl files.
> +  '("mailto:" "data:")

Which is this one.  But I don't really see how to fix that without
having a different interface here (i.e., getting rid of
goto-address-url-regexp), so perhaps it's OK...

But:

> +(defvar goto-address-uri-schemes
> +  ;; We use `thing-at-point-uri-schemes', with a few exclusions,
> +  ;; as listed in `goto-address-uri-schemes-ignored'.
> +  (seq-reduce (lambda (accum elt) (delete elt accum))
> +              goto-address-uri-schemes-ignored
> +              (copy-sequence thing-at-point-uri-schemes))

I don't much value to this "intermediate" variable.  It just makes
things even more complicated to work with, I think.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#23343: 25.0.93; [PATCH] URI schemes are not regexp-quoted for `goto-address-url-regexp'
  2016-04-24 11:36     ` Lars Magne Ingebrigtsen
@ 2016-04-24 13:48       ` Phil Sainty
  2019-06-25 14:02         ` Lars Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Sainty @ 2016-04-24 13:48 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: 23343

On 24/04/16 23:36, Lars Magne Ingebrigtsen wrote:
 > Having variable defaults depend on each other can be awkward, especially
 > when it's the "first" variable that users will realistically be
 > tweaking.

Yes indeed. I don't especially like the way I've done it -- but it
seemed like an improvement, nevertheless.

 >> +(defvar goto-address-uri-schemes
 >
 > I don't much value to this "intermediate" variable.  It just makes
 > things even more complicated to work with, I think.

It's true that this one is an intermediate by default; but it's one
which users might potentially want to set in order to say "I don't
care about the default schemes at all; I only want the specific
schemes I've specified here to be matched."

So of the two new variables I'm adding, a user might set one or the
other depending on their needs. That was my line of thought, anyhow.

-Phil





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#23343: 25.0.93; [PATCH] URI schemes are not regexp-quoted for `goto-address-url-regexp'
  2016-04-24 13:48       ` Phil Sainty
@ 2019-06-25 14:02         ` Lars Ingebrigtsen
  0 siblings, 0 replies; 6+ messages in thread
From: Lars Ingebrigtsen @ 2019-06-25 14:02 UTC (permalink / raw)
  To: Phil Sainty; +Cc: 23343

Phil Sainty <psainty@orcon.net.nz> writes:

> It's true that this one is an intermediate by default; but it's one
> which users might potentially want to set in order to say "I don't
> care about the default schemes at all; I only want the specific
> schemes I've specified here to be matched."
>
> So of the two new variables I'm adding, a user might set one or the
> other depending on their needs. That was my line of thought, anyhow.

That's true.  I've now applied your patch to the Emacs trunk.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-06-25 14:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-23 13:51 bug#23343: 25.0.93; URI schemes are not regexp-quoted for `goto-address-url-regexp' Phil Sainty
2016-04-23 14:02 ` bug#23343: 25.0.93; [PATCH] " Phil Sainty
2016-04-23 22:48   ` Phil Sainty
2016-04-24 11:36     ` Lars Magne Ingebrigtsen
2016-04-24 13:48       ` Phil Sainty
2019-06-25 14:02         ` Lars Ingebrigtsen

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.