unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
@ 2007-08-30  7:15 Richard Stallman
  2007-08-31  8:32 ` Glenn Morris
  2007-09-01 21:00 ` (thing-at-point 'defun) always returns NIL (was: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]) Leo
  0 siblings, 2 replies; 22+ messages in thread
From: Richard Stallman @ 2007-08-30  7:15 UTC (permalink / raw)
  To: emacs-devel

Would someone please DTRT and ack?

------- Start of forwarded message -------
X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY 
	autolearn=failed version=3.1.0
From: Leo <sdl.web@gmail.com>
To: emacs-pretest-bug@gnu.org
Date: Wed, 29 Aug 2007 19:40:57 +0100
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: 
Subject: 23.0.0; (thing-at-point 'url) returns invalid urls


Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

Put the point in any word but not in a url and eval (thing-at-point
'url) it will returns a url like "http://something", where 'something'
is the word under point.

If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
    `bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/local/packages/emacs/share/emacs/23.0.0/etc/DEBUG for instructions.


In GNU Emacs 23.0.0.8 (i686-pc-linux-gnu, GTK+ Version 2.10.14)
 of 2007-08-29 on sl392.st-edmunds.cam.ac.uk
Windowing system distributor `The X.Org Foundation', version 11.0.10300000
configured using `configure  '--prefix=/usr/local/packages/emacs' '--with-kerberos5' '--enable-locallisppath=/usr/local/share/emacs/site-lisp' '--without-toolkit-scroll-bars' '--with-xft' '--enable-font-backend' '--with-x-toolkit=yes''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_GB.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t

Major mode: ERC

Minor modes in effect:
  erc-spelling-mode: t
  erc-page-mode: t
  erc-menu-mode: t
  erc-services-mode: t
  erc-autojoin-mode: t
  erc-button-mode: t
  erc-ring-mode: t
  erc-pcomplete-mode: t
  erc-track-mode: t
  erc-track-minor-mode: t
  erc-match-mode: t
  erc-fill-mode: t
  erc-stamp-mode: t
  erc-netsplit-mode: t
  erc-smiley-mode: t
  erc-readonly-mode: t
  erc-scrolltobottom-mode: t
  flyspell-mode: t
  dired-omit-mode: t
  recentf-mode: t
  icomplete-mode: t
  show-paren-mode: t
  savehist-mode: t
  xterm-mouse-mode: t
  delete-selection-mode: t
  global-auto-revert-mode: t
  display-time-mode: t
  minibuffer-indicate-depth-mode: t
  partial-completion-mode: t
  which-function-mode: t
  shell-dirtrack-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  global-auto-composition-mode: t
  auto-composition-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
t h e SPC b r o w s e r SPC c a n SPC n o t SPC f i 
n d <return> C-x o <tab> <return> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <down> <down> <down> <down> <down> 
<down> <down> <down> <up> <up> <up> <up> <up> <up> 
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> 
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> 
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up> 
<up> <up> <up> <up> <up> <up> C-x o <down> <down> <down> 
<down> <up> <up> <down> <down> <down> <left> <left> 
<left> <left> <left> <left> <left> <left> <left> <left> 
<left> <left> <left> <left> <left> <left> <left> <left> 
<left> <left> <left> <left> <left> <left> <left> C-x 
0 C-x k <return> C-x b <return> i SPC w a n t SPC t 
o SPC a v o i d SPC d <backspace> o p e n SPC m y SPC 
b r o w s e r SPC t o SPC a SPC u r l SPC l i k e SPC 
t h a t <return> <up> <up> <up> <up> <up> <up> <up> 
<up> <up> <down> <right> <right> <right> <right> <right> 
<right> <right> <right> <right> <right> <right> <right> 
<right> <right> <right> <right> <right> <right> <right> 
<right> <right> <right> <right> <right> <right> <right> 
M-: ( t h i n g - a t - p o i <tab> SPC ' u r l <return> 
M-x r e b <tab> <backspace> <backspace> <backspace> 
<backspace> <backspace> <backspace> <backspace> <backspace> 
<backspace> <backspace> <backspace> <backspace> <backspace> 
<backspace> <backspace> <backspace> <backspace> <backspace> 
<backspace> e p <tab> o <tab> r <tab> b <tab> <ret
urn>

Recent messages:
Using try-expand-dabbrev
Type C-x 1 to remove help window.  
Quit
Using try-expand-dabbrev
Using try-expand-list
"http://that"
call-interactively: End of buffer
mouse-2, RET: find function's definition
"http://create"
Loading emacsbug...done


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel
------- End of forwarded message -------

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-30  7:15 [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls] Richard Stallman
@ 2007-08-31  8:32 ` Glenn Morris
  2007-08-31 14:42   ` Stefan Monnier
  2007-09-01  4:06   ` Richard Stallman
  2007-09-01 21:00 ` (thing-at-point 'defun) always returns NIL (was: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]) Leo
  1 sibling, 2 replies; 22+ messages in thread
From: Glenn Morris @ 2007-08-31  8:32 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

Richard Stallman wrote:

> Would someone please DTRT and ack?

There's no problem here and nothing to do. The URLs returned are
perfectly valid. See the discussion that followed the original email.

> From: Leo <sdl.web@gmail.com>
> Subject: 23.0.0; (thing-at-point 'url) returns invalid urls
> To: emacs-pretest-bug@gnu.org
> Cc: 
> Date: Wed, 29 Aug 2007 19:40:57 +0100

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31  8:32 ` Glenn Morris
@ 2007-08-31 14:42   ` Stefan Monnier
  2007-08-31 16:18     ` Drew Adams
  2007-09-01  4:06   ` Richard Stallman
  1 sibling, 1 reply; 22+ messages in thread
From: Stefan Monnier @ 2007-08-31 14:42 UTC (permalink / raw)
  To: Glenn Morris; +Cc: rms, emacs-devel

>> Would someone please DTRT and ack?
> There's no problem here and nothing to do. The URLs returned are
> perfectly valid. See the discussion that followed the original email.

Maybe the docstring (and or behavior) should be adjusted to make the
distinction clear between "assuming there's a URL at point, return it" and
"check if there's a URL at point, and if there is, return it".

This is often a somewhat subtle but important difference.  When the user
does M-x browse-url RET, it makes perfect sense to place as default in the
minibuffer the best URL we could come up with, even if it doesn't look very
likely to a good one.

OTOH if we create a command which can do two different thing depending on
whether there's a URL at point, we would want to be much more stringent on
what we consider as an acceptable "URL at point".


        Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31 14:42   ` Stefan Monnier
@ 2007-08-31 16:18     ` Drew Adams
  2007-08-31 20:27       ` Stefan Monnier
  0 siblings, 1 reply; 22+ messages in thread
From: Drew Adams @ 2007-08-31 16:18 UTC (permalink / raw)
  To: emacs-devel

> >> Would someone please DTRT and ack?
> > There's no problem here and nothing to do. The URLs returned are
> > perfectly valid. See the discussion that followed the original email.
>
> Maybe the docstring (and or behavior) should be adjusted to make the
> distinction clear between "assuming there's a URL at point, return it" and
> "check if there's a URL at point, and if there is, return it".
>
> This is often a somewhat subtle but important difference.  When the user
> does M-x browse-url RET, it makes perfect sense to place as default in the
> minibuffer the best URL we could come up with, even if it doesn't
> look very
> likely to a good one.
>
> OTOH if we create a command which can do two different thing depending on
> whether there's a URL at point, we would want to be much more stringent on
> what we consider as an acceptable "URL at point".

Seconded.

The doc can distinguish between the URL itself being valid (syntactically)
and the URL having a live target. The first is a property of the URL; the
second is a property of its target.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31 16:18     ` Drew Adams
@ 2007-08-31 20:27       ` Stefan Monnier
  2007-08-31 20:34         ` Drew Adams
  0 siblings, 1 reply; 22+ messages in thread
From: Stefan Monnier @ 2007-08-31 20:27 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel

>> >> Would someone please DTRT and ack?
>> > There's no problem here and nothing to do. The URLs returned are
>> > perfectly valid. See the discussion that followed the original email.
>> 
>> Maybe the docstring (and or behavior) should be adjusted to make the
>> distinction clear between "assuming there's a URL at point, return it" and
>> "check if there's a URL at point, and if there is, return it".
>> 
>> This is often a somewhat subtle but important difference.  When the user
>> does M-x browse-url RET, it makes perfect sense to place as default in the
>> minibuffer the best URL we could come up with, even if it doesn't
>> look very
>> likely to a good one.
>> 
>> OTOH if we create a command which can do two different thing depending on
>> whether there's a URL at point, we would want to be much more stringent on
>> what we consider as an acceptable "URL at point".

> Seconded.

> The doc can distinguish between the URL itself being valid (syntactically)
> and the URL having a live target. The first is a property of the URL; the
> second is a property of its target.

No, I think this would be a mistake.  We're still talking only about URLs
independently from their target.  Liveness of a URL target is a property
that is difficult/impossible to ascertain and can change at any moment.
I don't want `browse-url' to ignore "http://www.iro.umontreal.ca" under
point just because I'm currently not connected to the internet or because it
is temporarily down, and neither do I want it to wait for a timeout before
deciding it.

The difference we want to talk about instead is between something that
heuristically is likely to be meant as a URL and something which may be used
as a URL but doesn't particularly "look like a URL".


        Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31 20:27       ` Stefan Monnier
@ 2007-08-31 20:34         ` Drew Adams
  2007-08-31 21:35           ` Thien-Thi Nguyen
  2007-09-01  1:57           ` Leo
  0 siblings, 2 replies; 22+ messages in thread
From: Drew Adams @ 2007-08-31 20:34 UTC (permalink / raw)
  To: emacs-devel

> >> OTOH if we create a command which can do two different thing
> depending on
> >> whether there's a URL at point, we would want to be much more
> stringent on
> >> what we consider as an acceptable "URL at point".
>
> > Seconded.
>
> > The doc can distinguish between the URL itself being valid
> > (syntactically) and the URL having a live target. The first
> > is a property of the URL; the
> > second is a property of its target.
>
> No, I think this would be a mistake.  We're still talking only about URLs
> independently from their target.  Liveness of a URL target is a property
> that is difficult/impossible to ascertain and can change at any moment.
> I don't want `browse-url' to ignore "http://www.iro.umontreal.ca" under
> point just because I'm currently not connected to the internet or
> because it
> is temporarily down, and neither do I want it to wait for a timeout before
> deciding it.
>
> The difference we want to talk about instead is between something that
> heuristically is likely to be meant as a URL and something which
> may be used
> as a URL but doesn't particularly "look like a URL".

That's OK too. I thought people were asking for the ability to get only
"valid" URLs in the sense of having live targets. Isn't that what the
initial request was for?

In any case, let's at least be able to get any syntactically valid URL, and
preferably as the default behavior.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31 20:34         ` Drew Adams
@ 2007-08-31 21:35           ` Thien-Thi Nguyen
  2007-08-31 23:57             ` Drew Adams
  2007-09-01  1:57           ` Leo
  1 sibling, 1 reply; 22+ messages in thread
From: Thien-Thi Nguyen @ 2007-08-31 21:35 UTC (permalink / raw)
  To: emacs-devel

() "Drew Adams" <drew.adams@oracle.com>
() Fri, 31 Aug 2007 13:34:41 -0700

   That's OK too. I thought people were asking for the ability to
   get only "valid" URLs in the sense of having live
   targets. Isn't that what the initial request was for?

iiuc the initial request was to disambiguate the two cases whereby
(thing-at-point 'url) when point is on one of:

  something
  http://something

(both cases return "http://something").  i posted a suggestion (to
gnu-emacs-help but i guess it got dropped somehow) along the lines
of adding a variable thing-at-point-url-autoprefix which when t
would do the heuristic autoprefixing (http:// or ftp://) that is
the present behavior, and when nil, would not.  default value is
another question, of course.

below is a quick patch.  here are some tests from *scratch*:

 (defun test (setting text)
   (setq thing-at-point-url-autoprefix setting)
   (save-excursion
     (insert text))
   (prog1 (thing-at-point 'url)
     (delete-char (length text))))
 
 (test   t "http://something")
 "http://something"
 
 (test nil "http://something")
 "http://something"
 
 (test   t "something")
 "http://something"
 
 (test nil "something")
 "something"

seems harmless enough.  what do people think?

thi


_______________________________________
*** thingatpt.el	26 Jul 2007 05:26:35 -0000	1.43
--- thingatpt.el	31 Aug 2007 21:26:54 -0000
***************
*** 238,243 ****
--- 238,247 ----
    "A regular expression matching a URL marked up per RFC1738.
  This may contain whitespace (including newlines) .")
  
+ (defvar thing-at-point-url-autoprefix nil
+   "Controls how `thing-at-point' recognizes URLs.
+ If nil, no access scheme is presumed.")
+ 
  (put 'url 'bounds-of-thing-at-point 'thing-at-point-bounds-of-url-at-point)
  (defun thing-at-point-bounds-of-url-at-point ()
    (let ((strip (thing-at-point-looking-at
***************
*** 261,267 ****
  
  Search backwards for the start of a URL ending at or after point.  If
  no URL found, return nil.  The access scheme will be prepended if
! absent: \"mailto:\" if the string contains \"@\", \"ftp://\" if it
  starts with \"ftp\" and not \"ftp:/\", or \"http://\" by default."
  
    (let ((url "") short strip)
--- 265,272 ----
  
  Search backwards for the start of a URL ending at or after point.  If
  no URL found, return nil.  The access scheme will be prepended if
! absent (and if `thing-at-point-url-autoprefix' has non-nil value),
! one of \"mailto:\" if the string contains \"@\", \"ftp://\" if it
  starts with \"ftp\" and not \"ftp:/\", or \"http://\" by default."
  
    (let ((url "") short strip)
***************
*** 278,284 ****
  	  ;; strip whitespace
  	  (while (string-match "[ \t\n\r]+" url)
  	    (setq url (replace-match "" t t url)))
! 	  (and short (setq url (concat (cond ((string-match "^[a-zA-Z]+:" url)
  					       ;; already has a URL scheme.
  					       "")
  					     ((string-match "@" url)
--- 283,290 ----
  	  ;; strip whitespace
  	  (while (string-match "[ \t\n\r]+" url)
  	    (setq url (replace-match "" t t url)))
! 	  (and short thing-at-point-url-autoprefix
!                      (setq url (concat (cond ((string-match "^[a-zA-Z]+:" url)
  					       ;; already has a URL scheme.
  					       "")
  					     ((string-match "@" url)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31 21:35           ` Thien-Thi Nguyen
@ 2007-08-31 23:57             ` Drew Adams
  2007-09-01 19:13               ` Johannes Weiner
  0 siblings, 1 reply; 22+ messages in thread
From: Drew Adams @ 2007-08-31 23:57 UTC (permalink / raw)
  To: Emacs-Devel

>    That's OK too. I thought people were asking for the ability to
>    get only "valid" URLs in the sense of having live
>    targets. Isn't that what the initial request was for?
>
> iiuc the initial request was to disambiguate the two cases whereby
> (thing-at-point 'url) when point is on one of:
>   something
>   http://something
> (both cases return "http://something").

Admittedly, the text of the initial bug report (OP) is not too clear:

>> Put the point in any word but not in a url and eval (thing-at-point
>> 'url) it will returns a url like "http://something", where 'something'
>> is the word under point.

That and other posts seem to suggest that the requested (optional, perhaps)
behavior is to return nil unless the URL under point contains a scheme (e.g.
http://, ftp://). For example, "something" at point and www.google.com at
point would each return nil.

However, other posts seem to suggest that the perceived problem is that
http://something is not a "valid" URL because the target is not (currently
or perhaps usually) live (accessible):

>> Because a url created by concat "http://" and the word under point is
>> unlikely to be accessible by a browser.

Admittedly, it says "is unlikely to be accessible", not "is not accessible".
But the subject line of the thread is "... returns invalid urls". Since the
URLs in question are syntactically valid, and people do sometimes refer to
URLs as "invalid" when their targets are not live, I guessed that
inaccessible target was what was meant.

Putting it all together, it's not clear to me just what those who don't like
the current behavior would prefer.

I like the current behavior, so I don't really care what additional
behaviors are offered. I do hope that the current behavior will be the
default, however. In terms of Thi's patch, this means that the new variable
he introduced has the wrong default value, for me (it is also a defvar, not
a defcustom). The current behavior is better.

What alternative behavior is being requested for each of the following cases
(text under cursor). Return nil? Return the text as is? Return the text with
http:// prepended unless there is already a URL scheme? (The currently
returned text is indicated in parens, when it differs from what is at
point.) Assume that http://nosuchlivetarget does not point to a live target
(now or usually).

a. nosuchlivetarget   (http://nosuchlivetarget)
b. http://nosuchlivetarget
c. www.google.com     (http:/www.google.com)
d. http://www.google.com

Under the two interpretations above, the result would be #1, #2:

1. "valid" meaning has valid URL syntax that includes a scheme: (a) nil, (b)
http://nosuchlivetarget, (c) nil (d) http://www.google.com.

2. "valid" meaning has a live target (either now or usually): (a) nil, (b)
nil, (c) http:/www.google.com, (d) http:/www.google.com.

3. Stefan's criterion is "likely to be meant as a URL". That might give the
same result as #2, but he rejected link testing as the means. Perhaps,
depending on the heuristic used, the result he expects is this: (a) nil, (b)
http://nosuchlivetarget, (c) http://www.google.com, (d)
http://www.google.com. One could argue that both (b) and (c) are likely to
be meant as URLs. Dunno what the heuristic would be.

4. Thi's patch (with new variable = nil default value) gives this: (a)
nosuchlivetarget, (b) http://nosuchlivetarget, (c) www.google.com, (d)
http://www.google.com.

Thi's patch doesn't seem to fit the OP, regardless of interpretation, since
it returns nil only if the thing at point is not even a simple URL (without
scheme) - for example, when point is over whitespace or punctuation. The OP
clearly wanted nil for "something" (whatever the reason - whether no URL
scheme or inaccessible target).

So what alternative behavior do people want?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31 20:34         ` Drew Adams
  2007-08-31 21:35           ` Thien-Thi Nguyen
@ 2007-09-01  1:57           ` Leo
  2007-09-01 16:39             ` Drew Adams
  1 sibling, 1 reply; 22+ messages in thread
From: Leo @ 2007-09-01  1:57 UTC (permalink / raw)
  To: emacs-devel

On 2007-08-31 21:34 +0100, Drew Adams wrote:
> That's OK too. I thought people were asking for the ability to get only
> "valid" URLs in the sense of having live targets. Isn't that what the
> initial request was for?

Stefan gets my idea right. It is definitely not about aliveness.

-- 
Leo <sdl.web AT gmail.com>                         (GPG Key: 9283AA3F)

         Gnus is one component of the Emacs operating system.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31  8:32 ` Glenn Morris
  2007-08-31 14:42   ` Stefan Monnier
@ 2007-09-01  4:06   ` Richard Stallman
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Stallman @ 2007-09-01  4:06 UTC (permalink / raw)
  To: Glenn Morris; +Cc: emacs-devel

    > Would someone please DTRT and ack?

    There's no problem here and nothing to do.

Thanks for DTRT.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01  1:57           ` Leo
@ 2007-09-01 16:39             ` Drew Adams
  2007-09-01 20:57               ` Leo
                                 ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Drew Adams @ 2007-09-01 16:39 UTC (permalink / raw)
  To: Emacs-Pretest-Bug

> Stefan gets my idea right. It is definitely not about aliveness.

Fine. So what is the heuristic to use to recognize something that "is likely
to be meant as a URL"? Presence of a URL scheme (e.g. http://, ftp://)?
Presence of a URL scheme or "www." (e.g. www.whatever.anything)? There are
already regexps defined to recognize URLs, with and without schemes. How
should they be used or modified?

And what to return when probably-intended-URL recognition fails? nil?
Whatever is currently at point, without prepending http://? Should http://
ever be prepended (e.g. if "www." satisfies the test for likely URL, as in
www.google.com)?

Let's stop being so vague and go beyond saying things like (1) just DTRT and
(2) we'll have a heuristic that recognizes TRT. Which value do you want
returned for which text at point? And what heuristic do you propose to use
to recognize a likely URL intention?

The only difficult problem seen so far is knowing what is being requested.
It's not an alligator. It's bigger than a breadbox. It doesn't contain
chlorophyll. OK, so what is it?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-08-31 23:57             ` Drew Adams
@ 2007-09-01 19:13               ` Johannes Weiner
  2007-09-01 19:59                 ` Leo
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Weiner @ 2007-09-01 19:13 UTC (permalink / raw)
  To: Drew Adams; +Cc: Emacs-Devel


[-- Attachment #1.1: Type: text/plain, Size: 1694 bytes --]

Hi,

On Fri, Aug 31, 2007 at 04:57:04PM -0700, Drew Adams wrote:
> >    That's OK too. I thought people were asking for the ability to
> >    get only "valid" URLs in the sense of having live
> >    targets. Isn't that what the initial request was for?
> >
> > iiuc the initial request was to disambiguate the two cases whereby
> > (thing-at-point 'url) when point is on one of:
> >   something
> >   http://something
> > (both cases return "http://something").
> 
> Admittedly, the text of the initial bug report (OP) is not too clear:
> 
> >> Put the point in any word but not in a url and eval (thing-at-point
> >> 'url) it will returns a url like "http://something", where 'something'
> >> is the word under point.
> 
> That and other posts seem to suggest that the requested (optional, perhaps)
> behavior is to return nil unless the URL under point contains a scheme (e.g.
> http://, ftp://). For example, "something" at point and www.google.com at
> point would each return nil.
> 
> However, other posts seem to suggest that the perceived problem is that
> http://something is not a "valid" URL because the target is not (currently
> or perhaps usually) live (accessible):
> 
> >> Because a url created by concat "http://" and the word under point is
> >> unlikely to be accessible by a browser.

Come on, people. If the point is above `something' and I don't want EMACS to
return an URL-string but NIL, why would I run (thing-at-point 'url) in the
first place? This is bogus.

I like the behaviour as it is right now. Returning nil is as unusable as
http://something for most, but for some people, http://something can still be
of use.

	Hannes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01 19:13               ` Johannes Weiner
@ 2007-09-01 19:59                 ` Leo
  2007-09-01 20:04                   ` Johannes Weiner
  0 siblings, 1 reply; 22+ messages in thread
From: Leo @ 2007-09-01 19:59 UTC (permalink / raw)
  To: emacs-devel

On 2007-09-01 20:13 +0100, Johannes Weiner wrote:
> Come on, people. If the point is above `something' and I don't want EMACS to
> return an URL-string but NIL, why would I run (thing-at-point 'url) in the
> first place? This is bogus.

To test if there is a url under point, if yes do something if not do
something else.

> I like the behaviour as it is right now. Returning nil is as unusable
> as http://something for most, but for some people, http://something
> can still be of use.

Those claims are not true.

-- 
Leo <sdl.web AT gmail.com>                (GPG Key: 9283AA3F)

      Gnus is one component of the Emacs operating system.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01 19:59                 ` Leo
@ 2007-09-01 20:04                   ` Johannes Weiner
  0 siblings, 0 replies; 22+ messages in thread
From: Johannes Weiner @ 2007-09-01 20:04 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 744 bytes --]

Hi,

On Sat, Sep 01, 2007 at 08:59:28PM +0100, Leo wrote:
> On 2007-09-01 20:13 +0100, Johannes Weiner wrote:
> > Come on, people. If the point is above `something' and I don't want EMACS to
> > return an URL-string but NIL, why would I run (thing-at-point 'url) in the
> > first place? This is bogus.
> 
> To test if there is a url under point, if yes do something if not do
> something else.

Okay, that is something else.

> > I like the behaviour as it is right now. Returning nil is as unusable
> > as http://something for most, but for some people, http://something
> > can still be of use.
> 
> Those claims are not true.

Which claims? I talk from experience. I really make use of the current
behaviour.

	Hannes

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 142 bytes --]

_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01 16:39             ` Drew Adams
@ 2007-09-01 20:57               ` Leo
  2007-09-02  6:39                 ` David Kastrup
  2007-09-04 22:37                 ` Davis Herring
  2007-09-03 20:51               ` Stefan Monnier
  2007-09-04 22:40               ` Davis Herring
  2 siblings, 2 replies; 22+ messages in thread
From: Leo @ 2007-09-01 20:57 UTC (permalink / raw)
  To: emacs-devel

On 2007-09-01 17:39 +0100, Drew Adams wrote:
>> Stefan gets my idea right. It is definitely not about aliveness.
>
> Fine. So what is the heuristic to use to recognize something that "is likely
> to be meant as a URL"? Presence of a URL scheme (e.g. http://, ftp://)?
> Presence of a URL scheme or "www." (e.g. www.whatever.anything)? There are
> already regexps defined to recognize URLs, with and without schemes. How
> should they be used or modified?
>
> And what to return when probably-intended-URL recognition fails? nil?
> Whatever is currently at point, without prepending http://? Should http://
> ever be prepended (e.g. if "www." satisfies the test for likely URL, as in
> www.google.com)?

To require that a url must contain a '.' in it is able to reduce the
risk of returning random urls by more than 90%. Can you image each time
I feed browse-url with (thing-at-point 'url), I am getting:

,----
| Cannot retrieve URL: http://something (exit status: 0)
| 
| something could not be found. Please check the name, and try again.
`----

and it happen when the point is in any words.

> Let's stop being so vague and go beyond saying things like (1) just
> DTRT and (2) we'll have a heuristic that recognizes TRT. Which value
> do you want returned for which text at point? And what heuristic do
> you propose to use to recognize a likely URL intention?

Sometimes there might not be TRT thing, but there is a better thing to
do. i.e. make thing-at-point more useful.

I see erc, ffap uses there own url-regexp. Those kinds of duplication
can be avoided.

> The only difficult problem seen so far is knowing what is being
> requested.  It's not an alligator. It's bigger than a breadbox. It
> doesn't contain chlorophyll. OK, so what is it?

-- 
Leo <sdl.web AT gmail.com>                (GPG Key: 9283AA3F)

      Gnus is one component of the Emacs operating system.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* (thing-at-point 'defun) always returns NIL (was: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls])
  2007-08-30  7:15 [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls] Richard Stallman
  2007-08-31  8:32 ` Glenn Morris
@ 2007-09-01 21:00 ` Leo
  2007-09-01 21:41   ` (thing-at-point 'defun) always returns NIL (was:[sdl.web@gmail.com: " Drew Adams
  1 sibling, 1 reply; 22+ messages in thread
From: Leo @ 2007-09-01 21:00 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

On 2007-08-30 08:15 +0100, Richard Stallman wrote:
> Would someone please DTRT and ack?

Maybe someone can take a look at thingatpt.el. I found another bug:

(thing-at-point 'defun) always returns NIL

HTH,
Leo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: (thing-at-point 'defun) always returns NIL (was:[sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls])
  2007-09-01 21:00 ` (thing-at-point 'defun) always returns NIL (was: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]) Leo
@ 2007-09-01 21:41   ` Drew Adams
  0 siblings, 0 replies; 22+ messages in thread
From: Drew Adams @ 2007-09-01 21:41 UTC (permalink / raw)
  To: emacs-devel

> Maybe someone can take a look at thingatpt.el. I found another bug:
>
> (thing-at-point 'defun) always returns NIL

I sent a patch that fixed this back in July (Subject: "patch for
thingatpt.el").

The code needs to do this:

(put 'defun 'beginning-op 'beginning-of-defun)
(put 'defun 'end-op 'end-of-defun)
(put 'defun 'forward-op 'end-of-defun)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01 20:57               ` Leo
@ 2007-09-02  6:39                 ` David Kastrup
  2007-09-02 19:20                   ` Glenn Morris
  2007-09-04 22:37                 ` Davis Herring
  1 sibling, 1 reply; 22+ messages in thread
From: David Kastrup @ 2007-09-02  6:39 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

Leo <sdl.web@gmail.com> writes:

> I see erc, ffap uses there own url-regexp. Those kinds of duplication
> can be avoided.

Well, for some fun, place the cursor on myscript.sh and type M-x
ffap RET.  This gives (assuming that the script does not exist in the
current directory)

Pinging myscript.sh (Saint Helena)...

which I consider quite hilarious.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-02  6:39                 ` David Kastrup
@ 2007-09-02 19:20                   ` Glenn Morris
  0 siblings, 0 replies; 22+ messages in thread
From: Glenn Morris @ 2007-09-02 19:20 UTC (permalink / raw)
  To: David Kastrup; +Cc: Leo, emacs-devel

David Kastrup wrote:

> Well, for some fun, place the cursor on myscript.sh and type M-x
> ffap RET.  This gives (assuming that the script does not exist in the
> current directory)
>
> Pinging myscript.sh (Saint Helena)...
>
> which I consider quite hilarious.

Previously discussed:

http://lists.gnu.org/archive/html/emacs-devel/2007-03/msg00392.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01 16:39             ` Drew Adams
  2007-09-01 20:57               ` Leo
@ 2007-09-03 20:51               ` Stefan Monnier
  2007-09-04 22:40               ` Davis Herring
  2 siblings, 0 replies; 22+ messages in thread
From: Stefan Monnier @ 2007-09-03 20:51 UTC (permalink / raw)
  To: Drew Adams; +Cc: Emacs-Pretest-Bug

>> Stefan gets my idea right. It is definitely not about aliveness.
> Fine. So what is the heuristic to use to recognize something that "is likely
> to be meant as a URL"? Presence of a URL scheme (e.g. http://, ftp://)?

My point was only that the documentation for (thing-at-point 'url) should
make it clear that this does not try and determine *if* there is a URL.
For uses where we want to check that point is near a URL, we want some
other operation.
I don't know if this one exists and will let others invent it if needed.
So again: I don't think there's a need to change the behavior of
(thing-at-point 'url), but maybe there's a need to make its doc
more explicit.


        Stefan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01 20:57               ` Leo
  2007-09-02  6:39                 ` David Kastrup
@ 2007-09-04 22:37                 ` Davis Herring
  1 sibling, 0 replies; 22+ messages in thread
From: Davis Herring @ 2007-09-04 22:37 UTC (permalink / raw)
  To: Leo; +Cc: emacs-devel

> To require that a url must contain a '.' in it is able to reduce the
> risk of returning random urls by more than 90%.

I would test for a '.', and for a '/', and for an explicit scheme (some,
like "mailto:", do not come along with slashes).  If any of those pass,
it's probably (that is, it is IMHO sufficiently rare that it won't be) a
URL.

Davis

-- 
This product is sold by volume, not by mass.  If it appears too dense or
too sparse, it is because mass-energy conversion has occurred during
shipping.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]
  2007-09-01 16:39             ` Drew Adams
  2007-09-01 20:57               ` Leo
  2007-09-03 20:51               ` Stefan Monnier
@ 2007-09-04 22:40               ` Davis Herring
  2 siblings, 0 replies; 22+ messages in thread
From: Davis Herring @ 2007-09-04 22:40 UTC (permalink / raw)
  To: Drew Adams; +Cc: Emacs-Pretest-Bug

> And what to return when probably-intended-URL recognition fails? nil?
> Whatever is currently at point, without prepending http://? Should http://
> ever be prepended (e.g. if "www." satisfies the test for likely URL, as in
> www.google.com)?

In the context of the uneager URL-at-point function (which, as Stefan has
said, should not replace the current behavior of (thing-at-point 'url)):

If we think it's not a URL, we return nil.  Otherwise, we prepend
"http://" (customizable) if there's no scheme present and it doesn't look
like an email address (to which "mailto:" should be prepended, of course,
to make it a URL).  On a somewhat-related note, such things as "www." and
".com" should in no circumstances be added; it is a misfeature of (many)
Web browsers that they conflate keyword searches with the DNS.

Davis

-- 
This product is sold by volume, not by mass.  If it appears too dense or
too sparse, it is because mass-energy conversion has occurred during
shipping.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2007-09-04 22:40 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-30  7:15 [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls] Richard Stallman
2007-08-31  8:32 ` Glenn Morris
2007-08-31 14:42   ` Stefan Monnier
2007-08-31 16:18     ` Drew Adams
2007-08-31 20:27       ` Stefan Monnier
2007-08-31 20:34         ` Drew Adams
2007-08-31 21:35           ` Thien-Thi Nguyen
2007-08-31 23:57             ` Drew Adams
2007-09-01 19:13               ` Johannes Weiner
2007-09-01 19:59                 ` Leo
2007-09-01 20:04                   ` Johannes Weiner
2007-09-01  1:57           ` Leo
2007-09-01 16:39             ` Drew Adams
2007-09-01 20:57               ` Leo
2007-09-02  6:39                 ` David Kastrup
2007-09-02 19:20                   ` Glenn Morris
2007-09-04 22:37                 ` Davis Herring
2007-09-03 20:51               ` Stefan Monnier
2007-09-04 22:40               ` Davis Herring
2007-09-01  4:06   ` Richard Stallman
2007-09-01 21:00 ` (thing-at-point 'defun) always returns NIL (was: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls]) Leo
2007-09-01 21:41   ` (thing-at-point 'defun) always returns NIL (was:[sdl.web@gmail.com: " Drew Adams

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).