From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Leo Newsgroups: gmane.emacs.devel Subject: Re: [sdl.web@gmail.com: 23.0.0; (thing-at-point 'url) returns invalid urls] Date: Sat, 01 Sep 2007 21:57:02 +0100 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1188680385 4473 80.91.229.12 (1 Sep 2007 20:59:45 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 1 Sep 2007 20:59:45 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Sep 01 22:59:44 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1IRa4D-0007aX-6x for ged-emacs-devel@m.gmane.org; Sat, 01 Sep 2007 22:59:41 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IRa4C-0006fI-DH for ged-emacs-devel@m.gmane.org; Sat, 01 Sep 2007 16:59:40 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1IRa49-0006f3-HO for emacs-devel@gnu.org; Sat, 01 Sep 2007 16:59:37 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1IRa48-0006er-43 for emacs-devel@gnu.org; Sat, 01 Sep 2007 16:59:36 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IRa47-0006eo-TK for emacs-devel@gnu.org; Sat, 01 Sep 2007 16:59:35 -0400 Original-Received: from main.gmane.org ([80.91.229.2] helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1IRa47-00039m-AY for emacs-devel@gnu.org; Sat, 01 Sep 2007 16:59:35 -0400 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1IRa41-0003WT-LH for emacs-devel@gnu.org; Sat, 01 Sep 2007 22:59:29 +0200 Original-Received: from sl392.st-edmunds.cam.ac.uk ([131.111.223.202]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 01 Sep 2007 22:59:29 +0200 Original-Received: from sdl.web by sl392.st-edmunds.cam.ac.uk with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 01 Sep 2007 22:59:29 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 45 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: sl392.st-edmunds.cam.ac.uk User-Agent: Gnus/5.110007 Emacs/23.0.0 (20070829) Fedora 7 (gnu/linux) Face: iVBORw0KGgoAAAANSUhEUgAAACgAAAAoBAMAAAB+0KVeAAAAElBMVEUAAAAAAP+LRRP0pGC+ vr7///+7mT1iAAAAAWJLR0QAiAUdSAAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9cBBwMO DhglKe4AAAEsSURBVCjPbZNBboQwDEV/Cd4X9QJRThApmn0XYW+Jyf2v0m+HhqDBgiAe9rcTG7QH w/1Vn2Ar8gBb/ocywSN3qK9T3z4eFDB4eApocBpeBs1RSykoJd8gQcm8pGmHXFso3ajnmsqV0TnY DQkOfXUfN5NwaI7AWTVOyEhcu1aHmdWItHddUVUcUgUBCkitu8V6ditHVOVdqzl2EQ1ZVGTbdK0V 7cqn8vWzoU5Q/bF9Y/Y0cRU1xwkys5dJ+Dt6pBDWifcNQml8Gh2JVmPSoQzo7en0grswkxrUGYJ7 0hSxxAGr7ZMwYcHIzprpi7TENEE1xtiYxixRlCfPBsUUrwHD7uGIwATrbnODJcVrPpVn3hxiGloe m/S+z3CtuzUSMo83N4DPH+F0evwR3P4A2k+75838OKQAAAAASUVORK5CYII= Cancel-Lock: sha1:pDFLuRYp8B224Ir1CWGjrS/TUjI= X-Detected-Kernel: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:77559 Archived-At: On 2007-09-01 17:39 +0100, Drew Adams wrote: >> Stefan gets my idea right. It is definitely not about aliveness. > > Fine. So what is the heuristic to use to recognize something that "is likely > to be meant as a URL"? Presence of a URL scheme (e.g. http://, ftp://)? > Presence of a URL scheme or "www." (e.g. www.whatever.anything)? There are > already regexps defined to recognize URLs, with and without schemes. How > should they be used or modified? > > And what to return when probably-intended-URL recognition fails? nil? > Whatever is currently at point, without prepending http://? Should http:// > ever be prepended (e.g. if "www." satisfies the test for likely URL, as in > www.google.com)? To require that a url must contain a '.' in it is able to reduce the risk of returning random urls by more than 90%. Can you image each time I feed browse-url with (thing-at-point 'url), I am getting: ,---- | Cannot retrieve URL: http://something (exit status: 0) | | something could not be found. Please check the name, and try again. `---- and it happen when the point is in any words. > Let's stop being so vague and go beyond saying things like (1) just > DTRT and (2) we'll have a heuristic that recognizes TRT. Which value > do you want returned for which text at point? And what heuristic do > you propose to use to recognize a likely URL intention? Sometimes there might not be TRT thing, but there is a better thing to do. i.e. make thing-at-point more useful. I see erc, ffap uses there own url-regexp. Those kinds of duplication can be avoided. > The only difficult problem seen so far is knowing what is being > requested. It's not an alligator. It's bigger than a breadbox. It > doesn't contain chlorophyll. OK, so what is it? -- Leo (GPG Key: 9283AA3F) Gnus is one component of the Emacs operating system.