From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Karl Fogel Newsgroups: gmane.emacs.devel Subject: Re: Adding email address support to thingatpt.el. Date: Tue, 27 Feb 2007 04:08:50 -0800 Message-ID: <87abyz7qgd.fsf@floss.red-bean.com> References: <87vehqox9u.fsf@floss.red-bean.com> <87ps7xavui.fsf@floss.red-bean.com> <45E32048.8050309@easy-emacs.de> Reply-To: Karl Fogel NNTP-Posting-Host: dough.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1172545800 5741 80.91.229.10 (27 Feb 2007 03:10:00 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 27 Feb 2007 03:10:00 +0000 (UTC) Cc: emacs-devel To: Andreas Roehler Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 27 04:09:51 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by dough.gmane.org with esmtp (Exim 4.50) id 1HLsip-0001lM-8M for ged-emacs-devel@m.gmane.org; Tue, 27 Feb 2007 04:09:47 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HLsip-0007K0-9k for ged-emacs-devel@m.gmane.org; Mon, 26 Feb 2007 22:09:47 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1HLshx-0006cF-Or for emacs-devel@gnu.org; Mon, 26 Feb 2007 22:08:53 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1HLshx-0006bO-8a for emacs-devel@gnu.org; Mon, 26 Feb 2007 22:08:53 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HLshw-0006bI-QR for emacs-devel@gnu.org; Mon, 26 Feb 2007 22:08:52 -0500 Original-Received: from sanpietro.red-bean.com ([66.146.193.61]) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1HLshw-0004E0-EW for emacs-devel@gnu.org; Mon, 26 Feb 2007 22:08:52 -0500 Original-Received: from localhost ([127.0.0.1]:47171 helo=floss.red-bean.com ident=kfogel) by sanpietro.red-bean.com with esmtp (Exim 4.63) (envelope-from ) id 1HLshv-0007IN-7C; Mon, 26 Feb 2007 21:08:51 -0600 In-Reply-To: <45E32048.8050309@easy-emacs.de> (Andreas Roehler's message of "Mon\, 26 Feb 2007 19\:00\:40 +0100") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.93 (gnu/linux) X-detected-kernel: Linux 2.6 (newer, 3) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:66894 Archived-At: Andreas Roehler writes: > Just for consideration: > > Email at point might be used to pick emails from a > csv-database. Than `;' and `,' as delimiters should be > possible together with or instead of angles. It's okay, they're already treated as boundaries, because they're not legal in the email address. (I tested just now to make sure.) Unless you mean they should be returned *as part of* the email address, like ",andreas.roehler@easy-emacs.ed,"? But that wouldn't be good -- commas and semicolons are not the same as angle brackets in that respect. > AFAIU rfc2822, several more chars are allowed to be > part of an email-adress than regexp honours now: It's tough to know what to include. Many characters that technically could be part of an email address are rarely used in practice, and instead appear much more often as delimiters (in certain contexts). So if thingatpt.el is to Do The Right Thing most often for the user, it probably can't comply precisely with the RFC. I'm including the latest patch below, for reference, but I won't do anything with it until after the release. -Karl 2007-02-25 Karl Fogel * thingatpt.el: Add support for email addresses (`email'). (thing-at-point, bounds-of-thing-at-point): Document `email' support. (thing-at-point-email-regexp): New variable. (`email'): Put `bounds-of-thing-at-point' and `thing-at-point' properties on this symbol, with lambda forms for values. Index: thingatpt.el =================================================================== RCS file: /cvsroot/emacs/emacs/lisp/thingatpt.el,v retrieving revision 1.40 diff -u -r1.40 thingatpt.el --- thingatpt.el 21 Jan 2007 03:53:10 -0000 1.40 +++ thingatpt.el 27 Feb 2007 03:07:51 -0000 @@ -67,7 +67,7 @@ "Determine the start and end buffer locations for the THING at point. THING is a symbol which specifies the kind of syntactic entity you want. Possibilities include `symbol', `list', `sexp', `defun', `filename', `url', -`word', `sentence', `whitespace', `line', `page' and others. +`email', `word', `sentence', `whitespace', `line', `page' and others. See the file `thingatpt.el' for documentation on how to define a symbol as a valid THING. @@ -124,7 +124,7 @@ "Return the THING at point. THING is a symbol which specifies the kind of syntactic entity you want. Possibilities include `symbol', `list', `sexp', `defun', `filename', `url', -`word', `sentence', `whitespace', `line', `page' and others. +`email', `word', `sentence', `whitespace', `line', `page' and others. See the file `thingatpt.el' for documentation on how to define a symbol as a valid THING." @@ -340,6 +340,33 @@ (goto-char (car bounds)) (error "No URL here"))))) +;; Email addresses +(defvar thing-at-point-email-regexp + "?" + "A regular expression probably matching an email address. +This does not match the real name portion, only the address, optionally +with angle brackets.") + +;; Haven't set 'forward-op on 'email nor defined 'forward-email' because +;; not sure they're actually needed, and URL seems to skip them too. +;; Note that (end-of-thing 'email) and (beginning-of-thing 'email) +;; work automagically, though. + +(put 'email 'bounds-of-thing-at-point + (lambda () + (let ((thing (thing-at-point-looking-at thing-at-point-email-regexp))) + (if thing + (let ((beginning (match-beginning 0)) + (end (match-end 0))) + (cons beginning end)))))) + +(put 'email 'thing-at-point + (lambda () + (let ((boundary-pair (bounds-of-thing-at-point 'email))) + (if boundary-pair + (buffer-substring-no-properties + (car boundary-pair) (cdr boundary-pair)))))) + ;; Whitespace (defun forward-whitespace (arg)