unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* How do I check if the format of an email address is valid?
@ 2006-03-30 12:35 Lennart Borgman
  2006-03-30 14:16 ` Reiner Steib
  0 siblings, 1 reply; 12+ messages in thread
From: Lennart Borgman @ 2006-03-30 12:35 UTC (permalink / raw)


How do I do this in Elisp?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-30 12:35 How do I check if the format of an email address is valid? Lennart Borgman
@ 2006-03-30 14:16 ` Reiner Steib
  2006-03-30 21:00   ` Lennart Borgman
       [not found]   ` <mailman.196.1143752412.2481.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 12+ messages in thread
From: Reiner Steib @ 2006-03-30 14:16 UTC (permalink / raw)


On Thu, Mar 30 2006, Lennart Borgman wrote:

> How do I do this in Elisp?

How do you define a "valid email address"?

(a) Valid according to RFC2822?

(b) Valid in the sense of the given domain exists, can receive mail
    and the localpart corresponds do a mailbox?

(c) Something different?

For (a), the variables `gnus-button-mid-or-mail-regexp',
`gnus-button-valid-localpart-regexp' and
`gnus-button-valid-fqdn-regexp' in `gnus-art.el' contain some related
checks.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-30 14:16 ` Reiner Steib
@ 2006-03-30 21:00   ` Lennart Borgman
       [not found]   ` <mailman.196.1143752412.2481.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 12+ messages in thread
From: Lennart Borgman @ 2006-03-30 21:00 UTC (permalink / raw)
  Cc: help-gnu-emacs

Reiner Steib wrote:
> On Thu, Mar 30 2006, Lennart Borgman wrote:
>
>   
>> How do I do this in Elisp?
>>     
>
> How do you define a "valid email address"?
>
> (a) Valid according to RFC2822?
>
> (b) Valid in the sense of the given domain exists, can receive mail
>     and the localpart corresponds do a mailbox?
>
> (c) Something different?
>
> For (a), the variables `gnus-button-mid-or-mail-regexp',
> `gnus-button-valid-localpart-regexp' and
> `gnus-button-valid-fqdn-regexp' in `gnus-art.el' contain some related
> checks.
>
> Bye, Reiner.
>   
Thanks Reiner! Two small worries:

- `message-valid-fqdn-regexp' (which is copied to 
`gnus-button-valid-local-part-regexp') has a list of TLD:s. Is this 
really wise? I guess there will be new TLD:s. (In fact there may already 
be some that are not in this list.)

- It is maybe a pitty that you have to load those big modules (message 
and gnus) just to get the patterns.

And then I can not stop myself from wondering if there really are any 
format of email addresses that asures that conforming addresses are 
actually existing email adresses ;-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
       [not found]   ` <mailman.196.1143752412.2481.help-gnu-emacs@gnu.org>
@ 2006-03-30 21:24     ` Pascal Bourguignon
  2006-03-30 22:25       ` Lennart Borgman
  0 siblings, 1 reply; 12+ messages in thread
From: Pascal Bourguignon @ 2006-03-30 21:24 UTC (permalink / raw)


Lennart Borgman <lennart.borgman.073@student.lu.se> writes:
> Thanks Reiner! Two small worries:
>
> - `message-valid-fqdn-regexp' (which is copied to
> gnus-button-valid-local-part-regexp') has a list of TLD:s. Is this
> really wise? I guess there will be new TLD:s. (In fact there may
> already be some that are not in this list.)

You can customize the variable!


> - It is maybe a pitty that you have to load those big modules (message
> and gnus) just to get the patterns.


You only need to (require 'message) to get message-valid-fqdn-regexp.
In anycase, if you're processing email addresses, it's rather probable
you'll have to process emails too.

And you only need to (require 'gnus-art) to get the other variables.

> And then I can not stop myself from wondering if there really are any
> format of email addresses that asures that conforming addresses are
> actually existing email adresses ;-)

Of course not.  

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

Nobody can fix the economy.  Nobody can be trusted with their finger
on the button.  Nobody's perfect.  VOTE FOR NOBODY.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-30 21:24     ` Pascal Bourguignon
@ 2006-03-30 22:25       ` Lennart Borgman
  2006-03-30 23:30         ` Reiner Steib
  0 siblings, 1 reply; 12+ messages in thread
From: Lennart Borgman @ 2006-03-30 22:25 UTC (permalink / raw)
  Cc: help-gnu-emacs

Pascal Bourguignon wrote:
> Lennart Borgman <lennart.borgman.073@student.lu.se> writes:
>   
>> Thanks Reiner! Two small worries:
>>
>> - `message-valid-fqdn-regexp' (which is copied to
>> gnus-button-valid-local-part-regexp') has a list of TLD:s. Is this
>> really wise? I guess there will be new TLD:s. (In fact there may
>> already be some that are not in this list.)
>>     
>
> You can customize the variable!
>   
This argument can be used in the reverse direction too. Do most users 
really gain something from the list of known TLD:s in the regexp?

>
>   
>> - It is maybe a pitty that you have to load those big modules (message
>> and gnus) just to get the patterns.
>>     
>
>
> You only need to (require 'message) to get message-valid-fqdn-regexp.
> In anycase, if you're processing email addresses, it's rather probable
> you'll have to process emails too.
>
> And you only need to (require 'gnus-art) to get the other variables.
>   
I just want to check the format of email addresses entered on a html page.

>   
>> And then I can not stop myself from wondering if there really are any
>> format of email addresses that asures that conforming addresses are
>> actually existing email adresses ;-)
>>     
>
> Of course not.  
>   
And I will probably never learn the fine art of joking in email messages ;-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-30 22:25       ` Lennart Borgman
@ 2006-03-30 23:30         ` Reiner Steib
  2006-03-31 11:26           ` Lennart Borgman
  0 siblings, 1 reply; 12+ messages in thread
From: Reiner Steib @ 2006-03-30 23:30 UTC (permalink / raw)


On Fri, Mar 31 2006, Lennart Borgman wrote:

> Pascal Bourguignon wrote:
>> Lennart Borgman <lennart.borgman.073@student.lu.se> writes:
>>> - `message-valid-fqdn-regexp' (which is copied to
>>> gnus-button-valid-local-part-regexp') 

You meant `gnus-button-valid-fqdn-regexp' here.

>>> has a list of TLD:s. Is this really wise?

Consider the purpose of these variables in Gnus:

  `message-valid-fqdn-regexp' is used to exclude obvious bogus system
  names from being used as the domain part in a Message-ID.

  `gnus-button-valid-fqdn-regexp' is used when creating buttons
  (clickable links) when a mail address or a Message-ID appears in the
  body of mail/usenet articles: Say null123@marauder.physik.uni-ulm.de
  or reiner.steib@gmx.de etc.  The regexp should catch virtually all
  valid addresses without generating too many false positives,
  e.g. TeX code \def\foo@randomstuff.bar{} or similar.

If we miss some new TLD, it's not critical and the user can easily add
it or upgrade Gnus.

>>> I guess there will be new TLD:s.  (In fact there may already be
>>> some that are not in this list.)

Is anyone aware of any missing TLDs?

(defcustom message-valid-fqdn-regexp
  (concat "[a-z0-9][-.a-z0-9]+\\." ;; [hostname.subdomain.]domain.
	  ;; valid TLDs:
	  "\\([a-z][a-z]" ;; two letter country TDLs
	  "\\|biz\\|com\\|edu\\|gov\\|int\\|mil\\|net\\|org"
	  "\\|aero\\|coop\\|info\\|name\\|museum"
	  "\\|arpa\\|pro\\|uucp\\|bitnet\\|bofh" ;; old style?
	  "\\)")
  "Regular expression that matches a valid FQDN."
  ;; see also: gnus-button-valid-fqdn-regexp
  :version "22.1"
  :group 'message-headers
  :type 'regexp)
     
>> You can customize the variable!

BTW, the tin newsreader even hard-codes a matrix of (in)valid
two letter country domains. The user/admin has to recompile to add new
domains.  IIRC it's used when checking for a valid FQDN.

> This argument can be used in the reverse direction too. Do most
> users really gain something from the list of known TLD:s in the
> regexp?

For the purpose in Gnus: Yes, IMHO.  What would you suggest to use
instead?

>>> - It is maybe a pitty that you have to load those big modules (message
>>> and gnus) just to get the patterns.
>> You only need to (require 'message) to get message-valid-fqdn-regexp.
[...]
>> And you only need to (require 'gnus-art) to get the other variables.

Well, `gnus-art.el' will in turn load "half of the moon": ;-)

,----[ emacs -Q -f ielm ]
| ELISP> (length features)
| 76
| ELISP> (require 'gnus-art)
| gnus-art
| ELISP> (length features)
| 130
`----

,----[ emacs -Q -f ielm ]
| ELISP> (length features)
| 76
| ELISP> (require 'message)
| message
| ELISP> (length features)
| 107
`----
   
> I just want to check the format of email addresses entered on a html page.

And you'll post-process it using Emacs?

>>> And then I can not stop myself from wondering if there really are any
>>> format of email addresses that asures that conforming addresses are
>>> actually existing email adresses ;-)
>> Of course not.    
> And I will probably never learn the fine art of joking in email messages ;-)

I wasn't joking. :-)  Of course not from the format, but...

Basically you need to do something like "host -t mx" (using
`dig.el'?), connect to the smtp port, send "RCPT TO:" and parse the
output ("user unknown"?).  But I'm not sure if this still works
nowadays.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-30 23:30         ` Reiner Steib
@ 2006-03-31 11:26           ` Lennart Borgman
  2006-03-31 14:05             ` Reiner Steib
  0 siblings, 1 reply; 12+ messages in thread
From: Lennart Borgman @ 2006-03-31 11:26 UTC (permalink / raw)


Reiner Steib wrote:
> Is anyone aware of any missing TLDs?
>
> (defcustom message-valid-fqdn-regexp
>   (concat "[a-z0-9][-.a-z0-9]+\\." ;; [hostname.subdomain.]domain.
> 	  ;; valid TLDs:
> 	  "\\([a-z][a-z]" ;; two letter country TDLs
> 	  "\\|biz\\|com\\|edu\\|gov\\|int\\|mil\\|net\\|org"
> 	  "\\|aero\\|coop\\|info\\|name\\|museum"
> 	  "\\|arpa\\|pro\\|uucp\\|bitnet\\|bofh" ;; old style?
> 	  "\\)")
>   "Regular expression that matches a valid FQDN."
>   ;; see also: gnus-button-valid-fqdn-regexp
>   :version "22.1"
>   :group 'message-headers
>   :type 'regexp)
>      
>   
>>> You can customize the variable!
>>>       
>
> BTW, the tin newsreader even hard-codes a matrix of (in)valid
> two letter country domains. The user/admin has to recompile to add new
> domains.  IIRC it's used when checking for a valid FQDN.
>
>   
>> This argument can be used in the reverse direction too. Do most
>> users really gain something from the list of known TLD:s in the
>> regexp?
>>     
>
> For the purpose in Gnus: Yes, IMHO.  What would you suggest to use
> instead?
>   
Why not just [a-z]{2, 6}

>> I just want to check the format of email addresses entered on a html page.
>>     
>
> And you'll post-process it using Emacs?
>   
Just what the user entered (in completion in nxhtml-mode).

> I wasn't joking. :-)  Of course not from the format, but...
>
> Basically you need to do something like "host -t mx" (using
> `dig.el'?), connect to the smtp port, send "RCPT TO:" and parse the
> output ("user unknown"?).  But I'm not sure if this still works
> nowadays.
>
> Bye, Reiner.
>   
Thanks, I did not know. But that seems out of the scope for me right 
now. (And it is a bit complicated to find the tools on w32 - but they 
seem to be in BIND.)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
       [not found] <mailman.169.1143722125.2481.help-gnu-emacs@gnu.org>
@ 2006-03-31 13:33 ` Mathias Dahl
  2006-04-02 12:20   ` Xavier Maillard
  0 siblings, 1 reply; 12+ messages in thread
From: Mathias Dahl @ 2006-03-31 13:33 UTC (permalink / raw)


Lennart Borgman <lennart.borgman.073@student.lu.se> writes:

> How do I do this in Elisp?

In the book Mastering Regular Expressions, there is a regexp covering
a full page (or was it more?) just for matching an e-mail
address. Good luck! ... :)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-31 11:26           ` Lennart Borgman
@ 2006-03-31 14:05             ` Reiner Steib
  2006-03-31 21:02               ` Lennart Borgman
  0 siblings, 1 reply; 12+ messages in thread
From: Reiner Steib @ 2006-03-31 14:05 UTC (permalink / raw)


On Fri, Mar 31 2006, Lennart Borgman wrote:

[ Regexp for valid TLDs ]
> Why not just [a-z]{2, 6}

Too many false positives.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-31 14:05             ` Reiner Steib
@ 2006-03-31 21:02               ` Lennart Borgman
  0 siblings, 0 replies; 12+ messages in thread
From: Lennart Borgman @ 2006-03-31 21:02 UTC (permalink / raw)
  Cc: help-gnu-emacs

Reiner Steib wrote:
> On Fri, Mar 31 2006, Lennart Borgman wrote:
>
> [ Regexp for valid TLDs ]
>   
>> Why not just [a-z]{2, 6}
>>     
>
> Too many false positives.
>
> Bye, Reiner.
>   
One can of course think in many ways here. Currently there seem to be 10 
proposed new TLDs: http://www.icann.org/tlds/stld-apps-19mar04/

And as Mattias pointed out it seems to be very difficult to write an 
entirely correct regexp for a valid mail address format. What about the 
current regexp in Emacs? Does it give many false positive? Or, maybe 
worse, many false negatives?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-03-31 13:33 ` Mathias Dahl
@ 2006-04-02 12:20   ` Xavier Maillard
  2006-04-02 13:02     ` Lennart Borgman
  0 siblings, 1 reply; 12+ messages in thread
From: Xavier Maillard @ 2006-04-02 12:20 UTC (permalink / raw)
  Cc: help-gnu-emacs

   From: Mathias Dahl <brakjoller@gmail.com>

   Lennart Borgman <lennart.borgman.073@student.lu.se> writes:

   > How do I do this in Elisp?

   In the book Mastering Regular Expressions, there is a regexp covering
   a full page (or was it more?) just for matching an e-mail
   address. Good luck! ... :)

Yeah, this is the longest regexp I have ever seen. Awesome !

Once again, good luck :)

Xavier

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: How do I check if the format of an email address is valid?
  2006-04-02 12:20   ` Xavier Maillard
@ 2006-04-02 13:02     ` Lennart Borgman
  0 siblings, 0 replies; 12+ messages in thread
From: Lennart Borgman @ 2006-04-02 13:02 UTC (permalink / raw)
  Cc: help-gnu-emacs, Mathias Dahl

Xavier Maillard wrote:
>    From: Mathias Dahl <brakjoller@gmail.com>
>
>    Lennart Borgman <lennart.borgman.073@student.lu.se> writes:
>
>    > How do I do this in Elisp?
>
>    In the book Mastering Regular Expressions, there is a regexp covering
>    a full page (or was it more?) just for matching an e-mail
>    address. Good luck! ... :)
>
> Yeah, this is the longest regexp I have ever seen. Awesome !
>
> Once again, good luck :)
>
> Xavier
>   
For those who want to see the regexp a perl version is here:

    http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-04-02 13:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-30 12:35 How do I check if the format of an email address is valid? Lennart Borgman
2006-03-30 14:16 ` Reiner Steib
2006-03-30 21:00   ` Lennart Borgman
     [not found]   ` <mailman.196.1143752412.2481.help-gnu-emacs@gnu.org>
2006-03-30 21:24     ` Pascal Bourguignon
2006-03-30 22:25       ` Lennart Borgman
2006-03-30 23:30         ` Reiner Steib
2006-03-31 11:26           ` Lennart Borgman
2006-03-31 14:05             ` Reiner Steib
2006-03-31 21:02               ` Lennart Borgman
     [not found] <mailman.169.1143722125.2481.help-gnu-emacs@gnu.org>
2006-03-31 13:33 ` Mathias Dahl
2006-04-02 12:20   ` Xavier Maillard
2006-04-02 13:02     ` Lennart Borgman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).