all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Re: cleaning up a big regexp
       [not found] <mailman.8887.1410779723.1147.help-gnu-emacs@gnu.org>
@ 2014-09-15 11:18 ` Joost Kremers
  2014-09-15 11:26   ` Tory S. Anderson
  2014-09-15 11:41   ` Michael Albinus
  0 siblings, 2 replies; 12+ messages in thread
From: Joost Kremers @ 2014-09-15 11:18 UTC (permalink / raw)
  To: help-gnu-emacs

Tory S. Anderson wrote:
> Using gnus I have a growing regexp that represents the criteria for bulk email and splits accordingly:

[...]

> Is there a way to clean this up to make it both more readable and more
> easily editable? It seems like keeping some kind of list would be the
> way to do it, instead of an ever-lengthening string.

There's the function `regexp-opt', which takes a list of strings and
returns a regular expression that will match any of those strings.
Perhaps you can use that?


-- 
Joost Kremers                                   joostkremers@fastmail.fm
Selbst in die Unterwelt dringt durch Spalten Licht
EN:SiS(9)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: cleaning up a big regexp
  2014-09-15 11:18 ` cleaning up a big regexp Joost Kremers
@ 2014-09-15 11:26   ` Tory S. Anderson
  2014-09-15 11:41   ` Michael Albinus
  1 sibling, 0 replies; 12+ messages in thread
From: Tory S. Anderson @ 2014-09-15 11:26 UTC (permalink / raw)
  To: help-gnu-emacs

That command would shrink the regexp and make it more efficient, but my goal here is to be able to more easily read and append to it, rather than optimize it. I would like to be able to have a list, maybe something like:

("^From:.*" ((".*@bulk1.com")
          (".*@bulk2.com")
          (".*@bulk3.com")))

I'm new enough to [e]lisp that I'm not sure what list->string concatenation functions would do the trick here. 

Joost Kremers <joost.m.kremers@gmail.com> writes:

> Tory S. Anderson wrote:
>> Using gnus I have a growing regexp that represents the criteria for bulk email and splits accordingly:
>
> [...]
>
>> Is there a way to clean this up to make it both more readable and more
>> easily editable? It seems like keeping some kind of list would be the
>> way to do it, instead of an ever-lengthening string.
>
> There's the function `regexp-opt', which takes a list of strings and
> returns a regular expression that will match any of those strings.
> Perhaps you can use that?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: cleaning up a big regexp
  2014-09-15 11:18 ` cleaning up a big regexp Joost Kremers
  2014-09-15 11:26   ` Tory S. Anderson
@ 2014-09-15 11:41   ` Michael Albinus
  2014-09-15 12:06     ` Tory S. Anderson
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Albinus @ 2014-09-15 11:41 UTC (permalink / raw)
  To: help-gnu-emacs

Joost Kremers <joost.m.kremers@gmail.com> writes:

> There's the function `regexp-opt', which takes a list of strings and
> returns a regular expression that will match any of those strings.
> Perhaps you can use that?

regexp-opt cannot handle meta characters like "*". The OP showed such a regexp.

Maybe somthing like this works (untested):

(setq my-gnus-bulk-regexp
      (concat
       "^\\(From:.*@"
       (regexp-opt
	'("maillist.codeproject.com"
	  "papajohns-specials.com"
	  "qomail.quikorder.com"
	  "linkedin.com"
	  "facebookmail.com"
	  "plus.google.com"
	  "twitter.com"
	  "youtube.com"
	  "linguistlist.org"
	  "sportsauthority.com")
	'par)
       "\\)\\|\\(To:.*torysanderson@gmail.com\\)"))

Best regards, Michael.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: cleaning up a big regexp
  2014-09-15 11:41   ` Michael Albinus
@ 2014-09-15 12:06     ` Tory S. Anderson
  2014-09-15 12:18       ` Michael Albinus
  2014-09-15 12:36       ` Stefan Monnier
  0 siblings, 2 replies; 12+ messages in thread
From: Tory S. Anderson @ 2014-09-15 12:06 UTC (permalink / raw)
  To: Michael Albinus; +Cc: help-gnu-emacs

Ok. Applying the advice I've received so far, I have the following (which doesn't quite evaluate). Clearly my syntax is wrong. 

(setq my-gnus-bulk-from-address-list '("@maillist.codeproject.com"
				      "@papajohns-specials.com"
				      "@qomail.quikorder.com"
				      "@linkedin.com"
				      "@facebookmail.com"
				      "@plus.google.com"
				      "@twitter.com"
				      "@youtube.com"
				      "@linguistlist.org"
				      "sportsauthority.com")) ;; list of bulkmail addresses
(setq my-gnus-bulk-from-address-regexp (mapconcat `regexp-quote my-gnus-bulk-from-address-list "\\|")) ;; make OR

(setq my-gnus-bulk-from-regexp
      (regexp-opt (mapconcat `(concat "^From:.*") my-gnus-bulk-from-address-regexp "\\|"))) ;; apply "From.*" to the start of each address



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: cleaning up a big regexp
  2014-09-15 12:06     ` Tory S. Anderson
@ 2014-09-15 12:18       ` Michael Albinus
  2014-09-15 12:36       ` Stefan Monnier
  1 sibling, 0 replies; 12+ messages in thread
From: Michael Albinus @ 2014-09-15 12:18 UTC (permalink / raw)
  To: Tory S. Anderson; +Cc: help-gnu-emacs

torys.anderson@gmail.com (Tory S. Anderson) writes:

> (setq my-gnus-bulk-from-address-regexp (mapconcat `regexp-quote my-gnus-bulk-from-address-list "\\|")) ;; make OR

Use a single apostroph "'" here.

> (setq my-gnus-bulk-from-regexp
>       (regexp-opt (mapconcat `(concat "^From:.*") my-gnus-bulk-from-address-regexp "\\|"))) ;; apply "From.*" to the start of each address

That's wrong. You `concat' just one element, not needed. `regexp-opt'
cannot be applied over meta characters, like "*". Even if you quote
them, it doesn't work.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: cleaning up a big regexp
  2014-09-15 12:06     ` Tory S. Anderson
  2014-09-15 12:18       ` Michael Albinus
@ 2014-09-15 12:36       ` Stefan Monnier
  2014-09-15 14:11         ` Michael Albinus
  1 sibling, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2014-09-15 12:36 UTC (permalink / raw)
  To: help-gnu-emacs

> (mapconcat `regexp-quote my-gnus-bulk-from-address-list "\\|"))

A.k.a (regexp-opt my-gnus-bulk-from-address-list)


-- Stefan




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: cleaning up a big regexp
  2014-09-15 12:36       ` Stefan Monnier
@ 2014-09-15 14:11         ` Michael Albinus
  2014-09-15 14:42           ` Thanks! " Tory S. Anderson
  2014-09-15 19:00           ` Stefan Monnier
  0 siblings, 2 replies; 12+ messages in thread
From: Michael Albinus @ 2014-09-15 14:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: help-gnu-emacs

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> (mapconcat `regexp-quote my-gnus-bulk-from-address-list "\\|"))
>
> A.k.a (regexp-opt my-gnus-bulk-from-address-list)

Not when the strings contain meta characters like "*". I repeat it again
and again, because I've entered this trap very recently. See commit
trunk r117880, where I refuse to tell that I have been an idiot.

> -- Stefan

Best regards, Michael.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Thanks! Re: cleaning up a big regexp
  2014-09-15 14:11         ` Michael Albinus
@ 2014-09-15 14:42           ` Tory S. Anderson
  2014-09-15 19:00           ` Stefan Monnier
  1 sibling, 0 replies; 12+ messages in thread
From: Tory S. Anderson @ 2014-09-15 14:42 UTC (permalink / raw)
  To: Michael Albinus; +Cc: help-gnu-emacs, Stefan Monnier

Okay. In my code I now have: 
(setq my-gnus-bulk-from-address-list '("@maillist.codeproject.com"
				      "@papajohns-specials.com"
				      "@qomail.quikorder.com"
				      "@linkedin.com"
				      "@facebookmail.com"
				      "@plus.google.com"
				      "@twitter.com"
				      "@youtube.com"
				      "@linguistlist.org"
				      "@sportsauthority.com")) ;; list of bulkmail addresses
(setq my-bulk-from (mapconcat (lambda (x) (concat "^From:.*" x)) my-gnus-bulk-from-address-list "\\|"))

(setq nnmail-split-methods
       '(("mail.bulk" my-bulk-from)


Looks much better and seems to work. It appends "From:.*" to each string and then combines them into a single string. Any suggestions for improvement are welcome; big thanks for the suggestions that helped me figure this out. regexp-opt will probably make this more efficient (especially for big lists), but I haven't figured out how to plug it in yet. 


Michael Albinus <michael.albinus@gmx.de> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> (mapconcat `regexp-quote my-gnus-bulk-from-address-list "\\|"))
>>
>> A.k.a (regexp-opt my-gnus-bulk-from-address-list)
>
> Not when the strings contain meta characters like "*". I repeat it again
> and again, because I've entered this trap very recently. See commit
> trunk r117880, where I refuse to tell that I have been an idiot.
>
>> -- Stefan
>
> Best regards, Michael.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: cleaning up a big regexp
  2014-09-15 14:11         ` Michael Albinus
  2014-09-15 14:42           ` Thanks! " Tory S. Anderson
@ 2014-09-15 19:00           ` Stefan Monnier
  2014-09-16 11:05             ` Regexp in nnmail-split-methods (was Re: cleaning up a big regexp) Tory S. Anderson
       [not found]             ` <mailman.8968.1410865533.1147.help-gnu-emacs@gnu.org>
  1 sibling, 2 replies; 12+ messages in thread
From: Stefan Monnier @ 2014-09-15 19:00 UTC (permalink / raw)
  To: Michael Albinus; +Cc: help-gnu-emacs

>>> (mapconcat `regexp-quote my-gnus-bulk-from-address-list "\\|"))
>> A.k.a (regexp-opt my-gnus-bulk-from-address-list)
> Not when the strings contain meta characters like "*".

No, no: even if they do.  Read again: he mapconcats `regexp-quote'.


        Stefan



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Regexp in nnmail-split-methods (was Re: cleaning up a big regexp)
  2014-09-15 19:00           ` Stefan Monnier
@ 2014-09-16 11:05             ` Tory S. Anderson
       [not found]             ` <mailman.8968.1410865533.1147.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 12+ messages in thread
From: Tory S. Anderson @ 2014-09-16 11:05 UTC (permalink / raw)
  To: help-gnu-emacs

Okay; I've run into one more problem that is probabably (hopefully) a simple fix. Running the following code, I get a nice regexp in my "my-bulk-from" var. However, if I simply set that var as the criteria for "mail.bulk" splitting, all my split-methods break and all my mail ends up in my main inbox. If, on the other hand, I simply paste the value of that variable (rather than referencing the variable), it works as expected. I expect I'm missing a quote or something; more importantly, I'm missing some understanding. 

Why does `("mail.bulk" my-bulk-from)` fail but `("mail.bulk" "BIG REGEXP")` work, when the value of my-bulk-from is BIG REGEXP?



(setq my-gnus-bulk-from-address-list '("@maillist.codeproject.com"
				      "@papajohns-specials.com"
				      "@qomail.quikorder.com"
				      "@linkedin.com"
				      "@facebookmail.com"
				      "@plus.google.com"
				      "@twitter.com"
				      "@youtube.com"
				      "@linguistlist.org"
				      "@sportsauthority.com")) ;; list of bulkmail addresses
(setq my-bulk-from (concat "^From:.*" (regexp-opt my-gnus-bulk-from-address-list)))

(setq nnmail-split-methods
;       '(("mail.bulk" my-bulk-from) ;; breaks my split-methods
	 ("mail.bulk" "^From:.*\\(?:@\\(?:facebookmail\\.com\\|lin\\(?:guistlist\\.org\\|kedin\\.com\\)\\|\\(?:maillist\\.codeproject\\|p\\(?:apajohns-specials\\|lus\\.google\\)\\|qomail\\.quikorder\\|sportsauthority\\|twitter\\|youtube\\)\\.com\\)\\)")



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Regexp in nnmail-split-methods (was Re: cleaning up a big regexp)
       [not found]             ` <mailman.8968.1410865533.1147.help-gnu-emacs@gnu.org>
@ 2014-09-16 12:33               ` sokobania.01
  2014-09-17 21:20                 ` RESOLVED " Tory S. Anderson
  0 siblings, 1 reply; 12+ messages in thread
From: sokobania.01 @ 2014-09-16 12:33 UTC (permalink / raw)
  To: help-gnu-emacs

Le mardi 16 septembre 2014 13:05:16 UTC+2, Tory S. Anderson a écrit :
> (setq nnmail-split-methods
> 
> ;       '(("mail.bulk" my-bulk-from) ;; breaks my split-methods
> 
> 	 ("mail.bulk" "^From:.*\\(?:@\\(?:facebookmail\\.com\\|lin\\(?:guistlist\\.org\\|kedin\\.com\\)\\|\\(?:maillist\\.codeproject\\|p\\(?:apajohns-specials\\|lus\\.google\\)\\|qomail\\.quikorder\\|sportsauthority\\|twitter\\|youtube\\)\\.com\\)\\)")

You have to use a backquote (or backtick) and a comma to replace the symbol my-bulk-from by its value in the list you create:
(setq nnmail-split-methods
       `(("mail.bulk" ,my-bulk-from) ;; Won't break my split-methods
         ("mail.bulk" "^From:.*\\(?:@\\(?:facebookmail\\.com\\|lin\\(?:guistlist\\.org\\|kedin\\.com\\)\\|\\(?:maillist\\.codeproject\\|p\\(?:apajohns-specials\\|lus\\.google\\)\\|qomail\\.quikorder\\|sportsauthority\\|twitter\\|youtube\\)\\.com\\)\\)")

This is some lispish stuff...
HTH
)jack(


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RESOLVED Re: Regexp in nnmail-split-methods (was Re: cleaning up a big regexp)
  2014-09-16 12:33               ` sokobania.01
@ 2014-09-17 21:20                 ` Tory S. Anderson
  0 siblings, 0 replies; 12+ messages in thread
From: Tory S. Anderson @ 2014-09-17 21:20 UTC (permalink / raw)
  To: sokobania.01; +Cc: help-gnu-emacs

That did the trick. Thanks!

sokobania.01@gmail.com writes:

> Le mardi 16 septembre 2014 13:05:16 UTC+2, Tory S. Anderson a écrit :
>> (setq nnmail-split-methods
>> 
>> ;       '(("mail.bulk" my-bulk-from) ;; breaks my split-methods
>> 
>> 	 ("mail.bulk" "^From:.*\\(?:@\\(?:facebookmail\\.com\\|lin\\(?:guistlist\\.org\\|kedin\\.com\\)\\|\\(?:maillist\\.codeproject\\|p\\(?:apajohns-specials\\|lus\\.google\\)\\|qomail\\.quikorder\\|sportsauthority\\|twitter\\|youtube\\)\\.com\\)\\)")
>
> You have to use a backquote (or backtick) and a comma to replace the symbol my-bulk-from by its value in the list you create:
> (setq nnmail-split-methods
>        `(("mail.bulk" ,my-bulk-from) ;; Won't break my split-methods
>          ("mail.bulk" "^From:.*\\(?:@\\(?:facebookmail\\.com\\|lin\\(?:guistlist\\.org\\|kedin\\.com\\)\\|\\(?:maillist\\.codeproject\\|p\\(?:apajohns-specials\\|lus\\.google\\)\\|qomail\\.quikorder\\|sportsauthority\\|twitter\\|youtube\\)\\.com\\)\\)")
>
> This is some lispish stuff...
> HTH
> )jack(



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-09-17 21:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.8887.1410779723.1147.help-gnu-emacs@gnu.org>
2014-09-15 11:18 ` cleaning up a big regexp Joost Kremers
2014-09-15 11:26   ` Tory S. Anderson
2014-09-15 11:41   ` Michael Albinus
2014-09-15 12:06     ` Tory S. Anderson
2014-09-15 12:18       ` Michael Albinus
2014-09-15 12:36       ` Stefan Monnier
2014-09-15 14:11         ` Michael Albinus
2014-09-15 14:42           ` Thanks! " Tory S. Anderson
2014-09-15 19:00           ` Stefan Monnier
2014-09-16 11:05             ` Regexp in nnmail-split-methods (was Re: cleaning up a big regexp) Tory S. Anderson
     [not found]             ` <mailman.8968.1410865533.1147.help-gnu-emacs@gnu.org>
2014-09-16 12:33               ` sokobania.01
2014-09-17 21:20                 ` RESOLVED " Tory S. Anderson

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.