* Re: Spam through the newsgroup gateway [not found] <mailman.2742.1540494841.1284.help-gnu-emacs@gnu.org> @ 2018-10-25 20:30 ` Nuno Silva 2018-10-25 20:41 ` Bob Proulx 2018-10-25 20:48 ` Emanuel Berg 0 siblings, 2 replies; 10+ messages in thread From: Nuno Silva @ 2018-10-25 20:30 UTC (permalink / raw) To: help-gnu-emacs On 2018-10-25, Bob Proulx wrote: > Since we have been talking about the newsgroup gateway of late... > > The recent spam messages just now to the mailing list came through the > newsgroup and not the mailing list. There isn't a way to filter it > from the mailing list since it is done upstream by Mailman outside of > our control. > > It is an example of a peeve of mine with the way Mailman handles > email. When spam enters the newsgroup it is gateway'd directly to the > mailing list bypassing spam filtering. If it went through the spam > filtering the same as other mail then I would be okay with it. The > only direct way to stop it is to block messages from the gateway. And > obviously we have already talked about why that isn't desired. > > Public Service Announcement: Please do not reply to spam. If a valid > message is in reply to a spam message then it refers to it and in a > sense validates it. To talk about spam please use an independent > thread so as not to validate the original spam. Here I only saw the two recent spam messages in the Gmane group. They did not appear in the USENET group. -- Nuno Silva ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Spam through the newsgroup gateway 2018-10-25 20:30 ` Spam through the newsgroup gateway Nuno Silva @ 2018-10-25 20:41 ` Bob Proulx 2018-10-25 20:57 ` Emanuel Berg 2018-10-25 20:48 ` Emanuel Berg 1 sibling, 1 reply; 10+ messages in thread From: Bob Proulx @ 2018-10-25 20:41 UTC (permalink / raw) To: help-gnu-emacs Nuno Silva wrote: > Here I only saw the two recent spam messages in the Gmane group. They > did not appear in the USENET group. I will guess that they were filtered by one of the newsgroups in the mesh of hosts passing news articles around and therefore you didn't see it in your news server. Here is host Path they listed. Therefore looking at any of those host newsgroup servers would have seen the message there. Path: usenet.stanford.edu!e5-v6no2874978qtr.0!news-out.google.com!o27-v6ni12010qtk.1!nntp.google.com!e5-v6no2874964qtr.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail It is also possible that they were posted, then an anti-spam cancel control (possibly an automated cancel-bot) canceled the message from the newsgroup. Meaning that if one reads the newsgroup later in time that the spam will have been removed before it is seen. But it would have already been passed on via the gateway to the mailing list. Just guessing... Bob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Spam through the newsgroup gateway 2018-10-25 20:41 ` Bob Proulx @ 2018-10-25 20:57 ` Emanuel Berg 2018-10-25 22:06 ` Van L 0 siblings, 1 reply; 10+ messages in thread From: Emanuel Berg @ 2018-10-25 20:57 UTC (permalink / raw) To: help-gnu-emacs Sorry, I didn't look close enough. I see the spam in the Gmane group and in the newsgroup. The reason I didn't see them is that that stuff "solution manual" is a spammer thing that has been around for years. So I have killfiled it long ago. Yes, most definitely is that coming from the newsgroup! -- underground experts united http://user.it.uu.se/~embe8573 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Spam through the newsgroup gateway 2018-10-25 20:57 ` Emanuel Berg @ 2018-10-25 22:06 ` Van L 2018-10-26 10:57 ` Emanuel Berg 0 siblings, 1 reply; 10+ messages in thread From: Van L @ 2018-10-25 22:06 UTC (permalink / raw) To: help-gnu-emacs > "solution manual" is a spammer thing On my mailing list reader, I can mark the thing as junk but even the junk folder barfs it back out if I try to move it there. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Spam through the newsgroup gateway 2018-10-25 22:06 ` Van L @ 2018-10-26 10:57 ` Emanuel Berg 0 siblings, 0 replies; 10+ messages in thread From: Emanuel Berg @ 2018-10-26 10:57 UTC (permalink / raw) To: help-gnu-emacs Van L wrote: >> "solution manual" is a spammer thing > > On my mailing list reader, I can mark the > thing as junk but even the junk folder barfs > it back out if I try to move it there. The guy who wrote whatever software it is that propagates it forever sure made a fine job :) I hope he ended up doing something better with his talent than the spammer career. -- underground experts united http://user.it.uu.se/~embe8573 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Spam through the newsgroup gateway 2018-10-25 20:30 ` Spam through the newsgroup gateway Nuno Silva 2018-10-25 20:41 ` Bob Proulx @ 2018-10-25 20:48 ` Emanuel Berg 1 sibling, 0 replies; 10+ messages in thread From: Emanuel Berg @ 2018-10-25 20:48 UTC (permalink / raw) To: help-gnu-emacs Nuno Silva wrote: >> Since we have been talking about the >> newsgroup gateway of late... The recent spam >> messages just now to the mailing list came >> through the newsgroup and not the mailing >> list. There isn't a way to filter it from >> the mailing list since it is done upstream >> by Mailman outside of our control. It is an >> example of a peeve of mine with the way >> Mailman handles email. When spam enters the >> newsgroup it is gateway'd directly to the >> mailing list bypassing spam filtering. If it >> went through the spam filtering the same as >> other mail then I would be okay with it. >> The only direct way to stop it is to block >> messages from the gateway. And obviously we >> have already talked about why that isn't >> desired. Public Service Announcement: Please >> do not reply to spam. If a valid message is >> in reply to a spam message then it refers to >> it and in a sense validates it. To talk >> about spam please use an independent thread >> so as not to validate the original spam. > > Here I only saw the two recent spam messages in > the Gmane group. They did not appear in the > USENET group. I have not seen any spam in the Gmane group either? -- underground experts united http://user.it.uu.se/~embe8573 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Spam through the newsgroup gateway @ 2018-10-25 19:13 Bob Proulx 2018-10-27 17:01 ` Garreau, Alexandre 0 siblings, 1 reply; 10+ messages in thread From: Bob Proulx @ 2018-10-25 19:13 UTC (permalink / raw) To: help-gnu-emacs Since we have been talking about the newsgroup gateway of late... The recent spam messages just now to the mailing list came through the newsgroup and not the mailing list. There isn't a way to filter it from the mailing list since it is done upstream by Mailman outside of our control. It is an example of a peeve of mine with the way Mailman handles email. When spam enters the newsgroup it is gateway'd directly to the mailing list bypassing spam filtering. If it went through the spam filtering the same as other mail then I would be okay with it. The only direct way to stop it is to block messages from the gateway. And obviously we have already talked about why that isn't desired. Public Service Announcement: Please do not reply to spam. If a valid message is in reply to a spam message then it refers to it and in a sense validates it. To talk about spam please use an independent thread so as not to validate the original spam. Bob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Spam through the newsgroup gateway 2018-10-25 19:13 Bob Proulx @ 2018-10-27 17:01 ` Garreau, Alexandre 2018-11-10 22:17 ` Bob Proulx [not found] ` <mailman.3846.1541888586.1284.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 10+ messages in thread From: Garreau, Alexandre @ 2018-10-27 17:01 UTC (permalink / raw) To: help-gnu-emacs On 2018-10-25 at 13:13, Bob Proulx wrote: > Since we have been talking about the newsgroup gateway of late... > > The recent spam messages just now to the mailing list came through the > newsgroup and not the mailing list. There isn't a way to filter it > from the mailing list since it is done upstream by Mailman outside of > our control. > > It is an example of a peeve of mine with the way Mailman handles > email. When spam enters the newsgroup it is gateway'd directly to the > mailing list bypassing spam filtering. If it went through the spam > filtering the same as other mail then I would be okay with it. The > only direct way to stop it is to block messages from the gateway. And > obviously we have already talked about why that isn't desired. > > Public Service Announcement: Please do not reply to spam. If a valid > message is in reply to a spam message then it refers to it and in a > sense validates it. To talk about spam please use an independent > thread so as not to validate the original spam. Why so? If not sending anything to whoever sent the mail, will they track the mailing-list or its archive to find some other mail referring to it, and take this as an encouragement and post more spam? Otherwise, what’s the problem of validation if it’s for a single spam? Let’s say someone got their antispam block that spam: it seems to me normal, whenever a discussion is being about some spam that has been relayed by the list, that the user either see the aforementioned spam, to aknowledge the problem other are living (and get a sample of it), or not to see the thread at all, as they’re not concerned. Ideally there should be a way to trigger metadata so that when you answer to something you do while marking it as spam for people seeing your message, like a mail header for it. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Spam through the newsgroup gateway 2018-10-27 17:01 ` Garreau, Alexandre @ 2018-11-10 22:17 ` Bob Proulx [not found] ` <mailman.3846.1541888586.1284.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 10+ messages in thread From: Bob Proulx @ 2018-11-10 22:17 UTC (permalink / raw) To: help-gnu-emacs; +Cc: Garreau, Alexandre Alexandre Garreau wrote: > Bob Proulx wrote: > > Public Service Announcement: Please do not reply to spam. If a valid > > message is in reply to a spam message then it refers to it and in a > > sense validates it. To talk about spam please use an independent > > thread so as not to validate the original spam. > > Why so? The best anti-spam engines in practice are learning engines such as Bayes and other. Spam characteristics change so quickly and their human senders keep trying to be more sneaky than before. We use no fewer than three! SpamAssassin, Bogofilter, and CRM114. By far CRM114 is the best of those three. But there are subtle differences that keep me playing one off the other and therefore continuing to add engines rather than remove them. Since they are learning engines they must be trained in order to learn. The best training has been training on error. When the classification is different it must be corrected. All messages are fed through the anti-spam classification engines twice. Once on the frontend in order to classify the message to determine if it should be automatically discarded. And then once again after the messages go through the mailing list to train on any errors. Since the mailing lists are relatively spam free (IMNHO) then I assume that any message through the mailing list is a desired message. If any of the learning engines think otherwise then it triggers training to learn that message as non-spam. SpamAssassin knows the structure of email, what's a header and what is the body. Bogofilter and CRM114 have no knowledge of email structure and process the message as a raw file looking at tokens in the headers and structure and learning them as either indicators or not dynamically. For them this includes IP addresses and email addresses and everything. Everything is open to gripping upon. Just recently, due to our conversations about the newsgroup gateway here, I have modified this algorithm slightly. I now look for the newsgroup gateway header. If a message entered through the newsgroup then I ignore it. There isn't anything I can do about it. Training on it makes no sense. Therefore I ignore it. No training. But until recently I did train on newsgroup messages too. If someone replies to the message then the email headers and the structure of it and, goodness forbid if they quote any of the message (top posting on the entire spam is worst), then all of that may have been associated with spam but when it comes through the mailing list now it will be associated with non-spam. Training the learning engines on it will pull the database to thinking that that type of message, spam though it is, is desirable on the mailing list and will pass it through in the future. It will eventually correct but may take a while. A while being around a month for the size of the token database we keep. From week to week the trend in spam changes. > If not sending anything to whoever sent the mail, will they > track the mailing-list or its archive to find some other mail referring > to it, and take this as an encouragement and post more spam? Not likely. I think for spammers it is mostly send and forget (like a "fire and forget" military missile). > Otherwise, what's the problem of validation if it's for a single spam? > Let's say someone got their antispam block that spam: it seems to me > normal, whenever a discussion is being about some spam that has been > relayed by the list, that the user either see the aforementioned spam, > to aknowledge the problem other are living (and get a sample of it), or > not to see the thread at all, as they're not concerned. If it is a single spam it isn't the end of the world. It is all just incremental. Because it will be used to train the learning engines. And they will recover given enough time and good later input. But every little bit counts! > Ideally there should be a way to trigger metadata so that when you > answer to something you do while marking it as spam for people seeing > your message, like a mail header for it. There are systems in use where the community can vote upon messages. They usually require multiple votes, say five, from known quality voters, and then the message is hidden. But mostly we see those with web page forums. Since this is a mailing list in order to install such a thing we would need to have users trained on how to do this. As another data point in this area the Debian mailing lists have an address where people can "bounce" the spam to for further training of their anti-spam learning engines. And as a notification to the listmaster that spam is flowing in and needs help to be blocked (they use procmail rules, we do too) if they get a new type that slips through. (Mutt has a 'b'ounce mail action, other mailers may or may not.) We could set up something like that but one does not exist at the moment. With some more work it could be useful if people were to contribute spams that slip through into the mailing list to it. Sorry for the long delay in answering this message. Life and time is what keeps everything from happening all at once. Bob ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <mailman.3846.1541888586.1284.help-gnu-emacs@gnu.org>]
* Re: Spam through the newsgroup gateway [not found] ` <mailman.3846.1541888586.1284.help-gnu-emacs@gnu.org> @ 2018-11-16 3:38 ` Rusi 0 siblings, 0 replies; 10+ messages in thread From: Rusi @ 2018-11-16 3:38 UTC (permalink / raw) To: help-gnu-emacs On Sunday, November 11, 2018 at 3:53:08 AM UTC+5:30, Bob Proulx wrote: > Alexandre Garreau wrote: > > Bob Proulx wrote: > > > Public Service Announcement: Please do not reply to spam. If a valid > > > message is in reply to a spam message then it refers to it and in a > > > sense validates it. To talk about spam please use an independent > > > thread so as not to validate the original spam. > > > > Why so? > > The best anti-spam engines in practice are learning engines such as > Bayes and other. Spam characteristics change so quickly and their > human senders keep trying to be more sneaky than before. We use no > fewer than three! SpamAssassin, Bogofilter, and CRM114. By far > CRM114 is the best of those three. But there are subtle differences > that keep me playing one off the other and therefore continuing to add > engines rather than remove them. > > Since they are learning engines they must be trained in order to > learn. The best training has been training on error. When the > classification is different it must be corrected. > > All messages are fed through the anti-spam classification engines > twice. Once on the frontend in order to classify the message to > determine if it should be automatically discarded. And then once > again after the messages go through the mailing list to train on any > errors. Since the mailing lists are relatively spam free (IMNHO) then > I assume that any message through the mailing list is a desired > message. If any of the learning engines think otherwise then it > triggers training to learn that message as non-spam. > > SpamAssassin knows the structure of email, what's a header and what is > the body. Bogofilter and CRM114 have no knowledge of email structure > and process the message as a raw file looking at tokens in the headers > and structure and learning them as either indicators or not > dynamically. For them this includes IP addresses and email addresses > and everything. Everything is open to gripping upon. > > Just recently, due to our conversations about the newsgroup gateway > here, I have modified this algorithm slightly. I now look for the > newsgroup gateway header. If a message entered through the newsgroup > then I ignore it. There isn't anything I can do about it. Training > on it makes no sense. Therefore I ignore it. No training. But until > recently I did train on newsgroup messages too. > > If someone replies to the message then the email headers and the > structure of it and, goodness forbid if they quote any of the message > (top posting on the entire spam is worst), then all of that may have > been associated with spam but when it comes through the mailing list > now it will be associated with non-spam. Training the learning > engines on it will pull the database to thinking that that type of > message, spam though it is, is desirable on the mailing list and will > pass it through in the future. It will eventually correct but may > take a while. A while being around a month for the size of the token > database we keep. From week to week the trend in spam changes. > > > If not sending anything to whoever sent the mail, will they > > track the mailing-list or its archive to find some other mail referring > > to it, and take this as an encouragement and post more spam? > > Not likely. I think for spammers it is mostly send and forget (like a > "fire and forget" military missile). > > > Otherwise, what's the problem of validation if it's for a single spam? > > Let's say someone got their antispam block that spam: it seems to me > > normal, whenever a discussion is being about some spam that has been > > relayed by the list, that the user either see the aforementioned spam, > > to aknowledge the problem other are living (and get a sample of it), or > > not to see the thread at all, as they're not concerned. > > If it is a single spam it isn't the end of the world. It is all just > incremental. Because it will be used to train the learning engines. > And they will recover given enough time and good later input. But > every little bit counts! > > > Ideally there should be a way to trigger metadata so that when you > > answer to something you do while marking it as spam for people seeing > > your message, like a mail header for it. > > There are systems in use where the community can vote upon messages. > They usually require multiple votes, say five, from known quality > voters, and then the message is hidden. But mostly we see those with > web page forums. Since this is a mailing list in order to install > such a thing we would need to have users trained on how to do this. > > As another data point in this area the Debian mailing lists have an > address where people can "bounce" the spam to for further training of > their anti-spam learning engines. And as a notification to the > listmaster that spam is flowing in and needs help to be blocked (they > use procmail rules, we do too) if they get a new type that slips > through. (Mutt has a 'b'ounce mail action, other mailers may or may > not.) We could set up something like that but one does not exist at > the moment. With some more work it could be useful if people were to > contribute spams that slip through into the mailing list to it. > > Sorry for the long delay in answering this message. Life and time is > what keeps everything from happening all at once. > > Bob You seem to be managing a splendid job with ML-news gateway spam [ Compare https://groups.google.com/forum/#!forum/comp.lang.python ] Wonder how easy it would be for you to share your know-how in capsule/summary?? (assuming the folks managing comp.lang.python are interested] ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-11-16 3:38 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <mailman.2742.1540494841.1284.help-gnu-emacs@gnu.org> 2018-10-25 20:30 ` Spam through the newsgroup gateway Nuno Silva 2018-10-25 20:41 ` Bob Proulx 2018-10-25 20:57 ` Emanuel Berg 2018-10-25 22:06 ` Van L 2018-10-26 10:57 ` Emanuel Berg 2018-10-25 20:48 ` Emanuel Berg 2018-10-25 19:13 Bob Proulx 2018-10-27 17:01 ` Garreau, Alexandre 2018-11-10 22:17 ` Bob Proulx [not found] ` <mailman.3846.1541888586.1284.help-gnu-emacs@gnu.org> 2018-11-16 3:38 ` Rusi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).