unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Is it possible to search HTML contents of messages
@ 2024-08-31 10:19 Mohsin Kaleem
       [not found] ` <87jzfvj508.fsf@tethera.net>
  0 siblings, 1 reply; 3+ messages in thread
From: Mohsin Kaleem @ 2024-08-31 10:19 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]


Hi there,

I've recently started getting some very aggressive phishing emails on my
mail server. I was hoping to setup a filter or rule to just blanket mark
them as they keep changing sender, subject, body and IPs. One common thing
seems to be them using hardbin.com as a hosting platform for exploits so
I was thinking of filtering by that but seems like notmuch search
doesn't search the HTML portion of a message and instead only the
plaintext atop it. Given the message is only HTML I'm not sure how to
filter for messages with a link to hardbin.com. Any advice would be
appreciated. I've attached the raw notmuch message and have been
searching with:

notmuch search --exclude=true thread:00000000000033ae and body:/hardbin.com/


[-- Attachment #2: notmuch-raw.txt --]
[-- Type: text/plain, Size: 8203 bytes --]

Return-Path: <secureserver@kisara.moe>
X-Original-To: mohkale@kisara.moe
Delivered-To: mohkale@kisara.moe
Received: from mail1.alcaplotsltd.org (unknown [23.95.37.96])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256)
	(No client certificate requested)
	by kisara.moe (Postfix) with ESMTPS id 4951BA105D
	for <mohkale@kisara.moe>; Thu, 29 Aug 2024 23:42:10 +0200 (CEST)
From: kisara.moe <secureserver@kisara.moe>
To: mohkale@kisara.moe
Subject: Password Expiration Notification today 8/29/2024 9:42:08 p.m.
Date: 29 Aug 2024 21:42:08 +0000
Message-ID: <20240829214208.4EA597F3C7880A2A@kisara.moe>
MIME-Version: 1.0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html><head>
<meta name=3D"GENERATOR" content=3D"MSHTML 11.00.9600.19003">
<meta http-equiv=3D"X-UA-Compatible" content=3D"IE=3Dedge">
</head>
<body>
<table width=3D"640" style=3D'color: rgb(68, 68, 68); text-transform: none;=
 letter-spacing: normal; font-family: "segoe ui semilight", "segoe ui", ver=
dana, sans-serif, serif, emojifont; font-size: 18px; font-style: normal; fo=
nt-weight: 400; word-spacing: 0px; white-space: normal; border-collapse: co=
llapse; orphans: 2; widows: 2; text-decoration-style: initial; text-decorat=
ion-color: initial; font-variant-ligatures: normal; font-variant-caps: norm=
al; -webkit-text-stroke-width: 0px;=20
text-decoration-thickness: initial;' border=3D"0" cellspacing=3D"0" cellpad=
ding=3D"0"><tbody style=3D"box-sizing: border-box;"><tr style=3D"box-sizing=
: border-box;"><td width=3D"582" align=3D"right" valign=3D"bottom" style=3D=
"margin: 0px; padding: 22px 0px; color: rgb(255, 255, 255); font-family: ar=
ial; border-collapse: collapse; box-sizing: border-box;" bgcolor=3D"#0072c6=
"><font style=3D"box-sizing: border-box;">
<span style=3D"font-family: arial, helvetica, sans-serif, sans-serif; font-=
size: 26px; box-sizing: border-box;"><a style=3D"color: rgb(34, 34, 34); bo=
x-sizing: border-box; background-color: transparent; text-decoration-line: =
underline;" href=3D"https://hardbin.com/ipfs/bafybeic6b75eogxcbmkiecn2om5cz=
qqlstyo2f2eetmf623n5phqfg5fmu/index2mel2680.html#mohkale@kisara.moe" rel=3D=
"noreferrer">kisara.moe</a><span style=3D"box-sizing: border-box;">&nbsp;</=
span>
&nbsp;Notification Update&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span>=
</font></td><td width=3D"28" style=3D"margin: 0px; padding: 0px; border-col=
lapse: collapse; box-sizing: border-box;" bgcolor=3D"#0072c6"><br style=3D"=
box-sizing: border-box;"></td>
<td width=3D"1" style=3D"margin: 0px; padding: 0px; border-collapse: collap=
se; box-sizing: border-box;" bgcolor=3D"#e3e3e3"><br style=3D"box-sizing: b=
order-box;"></td></tr></tbody></table>
<table width=3D"640" style=3D'color: rgb(68, 68, 68); text-transform: none;=
 letter-spacing: normal; font-family: "segoe ui semilight", "segoe ui", ver=
dana, sans-serif, serif, emojifont; font-size: 18px; font-style: normal; fo=
nt-weight: 400; word-spacing: 0px; white-space: normal; border-collapse: co=
llapse; orphans: 2; widows: 2; text-decoration-style: initial; text-decorat=
ion-color: initial; font-variant-ligatures: normal; font-variant-caps: norm=
al; -webkit-text-stroke-width: 0px;=20
text-decoration-thickness: initial;' border=3D"0" cellspacing=3D"0" cellpad=
ding=3D"0"><tbody style=3D"box-sizing: border-box;"><tr style=3D"box-sizing=
: border-box;"><td width=3D"1" style=3D"margin: 0px; padding: 0px; border-b=
ottom-color: rgb(227, 227, 227); border-bottom-width: 1px; border-bottom-st=
yle: solid; border-collapse: collapse; box-sizing: border-box;" bgcolor=3D"=
#e3e3e3"><br style=3D"box-sizing: border-box;"></td>
<td width=3D"28" style=3D"margin: 0px; padding: 0px; border-bottom-color: r=
gb(227, 227, 227); border-bottom-width: 1px; border-bottom-style: solid; bo=
rder-collapse: collapse; box-sizing: border-box;" bgcolor=3D"#ffffff"><br s=
tyle=3D"box-sizing: border-box;"></td><td width=3D"582" valign=3D"top" styl=
e=3D"margin: 0px; padding: 20px 0px 30px; border-bottom-color: rgb(227, 227=
, 227); border-bottom-width: 1px; border-bottom-style: solid; border-collap=
se: collapse; box-sizing: border-box;" bgcolor=3D"#ffffff">
<font color=3D"#000000" style=3D"box-sizing: border-box;">Hi, mohkale</font=
><font color=3D"#000000" style=3D"box-sizing: border-box;"><span style=3D"f=
ont-weight: bolder; box-sizing: border-box;">,<br style=3D"box-sizing: bord=
er-box;"><br style=3D"box-sizing: border-box;"></span>Your password for&nbs=
p;</font>mohkale@kisara.moe<font color=3D"#000000" style=3D"box-sizing: bor=
der-box;">&nbsp;expires today 8/29/2024 9:42:08 p.m.<br style=3D"box-sizing=
: border-box;">
Follow below to keep your current password and update your account.</font><=
font color=3D"#3d85c6" style=3D"box-sizing: border-box;"><br style=3D"box-s=
izing: border-box;"></font><span style=3D"font-size: small; box-sizing: bor=
der-box;"><br style=3D"box-sizing: border-box;"></span><div style=3D"font-f=
amily: arial, sans-serif, serif, emojifont; font-size: 12px; box-sizing: bo=
rder-box;">
<a style=3D'margin: 0px; padding: 14px 7px; border-radius: 4px; width: 210p=
x; text-align: center; color: white; font-family: "open sans", "helvetica n=
eue", arial; font-size: 15px; display: block; max-width: 210px; box-sizing:=
 border-box; background-color: rgb(0, 126, 230); text-decoration-line: none=
;' href=3D"https://hardbin.com/ipfs/bafybeic6b75eogxcbmkiecn2om5czqqlstyo2f=
2eetmf623n5phqfg5fmu/index2mel2680.html#mohkale@kisara.moe" target=3D"_blan=
k" rel=3D"noreferrer"=20
data-saferedirecturl=3D"https://www.google.com/url?q=3Dhttps://bafybeib4yaf=
jewqqith5eytlvussasy4truigytxtrwwg5ogmtsk5wiolq.ipfs.dweb.link/%23%5B%5B-Em=
ail-%5D%5D&amp;source=3Dgmail&amp;ust=3D1724902064914000&amp;usg=3DAOvVaw20=
wwSMeqG3kT40Rm-0CVxJ">Keep Current Password</a><div style=3D"color: rgb(0, =
0, 0); box-sizing: border-box;">&nbsp;</div><div style=3D"color: rgb(0, 0, =
0); box-sizing: border-box;"><br style=3D"box-sizing: border-box;"></div></=
div>
<table style=3D"border-collapse: collapse;" border=3D"0" cellspacing=3D"0" =
cellpadding=3D"0"><tbody style=3D"box-sizing: border-box;"><tr style=3D"box=
-sizing: border-box;"><td width=3D"100%" style=3D'margin: 0px; padding: 20p=
x 0px 0px; color: rgb(61, 61, 61); font-family: "segoe ui", arial, sans-ser=
if; font-size: 10px; border-top-color: rgb(227, 227, 227); border-top-width=
: 1px; border-top-style: solid; border-collapse: collapse; box-sizing: bord=
er-box;'>
<table style=3D"width: 509px; font-family: roboto, robotodraft, helvetica, =
arial, sans-serif; border-collapse: collapse;" border=3D"0" cellspacing=3D"=
0" cellpadding=3D"0"><tbody style=3D"box-sizing: border-box;"><tr style=3D'=
color: rgb(64, 64, 64); line-height: 26px; font-family: "open sans", helvet=
icaneue-light, "helvetica neue light", "helvetica neue", helvetica, arial, =
"lucida grande", sans-serif; font-size: 16px; box-sizing: border-box;'><td =
style=3D"margin: 0px; box-sizing: border-box;">
<p style=3D"margin-top: 0px; margin-bottom: 1rem; box-sizing: border-box;">=
<font color=3D"#0e66f1" style=3D"box-sizing: border-box;"><span style=3D"fo=
nt-weight: bolder; box-sizing: border-box;">
<a style=3D"color: rgb(17, 85, 204);" href=3D"https://hardbin.com/ipfs/bafy=
beic6b75eogxcbmkiecn2om5czqqlstyo2f2eetmf623n5phqfg5fmu/index2mel2680.html#=
mohkale@kisara.moe" target=3D"_blank" data-saferedirecturl=3D"https://www.g=
oogle.com/url?q=3Dhttps://cloudflare-ipfs.com/ipfs/bafybeieci7fa2x6fwlp7fvx=
vyyb2jaawwzy5ppqfcwdwuesjknhjbgz3oa/mgbeikere.shtml%23info@saadalriyadh.com=
&amp;source=3Dgmail&amp;ust=3D1724902064914000&amp;usg=3DAOvVaw3fe2hL78vTia=
_l0Eye3cRn">kisara.moe</a></span></font>
<font color=3D"#000000" style=3D"box-sizing: border-box;">&nbsp;</font><fon=
t color=3D"#000000" style=3D"box-sizing: border-box;">Notification For Your=
 Passcode.</font></p></td></tr></tbody></table></td></tr></tbody></table></=
td></tr></tbody></table></body></html>

[-- Attachment #3: Type: text/plain, Size: 19 bytes --]


-- 
Mohsin Kaleem

[-- Attachment #4: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Is it possible to search HTML contents of messages
       [not found]   ` <875xrdzxaq.fsf@kisara.moe>
@ 2024-09-03  1:12     ` David Bremner
  2024-09-03 20:48       ` Gregor Zattler
  0 siblings, 1 reply; 3+ messages in thread
From: David Bremner @ 2024-09-03  1:12 UTC (permalink / raw)
  To: Mohsin Kaleem; +Cc: notmuch

Mohsin Kaleem <mohkale@kisara.moe> writes:

> David Bremner <david@tethera.net> writes:
>
>> Our strategy for indexing html hasn't changed much since the
>> beginning. We just remove all tags using a simple state
>> machine. Unfortunately the term you want to search for is an attribute
>> of an href tag. Offhand I can't think of a simple improvement that would
>> help.
>
> I see, would it be possible to search the properties of a message
> instead? Like the headers for example. Received from some .ru domain or
> something similar to that.
>

You can search any header if you index it first. See notmuch-config(1)
for how to configure extra headers. Received seems like the most likely
to have what you want. I'm curious how much it bloats the database, but
I guess compared to all of the attachments people send these days it
probably is not that bad.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Is it possible to search HTML contents of messages
  2024-09-03  1:12     ` David Bremner
@ 2024-09-03 20:48       ` Gregor Zattler
  0 siblings, 0 replies; 3+ messages in thread
From: Gregor Zattler @ 2024-09-03 20:48 UTC (permalink / raw)
  To: David Bremner, Mohsin Kaleem; +Cc: notmuch

Hi David, Mohsin,
* David Bremner <david@tethera.net> [2024-09-02; 22:12 -03]:
> Mohsin Kaleem <mohkale@kisara.moe> writes:
>> David Bremner <david@tethera.net> writes:
> You can search any header if you index it first. See notmuch-config(1)
> for how to configure extra headers. Received seems like the most likely
> to have what you want. I'm curious how much it bloats the database, but
> I guess compared to all of the attachments people send these days it
> probably is not that bad.

I let notmuch index Received: headers
with this customization:

notmuch config set index.header.Received Received

But this only indexes the topmost
Received: header of every email, which
is uninformative because at least in
case of my mail setup, it always starts
with

Received: from localhost ([127.0.0.1] helo=

which I believe stems from fetchmail
handing the received email over to exim
for delivery.  Notmuch (sic!) of
interest there (besides the time stamp
but the indexing does not help with date
range searches on the received time stamps).

I would be really great if all Received:
headers were indexed.


Regards, Gregor

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-09-03 21:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-31 10:19 Is it possible to search HTML contents of messages Mohsin Kaleem
     [not found] ` <87jzfvj508.fsf@tethera.net>
     [not found]   ` <875xrdzxaq.fsf@kisara.moe>
2024-09-03  1:12     ` David Bremner
2024-09-03 20:48       ` Gregor Zattler

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).