* Re: master ce63f91025: Add textsec functions for verifying email addresses [not found] ` <20220118122012.7A3A4C0DA1B@vcs2.savannah.gnu.org> @ 2022-01-18 13:30 ` Po Lu 2022-01-18 18:42 ` Eli Zaretskii 2022-01-18 13:38 ` Robert Pluim 1 sibling, 1 reply; 18+ messages in thread From: Po Lu @ 2022-01-18 13:30 UTC (permalink / raw) To: emacs-devel; +Cc: Lars Ingebrigtsen Lars Ingebrigtsen <larsi@gnus.org> writes: > +(defun textsec-name-suspicious-p (name) > + "Say whether NAME looks suspicious. > +NAME is (for instance) the free-text name from an email address. > + > +If it suspicious, nil is returned. If it is, a string explaining > +the problem is returned." > + (cond > + ((not (equal name (ucs-normalize-NFC-string name))) > + (format "`%s' is not in normalized format `%s'" > + name (ucs-normalize-NFC-string name))) > + ((seq-find (lambda (char) > + (and (member char bidi-control-characters) > + (not (member char > + '( ?\N{left-to-right mark} > + ?\N{right-to-left mark} > + ?\N{arabic letter mark}))))) > + name) > + (format "The string contains bidirectional control characters")) > + ((textsec-suspicious-nonspacing-p name)))) I thought the consensus from the last discussion about this subject was to use `bidi-find-overridden-directionality' for this kind of thing, to avoid false positives with legitimate use of bidirectional control characters. Thanks. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-18 13:30 ` master ce63f91025: Add textsec functions for verifying email addresses Po Lu @ 2022-01-18 18:42 ` Eli Zaretskii 2022-01-20 8:47 ` Lars Ingebrigtsen 2022-01-20 8:49 ` Lars Ingebrigtsen 0 siblings, 2 replies; 18+ messages in thread From: Eli Zaretskii @ 2022-01-18 18:42 UTC (permalink / raw) To: Po Lu; +Cc: larsi, emacs-devel > From: Po Lu <luangruo@yahoo.com> > Cc: Lars Ingebrigtsen <larsi@gnus.org> > Date: Tue, 18 Jan 2022 21:30:39 +0800 > > Lars Ingebrigtsen <larsi@gnus.org> writes: > > > +(defun textsec-name-suspicious-p (name) > > + "Say whether NAME looks suspicious. > > +NAME is (for instance) the free-text name from an email address. > > + > > +If it suspicious, nil is returned. If it is, a string explaining > > +the problem is returned." > > + (cond > > + ((not (equal name (ucs-normalize-NFC-string name))) > > + (format "`%s' is not in normalized format `%s'" > > + name (ucs-normalize-NFC-string name))) > > + ((seq-find (lambda (char) > > + (and (member char bidi-control-characters) > > + (not (member char > > + '( ?\N{left-to-right mark} > > + ?\N{right-to-left mark} > > + ?\N{arabic letter mark}))))) > > + name) > > + (format "The string contains bidirectional control characters")) > > + ((textsec-suspicious-nonspacing-p name)))) > > I thought the consensus from the last discussion about this subject was > to use `bidi-find-overridden-directionality' for this kind of thing, to > avoid false positives with legitimate use of bidirectional control > characters. Yes, using the Unicode security guidelines would produce unnecessary false positives. Which could be OK for paranoid minds, I guess, who are afraid of any bidi controls, even if they don't actually affect the display order. Like in this example: "אבגד שונה מרגיל" I do hope we will eventually offer separate functions to do that with fewer false positives (or a way of customizing these textsec functions to do that). ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-18 18:42 ` Eli Zaretskii @ 2022-01-20 8:47 ` Lars Ingebrigtsen 2022-01-20 9:40 ` Eli Zaretskii 2022-01-20 8:49 ` Lars Ingebrigtsen 1 sibling, 1 reply; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-20 8:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Po Lu, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Yes, using the Unicode security guidelines would produce unnecessary > false positives. Which could be OK for paranoid minds, I guess, who > are afraid of any bidi controls, even if they don't actually affect > the display order. Like in this example: > > "אבגד שונה מרגיל" > > I do hope we will eventually offer separate functions to do that with > fewer false positives (or a way of customizing these textsec functions > to do that). It seems save to allow names that pass bidi-find-overridden-directionality ... but is there an off-by-one error there? It says that this is OK: "Lars Ingebrigtsen\N{LEFT-TO-RIGHT OVERRIDE}" But this isn't: "Lars Ingebrigtsen\N{LEFT-TO-RIGHT OVERRIDE}f" And both are equally suspicious. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 8:47 ` Lars Ingebrigtsen @ 2022-01-20 9:40 ` Eli Zaretskii 2022-01-20 9:49 ` Lars Ingebrigtsen 0 siblings, 1 reply; 18+ messages in thread From: Eli Zaretskii @ 2022-01-20 9:40 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: luangruo, emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Po Lu <luangruo@yahoo.com>, emacs-devel@gnu.org > Date: Thu, 20 Jan 2022 09:47:42 +0100 > > It seems save to allow names that pass > bidi-find-overridden-directionality ... but is there an off-by-one > error there? > > It says that this is OK: > > "Lars Ingebrigtsen\N{LEFT-TO-RIGHT OVERRIDE}" > > But this isn't: > > "Lars Ingebrigtsen\N{LEFT-TO-RIGHT OVERRIDE}f" > > And both are equally suspicious. Why do you think the former one is suspicious? The override there doesn't affect any character, because there's nothing after it. bidi-find-overridden-directionality works by looking at characters which have their directionality affected by the bidi controls. It doesn't look at the controls themselves, because those controls by themselves aren't doing any harm. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 9:40 ` Eli Zaretskii @ 2022-01-20 9:49 ` Lars Ingebrigtsen 2022-01-20 10:20 ` Eli Zaretskii 0 siblings, 1 reply; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-20 9:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Why do you think the former one is suspicious? The override there > doesn't affect any character, because there's nothing after it. Because strings aren't used in isolation. In this case we're checking the name part of an email address headers, and: (insert "Lars Ingebrigtsen\N{RIGHT-TO-LEFT OVERRIDE}" "larsi@gnus.org") Boom. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 9:49 ` Lars Ingebrigtsen @ 2022-01-20 10:20 ` Eli Zaretskii 2022-01-20 11:08 ` Lars Ingebrigtsen 0 siblings, 1 reply; 18+ messages in thread From: Eli Zaretskii @ 2022-01-20 10:20 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: luangruo, emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: luangruo@yahoo.com, emacs-devel@gnu.org > Date: Thu, 20 Jan 2022 10:49:00 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Why do you think the former one is suspicious? The override there > > doesn't affect any character, because there's nothing after it. > > Because strings aren't used in isolation. In this case we're checking > the name part of an email address headers, and: > > (insert "Lars Ingebrigtsen\N{RIGHT-TO-LEFT OVERRIDE}" "larsi@gnus.org") > > Boom. Then you must pass the entire concatenated string to the function. Or call it on buffer text after inserting the string there. This function must see the characters affected by the bidi controls, to tell whether the control do any harm. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 10:20 ` Eli Zaretskii @ 2022-01-20 11:08 ` Lars Ingebrigtsen 2022-01-20 11:29 ` Eli Zaretskii 0 siblings, 1 reply; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-20 11:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Because strings aren't used in isolation. In this case we're checking >> the name part of an email address headers, and: >> >> (insert "Lars Ingebrigtsen\N{RIGHT-TO-LEFT OVERRIDE}" "larsi@gnus.org") >> >> Boom. > > Then you must pass the entire concatenated string to the function. Or > call it on buffer text after inserting the string there. This > function must see the characters affected by the bidi controls, to > tell whether the control do any harm. I see. Perhaps we should have another function in addition -- one that says "does this string (if inserted into a buffer) possibly affect other text"? I.e., "does it have dangling directional modifiers"? I think that's really what we want here (and why the Unicode recommendations are like they are in this area). If we had such a function, then textsec-name-suspicious-p use both the current bidi-find-overridden-directionality and the new predicate as a filter. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 11:08 ` Lars Ingebrigtsen @ 2022-01-20 11:29 ` Eli Zaretskii 2022-01-20 12:46 ` Lars Ingebrigtsen 0 siblings, 1 reply; 18+ messages in thread From: Eli Zaretskii @ 2022-01-20 11:29 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: luangruo, emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: luangruo@yahoo.com, emacs-devel@gnu.org > Date: Thu, 20 Jan 2022 12:08:35 +0100 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> (insert "Lars Ingebrigtsen\N{RIGHT-TO-LEFT OVERRIDE}" "larsi@gnus.org") > >> > >> Boom. > > > > Then you must pass the entire concatenated string to the function. Or > > call it on buffer text after inserting the string there. This > > function must see the characters affected by the bidi controls, to > > tell whether the control do any harm. > > I see. Perhaps we should have another function in addition -- one that > says "does this string (if inserted into a buffer) possibly affect other > text"? I.e., "does it have dangling directional modifiers"? I think > that's really what we want here (and why the Unicode recommendations are > like they are in this area). The problem is that the answer to that question depends on the following text. E.g., if RIGHT-TO-LEFT OVERRIDE is followed by R2L characters, they will not be affected. We could try appending some representative text to the string being tested, of course. For example, append a fixed string like this: a1א:! and see if the function returns non-nil position that points to one of those characters; if so, consider the original string "unsafe". Would that be good enough for textsec purposes? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 11:29 ` Eli Zaretskii @ 2022-01-20 12:46 ` Lars Ingebrigtsen 2022-01-22 10:01 ` Eli Zaretskii 0 siblings, 1 reply; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-20 12:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > The problem is that the answer to that question depends on the > following text. E.g., if RIGHT-TO-LEFT OVERRIDE is followed by R2L > characters, they will not be affected. Yes. But it's certainly suspicious to have such dangling control characters in a string, which is what we're wondering about. > We could try appending some representative text to the string being > tested, of course. For example, append a fixed string like this: > > a1א:! > > and see if the function returns non-nil position that points to one of > those characters; if so, consider the original string "unsafe". > > Would that be good enough for textsec purposes? Sounds good to me. It should probably be baked into its own utility function, so that other people that wonder about strings they have doesn't have to know anything about these things. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 12:46 ` Lars Ingebrigtsen @ 2022-01-22 10:01 ` Eli Zaretskii 2022-01-22 11:25 ` Lars Ingebrigtsen 0 siblings, 1 reply; 18+ messages in thread From: Eli Zaretskii @ 2022-01-22 10:01 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: luangruo, emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: luangruo@yahoo.com, emacs-devel@gnu.org > Date: Thu, 20 Jan 2022 13:46:24 +0100 > > > We could try appending some representative text to the string being > > tested, of course. For example, append a fixed string like this: > > > > a1א:! > > > > and see if the function returns non-nil position that points to one of > > those characters; if so, consider the original string "unsafe". > > > > Would that be good enough for textsec purposes? > > Sounds good to me. It should probably be baked into its own utility > function, so that other people that wonder about strings they have > doesn't have to know anything about these things. Now done, see the new function textsec-bidi-controls-suspicious-p. Feel free to tweak as needed, if the API is not convenient. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-22 10:01 ` Eli Zaretskii @ 2022-01-22 11:25 ` Lars Ingebrigtsen 0 siblings, 0 replies; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-22 11:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Now done, see the new function textsec-bidi-controls-suspicious-p. > Feel free to tweak as needed, if the API is not convenient. Thanks; looks good to me. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-18 18:42 ` Eli Zaretskii 2022-01-20 8:47 ` Lars Ingebrigtsen @ 2022-01-20 8:49 ` Lars Ingebrigtsen 2022-01-20 10:04 ` Eli Zaretskii 1 sibling, 1 reply; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-20 8:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Po Lu, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > the display order. Like in this example: > > "אבגד שונה מרגיל" > > I do hope we will eventually offer separate functions to do that with And (let ((string "אבגד שונה מרגיל")) (bidi-find-overridden-directionality 0 (length string) string)) => nil as expected when I do it interactively, but if I run it from a batch Emacs, it returns non-nil? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 8:49 ` Lars Ingebrigtsen @ 2022-01-20 10:04 ` Eli Zaretskii 2022-01-20 10:07 ` Lars Ingebrigtsen 0 siblings, 1 reply; 18+ messages in thread From: Eli Zaretskii @ 2022-01-20 10:04 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: luangruo, emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Po Lu <luangruo@yahoo.com>, emacs-devel@gnu.org > Date: Thu, 20 Jan 2022 09:49:50 +0100 > > (let ((string "אבגד שונה מרגיל")) > (bidi-find-overridden-directionality 0 (length string) string)) > => nil > > as expected when I do it interactively, but if I run it from a batch > Emacs, it returns non-nil? Oops! Should be fixed now. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-20 10:04 ` Eli Zaretskii @ 2022-01-20 10:07 ` Lars Ingebrigtsen 0 siblings, 0 replies; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-20 10:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: luangruo, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Oops! Should be fixed now. Thanks; I can confirm that it now works. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses [not found] ` <20220118122012.7A3A4C0DA1B@vcs2.savannah.gnu.org> 2022-01-18 13:30 ` master ce63f91025: Add textsec functions for verifying email addresses Po Lu @ 2022-01-18 13:38 ` Robert Pluim 2022-01-19 14:37 ` Lars Ingebrigtsen 1 sibling, 1 reply; 18+ messages in thread From: Robert Pluim @ 2022-01-18 13:38 UTC (permalink / raw) To: emacs-devel; +Cc: Lars Ingebrigtsen >>>>> On Tue, 18 Jan 2022 07:20:12 -0500 (EST), Lars Ingebrigtsen <larsi@gnus.org> said: Lars> +(defun textsec-email-suspicious-p (email) Lars> + "Say whether EMAIL looks suspicious. Lars> +If it isn't, nil is returned. If it is, a string explaining the Lars> +problem is returned." Lars> + (pcase-let* ((`(,address . ,name) (mail-header-parse-address email t)) Lars> + (`(,local ,domain) (split-string address "@"))) Lars> + (or Lars> + (textsec-domain-suspicious-p domain) Lars> + (textsec-local-address-suspicious-p local) Lars> + (textsec-name-suspicious-p name)))) Lars> + Lars> (provide 'textsec) Does it really matter if the display name of an email address has possibly confusable characters in it? Itʼs not actually used for anything except to be displayed to the user. (as an aside: (mail-header-parse-address "Robert Pluim rpluim@gmail.com") => ("RobertPluimrpluim@gmail.com") I know thatʼs not a valid format, but the behaviour is somewhat surprising) Robert -- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-18 13:38 ` Robert Pluim @ 2022-01-19 14:37 ` Lars Ingebrigtsen 2022-01-19 14:49 ` Robert Pluim 0 siblings, 1 reply; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-19 14:37 UTC (permalink / raw) To: Robert Pluim; +Cc: emacs-devel Robert Pluim <rpluim@gmail.com> writes: > Does it really matter if the display name of an email address has > possibly confusable characters in it? Itʼs not actually used for > anything except to be displayed to the user. The Unicode people think so. > (as an aside: > > (mail-header-parse-address "Robert Pluim rpluim@gmail.com") > => > ("RobertPluimrpluim@gmail.com") > > I know thatʼs not a valid format, but the behaviour is somewhat > surprising) Looks fine to me. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-19 14:37 ` Lars Ingebrigtsen @ 2022-01-19 14:49 ` Robert Pluim 2022-01-19 14:54 ` Lars Ingebrigtsen 0 siblings, 1 reply; 18+ messages in thread From: Robert Pluim @ 2022-01-19 14:49 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel >>>>> On Wed, 19 Jan 2022 15:37:53 +0100, Lars Ingebrigtsen <larsi@gnus.org> said: Lars> Robert Pluim <rpluim@gmail.com> writes: >> Does it really matter if the display name of an email address has >> possibly confusable characters in it? Itʼs not actually used for >> anything except to be displayed to the user. Lars> The Unicode people think so. OK >> (as an aside: >> >> (mail-header-parse-address "Robert Pluim rpluim@gmail.com") >> => >> ("RobertPluimrpluim@gmail.com") >> >> I know thatʼs not a valid format, but the behaviour is somewhat >> surprising) Lars> Looks fine to me. It does? I was expecting it to error out or drop everything prior to the last space. Robert -- ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: master ce63f91025: Add textsec functions for verifying email addresses 2022-01-19 14:49 ` Robert Pluim @ 2022-01-19 14:54 ` Lars Ingebrigtsen 0 siblings, 0 replies; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-01-19 14:54 UTC (permalink / raw) To: Robert Pluim; +Cc: emacs-devel Robert Pluim <rpluim@gmail.com> writes: > It does? I was expecting it to error out or drop everything prior to > the last space. It's a function for parsing RFC2047 (well, DRUMS, actually) strings, so if you feed it something else, it's undefined what you get out. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2022-01-22 11:25 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <164250841214.433.17670666873471731764@vcs2.savannah.gnu.org> [not found] ` <20220118122012.7A3A4C0DA1B@vcs2.savannah.gnu.org> 2022-01-18 13:30 ` master ce63f91025: Add textsec functions for verifying email addresses Po Lu 2022-01-18 18:42 ` Eli Zaretskii 2022-01-20 8:47 ` Lars Ingebrigtsen 2022-01-20 9:40 ` Eli Zaretskii 2022-01-20 9:49 ` Lars Ingebrigtsen 2022-01-20 10:20 ` Eli Zaretskii 2022-01-20 11:08 ` Lars Ingebrigtsen 2022-01-20 11:29 ` Eli Zaretskii 2022-01-20 12:46 ` Lars Ingebrigtsen 2022-01-22 10:01 ` Eli Zaretskii 2022-01-22 11:25 ` Lars Ingebrigtsen 2022-01-20 8:49 ` Lars Ingebrigtsen 2022-01-20 10:04 ` Eli Zaretskii 2022-01-20 10:07 ` Lars Ingebrigtsen 2022-01-18 13:38 ` Robert Pluim 2022-01-19 14:37 ` Lars Ingebrigtsen 2022-01-19 14:49 ` Robert Pluim 2022-01-19 14:54 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.