Richard Stallman writes: This is indeed worrysome and has been around for a while. There is an even more insidious form of this hack where unicode chars that "appear like english letters" can be used --and a quick visual scan will miss it -- the trick is often used by spammers in domain-names within URLs as an example. As an example, there are Cyrillic letters that "look like" Roman letters. > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > There is a thread now about confusables. > > I read this, > > Unicode allows user tracking by means of invisible text marking. Any > string can be converted into its binary form and then recoded into a > string of zero-width characters, which can then be invisibly inserted > into the text. If the text is posted elsewhere, the zero-width > character string can be extracted and the process reversed to figure > out the identity of the person who copied it. > > which seems ot be about a special case of confusables, and it makes me > wonder whether Emacs does, or could, show users when Unicode confusion > occurs, or prevent or fix it somehow. > > First, is that issue of invisible characters real? > > Second, does Emacs do anything now such that these tricks > won't succeed? > > If the problem exists in Emacs now, could we prevent it? I see a few > ways to try. I don't know whether they would work well. > > * Indicate the different encodings on the screen somehow. > > * Canonicalize such seqences (perhaps when reading text into Emacs), > so that different encodings of the same text become identical. > > * Use a stand-alone canonicalizer program. -- Thanks, --Raman(I Search, I Find, I Misplace, I Research) ♈ Id: kg:/m/0285kf1 🦮