From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: "Garreau\, Alexandre" Newsgroups: gmane.emacs.devel Subject: Re: Change of Lisp syntax for "fancy" quotes in Emacs 27? Date: Sat, 06 Oct 2018 14:10:17 +0200 Message-ID: <1q9xavzzzzzz.vci.xxuns.g6.gal@portable.galex-713.eu> References: <83y3bc2378.fsf@gnu.org> <4fjg1zzzzzzz.3lf.xxuns.g6.gal@portable.galex-713.eu> <83h8hz1eg1.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1538827748 24991 195.159.176.226 (6 Oct 2018 12:09:08 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 6 Oct 2018 12:09:08 +0000 (UTC) User-Agent: Gnus (5.13), GNU Emacs 25.1.1 (i686-pc-linux-gnu) Cc: npostavs@users.sourceforge.net, eggert@cs.ucla.edu, drew.adams@oracle.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Oct 06 14:09:03 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1g8lOJ-0006PW-AK for ged-emacs-devel@m.gmane.org; Sat, 06 Oct 2018 14:09:03 +0200 Original-Received: from localhost ([::1]:39002 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g8lQP-0005cg-UV for ged-emacs-devel@m.gmane.org; Sat, 06 Oct 2018 08:11:13 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52473) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g8lPl-0005cJ-7t for emacs-devel@gnu.org; Sat, 06 Oct 2018 08:10:34 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1g8lPk-0008JN-9l for emacs-devel@gnu.org; Sat, 06 Oct 2018 08:10:33 -0400 Original-Received: from portable.galex-713.eu ([2a00:5884:8305::1]:49922) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1g8lPk-0008Gu-1h; Sat, 06 Oct 2018 08:10:32 -0400 Original-Received: from localhost ([::1] helo=portable.galex-713.eu) by portable.galex-713.eu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1g8lPV-000619-Lu; Sat, 06 Oct 2018 14:10:17 +0200 X-GPG-FINGERPRINT: E109 9988 4197 D7CB B0BC 5C23 8DEB 24BA 867D 3F7F X-Accept-Language: fr, en, it, eo In-Reply-To: <83h8hz1eg1.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 06 Oct 2018 14:50:54 +0300") X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:5884:8305::1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:230251 Archived-At: Le 06/10/2018 =C3=A0 14h50, Eli Zaretskii a =C3=A9crit=C2=A0: >> From: "Garreau\, Alexandre" >> Cc: Eli Zaretskii , emacs-devel@gnu.org, >> drew.adams@oracle.com, npostavs@users.sourceforge.net >> Date: Sat, 06 Oct 2018 13:22:14 +0200 >>=20 >> In a world where unicode is increasingly present and confusion about its >> characters increasingly problematic (typosquatting, etc.) wouldn=E2=80= =99t it be >> reasonable to expect unicode-related semantic functions to be provided >> in most frameworks, systems and languages to allow better handling of >> such problems, thus making that problem the interface=E2=80=99s one? > > I don't think I understand what this means in practice; please > elaborate. afaik there are also problems in other contents than source code about undistinguishable unicode character, such as the latin ?o and the cyrillic ?=D0=BE (the first example of unicode-powered typosquatting I ever heard), the different spaces (sometimes not distinguishable in monospace font), or, to stay on monospacing problems: I have great pain in writing correct french text as I must always check in something not-emacs about which one between ?=E2=80=93 and ?=E2=80=94 is the medium and the long dash= (I normally recall through their position on my keyboard but as they=E2=80=99re aside I often forget), not to recall the different hacks about bidirectionality you highlighted earlier. I also heard about emails confusing semantic-based bayesian anti-spam by putting not-spammy words in mails that, because of some unicode tricks, wouldn=E2=80=99t be displayed to user. This problems aren=E2=80=99t local to source code, nor to emacs (as many pe= ople use something else than emacs to read mails, websites, news, and reading domain names), and afaik there are canonicalizations and semantic unicode categories functions to help knowing what is punctuation, what is combining, what is displayed and takes how much space, and maybe, but I=E2=80=99m unsure, which characters are to be difficult or even impossible= to distinguish (or some canonicalizations function to get two differently encoded (related to combining characters (such as the difference between "=C3=A9" and "e=CC=81" (made of ?e then ?=CC=81 (it=E2=80=99s fun to see ho= w this last one is strangely displayed and finely evaluated by emacs)))) strings comparable the same, or two characters-different but looking-alike strings comparable the same too). I guess this issue is even going to be less a problem in free softwares where theorically the writers should be well-intentioned and shouldn=E2=80= =99t try to trick the readers on what the software do (and/or it should at least be reviewed with capable tools and/or knowledge), compared to cases where this is going to be abusable and profitable, such as typosquating ("google.com" and "g=D0=BE=D0=BEgle.com" are not the same (it= =E2=80=99s interesting to notice too how emacs forward/backward-word detects and use the language-switching to stop at the "=D0=BE=D0=BE", I=E2=80=99m astou= nished by these capabilities I have to thank you guy for a such great piece of software!) but google could aford (and took care) to buy both while not everyone could do as well (and nobody yet reserved "amaz=D0=BEn.com"), and people might crack, steal or blackmail using something like that).