From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: ASCII-only startup message? Date: Sun, 27 Dec 2015 14:47:04 -0800 (PST) Message-ID: References: <567ECD8C.1070408@cs.ucla.edu> <8360zlhy7x.fsf@gnu.org> <567EE043.9020109@cs.ucla.edu> <83y4chgh5q.fsf@gnu.org> <567EED47.1090700@cs.ucla.edu> <83si2pgci8.fsf@gnu.org> <567F22B1.9040702@cs.ucla.edu> <2dc99848-b6d5-4f53-b22c-66e29d15647c@default> <444c19cb-4687-41c4-8291-481f5b5a42a1@default> <9e93866e-c6a4-42e3-b8b2-70fd6185b25e@default> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1451256448 6846 80.91.229.3 (27 Dec 2015 22:47:28 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 27 Dec 2015 22:47:28 +0000 (UTC) To: =?utf-8?B?UGVyIFN0YXJiw6Rjaw==?= , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Dec 27 23:47:16 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aDK5r-00033s-Eb for ged-emacs-devel@m.gmane.org; Sun, 27 Dec 2015 23:47:15 +0100 Original-Received: from localhost ([::1]:43004 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aDK5q-00039r-Mr for ged-emacs-devel@m.gmane.org; Sun, 27 Dec 2015 17:47:14 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54447) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aDK5n-00039m-GD for emacs-devel@gnu.org; Sun, 27 Dec 2015 17:47:12 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aDK5k-0004K8-9D for emacs-devel@gnu.org; Sun, 27 Dec 2015 17:47:11 -0500 Original-Received: from userp1040.oracle.com ([156.151.31.81]:38827) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aDK5k-0004K4-1y for emacs-devel@gnu.org; Sun, 27 Dec 2015 17:47:08 -0500 Original-Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id tBRMl6Uw012212 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Sun, 27 Dec 2015 22:47:06 GMT Original-Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id tBRMl5pF026375 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Sun, 27 Dec 2015 22:47:05 GMT Original-Received: from abhmp0002.oracle.com (abhmp0002.oracle.com [141.146.116.8]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id tBRMl5di027488; Sun, 27 Dec 2015 22:47:05 GMT In-Reply-To: X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 156.151.31.81 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:196985 Archived-At: > > Or consider character HYPHEN-MINUS (U+002D), character HYPHEN > > (U+2010), and character MINUS SIGN (U+2212). > > > > You might say that the first of these is analogous to the ASCII > > apostrophe (U+0027) - it is essentially for compatibility. >=20 > Yes, that is true, but not for compatibility between "apostrophe" and > "right single quotation mark" as that imagined argument continues in > your post, but for compatibility between "left single quotation mark" > and "right single quotation mark" as well as less common characters > like "prime". Huh? The Unicode _name_ of character U+0027 is... "APOSTROPHE". And the Unicode "old name" of it is "APOSTROPHE-QUOTE". Claiming that Unicode intends this character only for compatibility between "left single quotation mark", "right single quotation mark", and less common characters like "prime", and NOT for compatibility between "apostrophe" and "right single quotation mark" is, well, imaginative. Where do you get that notion? --- And then there is this, which echoes the point I made that an apostrophe _is not_ a closing quotation mark. https://tedclancy.wordpress.com/2015/06/03/which-unicode-character-should-r= epresent-the-english-apostrophe-and-why-the-unicode-committee-is-very-wrong= / (cited here, BTW: http://ilovetypography.com/2015/08/07/this-month-in-typog= raphy-6/) Using U+2019 is inconsistent with the rest of the standard ---------------------------------------------------------- Earlier in section 6.2, the standard explains the difference between punctuation marks and modifier letters: Punctuation marks generally break words; modifier letters generally are considered part of a word. Consider any English word with an apostrophe, e.g. =E2=80=9Cdon=E2=80=99t=E2=80=9D. The word =E2=80=9Cdon=E2=80=99t=E2=80=9D is a single word. It is not the = word =E2=80=9Cdon=E2=80=9D juxtaposed against the word =E2=80=9Ct=E2=80=9D. The apostrophe is part o= f the word, which, in Unicode-speak, means it=E2=80=99s a modifier letter, not a punctuation mark, regardless of what colloquial English calls it. According to the Unicode character database, U+2019 is a punctuation mark (General Category =3D Pf), while U+02BC is a modifier letter (General Category =3D Lm). Since English apostrophes are part of the words they=E2=80=99re in, they are modifier letters, and hence should be represented by U+02BC, not U+2019. And this, which makes a somewhat different argument: https://www.mail-archive.com/unicode@unicode.org/msg35871.html It refers to the previous argument thus: Were there no modifier letters at all, Unicode had have to introduce an apostrophe character, because an apostrophe is not at all the same as a quotation mark and does not work the same way neither. By handling text, not theories, Ted Clancy at Mozilla clearly shows us that ambiguating the apostrophe with a close-quote brings up counterproductive complications that impact severely the productivity of the users. Reply: https://www.mail-archive.com/unicode@unicode.org/msg35851.html And this URL provides a history of the move from U+02BC to U+0219: http://charupdate.info/#ambiguation It points out that this move was so odd that it required the invention of the word "ambiguation" to cover the confusion. The same article suggests that the Unicode Consortium itself "is not at ease with the new preference". A search in the Mail Archives shows why the apostrophe and the single close quote were ambiguated=E2=80=94a process that needs even a new word to put on it, as ordinarily everybody works for disambiguation. It was for simplification's sake, in word processing software. Simplification for word-processing software! Aka MS Word and its notorious misuse of _left_ single quotation mark for things like "=E2=80=98Tis the season" (it should be "=E2=80=99Tis"): The phenomenon called =E2=80=9Cthe Apostrophe Catastrophe=E2=80=9D consis= ts in a huge number of instances where text processing software (word processor, desktop publishing) inserts an open quote instead of a leading apostrophe. Interestingly, a similar discussion surrounds the use of hyphen: https://www.mail-archive.com/unicode@unicode.org/msg35852.html But luckily, the miscategorisation of U+2010 hasn't led to any pressing practical problems, unlike the misuse of U+2019 for the apostrophe. This discussion, BTW, is from _2015_, 16 years after the Unicode decision to switch from using U+02BC to using U+0219 as apostrophe. Still problematic, it would seem. Certainly not cut-and-dried. --- To be clear, I am NOT arguing that _Emacs_ should use U+02BC instead of U+0219 as apostrophe. I argue that Emacs should (continue to) use U+0027 (ASCII apostrophe) as apostrophe (in its own doc, *scratch* comments, and so on). Not because it is a more genuine apostrophe but because it is much easier for users (and programs) to work with.