From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Aidan Kehoe Newsgroups: gmane.emacs.devel Subject: Re: [PATCH] Unicode Lisp reader escapes Date: Thu, 4 May 2006 18:41:17 +0200 Message-ID: <17498.11949.75640.41779@parhasard.net> References: <17491.34779.959316.484740@parhasard.net> <87odyfnqcj.fsf-monnier+emacs@gnu.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1146760930 22328 80.91.229.2 (4 May 2006 16:42:10 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 4 May 2006 16:42:10 +0000 (UTC) Cc: , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu May 04 18:42:06 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1Fbgtr-0004N3-3i for ged-emacs-devel@m.gmane.org; Thu, 04 May 2006 18:41:59 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Fbgtq-0003yv-IH for ged-emacs-devel@m.gmane.org; Thu, 04 May 2006 12:41:58 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Fbgtc-0003xe-3J for emacs-devel@gnu.org; Thu, 04 May 2006 12:41:44 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FbgtZ-0003vB-3K for emacs-devel@gnu.org; Thu, 04 May 2006 12:41:43 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FbgtY-0003uo-NV for emacs-devel@gnu.org; Thu, 04 May 2006 12:41:40 -0400 Original-Received: from [66.111.49.30] (helo=icarus.asclepian.ie) by monty-python.gnu.org with esmtp (Exim 4.52) id 1Fbgtx-00058i-Dl; Thu, 04 May 2006 12:42:06 -0400 Original-Received: by icarus.asclepian.ie (Postfix, from userid 1003) id 369EC8008D; Thu, 4 May 2006 17:41:17 +0100 (IST) Original-To: rms@gnu.org In-Reply-To: X-Mailer: VM 7.17 under 21.5 (beta25) "eggplant" (+CVS-20060325) XEmacs Lucid X-Echelon-distraction: Roswell UNSCOM Burns CSC 701 DSD X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:53921 Archived-At: Ar an ceathr=C3=BA l=C3=A1 de m=C3=AD Bealtaine, scr=C3=ADobh Richard St= allman:=20 > Regarding \u: the question is whether an Emacs escape for Unicode > characters should be compatible with C string syntax for Unicode > characters, or coherent with the Emacs \x escape. The thing with the Emacs \x escape is that anyone using it for characters outside of ASCII is asking for pain, and always has been. It has only eve= r been clearly defined for that character set; any existing code in the repository for other characters, for example, _will definitely_ break wit= h the merging of the Unicode branch. Now, there is lots of code in 21.4=E2=80=99s source tree that uses the sy= ntax for things that are conceptually numbers and not Emacs characters. That code = is not broken, but it is bad style; that=E2=80=99s what the #x syntax is for= . So when people have been using the variable-length syntax with a length greater than two, they are either writing buggy code, or using bad style. I=E2=80=99m not sure that merits emulation.=20 > I think one relevant question is to what extent the C and Emacs Lisp > string syntax are compatible in the first place. Emacs Lisp string > syntax was largely based on C string syntax in 1984, but I don't know > how C has developed since 1990. Can someone report on this question? The \u syntax (with a fixed number of digits) came into wide use with Jav= a in 1996. The necessity for the \U extension arose with progress towards version 3.0 of Unicode and its ~1.1 million available code points. That version of the standard was released in 1999; the C99 ISO standard for C = of the same year included both \u and \U. Various other C-oriented programmi= ng languages have incorporated the syntax since.=20 --=20 Aidan Kehoe, http://www.parhasard.net/