From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Philipp Stephani Newsgroups: gmane.emacs.devel Subject: Re: Character literals for Unicode (control) characters Date: Sun, 06 Mar 2016 19:16:37 +0000 Message-ID: References: <87r3fsjenn.fsf@gnus.org> <56D8623F.6060806@cs.ucla.edu> <838u1vwqj9.fsf@gnu.org> <56DC7227.10708@cs.ucla.edu> <56DC7F18.8050103@cs.ucla.edu> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a1145ba6c773878052d6630db X-Trace: ger.gmane.org 1457291833 27420 80.91.229.3 (6 Mar 2016 19:17:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 6 Mar 2016 19:17:13 +0000 (UTC) Cc: larsi@gnus.org, johnw@gnu.org, emacs-devel@gnu.org To: Paul Eggert , Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Mar 06 20:17:07 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aceAp-0001ay-Ln for ged-emacs-devel@m.gmane.org; Sun, 06 Mar 2016 20:17:03 +0100 Original-Received: from localhost ([::1]:51834 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aceAp-0003Ae-1p for ged-emacs-devel@m.gmane.org; Sun, 06 Mar 2016 14:17:03 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56846) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aceAa-0003AV-P0 for emacs-devel@gnu.org; Sun, 06 Mar 2016 14:16:49 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aceAZ-00013M-NO for emacs-devel@gnu.org; Sun, 06 Mar 2016 14:16:48 -0500 Original-Received: from mail-wm0-x231.google.com ([2a00:1450:400c:c09::231]:33777) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aceAZ-000132-Bo; Sun, 06 Mar 2016 14:16:47 -0500 Original-Received: by mail-wm0-x231.google.com with SMTP id l68so82806621wml.0; Sun, 06 Mar 2016 11:16:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=V14IIcGr7x8nCvw1hrbwVVaGRbE8Dxym2KP0/dSJ39Y=; b=QubjVISdjvj1hyGz1PjNfwGdL9dw2YiWpU9ZG69VVNtMQMm4A6SpO4cdChNPbeRhEv i5cF+j1IVmZW6Bi0MqAhURQHNV2Ivfvqt/FAbJfDLEfGZMY3SsP7BNW5y5Hh3jYDpD9b 25VUjvWW3dQFwqXtN160WN0Ef8fZesgbz3H2eoyKG0fcUTAY5PQbI5mPy3KmUM2UdGcj +49hJtNryh261zbEX9QEz3Ru1HLIGop7FyD1pRmZ4jgyDl7f1DfuoWCo8Rg9SpxA0qtt YoB2kcnGS29VVMrSICScgkE4qS4o0d4OGdaI5BuDK/nxfwObsAH6ZQ2He/h0ukhPWKZl OAUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=V14IIcGr7x8nCvw1hrbwVVaGRbE8Dxym2KP0/dSJ39Y=; b=X235q0NWcoyvoezwdk2iUag9dr1vRvxpi6HaDOFh4kLEdMbupIhT14tuZr0zdaiiiR DmM8w+qRi2CXuRg7zewB3GFy6Iu2xQ0sIB5lA4itF61xgd+tkugLISqKaGCvdNiQL0B4 LwrVqYhPCbSsKG8xBoF8pZGlOwR/Dfw0n5PSKzKPDRqY4ROwzWOKb+2SBPKaI1LIFpN2 0KbcvJDMk66MIOEKPLVuzAQrftTSsS8sgyFl5WvNBQsG8xk15wQDcxS9k1F1uGdoythr hAfmiJmkHKBPIyc+So0iCx8UT6P4m9UNOv6wEdMeRs1A5B0hSlOTP7ANUOQxliqW/r4d ff1g== X-Gm-Message-State: AD7BkJJ8vtLF2KwXX/MYj3ecuz9yJ8GfCoNt/o1pcoWRopqQkPQ0D/zAB0lExu8Y6oLQcClMNvtMEem3D50wSA== X-Received: by 10.28.21.75 with SMTP id 72mr8573082wmv.64.1457291806577; Sun, 06 Mar 2016 11:16:46 -0800 (PST) In-Reply-To: <56DC7F18.8050103@cs.ucla.edu> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::231 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:201013 Archived-At: --001a1145ba6c773878052d6630db Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Paul Eggert schrieb am So., 6. M=C3=A4rz 2016 um 20:03= Uhr: > Philipp Stephani wrote: > > Initially I used ucs-names, but the decided against it because it lacks > > most characters. > > Can you describe in general terms the difference between what's in > ucs-names and > what's in the new hash table? Should the two things be unified? > ucs-names uses a whitelist of ranges to consider: '((#x0000 . #x33FF) ;; (#x3400 . #x4DBF) CJK Ideographs Extension A (#x4DC0 . #x4DFF) ;; (#x4E00 . #x9FFF) CJK Unified Ideographs (#xA000 . #xD7FF) ;; (#xD800 . #xFAFF) Surrogate/Private (#xFB00 . #x134FF) ;; (#x13500 . #x167FF) unused (#x16800 . #x16A3F) ;; (#x16A40 . #x1AFFF) unused (#x1B000 . #x1B0FF) ;; (#x1B100 . #x1CFFF) unused (#x1D000 . #x1FFFF) ;; (#x20000 . #xDFFFF) CJK Ideograph Extension A, B, etc, unused (#xE0000 . #xE01FF)) This is probably for practical purposes (no point in showing thousands of "CJK UNIFIED IDEOGRAPH-xyz" completions). For a character escape these considerations don't apply, and it would be very surprising and confusing to not accept all characters. --001a1145ba6c773878052d6630db Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Paul E= ggert <eggert@cs.ucla.edu> = schrieb am So., 6. M=C3=A4rz 2016 um 20:03=C2=A0Uhr:
Philipp Stephani wrote:
> Initially I used ucs-names, but the decided against it because it lack= s
> most characters.

Can you describe in general terms the difference between what's in ucs-= names and
what's in the new hash table? Should the two things be unified?

ucs-names uses a whitelist of ranges to consid= er:

=C2=A0 =C2=A0 '((#x0000 . #x33FF)
=C2=A0 =C2=A0 = =C2=A0 ;; (#x3400 . #x4DBF) CJK Ideographs Extension A
=C2=A0 =C2=A0 =C2= =A0 (#x4DC0 . #x4DFF)
=C2=A0 =C2=A0 =C2=A0 ;; (#x4E00 . #x9FFF) CJK Unifie= d Ideographs
=C2=A0 =C2=A0 =C2=A0 (#xA000 . #xD7FF)
=C2=A0 =C2=A0 =C2=A0= ;; (#xD800 . #xFAFF) Surrogate/Private
=C2=A0 =C2=A0 =C2=A0 (#xFB00 . #x1= 34FF)
<= /span> =C2=A0 =C2=A0 =C2=A0 ;; (#x13500 . #x167FF) unused
=C2=A0 =C2=A0 = =C2=A0 (#x16800 . #x16A3F)
=C2=A0 =C2=A0 =C2=A0 ;; (#x16A40 . #x1AFFF) u= nused
<= /span> =C2=A0 =C2=A0 =C2=A0 (#x1B000 . #x1B0FF)
=C2=A0 =C2=A0 =C2=A0 ;; (#= x1B100 . #x1CFFF) unused
=C2=A0 =C2=A0 =C2=A0 (#x1D000 . #x1FFFF)
=C2=A0 = =C2=A0 =C2=A0 ;; (#x20000 . #xDFFFF) CJK Ideograph Extension A, B, etc, unu= sed
=C2=A0 =C2=A0 =C2=A0 (#xE0000 . #xE01FF))
=C2=A0
T= his is probably for practical purposes (no point in showing thousands of &q= uot;CJK UNIFIED IDEOGRAPH-xyz" completions). For a character escape th= ese considerations don't apply, and it would be very surprising and con= fusing to not accept all characters.
--001a1145ba6c773878052d6630db--