From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Character literals for Unicode (control) characters Date: Mon, 14 Mar 2016 22:30:11 +0200 Message-ID: <83y49kby5o.fsf@gnu.org> References: <87r3fsjenn.fsf@gnus.org> <56D8623F.6060806@cs.ucla.edu> <838u1vwqj9.fsf@gnu.org> <56DC7227.10708@cs.ucla.edu> <56DC7F18.8050103@cs.ucla.edu> <83si03v0c3.fsf@gnu.org> <56E7191A.60507@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1457987474 17112 80.91.229.3 (14 Mar 2016 20:31:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 14 Mar 2016 20:31:14 +0000 (UTC) Cc: p.stephani2@gmail.com, johnw@gnu.org, larsi@gnus.org, emacs-devel@gnu.org To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Mar 14 21:31:09 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1afZ8t-0001qa-4j for ged-emacs-devel@m.gmane.org; Mon, 14 Mar 2016 21:31:07 +0100 Original-Received: from localhost ([::1]:43921 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afZ8s-0003OQ-Lc for ged-emacs-devel@m.gmane.org; Mon, 14 Mar 2016 16:31:06 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54177) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afZ8a-0003NM-GI for emacs-devel@gnu.org; Mon, 14 Mar 2016 16:30:52 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1afZ8W-0007NN-M6 for emacs-devel@gnu.org; Mon, 14 Mar 2016 16:30:48 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:38820) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1afZ8R-0007La-AP; Mon, 14 Mar 2016 16:30:39 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2965 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1afZ8Q-0006km-CJ; Mon, 14 Mar 2016 16:30:38 -0400 In-reply-to: <56E7191A.60507@cs.ucla.edu> (message from Paul Eggert on Mon, 14 Mar 2016 13:03:38 -0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:201761 Archived-At: > Cc: larsi@gnus.org, johnw@gnu.org, emacs-devel@gnu.org > From: Paul Eggert > Date: Mon, 14 Mar 2016 13:03:38 -0700 > > What's the likelihood that the numbers in the above test will > change? Zero, given the UTC's stability policy. But note that Unicode 9.0.0 adds another range of Ideographs similar to CJK, their names begin with "TANGUT IDEOGRAPH-". > > + /* 200 characters is hopefully long enough. Increase if > > + not. */ > > + char name[200]; > > Give a name to this constant, e.g., > > /* Bound on the length of a Unicode character name. > As of Unicode 9.0.0 the maximum is 83, so this should be safe. */ > enum { UNICODE_CHARACTER_NAME_LENGTH_BOUND = 199 }; > ... > char name[UNICODE_CHARACTER_NAME_LENGTH_BOUND + 1]; Perhaps we should ask on the Unicode mailing list, I somehow remember seeing a mandatory limit on the length of a character's name. Thanks.