From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Drew Adams" Newsgroups: gmane.emacs.bugs Subject: bug#9653: 24.0.50; `ucs-names' - Why all of the ("" . XXX) entries? Date: Sun, 2 Oct 2011 10:38:43 -0700 Message-ID: <3EB320EE64B3419489F05F06AE8B3486@us.oracle.com> References: <74B14D2A03144E798C9415172D5FE01A@us.oracle.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1317577196 16271 80.91.229.12 (2 Oct 2011 17:39:56 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sun, 2 Oct 2011 17:39:56 +0000 (UTC) To: <9653@debbugs.gnu.org> Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Oct 02 19:39:50 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1RAQ0y-0005z7-Q0 for geb-bug-gnu-emacs@m.gmane.org; Sun, 02 Oct 2011 19:39:49 +0200 Original-Received: from localhost ([::1]:53806 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RAQ0y-0002u0-4N for geb-bug-gnu-emacs@m.gmane.org; Sun, 02 Oct 2011 13:39:48 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:48731) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RAQ0u-0002tk-TG for bug-gnu-emacs@gnu.org; Sun, 02 Oct 2011 13:39:45 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RAQ0t-0006XY-HK for bug-gnu-emacs@gnu.org; Sun, 02 Oct 2011 13:39:44 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:47790) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RAQ0t-0006XU-ES for bug-gnu-emacs@gnu.org; Sun, 02 Oct 2011 13:39:43 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1RAQ29-0004s9-Hi for bug-gnu-emacs@gnu.org; Sun, 02 Oct 2011 13:41:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: "Drew Adams" Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 02 Oct 2011 17:41:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 9653 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 9653-submit@debbugs.gnu.org id=B9653.131757720518661 (code B ref 9653); Sun, 02 Oct 2011 17:41:01 +0000 Original-Received: (at 9653) by debbugs.gnu.org; 2 Oct 2011 17:40:05 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RAQ1F-0004qv-DW for submit@debbugs.gnu.org; Sun, 02 Oct 2011 13:40:05 -0400 Original-Received: from acsinet15.oracle.com ([141.146.126.227]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1RAQ1B-0004qR-6l for 9653@debbugs.gnu.org; Sun, 02 Oct 2011 13:40:04 -0400 Original-Received: from rtcsinet22.oracle.com (rtcsinet22.oracle.com [66.248.204.30]) by acsinet15.oracle.com (Switch-3.4.4/Switch-3.4.4) with ESMTP id p92Hcd4S001290 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <9653@debbugs.gnu.org>; Sun, 2 Oct 2011 17:38:41 GMT Original-Received: from acsmt356.oracle.com (acsmt356.oracle.com [141.146.40.156]) by rtcsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id p92HccjR021074 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <9653@debbugs.gnu.org>; Sun, 2 Oct 2011 17:38:39 GMT Original-Received: from abhmt110.oracle.com (abhmt110.oracle.com [141.146.116.62]) by acsmt356.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id p92HcXW8001797 for <9653@debbugs.gnu.org>; Sun, 2 Oct 2011 12:38:33 -0500 Original-Received: from dradamslap1 (/10.159.61.165) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 02 Oct 2011 10:38:33 -0700 X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <74B14D2A03144E798C9415172D5FE01A@us.oracle.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6109 Thread-Index: AcyBIX7V2dai2j7cRZ+04MS1cLAMxgAAU3Ww X-Source-IP: rtcsinet22.oracle.com [66.248.204.30] X-CT-RefId: str=0001.0A090202.4E88A1A1.00C0,ss=1,re=0.000,fgs=0 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Sun, 02 Oct 2011 13:41:01 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 1) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:52071 Archived-At: (Not claiming this additional question relates to a bug, in particular to this bug report - except in so far as it asks for better doc.) In `ucs-names', what are the CHAR-NAMEs "VARIATION SELECTOR-n" all about (for n=17...256)? Are those actually character names? Googling indicates that a variation selector is a metacharacter that selects one of a set of semantically equivalent glyphs. I no doubt do not fully understand (even after scanning the Unicode standard, http://unicode.org/reports/tr28/tr28-3.html#13_7_variation_selectors and this: http://babelstone.blogspot.com/2007/06/secret-life-of-variation-selectors.html about it). I can, however, see the difference variation selectors can make, e.g. here: http://www.w3.org/TR/xml-entity-names/U0FE00.html. But why are the "VARIATION SELECTOR-n" included as CHAR-NAMEs in `ucs-names'? IIUC, variant selectors, when used, follow characters whose representations/appearance they modify in some sense. Why do we treat variation selectors, in `ucs-names', as "character names", if they are only "metacharacters", "combining marks" used to indicate how to change the appearance of the characters they follow? I see that the Unicode standard also refers to variation selectors as "default ignorable characters", so I guess they are characters in some sense. But how about providing a function that filters out all such "ignorable characters" from `ucs-names', or how about at least providing a list of all such chars. I see this in the standard too: "If a user requires a visual distinction between a character and a particular variant of that character, then fonts must be used to make that distinction." The "variation selector" information seems to be only about visual appearance, not about names of displayable characters. Does it really belong in `ucs-names'? And I see that such "ignorable" stuff is apparently supposed to be invisible - e.g., "default_ignorable_code points...are invisible, have no glyph...". If so, how about a function that filters out all such invisible stuff from `ucs-names' (or at least a list of such stuff). How about a little more doc for `ucs-names', so that any programmer who might want to use `ucs-names' (e.g. for completion) might know how to reasonably use/deal with such particular CHAR-NAMEs. Please do not simply say that `ucs-names' is only "internal" so you need not describe it better. It's already being used in various 3rd-party code. Again, this is not really part of this bug report (which is only about "" as a CHAR-NAME), unless you see that it is related (e.g. wrt doc). But I would like to know more about the "ignorable characters" - how to recognize them etc. so that I can (optionally, at least) remove them as completion candidates. I understand that the Emacs doc does not have as its purpose to teach the details of the Unicode standard, but perhaps a little more explanation of the content of `ucs-names' wouldn't hurt?