* bug#16216: 24.3.50; <control> entries in `ucs-names' @ 2013-12-22 2:09 Drew Adams 2013-12-22 3:55 ` Eli Zaretskii 0 siblings, 1 reply; 6+ messages in thread From: Drew Adams @ 2013-12-22 2:09 UTC (permalink / raw) To: 16216 The doc for `insert-char' and `ucs-names' is sketchy. But it does at least say that it is about inserting a character "using its UNICODE name or its code point." So what are all of those `<control>' character names about? Many characters are listed in `ucs-names' as having this same "character name", `<control>': C-x 8 RET TAB C-g C-h v ucs-names C-s <control> C-s C-s... And yet, AFAICT, there is no UNICODE character that has the name `<control>', or even any name that has that as a substring. http://www.unicode.org/charts/charindex.html The seems like a bug. But since the description of `ucs-names' is so sketchy it's hard to assert that. If this is not a bug, then: 1. In what way is `<control>' a "CHAR-NAME" for a character with any code point? What does CHAR-NAME mean in this case? 2. What is the purpose of the multiple `<control>' CHAR-NAMEs? 3. Why are different CHAR-CODE values associated with the same CHAR-NAME, `<control>'? What does that mean? 4. Try `C-x 8 RET <contr TAB RET'. You get only one particular character "named" <control>, the one with code point decimal 159. That's the character named "APPLICATION PROGRAM COMMAND". Why that one? In GNU Emacs 24.3.50.1 (i686-pc-mingw32) of 2013-12-16 on ODIEONE Bzr revision: 115543 rudalics@gmx.at-20131216095844-lbjh5yerk6ff0tm7 Windowing system distributor `Microsoft Corp.', version 6.1.7601 Configured using: `configure --prefix=/c/Devel/emacs/binary --enable-checking=yes,glyphs 'CFLAGS=-O0 -g3' LDFLAGS=-Lc:/Devel/emacs/lib CPPFLAGS=-Ic:/Devel/emacs/include' ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#16216: 24.3.50; <control> entries in `ucs-names' 2013-12-22 2:09 bug#16216: 24.3.50; <control> entries in `ucs-names' Drew Adams @ 2013-12-22 3:55 ` Eli Zaretskii 0 siblings, 0 replies; 6+ messages in thread From: Eli Zaretskii @ 2013-12-22 3:55 UTC (permalink / raw) To: Drew Adams; +Cc: 16216 > Date: Sat, 21 Dec 2013 18:09:17 -0800 (PST) > From: Drew Adams <drew.adams@oracle.com> > > 1. In what way is `<control>' a "CHAR-NAME" for a character with any > code point? What does CHAR-NAME mean in this case? Look at UnicodeData.txt, near the beginning of the file. ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <<cbbc5d36-76a4-4145-9dbe-30f8c986b2a7@default>]
[parent not found: <<83lhzd8roz.fsf@gnu.org>]
* bug#16216: 24.3.50; <control> entries in `ucs-names' [not found] ` <<83lhzd8roz.fsf@gnu.org> @ 2013-12-22 5:08 ` Drew Adams 2013-12-22 5:10 ` Drew Adams 2013-12-22 18:10 ` Eli Zaretskii 0 siblings, 2 replies; 6+ messages in thread From: Drew Adams @ 2013-12-22 5:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 16216 > Look at UnicodeData.txt, near the beginning of the file. I see; thanks. And I recall now that you pointed me to that file once before. Still, that does not really answer the questions I posed, AFAICT. At least not for a user of `ucs-names' or the other functions mentioned. If `ucs-names' essentially corresponds to UnicodeData.txt, how about citing that in its doc? Better yet, perhaps cite this, which seems to be the place that the fields of UnicodeData.txt are described: http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt Still, part of my question is about `insert-char' and `read-char-by-name', which is really what most users will see. (Those are admittedly not the same as `ucs-names'. But they are currently the only consumers of the latter.) Should the `<control>' entries of `ucs-names' be included for the completion provided by `read-char-by-name'? You can only choose one of them, anyway. What is the use case for that - the reason it is included as a possibility for `C-x 8 RET'? ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#16216: 24.3.50; <control> entries in `ucs-names' 2013-12-22 5:08 ` Drew Adams @ 2013-12-22 5:10 ` Drew Adams 2013-12-22 18:13 ` Eli Zaretskii 2013-12-22 18:10 ` Eli Zaretskii 1 sibling, 1 reply; 6+ messages in thread From: Drew Adams @ 2013-12-22 5:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 16216 > http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt (That seems to have been replaced by this: http://www.unicode.org/reports/tr44/#UnicodeData.txt) ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#16216: 24.3.50; <control> entries in `ucs-names' 2013-12-22 5:10 ` Drew Adams @ 2013-12-22 18:13 ` Eli Zaretskii 0 siblings, 0 replies; 6+ messages in thread From: Eli Zaretskii @ 2013-12-22 18:13 UTC (permalink / raw) To: Drew Adams; +Cc: 16216 > Date: Sat, 21 Dec 2013 21:10:50 -0800 (PST) > From: Drew Adams <drew.adams@oracle.com> > Cc: 16216@debbugs.gnu.org > > > http://www.unicode.org/Public/5.1.0/ucd/UCD.html#UnicodeData.txt > > (That seems to have been replaced by this: > http://www.unicode.org/reports/tr44/#UnicodeData.txt) The best references are to the "latest" version: http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt ^ permalink raw reply [flat|nested] 6+ messages in thread
* bug#16216: 24.3.50; <control> entries in `ucs-names' 2013-12-22 5:08 ` Drew Adams 2013-12-22 5:10 ` Drew Adams @ 2013-12-22 18:10 ` Eli Zaretskii 1 sibling, 0 replies; 6+ messages in thread From: Eli Zaretskii @ 2013-12-22 18:10 UTC (permalink / raw) To: Drew Adams; +Cc: 16216-done > Date: Sat, 21 Dec 2013 21:08:35 -0800 (PST) > From: Drew Adams <drew.adams@oracle.com> > Cc: 16216@debbugs.gnu.org > > > Look at UnicodeData.txt, near the beginning of the file. > > I see; thanks. And I recall now that you pointed me to that > file once before. > > Still, that does not really answer the questions I posed, AFAICT. > At least not for a user of `ucs-names' or the other functions > mentioned. I looked deeper and decided that this was a bug. The Unicode Standard explicitly says that control characters have no 'name' property (see Section 4.8 in the Standard), and that those "<control>" things are just labels. The 'name' property cannot have lower-case characters of "<>" in it anyway. So starting with trunk revision 115693, all control characters will have nil as their 'name' property, and "C-x 8 RET < TAB" will say "No match". (Some of the control characters have 'old-name' property, so they still can be called out by name.) > If `ucs-names' essentially corresponds to UnicodeData.txt, how > about citing that in its doc? The exact file is an implementation detail (there's a corresponding XML file, which could be used if we wanted); the ELisp manual documents that the properties are derived from UCD, the Unicode Character Database. Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-12-22 18:13 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-22 2:09 bug#16216: 24.3.50; <control> entries in `ucs-names' Drew Adams 2013-12-22 3:55 ` Eli Zaretskii [not found] <<cbbc5d36-76a4-4145-9dbe-30f8c986b2a7@default> [not found] ` <<83lhzd8roz.fsf@gnu.org> 2013-12-22 5:08 ` Drew Adams 2013-12-22 5:10 ` Drew Adams 2013-12-22 18:13 ` Eli Zaretskii 2013-12-22 18:10 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).