unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
@ 2016-04-21  4:40 Tino Calancha
  2016-04-21 14:00 ` Eli Zaretskii
  2016-04-23 19:52 ` bug#23325: 25.0.92; insert-char: cannot find all chars if input, " Paul Eggert
  0 siblings, 2 replies; 10+ messages in thread
From: Tino Calancha @ 2016-04-21  4:40 UTC (permalink / raw)
  To: 23325


Hello,

Interactive calls to `insert-char' cannot find all
characters when CHARACTER is the unicode character name.

IOW, it finds all characters when CHARACTER is the
code point.


emacs -Q:
M-x insert-char RET cjk SPC TAB
;; Buffer "*Completions*" just shows entries starting
;; with 'CJK RADICAL' or 'CJK STROKE'.

M-x insert-char RET 2eea RET
;; ok
M-x insert-char RET cjk SPC radical SPC c-simplified SPC frog RET
;; ok
M-x insert-char RET 79c1 RET
;; ok
M-x insert-char RET cjk SPC ideograph-79c1 RET
;; Signal error 'Invalid character'


In GNU Emacs 25.0.92.1 (x86_64-pc-linux-gnu, GTK+ Version 2.24.30)
  of 2016-04-21 built on calancha-pc
Repository revision: a77cf24ada2f89194c0ac64aae27bcdf7021e697





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-21  4:40 bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name Tino Calancha
@ 2016-04-21 14:00 ` Eli Zaretskii
  2016-04-21 14:32   ` Tino Calancha
  2016-04-23 19:52 ` bug#23325: 25.0.92; insert-char: cannot find all chars if input, " Paul Eggert
  1 sibling, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2016-04-21 14:00 UTC (permalink / raw)
  To: Tino Calancha; +Cc: 23325

> Date: Thu, 21 Apr 2016 13:40:01 +0900 (JST)
> From: Tino Calancha <f92capac@gmail.com>
> 
> Interactive calls to `insert-char' cannot find all
> characters when CHARACTER is the unicode character name.
> 
> IOW, it finds all characters when CHARACTER is the
> code point.
> 
> 
> emacs -Q:
> M-x insert-char RET cjk SPC TAB
> ;; Buffer "*Completions*" just shows entries starting
> ;; with 'CJK RADICAL' or 'CJK STROKE'.
> 
> M-x insert-char RET 2eea RET
> ;; ok
> M-x insert-char RET cjk SPC radical SPC c-simplified SPC frog RET
> ;; ok
> M-x insert-char RET 79c1 RET
> ;; ok
> M-x insert-char RET cjk SPC ideograph-79c1 RET
> ;; Signal error 'Invalid character'

It's a feature, see ucs-names.  We deliberately filter out
non-descriptive names like "CJK COMPATIBILITY IDEOGRAPH-2F803",
because (a) if we didn't the list of completions would be sometimes
much longer; and (b) it makes very little sense to show these names as
completions candidates, because those names include the codepoint, so
if you know which of these you need to insert, you can insert it by
the codepoint.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-21 14:00 ` Eli Zaretskii
@ 2016-04-21 14:32   ` Tino Calancha
  2016-04-21 15:44     ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Tino Calancha @ 2016-04-21 14:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tino Calancha, 23325


> It's a feature, see ucs-names.  We deliberately filter out
> non-descriptive names like "CJK COMPATIBILITY IDEOGRAPH-2F803",

Yes, that was my guess, and actually i would not cache such
trivial key val relations either.

Then a minor detail is as follows:

M-x insert-char 79c1 RET
;; ok character appear
C-b
M-x describe-char RET
;; Line 8 shows:
;; to input: type "C-x 8 RET 79c1" or "C-x 8 RET CJK IDEOGRAPH-79C1"

Shouldn't `describe-char' omit the input method by unicode name in 
those cases?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-21 14:32   ` Tino Calancha
@ 2016-04-21 15:44     ` Eli Zaretskii
  2016-04-21 17:39       ` Tino Calancha
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2016-04-21 15:44 UTC (permalink / raw)
  To: Tino Calancha; +Cc: 23325

> Date: Thu, 21 Apr 2016 23:32:49 +0900 (JST)
> From: Tino Calancha <f92capac@gmail.com>
> cc: Tino Calancha <f92capac@gmail.com>, 23325@debbugs.gnu.org
> 
> M-x describe-char RET
> ;; Line 8 shows:
> ;; to input: type "C-x 8 RET 79c1" or "C-x 8 RET CJK IDEOGRAPH-79C1"
> 
> Shouldn't `describe-char' omit the input method by unicode name in 
> those cases?

Yes, it should show a method that works; patches welcome.





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-21 15:44     ` Eli Zaretskii
@ 2016-04-21 17:39       ` Tino Calancha
  2016-04-21 19:42         ` Eli Zaretskii
  0 siblings, 1 reply; 10+ messages in thread
From: Tino Calancha @ 2016-04-21 17:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tino Calancha, 23325

[-- Attachment #1: Type: text/plain, Size: 289 bytes --]



> Yes, it should show a method that works; patches welcome.

With attached patch:

`describe-char' on 0x79c1 shows:
to input: type "C-x 8 RET 79c1"

IOH, for catched entries in `ucs-names' (e.g. 0x304d) it shows as usual:
to input: type "C-x 8 RET 304d" or "C-x 8 RET HIRAGANA LETTER KI"

[-- Attachment #2: Type: text/plain, Size: 1152 bytes --]

diff --git a/lisp/descr-text.el b/lisp/descr-text.el
index a352ed0..21f2214 100644
--- a/lisp/descr-text.el
+++ b/lisp/descr-text.el
@@ -619,7 +619,13 @@ describe-char
                           (let ((name
                                  (or (get-char-code-property char 'name)
                                      (get-char-code-property char 'old-name))))
-                            (if name
+                            (if (and name
+                                     (not (string-prefix-p "CJK IDEOGRAPH-" name))
+                                     (not (string-prefix-p "CJK COMPATIBILITY IDEOGRAPH-" name))
+                                     (not (string-prefix-p "LOW SURROGATE-" name))
+                                     (not (string-prefix-p "HIGH SURROGATE-" name))
+                                     (not (string-prefix-p "TANGUT IDEOGRAPH-" name))
+                                     (not (string-prefix-p "TANGUT COMPONENT-" name)))
                                 (format
                                  "type \"C-x 8 RET %x\" or \"C-x 8 RET %s\""
                                  char name)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-21 17:39       ` Tino Calancha
@ 2016-04-21 19:42         ` Eli Zaretskii
  2016-04-22  3:08           ` Tino Calancha
  0 siblings, 1 reply; 10+ messages in thread
From: Eli Zaretskii @ 2016-04-21 19:42 UTC (permalink / raw)
  To: Tino Calancha; +Cc: 23325

> Date: Fri, 22 Apr 2016 02:39:24 +0900 (JST)
> From: Tino Calancha <f92capac@gmail.com>
> cc: Tino Calancha <f92capac@gmail.com>, 23325@debbugs.gnu.org
> 
> > Yes, it should show a method that works; patches welcome.
> 
> With attached patch:
> 
> `describe-char' on 0x79c1 shows:
> to input: type "C-x 8 RET 79c1"
> 
> IOH, for catched entries in `ucs-names' (e.g. 0x304d) it shows as usual:
> to input: type "C-x 8 RET 304d" or "C-x 8 RET HIRAGANA LETTER KI"

Thanks.  But I think it would be better if the code didn't have to
define in yet another place which characters are omitted; instead, how
about checking if the character is in the ucs-names list?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-21 19:42         ` Eli Zaretskii
@ 2016-04-22  3:08           ` Tino Calancha
  2016-04-22  7:32             ` Andreas Schwab
  0 siblings, 1 reply; 10+ messages in thread
From: Tino Calancha @ 2016-04-22  3:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tino Calancha, 23325

[-- Attachment #1: Type: text/plain, Size: 227 bytes --]


> It would be better if the code didn't have to
> define in yet another place which characters are omitted; instead, how
> about checking if the character is in the ucs-names list?

Absolutely.  Implemented in the new patch.


[-- Attachment #2: Type: text/plain, Size: 1137 bytes --]

From 592473bed4a6986e9dc96489f63a174112cc90d5 Mon Sep 17 00:00:00 2001
From: Tino Calancha <f92capac@gmail.com>
Date: Fri, 22 Apr 2016 12:00:14 +0900
Subject: [PATCH] describe-char: fix insert char documentation

* lisp/descr-text.el (describe-char):
Only 'ucs-names' entries can be inserted by unicode name (Bug#23325).
---
 lisp/descr-text.el | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lisp/descr-text.el b/lisp/descr-text.el
index a352ed0..4dac6bb 100644
--- a/lisp/descr-text.el
+++ b/lisp/descr-text.el
@@ -619,7 +619,8 @@ describe-char
                           (let ((name
                                  (or (get-char-code-property char 'name)
                                      (get-char-code-property char 'old-name))))
-                            (if name
+                            (if (and name
+                                     (assoc-string name (or ucs-names (ucs-names))))
                                 (format
                                  "type \"C-x 8 RET %x\" or \"C-x 8 RET %s\""
                                  char name)
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-22  3:08           ` Tino Calancha
@ 2016-04-22  7:32             ` Andreas Schwab
  2016-04-22  7:56               ` Tino Calancha
  0 siblings, 1 reply; 10+ messages in thread
From: Andreas Schwab @ 2016-04-22  7:32 UTC (permalink / raw)
  To: Tino Calancha; +Cc: 23325

Tino Calancha <f92capac@gmail.com> writes:

> +                                     (assoc-string name (or ucs-names (ucs-names))))

There is no need to refer to the variable ucs-names as the function uses
it anyway.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name
  2016-04-22  7:32             ` Andreas Schwab
@ 2016-04-22  7:56               ` Tino Calancha
  0 siblings, 0 replies; 10+ messages in thread
From: Tino Calancha @ 2016-04-22  7:56 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Tino Calancha, 23325

[-- Attachment #1: Type: text/plain, Size: 152 bytes --]



> There is no need to refer to the variable ucs-names as the function uses
> it anyway.
Thank you.  Corrected in the attached new patch.

Cheers,
Tino

[-- Attachment #2: Type: text/plain, Size: 1081 bytes --]

From 4c9b21e7a65cfc17f679906d9ebfa6ac9a11bedb Mon Sep 17 00:00:00 2001
From: Tino Calancha <f92capac@gmail.com>
Date: Fri, 22 Apr 2016 16:50:17 +0900
Subject: [PATCH] describe-char: fix insert char documentation

* lisp/descr-text.el (describe-char):
Only 'ucs-names' entries can be inserted by unicode name (Bug#23325).
---
 lisp/descr-text.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lisp/descr-text.el b/lisp/descr-text.el
index a352ed0..5f1a430 100644
--- a/lisp/descr-text.el
+++ b/lisp/descr-text.el
@@ -619,7 +619,7 @@ describe-char
                           (let ((name
                                  (or (get-char-code-property char 'name)
                                      (get-char-code-property char 'old-name))))
-                            (if name
+                            (if (and name (assoc-string name (ucs-names)))
                                 (format
                                  "type \"C-x 8 RET %x\" or \"C-x 8 RET %s\""
                                  char name)
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* bug#23325: 25.0.92; insert-char: cannot find all chars if input, is unicode name
  2016-04-21  4:40 bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name Tino Calancha
  2016-04-21 14:00 ` Eli Zaretskii
@ 2016-04-23 19:52 ` Paul Eggert
  1 sibling, 0 replies; 10+ messages in thread
From: Paul Eggert @ 2016-04-23 19:52 UTC (permalink / raw)
  To: Tino Calancha; +Cc: 23325-done

Thanks, I installed that patch and am closing the bug report.






^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-04-23 19:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-21  4:40 bug#23325: 25.0.92; insert-char: cannot find all chars if input is unicode name Tino Calancha
2016-04-21 14:00 ` Eli Zaretskii
2016-04-21 14:32   ` Tino Calancha
2016-04-21 15:44     ` Eli Zaretskii
2016-04-21 17:39       ` Tino Calancha
2016-04-21 19:42         ` Eli Zaretskii
2016-04-22  3:08           ` Tino Calancha
2016-04-22  7:32             ` Andreas Schwab
2016-04-22  7:56               ` Tino Calancha
2016-04-23 19:52 ` bug#23325: 25.0.92; insert-char: cannot find all chars if input, " Paul Eggert

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).