bug#10857: ucs-insert deals inconsistently with errors

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#10857: ucs-insert deals inconsistently with errors
@ 2012-02-20 15:53 Juanma Barranquero
  2012-02-21  0:37 ` Juri Linkov
  0 siblings, 1 reply; 8+ messages in thread
From: Juanma Barranquero @ 2012-02-20 15:53 UTC (permalink / raw)
  To: 10857

Package: emacs
Severity: minor


`ucs-insert' does not deal very consistently with errors.

Two anomalies:

1)  M-x ucs-insert <RET> zzz <RET>   => "Not a Unicode character code: nil"
    Which is caused by `read-char-by-name' not having a way to pass
back what the user really typed. Still, I typed "zzz", not "nil", so
the message is unhelpful.

2) When called from lisp code, it deals differently with erroneous
strings and erroneous non-strings:
    (ucs-insert 'zzz)  =>  "Not a Unicode character code: zzz"   ;; correct
    (ucs-insert "zzz")  =>  any non-hex string is turned into ^@ and
inserted, and no error is produced.

The second problem can be trivially fixed with (not (string-match-p
"[^[:xdigit:]]" character)), though the docstring of `ucs-insert' does
not really say much about the valid forms the CHARACTER arg can take.

    Juanma





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#10857: ucs-insert deals inconsistently with errors
  2012-02-20 15:53 bug#10857: ucs-insert deals inconsistently with errors Juanma Barranquero
@ 2012-02-21  0:37 ` Juri Linkov
  2012-02-21  1:25   ` Juanma Barranquero
  0 siblings, 1 reply; 8+ messages in thread
From: Juri Linkov @ 2012-02-21  0:37 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 10857

> 1)  M-x ucs-insert <RET> zzz <RET>   => "Not a Unicode character code: nil"
>     Which is caused by `read-char-by-name' not having a way to pass
> back what the user really typed. Still, I typed "zzz", not "nil", so
> the message is unhelpful.

Wouldn't it be too weird for `read-char-by-name' to return "zzz"
when the purpose of this function is to return a character,
not a string the user typed.

> 2) When called from lisp code, it deals differently with erroneous
> strings and erroneous non-strings:
>     (ucs-insert 'zzz)  =>  "Not a Unicode character code: zzz"   ;; correct
>     (ucs-insert "zzz")  =>  any non-hex string is turned into ^@ and
> inserted, and no error is produced.
>
> The second problem can be trivially fixed with
> (not (string-match-p "[^[:xdigit:]]" character)),

In `read-char-by-name', the condition for this purpose is:

  (string-match-p "^[0-9a-fA-F]+$" input)





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#10857: ucs-insert deals inconsistently with errors
  2012-02-21  0:37 ` Juri Linkov
@ 2012-02-21  1:25   ` Juanma Barranquero
  2012-02-21  9:16     ` Andreas Schwab
  2012-02-22  0:09     ` Juri Linkov
  0 siblings, 2 replies; 8+ messages in thread
From: Juanma Barranquero @ 2012-02-21  1:25 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 10857

On Tue, Feb 21, 2012 at 01:37, Juri Linkov <juri@jurta.org> wrote:

> Wouldn't it be too weird for `read-char-by-name' to return "zzz"
> when the purpose of this function is to return a character,
> not a string the user typed.

Yes. I don't think `read-char-by-name' should return "zzz", I think
`ucs-insert' should not say the "nil" part. Perhaps just "Not a
Unicode character".

>> The second problem can be trivially fixed with
>> (not (string-match-p "[^[:xdigit:]]" character)),
>
> In `read-char-by-name', the condition for this purpose is:
>
>  (string-match-p "^[0-9a-fA-F]+$" input)

They are equivalent, aren't they? But my point was that the docstring
does not say what to expect in CHARACTER.

    Juanma





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#10857: ucs-insert deals inconsistently with errors
  2012-02-21  1:25   ` Juanma Barranquero
@ 2012-02-21  9:16     ` Andreas Schwab
  2012-02-21 10:39       ` Juanma Barranquero
  2012-02-22  0:09     ` Juri Linkov
  1 sibling, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2012-02-21  9:16 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 10857

Juanma Barranquero <lekktu@gmail.com> writes:

> On Tue, Feb 21, 2012 at 01:37, Juri Linkov <juri@jurta.org> wrote:
>
>>> The second problem can be trivially fixed with
>>> (not (string-match-p "[^[:xdigit:]]" character)),
>>
>> In `read-char-by-name', the condition for this purpose is:
>>
>>  (string-match-p "^[0-9a-fA-F]+$" input)
>
> They are equivalent, aren't they?

No.  The latter ignores anything before or after a newline character, as
long as there is a match on the other side of it.  That can be fixed by
using "\\`[0-9a-fA-F]+\\'".

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#10857: ucs-insert deals inconsistently with errors
  2012-02-21  9:16     ` Andreas Schwab
@ 2012-02-21 10:39       ` Juanma Barranquero
  0 siblings, 0 replies; 8+ messages in thread
From: Juanma Barranquero @ 2012-02-21 10:39 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: 10857

On Tue, Feb 21, 2012 at 10:16, Andreas Schwab <schwab@linux-m68k.org> wrote:

> No.  The latter ignores anything before or after a newline character, as
> long as there is a match on the other side of it.  That can be fixed by
> using "\\`[0-9a-fA-F]+\\'".

I didn't say "identical". That seems like a corner case.

    Juanma





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#10857: ucs-insert deals inconsistently with errors
  2012-02-21  1:25   ` Juanma Barranquero
  2012-02-21  9:16     ` Andreas Schwab
@ 2012-02-22  0:09     ` Juri Linkov
  2012-02-22  9:03       ` Andreas Schwab
  1 sibling, 1 reply; 8+ messages in thread
From: Juri Linkov @ 2012-02-22  0:09 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 10857

tags 10857 patch
thanks

> Yes. I don't think `read-char-by-name' should return "zzz", I think
> `ucs-insert' should not say the "nil" part. Perhaps just "Not a
> Unicode character".
>
>> In `read-char-by-name', the condition for this purpose is:
>>
>>  (string-match-p "^[0-9a-fA-F]+$" input)
>
> They are equivalent, aren't they? But my point was that the docstring
> does not say what to expect in CHARACTER.

This should be fixed by this patch:

=== modified file 'lisp/international/mule-cmds.el'
--- lisp/international/mule-cmds.el	2012-02-10 19:35:28 +0000
+++ lisp/international/mule-cmds.el	2012-02-22 00:07:34 +0000
@@ -2949,7 +2949,7 @@ (defun read-char-by-name (prompt)
                        '(metadata (category . unicode-name))
                      (complete-with-action action (ucs-names) string pred))))))
     (cond
-     ((string-match-p "^[0-9a-fA-F]+$" input)
+     ((string-match-p "\\`[0-9a-fA-F]+\\'" input)
       (string-to-number input 16))
      ((string-match-p "^#" input)
       (read input))
@@ -2967,6 +2967,10 @@ (defun ucs-insert (character &optional c
 the characters whose names include that substring, not necessarily
 at the beginning of the name.
 
+This function also accepts a hexadecimal number of Unicode code
+point or a number in hash notation, e.g. #o21430 for octal,
+#x2318 for hex, or #10r8984 for decimal.
+
 The optional third arg INHERIT (non-nil when called interactively),
 says to inherit text properties from adjoining text, if those
 properties are sticky."
@@ -2975,9 +2979,12 @@ (defun ucs-insert (character &optional c
 	 (prefix-numeric-value current-prefix-arg)
 	 t))
   (unless count (setq count 1))
-  (if (stringp character)
+  (if (and (stringp character)
+	   (string-match-p "\\`[0-9a-fA-F]+\\'" character))
       (setq character (string-to-number character 16)))
   (cond
+   ((null character)
+    (error "Not a Unicode character"))
    ((not (integerp character))
     (error "Not a Unicode character code: %S" character))
    ((or (< character 0) (> character #x10FFFF))






^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#10857: ucs-insert deals inconsistently with errors
  2012-02-22  0:09     ` Juri Linkov
@ 2012-02-22  9:03       ` Andreas Schwab
  2012-02-22 23:35         ` Juri Linkov
  0 siblings, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2012-02-22  9:03 UTC (permalink / raw)
  To: Juri Linkov; +Cc: Juanma Barranquero, 10857

Juri Linkov <juri@jurta.org> writes:

> This should be fixed by this patch:
>
> === modified file 'lisp/international/mule-cmds.el'
> --- lisp/international/mule-cmds.el	2012-02-10 19:35:28 +0000
> +++ lisp/international/mule-cmds.el	2012-02-22 00:07:34 +0000
> @@ -2949,7 +2949,7 @@ (defun read-char-by-name (prompt)
>                         '(metadata (category . unicode-name))
>                       (complete-with-action action (ucs-names) string pred))))))
>      (cond
> -     ((string-match-p "^[0-9a-fA-F]+$" input)
> +     ((string-match-p "\\`[0-9a-fA-F]+\\'" input)
>        (string-to-number input 16))
>       ((string-match-p "^#" input)

This should also use \`.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#10857: ucs-insert deals inconsistently with errors
  2012-02-22  9:03       ` Andreas Schwab
@ 2012-02-22 23:35         ` Juri Linkov
  0 siblings, 0 replies; 8+ messages in thread
From: Juri Linkov @ 2012-02-22 23:35 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Juanma Barranquero, 10857-done

>>       ((string-match-p "^#" input)
>
> This should also use \`.

All right, installed.





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-02-22 23:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-20 15:53 bug#10857: ucs-insert deals inconsistently with errors Juanma Barranquero
2012-02-21  0:37 ` Juri Linkov
2012-02-21  1:25   ` Juanma Barranquero
2012-02-21  9:16     ` Andreas Schwab
2012-02-21 10:39       ` Juanma Barranquero
2012-02-22  0:09     ` Juri Linkov
2012-02-22  9:03       ` Andreas Schwab
2012-02-22 23:35         ` Juri Linkov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).