* bug#10857: ucs-insert deals inconsistently with errors
@ 2012-02-20 15:53 Juanma Barranquero
2012-02-21 0:37 ` Juri Linkov
0 siblings, 1 reply; 8+ messages in thread
From: Juanma Barranquero @ 2012-02-20 15:53 UTC (permalink / raw)
To: 10857
Package: emacs
Severity: minor
`ucs-insert' does not deal very consistently with errors.
Two anomalies:
1) M-x ucs-insert <RET> zzz <RET> => "Not a Unicode character code: nil"
Which is caused by `read-char-by-name' not having a way to pass
back what the user really typed. Still, I typed "zzz", not "nil", so
the message is unhelpful.
2) When called from lisp code, it deals differently with erroneous
strings and erroneous non-strings:
(ucs-insert 'zzz) => "Not a Unicode character code: zzz" ;; correct
(ucs-insert "zzz") => any non-hex string is turned into ^@ and
inserted, and no error is produced.
The second problem can be trivially fixed with (not (string-match-p
"[^[:xdigit:]]" character)), though the docstring of `ucs-insert' does
not really say much about the valid forms the CHARACTER arg can take.
Juanma
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#10857: ucs-insert deals inconsistently with errors
2012-02-20 15:53 bug#10857: ucs-insert deals inconsistently with errors Juanma Barranquero
@ 2012-02-21 0:37 ` Juri Linkov
2012-02-21 1:25 ` Juanma Barranquero
0 siblings, 1 reply; 8+ messages in thread
From: Juri Linkov @ 2012-02-21 0:37 UTC (permalink / raw)
To: Juanma Barranquero; +Cc: 10857
> 1) M-x ucs-insert <RET> zzz <RET> => "Not a Unicode character code: nil"
> Which is caused by `read-char-by-name' not having a way to pass
> back what the user really typed. Still, I typed "zzz", not "nil", so
> the message is unhelpful.
Wouldn't it be too weird for `read-char-by-name' to return "zzz"
when the purpose of this function is to return a character,
not a string the user typed.
> 2) When called from lisp code, it deals differently with erroneous
> strings and erroneous non-strings:
> (ucs-insert 'zzz) => "Not a Unicode character code: zzz" ;; correct
> (ucs-insert "zzz") => any non-hex string is turned into ^@ and
> inserted, and no error is produced.
>
> The second problem can be trivially fixed with
> (not (string-match-p "[^[:xdigit:]]" character)),
In `read-char-by-name', the condition for this purpose is:
(string-match-p "^[0-9a-fA-F]+$" input)
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#10857: ucs-insert deals inconsistently with errors
2012-02-21 0:37 ` Juri Linkov
@ 2012-02-21 1:25 ` Juanma Barranquero
2012-02-21 9:16 ` Andreas Schwab
2012-02-22 0:09 ` Juri Linkov
0 siblings, 2 replies; 8+ messages in thread
From: Juanma Barranquero @ 2012-02-21 1:25 UTC (permalink / raw)
To: Juri Linkov; +Cc: 10857
On Tue, Feb 21, 2012 at 01:37, Juri Linkov <juri@jurta.org> wrote:
> Wouldn't it be too weird for `read-char-by-name' to return "zzz"
> when the purpose of this function is to return a character,
> not a string the user typed.
Yes. I don't think `read-char-by-name' should return "zzz", I think
`ucs-insert' should not say the "nil" part. Perhaps just "Not a
Unicode character".
>> The second problem can be trivially fixed with
>> (not (string-match-p "[^[:xdigit:]]" character)),
>
> In `read-char-by-name', the condition for this purpose is:
>
> (string-match-p "^[0-9a-fA-F]+$" input)
They are equivalent, aren't they? But my point was that the docstring
does not say what to expect in CHARACTER.
Juanma
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#10857: ucs-insert deals inconsistently with errors
2012-02-21 1:25 ` Juanma Barranquero
@ 2012-02-21 9:16 ` Andreas Schwab
2012-02-21 10:39 ` Juanma Barranquero
2012-02-22 0:09 ` Juri Linkov
1 sibling, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2012-02-21 9:16 UTC (permalink / raw)
To: Juanma Barranquero; +Cc: 10857
Juanma Barranquero <lekktu@gmail.com> writes:
> On Tue, Feb 21, 2012 at 01:37, Juri Linkov <juri@jurta.org> wrote:
>
>>> The second problem can be trivially fixed with
>>> (not (string-match-p "[^[:xdigit:]]" character)),
>>
>> In `read-char-by-name', the condition for this purpose is:
>>
>> (string-match-p "^[0-9a-fA-F]+$" input)
>
> They are equivalent, aren't they?
No. The latter ignores anything before or after a newline character, as
long as there is a match on the other side of it. That can be fixed by
using "\\`[0-9a-fA-F]+\\'".
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#10857: ucs-insert deals inconsistently with errors
2012-02-21 9:16 ` Andreas Schwab
@ 2012-02-21 10:39 ` Juanma Barranquero
0 siblings, 0 replies; 8+ messages in thread
From: Juanma Barranquero @ 2012-02-21 10:39 UTC (permalink / raw)
To: Andreas Schwab; +Cc: 10857
On Tue, Feb 21, 2012 at 10:16, Andreas Schwab <schwab@linux-m68k.org> wrote:
> No. The latter ignores anything before or after a newline character, as
> long as there is a match on the other side of it. That can be fixed by
> using "\\`[0-9a-fA-F]+\\'".
I didn't say "identical". That seems like a corner case.
Juanma
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#10857: ucs-insert deals inconsistently with errors
2012-02-21 1:25 ` Juanma Barranquero
2012-02-21 9:16 ` Andreas Schwab
@ 2012-02-22 0:09 ` Juri Linkov
2012-02-22 9:03 ` Andreas Schwab
1 sibling, 1 reply; 8+ messages in thread
From: Juri Linkov @ 2012-02-22 0:09 UTC (permalink / raw)
To: Juanma Barranquero; +Cc: 10857
tags 10857 patch
thanks
> Yes. I don't think `read-char-by-name' should return "zzz", I think
> `ucs-insert' should not say the "nil" part. Perhaps just "Not a
> Unicode character".
>
>> In `read-char-by-name', the condition for this purpose is:
>>
>> (string-match-p "^[0-9a-fA-F]+$" input)
>
> They are equivalent, aren't they? But my point was that the docstring
> does not say what to expect in CHARACTER.
This should be fixed by this patch:
=== modified file 'lisp/international/mule-cmds.el'
--- lisp/international/mule-cmds.el 2012-02-10 19:35:28 +0000
+++ lisp/international/mule-cmds.el 2012-02-22 00:07:34 +0000
@@ -2949,7 +2949,7 @@ (defun read-char-by-name (prompt)
'(metadata (category . unicode-name))
(complete-with-action action (ucs-names) string pred))))))
(cond
- ((string-match-p "^[0-9a-fA-F]+$" input)
+ ((string-match-p "\\`[0-9a-fA-F]+\\'" input)
(string-to-number input 16))
((string-match-p "^#" input)
(read input))
@@ -2967,6 +2967,10 @@ (defun ucs-insert (character &optional c
the characters whose names include that substring, not necessarily
at the beginning of the name.
+This function also accepts a hexadecimal number of Unicode code
+point or a number in hash notation, e.g. #o21430 for octal,
+#x2318 for hex, or #10r8984 for decimal.
+
The optional third arg INHERIT (non-nil when called interactively),
says to inherit text properties from adjoining text, if those
properties are sticky."
@@ -2975,9 +2979,12 @@ (defun ucs-insert (character &optional c
(prefix-numeric-value current-prefix-arg)
t))
(unless count (setq count 1))
- (if (stringp character)
+ (if (and (stringp character)
+ (string-match-p "\\`[0-9a-fA-F]+\\'" character))
(setq character (string-to-number character 16)))
(cond
+ ((null character)
+ (error "Not a Unicode character"))
((not (integerp character))
(error "Not a Unicode character code: %S" character))
((or (< character 0) (> character #x10FFFF))
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#10857: ucs-insert deals inconsistently with errors
2012-02-22 0:09 ` Juri Linkov
@ 2012-02-22 9:03 ` Andreas Schwab
2012-02-22 23:35 ` Juri Linkov
0 siblings, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2012-02-22 9:03 UTC (permalink / raw)
To: Juri Linkov; +Cc: Juanma Barranquero, 10857
Juri Linkov <juri@jurta.org> writes:
> This should be fixed by this patch:
>
> === modified file 'lisp/international/mule-cmds.el'
> --- lisp/international/mule-cmds.el 2012-02-10 19:35:28 +0000
> +++ lisp/international/mule-cmds.el 2012-02-22 00:07:34 +0000
> @@ -2949,7 +2949,7 @@ (defun read-char-by-name (prompt)
> '(metadata (category . unicode-name))
> (complete-with-action action (ucs-names) string pred))))))
> (cond
> - ((string-match-p "^[0-9a-fA-F]+$" input)
> + ((string-match-p "\\`[0-9a-fA-F]+\\'" input)
> (string-to-number input 16))
> ((string-match-p "^#" input)
This should also use \`.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#10857: ucs-insert deals inconsistently with errors
2012-02-22 9:03 ` Andreas Schwab
@ 2012-02-22 23:35 ` Juri Linkov
0 siblings, 0 replies; 8+ messages in thread
From: Juri Linkov @ 2012-02-22 23:35 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Juanma Barranquero, 10857-done
>> ((string-match-p "^#" input)
>
> This should also use \`.
All right, installed.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-02-22 23:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-20 15:53 bug#10857: ucs-insert deals inconsistently with errors Juanma Barranquero
2012-02-21 0:37 ` Juri Linkov
2012-02-21 1:25 ` Juanma Barranquero
2012-02-21 9:16 ` Andreas Schwab
2012-02-21 10:39 ` Juanma Barranquero
2012-02-22 0:09 ` Juri Linkov
2012-02-22 9:03 ` Andreas Schwab
2012-02-22 23:35 ` Juri Linkov
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).