From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: emacs-26 8f18d12: Improve documentation of decoding into a unibyte buffer
Date: Tue, 28 May 2019 13:43:47 -0400 [thread overview]
Message-ID: <jwv7eaav6d3.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <83woiazjyo.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 28 May 2019 18:18:07 +0300")
> "Use the source, Luke!"
But the dark side is so enticing!
> (let* ((str1 (string-as-multibyte (string char)))
> (str2 (string-as-multibyte (string char char)))
Why on earth do we call string-as-multibyte here? AFAIK, the only cases
where `string` returns a unibyte string is when char <128 (it could make
sense to also do that for char ≥128 and <160, but we don't seem to do
that currently) and these are better turned into multibyte via
string-TO-unibyte (tho here we don't even need that, since the unibyte
string works just as well for what we do) than string-AS-unibyte.
I think this is an error. The patch below seems in order.
> (found (find-coding-systems-string str1))
> enc1 enc2 i1 i2)
> (if (and (consp found)
> (eq (car found) 'undecided))
> str1 <<<<<<<<<<<<<<<<<<<<<<<<<
>
> If we return here, the value is str1, which is a multibyte string, see
> how it was calculated.
I think it's a bug. Largely harmless since it only applies to ASCII
chars for which we conflate the char/byte status, but still, it's a wart.
> I didn't think enough about this to figure out if there can be less
> trivial use cases. If you can describe all the cases where
> find-coding-systems-string will return a list whose 'car' is
> 'undecided', my hat off to you.
AFAIK it only happens for pure-ASCII strings.
Stefan
diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el
index 2b0aaca664..391efbedc8 100644
--- a/lisp/international/mule-cmds.el
+++ b/lisp/international/mule-cmds.el
@@ -2926,12 +2926,11 @@ encode-coding-char
If CODING-SYSTEM can't safely encode CHAR, return nil.
The 3rd optional argument CHARSET, if non-nil, is a charset preferred
on encoding."
- (let* ((str1 (string-as-multibyte (string char)))
- (str2 (string-as-multibyte (string char char)))
+ (let* ((str1 (string char))
+ (str2 (string char char))
(found (find-coding-systems-string str1))
enc1 enc2 i1 i2)
- (if (and (consp found)
- (eq (car found) 'undecided))
+ (if (not (multibyte-string-p str1))
str1
(when (memq (coding-system-base coding-system) found)
;; We must find the encoded string of CHAR. But, just encoding
next prev parent reply other threads:[~2019-05-28 17:43 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20190525191039.14136.23307@vcs0.savannah.gnu.org>
[not found] ` <20190525191040.CCD6C207F5@vcs0.savannah.gnu.org>
2019-05-25 19:41 ` [Emacs-diffs] emacs-26 8f18d12: Improve documentation of decoding into a unibyte buffer Stefan Monnier
2019-05-25 19:59 ` Eli Zaretskii
2019-05-25 20:15 ` Eli Zaretskii
2019-05-25 21:11 ` Stefan Monnier
2019-05-25 21:27 ` Stefan Monnier
2019-05-26 2:37 ` Eli Zaretskii
2019-05-27 9:47 ` Robert Pluim
2019-05-27 12:24 ` Stefan Monnier
2019-05-27 13:02 ` Robert Pluim
2019-05-27 13:32 ` Stefan Monnier
2019-05-27 13:49 ` Robert Pluim
2019-05-27 16:53 ` Eli Zaretskii
2019-05-28 6:23 ` Robert Pluim
2019-05-28 14:57 ` Eli Zaretskii
2019-05-28 3:08 ` Stefan Monnier
2019-05-28 4:40 ` Eli Zaretskii
2019-05-28 11:55 ` Stefan Monnier
2019-05-28 15:18 ` Eli Zaretskii
2019-05-28 17:43 ` Stefan Monnier [this message]
2019-05-28 18:58 ` Eli Zaretskii
2019-05-28 19:35 ` Eli Zaretskii
2019-05-28 23:44 ` Stefan Monnier
2019-05-29 14:33 ` Eli Zaretskii
2019-05-27 16:51 ` Eli Zaretskii
2019-05-27 19:17 ` Stefan Monnier
2019-05-28 2:30 ` Eli Zaretskii
2019-05-28 2:56 ` Stefan Monnier
2019-05-28 4:17 ` Eli Zaretskii
2019-05-28 6:21 ` Robert Pluim
2019-05-28 11:53 ` Stefan Monnier
2019-05-28 11:54 ` Stefan Monnier
2019-05-28 15:11 ` Eli Zaretskii
2019-05-28 17:25 ` Stefan Monnier
2019-05-28 18:51 ` Eli Zaretskii
2019-05-28 23:39 ` Stefan Monnier
2019-05-29 2:45 ` Eli Zaretskii
2019-05-29 16:28 ` Stefan Monnier
2019-05-29 18:19 ` Eli Zaretskii
2019-05-29 18:58 ` Stefan Monnier
2019-05-29 19:09 ` Eli Zaretskii
2019-05-29 19:50 ` Stefan Monnier
2019-05-27 16:43 ` Eli Zaretskii
2019-05-27 16:42 ` Eli Zaretskii
2019-05-27 19:13 ` Stefan Monnier
2019-05-27 16:40 ` Eli Zaretskii
2019-05-27 20:17 ` Richard Stallman
2019-05-28 2:36 ` Eli Zaretskii
2019-05-28 7:06 ` Robert Pluim
2019-05-28 14:59 ` Eli Zaretskii
2019-05-28 15:11 ` Robert Pluim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwv7eaav6d3.fsf-monnier+emacs@gnu.org \
--to=monnier@iro.umontreal.ca \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.