From: Eli Zaretskii <eliz@gnu.org>
To: Stefan Monnier <monnier@IRO.UMontreal.CA>
Cc: a.s@realize.ch, toke@toke.dk, dgutov@yandex.ru, emacs-devel@gnu.org
Subject: Re: distinguishing multibyte/unibyte ASCII (was: [PATCH] url: Wrap cookie headers in url-http--encode-string.)
Date: Sat, 10 Sep 2016 08:50:42 +0300 [thread overview]
Message-ID: <83y4308frh.fsf@gnu.org> (raw)
In-Reply-To: <jwvd1kc7t4v.fsf-monnier+Inbox@gnu.org> (message from Stefan Monnier on Fri, 09 Sep 2016 16:01:57 -0400)
> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: Alain Schneble <a.s@realize.ch>, toke@toke.dk, emacs-devel@gnu.org,
> dgutov@yandex.ru
> Date: Fri, 09 Sep 2016 16:01:57 -0400
>
> At some point I tried to change this handling (not exactly fix it) by
> treating multibyte ASCII strings specially (it's easy to recognize by
> checking that the char length is equal to the byte length and both are
> readily available in the "struct Lisp_String" object). Then when we
> read an ASCII string, instead of making it unibyte, I'd keep it as
> multibyte. And then change things like "concat" so that those "ASCII
> multibyte" strings don't force the result to be multibyte.
>
> My local Emacs still runs with those changes, but in the end I don't
> think the result is really better (or sufficiently better to justify
> the subtle incompatibilities it introduces).
>
> [ Also, I wouldn't be surprised to hear that such a change causes real
> problems with utf-7 or EBCDIC, or other systems where decoding/encoding
> a string of bytes/chars all <127 is not a no-op. ]
We could change concat (and other relevant functions, like format) to
recognize ASCII strings safely and reliably, but I think this would
make those functions slower, which would be a performance hit, since
those functions are used a lot in the inner loops.
This issue is actually rather marginal in Emacs, the url-http use case
is one of a very few that care, because they need to report length in
bytes.
next prev parent reply other threads:[~2016-09-10 5:50 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-07 15:30 [PATCH] url: Wrap cookie headers in url-http--encode-string Toke Høiland-Jørgensen
2016-09-07 16:40 ` Stefan Monnier
2016-09-07 16:52 ` Toke Høiland-Jørgensen
2016-09-07 17:15 ` Eli Zaretskii
2016-09-07 18:25 ` Toke Høiland-Jørgensen
2016-09-08 14:06 ` Dmitry Gutov
2016-09-08 14:14 ` Toke Høiland-Jørgensen
2016-09-08 14:25 ` Dmitry Gutov
2016-09-08 15:58 ` Toke Høiland-Jørgensen
2016-09-08 17:20 ` Eli Zaretskii
2016-09-08 17:43 ` Toke Høiland-Jørgensen
2016-09-08 18:01 ` Eli Zaretskii
2016-09-08 17:47 ` Stefan Monnier
2016-09-08 18:04 ` Eli Zaretskii
2016-09-08 20:29 ` Alain Schneble
2016-09-09 7:57 ` Eli Zaretskii
2016-09-09 14:56 ` Alain Schneble
2016-09-09 15:04 ` Eli Zaretskii
2016-09-09 15:16 ` Alain Schneble
2016-09-09 15:06 ` Stefan Monnier
2016-09-09 15:15 ` Alain Schneble
2016-09-09 18:02 ` Alain Schneble
2016-09-09 18:07 ` Toke Høiland-Jørgensen
2016-09-09 18:54 ` Eli Zaretskii
2016-09-09 19:21 ` Alain Schneble
2016-09-09 19:32 ` Eli Zaretskii
2016-09-09 19:47 ` Alain Schneble
2016-09-09 19:49 ` Eli Zaretskii
2016-09-09 19:56 ` Toke Høiland-Jørgensen
2016-09-10 5:42 ` Eli Zaretskii
2016-09-10 8:34 ` Dmitry Gutov
2016-09-10 19:12 ` Eli Zaretskii
2016-09-09 20:01 ` distinguishing multibyte/unibyte ASCII (was: [PATCH] url: Wrap cookie headers in url-http--encode-string.) Stefan Monnier
2016-09-09 20:17 ` distinguishing multibyte/unibyte ASCII Toke Høiland-Jørgensen
2016-09-09 20:46 ` Stefan Monnier
2016-09-09 21:02 ` Alain Schneble
2016-09-10 5:50 ` Eli Zaretskii [this message]
2016-09-07 19:14 ` [PATCH] url: Wrap cookie headers in url-http--encode-string Lars Ingebrigtsen
2016-09-07 20:49 ` Toke Høiland-Jørgensen
2016-09-08 2:47 ` Eli Zaretskii
2016-09-08 9:07 ` Lars Ingebrigtsen
2016-09-08 17:23 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83y4308frh.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=a.s@realize.ch \
--cc=dgutov@yandex.ru \
--cc=emacs-devel@gnu.org \
--cc=monnier@IRO.UMontreal.CA \
--cc=toke@toke.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).