unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	Kenichi Handa <handa@gnu.org>
Cc: 40407@debbugs.gnu.org, mattiase@acm.org
Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE
Date: Mon, 06 Apr 2020 17:21:34 +0300	[thread overview]
Message-ID: <835zecsnip.fsf@gnu.org> (raw)
In-Reply-To: <87blo46i1j.fsf@mail.parknet.co.jp> (message from OGAWA Hirofumi on Mon, 06 Apr 2020 19:10:48 +0900)

> From: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
> Cc: Mattias Engdegård <mattiase@acm.org>,
>         40407@debbugs.gnu.org
> Date: Mon, 06 Apr 2020 19:10:48 +0900
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> -  if (BUFFERP (dst_object))
> >> +  if (EQ (dst_object, Qt))
> >> +    {
> >> +      /* Fast path for ASCII-only input and an ASCII-compatible coding:
> >> +         act as identity.  */
> >> +      Lisp_Object attrs = CODING_ID_ATTRS (coding.id);
> >> +      if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs))
> >> +          && (STRING_MULTIBYTE (string)
> >> +              ? (chars == bytes) : string_ascii_p (string)))
> >> +        return string;
> 
> While using the latest master branch, I noticed this became the cause of
> decoding error.
> 
> The simple reproducible test is,
> 
> 	(decode-coding-string "&abc" 'utf-7-imap)
>         => "&abc"
> 
> like the above result, decoding utf-7-imap didn't work.
> 
> Because (coding-system-get 'utf-7-imap :ascii-compatible-p) => t.

Thanks.

> I'm not sure, 'utf-7* should be fixed as non ascii-compatible, or
> string_ascii_p() should check more strictly.

The former, since UTF-7 is definitely *not* ASCII-compatible.  Does
the patch below produce good results?

Kenichi, why was coding-type of UTF-7 systems set to 'utf-8'?
Wouldn't it be better to set it to 'utf-16'?  Or is there some
subtlety here that we should be aware of?  Do you have any comments on
the patch below?

Thanks.

diff --git a/src/coding.c b/src/coding.c
index 97a6eb9..71ff93c 100644
--- a/src/coding.c
+++ b/src/coding.c
@@ -11301,7 +11301,10 @@ DEFUN ("define-coding-system-internal", Fdefine_coding_system_internal,
 	  CHECK_CODING_SYSTEM (val);
 	}
       ASET (attrs, coding_attr_utf_bom, bom);
-      if (NILP (bom))
+      if (NILP (bom)
+	  /* UTF-7 has :coding-type set to 'utf-8' (why not
+	     'utf-16'?), but it is definitely NOT ASCII-compatible.  */
+	  && !EQ (name, Qutf_7) && !EQ (name, Qutf_7_imap))
 	ASET (attrs, coding_attr_ascii_compat, Qt);
 
       category = (CONSP (bom) ? coding_category_utf_8_auto
@@ -11673,6 +11676,9 @@ syms_of_coding (void)
   DEFSYM (Qutf_8_unix, "utf-8-unix");
   DEFSYM (Qutf_8_emacs, "utf-8-emacs");
 
+  DEFSYM (Qutf_7, "utf-7");
+  DEFSYM (Qutf_7_imap, "utf-7-imap");
+
 #if defined (WINDOWSNT) || defined (CYGWIN)
   /* No, not utf-16-le: that one has a BOM.  */
   DEFSYM (Qutf_16le, "utf-16le");





  reply	other threads:[~2020-04-06 14:21 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03 14:18 bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Mattias Engdegård
2020-04-03 16:24 ` Eli Zaretskii
2020-04-03 22:32   ` Mattias Engdegård
2020-04-04  9:26     ` Eli Zaretskii
2020-04-04 16:41       ` Mattias Engdegård
2020-04-04 17:22         ` Eli Zaretskii
2020-04-04 17:37           ` Eli Zaretskii
2020-04-04 18:06             ` Mattias Engdegård
2020-04-05  2:37               ` Eli Zaretskii
2020-04-05  3:42                 ` Eli Zaretskii
2020-04-05 10:14           ` Mattias Engdegård
2020-04-05 13:28             ` Eli Zaretskii
2020-04-05 13:40               ` Mattias Engdegård
2020-04-04 10:26     ` Eli Zaretskii
2020-04-04 16:55       ` Mattias Engdegård
2020-04-04 17:04         ` Eli Zaretskii
2020-04-04 18:01           ` Mattias Engdegård
2020-04-04 18:25             ` Eli Zaretskii
2020-04-05 10:48               ` Mattias Engdegård
2020-04-05 13:39                 ` Eli Zaretskii
2020-04-05 15:03                   ` Mattias Engdegård
2020-04-05 15:35                     ` Mattias Engdegård
2020-04-05 15:56                       ` Eli Zaretskii
2020-04-06 18:13                         ` Mattias Engdegård
2020-04-05 16:00                     ` Eli Zaretskii
2020-04-06 10:10   ` OGAWA Hirofumi
2020-04-06 14:21     ` Eli Zaretskii [this message]
2020-04-06 15:56       ` Mattias Engdegård
2020-04-06 16:33         ` Eli Zaretskii
2020-04-06 16:55           ` Mattias Engdegård
2020-04-06 17:18             ` Eli Zaretskii
2020-04-06 17:49               ` Mattias Engdegård
2020-04-06 18:20                 ` Eli Zaretskii
2020-04-06 18:34                   ` OGAWA Hirofumi
2020-04-06 21:57                     ` Mattias Engdegård
2020-04-09 11:03                     ` Mattias Engdegård
2020-04-09 14:09                       ` Kazuhiro Ito
2020-04-09 14:22                         ` Mattias Engdegård
2020-04-11 15:09                       ` Mattias Engdegård
2020-04-16 13:11       ` handa
2020-04-16 13:44         ` Eli Zaretskii
2020-04-16 13:59           ` Mattias Engdegård

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=835zecsnip.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=40407@debbugs.gnu.org \
    --cc=handa@gnu.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=mattiase@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).