From: "Mattias Engdegård" <mattiase@acm.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 40407@debbugs.gnu.org
Subject: bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE
Date: Sat, 4 Apr 2020 18:41:39 +0200 [thread overview]
Message-ID: <729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org> (raw)
In-Reply-To: <83mu7rvbyk.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 1158 bytes --]
4 apr. 2020 kl. 11.26 skrev Eli Zaretskii <eliz@gnu.org>:
> DECODE_FILE is called because the file name in question starts with a
> "~"? Otherwise, I don't think I understand why would expand-file-name
> need to decode a file name.
Maybe it's because default-directory started with a tilde. It doesn't really matter; it's a common case, and the profiler tells us as much.
> IME, the cases where we can safely assume it's OK to return the same
> string are actually very rare. It is no accident that you saw so few
> calls of these functions where we use that optional behavior.
This does not mean that the remaining 179 calls require a copy; they just use the default value of the parameter.
> Neither, IMO. Again, it's a separate problem, and let's keep our
> sights squarely on the original issue you wanted to fix. Let's tackle
> the NOCOPY issue in a separate discussion, OK?
Thank you, a separate bug for it is fine.
Here is a revised patch which takes the nocopy parameter into account (in its inverted sense). Obviously it needs to be adapted if the nocopy inversion is dealt with first; the two bugs do not commute.
[-- Attachment #2: 0001-Avoid-expensive-recoding-for-ASCII-identity-cases-bu.patch --]
[-- Type: application/octet-stream, Size: 2146 bytes --]
From 0c6139ab490733f3c1257665535fc4ed2ad0dbe7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mattias=20Engdeg=C3=A5rd?= <mattiase@acm.org>
Date: Fri, 3 Apr 2020 16:01:01 +0200
Subject: [PATCH] Avoid expensive recoding for ASCII identity cases (bug#40407)
Optimise for the common case of encoding or decoding an ASCII-only
string using an ASCII-compatible coding, for file names in particular.
* src/coding.c (string_ascii_p): New function.
(code_convert_string): Return the input string for ASCII-only inputs
and ASCII-compatible codings.
---
src/coding.c | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/src/coding.c b/src/coding.c
index 0bea2a0c2b..0fdbc95939 100644
--- a/src/coding.c
+++ b/src/coding.c
@@ -9471,6 +9471,17 @@ used (which may be different from CODING-SYSTEM if CODING-SYSTEM is
return code_convert_region (start, end, coding_system, destination, 1, 0);
}
+/* Whether a (unibyte) string only contains chars in the 0..127 range. */
+static bool
+string_ascii_p (Lisp_Object str)
+{
+ ptrdiff_t nbytes = SBYTES (str);
+ for (ptrdiff_t i = 0; i < nbytes; i++)
+ if (SREF (str, i) > 127)
+ return false;
+ return true;
+}
+
Lisp_Object
code_convert_string (Lisp_Object string, Lisp_Object coding_system,
Lisp_Object dst_object, bool encodep, bool nocopy,
@@ -9502,7 +9513,17 @@ code_convert_string (Lisp_Object string, Lisp_Object coding_system,
chars = SCHARS (string);
bytes = SBYTES (string);
- if (BUFFERP (dst_object))
+ if (EQ (dst_object, Qt))
+ {
+ /* Fast path for ASCII-only input and an ASCII-compatible coding:
+ act as identity. */
+ Lisp_Object attrs = CODING_ID_ATTRS (coding.id);
+ if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs))
+ && (STRING_MULTIBYTE (string)
+ ? (chars == bytes) : string_ascii_p (string)))
+ return nocopy ? Fcopy_sequence (string) : string;
+ }
+ else if (BUFFERP (dst_object))
{
struct buffer *buf = XBUFFER (dst_object);
ptrdiff_t buf_pt = BUF_PT (buf);
--
2.21.1 (Apple Git-122.3)
next prev parent reply other threads:[~2020-04-04 16:41 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-03 14:18 bug#40407: [PATCH] slow ENCODE_FILE and DECODE_FILE Mattias Engdegård
2020-04-03 16:24 ` Eli Zaretskii
2020-04-03 22:32 ` Mattias Engdegård
2020-04-04 9:26 ` Eli Zaretskii
2020-04-04 16:41 ` Mattias Engdegård [this message]
2020-04-04 17:22 ` Eli Zaretskii
2020-04-04 17:37 ` Eli Zaretskii
2020-04-04 18:06 ` Mattias Engdegård
2020-04-05 2:37 ` Eli Zaretskii
2020-04-05 3:42 ` Eli Zaretskii
2020-04-05 10:14 ` Mattias Engdegård
2020-04-05 13:28 ` Eli Zaretskii
2020-04-05 13:40 ` Mattias Engdegård
2020-04-04 10:26 ` Eli Zaretskii
2020-04-04 16:55 ` Mattias Engdegård
2020-04-04 17:04 ` Eli Zaretskii
2020-04-04 18:01 ` Mattias Engdegård
2020-04-04 18:25 ` Eli Zaretskii
2020-04-05 10:48 ` Mattias Engdegård
2020-04-05 13:39 ` Eli Zaretskii
2020-04-05 15:03 ` Mattias Engdegård
2020-04-05 15:35 ` Mattias Engdegård
2020-04-05 15:56 ` Eli Zaretskii
2020-04-06 18:13 ` Mattias Engdegård
2020-04-05 16:00 ` Eli Zaretskii
2020-04-06 10:10 ` OGAWA Hirofumi
2020-04-06 14:21 ` Eli Zaretskii
2020-04-06 15:56 ` Mattias Engdegård
2020-04-06 16:33 ` Eli Zaretskii
2020-04-06 16:55 ` Mattias Engdegård
2020-04-06 17:18 ` Eli Zaretskii
2020-04-06 17:49 ` Mattias Engdegård
2020-04-06 18:20 ` Eli Zaretskii
2020-04-06 18:34 ` OGAWA Hirofumi
2020-04-06 21:57 ` Mattias Engdegård
2020-04-09 11:03 ` Mattias Engdegård
2020-04-09 14:09 ` Kazuhiro Ito
2020-04-09 14:22 ` Mattias Engdegård
2020-04-11 15:09 ` Mattias Engdegård
2020-04-16 13:11 ` handa
2020-04-16 13:44 ` Eli Zaretskii
2020-04-16 13:59 ` Mattias Engdegård
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=729DE2D1-EA0F-46F9-8B4B-2ED146CE6892@acm.org \
--to=mattiase@acm.org \
--cc=40407@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).