all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Tassilo Horn <tsdh@gnu.org>
To: ljell <laszlomail@protonmail.com>
Cc: "43016@debbugs.gnu.org" <43016@debbugs.gnu.org>,
	Paul Eggert <eggert@cs.ucla.edu>
Subject: bug#43016: replace-region-contents takes a lot of time when called from json-pretty-print-buffer
Date: Mon, 24 Aug 2020 19:14:50 +0200	[thread overview]
Message-ID: <875z98c6et.fsf@gnu.org> (raw)
In-Reply-To: <g0fj8DSywCYTI9O75o22EM_zKPrP65v-r1KqPhpunAix_v4TiLffKZXcRQSUvuKkjDZoVf5Mvi-iBSAqrNz2wkrp0Yd74iE2mbhyAxa4o1Y=@protonmail.com> (ljell's message of "Mon, 24 Aug 2020 12:13:15 +0000")

[-- Attachment #1: Type: text/plain, Size: 2817 bytes --]

ljell <laszlomail@protonmail.com> writes:

Hi all,

>> Thank you for your report and the data. Is it possible to have the
>> file with which you've seen this problem?
>
> Looks like the problem occurs only with accented characters present,
> so I created a json file with them. Attached.

I can easily reproduce the problem using this file.

Back when I introduced that feature, i.e., that json pretty printing
uses replace-region-contents (which in turn uses
replace-buffer-contents), I added a parameter MAX-SECS to those
functions which should restrict the compareseq call's runtime to that
amount of seconds and give up if it takes longer, in which case the
replacement is just a delete + insert (which hasn't the benefit of
retaining point and marks but is fast).  That's controllable via the
variable json-pretty-print-max-secs whose default is 2.0 seconds.

In 975893b2290 Paul (in Cc) improved that change so that the context
struct given to compareseq doesn't store the MAX-SECS but a timespec
time_limit computed beforehand and used later in the
compareseq_early_abort tests later on saving conversions there.  (The
change looks good to me with my very limited C knowledge.)

The actual problem here is that for the specific test file, there are
(only) 321 compareseq_early_abort tests performed and it seems that the
first 320 are executed almost immediately (before MAX-SECS are over),
then no test is performed for minutes, and then a last test is performed
leading to an early_abort of compareseq followed by delete + insert.

This "early_abort if it takes too long" thingy doesn't work if the
compareseq_early_abort tests aren't performed somewhat regularly.  If
there can be minutes between two consecutive tests like here, then this
whole thing doesn't work out.

replace-region-contents and replace-buffer-contents have another
MAX-COSTS parameter which json.el sets to 64 (with a FIXME since I had
no clue what a sensible value was but extracted that value from
edidfns.c where it was hard-coded before).  I've reduced the value in
the json.el call to 16 for testing and indeed it became faster but still
too slow for the very same reason.  Now there were only 65
compareseq_early_abort tests but again 64 occurred almost immediately
whereas the last one aborting the comparison happened after ~40 seconds.

So basically I'd say the problem is in gnulib's compareseq.  If it can't
be fixed there, I see no other possibility than to stop using
replace-buffer/region-contents in json.el (and wherever it might also be
used).  That would be sad because except for the performance in some
cases, it's very nice. :-(

Maybe Paul has some idea (or knows compareseq better)?

I'm attaching a patch adding some printfs which I used to diagnose the
issue.  (Yes, I'm cannot operate GDB without guidance...)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: replace_buffer_contents_debugging.patch --]
[-- Type: text/x-patch, Size: 1580 bytes --]

diff --git a/src/editfns.c b/src/editfns.c
index 949f3825a3..2037d1a133 100644
--- a/src/editfns.c
+++ b/src/editfns.c
@@ -2000,6 +2000,7 @@ DEFUN ("replace-buffer-contents", Freplace_buffer_contents,
   else
     CHECK_FIXNUM (max_costs);
 
+  printf("max_secs: %f\n", XFLOAT_DATA (max_secs));
   struct timespec time_limit = make_timespec (0, -1);
   if (!NILP (max_secs))
     {
@@ -2008,7 +2009,11 @@ DEFUN ("replace-buffer-contents", Freplace_buffer_contents,
 			     lisp_time_argument (max_secs)),
 	tmax = make_timespec (TYPE_MAXIMUM (time_t), TIMESPEC_HZ - 1);
       if (timespec_cmp (tlim, tmax) < 0)
-	time_limit = tlim;
+        {
+          time_limit = tlim;
+          printf("time_limit: %lld.%.9ld\n",
+                 (long long)time_limit.tv_sec, time_limit.tv_nsec);
+        }
     }
 
   /* Micro-optimization: Casting to size_t generates much better
@@ -2038,6 +2043,10 @@ DEFUN ("replace-buffer-contents", Freplace_buffer_contents,
      later.  */
   bool early_abort = compareseq (0, size_a, 0, size_b, false, &ctx);
 
+  printf("early_abort: %d\n", early_abort);
+  printf("early_abort_tests: %u\n", ctx.early_abort_tests);
+  printf("size_a: %ld, size_b: %ld\n", size_a, size_b);
+
   if (early_abort)
     {
       del_range (min_a, ZV);
@@ -2186,6 +2195,8 @@ buffer_chars_equal (struct context *ctx,
 static bool
 compareseq_early_abort (struct context *ctx)
 {
+  printf("early_abort_tests\n");
+  ctx->early_abort_tests++;
   if (ctx->time_limit.tv_nsec < 0)
     return false;
   return timespec_cmp (ctx->time_limit, current_timespec ()) < 0;

[-- Attachment #3: Type: text/plain, Size: 14 bytes --]


Bye,
Tassilo

  parent reply	other threads:[~2020-08-24 17:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-24  8:25 bug#43016: replace-region-contents takes a lot of time when called from json-pretty-print-buffer ljell via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-08-24 10:37 ` Eli Zaretskii
2020-08-24 12:13   ` ljell via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-08-24 12:29     ` ljell via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-08-24 17:14     ` Tassilo Horn [this message]
2020-08-24 17:21       ` Lars Ingebrigtsen
2020-08-24 17:25         ` Philipp Stephani
2020-08-24 17:27       ` Eli Zaretskii
2020-08-24 19:15         ` Tassilo Horn
2020-08-24 19:36           ` Eli Zaretskii
2020-08-24 23:35       ` Paul Eggert
2020-08-25  6:10         ` Eli Zaretskii
2020-08-25 18:26           ` Paul Eggert
2020-08-25 17:30         ` Tassilo Horn
2020-08-25 18:19           ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875z98c6et.fsf@gnu.org \
    --to=tsdh@gnu.org \
    --cc=43016@debbugs.gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=laszlomail@protonmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.