unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance
       [not found] ` <20220616080945.7DD14C01685@vcs2.savannah.gnu.org>
@ 2022-06-16 12:19   ` Stefan Monnier
  2022-06-16 12:45     ` Lars Ingebrigtsen
                       ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Stefan Monnier @ 2022-06-16 12:19 UTC (permalink / raw)
  To: emacs-devel; +Cc: Ihor Radchenko

Eli Zaretskii [2022-06-16 04:09:44] wrote:
>     org-export-data: Concatenate strings in temporary buffer for performance
>     
>     * lisp/org/ox.el (org-export-data): Use temporary buffer to collect export
>     data instead of `mapconcat'.  Using buffer puts less load on garbage
>     collector.

AFAICT this just replaces `mapconcat` with a `with-temp-buffer + dolist
+ insert`, so if this makes a significant performance difference, it
points to a performance problem in our `mapconcat`, right?


        Stefan




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance
  2022-06-16 12:19   ` master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance Stefan Monnier
@ 2022-06-16 12:45     ` Lars Ingebrigtsen
  2022-06-16 13:33       ` Robert Pluim
  2022-06-16 12:49     ` Ihor Radchenko
  2022-06-16 12:53     ` Eli Zaretskii
  2 siblings, 1 reply; 7+ messages in thread
From: Lars Ingebrigtsen @ 2022-06-16 12:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, Ihor Radchenko

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> AFAICT this just replaces `mapconcat` with a `with-temp-buffer + dolist
> + insert`, so if this makes a significant performance difference, it
> points to a performance problem in our `mapconcat`, right?

It looks like it shouldn't be that difficult to improve the performance
radically in the common case of FUNCTION being #'identity.  If there's
no SEPARATOR we can just pass the arguments more or less directly on to
Fconcat, and if there is a SEPARATOR, we just have to adjust the arg
list.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance
  2022-06-16 12:19   ` master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance Stefan Monnier
  2022-06-16 12:45     ` Lars Ingebrigtsen
@ 2022-06-16 12:49     ` Ihor Radchenko
  2022-06-16 17:43       ` Stefan Monnier
  2022-06-16 12:53     ` Eli Zaretskii
  2 siblings, 1 reply; 7+ messages in thread
From: Ihor Radchenko @ 2022-06-16 12:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Eli Zaretskii [2022-06-16 04:09:44] wrote:
>>     org-export-data: Concatenate strings in temporary buffer for performance
>>     
>>     * lisp/org/ox.el (org-export-data): Use temporary buffer to collect export
>>     data instead of `mapconcat'.  Using buffer puts less load on garbage
>>     collector.
>
> AFAICT this just replaces `mapconcat` with a `with-temp-buffer + dolist
> + insert`, so if this makes a significant performance difference, it
> points to a performance problem in our `mapconcat`, right?

I hope that I did not get it wrong. I _believe_ that I did see an
improvement. So, you better check if it makes a difference on your side
if you revert that patch (especially with un-optimized build where the
differences should be more prominent).

AFAIK, there are several caveats with `mapconcat':
1. The way it was used in Org before the patch was
   (mapconcat #function sequence "")
   Here, `mapconcat' will call `concat' to generate the new string with 2N
   number of arguments. Half of the arguments will be "".

2. `concat' is doing ad-hoc save/restore of text properties. IDK if it
   matters in this particular case, but buffers should work more
   efficiently with handling text properties (AFAIK)

3. `concat' tries to consider non-string argument adding some (probably
   small) extra overheads.

Best,
Ihor



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance
  2022-06-16 12:19   ` master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance Stefan Monnier
  2022-06-16 12:45     ` Lars Ingebrigtsen
  2022-06-16 12:49     ` Ihor Radchenko
@ 2022-06-16 12:53     ` Eli Zaretskii
  2 siblings, 0 replies; 7+ messages in thread
From: Eli Zaretskii @ 2022-06-16 12:53 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, yantar92

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ihor Radchenko <yantar92@gmail.com>
> Date: Thu, 16 Jun 2022 08:19:33 -0400
> 
> Eli Zaretskii [2022-06-16 04:09:44] wrote:
> >     org-export-data: Concatenate strings in temporary buffer for performance
> >     
> >     * lisp/org/ox.el (org-export-data): Use temporary buffer to collect export
> >     data instead of `mapconcat'.  Using buffer puts less load on garbage
> >     collector.
> 
> AFAICT this just replaces `mapconcat` with a `with-temp-buffer + dolist
> + insert`, so if this makes a significant performance difference, it
> points to a performance problem in our `mapconcat`, right?

I didn't test each of the changes separately, so I don't know.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance
  2022-06-16 12:45     ` Lars Ingebrigtsen
@ 2022-06-16 13:33       ` Robert Pluim
  0 siblings, 0 replies; 7+ messages in thread
From: Robert Pluim @ 2022-06-16 13:33 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: Stefan Monnier, emacs-devel, Ihor Radchenko

>>>>> On Thu, 16 Jun 2022 14:45:26 +0200, Lars Ingebrigtsen <larsi@gnus.org> said:

    Lars> Stefan Monnier <monnier@iro.umontreal.ca> writes:
    >> AFAICT this just replaces `mapconcat` with a `with-temp-buffer + dolist
    >> + insert`, so if this makes a significant performance difference, it
    >> points to a performance problem in our `mapconcat`, right?

    Lars> It looks like it shouldn't be that difficult to improve the performance
    Lars> radically in the common case of FUNCTION being #'identity.  If there's
    Lars> no SEPARATOR we can just pass the arguments more or less directly on to
    Lars> Fconcat, and if there is a SEPARATOR, we just have to adjust the arg
    Lars> list.

I did some profiling of concat and similar about a year ago. One of
the things that got me some improvement was adjusting the type checks
so that the most common case came first. Of course thatʼs highly
dependent on which benchmarks you choose :-)

Robert
-- 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance
  2022-06-16 12:49     ` Ihor Radchenko
@ 2022-06-16 17:43       ` Stefan Monnier
  2022-06-17 11:35         ` Ihor Radchenko
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Monnier @ 2022-06-16 17:43 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-devel

Ihor Radchenko [2022-06-16 20:49:27] wrote:
> I hope that I did not get it wrong.  I _believe_ that I did see an
> improvement. So, you better check if it makes a difference on your side
> if you revert that patch (especially with un-optimized build where the
> differences should be more prominent).

The fact that you think you saw a significant difference is already
a good hint that there might be something there, in any case.
But of course, we need to look more closely to see not just "if" but
"how" it is faster.

> AFAIK, there are several caveats with `mapconcat':
> 1. The way it was used in Org before the patch was
>    (mapconcat #function sequence "")
>    Here, `mapconcat' will call `concat' to generate the new string with 2N
>    number of arguments. Half of the arguments will be "".

That's true but we're talking about a single vector, which is allocated
via SAFE_ALLOCA, so it shouldn't put any significant extra pressure on
the GC, and compared to the time taken to do the "map" part of
`mapconcat`, it should be negligible.

This said, it's trivial to eliminate this cost and is probably
a good optimization.  See patch below.

> 2. `concat' is doing ad-hoc save/restore of text properties. IDK if it
>    matters in this particular case, but buffers should work more
>    efficiently with handling text properties (AFAIK)

Indeed, I suspect if there's a performance difference it is likely
coming from that area.

> 3. `concat' tries to consider non-string argument adding some (probably
>    small) extra overheads.

This should be dwarfed by the "map" part of `mapconcat`, so I'd be
surprised if it makes any difference.  Also your replacement code uses
`insert` here instead which performs the same kind of dance anyway.

Lars Ingebrigtsen [2022-06-16 14:45:26] wrote:
> It looks like it shouldn't be that difficult to improve the performance
> radically in the common case of FUNCTION being #'identity.  If there's
> no SEPARATOR we can just pass the arguments more or less directly on to
> Fconcat, and if there is a SEPARATOR, we just have to adjust the arg
> list.

Indeed, we could optimize the common case of function being `identity`,
but it won't help in this particular case.


        Stefan


diff --git a/src/fns.c b/src/fns.c
index d81a3bfcac3..c9251a80f53 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -2843,12 +2843,18 @@ DEFUN ("mapconcat", Fmapconcat, Smapconcat, 2, 3, 0,
   SAFE_ALLOCA_LISP (args, args_alloc);
   ptrdiff_t nmapped = mapcar1 (leni, args, function, sequence);
   ptrdiff_t nargs = 2 * nmapped - 1;
+  eassert (nmapped == leni);
 
-  for (ptrdiff_t i = nmapped - 1; i > 0; i--)
-    args[i + i] = args[i];
+  if (!NILP (Fequal (separator, empty_multibyte_string)))
+    nargs = nmapped;
+  else
+    {
+      for (ptrdiff_t i = nmapped - 1; i > 0; i--)
+        args[i + i] = args[i];
 
-  for (ptrdiff_t i = 1; i < nargs; i += 2)
-    args[i] = separator;
+      for (ptrdiff_t i = 1; i < nargs; i += 2)
+        args[i] = separator;
+    }
 
   Lisp_Object ret = Fconcat (nargs, args);
   SAFE_FREE ();




^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance
  2022-06-16 17:43       ` Stefan Monnier
@ 2022-06-17 11:35         ` Ihor Radchenko
  0 siblings, 0 replies; 7+ messages in thread
From: Ihor Radchenko @ 2022-06-17 11:35 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Ihor Radchenko [2022-06-16 20:49:27] wrote:
>> I hope that I did not get it wrong.  I _believe_ that I did see an
>> improvement. So, you better check if it makes a difference on your side
>> if you revert that patch (especially with un-optimized build where the
>> differences should be more prominent).
>
> The fact that you think you saw a significant difference is already
> a good hint that there might be something there, in any case.
> But of course, we need to look more closely to see not just "if" but
> "how" it is faster.

I just tested the performance difference before/after reverting this
particular commit. There is no measurable difference on my system.
So, what I saw was probably a result of multiple commits I cramped
together during testing.

In any case, optimizing mapconcat should not heart.

Best,
Ihor



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-17 11:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <165536698076.5328.17316430648307086249@vcs2.savannah.gnu.org>
     [not found] ` <20220616080945.7DD14C01685@vcs2.savannah.gnu.org>
2022-06-16 12:19   ` master 979308b4ca 5/9: org-export-data: Concatenate strings in temporary buffer for performance Stefan Monnier
2022-06-16 12:45     ` Lars Ingebrigtsen
2022-06-16 13:33       ` Robert Pluim
2022-06-16 12:49     ` Ihor Radchenko
2022-06-16 17:43       ` Stefan Monnier
2022-06-17 11:35         ` Ihor Radchenko
2022-06-16 12:53     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).