* string_char_to_byte and string_byte_to_char micro-optimisation
@ 2019-06-14 12:37 Robert Pluim
2019-06-14 16:53 ` Paul Eggert
0 siblings, 1 reply; 9+ messages in thread
From: Robert Pluim @ 2019-06-14 12:37 UTC (permalink / raw)
To: emacs-devel
Hi,
in <https://nullprogram.com/blog/2019/05/29/> a benchmark is shown:
(defun compare (string-a string-b)
(cl-loop for a being the elements of string-a
for b being the elements of string-b
unless (eql a b)
return (cons a b)))
(benchmark-run
(let ((a (make-string 100000 0))
(b (make-string 100000 0)))
(setf (aref a (1- (length a))) 256
(aref b (1- (length b))) 256)
(compare a b)))
which runs very slowly because string_char_to_byte and
string_byte_to_char only cache the found values for 1 previous string.
I have a patch which extends this cache to two (count 'em, two!)
previous strings, which fixes this particular benchmark.
What I donʼt have is any intuition on whether such a change actually
makes any difference in real-world Emacs usage. Can anyone suggest any
benchmarks?
Thanks
Robert
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-14 12:37 string_char_to_byte and string_byte_to_char micro-optimisation Robert Pluim
@ 2019-06-14 16:53 ` Paul Eggert
2019-06-14 19:00 ` Eli Zaretskii
2019-06-17 9:37 ` Robert Pluim
0 siblings, 2 replies; 9+ messages in thread
From: Paul Eggert @ 2019-06-14 16:53 UTC (permalink / raw)
To: emacs-devel
On 6/14/19 5:37 AM, Robert Pluim wrote:
> What I donʼt have is any intuition on whether such a change actually
> makes any difference in real-world Emacs usage. Can anyone suggest any
> benchmarks?
My usual benchmark for this sort of thing is 'make compile-always' in
the lisp directory.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-14 16:53 ` Paul Eggert
@ 2019-06-14 19:00 ` Eli Zaretskii
2019-06-14 20:11 ` Stefan Monnier
2019-06-17 9:37 ` Robert Pluim
1 sibling, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2019-06-14 19:00 UTC (permalink / raw)
To: Paul Eggert; +Cc: emacs-devel
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 14 Jun 2019 09:53:50 -0700
>
> On 6/14/19 5:37 AM, Robert Pluim wrote:
> > What I donʼt have is any intuition on whether such a change actually
> > makes any difference in real-world Emacs usage. Can anyone suggest any
> > benchmarks?
>
> My usual benchmark for this sort of thing is 'make compile-always' in
> the lisp directory.
I don't think that will do for this case. Strings are used rather
rarely in Emacs. We need to find a command that uses strings
extensively, and uses non-ASCII text in strings in particular. Some
JSON processing with non-ASCII strings inside, perhaps?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-14 19:00 ` Eli Zaretskii
@ 2019-06-14 20:11 ` Stefan Monnier
2019-06-15 6:22 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: Stefan Monnier @ 2019-06-14 20:11 UTC (permalink / raw)
To: emacs-devel
> I don't think that will do for this case. Strings are used rather
> rarely in Emacs. We need to find a command that uses strings
> extensively, and uses non-ASCII text in strings in particular.
... and uses `aref` on it extensively.
Most strings are used via regexp-search in which case the conversion
between charpos and bytepos is generally lost in the noise.
Stefan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-14 20:11 ` Stefan Monnier
@ 2019-06-15 6:22 ` Eli Zaretskii
2019-06-15 7:48 ` Stefan Monnier
0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2019-06-15 6:22 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Fri, 14 Jun 2019 16:11:47 -0400
>
> > I don't think that will do for this case. Strings are used rather
> > rarely in Emacs. We need to find a command that uses strings
> > extensively, and uses non-ASCII text in strings in particular.
>
> ... and uses `aref` on it extensively.
Right. And/or 'aset'. Other candidates are 'string-match' and
'replace-match'. All that with non-ASCII strings, of course.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-15 6:22 ` Eli Zaretskii
@ 2019-06-15 7:48 ` Stefan Monnier
2019-06-15 11:11 ` Noam Postavsky
0 siblings, 1 reply; 9+ messages in thread
From: Stefan Monnier @ 2019-06-15 7:48 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
>> ... and uses `aref` on it extensively.
> Right. And/or 'aset'.
Right, but `aset` is even more rare on multibyte strings.
> Other candidates are 'string-match' and 'replace-match'.
`replace-match` has to copy the string, so charpos<->bytepos conversion
doesn't slow it down significantly (I'd guess it's at most a factor of 2).
`string-match` is only affected by charpos<->bytepos is you use the
`start` argument, and the time to perform the actual regexp search will
usually dwarf the charpos<->bytepos conversion, so I think it can only
be noticeably slowed down by charpos<->bytepos conversion in
"pathological" cases where we `start` in the middle of a longish string
and we immediately find a short match.
In contrast, `aref` never does much more than the charpos<->bytepos
conversion itself.
Stefan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-15 7:48 ` Stefan Monnier
@ 2019-06-15 11:11 ` Noam Postavsky
2019-06-16 11:17 ` Stefan Monnier
0 siblings, 1 reply; 9+ messages in thread
From: Noam Postavsky @ 2019-06-15 11:11 UTC (permalink / raw)
To: Stefan Monnier; +Cc: Eli Zaretskii, Emacs developers
On Sat, 15 Jun 2019 at 03:49, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> be noticeably slowed down by charpos<->bytepos conversion in
> "pathological" cases where we `start` in the middle of a longish string
> and we immediately find a short match.
Would this include cases where you iterate through string-match
results in a loop, incrementing the `start` argument each time, as in
replace-regexp-in-string? (I guess if its REP argument is a function
which aref's another multibyte string, then it should miss the cache
each time).
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-15 11:11 ` Noam Postavsky
@ 2019-06-16 11:17 ` Stefan Monnier
0 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2019-06-16 11:17 UTC (permalink / raw)
To: Noam Postavsky; +Cc: Eli Zaretskii, Emacs developers
>> be noticeably slowed down by charpos<->bytepos conversion in
>> "pathological" cases where we `start` in the middle of a longish string
>> and we immediately find a short match.
> Would this include cases where you iterate through string-match
> results in a loop, incrementing the `start` argument each time, as in
> replace-regexp-in-string?
Yes, that's exactly the case I had in mind.
Stefan
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: string_char_to_byte and string_byte_to_char micro-optimisation
2019-06-14 16:53 ` Paul Eggert
2019-06-14 19:00 ` Eli Zaretskii
@ 2019-06-17 9:37 ` Robert Pluim
1 sibling, 0 replies; 9+ messages in thread
From: Robert Pluim @ 2019-06-17 9:37 UTC (permalink / raw)
To: Paul Eggert; +Cc: emacs-devel
>>>>> On Fri, 14 Jun 2019 09:53:50 -0700, Paul Eggert <eggert@cs.ucla.edu> said:
Paul> On 6/14/19 5:37 AM, Robert Pluim wrote:
>> What I donʼt have is any intuition on whether such a change actually
>> makes any difference in real-world Emacs usage. Can anyone suggest any
>> benchmarks?
Paul> My usual benchmark for this sort of thing is 'make compile-always' in
Paul> the lisp directory.
It doesnʼt make a significant difference, so I donʼt think thereʼs any
point in complicating the code:
With patch, run 1:
real 4m21.097s
user 3m39.020s
sys 0m33.267s
With patch, run 2:
real 4m13.649s
user 3m34.102s
sys 0m31.834s
Without patch, run 1:
real 4m15.264s
user 3m34.305s
sys 0m32.719s
Without patch, run 2:
real 4m18.266s
user 3m36.531s
sys 0m33.315s
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-06-17 9:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-14 12:37 string_char_to_byte and string_byte_to_char micro-optimisation Robert Pluim
2019-06-14 16:53 ` Paul Eggert
2019-06-14 19:00 ` Eli Zaretskii
2019-06-14 20:11 ` Stefan Monnier
2019-06-15 6:22 ` Eli Zaretskii
2019-06-15 7:48 ` Stefan Monnier
2019-06-15 11:11 ` Noam Postavsky
2019-06-16 11:17 ` Stefan Monnier
2019-06-17 9:37 ` Robert Pluim
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.