* Passing buffers to function in elisp @ 2023-02-21 21:18 Petteri Hintsanen 2023-02-21 23:21 ` [External] : " Drew Adams 2023-02-22 5:30 ` tomas 0 siblings, 2 replies; 22+ messages in thread From: Petteri Hintsanen @ 2023-02-21 21:18 UTC (permalink / raw) To: help-gnu-emacs Hello list, Alan J. Perlis said "A LISP programmer knows the value of everything, but the cost of nothing." I'm reading some bytes into a temp buffer, like so: (with-temp-buffer (set-buffer-multibyte nil) (insert-file-contents-literally filename nil 0 64000)) then I pass these bytes to functions for processing, like this (func1 (buffer-string)) or sometimes just part of them (func2 (substring (buffer-string) 100 200)) Now: . does this generate garbage? (I believe it does.) . if there are many funcalls like that, will there be lots of garbage? (I guess there will be.) . is this bad style? (I'm afraid it is, hence asking.) Is it better just to assume in functions that the current buffer is the data buffer and work on that, instead of passing data as function arguments? [Why am I doing like this? It is /slightly/ easier to write tests when functions get their data in their arguments.] Also: is it good idea to try to limit the number temp buffers (with-temp-buffer expressions)? Or are they somehow recycled within the elisp interpreter? Thanks, Petteri ^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [External] : Passing buffers to function in elisp 2023-02-21 21:18 Passing buffers to function in elisp Petteri Hintsanen @ 2023-02-21 23:21 ` Drew Adams 2023-02-22 5:35 ` tomas 2023-02-22 5:30 ` tomas 1 sibling, 1 reply; 22+ messages in thread From: Drew Adams @ 2023-02-21 23:21 UTC (permalink / raw) To: Petteri Hintsanen, help-gnu-emacs@gnu.org > I'm reading some bytes into a temp buffer, like so: > > (with-temp-buffer > (set-buffer-multibyte nil) > (insert-file-contents-literally filename nil 0 64000)) > > then I pass these bytes to functions for processing, like this > > (func1 (buffer-string)) > > or sometimes just part of them > > (func2 (substring (buffer-string) 100 200)) Why aren't you passing the buffer itself to func1? Why aren't you passing the buffer itself and the limits 100 and 200 to func2? What is it that you're really trying to do? Yes, if you start manipulating strings instead of buffer text you will pay a performance penalty, in general. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-21 23:21 ` [External] : " Drew Adams @ 2023-02-22 5:35 ` tomas 2023-02-24 20:08 ` Petteri Hintsanen 0 siblings, 1 reply; 22+ messages in thread From: tomas @ 2023-02-22 5:35 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 916 bytes --] On Tue, Feb 21, 2023 at 11:21:47PM +0000, Drew Adams wrote: > > I'm reading some bytes into a temp buffer, like so: > > > > (with-temp-buffer > > (set-buffer-multibyte nil) > > (insert-file-contents-literally filename nil 0 64000)) > > > > then I pass these bytes to functions for processing, like this > > > > (func1 (buffer-string)) > > > > or sometimes just part of them > > > > (func2 (substring (buffer-string) 100 200)) > > Why aren't you passing the buffer itself to func1? > Why aren't you passing the buffer itself and the limits 100 and 200 to func2? > > What is it that you're really trying to do? That's exactly the point, yes. > Yes, if you start manipulating strings instead of buffer text you will pay a performance penalty, in general. ...the question being whether it's worth it or not. Sometimes it is, sometimes it isn't :-) Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-22 5:35 ` tomas @ 2023-02-24 20:08 ` Petteri Hintsanen 2023-02-25 6:40 ` tomas 2023-02-25 11:23 ` Michael Heerdegen 0 siblings, 2 replies; 22+ messages in thread From: Petteri Hintsanen @ 2023-02-24 20:08 UTC (permalink / raw) To: help-gnu-emacs <tomas@tuxteam.de> writes: > On Tue, Feb 21, 2023 at 11:21:47PM +0000, Drew Adams wrote: >> What is it that you're really trying to do? > > That's exactly the point, yes. Specifics, as usual, are somewhat messy. But I try to summarize below. I'm working with Ogg audio files, specifically Vorbis and Opus. I need to extract certain metadata from such files. This code is part of EMMS (see https://git.savannah.gnu.org/cgit/emms.git/tree/emms-info-native.el if you're curious, but please note that it's not the same version I'm working on.) Ogg file is basically a sequence of logical "pages", and each page has zero or more logical "packets". I need to read and decode the first two packets and the last page from a given file. Page size is bounded by 65307 bytes, while packets can be of any size (they can span multiple pages). My code extracts the first two packets by repeatedly reading and decoding a single page along its packet data ("payload"), until I have assembled two complete packets. These packets contain most of the metadata I'm interested in. Each page is read and decoded in somewhat wasteful manner by reading 65307 bytes worth of data from a certain offset (a page boundary) into a temporary buffer. So "func1" in my original posting is actually this: (defun emms-info-native--read-and-decode-ogg-page (filename offset) (with-temp-buffer (set-buffer-multibyte nil) (insert-file-contents-literally filename nil offset (+ offset emms-info-native--ogg-page-size)) (emms-info-native--decode-ogg-page (buffer-string)))) The function emms-info-native--decode-ogg-page uses bindat to do the actual decoding, and packs the results into a plist, which is then returned to the caller. I'm using separate function here because it is easy to test -- just supply fixed byte vectors for it and check that you get correct results. Calling code looks like this: (defun emms-info-native--decode-ogg-packets (filename packets) (let ((num-packets 0) (offset 0) (stream (vector))) (while (< num-packets packets) (let ((page (emms-info-native--read-and-decode-ogg-page filename offset))) (cl-incf num-packets (or (plist-get page :num-packets) 0)) (cl-incf offset (plist-get page :num-bytes)) (setq stream (vconcat stream (plist-get page :stream))) stream)) This function calls emms-info-native--read-and-decode-ogg-page in a loop until the desired number of packets has been extracted. So by evaluating (emms-info-native--decode-ogg-packets filename 2) I get what I need. All data is read-only in the sense it is read from the disk and then just copied around to alists, plists, vectors and so on. ----- I added a counter for tracking the number of temp buffers and ran a benchmark against some 3000+ Ogg files. This was done on primed cache so disk I/O should have had minimal effect. There were 12538 temp buffers created (= 12538 pages decoded). Benchmark function output was "Elapsed time: 23.806966s (18.743661s in 373 GCs)" So this means that ~78% of the time was spent on garbage collection? If so, I think my design sucks. ----- I am well aware that (preliminary) optimization is best avoided. Also, "when in doubt, use brute force." And even the current performance is good enough. My problem here is of more fundamental sort: I don't know what are the right data structures and calling conventions. I am still learning (emacs) lisp, and it shows. In C or C++ it is "easier": just pass pointers or references and you're good. With Lisp and especially Emacs Lisp things are more convoluted -- at least until you learn the necessary idioms. Thanks, Petteri ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-24 20:08 ` Petteri Hintsanen @ 2023-02-25 6:40 ` tomas 2023-02-25 11:23 ` Michael Heerdegen 1 sibling, 0 replies; 22+ messages in thread From: tomas @ 2023-02-25 6:40 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 1201 bytes --] On Fri, Feb 24, 2023 at 10:08:11PM +0200, Petteri Hintsanen wrote: > <tomas@tuxteam.de> writes: > > > On Tue, Feb 21, 2023 at 11:21:47PM +0000, Drew Adams wrote: > >> What is it that you're really trying to do? > > > > That's exactly the point, yes. > > Specifics, as usual, are somewhat messy. But I try to summarize below. [...] Thanks for this very interesting dive :) It seems you so deeper in the rabbit hole that my general handwaving doesn't do justice to it. I'd suggest to call `garbage-collect' explicitly from some strategic point in your code will tell you what kinds (and how many) of objects have been collected. You could then at least have a rough idea on where to focus your efforts (are the many buffers killing you -- or rather loads and loads of small cons pairs? Or those many vectors?) There are many knobs and variables to "look into" what the garbage collector is thinking, see "Garbage Collection" and "Memory Usage" in Appendix E of the Elisp manual (the Web version is here [1], if you prefer that). Thanks for hacking :-) Cheers [1] https://www.gnu.org/software/emacs/manual/html_node/elisp/GNU-Emacs-Internals.html - tomás [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-24 20:08 ` Petteri Hintsanen 2023-02-25 6:40 ` tomas @ 2023-02-25 11:23 ` Michael Heerdegen 2023-02-25 13:45 ` tomas 2023-02-25 23:52 ` Stefan Monnier via Users list for the GNU Emacs text editor 1 sibling, 2 replies; 22+ messages in thread From: Michael Heerdegen @ 2023-02-25 11:23 UTC (permalink / raw) To: help-gnu-emacs Petteri Hintsanen <petterih@iki.fi> writes: > (defun emms-info-native--read-and-decode-ogg-page (filename offset) > (with-temp-buffer > (set-buffer-multibyte nil) > (insert-file-contents-literally filename > nil > offset > (+ offset > emms-info-native--ogg-page-size)) > (emms-info-native--decode-ogg-page (buffer-string)))) > [...] > > (defun emms-info-native--decode-ogg-packets (filename packets) > (let ((num-packets 0) > (offset 0) > (stream (vector))) > (while (< num-packets packets) > (let ((page (emms-info-native--read-and-decode-ogg-page filename > offset))) > (cl-incf num-packets (or (plist-get page :num-packets) 0)) > (cl-incf offset (plist-get page :num-bytes)) > (setq stream (vconcat stream (plist-get page :stream))) > stream)) If `emms-info-native--read-and-decode-ogg-page' is called very often (hundreds of times or more), it's probably better to use one single buffer instead of a fresh temp buffer every single time. Using temp buffers creates quite a bunch of garbage IME. Michael. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-25 11:23 ` Michael Heerdegen @ 2023-02-25 13:45 ` tomas 2023-02-25 18:31 ` Michael Heerdegen 2023-02-25 23:52 ` Stefan Monnier via Users list for the GNU Emacs text editor 1 sibling, 1 reply; 22+ messages in thread From: tomas @ 2023-02-25 13:45 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 429 bytes --] On Sat, Feb 25, 2023 at 12:23:22PM +0100, Michael Heerdegen wrote: [...] > If `emms-info-native--read-and-decode-ogg-page' is called very often > (hundreds of times or more), it's probably better to use one single > buffer instead of a fresh temp buffer every single time. Using temp > buffers creates quite a bunch of garbage IME. And then do (erase-buffer) then (insert-file-contents-literally)? Cheers - t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-25 13:45 ` tomas @ 2023-02-25 18:31 ` Michael Heerdegen 2023-02-25 19:05 ` tomas 0 siblings, 1 reply; 22+ messages in thread From: Michael Heerdegen @ 2023-02-25 18:31 UTC (permalink / raw) To: help-gnu-emacs <tomas@tuxteam.de> writes: > > If `emms-info-native--read-and-decode-ogg-page' is called very often > > (hundreds of times or more), it's probably better to use one single > > buffer instead of a fresh temp buffer every single time. Using temp > > buffers creates quite a bunch of garbage IME. > > And then do (erase-buffer) then (insert-file-contents-literally)? Yes. I had a case in my own personal code where recycling temp buffers made a big difference wrt garbage. Not sure if the cases are comparable, but I would give it a try. Michael. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-25 18:31 ` Michael Heerdegen @ 2023-02-25 19:05 ` tomas 0 siblings, 0 replies; 22+ messages in thread From: tomas @ 2023-02-25 19:05 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 236 bytes --] On Sat, Feb 25, 2023 at 07:31:01PM +0100, Michael Heerdegen wrote: > <tomas@tuxteam.de> writes: > [reuse buffer] > > And then do (erase-buffer) then (insert-file-contents-literally)? > > Yes. Thanks :-) Cheers - t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-25 11:23 ` Michael Heerdegen 2023-02-25 13:45 ` tomas @ 2023-02-25 23:52 ` Stefan Monnier via Users list for the GNU Emacs text editor 2023-02-27 20:44 ` Petteri Hintsanen 1 sibling, 1 reply; 22+ messages in thread From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2023-02-25 23:52 UTC (permalink / raw) To: help-gnu-emacs > If `emms-info-native--read-and-decode-ogg-page' is called very often > (hundreds of times or more), it's probably better to use one single > buffer instead of a fresh temp buffer every single time. Using temp > buffers creates quite a bunch of garbage IME. That's definitely something to consider. Another is whether the ELisp code was byte-compiled (if not, then all bets are off, the interpreter itself generates a fair bit of garbage, especially if you use a lot of macros). Are you using `bindat-type`? Stefan ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-25 23:52 ` Stefan Monnier via Users list for the GNU Emacs text editor @ 2023-02-27 20:44 ` Petteri Hintsanen 2023-02-28 5:37 ` tomas 2023-03-03 15:19 ` Stefan Monnier via Users list for the GNU Emacs text editor 0 siblings, 2 replies; 22+ messages in thread From: Petteri Hintsanen @ 2023-02-27 20:44 UTC (permalink / raw) To: help-gnu-emacs Stefan Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org> writes: >> If `emms-info-native--read-and-decode-ogg-page' is called very often >> (hundreds of times or more), it's probably better to use one single >> buffer instead of a fresh temp buffer every single time. I tried this and, for a moment, I _think_ it shaved off something like 20-25% of the memory usage (according to the memory profiler). That would be a big win. Sadly enough, it was just for a moment, because I cannot replicate it anymore. It wasn't a particularly controlled setup, so probably I just messed up something at some point. Nonetheless, using a persistent buffer seems to be the right thing to do, and seeing how many " *foo-bar-baz*" buffers there are, it even looks like a pattern. Also, if I interpreted profiler's hieroglyphs correctly, it told me that this setq (setq stream (vconcat stream (plist-get page :stream))) is a pig -- well, of course it is. I'm accumulating byte vector by copying its parts. Similarly bindat consumes a lot of memory. I think I can replace vectors with strings, which should, according to the elisp manual, "occupy one-fourth the space of a vector of the same elements." And I guess that accumulation would be best done with a buffer, not with strings or vectors. But bindat internals are beyond me. > That's definitely something to consider. Another is whether the ELisp > code was byte-compiled (if not, then all bets are off, the interpreter > itself generates a fair bit of garbage, especially if you use a lot of > macros). No, it was not byte-compiled. I don't know how many macros there are. Just by hand-waving I'd say "not that many". But again what bindat does is beyond me. I'll try byte-compiling after the code is in good enough shape to do controlled experiments. > Are you using `bindat-type`? No, not yet. I have been thinking about it, not only because the current implementation is riddled with ugly evals and kludges, but I want to save the kittens ;-D I also need to discuss with EMMS maintainer whether using Emacs 28+ feature is okay. Thanks to all for insights, I learned a lot. Petteri ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-27 20:44 ` Petteri Hintsanen @ 2023-02-28 5:37 ` tomas 2023-03-03 15:19 ` Stefan Monnier via Users list for the GNU Emacs text editor 1 sibling, 0 replies; 22+ messages in thread From: tomas @ 2023-02-28 5:37 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 1578 bytes --] On Mon, Feb 27, 2023 at 10:44:43PM +0200, Petteri Hintsanen wrote: [...] > Also, if I interpreted profiler's hieroglyphs correctly, it told me that > this setq > > (setq stream (vconcat stream (plist-get page :stream))) > > is a pig -- well, of course it is. I'm accumulating byte vector by > copying its parts. Similarly bindat consumes a lot of memory. > > I think I can replace vectors with strings, which should, according to > the elisp manual, "occupy one-fourth the space of a vector of the same > elements." And I guess that accumulation would be best done with a > buffer, not with strings or vectors. I must admit I didn't look too closely into your code, but this one stuck out too. Not only the copying, but the throwing away of so many vectors. I don't know whether it applies in your case, but one "classical" Lispy pattern when you have to concatenate many things in order is just consing them (at the front of the list) and nreversing the list at the end, like so: (let ((result '())) (while (more) (setq result (cons (next) result))) (nreverse result)) (nreverse does things "in place", so it reuses the cons pairs: never do that when someone else is looking ;-) Another, more functional, of course is to arrange things so you can use map or similar. Then, at the end you can concatenate the whole list, if need be. Basically it pays off when the "spine" of the whole thing (i.e. all those cons pairs you are using) is significantly smaller than whatever hangs off it Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-02-27 20:44 ` Petteri Hintsanen 2023-02-28 5:37 ` tomas @ 2023-03-03 15:19 ` Stefan Monnier via Users list for the GNU Emacs text editor 2023-03-07 21:48 ` Petteri Hintsanen 2023-09-06 19:05 ` Petteri Hintsanen 1 sibling, 2 replies; 22+ messages in thread From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2023-03-03 15:19 UTC (permalink / raw) To: help-gnu-emacs > Also, if I interpreted profiler's hieroglyphs correctly, it told me that > this setq > > (setq stream (vconcat stream (plist-get page :stream))) This is a typical a source of unnecessary O(N²) complexity: the above line takes O(N) time, so if you do it O(N) times, you got your N² blowup. You're usually better off doing (push (plist-get page :stream) stream-chunks) and then at the end get the `stream` with (mapconcat #'identity (nreverse stream-chunks) nil) or (apply #'vconcat (nreverse stream-chunks)) Of course that depends on what else happens with `stream` (I haven't really looked at your code, sorry). > I think I can replace vectors with strings, which should, according to > the elisp manual, "occupy one-fourth the space of a vector of the same > elements." More likely one-eighth nowadays (64 bit machines). > Similarly bindat consumes a lot of memory. Hmm... IIRC it should not use up very much "auxiliary" memory. IOW its memory usage should be determined by the amount of data it returns. So, when producing the bytestring it should be quite efficient memorywise. When reading the bytestring it may be wastefully allocating memory for all the alists (and also it may be wasteful if you only need some info because you still need to parse everything and allocate data to represent its parsed form). > But bindat internals are beyond me. I can be of help here :-) >> That's definitely something to consider. Another is whether the ELisp >> code was byte-compiled (if not, then all bets are off, the interpreter >> itself generates a fair bit of garbage, especially if you use a lot of >> macros). > No, it was not byte-compiled. Then stop right there and fix this problem. There's absolutely no point worrying about performance (including memory use) if the code is not compiled because compilation can change the behavior drastically. The only reason to run interpreted code nowadays is when you're Edebugging a piece of code. > I'll try byte-compiling after the code is in good enough shape to do > controlled experiments. The compiler is your friend. He can help you get the code in good shape :-) Stefan ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-03-03 15:19 ` Stefan Monnier via Users list for the GNU Emacs text editor @ 2023-03-07 21:48 ` Petteri Hintsanen 2023-03-07 22:45 ` Stefan Monnier 2023-09-06 19:05 ` Petteri Hintsanen 1 sibling, 1 reply; 22+ messages in thread From: Petteri Hintsanen @ 2023-03-07 21:48 UTC (permalink / raw) To: help-gnu-emacs; +Cc: Stefan Monnier Stefan Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org> writes: > This is a typical a source of unnecessary O(N²) complexity: the above > line takes O(N) time, so if you do it O(N) times, you got your > N² blowup. You're usually better off doing > > (push (plist-get page :stream) stream-chunks) > > and then at the end get the `stream` with > > (mapconcat #'identity (nreverse stream-chunks) nil) > or > (apply #'vconcat (nreverse stream-chunks)) Right, I see. Stream chunks are in this case byte vectors, so just reversing those chunks does not do the trick. But surely I can get from an order of N² to 2N or so. > Of course that depends on what else happens with `stream` (I haven't > really looked at your code, sorry). It's ok, I'm not expecting any reviews here. All these comments from you and others have been valuable already. >> No, it was not byte-compiled. > > Then stop right there and fix this problem. There's absolutely no point > worrying about performance (including memory use) if the code is > not compiled because compilation can change the behavior drastically. > > The only reason to run interpreted code nowadays is when you're > Edebugging a piece of code. Okay, this is something I did not foresee. But what about eval-defun and eval-... in general? They are very convenient when trying out things. Should I bind compile-defun to C-M-x then? And instead of eval-buffer use byte-compile-file? Or emacs-lisp-byte-compile-and-load? Manual is a bit spotty here; emacs-lisp-byte-compile-... functions are not mentioned. >> I'll try byte-compiling after the code is in good enough shape to do >> controlled experiments. > > The compiler is your friend. He can help you get the code in good shape :-) I'm afraid that even the compiler cannot help against quadratic complexity blunders. But I think I got your point. Thanks, Petteri ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-03-07 21:48 ` Petteri Hintsanen @ 2023-03-07 22:45 ` Stefan Monnier 2023-03-08 5:38 ` tomas 0 siblings, 1 reply; 22+ messages in thread From: Stefan Monnier @ 2023-03-07 22:45 UTC (permalink / raw) To: Petteri Hintsanen; +Cc: help-gnu-emacs >> This is a typical a source of unnecessary O(N²) complexity: the above >> line takes O(N) time, so if you do it O(N) times, you got your >> N² blowup. You're usually better off doing >> >> (push (plist-get page :stream) stream-chunks) >> >> and then at the end get the `stream` with >> >> (mapconcat #'identity (nreverse stream-chunks) nil) >> or >> (apply #'vconcat (nreverse stream-chunks)) > > Right, I see. Stream chunks are in this case byte vectors, so > just reversing those chunks does not do the trick. > But surely I can get from an order of N² to 2N or so. I'm suggesting to build a list of chunks backward and to reverse *the list*, not the chunks. So the end result should still be the same. > Okay, this is something I did not foresee. But what about eval-defun > and eval-... in general? They are very convenient when trying out > things. It's OK to use them, of course. It usually means you still have 98% of your code compiled. >> The compiler is your friend. He can help you get the code in good >> shape :-) > I'm afraid that even the compiler cannot help against quadratic > complexity blunders. :-) It's just a friend, yes. Stefan ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [External] : Passing buffers to function in elisp 2023-03-07 22:45 ` Stefan Monnier @ 2023-03-08 5:38 ` tomas 0 siblings, 0 replies; 22+ messages in thread From: tomas @ 2023-03-08 5:38 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 1456 bytes --] On Tue, Mar 07, 2023 at 05:45:49PM -0500, Stefan Monnier wrote: > >> This is a typical a source of unnecessary O(N²) complexity: the above > >> line takes O(N) time, so if you do it O(N) times, you got your > >> N² blowup. You're usually better off doing > >> > >> (push (plist-get page :stream) stream-chunks) > >> > >> and then at the end get the `stream` with > >> > >> (mapconcat #'identity (nreverse stream-chunks) nil) > >> or > >> (apply #'vconcat (nreverse stream-chunks)) > > > > Right, I see. Stream chunks are in this case byte vectors, so > > just reversing those chunks does not do the trick. > > But surely I can get from an order of N² to 2N or so. > > I'm suggesting to build a list of chunks backward and to reverse *the > list*, not the chunks. So the end result should still be the same. Judging by the "2N instead of N^2" I guess Petteri had the right mental model, though. > > Okay, this is something I did not foresee. But what about eval-defun > > and eval-... in general? They are very convenient when trying out > > things. > > It's OK to use them, of course. It usually means you still have 98% of > your code compiled. > > >> The compiler is your friend. He can help you get the code in good > >> shape :-) > > I'm afraid that even the compiler cannot help against quadratic > > complexity blunders. > > :-) > > It's just a friend, yes. :-) Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Passing buffers to function in elisp 2023-03-03 15:19 ` Stefan Monnier via Users list for the GNU Emacs text editor 2023-03-07 21:48 ` Petteri Hintsanen @ 2023-09-06 19:05 ` Petteri Hintsanen 2023-09-06 21:12 ` Stefan Monnier 1 sibling, 1 reply; 22+ messages in thread From: Petteri Hintsanen @ 2023-09-06 19:05 UTC (permalink / raw) To: help-gnu-emacs; +Cc: Stefan Monnier Hello all, It took some time to do these memory optimizations I asked about few months ago. Here are some remarks. >> Also, if I interpreted profiler's hieroglyphs correctly, it told me that >> this setq >> >> (setq stream (vconcat stream (plist-get page :stream))) > > This is a typical a source of unnecessary O(N²) complexity: the above > line takes O(N) time, so if you do it O(N) times, you got your > N² blowup. You're usually better off doing > > (push (plist-get page :stream) stream-chunks) > > and then at the end get the `stream` with > > (mapconcat #'identity (nreverse stream-chunks) nil) > or > (apply #'vconcat (nreverse stream-chunks)) I replaced vconcat with push. However it did not have a significant effect (measured with Emacs memory profiler). Perhaps the chunks were quite small after all. In complexity speak, with small N one usually does not need to worry about quadratics. But it is no worse either, so I left it that way. >> I think I can replace vectors with strings, which should, according to >> the elisp manual, "occupy one-fourth the space of a vector of the same >> elements." > > More likely one-eighth nowadays (64 bit machines). Changing vectors to strings did indeed have a significant effect. It is also the right thing to do, because, frankly, much of the data *are* strings. >> Similarly bindat consumes a lot of memory. > > Hmm... IIRC it should not use up very much "auxiliary" memory. IOW > its memory usage should be determined by the amount of data it > returns. So, when producing the bytestring it should be quite > efficient memorywise. This is correct. Bindat is very conservative. I probably misread the profiler report back then and unjustly put part of the blame on bindat. >>> That's definitely something to consider. Another is whether the ELisp >>> code was byte-compiled (if not, then all bets are off, the interpreter >>> itself generates a fair bit of garbage, especially if you use a lot of >>> macros). >> No, it was not byte-compiled. > > Then stop right there and fix this problem. There's absolutely no point > worrying about performance (including memory use) if the code is > not compiled because compilation can change the behavior drastically. This is also absolutely correct. There is no point in profiling non compiled code. Non compiled code gives wildly changing profiles from time to time. >> I'll try byte-compiling after the code is in good enough shape to do >> controlled experiments. > > The compiler is your friend. He can help you get the code in good shape :-) Truly he does. I have also native compilation enabled. Don't know how much effect it had. I also tried to replace with-temp-buffer forms (such forms are called hundreds of times) with a static buffer for holding temporary data. It produced mixed results. In some limited settings, memory savings were considerable, but in some others cases it blew up memory usage. I cannot explain why that happened. But it seems safest to stick to with-temp-buffer. Nonetheless, the code is now much better. Thank you all for your insights, Petteri ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Passing buffers to function in elisp 2023-09-06 19:05 ` Petteri Hintsanen @ 2023-09-06 21:12 ` Stefan Monnier 0 siblings, 0 replies; 22+ messages in thread From: Stefan Monnier @ 2023-09-06 21:12 UTC (permalink / raw) To: Petteri Hintsanen; +Cc: help-gnu-emacs >> This is a typical a source of unnecessary O(N²) complexity: the above >> line takes O(N) time, so if you do it O(N) times, you got your >> N² blowup. You're usually better off doing [...] > I replaced vconcat with push. However it did not have a significant > effect (measured with Emacs memory profiler). Perhaps the chunks were > quite small after all. That's usually the case, indeed. > In complexity speak, with small N one usually > does not need to worry about quadratics. But: it's rare to be sure that N will *always* be small :-( > I also tried to replace with-temp-buffer forms (such forms are called > hundreds of times) with a static buffer for holding temporary data. It > produced mixed results. In some limited settings, memory savings were > considerable, but in some others cases it blew up memory usage. I > cannot explain why that happened. But it seems safest to stick to > with-temp-buffer. `with-temp-buffer` is fairly costly, but to the extent that it's pretty much a constant cost it shouldn't [known on wood] bring surprises in unexpected circumstances, so if it's fast enough it's a good choice. Stefan ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Passing buffers to function in elisp 2023-02-21 21:18 Passing buffers to function in elisp Petteri Hintsanen 2023-02-21 23:21 ` [External] : " Drew Adams @ 2023-02-22 5:30 ` tomas 2023-02-23 9:34 ` Michael Heerdegen 1 sibling, 1 reply; 22+ messages in thread From: tomas @ 2023-02-22 5:30 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 2825 bytes --] On Tue, Feb 21, 2023 at 11:18:25PM +0200, Petteri Hintsanen wrote: > Hello list, > > > Alan J. Perlis said "A LISP programmer knows the value of everything, > but the cost of nothing." > > > I'm reading some bytes into a temp buffer, like so: > > (with-temp-buffer > (set-buffer-multibyte nil) > (insert-file-contents-literally filename nil 0 64000)) > > then I pass these bytes to functions for processing, like this > > (func1 (buffer-string)) > > or sometimes just part of them > > (func2 (substring (buffer-string) 100 200)) > > Now: > > . does this generate garbage? (I believe it does.) It most probably does, but that will depend on the future history of buffer and whatever func1 does with its arg. If both are needed later, it isn't garbage (yet). They become garbage once they aren't needed. > . if there are many funcalls like that, will there be lots of garbage? > (I guess there will be.) See above. See, the documentation of `buffer string' hints that it is doing a copy. If you modify the string, the buffer will stay the same and vice-versa. If that is what you want, then go for it :-) > . is this bad style? (I'm afraid it is, hence asking.) See above: it depends. If you want func1 to operate on the buffer content, then you better pass it the buffer itself (actually a reference to the buffer, but that's "details" ;-) If you'd be surprised that func1 is able to change the buffer, then better pass it a copy: `buffer-string' seems a good way to do that. > Is it better just to assume in functions that the current buffer is the > data buffer and work on that, instead of passing data as function > arguments? That depends on your style and on the "contracts" you make with yourself (and ultimately, of course, on what you are trying to do: for each different purpose, some style will be clearer/more efficient -- ideally both, but life and things). > [Why am I doing like this? It is /slightly/ easier to write tests when > functions get their data in their arguments.] Then go for it. To accompany your nice Perlis quote above I offer "Premature optimization is the root of all evil", which is attributed to Donald Knuth (some say it was Tony Hoare). Keep an eye on things and be ready to notice whether it is creating performance problems. > Also: is it good idea to try to limit the number temp buffers > (with-temp-buffer expressions)? Or are they somehow recycled within the > elisp interpreter? Once the interpreter (well, it's a hybrid these days. Let's call it the "run time") can prove they aren't needed, it will get recycled, yes. If you are curious, just invoke (garbage-collect) after you have accumulated some. It will tell you what it found. Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Passing buffers to function in elisp 2023-02-22 5:30 ` tomas @ 2023-02-23 9:34 ` Michael Heerdegen 2023-02-23 9:51 ` tomas 2023-02-23 16:19 ` Marcin Borkowski 0 siblings, 2 replies; 22+ messages in thread From: Michael Heerdegen @ 2023-02-23 9:34 UTC (permalink / raw) To: help-gnu-emacs <tomas@tuxteam.de> writes: > > Is it better just to assume in functions that the current buffer is > > the data buffer and work on that, instead of passing data as > > function arguments? > > That depends on your style and on the "contracts" you make > with yourself (and ultimately, of course, on what you are > trying to do: for each different purpose, some style will > be clearer/more efficient -- ideally both, but life and > things). And there is not only garbage, there is also the aspect of speed: many operations can be performed in buffers and likewise for strings, but sometimes operations are a lot faster for strings (modifying a buffer is a more complicated operation). Michael. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Passing buffers to function in elisp 2023-02-23 9:34 ` Michael Heerdegen @ 2023-02-23 9:51 ` tomas 2023-02-23 16:19 ` Marcin Borkowski 1 sibling, 0 replies; 22+ messages in thread From: tomas @ 2023-02-23 9:51 UTC (permalink / raw) To: help-gnu-emacs [-- Attachment #1: Type: text/plain, Size: 1024 bytes --] On Thu, Feb 23, 2023 at 10:34:32AM +0100, Michael Heerdegen wrote: > <tomas@tuxteam.de> writes: > > > > Is it better just to assume in functions that the current buffer is > > > the data buffer and work on that, instead of passing data as > > > function arguments? > > > > That depends on your style and on the "contracts" you make > > with yourself (and ultimately, of course, on what you are > > trying to do: for each different purpose, some style will > > be clearer/more efficient -- ideally both, but life and > > things). > > And there is not only garbage, there is also the aspect of speed: many > operations can be performed in buffers and likewise for strings, but > sometimes operations are a lot faster for strings (modifying a buffer is > a more complicated operation). And then, if you have the right garbage collector, creating some garbage might be faster than modifying things in place (if some stars align, and you take into account other things and all that :-) Cheers -- t [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Passing buffers to function in elisp 2023-02-23 9:34 ` Michael Heerdegen 2023-02-23 9:51 ` tomas @ 2023-02-23 16:19 ` Marcin Borkowski 1 sibling, 0 replies; 22+ messages in thread From: Marcin Borkowski @ 2023-02-23 16:19 UTC (permalink / raw) To: Michael Heerdegen; +Cc: help-gnu-emacs On 2023-02-23, at 10:34, Michael Heerdegen <michael_heerdegen@web.de> wrote: > <tomas@tuxteam.de> writes: > >> > Is it better just to assume in functions that the current buffer is >> > the data buffer and work on that, instead of passing data as >> > function arguments? >> >> That depends on your style and on the "contracts" you make >> with yourself (and ultimately, of course, on what you are >> trying to do: for each different purpose, some style will >> be clearer/more efficient -- ideally both, but life and >> things). > > And there is not only garbage, there is also the aspect of speed: many > operations can be performed in buffers and likewise for strings, but > sometimes operations are a lot faster for strings (modifying a buffer is > a more complicated operation). Well, I am fairly sure there are things which are faster for buffers, too... A few years ago I did some experimenting with that: https://mbork.pl/2019-03-25_Using_benchmark_to_measure_speed_of_Elisp_code As for testing with buffers, you might be interested in the `elisp-tests-with-temp-buffer' macro I've written a long time ago (see emacs/test/lisp/emacs-lisp/lisp-tests.el:317). The bottom line is probably this: do whatever you prefer, and optimize when it's needed (as Tomas said). Hth, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2023-09-06 21:12 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-21 21:18 Passing buffers to function in elisp Petteri Hintsanen 2023-02-21 23:21 ` [External] : " Drew Adams 2023-02-22 5:35 ` tomas 2023-02-24 20:08 ` Petteri Hintsanen 2023-02-25 6:40 ` tomas 2023-02-25 11:23 ` Michael Heerdegen 2023-02-25 13:45 ` tomas 2023-02-25 18:31 ` Michael Heerdegen 2023-02-25 19:05 ` tomas 2023-02-25 23:52 ` Stefan Monnier via Users list for the GNU Emacs text editor 2023-02-27 20:44 ` Petteri Hintsanen 2023-02-28 5:37 ` tomas 2023-03-03 15:19 ` Stefan Monnier via Users list for the GNU Emacs text editor 2023-03-07 21:48 ` Petteri Hintsanen 2023-03-07 22:45 ` Stefan Monnier 2023-03-08 5:38 ` tomas 2023-09-06 19:05 ` Petteri Hintsanen 2023-09-06 21:12 ` Stefan Monnier 2023-02-22 5:30 ` tomas 2023-02-23 9:34 ` Michael Heerdegen 2023-02-23 9:51 ` tomas 2023-02-23 16:19 ` Marcin Borkowski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).