unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Question: estimating size of a buffer when written to file?
@ 2021-12-08 15:33 Qiantan Hong
  2021-12-08 15:52 ` Stephen Berman
  2021-12-08 17:03 ` Stefan Monnier
  0 siblings, 2 replies; 7+ messages in thread
From: Qiantan Hong @ 2021-12-08 15:33 UTC (permalink / raw)
  To: emacs-devel@gnu.org

Is there any reliable way to get (or conservatively estimate) 
the size of a buffer when written to file?

I’m thinking about using it for an optimization for persistent kv store.
If we know a log entry, when written out, is shorter than PIPE_BUF,
then we can skip the file locking/unlocking.
Emacs lock file doesn’t seem to be particularly fast.

Best,
Qiantan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question: estimating size of a buffer when written to file?
  2021-12-08 15:33 Question: estimating size of a buffer when written to file? Qiantan Hong
@ 2021-12-08 15:52 ` Stephen Berman
  2021-12-08 15:55   ` Qiantan Hong
  2021-12-08 17:03 ` Stefan Monnier
  1 sibling, 1 reply; 7+ messages in thread
From: Stephen Berman @ 2021-12-08 15:52 UTC (permalink / raw)
  To: Qiantan Hong; +Cc: emacs-devel@gnu.org

On Wed, 8 Dec 2021 15:33:06 +0000 Qiantan Hong <qhong@mit.edu> wrote:

> Is there any reliable way to get (or conservatively estimate)
> the size of a buffer when written to file?

Is (string-bytes (buffer-string)) not reliable?

Steve Berman



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question: estimating size of a buffer when written to file?
  2021-12-08 15:52 ` Stephen Berman
@ 2021-12-08 15:55   ` Qiantan Hong
  2021-12-08 16:01     ` Stephen Berman
  0 siblings, 1 reply; 7+ messages in thread
From: Qiantan Hong @ 2021-12-08 15:55 UTC (permalink / raw)
  To: Stephen Berman; +Cc: emacs-devel@gnu.org

> Is (string-bytes (buffer-string)) not reliable?
> 
> Steve Berman
I’m not very familiar with different kinds of encoding,
but would some coding system result in final
writes longer than internal string-bytes?

Best,
Qiantan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question: estimating size of a buffer when written to file?
  2021-12-08 15:55   ` Qiantan Hong
@ 2021-12-08 16:01     ` Stephen Berman
  2021-12-08 16:29       ` Daniel Martín
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Berman @ 2021-12-08 16:01 UTC (permalink / raw)
  To: Qiantan Hong; +Cc: emacs-devel@gnu.org

On Wed, 8 Dec 2021 15:55:42 +0000 Qiantan Hong <qhong@mit.edu> wrote:

>> Is (string-bytes (buffer-string)) not reliable?
>> 
>> Steve Berman
> I’m not very familiar with different kinds of encoding,
> but would some coding system result in final
> writes longer than internal string-bytes?

I'm afraid I don't know either.

Steve Berman

> Best,
> Qiantan



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question: estimating size of a buffer when written to file?
  2021-12-08 16:01     ` Stephen Berman
@ 2021-12-08 16:29       ` Daniel Martín
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Martín @ 2021-12-08 16:29 UTC (permalink / raw)
  To: Stephen Berman; +Cc: Qiantan Hong, emacs-devel@gnu.org

Stephen Berman <stephen.berman@gmx.net> writes:

> On Wed, 8 Dec 2021 15:55:42 +0000 Qiantan Hong <qhong@mit.edu> wrote:
>
>>> Is (string-bytes (buffer-string)) not reliable?
>>> 
>>> Steve Berman
>> I’m not very familiar with different kinds of encoding,
>> but would some coding system result in final
>> writes longer than internal string-bytes?
>
> I'm afraid I don't know either.
>

Yes, if the coding system is not the UTF-8 variant used by Emacs
internally, the result won't be accurate for some scripts.  You need to
encode the contents of the buffer/string using the value of
buffer-file-coding-system, for example, and call string-bytes on it.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question: estimating size of a buffer when written to file?
  2021-12-08 15:33 Question: estimating size of a buffer when written to file? Qiantan Hong
  2021-12-08 15:52 ` Stephen Berman
@ 2021-12-08 17:03 ` Stefan Monnier
  2021-12-08 17:09   ` Qiantan Hong
  1 sibling, 1 reply; 7+ messages in thread
From: Stefan Monnier @ 2021-12-08 17:03 UTC (permalink / raw)
  To: Qiantan Hong; +Cc: emacs-devel@gnu.org

> Is there any reliable way to get (or conservatively estimate) 
> the size of a buffer when written to file?

How 'bout your first encode the buffer's content and then write it
instead of relying on the write to do the encoding?
This way you get to see the size trivially.

If you need this estimate faster (e.g. because you may end up not
writing the text at all depending on the result), then I think the only
way is to multiply the number of chars by the amplification factor that
is a property of the encoding (this is used in the C to allocate
a destination byte-array that's guaranteed to be large enough.  Not sure
if we currently expose it to ELisp but it would be easy to do).


        Stefan




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Question: estimating size of a buffer when written to file?
  2021-12-08 17:03 ` Stefan Monnier
@ 2021-12-08 17:09   ` Qiantan Hong
  0 siblings, 0 replies; 7+ messages in thread
From: Qiantan Hong @ 2021-12-08 17:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel@gnu.org

> How 'bout your first encode the buffer's content and then write it
> instead of relying on the write to do the encoding?
> This way you get to see the size trivially.
I just found the function encode-coding-region, I never knew it exists, thanks!
The log will eventually gets written out so this is good enough.

BTW, after second thought, I think this is probably a premature optimization.

What about not care about locking at all in the kv-store APIs and let
package author do it?

On the other hand, I’m currently implementing a high level persistent-variable
API that runs in idle timer. I can easily lock/unlock only once every idle timer
event (instead of before/after every kv-put) and now the cost should be
negligible.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-12-08 17:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-08 15:33 Question: estimating size of a buffer when written to file? Qiantan Hong
2021-12-08 15:52 ` Stephen Berman
2021-12-08 15:55   ` Qiantan Hong
2021-12-08 16:01     ` Stephen Berman
2021-12-08 16:29       ` Daniel Martín
2021-12-08 17:03 ` Stefan Monnier
2021-12-08 17:09   ` Qiantan Hong

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).