* Against sqlite3!!! (Was: sqlite3)
[not found] <MN2PR12MB3391BC76A0D05236AC76C94E946E9@MN2PR12MB3391.namprd12.prod.outlook.com>
@ 2021-12-07 8:13 ` Qiantan Hong
2021-12-07 9:14 ` Qiantan Hong
` (3 more replies)
0 siblings, 4 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-07 8:13 UTC (permalink / raw)
To: emacs-devel@gnu.org
[-- Attachment #1.1: Type: text/plain, Size: 81 bytes --]
I’ve attached a pure Emacs Lisp implementation of persistent key value store.
[-- Attachment #1.2: Type: text/html, Size: 852 bytes --]
[-- Attachment #2: resist!.el --]
[-- Type: application/octet-stream, Size: 1667 bytes --]
;;; resist!.el --- Against SQLite3! -*- lexical-binding: t; -*-
(defvar kv-store-table)
(defmacro --kv (key value)
(puthash key value kv-store-table)
nil)
(cl-defstruct (kv-store (:constructor make-kv-store-1)) path table)
(defun make-kv-store (path)
(let* ((kv-store (make-kv-store-1 :path path))
(kv-store-table (make-hash-table :test 'equal))
need-compactification)
(when (file-exists-p path)
(condition-case c
(load-file path)
(end-of-file
;; We might encounter trailing unbalanced form if Emacs
;; crashed in the middle of `kv-put'. We compact the file
;; and fix unbalanced form as a side effect
(setq need-compactification t)))
(setf (kv-store-table kv-store) kv-store-table))
(when need-compactification
(compact-kv-store kv-store))
kv-store))
(defsubst kv-put-log (key value kv-store)
(let ((print-length nil) (print-level nil))
(prin1 `(--kv ,key ,value) (current-buffer)))
(insert "\n"))
(defun compact-kv-store (kv-store)
;; dump the full content of kv-store-table at once
;; to compress the log
(with-temp-buffer
(maphash (lambda (key value) (kv-put-log key value kv-store))
(kv-store-table kv-store))
(let ((file-precious-flag t))
(write-file (kv-store-path kv-store)))))
(defun kv-put (key value kv-store)
(with-temp-buffer
(kv-put-log key value kv-store)
(append-to-file nil nil (kv-store-path kv-store)))
(puthash key value (kv-store-table kv-store)))
(defun kv-get (key kv-store)
(gethash key (kv-store-table kv-store)))
(provide 'resist!)
[-- Attachment #3: ATT00001.htm --]
[-- Type: text/html, Size: 332 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! (Was: sqlite3)
2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong
@ 2021-12-07 9:14 ` Qiantan Hong
2021-12-07 12:49 ` Against sqlite3!!! Colin Baxter 😺
` (2 subsequent siblings)
3 siblings, 0 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-07 9:14 UTC (permalink / raw)
To: emacs-devel@gnu.org
A first unscientific benchmark, people!
(defvar test (make-kv-store "~/Projects/test.eld”))
(loop for i below 10000
do (kv-put i '(1 3 3) test))
(measure-time
(make-kv-store "~/Projects/test.eld"))
"0.025423"
Hua, 0.025s for loading 10k entries, by abusing built-in C load directly.
> On Dec 7, 2021, at 12:13 AM, Qiantan Hong <qhong@MIT.EDU> wrote:
>
>> I’ve attached a pure Emacs Lisp implementation of persistent key value store.
> <resist!.el>
> Seems to stuck for too long, so I’ve resent it — sorry for the spam.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong
2021-12-07 9:14 ` Qiantan Hong
@ 2021-12-07 12:49 ` Colin Baxter 😺
2021-12-07 13:21 ` Stefan Monnier
2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao
3 siblings, 0 replies; 26+ messages in thread
From: Colin Baxter 😺 @ 2021-12-07 12:49 UTC (permalink / raw)
To: Qiantan Hong; +Cc: emacs-devel@gnu.org
>>>>> "Qiantan" == Qiantan Hong <qhong@mit.edu> writes:
Qiantan> I’ve attached a pure Emacs Lisp implementation of
Qiantan> persistent key value store.
Excellent! Long live text solutions.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong
2021-12-07 9:14 ` Qiantan Hong
2021-12-07 12:49 ` Against sqlite3!!! Colin Baxter 😺
@ 2021-12-07 13:21 ` Stefan Monnier
2021-12-07 13:55 ` Qiantan Hong
2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao
3 siblings, 1 reply; 26+ messages in thread
From: Stefan Monnier @ 2021-12-07 13:21 UTC (permalink / raw)
To: Qiantan Hong; +Cc: emacs-devel@gnu.org
> (load-file path)
Please don't:
- The function is called `load` (`load-file` is just one of the
*commands* defined to access the functionality interactively).
- This is a security hole.
`insert-file-contents + read` should hopefully still be fast enough.
Stefan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! (Was: sqlite3)
2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong
` (2 preceding siblings ...)
2021-12-07 13:21 ` Stefan Monnier
@ 2021-12-07 13:45 ` Zhu Zihao
2021-12-07 14:50 ` Against sqlite3!!! David Engster
3 siblings, 1 reply; 26+ messages in thread
From: Zhu Zihao @ 2021-12-07 13:45 UTC (permalink / raw)
To: Qiantan Hong; +Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 634 bytes --]
Actually, Emacs can serialize/deserialize hash table directly via
prin1-to-string & read
```
(let ((ht (make-hash-table)))
(puthash "test" "value" ht)
(format "%S" ht))
```
You can use `read` to "parse" the string returned by that snippet and
get a hash table.
Qiantan Hong <qhong@mit.edu> writes:
> I’ve attached a pure Emacs Lisp implementation of persistent key value store.
>
> [4. resist!.el --- application/emacs-lisp; resist!.el]...
>
> [5. ATT00001.htm --- text/html; ATT00001.htm]...
--
Retrieve my PGP public key:
gpg --recv-keys D47A9C8B2AE3905B563D9135BE42B352A9F6821F
Zihao
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 13:21 ` Stefan Monnier
@ 2021-12-07 13:55 ` Qiantan Hong
2021-12-07 15:51 ` Tassilo Horn
0 siblings, 1 reply; 26+ messages in thread
From: Qiantan Hong @ 2021-12-07 13:55 UTC (permalink / raw)
To: Stefan Monnier, Zhu Zihao; +Cc: emacs-devel@gnu.org
[-- Attachment #1: Type: text/plain, Size: 730 bytes --]
> Please don't:
> - The function is called `load` (`load-file` is just one of the
> *commands* defined to access the functionality interactively).
> - This is a security hole.
>
> `insert-file-contents + read` should hopefully still be fast enough.
Indeed, I’ve attached an updated version, it runs as fast.
I also added kv-rem
> Actually, Emacs can serialize/deserialize hash table directly via
> prin1-to-string & read
ik, this is the current standard practice.
If you follow the sqlite3 thread you see the problem is saving/loading
(aka printing/reading) whole hash table/alist takes too much time.
The point of my implementation is to do it incrementally everytime
the key value store is mutated.
[-- Attachment #2: resist!.el --]
[-- Type: application/octet-stream, Size: 2199 bytes --]
;;; resist!.el --- Against SQLite3! -*- lexical-binding: t; -*-
(cl-defstruct (kv-store (:constructor make-kv-store-1)) path table)
(defun make-kv-store (path)
(let* ((kv-store (make-kv-store-1 :path path))
(kv-store-table (make-hash-table :test 'equal))
need-compactification)
(when (file-exists-p path)
(condition-case nil
(with-temp-buffer
(insert-file-contents path)
(while (< (point) (1- (point-max))) ; exclude trailing newline
(let ((entry (read (current-buffer))))
(pcase (car entry)
('++ (puthash (cadr entry) (caddr entry) kv-store-table))
('-- (remhash (cadr entry) kv-store-table))))))
(end-of-file
;; We might encounter trailing unbalanced form if Emacs
;; crashed in the middle of `kv-put'. We compact the file
;; and fix unbalanced form as a side effect
(setq need-compactification t)))
(setf (kv-store-table kv-store) kv-store-table))
(when need-compactification
(compact-kv-store kv-store))
kv-store))
(defsubst kv-put-log (key value)
(let ((print-length nil) (print-level nil))
(prin1 (list '++ key value) (current-buffer)))
(insert "\n"))
(defsubst kv-rem-log (key)
(let ((print-length nil) (print-level nil))
(prin1 (list '-- key) (current-buffer)))
(insert "\n"))
(defun compact-kv-store (kv-store)
;; dump the full content of kv-store-table at once
;; to compress the log
(with-temp-buffer
(maphash (lambda (key value) (kv-put-log key value))
(kv-store-table kv-store))
(let ((file-precious-flag t))
(write-file (kv-store-path kv-store)))))
(defun kv-put (key value kv-store)
(with-temp-buffer
(kv-put-log key value)
(append-to-file nil nil (kv-store-path kv-store)))
(puthash key value (kv-store-table kv-store)))
(defun kv-rem (key)
(with-temp-buffer
(kv-rem-log key)
(append-to-file nil nil (kv-store-path kv-store)))
(remhash key (kv-store-table kv-store)))
(defun kv-get (key kv-store)
(gethash key (kv-store-table kv-store)))
(provide 'resist!)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao
@ 2021-12-07 14:50 ` David Engster
2021-12-07 20:00 ` Lars Ingebrigtsen
0 siblings, 1 reply; 26+ messages in thread
From: David Engster @ 2021-12-07 14:50 UTC (permalink / raw)
To: Zhu Zihao; +Cc: Qiantan Hong, emacs-devel
> Actually, Emacs can serialize/deserialize hash table directly via
> prin1-to-string & read
>
> ```
> (let ((ht (make-hash-table)))
> (puthash "test" "value" ht)
> (format "%S" ht))
> ```
>
> You can use `read` to "parse" the string returned by that snippet and
> get a hash table.
Yes, and it's slow. The Gnus registry is saved/loaded this way, and this
has annoyed me for years.
-David
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 13:55 ` Qiantan Hong
@ 2021-12-07 15:51 ` Tassilo Horn
2021-12-07 16:35 ` Qiantan Hong
0 siblings, 1 reply; 26+ messages in thread
From: Tassilo Horn @ 2021-12-07 15:51 UTC (permalink / raw)
To: Qiantan Hong; +Cc: emacs-devel, Stefan Monnier, Zhu Zihao
Hi Hong,
I'm not very knowledgable with cl-* stuff but doesn't your
implementation load all key-value pairs at once? That would be quite a
disadvantage compared to a DB approach where I'd naturally expect that
only the value I'm asking for is loaded.
Also, how would it ensure consistency when I have 2 parallel emacs
sessions (like one for mail/irc and one for programming/editing) where
session 1 modifies the value of key A and the other of key B? It looks
like the values of the kv-store that gets saved later will win. In case
that's the emacs session which has modified B, it'll revert A to the
state before the other session modified it, no?
If that were true, I'd say your resist!.el is a non-starter in the
current form. It should at least load only values explicitly asked for
and only persist/override actually changed values. A trivial solution
could use one file per key. Not sure how sensible that is.
Bye,
Tassilo
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 15:51 ` Tassilo Horn
@ 2021-12-07 16:35 ` Qiantan Hong
2021-12-07 18:43 ` Arthur Miller
` (2 more replies)
0 siblings, 3 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-07 16:35 UTC (permalink / raw)
To: Tassilo Horn; +Cc: Zhu Zihao, Stefan Monnier, emacs-devel@gnu.org
> I'm not very knowledgable with cl-* stuff but doesn't your
> implementation load all key-value pairs at once? That would be quite a
> disadvantage compared to a DB approach where I'd naturally expect that
> only the value I'm asking for is loaded.
It does, but I’ve done some benchmark and it loads 10k entries in 0.02~0.03
seconds. 100k entries takes <0.5s.
I’d say it should be suffice for most Emacs application I know of.
Nobody is using Emacs for trillions of business records.
On the other hand save operation is fully incremental and don’t
even need to be invoked explicitly.
Being said that, if we really find out loading is not fast enough, I might
come up with some way to load it in segments lazily. I doubt if that will
ever become necessary.
> Also, how would it ensure consistency when I have 2 parallel emacs
> sessions (like one for mail/irc and one for programming/editing) where
> session 1 modifies the value of key A and the other of key B? It looks
> like the values of the kv-store that gets saved later will win. In case
> that's the emacs session which has modified B, it'll revert A to the
> state before the other session modified it, no?
Since it records a log of deltas instead of printing the whole data structure,
different key won’t interfere.
Being said that, currently it probably won’t work because UNIX append
is not atomic and will probably be interleaved into nonsense.
There’re various workarounds, lock file being one, but I like
the idea of keeping only one “controller” instance with exclusive
access to the file more.
> If that were true, I'd say your resist!.el is a non-starter in the
> current form. It should at least load only values explicitly asked for
> and only persist/override actually changed values. A trivial solution
> could use one file per key. Not sure how sensible that is.
It does the latter.
As I’ve mentioned, from the benchmark results, the former doesn’t
seem to be a big problem. You’ll do it at most once for every Emacs
instance anyway.
Best,
Qiantan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 16:35 ` Qiantan Hong
@ 2021-12-07 18:43 ` Arthur Miller
2021-12-07 19:13 ` Qiantan Hong
2021-12-07 19:34 ` Tassilo Horn
2021-12-07 19:52 ` Stefan Monnier
2 siblings, 1 reply; 26+ messages in thread
From: Arthur Miller @ 2021-12-07 18:43 UTC (permalink / raw)
To: Qiantan Hong; +Cc: emacs-devel@gnu.org, Zhu Zihao, Stefan Monnier, Tassilo Horn
Qiantan Hong <qhong@mit.edu> writes:
>> That would be quite a
>> disadvantage compared to a DB approach where I'd naturally expect that
>> only the value I'm asking for is loaded.
Sqlite does not read one key per request, it reads at least a page. So if your
system uses 64k pages, sqlite will read at least 64k at once. Then you also have
system doing disk i/o, caching nodes and so on; so you are never really getting
just "the value you are asking for to be loaded".
> It does, but I’ve done some benchmark and it loads 10k entries in 0.02~0.03
> seconds. 100k entries takes <0.5s.
> I’d say it should be suffice for most Emacs application I know of.
> Nobody is using Emacs for trillions of business records.
>
> On the other hand save operation is fully incremental and don’t
> even need to be invoked explicitly.
>
> Being said that, if we really find out loading is not fast enough, I might
> come up with some way to load it in segments lazily. I doubt if that will
> ever become necessary.
>
>> Also, how would it ensure consistency when I have 2 parallel emacs
>> sessions (like one for mail/irc and one for programming/editing) where
>> session 1 modifies the value of key A and the other of key B? It looks
>> like the values of the kv-store that gets saved later will win. In case
>> that's the emacs session which has modified B, it'll revert A to the
>> state before the other session modified it, no?
> Since it records a log of deltas instead of printing the whole data structure,
> different key won’t interfere.
>
> Being said that, currently it probably won’t work because UNIX append
> is not atomic and will probably be interleaved into nonsense.
> There’re various workarounds, lock file being one, but I like
> the idea of keeping only one “controller” instance with exclusive
> access to the file more.
You can do same as sqlite does: just lock entire file (db). Sqlite locks entire
db while writes are performed. It can be configured to allow concurrent reads
but only if there are no ongoing writes.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 18:43 ` Arthur Miller
@ 2021-12-07 19:13 ` Qiantan Hong
0 siblings, 0 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-07 19:13 UTC (permalink / raw)
To: Arthur Miller
Cc: emacs-devel@gnu.org, Zhu Zihao, Stefan Monnier, Tassilo Horn
> You can do same as sqlite does: just lock entire file (db). Sqlite locks entire
> db while writes are performed. It can be configured to allow concurrent reads
> but only if there are no ongoing writes.
Indeed, that’s a straightforward solution. Not sure if
the Emacs lock file mechanism is fast enough to run for
every write, though (Emacs doesn’t seem to expose
UNIX flock/fcntl stuff).
There’s also a question of should we try to make write from an Emacs
visible for read from another Emacs.
AFAIC it doesn’t make much sense if we just want to persistent
backing storage. Should we also consider the (ab)use
as an IPC channel?
Best,
Qiantan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 16:35 ` Qiantan Hong
2021-12-07 18:43 ` Arthur Miller
@ 2021-12-07 19:34 ` Tassilo Horn
2021-12-08 10:00 ` Yuri Khan
2021-12-07 19:52 ` Stefan Monnier
2 siblings, 1 reply; 26+ messages in thread
From: Tassilo Horn @ 2021-12-07 19:34 UTC (permalink / raw)
To: Qiantan Hong; +Cc: emacs-devel, Zhu Zihao, Stefan Monnier
Qiantan Hong <qhong@mit.edu> writes:
>> I'm not very knowledgable with cl-* stuff but doesn't your
>> implementation load all key-value pairs at once? That would be quite
>> a disadvantage compared to a DB approach where I'd naturally expect
>> that only the value I'm asking for is loaded.
> It does, but I’ve done some benchmark and it loads 10k entries in
> 0.02~0.03 seconds. 100k entries takes <0.5s.
But your values are very only small lists.
> I’d say it should be suffice for most Emacs application I know of.
> Nobody is using Emacs for trillions of business records.
I'd imagine that if such a feature became available, packages would
start using it and store more data that they do now for whatever
reasons. Like eww/elpher could want to store the list of the last 100
visited URLs.
>> Also, how would it ensure consistency when I have 2 parallel emacs
>> sessions (like one for mail/irc and one for programming/editing) where
>> session 1 modifies the value of key A and the other of key B? It looks
>> like the values of the kv-store that gets saved later will win. In case
>> that's the emacs session which has modified B, it'll revert A to the
>> state before the other session modified it, no?
>
> Since it records a log of deltas instead of printing the whole data
> structure, different key won’t interfere.
Ah, allright. By skimming the code I've first thought the log would
only be for analysis/debugging purposes.
> Being said that, currently it probably won’t work because UNIX append
> is not atomic and will probably be interleaved into nonsense.
> There’re various workarounds, lock file being one, but I like the idea
> of keeping only one “controller” instance with exclusive access to the
> file more.
Interesting, but how would emacs instances interact with that controller
instance? And would that controller instance simply be the first emacs
instance of a user? What if the controller blocks because Gnus is
running and currently downloading mails with huge attachments?
>> If that were true, I'd say your resist!.el is a non-starter in the
>> current form. It should at least load only values explicitly asked for
>> and only persist/override actually changed values. A trivial solution
>> could use one file per key. Not sure how sensible that is.
> It does the latter.
No, it uses one file per kv-store but you can have as many kv-stores as
you like, e.g., one per package. Or do you mean "it only overrides
changed values" with "the latter"? Indeed, that's true. I've played
with the code and now understand it. Nice! :-) (kv-rem is missing a
kv-store arg.)
> As I’ve mentioned, from the benchmark results, the former doesn’t
> seem to be a big problem. You’ll do it at most once for every Emacs
> instance anyway.
Yeah, and since it's no global store in the sense of "every package
feeds its data into the same store", my argument is void. If my emacs
session doesn't start Gnus, Gnus won't load its own store.
Bye,
Tassilo
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 16:35 ` Qiantan Hong
2021-12-07 18:43 ` Arthur Miller
2021-12-07 19:34 ` Tassilo Horn
@ 2021-12-07 19:52 ` Stefan Monnier
2 siblings, 0 replies; 26+ messages in thread
From: Stefan Monnier @ 2021-12-07 19:52 UTC (permalink / raw)
To: Qiantan Hong; +Cc: Tassilo Horn, emacs-devel@gnu.org, Zhu Zihao
> Since it records a log of deltas instead of printing the whole data structure,
> different key won’t interfere.
> Being said that, currently it probably won’t work because UNIX append
> is not atomic and will probably be interleaved into nonsense.
> There’re various workarounds, lock file being one, but I like
> the idea of keeping only one “controller” instance with exclusive
> access to the file more.
I think allowing several instances to use the file at the same time is
important (I always have 2 sessions active at the same time and I'd like
to be able to use (and share) savehist with both of them).
But that only means having to lock while we're saving, which is a very
short time. It also requires being able to refresh the in-memory data
once we detect that the ondisk data has been changed.
Stefan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 14:50 ` Against sqlite3!!! David Engster
@ 2021-12-07 20:00 ` Lars Ingebrigtsen
2021-12-08 6:11 ` Arthur Miller
0 siblings, 1 reply; 26+ messages in thread
From: Lars Ingebrigtsen @ 2021-12-07 20:00 UTC (permalink / raw)
To: David Engster; +Cc: Qiantan Hong, Zhu Zihao, emacs-devel
David Engster <deng@randomsample.de> writes:
> Yes, and it's slow. The Gnus registry is saved/loaded this way, and this
> has annoyed me for years.
Yup. The Gnus registry would be well suited to use sqlite directly,
though -- it's basically hand-maintaining a (large) database, and sqlite
is a good fit for that.
That is, I don't think the new normal persistence method would be ideal
for the registry.
But, yes, the Gnus registry is a good demonstration of why serialising
hash tables to disk is unworkable in practice.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 20:00 ` Lars Ingebrigtsen
@ 2021-12-08 6:11 ` Arthur Miller
2021-12-08 6:20 ` Qiantan Hong
2021-12-09 7:12 ` Alexandre Garreau
0 siblings, 2 replies; 26+ messages in thread
From: Arthur Miller @ 2021-12-08 6:11 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: Qiantan Hong, Zhu Zihao, David Engster, emacs-devel
Lars Ingebrigtsen <larsi@gnus.org> writes:
> David Engster <deng@randomsample.de> writes:
>
>> Yes, and it's slow. The Gnus registry is saved/loaded this way, and this
>> has annoyed me for years.
>
> Yup. The Gnus registry would be well suited to use sqlite directly,
> though -- it's basically hand-maintaining a (large) database, and sqlite
> is a good fit for that.
>
> That is, I don't think the new normal persistence method would be ideal
> for the registry.
>
> But, yes, the Gnus registry is a good demonstration of why serialising
> hash tables to disk is unworkable in practice.
Than implement a way for Emacs to dump lisp objects to files faster. It would be
very useful for Emacs in general.
You will still have to serialize your gnus db to sqlite db, and write it to disk
via sqlite, so disk access will still be there.
Have you trye to write it in chunks in idle timer or all in one go? Is that possible?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-08 6:11 ` Arthur Miller
@ 2021-12-08 6:20 ` Qiantan Hong
2021-12-08 9:21 ` Arthur Miller
2021-12-09 7:12 ` Alexandre Garreau
1 sibling, 1 reply; 26+ messages in thread
From: Qiantan Hong @ 2021-12-08 6:20 UTC (permalink / raw)
To: Arthur Miller
Cc: larsi@gnus.org, Zhu Zihao, David Engster, emacs-devel@gnu.org
> Than implement a way for Emacs to dump lisp objects to files faster. It would be
> very useful for Emacs in general.
I think for this question, the one-and-for-all solution is. to have a fully
incremental persistent object store, i.e. all mutation are stored
incrementally without ever needing to print out the “fully value”
of a Lisp value. What do you think?
resist!.el in its current form basically implemented a special case
of the above, where only mutation to the top level hash table
is persisted incrementally.
I can’t see a way to implement persistent object store without
some non-trivial memory overhead, though (because each
object has to get an unique id). The best thing I can come up
with has to have a hash table that maps every object in the
store to a numeric ID. Is that too much?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-08 6:20 ` Qiantan Hong
@ 2021-12-08 9:21 ` Arthur Miller
2021-12-08 9:28 ` Qiantan Hong
0 siblings, 1 reply; 26+ messages in thread
From: Arthur Miller @ 2021-12-08 9:21 UTC (permalink / raw)
To: Qiantan Hong
Cc: larsi@gnus.org, Zhu Zihao, David Engster, emacs-devel@gnu.org
Qiantan Hong <qhong@mit.edu> writes:
>> Than implement a way for Emacs to dump lisp objects to files faster. It would be
>> very useful for Emacs in general.
> I think for this question, the one-and-for-all solution is. to have a fully
> incremental persistent object store, i.e. all mutation are stored
> incrementally without ever needing to print out the “fully value”
> of a Lisp value. What do you think?
I am not sure I understand what you mean. What is "fully value" of a Lisp
value. You mean entire object?
> resist!.el in its current form basically implemented a special case
> of the above, where only mutation to the top level hash table
> is persisted incrementally.
>
> I can’t see a way to implement persistent object store without
> some non-trivial memory overhead, though (because each
> object has to get an unique id). The best thing I can come up
> with has to have a hash table that maps every object in the
> store to a numeric ID. Is that too much?
Honestly, I have no idea.
I just meant that it is very useful to be able to serialize/deserialize lisp
ojbects, in general, without need to go through some intermediate key/value
database.
I don't know what you are doing to start with; I haven't looked at your
resist.el, but maybe you can cache keys that needs to be flushed to disk in some
'dirty state (a list of kyes) and flush just those keys to a file in idle timer.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-08 9:21 ` Arthur Miller
@ 2021-12-08 9:28 ` Qiantan Hong
0 siblings, 0 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-08 9:28 UTC (permalink / raw)
To: Arthur Miller
Cc: larsi@gnus.org, Zhu Zihao, David Engster, emacs-devel@gnu.org
> I am not sure I understand what you mean. What is "fully value" of a Lisp
> value. You mean entire object?
Yes, and this is less than optimal. For example, there’re lots of variables
holding a list and added to/removed from frequently.
A straightforward implementation would need to print/read the whole list
every time.
A cleverer object store would be able to figure out which CONS changed exactly,
and store a single record like
(rplacd *id-of-the-cons* `(*new-element* . ,(object *id-of-old-tail*)))
> I don't know what you are doing to start with; I haven't looked at your
> resist.el, but maybe you can cache keys that needs to be flushed to disk in some
> 'dirty state (a list of kyes) and flush just those keys to a file in idle timer.
That’s not really relevant. resist!.el persist every kv-put operation immediately
without the need of any explicit save operation.
It might be a good idea to run compact-kv-store to save disk space in idle timer, though.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-07 19:34 ` Tassilo Horn
@ 2021-12-08 10:00 ` Yuri Khan
0 siblings, 0 replies; 26+ messages in thread
From: Yuri Khan @ 2021-12-08 10:00 UTC (permalink / raw)
To: Tassilo Horn; +Cc: Qiantan Hong, Zhu Zihao, Stefan Monnier, Emacs developers
On Wed, 8 Dec 2021 at 03:19, Tassilo Horn <tsdh@gnu.org> wrote:
> > There’re various workarounds, lock file being one, but I like the idea
> > of keeping only one “controller” instance with exclusive access to the
> > file more.
>
> Interesting, but how would emacs instances interact with that controller
> instance? And would that controller instance simply be the first emacs
> instance of a user? What if the controller blocks because Gnus is
> running and currently downloading mails with huge attachments?
This line of thought, if followed naturally, leads to a dedicated
database server process. With multiple concurrent clients, isolation
levels, transaction control, etc. And there are better database
servers than a special Emacs configuration owning an SQLite database
file.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-08 6:11 ` Arthur Miller
2021-12-08 6:20 ` Qiantan Hong
@ 2021-12-09 7:12 ` Alexandre Garreau
2021-12-09 7:27 ` Qiantan Hong
1 sibling, 1 reply; 26+ messages in thread
From: Alexandre Garreau @ 2021-12-09 7:12 UTC (permalink / raw)
To: emacs-devel
Le merkredo, 8-a de decembro 2021, 7-a horo kaj 11:19 CET Arthur Miller a
écrit :
> Lars Ingebrigtsen <larsi@gnus.org> writes:
>
>
>
> > David Engster <deng@randomsample.de> writes:
>
>
> >> Yes, and it's slow. The Gnus registry is saved/loaded this way, and
> >> this has annoyed me for years.
>
>
>
> > Yup. The Gnus registry would be well suited to use sqlite directly,
> > though -- it's basically hand-maintaining a (large) database, and
> > sqlite is a good fit for that.
> >
> > That is, I don't think the new normal persistence method would be
> > ideal
> > for the registry.
> >
> > But, yes, the Gnus registry is a good demonstration of why serialising
> > hash tables to disk is unworkable in practice.
>
> Than implement a way for Emacs to dump lisp objects to files faster. It
> would be very useful for Emacs in general.
>
> You will still have to serialize your gnus db to sqlite db, and write it
> to disk via sqlite, so disk access will still be there.
Well, I think, wrt writes, it’s impossible to go faster than just writing
plaintext. But sqlite is here to make *reads* faster, not writes. The
only way I can see to make writes faster without any form of serialization
is just garbage collecting related data together, and directly dumping
memory onto disk, in a native but non-portable format, along with metadata
about the native format, and specific implementations to decode it from
other platforms for each memory format. I’d love to see that. But I
think writing is not the bottleneck that we try to improve here.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-09 7:12 ` Alexandre Garreau
@ 2021-12-09 7:27 ` Qiantan Hong
[not found] ` <24465971.J1OoJ6LT5i@galex-713.eu>
2021-12-09 13:17 ` Stefan Monnier
0 siblings, 2 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-09 7:27 UTC (permalink / raw)
To: Alexandre Garreau; +Cc: emacs-devel@gnu.org
> Well, I think, wrt writes, it’s impossible to go faster than just writing
> plaintext. But sqlite is here to make *reads* faster, not writes. The
> only way I can see to make writes faster without any form of serialization
> is just garbage collecting related data together, and directly dumping
> memory onto disk, in a native but non-portable format, along with metadata
> about the native format, and specific implementations to decode it from
> other platforms for each memory format. I’d love to see that. But I
> think writing is not the bottleneck that we try to improve here.
I think write still *used to* be a bottleneck in the sense that without
incremental updates, every writes require dumping the whole store,
thus preventing Emacs from remembering things *early and eagerly*.
resist! solves exactly this. Any mutation to the kv-store is immediately
remembered by appending a log, and make-persistent-variable
remember all changes every persistent-variable-idle-time (1 second
seems reasonable).
> Well, I think, wrt writes, it’s impossible to go faster than just writing
> plaintext. But sqlite is here to make *reads* faster, not writes.
Did you mean the initial loading up, or substantial reads?
For the substantial reads, sqlite3 is probably not much faster
than a single gethash.
resist! In its current form do require initially loading the whole store,
but I think it’s unclear whether it is a bottle neck. See
https://lists.gnu.org/archive/html/emacs-devel/2021-12/msg00865.html
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
[not found] ` <24465971.J1OoJ6LT5i@galex-713.eu>
@ 2021-12-09 7:50 ` Qiantan Hong
2021-12-09 19:16 ` Thierry Volpiatto
0 siblings, 1 reply; 26+ messages in thread
From: Qiantan Hong @ 2021-12-09 7:50 UTC (permalink / raw)
To: Alexandre Garreau; +Cc: emacs-devel@gnu.org
> If you start dividing your file into pages and loading one page at a time,
> you’re starting to reimplement a kv-database, just as gdbm (or some
> filesystems). You’d better reuse gdbm,
There is a subtle but IMO significant benefit of resist!, it understands
more than kv-put. In the current version, it also understands
kv-push and kv-delete (aka list operations).
I imagine this can be very useful for Lisp because lots of time people
just add or remove data from a list. Some implementation that doesn’t
understand this would need to dump the whole list again.
> and work on something higher level
> and more fun like a lisp sql-like query language, inspired from (or even
> compatible with) sql or sparql or prolog, except each expression would
> return a lisp value. That would be very beautiful.
That’s definitely a cool idea. I’ll probably just go for an embedded Prolog
with S-exp syntax.
It’s probably not my current priority though, because
Emacs packages at the moment still use mostly LISt Processing,
and somehow people still find enough incentive to introduce SQLite3,
which I attribute to the lack of a decent pure Lisp persistent store.
So that’s what resist! aims to solve at this moment.
Maybe if after resist! solved the problem people would still want
SQLite3, I’ll write another package revolt! which implements Prolog.
Best,
Qiantan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-09 7:27 ` Qiantan Hong
[not found] ` <24465971.J1OoJ6LT5i@galex-713.eu>
@ 2021-12-09 13:17 ` Stefan Monnier
1 sibling, 0 replies; 26+ messages in thread
From: Stefan Monnier @ 2021-12-09 13:17 UTC (permalink / raw)
To: Qiantan Hong; +Cc: Alexandre Garreau, emacs-devel@gnu.org
> resist! In its current form do require initially loading the whole store,
> but I think it’s unclear whether it is a bottle neck. See
> https://lists.gnu.org/archive/html/emacs-devel/2021-12/msg00865.html
It depends on the use case.
sqlite is the winner for databases that are not meant to be
memory-resident (usually because they're too large).
It's also the clear winner when you have to read some other
application's sqlite file ;-)
Stefan
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-09 7:50 ` Qiantan Hong
@ 2021-12-09 19:16 ` Thierry Volpiatto
2021-12-09 19:24 ` Qiantan Hong
2021-12-09 19:28 ` Qiantan Hong
0 siblings, 2 replies; 26+ messages in thread
From: Thierry Volpiatto @ 2021-12-09 19:16 UTC (permalink / raw)
To: Qiantan Hong; +Cc: Alexandre Garreau, emacs-devel@gnu.org
Qiantan Hong <qhong@mit.edu> writes:
>> If you start dividing your file into pages and loading one page at a time,
>> you’re starting to reimplement a kv-database, just as gdbm (or some
>> filesystems). You’d better reuse gdbm,
> There is a subtle but IMO significant benefit of resist!, it understands
> more than kv-put. In the current version, it also understands
> kv-push and kv-delete (aka list operations).
>
> I imagine this can be very useful for Lisp because lots of time people
> just add or remove data from a list. Some implementation that doesn’t
> understand this would need to dump the whole list again.
>
>> and work on something higher level
>> and more fun like a lisp sql-like query language, inspired from (or even
>> compatible with) sql or sparql or prolog, except each expression would
>> return a lisp value. That would be very beautiful.
> That’s definitely a cool idea. I’ll probably just go for an embedded Prolog
> with S-exp syntax.
>
> It’s probably not my current priority though, because
> Emacs packages at the moment still use mostly LISt Processing,
> and somehow people still find enough incentive to introduce SQLite3,
> which I attribute to the lack of a decent pure Lisp persistent store.
Emacs-lisp provide eval-when-compile which allows saving elisp objects
in compiled files, I use this since years to save my variables in emacs.
See psession package.
--
Thierry
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-09 19:16 ` Thierry Volpiatto
@ 2021-12-09 19:24 ` Qiantan Hong
2021-12-09 19:28 ` Qiantan Hong
1 sibling, 0 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-09 19:24 UTC (permalink / raw)
To: Thierry Volpiatto; +Cc: Alexandre Garreau, emacs-devel@gnu.org
> Emacs-lisp provide eval-when-compile which allows saving elisp objects
> in compiled files, I use this since years to save my variables in emacs.
> See psession package.
Interesting. Is the representation produced in this way different/more compact
from what you get by printing the object directly?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!!
2021-12-09 19:16 ` Thierry Volpiatto
2021-12-09 19:24 ` Qiantan Hong
@ 2021-12-09 19:28 ` Qiantan Hong
1 sibling, 0 replies; 26+ messages in thread
From: Qiantan Hong @ 2021-12-09 19:28 UTC (permalink / raw)
To: Thierry Volpiatto; +Cc: Alexandre Garreau, emacs-devel@gnu.org
>> Emacs-lisp provide eval-when-compile which allows saving elisp objects
>> in compiled files, I use this since years to save my variables in emacs.
>> See psession package.
> Interesting. Is the representation produced in this way different/more compact
> from what you get by printing the object directly?
I tested some simple objects and it seems that the representation in the .elc file
is exactly the same from what PRINT would produce, so it’s probably no faster
than PRINT/READing directly.
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2021-12-09 19:28 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <MN2PR12MB3391BC76A0D05236AC76C94E946E9@MN2PR12MB3391.namprd12.prod.outlook.com>
2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong
2021-12-07 9:14 ` Qiantan Hong
2021-12-07 12:49 ` Against sqlite3!!! Colin Baxter 😺
2021-12-07 13:21 ` Stefan Monnier
2021-12-07 13:55 ` Qiantan Hong
2021-12-07 15:51 ` Tassilo Horn
2021-12-07 16:35 ` Qiantan Hong
2021-12-07 18:43 ` Arthur Miller
2021-12-07 19:13 ` Qiantan Hong
2021-12-07 19:34 ` Tassilo Horn
2021-12-08 10:00 ` Yuri Khan
2021-12-07 19:52 ` Stefan Monnier
2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao
2021-12-07 14:50 ` Against sqlite3!!! David Engster
2021-12-07 20:00 ` Lars Ingebrigtsen
2021-12-08 6:11 ` Arthur Miller
2021-12-08 6:20 ` Qiantan Hong
2021-12-08 9:21 ` Arthur Miller
2021-12-08 9:28 ` Qiantan Hong
2021-12-09 7:12 ` Alexandre Garreau
2021-12-09 7:27 ` Qiantan Hong
[not found] ` <24465971.J1OoJ6LT5i@galex-713.eu>
2021-12-09 7:50 ` Qiantan Hong
2021-12-09 19:16 ` Thierry Volpiatto
2021-12-09 19:24 ` Qiantan Hong
2021-12-09 19:28 ` Qiantan Hong
2021-12-09 13:17 ` Stefan Monnier
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.