* Against sqlite3!!! (Was: sqlite3) [not found] <MN2PR12MB3391BC76A0D05236AC76C94E946E9@MN2PR12MB3391.namprd12.prod.outlook.com> @ 2021-12-07 8:13 ` Qiantan Hong 2021-12-07 9:14 ` Qiantan Hong ` (3 more replies) 0 siblings, 4 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-07 8:13 UTC (permalink / raw) To: emacs-devel@gnu.org [-- Attachment #1.1: Type: text/plain, Size: 81 bytes --] I’ve attached a pure Emacs Lisp implementation of persistent key value store. [-- Attachment #1.2: Type: text/html, Size: 852 bytes --] [-- Attachment #2: resist!.el --] [-- Type: application/octet-stream, Size: 1667 bytes --] ;;; resist!.el --- Against SQLite3! -*- lexical-binding: t; -*- (defvar kv-store-table) (defmacro --kv (key value) (puthash key value kv-store-table) nil) (cl-defstruct (kv-store (:constructor make-kv-store-1)) path table) (defun make-kv-store (path) (let* ((kv-store (make-kv-store-1 :path path)) (kv-store-table (make-hash-table :test 'equal)) need-compactification) (when (file-exists-p path) (condition-case c (load-file path) (end-of-file ;; We might encounter trailing unbalanced form if Emacs ;; crashed in the middle of `kv-put'. We compact the file ;; and fix unbalanced form as a side effect (setq need-compactification t))) (setf (kv-store-table kv-store) kv-store-table)) (when need-compactification (compact-kv-store kv-store)) kv-store)) (defsubst kv-put-log (key value kv-store) (let ((print-length nil) (print-level nil)) (prin1 `(--kv ,key ,value) (current-buffer))) (insert "\n")) (defun compact-kv-store (kv-store) ;; dump the full content of kv-store-table at once ;; to compress the log (with-temp-buffer (maphash (lambda (key value) (kv-put-log key value kv-store)) (kv-store-table kv-store)) (let ((file-precious-flag t)) (write-file (kv-store-path kv-store))))) (defun kv-put (key value kv-store) (with-temp-buffer (kv-put-log key value kv-store) (append-to-file nil nil (kv-store-path kv-store))) (puthash key value (kv-store-table kv-store))) (defun kv-get (key kv-store) (gethash key (kv-store-table kv-store))) (provide 'resist!) [-- Attachment #3: ATT00001.htm --] [-- Type: text/html, Size: 332 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! (Was: sqlite3) 2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong @ 2021-12-07 9:14 ` Qiantan Hong 2021-12-07 12:49 ` Against sqlite3!!! Colin Baxter 😺 ` (2 subsequent siblings) 3 siblings, 0 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-07 9:14 UTC (permalink / raw) To: emacs-devel@gnu.org A first unscientific benchmark, people! (defvar test (make-kv-store "~/Projects/test.eld”)) (loop for i below 10000 do (kv-put i '(1 3 3) test)) (measure-time (make-kv-store "~/Projects/test.eld")) "0.025423" Hua, 0.025s for loading 10k entries, by abusing built-in C load directly. > On Dec 7, 2021, at 12:13 AM, Qiantan Hong <qhong@MIT.EDU> wrote: > >> I’ve attached a pure Emacs Lisp implementation of persistent key value store. > <resist!.el> > Seems to stuck for too long, so I’ve resent it — sorry for the spam. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong 2021-12-07 9:14 ` Qiantan Hong @ 2021-12-07 12:49 ` Colin Baxter 😺 2021-12-07 13:21 ` Stefan Monnier 2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao 3 siblings, 0 replies; 26+ messages in thread From: Colin Baxter 😺 @ 2021-12-07 12:49 UTC (permalink / raw) To: Qiantan Hong; +Cc: emacs-devel@gnu.org >>>>> "Qiantan" == Qiantan Hong <qhong@mit.edu> writes: Qiantan> I’ve attached a pure Emacs Lisp implementation of Qiantan> persistent key value store. Excellent! Long live text solutions. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong 2021-12-07 9:14 ` Qiantan Hong 2021-12-07 12:49 ` Against sqlite3!!! Colin Baxter 😺 @ 2021-12-07 13:21 ` Stefan Monnier 2021-12-07 13:55 ` Qiantan Hong 2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao 3 siblings, 1 reply; 26+ messages in thread From: Stefan Monnier @ 2021-12-07 13:21 UTC (permalink / raw) To: Qiantan Hong; +Cc: emacs-devel@gnu.org > (load-file path) Please don't: - The function is called `load` (`load-file` is just one of the *commands* defined to access the functionality interactively). - This is a security hole. `insert-file-contents + read` should hopefully still be fast enough. Stefan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 13:21 ` Stefan Monnier @ 2021-12-07 13:55 ` Qiantan Hong 2021-12-07 15:51 ` Tassilo Horn 0 siblings, 1 reply; 26+ messages in thread From: Qiantan Hong @ 2021-12-07 13:55 UTC (permalink / raw) To: Stefan Monnier, Zhu Zihao; +Cc: emacs-devel@gnu.org [-- Attachment #1: Type: text/plain, Size: 730 bytes --] > Please don't: > - The function is called `load` (`load-file` is just one of the > *commands* defined to access the functionality interactively). > - This is a security hole. > > `insert-file-contents + read` should hopefully still be fast enough. Indeed, I’ve attached an updated version, it runs as fast. I also added kv-rem > Actually, Emacs can serialize/deserialize hash table directly via > prin1-to-string & read ik, this is the current standard practice. If you follow the sqlite3 thread you see the problem is saving/loading (aka printing/reading) whole hash table/alist takes too much time. The point of my implementation is to do it incrementally everytime the key value store is mutated. [-- Attachment #2: resist!.el --] [-- Type: application/octet-stream, Size: 2199 bytes --] ;;; resist!.el --- Against SQLite3! -*- lexical-binding: t; -*- (cl-defstruct (kv-store (:constructor make-kv-store-1)) path table) (defun make-kv-store (path) (let* ((kv-store (make-kv-store-1 :path path)) (kv-store-table (make-hash-table :test 'equal)) need-compactification) (when (file-exists-p path) (condition-case nil (with-temp-buffer (insert-file-contents path) (while (< (point) (1- (point-max))) ; exclude trailing newline (let ((entry (read (current-buffer)))) (pcase (car entry) ('++ (puthash (cadr entry) (caddr entry) kv-store-table)) ('-- (remhash (cadr entry) kv-store-table)))))) (end-of-file ;; We might encounter trailing unbalanced form if Emacs ;; crashed in the middle of `kv-put'. We compact the file ;; and fix unbalanced form as a side effect (setq need-compactification t))) (setf (kv-store-table kv-store) kv-store-table)) (when need-compactification (compact-kv-store kv-store)) kv-store)) (defsubst kv-put-log (key value) (let ((print-length nil) (print-level nil)) (prin1 (list '++ key value) (current-buffer))) (insert "\n")) (defsubst kv-rem-log (key) (let ((print-length nil) (print-level nil)) (prin1 (list '-- key) (current-buffer))) (insert "\n")) (defun compact-kv-store (kv-store) ;; dump the full content of kv-store-table at once ;; to compress the log (with-temp-buffer (maphash (lambda (key value) (kv-put-log key value)) (kv-store-table kv-store)) (let ((file-precious-flag t)) (write-file (kv-store-path kv-store))))) (defun kv-put (key value kv-store) (with-temp-buffer (kv-put-log key value) (append-to-file nil nil (kv-store-path kv-store))) (puthash key value (kv-store-table kv-store))) (defun kv-rem (key) (with-temp-buffer (kv-rem-log key) (append-to-file nil nil (kv-store-path kv-store))) (remhash key (kv-store-table kv-store))) (defun kv-get (key kv-store) (gethash key (kv-store-table kv-store))) (provide 'resist!) ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 13:55 ` Qiantan Hong @ 2021-12-07 15:51 ` Tassilo Horn 2021-12-07 16:35 ` Qiantan Hong 0 siblings, 1 reply; 26+ messages in thread From: Tassilo Horn @ 2021-12-07 15:51 UTC (permalink / raw) To: Qiantan Hong; +Cc: emacs-devel, Stefan Monnier, Zhu Zihao Hi Hong, I'm not very knowledgable with cl-* stuff but doesn't your implementation load all key-value pairs at once? That would be quite a disadvantage compared to a DB approach where I'd naturally expect that only the value I'm asking for is loaded. Also, how would it ensure consistency when I have 2 parallel emacs sessions (like one for mail/irc and one for programming/editing) where session 1 modifies the value of key A and the other of key B? It looks like the values of the kv-store that gets saved later will win. In case that's the emacs session which has modified B, it'll revert A to the state before the other session modified it, no? If that were true, I'd say your resist!.el is a non-starter in the current form. It should at least load only values explicitly asked for and only persist/override actually changed values. A trivial solution could use one file per key. Not sure how sensible that is. Bye, Tassilo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 15:51 ` Tassilo Horn @ 2021-12-07 16:35 ` Qiantan Hong 2021-12-07 18:43 ` Arthur Miller ` (2 more replies) 0 siblings, 3 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-07 16:35 UTC (permalink / raw) To: Tassilo Horn; +Cc: Zhu Zihao, Stefan Monnier, emacs-devel@gnu.org > I'm not very knowledgable with cl-* stuff but doesn't your > implementation load all key-value pairs at once? That would be quite a > disadvantage compared to a DB approach where I'd naturally expect that > only the value I'm asking for is loaded. It does, but I’ve done some benchmark and it loads 10k entries in 0.02~0.03 seconds. 100k entries takes <0.5s. I’d say it should be suffice for most Emacs application I know of. Nobody is using Emacs for trillions of business records. On the other hand save operation is fully incremental and don’t even need to be invoked explicitly. Being said that, if we really find out loading is not fast enough, I might come up with some way to load it in segments lazily. I doubt if that will ever become necessary. > Also, how would it ensure consistency when I have 2 parallel emacs > sessions (like one for mail/irc and one for programming/editing) where > session 1 modifies the value of key A and the other of key B? It looks > like the values of the kv-store that gets saved later will win. In case > that's the emacs session which has modified B, it'll revert A to the > state before the other session modified it, no? Since it records a log of deltas instead of printing the whole data structure, different key won’t interfere. Being said that, currently it probably won’t work because UNIX append is not atomic and will probably be interleaved into nonsense. There’re various workarounds, lock file being one, but I like the idea of keeping only one “controller” instance with exclusive access to the file more. > If that were true, I'd say your resist!.el is a non-starter in the > current form. It should at least load only values explicitly asked for > and only persist/override actually changed values. A trivial solution > could use one file per key. Not sure how sensible that is. It does the latter. As I’ve mentioned, from the benchmark results, the former doesn’t seem to be a big problem. You’ll do it at most once for every Emacs instance anyway. Best, Qiantan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 16:35 ` Qiantan Hong @ 2021-12-07 18:43 ` Arthur Miller 2021-12-07 19:13 ` Qiantan Hong 2021-12-07 19:34 ` Tassilo Horn 2021-12-07 19:52 ` Stefan Monnier 2 siblings, 1 reply; 26+ messages in thread From: Arthur Miller @ 2021-12-07 18:43 UTC (permalink / raw) To: Qiantan Hong; +Cc: emacs-devel@gnu.org, Zhu Zihao, Stefan Monnier, Tassilo Horn Qiantan Hong <qhong@mit.edu> writes: >> That would be quite a >> disadvantage compared to a DB approach where I'd naturally expect that >> only the value I'm asking for is loaded. Sqlite does not read one key per request, it reads at least a page. So if your system uses 64k pages, sqlite will read at least 64k at once. Then you also have system doing disk i/o, caching nodes and so on; so you are never really getting just "the value you are asking for to be loaded". > It does, but I’ve done some benchmark and it loads 10k entries in 0.02~0.03 > seconds. 100k entries takes <0.5s. > I’d say it should be suffice for most Emacs application I know of. > Nobody is using Emacs for trillions of business records. > > On the other hand save operation is fully incremental and don’t > even need to be invoked explicitly. > > Being said that, if we really find out loading is not fast enough, I might > come up with some way to load it in segments lazily. I doubt if that will > ever become necessary. > >> Also, how would it ensure consistency when I have 2 parallel emacs >> sessions (like one for mail/irc and one for programming/editing) where >> session 1 modifies the value of key A and the other of key B? It looks >> like the values of the kv-store that gets saved later will win. In case >> that's the emacs session which has modified B, it'll revert A to the >> state before the other session modified it, no? > Since it records a log of deltas instead of printing the whole data structure, > different key won’t interfere. > > Being said that, currently it probably won’t work because UNIX append > is not atomic and will probably be interleaved into nonsense. > There’re various workarounds, lock file being one, but I like > the idea of keeping only one “controller” instance with exclusive > access to the file more. You can do same as sqlite does: just lock entire file (db). Sqlite locks entire db while writes are performed. It can be configured to allow concurrent reads but only if there are no ongoing writes. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 18:43 ` Arthur Miller @ 2021-12-07 19:13 ` Qiantan Hong 0 siblings, 0 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-07 19:13 UTC (permalink / raw) To: Arthur Miller Cc: emacs-devel@gnu.org, Zhu Zihao, Stefan Monnier, Tassilo Horn > You can do same as sqlite does: just lock entire file (db). Sqlite locks entire > db while writes are performed. It can be configured to allow concurrent reads > but only if there are no ongoing writes. Indeed, that’s a straightforward solution. Not sure if the Emacs lock file mechanism is fast enough to run for every write, though (Emacs doesn’t seem to expose UNIX flock/fcntl stuff). There’s also a question of should we try to make write from an Emacs visible for read from another Emacs. AFAIC it doesn’t make much sense if we just want to persistent backing storage. Should we also consider the (ab)use as an IPC channel? Best, Qiantan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 16:35 ` Qiantan Hong 2021-12-07 18:43 ` Arthur Miller @ 2021-12-07 19:34 ` Tassilo Horn 2021-12-08 10:00 ` Yuri Khan 2021-12-07 19:52 ` Stefan Monnier 2 siblings, 1 reply; 26+ messages in thread From: Tassilo Horn @ 2021-12-07 19:34 UTC (permalink / raw) To: Qiantan Hong; +Cc: emacs-devel, Zhu Zihao, Stefan Monnier Qiantan Hong <qhong@mit.edu> writes: >> I'm not very knowledgable with cl-* stuff but doesn't your >> implementation load all key-value pairs at once? That would be quite >> a disadvantage compared to a DB approach where I'd naturally expect >> that only the value I'm asking for is loaded. > It does, but I’ve done some benchmark and it loads 10k entries in > 0.02~0.03 seconds. 100k entries takes <0.5s. But your values are very only small lists. > I’d say it should be suffice for most Emacs application I know of. > Nobody is using Emacs for trillions of business records. I'd imagine that if such a feature became available, packages would start using it and store more data that they do now for whatever reasons. Like eww/elpher could want to store the list of the last 100 visited URLs. >> Also, how would it ensure consistency when I have 2 parallel emacs >> sessions (like one for mail/irc and one for programming/editing) where >> session 1 modifies the value of key A and the other of key B? It looks >> like the values of the kv-store that gets saved later will win. In case >> that's the emacs session which has modified B, it'll revert A to the >> state before the other session modified it, no? > > Since it records a log of deltas instead of printing the whole data > structure, different key won’t interfere. Ah, allright. By skimming the code I've first thought the log would only be for analysis/debugging purposes. > Being said that, currently it probably won’t work because UNIX append > is not atomic and will probably be interleaved into nonsense. > There’re various workarounds, lock file being one, but I like the idea > of keeping only one “controller” instance with exclusive access to the > file more. Interesting, but how would emacs instances interact with that controller instance? And would that controller instance simply be the first emacs instance of a user? What if the controller blocks because Gnus is running and currently downloading mails with huge attachments? >> If that were true, I'd say your resist!.el is a non-starter in the >> current form. It should at least load only values explicitly asked for >> and only persist/override actually changed values. A trivial solution >> could use one file per key. Not sure how sensible that is. > It does the latter. No, it uses one file per kv-store but you can have as many kv-stores as you like, e.g., one per package. Or do you mean "it only overrides changed values" with "the latter"? Indeed, that's true. I've played with the code and now understand it. Nice! :-) (kv-rem is missing a kv-store arg.) > As I’ve mentioned, from the benchmark results, the former doesn’t > seem to be a big problem. You’ll do it at most once for every Emacs > instance anyway. Yeah, and since it's no global store in the sense of "every package feeds its data into the same store", my argument is void. If my emacs session doesn't start Gnus, Gnus won't load its own store. Bye, Tassilo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 19:34 ` Tassilo Horn @ 2021-12-08 10:00 ` Yuri Khan 0 siblings, 0 replies; 26+ messages in thread From: Yuri Khan @ 2021-12-08 10:00 UTC (permalink / raw) To: Tassilo Horn; +Cc: Qiantan Hong, Zhu Zihao, Stefan Monnier, Emacs developers On Wed, 8 Dec 2021 at 03:19, Tassilo Horn <tsdh@gnu.org> wrote: > > There’re various workarounds, lock file being one, but I like the idea > > of keeping only one “controller” instance with exclusive access to the > > file more. > > Interesting, but how would emacs instances interact with that controller > instance? And would that controller instance simply be the first emacs > instance of a user? What if the controller blocks because Gnus is > running and currently downloading mails with huge attachments? This line of thought, if followed naturally, leads to a dedicated database server process. With multiple concurrent clients, isolation levels, transaction control, etc. And there are better database servers than a special Emacs configuration owning an SQLite database file. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 16:35 ` Qiantan Hong 2021-12-07 18:43 ` Arthur Miller 2021-12-07 19:34 ` Tassilo Horn @ 2021-12-07 19:52 ` Stefan Monnier 2 siblings, 0 replies; 26+ messages in thread From: Stefan Monnier @ 2021-12-07 19:52 UTC (permalink / raw) To: Qiantan Hong; +Cc: Tassilo Horn, emacs-devel@gnu.org, Zhu Zihao > Since it records a log of deltas instead of printing the whole data structure, > different key won’t interfere. > Being said that, currently it probably won’t work because UNIX append > is not atomic and will probably be interleaved into nonsense. > There’re various workarounds, lock file being one, but I like > the idea of keeping only one “controller” instance with exclusive > access to the file more. I think allowing several instances to use the file at the same time is important (I always have 2 sessions active at the same time and I'd like to be able to use (and share) savehist with both of them). But that only means having to lock while we're saving, which is a very short time. It also requires being able to refresh the in-memory data once we detect that the ondisk data has been changed. Stefan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! (Was: sqlite3) 2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong ` (2 preceding siblings ...) 2021-12-07 13:21 ` Stefan Monnier @ 2021-12-07 13:45 ` Zhu Zihao 2021-12-07 14:50 ` Against sqlite3!!! David Engster 3 siblings, 1 reply; 26+ messages in thread From: Zhu Zihao @ 2021-12-07 13:45 UTC (permalink / raw) To: Qiantan Hong; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 634 bytes --] Actually, Emacs can serialize/deserialize hash table directly via prin1-to-string & read ``` (let ((ht (make-hash-table))) (puthash "test" "value" ht) (format "%S" ht)) ``` You can use `read` to "parse" the string returned by that snippet and get a hash table. Qiantan Hong <qhong@mit.edu> writes: > I’ve attached a pure Emacs Lisp implementation of persistent key value store. > > [4. resist!.el --- application/emacs-lisp; resist!.el]... > > [5. ATT00001.htm --- text/html; ATT00001.htm]... -- Retrieve my PGP public key: gpg --recv-keys D47A9C8B2AE3905B563D9135BE42B352A9F6821F Zihao [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 255 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao @ 2021-12-07 14:50 ` David Engster 2021-12-07 20:00 ` Lars Ingebrigtsen 0 siblings, 1 reply; 26+ messages in thread From: David Engster @ 2021-12-07 14:50 UTC (permalink / raw) To: Zhu Zihao; +Cc: Qiantan Hong, emacs-devel > Actually, Emacs can serialize/deserialize hash table directly via > prin1-to-string & read > > ``` > (let ((ht (make-hash-table))) > (puthash "test" "value" ht) > (format "%S" ht)) > ``` > > You can use `read` to "parse" the string returned by that snippet and > get a hash table. Yes, and it's slow. The Gnus registry is saved/loaded this way, and this has annoyed me for years. -David ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 14:50 ` Against sqlite3!!! David Engster @ 2021-12-07 20:00 ` Lars Ingebrigtsen 2021-12-08 6:11 ` Arthur Miller 0 siblings, 1 reply; 26+ messages in thread From: Lars Ingebrigtsen @ 2021-12-07 20:00 UTC (permalink / raw) To: David Engster; +Cc: Qiantan Hong, Zhu Zihao, emacs-devel David Engster <deng@randomsample.de> writes: > Yes, and it's slow. The Gnus registry is saved/loaded this way, and this > has annoyed me for years. Yup. The Gnus registry would be well suited to use sqlite directly, though -- it's basically hand-maintaining a (large) database, and sqlite is a good fit for that. That is, I don't think the new normal persistence method would be ideal for the registry. But, yes, the Gnus registry is a good demonstration of why serialising hash tables to disk is unworkable in practice. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-07 20:00 ` Lars Ingebrigtsen @ 2021-12-08 6:11 ` Arthur Miller 2021-12-08 6:20 ` Qiantan Hong 2021-12-09 7:12 ` Alexandre Garreau 0 siblings, 2 replies; 26+ messages in thread From: Arthur Miller @ 2021-12-08 6:11 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Qiantan Hong, Zhu Zihao, David Engster, emacs-devel Lars Ingebrigtsen <larsi@gnus.org> writes: > David Engster <deng@randomsample.de> writes: > >> Yes, and it's slow. The Gnus registry is saved/loaded this way, and this >> has annoyed me for years. > > Yup. The Gnus registry would be well suited to use sqlite directly, > though -- it's basically hand-maintaining a (large) database, and sqlite > is a good fit for that. > > That is, I don't think the new normal persistence method would be ideal > for the registry. > > But, yes, the Gnus registry is a good demonstration of why serialising > hash tables to disk is unworkable in practice. Than implement a way for Emacs to dump lisp objects to files faster. It would be very useful for Emacs in general. You will still have to serialize your gnus db to sqlite db, and write it to disk via sqlite, so disk access will still be there. Have you trye to write it in chunks in idle timer or all in one go? Is that possible? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-08 6:11 ` Arthur Miller @ 2021-12-08 6:20 ` Qiantan Hong 2021-12-08 9:21 ` Arthur Miller 2021-12-09 7:12 ` Alexandre Garreau 1 sibling, 1 reply; 26+ messages in thread From: Qiantan Hong @ 2021-12-08 6:20 UTC (permalink / raw) To: Arthur Miller Cc: larsi@gnus.org, Zhu Zihao, David Engster, emacs-devel@gnu.org > Than implement a way for Emacs to dump lisp objects to files faster. It would be > very useful for Emacs in general. I think for this question, the one-and-for-all solution is. to have a fully incremental persistent object store, i.e. all mutation are stored incrementally without ever needing to print out the “fully value” of a Lisp value. What do you think? resist!.el in its current form basically implemented a special case of the above, where only mutation to the top level hash table is persisted incrementally. I can’t see a way to implement persistent object store without some non-trivial memory overhead, though (because each object has to get an unique id). The best thing I can come up with has to have a hash table that maps every object in the store to a numeric ID. Is that too much? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-08 6:20 ` Qiantan Hong @ 2021-12-08 9:21 ` Arthur Miller 2021-12-08 9:28 ` Qiantan Hong 0 siblings, 1 reply; 26+ messages in thread From: Arthur Miller @ 2021-12-08 9:21 UTC (permalink / raw) To: Qiantan Hong Cc: larsi@gnus.org, Zhu Zihao, David Engster, emacs-devel@gnu.org Qiantan Hong <qhong@mit.edu> writes: >> Than implement a way for Emacs to dump lisp objects to files faster. It would be >> very useful for Emacs in general. > I think for this question, the one-and-for-all solution is. to have a fully > incremental persistent object store, i.e. all mutation are stored > incrementally without ever needing to print out the “fully value” > of a Lisp value. What do you think? I am not sure I understand what you mean. What is "fully value" of a Lisp value. You mean entire object? > resist!.el in its current form basically implemented a special case > of the above, where only mutation to the top level hash table > is persisted incrementally. > > I can’t see a way to implement persistent object store without > some non-trivial memory overhead, though (because each > object has to get an unique id). The best thing I can come up > with has to have a hash table that maps every object in the > store to a numeric ID. Is that too much? Honestly, I have no idea. I just meant that it is very useful to be able to serialize/deserialize lisp ojbects, in general, without need to go through some intermediate key/value database. I don't know what you are doing to start with; I haven't looked at your resist.el, but maybe you can cache keys that needs to be flushed to disk in some 'dirty state (a list of kyes) and flush just those keys to a file in idle timer. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-08 9:21 ` Arthur Miller @ 2021-12-08 9:28 ` Qiantan Hong 0 siblings, 0 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-08 9:28 UTC (permalink / raw) To: Arthur Miller Cc: larsi@gnus.org, Zhu Zihao, David Engster, emacs-devel@gnu.org > I am not sure I understand what you mean. What is "fully value" of a Lisp > value. You mean entire object? Yes, and this is less than optimal. For example, there’re lots of variables holding a list and added to/removed from frequently. A straightforward implementation would need to print/read the whole list every time. A cleverer object store would be able to figure out which CONS changed exactly, and store a single record like (rplacd *id-of-the-cons* `(*new-element* . ,(object *id-of-old-tail*))) > I don't know what you are doing to start with; I haven't looked at your > resist.el, but maybe you can cache keys that needs to be flushed to disk in some > 'dirty state (a list of kyes) and flush just those keys to a file in idle timer. That’s not really relevant. resist!.el persist every kv-put operation immediately without the need of any explicit save operation. It might be a good idea to run compact-kv-store to save disk space in idle timer, though. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-08 6:11 ` Arthur Miller 2021-12-08 6:20 ` Qiantan Hong @ 2021-12-09 7:12 ` Alexandre Garreau 2021-12-09 7:27 ` Qiantan Hong 1 sibling, 1 reply; 26+ messages in thread From: Alexandre Garreau @ 2021-12-09 7:12 UTC (permalink / raw) To: emacs-devel Le merkredo, 8-a de decembro 2021, 7-a horo kaj 11:19 CET Arthur Miller a écrit : > Lars Ingebrigtsen <larsi@gnus.org> writes: > > > > > David Engster <deng@randomsample.de> writes: > > > >> Yes, and it's slow. The Gnus registry is saved/loaded this way, and > >> this has annoyed me for years. > > > > > Yup. The Gnus registry would be well suited to use sqlite directly, > > though -- it's basically hand-maintaining a (large) database, and > > sqlite is a good fit for that. > > > > That is, I don't think the new normal persistence method would be > > ideal > > for the registry. > > > > But, yes, the Gnus registry is a good demonstration of why serialising > > hash tables to disk is unworkable in practice. > > Than implement a way for Emacs to dump lisp objects to files faster. It > would be very useful for Emacs in general. > > You will still have to serialize your gnus db to sqlite db, and write it > to disk via sqlite, so disk access will still be there. Well, I think, wrt writes, it’s impossible to go faster than just writing plaintext. But sqlite is here to make *reads* faster, not writes. The only way I can see to make writes faster without any form of serialization is just garbage collecting related data together, and directly dumping memory onto disk, in a native but non-portable format, along with metadata about the native format, and specific implementations to decode it from other platforms for each memory format. I’d love to see that. But I think writing is not the bottleneck that we try to improve here. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-09 7:12 ` Alexandre Garreau @ 2021-12-09 7:27 ` Qiantan Hong [not found] ` <24465971.J1OoJ6LT5i@galex-713.eu> 2021-12-09 13:17 ` Stefan Monnier 0 siblings, 2 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-09 7:27 UTC (permalink / raw) To: Alexandre Garreau; +Cc: emacs-devel@gnu.org > Well, I think, wrt writes, it’s impossible to go faster than just writing > plaintext. But sqlite is here to make *reads* faster, not writes. The > only way I can see to make writes faster without any form of serialization > is just garbage collecting related data together, and directly dumping > memory onto disk, in a native but non-portable format, along with metadata > about the native format, and specific implementations to decode it from > other platforms for each memory format. I’d love to see that. But I > think writing is not the bottleneck that we try to improve here. I think write still *used to* be a bottleneck in the sense that without incremental updates, every writes require dumping the whole store, thus preventing Emacs from remembering things *early and eagerly*. resist! solves exactly this. Any mutation to the kv-store is immediately remembered by appending a log, and make-persistent-variable remember all changes every persistent-variable-idle-time (1 second seems reasonable). > Well, I think, wrt writes, it’s impossible to go faster than just writing > plaintext. But sqlite is here to make *reads* faster, not writes. Did you mean the initial loading up, or substantial reads? For the substantial reads, sqlite3 is probably not much faster than a single gethash. resist! In its current form do require initially loading the whole store, but I think it’s unclear whether it is a bottle neck. See https://lists.gnu.org/archive/html/emacs-devel/2021-12/msg00865.html ^ permalink raw reply [flat|nested] 26+ messages in thread
[parent not found: <24465971.J1OoJ6LT5i@galex-713.eu>]
* Re: Against sqlite3!!! [not found] ` <24465971.J1OoJ6LT5i@galex-713.eu> @ 2021-12-09 7:50 ` Qiantan Hong 2021-12-09 19:16 ` Thierry Volpiatto 0 siblings, 1 reply; 26+ messages in thread From: Qiantan Hong @ 2021-12-09 7:50 UTC (permalink / raw) To: Alexandre Garreau; +Cc: emacs-devel@gnu.org > If you start dividing your file into pages and loading one page at a time, > you’re starting to reimplement a kv-database, just as gdbm (or some > filesystems). You’d better reuse gdbm, There is a subtle but IMO significant benefit of resist!, it understands more than kv-put. In the current version, it also understands kv-push and kv-delete (aka list operations). I imagine this can be very useful for Lisp because lots of time people just add or remove data from a list. Some implementation that doesn’t understand this would need to dump the whole list again. > and work on something higher level > and more fun like a lisp sql-like query language, inspired from (or even > compatible with) sql or sparql or prolog, except each expression would > return a lisp value. That would be very beautiful. That’s definitely a cool idea. I’ll probably just go for an embedded Prolog with S-exp syntax. It’s probably not my current priority though, because Emacs packages at the moment still use mostly LISt Processing, and somehow people still find enough incentive to introduce SQLite3, which I attribute to the lack of a decent pure Lisp persistent store. So that’s what resist! aims to solve at this moment. Maybe if after resist! solved the problem people would still want SQLite3, I’ll write another package revolt! which implements Prolog. Best, Qiantan ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-09 7:50 ` Qiantan Hong @ 2021-12-09 19:16 ` Thierry Volpiatto 2021-12-09 19:24 ` Qiantan Hong 2021-12-09 19:28 ` Qiantan Hong 0 siblings, 2 replies; 26+ messages in thread From: Thierry Volpiatto @ 2021-12-09 19:16 UTC (permalink / raw) To: Qiantan Hong; +Cc: Alexandre Garreau, emacs-devel@gnu.org Qiantan Hong <qhong@mit.edu> writes: >> If you start dividing your file into pages and loading one page at a time, >> you’re starting to reimplement a kv-database, just as gdbm (or some >> filesystems). You’d better reuse gdbm, > There is a subtle but IMO significant benefit of resist!, it understands > more than kv-put. In the current version, it also understands > kv-push and kv-delete (aka list operations). > > I imagine this can be very useful for Lisp because lots of time people > just add or remove data from a list. Some implementation that doesn’t > understand this would need to dump the whole list again. > >> and work on something higher level >> and more fun like a lisp sql-like query language, inspired from (or even >> compatible with) sql or sparql or prolog, except each expression would >> return a lisp value. That would be very beautiful. > That’s definitely a cool idea. I’ll probably just go for an embedded Prolog > with S-exp syntax. > > It’s probably not my current priority though, because > Emacs packages at the moment still use mostly LISt Processing, > and somehow people still find enough incentive to introduce SQLite3, > which I attribute to the lack of a decent pure Lisp persistent store. Emacs-lisp provide eval-when-compile which allows saving elisp objects in compiled files, I use this since years to save my variables in emacs. See psession package. -- Thierry ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-09 19:16 ` Thierry Volpiatto @ 2021-12-09 19:24 ` Qiantan Hong 2021-12-09 19:28 ` Qiantan Hong 1 sibling, 0 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-09 19:24 UTC (permalink / raw) To: Thierry Volpiatto; +Cc: Alexandre Garreau, emacs-devel@gnu.org > Emacs-lisp provide eval-when-compile which allows saving elisp objects > in compiled files, I use this since years to save my variables in emacs. > See psession package. Interesting. Is the representation produced in this way different/more compact from what you get by printing the object directly? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-09 19:16 ` Thierry Volpiatto 2021-12-09 19:24 ` Qiantan Hong @ 2021-12-09 19:28 ` Qiantan Hong 1 sibling, 0 replies; 26+ messages in thread From: Qiantan Hong @ 2021-12-09 19:28 UTC (permalink / raw) To: Thierry Volpiatto; +Cc: Alexandre Garreau, emacs-devel@gnu.org >> Emacs-lisp provide eval-when-compile which allows saving elisp objects >> in compiled files, I use this since years to save my variables in emacs. >> See psession package. > Interesting. Is the representation produced in this way different/more compact > from what you get by printing the object directly? I tested some simple objects and it seems that the representation in the .elc file is exactly the same from what PRINT would produce, so it’s probably no faster than PRINT/READing directly. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Against sqlite3!!! 2021-12-09 7:27 ` Qiantan Hong [not found] ` <24465971.J1OoJ6LT5i@galex-713.eu> @ 2021-12-09 13:17 ` Stefan Monnier 1 sibling, 0 replies; 26+ messages in thread From: Stefan Monnier @ 2021-12-09 13:17 UTC (permalink / raw) To: Qiantan Hong; +Cc: Alexandre Garreau, emacs-devel@gnu.org > resist! In its current form do require initially loading the whole store, > but I think it’s unclear whether it is a bottle neck. See > https://lists.gnu.org/archive/html/emacs-devel/2021-12/msg00865.html It depends on the use case. sqlite is the winner for databases that are not meant to be memory-resident (usually because they're too large). It's also the clear winner when you have to read some other application's sqlite file ;-) Stefan ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2021-12-09 19:28 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <MN2PR12MB3391BC76A0D05236AC76C94E946E9@MN2PR12MB3391.namprd12.prod.outlook.com> 2021-12-07 8:13 ` Against sqlite3!!! (Was: sqlite3) Qiantan Hong 2021-12-07 9:14 ` Qiantan Hong 2021-12-07 12:49 ` Against sqlite3!!! Colin Baxter 😺 2021-12-07 13:21 ` Stefan Monnier 2021-12-07 13:55 ` Qiantan Hong 2021-12-07 15:51 ` Tassilo Horn 2021-12-07 16:35 ` Qiantan Hong 2021-12-07 18:43 ` Arthur Miller 2021-12-07 19:13 ` Qiantan Hong 2021-12-07 19:34 ` Tassilo Horn 2021-12-08 10:00 ` Yuri Khan 2021-12-07 19:52 ` Stefan Monnier 2021-12-07 13:45 ` Against sqlite3!!! (Was: sqlite3) Zhu Zihao 2021-12-07 14:50 ` Against sqlite3!!! David Engster 2021-12-07 20:00 ` Lars Ingebrigtsen 2021-12-08 6:11 ` Arthur Miller 2021-12-08 6:20 ` Qiantan Hong 2021-12-08 9:21 ` Arthur Miller 2021-12-08 9:28 ` Qiantan Hong 2021-12-09 7:12 ` Alexandre Garreau 2021-12-09 7:27 ` Qiantan Hong [not found] ` <24465971.J1OoJ6LT5i@galex-713.eu> 2021-12-09 7:50 ` Qiantan Hong 2021-12-09 19:16 ` Thierry Volpiatto 2021-12-09 19:24 ` Qiantan Hong 2021-12-09 19:28 ` Qiantan Hong 2021-12-09 13:17 ` Stefan Monnier
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).