From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alexandre Garreau Newsgroups: gmane.emacs.devel Subject: Re: sqlite3 Date: Thu, 16 Dec 2021 06:05:28 +0100 Message-ID: <2601056.g262snxK1C@galex-713.eu> References: <87tufmjyai.fsf@gnus.org> <2328395.9DzazV271f@galex-713.eu> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22424"; mail-complaints-to="usenet@ciao.gmane.io" Cc: "larsi@gnus.org" , Qiantan Hong , "emacs-devel@gnu.org" To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Dec 16 06:07:03 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mxiyt-0005ie-Hm for ged-emacs-devel@m.gmane-mx.org; Thu, 16 Dec 2021 06:07:03 +0100 Original-Received: from localhost ([::1]:39598 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mxiyr-0004p5-Oj for ged-emacs-devel@m.gmane-mx.org; Thu, 16 Dec 2021 00:07:01 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:53884) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mxixe-0003za-4o for emacs-devel@gnu.org; Thu, 16 Dec 2021 00:05:46 -0500 Original-Received: from [2a00:5884:8305::1] (port=56064 helo=galex-713.eu) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mxixb-0005Iq-Mp for emacs-devel@gnu.org; Thu, 16 Dec 2021 00:05:45 -0500 Original-Received: from gal by galex-713.eu with local (Exim 4.94.2) (envelope-from ) id 1mxixO-00013h-Mm; Thu, 16 Dec 2021 06:05:30 +0100 In-Reply-To: X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a00:5884:8305::1 (failed) Received-SPF: pass client-ip=2a00:5884:8305::1; envelope-from=galex-713@galex-713.eu; helo=galex-713.eu X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:282106 Archived-At: Le merkredo, 15-a de decembro 2021, 16-a horo kaj 36:19 CET Qiantan Hong a= =20 =C3=A9crit : > > the current highest-level interface, the choice between sqlite/files? > > even before that, you could add your incremental-log-store to them, > > and also, why not, at least for benchmark sake, a store where one > > file contains everything such as custom.el >=20 > I realized some serious problem about file-per-key method, I see Lars > also use it so guess it will help to discuss. > Basically, printing the key as file name does not guarantee > distinguishing the key > 1. Even if one escape the key, the printed representation may be larger > than 255 bytes, or 8bytes in FAT32 (idk if any one still use it). In > such case, two different key may give the same filename because the > prefix is the same > 2. What=E2=80=99s worse, some FS are case insensitive. elisp symbols are case insensitive as well, it=E2=80=99s a lisp-2, of the k= ind of=20 cl also the canonical *standard* for naming elisp variables is a subset of=20 what is allowed for filenames. nobody seriously quotes their variable=20 names or exceed the 72 columns limit with a single variable name and no=20 indentation, so it might be okay to throw an error in edge cases, for=20 portability=E2=80=A6 or maybe just a warning plus, you could also hash the names, so you get only base 16, or 32, or=20 something alike > >> AFAIK the general way to avoid these issues is to store/log not the > >> "data-diff" but the higher-level operation that caused this diff. > >> E.g. log something like "add X to tree" instead of recording which > >> nodes in the tree were modified in which way. This way, the > >> presence or absence of cycles in the representation of the tree > >> doesn't come into the picture at all. >=20 >=20 >=20 > > Looks like Qiantan=E2=80=99s implementation of incremental log-like sto= re. >=20 > I think maybe this hinted us that we can get the best out of both > by using filename as a =E2=80=9Cbucket splitter=E2=80=9D, and have an inc= remental > log store in each file? Then we can be loose about file name > normalization because it=E2=80=99s just an optimization (and make it more > readable). It=E2=80=99s also almost as easy for people to explore as > one-file-per-key. why are you complicating that much???