From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: sqlite3 Date: Tue, 14 Dec 2021 20:32:59 +0300 Message-ID: <853a500f-ae15-ccd7-561c-a5c03afcaae5@yandex.ru> References: <87tufmjyai.fsf@gnus.org> <87lf0nr2b4.fsf@gnus.org> <87tufbnnlq.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10074"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 Cc: emacs-devel@gnu.org To: Lars Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Dec 14 18:35:02 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mxBhd-0002Nn-SB for ged-emacs-devel@m.gmane-mx.org; Tue, 14 Dec 2021 18:35:01 +0100 Original-Received: from localhost ([::1]:39218 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mxBhc-0006Wa-13 for ged-emacs-devel@m.gmane-mx.org; Tue, 14 Dec 2021 12:35:00 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:58142) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mxBgf-0005bE-Dg for emacs-devel@gnu.org; Tue, 14 Dec 2021 12:34:01 -0500 Original-Received: from [2a00:1450:4864:20::134] (port=46683 helo=mail-lf1-x134.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mxBgd-0006C8-J4 for emacs-devel@gnu.org; Tue, 14 Dec 2021 12:34:01 -0500 Original-Received: by mail-lf1-x134.google.com with SMTP id b1so38149305lfs.13 for ; Tue, 14 Dec 2021 09:33:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=HJQAYKdefCSmG4v/XdPU4Zlwp/lNmi75/6YQKVcaeNo=; b=TeyVOF1IkygkyPogSRz0SLghA077xXTqmtIQykeHLJQPJJsj28V6x6SCZbeDkpG2oJ tSiwspCaZP09Am17mSQtPMJOTJBBdgb5f3EkYBh++SAUF5sQeAQatZuQe+ioPZ98mJpb 8c9lSi3q/D7z46kl8UqQT7lEntBoclyKJNtvJEh20HXrd5G7XxBljQOz8EdQAPDKQ/vw sk6bITXQm3g2I47szOfPQiDfUT+Ok3xRluR7VCDaLc7uGJChr1FYVZfHUTdQ3M5vI3YG VEqEVJm/RCoj5odQ2a+6sOHutaNwaH+L25NgHXAroLSEPDCMXt5fCfwEXRpCq+3axuaN GKFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=HJQAYKdefCSmG4v/XdPU4Zlwp/lNmi75/6YQKVcaeNo=; b=bmU79SvbG2DfUrlboHYkflV1nEqPpmMYGSru9ysXX2YKwXrK5J2arlXCvAW0tYzkB3 TgBROQLcY+bFp4ZrQuWL0B5GqvbR4yZ3VUzWL7FibwJZCryGMIcUQtmLHAck3E/aqdHK NHQNOCPt5jBe+/v12iwv71gTwyYFXgA/cBkh+rzHoNnK2cRyM1tbB16w1rE17H8PPtnZ kSm12bG0BoQGbJn62DWpcmqcN+5IRXAFO9HgPHpO2jtbi0B9q/Ku1sMf16uUZAJGIxvK GdOoYnP2unznHUNuWsOC0bKZP1THYw7SPlgpNzR2f0aBuabU5FH8/sazSk7pCaDZYaev YXnQ== X-Gm-Message-State: AOAM5336ofHH9BevW/RmtjvGIfrUO0Iu31gA8GOcRwkGOCPUKIidvy9o Y1NkIliFdq4UogK41c1QZt0rkWvnJfc= X-Google-Smtp-Source: ABdhPJydOTsURGMbv0EEEdZTT/XEOpGQ4c2FXdrrlXqU2UdL5L12E1qTQydDXMVqD9E8bAylQF5Vig== X-Received: by 2002:ac2:4f8e:: with SMTP id z14mr5707041lfs.316.1639503237698; Tue, 14 Dec 2021 09:33:57 -0800 (PST) Original-Received: from [192.168.0.103] ([5.18.233.185]) by smtp.googlemail.com with ESMTPSA id b10sm57950lfj.230.2021.12.14.09.33.56 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 14 Dec 2021 09:33:57 -0800 (PST) In-Reply-To: <87tufbnnlq.fsf@gnus.org> Content-Language: en-US X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a00:1450:4864:20::134 (failed) Received-SPF: pass client-ip=2a00:1450:4864:20::134; envelope-from=raaahh@gmail.com; helo=mail-lf1-x134.google.com X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.248, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.248, NICE_REPLY_A=-0.962, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:281927 Archived-At: On 14.12.2021 19:43, Lars Ingebrigtsen wrote: > Dmitry Gutov writes: > >> But a "proper" database might give other advantages like a faster >> search in the loaded data (unless it's already "indexed" by using hash >> tables everywhere where they could be used). >> >> Or being able to read the data without loading the whole file into >> memory. Which, for certain scenarios and data sets, might be a bigger >> advantage than faster writes. > > Here's a matrix of advantages and disadvantages to three approaches: > sqlite, one-file-per-value, and > one-file-with-a-hash-table-with-several/all-values: > > sqlite files hash > Read/write value speed ⚄ ⚅ ⚀ > Read/write value mem ⚅ ⚄ ⚀ > List all values speed ⚅ ⚀ ⚅ > List all values mem ⚃ ⚁ ⚃ > Ease of moving around ⚄ ⚀ ⚅ I'm not 100% sure how to interpret (is a higher value for "mem" better or worse?), but it seems like, at least, for the original scenario of having large data sets sqlite might still be optimal. >>> But it turns out that sqlite3 is actually slower for this particular use >>> case than just writing the data to a file (i.e., using the file system >>> as the database; one file per value). So multisession.el now offers two >>> backends (`files' and `sqlite'), and defaults to `files'. >> >> Does the latter scenario use as many files as you do 'COMMIT' in the >> former scenario? > > No, if you (cl-incf (multisession-value foo)) you'll get one COMMIT per > time, but there'll only be one foo.value file (at a time). OK, but it's still the same number of writes, more or less? IO is the slow part of most programs, and when it comes to an SQL database, it might have to do an update in multiple places (e.g. the data and the index), rather than do one smooth write. Might also depend on the size of the write (how big the values are). Speaking of the latter scheme, I might be missing some details, but sqlite should provide better atomicity guarantess in the same of being interrupted mid-write. Like, if we have one-file-per-value, then the total list of keys must live somewhere, and they can get desynchronized.