From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Tassilo Horn Newsgroups: gmane.emacs.devel Subject: Re: [OFFTOPIC] size of issue tracker Date: Mon, 13 May 2019 20:59:51 +0200 Message-ID: <8736liuqmw.fsf@gnu.org> References: <1552789070.5272.1@yandex.ru> <87imwhmmt8.fsf@gmail.com> <87y347g1l3.fsf@iotcl.com> <9ac21e82-8e47-f9b5-f88d-23c0c56946d1@yandex.ru> <87pnpc1lby.fsf@iotcl.com> <83zhoezdqc.fsf@gnu.org> <87imuivfcr.fsf@iotcl.com> <83k1eyfxls.fsf@gnu.org> <17D21056-10B2-4813-AE90-9B2706936CE9@icloud.com> <83imuifqjc.fsf@gnu.org> <87lfzehrug.fsf@gmail.com> <83k1exec2n.fsf@gnu.org> <8736lkikqf.fsf@gmail.com> <83y33cb5nc.fsf@gnu.org> <87a7fsf91l.fsf@gnu.org> <87pnongeju.fsf@gnu.org> <83tvdy9rpm.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="167329"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: toon@iotcl.com, emacs-devel@gnu.org, Stefan Monnier , agrambot@gmail.com, dgutov@yandex.ru To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 13 21:02:51 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hQGDq-000hQD-Mn for ged-emacs-devel@m.gmane.org; Mon, 13 May 2019 21:02:50 +0200 Original-Received: from localhost ([127.0.0.1]:33743 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hQGDp-00008E-NI for ged-emacs-devel@m.gmane.org; Mon, 13 May 2019 15:02:49 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:60895) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hQGBQ-0006zO-19 for emacs-devel@gnu.org; Mon, 13 May 2019 15:00:21 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:49854) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hQGBP-0004Lu-05; Mon, 13 May 2019 15:00:19 -0400 Original-Received: from auth2-smtp.messagingengine.com ([66.111.4.228]:51275) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.82) (envelope-from ) id 1hQGBI-0003cL-6b; Mon, 13 May 2019 15:00:17 -0400 Original-Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailauth.nyi.internal (Postfix) with ESMTP id A9C5724837; Mon, 13 May 2019 15:00:11 -0400 (EDT) Original-Received: from mailfrontend1 ([10.202.2.162]) by compute7.internal (MEProxy); Mon, 13 May 2019 15:00:11 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrleeggddufeduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufhfffgjkfgfgggtsehttdertddtredtnecuhfhrohhmpefvrghsshhi lhhoucfjohhrnhcuoehtshguhhesghhnuhdrohhrgheqnecukfhppeelfedrvdefiedrud eftddrvddukeenucfrrghrrghmpehmrghilhhfrhhomhepthhhohhrnhdomhgvshhmthhp rghuthhhphgvrhhsohhnrghlihhthidqkeeijeefkeejkeegqdeifeehvdelkedqthhsug hhpeepghhnuhdrohhrghesfhgrshhtmhgrihhlrdhfmhenucevlhhushhtvghrufhiiigv pedt X-ME-Proxy: Original-Received: from thinkpad-t440p (p5dec82da.dip0.t-ipconnect.de [93.236.130.218]) by mail.messagingengine.com (Postfix) with ESMTPA id A08758006C; Mon, 13 May 2019 15:00:08 -0400 (EDT) Mail-Followup-To: Eli Zaretskii , Stefan Monnier , agrambot@gmail.com, toon@iotcl.com, dgutov@yandex.ru, emacs-devel@gnu.org In-Reply-To: <83tvdy9rpm.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 13 May 2019 20:42:13 +0300") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:236481 Archived-At: Eli Zaretskii writes: >> > Why does Forge insist downloading the entire DB to the local >> > machine? That sounds like something that wouldn't scale well. >> >> I don't know why it does so, but it sounds like a good idea to me, so >> you can later use them when you're offline. > > If the DB is small enough, sure. It sounds like somehow it isn't the > case with Forge and GitLab. My database (~/.emacs.d/forge-database.sqlite) is currently 38 MB, and it contains the contents of 4 different projects from which gtk with its 1900 issues and 900 MRs is by far the largest one. There's a repository table from which you can join your way to the issues, merge/pull requests, their comments, labels, and assignees. That's the current schema if you are interested in details. --8<---------------cut here---------------start------------->8--- sqlite> .schema CREATE TABLE repository (class NOT NULL, id NOT NULL PRIMARY KEY, forge_id , forge , owner , name , apihost , githost , remote , sparse_p , created , updated , pushed , parent , description , homepage , default_branch , archived_p , fork_p , locked_p , mirror_p , private_p , issues_p , wiki_p , stars , watchers , assignees DEFAULT eieio_unbound, forks DEFAULT eieio_unbound, issues DEFAULT eieio_unbound, labels DEFAULT eieio_unbound, revnotes DEFAULT eieio_unbound, pullreqs DEFAULT eieio_unbound); CREATE TABLE assignee (repository NOT NULL, id NOT NULL PRIMARY KEY, login , name , forge_id , FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE CASCADE); CREATE TABLE fork (parent NOT NULL, id NOT NULL PRIMARY KEY, owner , name , FOREIGN KEY (parent) REFERENCES repository (id) ON DELETE CASCADE); CREATE TABLE issue (class NOT NULL, id NOT NULL PRIMARY KEY, repository , number , state , author , title , created , updated , closed , unread_p , locked_p , milestone , body , assignees DEFAULT eieio_unbound, cards DEFAULT eieio_unbound, edits DEFAULT eieio_unbound, labels DEFAULT eieio_unbound, participants DEFAULT eieio_unbound, posts DEFAULT eieio_unbound, reactions DEFAULT eieio_unbound, timeline DEFAULT eieio_unbound, FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE CASCADE); CREATE TABLE issue_assignee (issue NOT NULL, id NOT NULL, FOREIGN KEY (issue) REFERENCES issue (id) ON DELETE CASCADE); CREATE TABLE issue_label (issue NOT NULL, id NOT NULL, FOREIGN KEY (issue) REFERENCES issue (id) ON DELETE CASCADE, FOREIGN KEY (id) REFERENCES label (id) ON DELETE CASCADE); CREATE TABLE issue_post (class NOT NULL, id NOT NULL PRIMARY KEY, issue , number , author , created , updated , body , edits DEFAULT eieio_unbound, reactions DEFAULT eieio_unbound, FOREIGN KEY (issue) REFERENCES issue (id) ON DELETE CASCADE); CREATE TABLE label (repository NOT NULL, id NOT NULL PRIMARY KEY, name , color , description , FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE CASCADE); CREATE TABLE notification (class NOT NULL, id NOT NULL PRIMARY KEY, repository , forge , reason , unread_p , last_read , updated , title , type , topic , url , FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE CASCADE); CREATE TABLE pullreq (class NOT NULL, id NOT NULL PRIMARY KEY, repository , number , state , author , title , created , updated , closed , merged , unread_p , locked_p , editable_p , cross_repo_p , base_ref , base_repo , head_ref , head_user , head_repo , milestone , body , assignees DEFAULT eieio_unbound, cards DEFAULT eieio_unbound, commits DEFAULT eieio_unbound, edits DEFAULT eieio_unbound, labels DEFAULT eieio_unbound, participants DEFAULT eieio_unbound, posts DEFAULT eieio_unbound, reactions DEFAULT eieio_unbound, review_requests DEFAULT eieio_unbound, reviews DEFAULT eieio_unbound, timeline DEFAULT eieio_unbound, FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE CASCADE); CREATE TABLE pullreq_assignee (pullreq NOT NULL, id NOT NULL, FOREIGN KEY (pullreq) REFERENCES pullreq (id) ON DELETE CASCADE); CREATE TABLE pullreq_label (pullreq NOT NULL, id NOT NULL, FOREIGN KEY (pullreq) REFERENCES pullreq (id) ON DELETE CASCADE, FOREIGN KEY (id) REFERENCES label (id) ON DELETE CASCADE); CREATE TABLE pullreq_post (class NOT NULL, id NOT NULL PRIMARY KEY, pullreq , number , author , created , updated , body , edits DEFAULT eieio_unbound, reactions DEFAULT eieio_unbound, FOREIGN KEY (pullreq) REFERENCES pullreq (id) ON DELETE CASCADE); CREATE TABLE revnote (class NOT NULL, id NOT NULL PRIMARY KEY, repository , "commit" , file , line , author , body , FOREIGN KEY (repository) REFERENCES repository (id) ON DELETE CASCADE); --8<---------------cut here---------------end--------------->8--- Fetching all new 19 issues & 10 MRs since my initial 2-hour fetch yesterday took about 3 minutes. I think the bottleneck here is the current GitLab API. If I find some time, I can re-run the procedure with an even bigger GitHub project. I've read GitHub has a more efficient GraphQL API for this stuff, and AFAIK, GitLab is or will implement something similar, so this slowness might be resolved in the mid-term future. If you have a preference for some project hosted on GitHub (e.g., because it has about the same numbers of issues as we have), drop me a mail. Bye, Tassilo