From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "Bozhidar Batsov" Newsgroups: gmane.emacs.devel Subject: Re: Subprojects in project.el (Was: Eglot, project.el, and python virtual environments) Date: Fri, 25 Nov 2022 09:07:38 +0200 Message-ID: <466bbd65-a7ae-4a29-b25a-e91c44695dad@app.fastmail.com> References: <87zgcq68zp.fsf@ericabrahamsen.net> <4c5f4b07-3df6-d700-83f8-9a9d1b684afc@yandex.ru> <84781346-5b88-2be5-38bb-02696fcf1364@yandex.ru> <87o7t2vj19.fsf@dfreeman.email> <877czqtyfy.fsf@dfreeman.email> <87zgcml7g7.fsf@gmail.com> <2ba04533-097a-a1da-ff3f-2c9506fd488e@yandex.ru> <875yf9bbzb.fsf@gmail.com> <87wn7oa0aw.fsf@gmail.com> <7a5b76fd-fb15-8c1e-ea29-bf11f7e0d2ae@yandex.ru> <87bkoya815.fsf@gmail.com> <0024a67d-b8e5-b35c-1b22-82541a170eb3@yandex.ru> <871qptai4d.fsf_-_@gmail.com> <86bkowdjx5.fsf@gmail.com> <43aa2f10-d947-dfcd-82b0-f6f1be3aaaec@yandex.ru> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=08fb7397c89b4a0f987e8eb5a463aed1 Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5469"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Cyrus-JMAP/3.7.0-alpha0-1115-g8b801eadce-fm-20221102.001-g8b801ead To: "Emacs Devel" Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Fri Nov 25 08:09:14 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oySpl-000191-6A for ged-emacs-devel@m.gmane-mx.org; Fri, 25 Nov 2022 08:09:14 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oySog-0002Hn-6U; Fri, 25 Nov 2022 02:08:06 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oySoe-0002Gv-Gu for emacs-devel@gnu.org; Fri, 25 Nov 2022 02:08:04 -0500 Original-Received: from out4-smtp.messagingengine.com ([66.111.4.28]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oySob-0000Oi-PM for emacs-devel@gnu.org; Fri, 25 Nov 2022 02:08:03 -0500 Original-Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id EDDF45C00D7 for ; Fri, 25 Nov 2022 02:07:58 -0500 (EST) Original-Received: from imap43 ([10.202.2.93]) by compute5.internal (MEProxy); Fri, 25 Nov 2022 02:07:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=batsov.dev; h=cc :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to; s=fm3; t=1669360078; x=1669446478; bh=k+hmNtIbKZ iIbcysv+4kQ+/cAtwZU9MtSmFp1dbFuLw=; b=W54fac693CIJX9lZtzD3xKqF0r CorOBX0+D3iWE6rMcijF/FF6HVxt8RBUbIvtrg/BwkrHOEpkkxGH6jmzYL1/r/0K N4A01YSUc5NPoz/5ZbxhB70Mq5pFP/49QtuvrjpDH8aaUbIdBkYcy6Yh2NuTIkc5 7lTv6iFXzZiFIlSYyQbHM6mmV+dn55ISWYVximxjQzRXjszzutzYHE0Ois3Y9QbO 2xQtinCmvFgF7v3muURH5uoTSUkLNstu2myUTrqgKSK6jGocZ76jrCdx7FSQnHg0 4JKlK9Q5VG66UfbjjWTkidl8970T+zs+ps0k9w0cnYY6QS9YVGhAXUinVY0A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:sender:subject:subject:to:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; t=1669360078; x=1669446478; bh=k+hmNtIbKZiIbcysv+4kQ+/cAtwZ U9MtSmFp1dbFuLw=; b=iJtYY9KNnZVjzStoX7rsFRK7fdBKLvY7ui0Ta/3OFo/Z 21A9ZjOk6O3pOtNazIF3jFiomqQh7MA7bK5KLp08UgdMTaClDcqMabNsPsGb3/B5 FWrSd9Y9mYgonS3R9ocLnfMjXOgiR+N3nSffHR8dpyqLvegkCG+cQodiktbOaCiE wsoujjQ9B8nEdFfCF66o2kuedrJUvPsGBd5CbgQFMpuHnXeB9cnHclk9dtifbm7L daF3AgsdKhkr9Am7a5V84u038+YefJ51t8OVZEa8GATj5mjJ0coJ5PccYd7hLdjN cn/qrac3euH4b9GSVDI7OKpPxQoRvjYZeO7qq6lLww== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvgedrieeggddutdegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvffutgesrgdtreerreerjeenucfhrhhomhepfdeuohii hhhiuggrrhcuuegrthhsohhvfdcuoegsohiihhhiuggrrhessggrthhsohhvrdguvghvqe enucggtffrrghtthgvrhhnpeeffffhheekfeekkeduteffgffhvefgtddvleefgeehhfef udelhedtleehfeegkeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrih hlfhhrohhmpegsohiihhhiuggrrhessggrthhsohhvrdguvghv X-ME-Proxy: Feedback-ID: i025946a9:Fastmail Original-Received: by mailuser.nyi.internal (Postfix, from userid 501) id AB36F2D40087; Fri, 25 Nov 2022 02:07:58 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface In-Reply-To: <43aa2f10-d947-dfcd-82b0-f6f1be3aaaec@yandex.ru> Received-SPF: pass client-ip=66.111.4.28; envelope-from=bozhidar@batsov.dev; helo=out4-smtp.messagingengine.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:300463 Archived-At: --08fb7397c89b4a0f987e8eb5a463aed1 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable Here are my 2 (very generic) cents on the subject. I'll just mention that sub-projects have been haunting me for years in P= rojectile, so you definitely will want to think long and hard about thei= r implementation as people tend to have all sorts of setups. Sometimes I= even wonder if it's worth it to try to support every use-case possible = as it's definitely a path of growing complexity and diminishing returns.=20 I definitely agree that such big and complex projects are the exception,= not the norm, so I definitely wouldn't optimize for them, but rather ai= m to support them in the least complex and more common way. (e.g. Projec= tile mostly focuses on marking subprojects with `.projectile` markers an= d git submodules). Obviously there are many ways to approach this, but a= t the end of the day I'm always thinking about how common would such pro= jects be in the real world and whether there are reasonable workarounds = as an alternative to supporting them "natively/out-of-the-box". Given th= at project.el has been out for years and this topic comes up just now, c= learly there's not great demand for subprojects functionality. (and I've= had similar observations in the 11 years I've spent working on Projecti= le)=20 On Fri, Nov 25, 2022, at 1:38 AM, Dmitry Gutov wrote: > On 25/11/22 00:46, Tim Cross wrote: > >=20 > > Jo=C3=A3o T=C3=A1vora writes: > >=20 > >> On Thu, Nov 24, 2022 at 3:01 AM Dmitry Gutov wro= te: > >> > >> =20 > >>> I'm imagining that traversing a directory tree with an arbitrary > >>> predicate is going to be slow. If the predicate is limited someh= ow (e.g. > >>> to a list of "markers" as base file name, or at least wildcards)= , 'git > >>> ls-files' can probably handle this, with certain but bounded cos= t. > >=20 > > I've seen references to superior performance benefits of git ls-file= a > > couple of times in this thread, which has me a little confused. > >=20 > > There has been lots in other threads regarding the importance of not > > relying on and not basing development on an underlying assumption > > regarding the VCS being used. For example, I would expect project.el= to > > be completely neutral with respect to the VCS used in a project. >=20 > That's the situation where we can optimize this case: when a project i= s=20 > Git/Hg. >=20 > > So how is git ls-file at all relevant when discussing performance > > characteristics when identifying files in a project? >=20 > Not files, though. Subprojects. Meaning, listing all (direct and=20 > indirect) subdirectories which satisfy a particular predicate. If the=20 > predicate is simple (has a particular project marker: file name or=20 > wildcard), it can be fetched in one shell command, like: >=20 > git ls-files -co -- "Makefile" "package.json" >=20 > (which will traverse the directory tree for you, but will also use Git= 's=20 > cache). >=20 > If the predicate is arbitrary (i.e. implemented in Lisp), the story=20 > would become harder. >=20 > > I also wonder if some of the performance concerns may be premature. = I've > > seen references to poor performance in projects with 400k or even 10= 0k > > files. What is the expected/acceptable performance for projects of t= hat > > size? How common are projects of that size? When considering > > performance, are we not better off focusing on the common case rather > > than extreme cases, leaving the extremes for once we have a known > > problem we can then focus in on? >=20 > OT1H, large projects are relatively rare. OT2H, having a need for=20 > subprojects seems to be correlated with working on large projects. >=20 > What is the common case, in your experience, and how is it better=20 > solved? Globally customizing a list of "markers", or customizing a lis= t=20 > of subprojects for every "parent" project? >=20 >=20 --08fb7397c89b4a0f987e8eb5a463aed1 Content-Type: text/html;charset=utf-8 Content-Transfer-Encoding: quoted-printable
Here are my 2 (= very generic) cents on the subject.

I'll ju= st mention that sub-projects have been haunting me for years in Projecti= le, so you definitely will want to think long and hard about their imple= mentation as people tend to have all sorts of setups. Sometimes I even w= onder if it's worth it to try to support every use-case possible as it's= definitely a path of growing complexity and diminishing returns.

I definitely agree that such big and complex pro= jects are the exception, not the norm, so I definitely wouldn't optimize= for them, but rather aim to support them in the least complex and more = common way. (e.g. Projectile mostly focuses on marking subprojects with = `.projectile` markers and git submodules). Obviously there are many ways= to approach this, but at the end of the day I'm always thinking about h= ow common would such projects be in the real world and whether there are= reasonable workarounds as an alternative to supporting them "natively/o= ut-of-the-box". Given that project.el has been out for years and this to= pic comes up just now, clearly there's not great demand for subprojects = functionality. (and I've had similar observations in the 11 years I've s= pent working on Projectile)

On Fri, Nov 25= , 2022, at 1:38 AM, Dmitry Gutov wrote:
On 25/11/22 00:46, Tim Cross wrote:

> Jo=C3=A3o T=C3=A1vora <joaotavora@gmail.com> writes:

>> On Thu, Nov 24, 2022 at 3:01 = AM Dmitry Gutov <dgutov@yandex.ru= > wrote:
>>
>>  = ; 
>>>   I'm imagining that traver= sing a directory tree with an arbitrary
>>> =   predicate is going to be slow. If the predicate is limited someho= w (e.g.
>>>   to a list of "markers" as= base file name, or at least wildcards), 'git
>>>=    ls-files' can probably handle this, with certain but bounde= d cost.

> I've seen references= to superior performance benefits of git ls-file a
> co= uple of times in this thread, which has me a little confused.
<= div>> 
> There has been lots in other threads r= egarding the importance of not
> relying on and not bas= ing development on an underlying assumption
> regarding= the VCS being used. For example, I would expect project.el to
=
> be completely neutral with respect to the VCS used in a projec= t.

That's the situation where we can optimi= ze this case: when a project is 
Git/Hg.

> So how is git ls-file at all relevant when discuss= ing performance
> characteristics when identifying file= s in a project?

Not files, though. Subproje= cts. Meaning, listing all (direct and 
indirect) subd= irectories which satisfy a particular predicate. If the 
<= div>predicate is simple (has a particular project marker: file name or&n= bsp;
wildcard), it can be fetched in one shell command, li= ke:

git ls-files -co -- "Makefile" "package= .json"

(which will traverse the directory t= ree for you, but will also use Git's 
cache).

If the predicate is arbitrary (i.e. implemented in= Lisp), the story 
would become harder.

> I also wonder if some of the performance concerns m= ay be premature. I've
> seen references to poor perform= ance in projects with 400k or even 100k
> files. What i= s the expected/acceptable performance for projects of that
> size? How common are projects of that size? When considering
> performance, are we not better off focusing on the common = case rather
> than extreme cases, leaving the extremes = for once we have a known
> problem we can then focus in= on?

OT1H, large projects are relatively ra= re. OT2H, having a need for 
subprojects seems to be = correlated with working on large projects.

= What is the common case, in your experience, and how is it better <= br>
solved? Globally customizing a list of "markers", or custo= mizing a list 
of subprojects for every "parent" proj= ect?



= --08fb7397c89b4a0f987e8eb5a463aed1--