From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#50686: Show number of downloads on packages on GNU ELPA/NonGNU ELPA Date: Mon, 11 Mar 2024 18:13:28 -0400 Message-ID: References: <985acef0-69f1-39c3-1354-9a49149c9df9@alphapapa.net> <1f2a10bf-c135-480d-9b79-17b64090fc7e@alphapapa.net> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30229"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: 50686@debbugs.gnu.org, stefan@marxist.se, larsi@gnus.org To: Adam Porter Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Mar 11 23:14:49 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rjnuy-0007fg-5T for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 11 Mar 2024 23:14:48 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rjnuf-0005cm-Gj; Mon, 11 Mar 2024 18:14:29 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rjnud-0005cc-Q6 for bug-gnu-emacs@gnu.org; Mon, 11 Mar 2024 18:14:27 -0400 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rjnud-0006Tp-Hw for bug-gnu-emacs@gnu.org; Mon, 11 Mar 2024 18:14:27 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rjnvB-00021q-Jf for bug-gnu-emacs@gnu.org; Mon, 11 Mar 2024 18:15:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 11 Mar 2024 22:15:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 50686 X-GNU-PR-Package: emacs Original-Received: via spool by 50686-submit@debbugs.gnu.org id=B50686.17101952577709 (code B ref 50686); Mon, 11 Mar 2024 22:15:01 +0000 Original-Received: (at 50686) by debbugs.gnu.org; 11 Mar 2024 22:14:17 +0000 Original-Received: from localhost ([127.0.0.1]:41435 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rjnuS-00020F-GE for submit@debbugs.gnu.org; Mon, 11 Mar 2024 18:14:17 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:26000) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rjnuN-0001zp-Hu for 50686@debbugs.gnu.org; Mon, 11 Mar 2024 18:14:13 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 29CAF80C41; Mon, 11 Mar 2024 18:13:31 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1710195209; bh=sU33E/ifQ+Co1zfWA9VRK72QFSjEsj9cDn/XeY48RkY=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=bkdI6A+TrhjVzkq/ihml4vOW9Mv4S2ObW2P1+Jiv05n3P0tPuE0RdmNFmEFlXjSpD P6GvMvcp9CyMYegeoRRjJoqdMhb5w3L9i5jOc+QtL1MGRfVxWavFMK+4F0jDc9iNYD iFoTDpypTQceDdxwIKx236+PZ9zoqcWqCI/usEOXqIlWM/nhWvO6o5s0d7+xI10zge kgLnbnbJ72ADDAroMmrys527cRkCU6aWsjyny/Qe0jkWKXKHduUID4N1DoxmywdZNq gC421YN6yLLHlSreBwIoxCRXpU4kHOeGf52JSD+ngE4KcstGX57I7chqFZ07RCx6AU TAGqDUuvmeUiA== Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id C950C80B0E; Mon, 11 Mar 2024 18:13:29 -0400 (EDT) Original-Received: from alfajor (69-165-147-56.dsl.teksavvy.com [69.165.147.56]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 9576C120748; Mon, 11 Mar 2024 18:13:29 -0400 (EDT) In-Reply-To: (Adam Porter's message of "Mon, 11 Mar 2024 15:55:47 -0500") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:281496 Archived-At: >>>> I had the logs only for a two weeks or so (plus some old logs from >>>> many years ago, actually), indeed. >>> I see. Are the rest of the logs still available on the ELPA server, or is >>> that all we have for historical data? >> That's all we have. > Ok. Going forward, will the logs we have now be preserved, or do they get > rotated away? They get rotated away. We do keep the weekly counts that we accumulate in our `wsl-stats.eld` file. >>>>> a list of downloads per version, etc. >>>> Currently I count the "interest" in the package, so I don't distinguish >>>> the version of the package, nor whether the access is for the tarball or >>>> the package's web page, or the package's readme.txt, or the package's badge. >>> That seems like a very different kind of data than the number of times >>> a package has been downloaded (i.e. by an Emacs instance). IME a small >>> fraction of hits to a package's GitHub repo seem to result in installations; >>> "interest" tends to be far more than "interested enough to install." >> Just because the "interest" tends to be far more than "interested enough >> to install" doesn't mean that the two aren't strongly correlated. >> Also my impression is that package web pages in `elpa.gnu.org` are not >> visited nearly as often as a Github project page. >> But it'd be definitely worth checking how the two measures compare. >> Patches welcome. > Ok, meaning that you'd accept a patch that does...what, exactly, to the > database? :) I guess keep separate counts for tarballs and other files, so we can compare? >>>> I'd like to the keep the stats database reasonably small (it's currently >>>> around 150kB, and I expect it'll take a year before it reaches 1MB), so >>>> I'd rather not segregate per version. >>> Is there a way that I could change your mind about that? Having the actual >>> download counts per version would be very useful. >> Maybe if you argue about what kind of use would make it useful? > > For example, if a package at version V has N downloads after 6 months, and > then the package is updated to version V+1, how many downloads that version > has after 6 months would give some indication of whether the package is > growing in popularity, whether initial users are still using it and > upgrading it, or whether it's falling out of favor. And, over time, that > might help determine whether an obsolete package should be removed > from ELPA. Ah, so as to factor out the fact that frequently updated packages will naturally see more downloads? I guess that would make sense. Not completely sure how to write the code, tho: I can see how to go and dig in the numbers to answer "is the new version less/more popular than the old one", but not how to use that insight to adjust the percentile ranking of the package. >> My goal was mostly to show relative popularity, so when you search for >> packages providing a given feature and you find 4 different options, the >> rank percentile can give you an idea of which one is more popular. > > That's definitely a worthy goal. > > Another goal that's relevant to me, as a package author, is to determine > whether a package of mine is still in use at all. For example, my package > org-ql is intended to subsume my older package, org-rifle, but I hear now > and then about people who still use org-rifle. Eventually I'd like to see > that the downloads of org-rifle fall off to the point that I could declare > it an archived, obsoleted package, but I don't want to do that prematurely. > (Those packages are on MELPA, but the principle applies regardless.) Right. I guess it would be hard to do because of the mirroring-style downloads, so even the least popular package still gets downloads. It's not super high on my todo list for now, but if you're interested in improving this, I'll be happy to take your patches, install them and let you play with it to see what comes up. Currently the `wsl-states.eld` "database" is not exposed on the web site, part of it is because it contains some "irrelevant" entries (accesses to non-existing files, some of them very much on purpose because their names look like "_nonexisting") which may contain information I'd rather not expose. We should try and sanitize it first to only keep things which do correspond to existing packages/files (which will also improve the quality of the rankings). Stefan