unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Adam Porter <adam@alphapapa.net>
Cc: 50686@debbugs.gnu.org, stefan@marxist.se, larsi@gnus.org
Subject: bug#50686: Show number of downloads on packages on GNU ELPA/NonGNU ELPA
Date: Mon, 11 Mar 2024 18:13:28 -0400	[thread overview]
Message-ID: <jwvsf0wh1kp.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <d68bd5a7-1f1a-4f12-b37b-dde053981a2b@alphapapa.net> (Adam Porter's message of "Mon, 11 Mar 2024 15:55:47 -0500")

>>>> I had the logs only for a two weeks or so (plus some old logs from
>>>> many years ago, actually), indeed.
>>> I see.  Are the rest of the logs still available on the ELPA server, or is
>>> that all we have for historical data?
>> That's all we have.
> Ok.  Going forward, will the logs we have now be preserved, or do they get
> rotated away?

They get rotated away.  We do keep the weekly counts that we accumulate
in our `wsl-stats.eld` file.

>>>>> a list of downloads per version, etc.
>>>> Currently I count the "interest" in the package, so I don't distinguish
>>>> the version of the package, nor whether the access is for the tarball or
>>>> the package's web page, or the package's readme.txt, or the package's badge.
>>> That seems like a very different kind of data than the number of times
>>> a package has been downloaded (i.e. by an Emacs instance).  IME a small
>>> fraction of hits to a package's GitHub repo seem to result in installations;
>>> "interest" tends to be far more than "interested enough to install."
>> Just because the "interest" tends to be far more than "interested enough
>> to install" doesn't mean that the two aren't strongly correlated.
>> Also my impression is that package web pages in `elpa.gnu.org` are not
>> visited nearly as often as a Github project page.
>> But it'd be definitely worth checking how the two measures compare.
>> Patches welcome.
> Ok, meaning that you'd accept a patch that does...what, exactly, to the
> database?  :)

I guess keep separate counts for tarballs and other files, so we can compare?

>>>> I'd like to the keep the stats database reasonably small (it's currently
>>>> around 150kB,  and I expect it'll take a year before it reaches 1MB), so
>>>> I'd rather not segregate per version.
>>> Is there a way that I could change your mind about that?  Having the actual
>>> download counts per version would be very useful.
>> Maybe if you argue about what kind of use would make it useful?
>
> For example, if a package at version V has N downloads after 6 months, and
> then the package is updated to version V+1, how many downloads that version
> has after 6 months would give some indication of whether the package is
> growing in popularity, whether initial users are still using it and
> upgrading it, or whether it's falling out of favor.  And, over time, that
> might help determine whether an obsolete package should be removed
> from ELPA.

Ah, so as to factor out the fact that frequently updated packages will
naturally see more downloads?  I guess that would make sense.

Not completely sure how to write the code, tho: I can see how to go and
dig in the numbers to answer "is the new version less/more popular than
the old one", but not how to use that insight to adjust the percentile
ranking of the package.

>> My goal was mostly to show relative popularity, so when you search for
>> packages providing a given feature and you find 4 different options, the
>> rank percentile can give you an idea of which one is more popular.
>
> That's definitely a worthy goal.
>
> Another goal that's relevant to me, as a package author, is to determine
> whether a package of mine is still in use at all.  For example, my package
> org-ql is intended to subsume my older package, org-rifle, but I hear now
> and then about people who still use org-rifle.  Eventually I'd like to see
> that the downloads of org-rifle fall off to the point that I could declare
> it an archived, obsoleted package, but I don't want to do that prematurely.
> (Those packages are on MELPA, but the principle applies regardless.)

Right.  I guess it would be hard to do because of the mirroring-style
downloads, so even the least popular package still gets downloads.

It's not super high on my todo list for now, but if you're interested in
improving this, I'll be happy to take your patches, install them and let
you play with it to see what comes up.

Currently the `wsl-states.eld` "database" is not exposed on the web
site, part of it is because it contains some "irrelevant" entries
(accesses to non-existing files, some of them very much on purpose
because their names look like "<RANDOM>_nonexisting") which may contain
information I'd rather not expose.  We should try and sanitize it first
to only keep things which do correspond to existing packages/files
(which will also improve the quality of the rankings).


        Stefan






  reply	other threads:[~2024-03-11 22:13 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-19 21:13 bug#50686: Show number of downloads on packages on GNU ELPA/NonGNU ELPA Stefan Kangas
2021-09-20  4:35 ` Eli Zaretskii
2021-09-20  5:54   ` Stefan Kangas
2021-09-20  6:22 ` Lars Ingebrigtsen
2023-09-07 22:05   ` Stefan Kangas
2023-09-08  8:30     ` Adam Porter
2024-03-05 23:58       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-06  0:22         ` Adam Porter
2024-03-06  2:57           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-06  5:04             ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-08 23:20               ` Adam Porter
2024-03-09 14:37                 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-11 20:07                   ` Adam Porter
2024-03-11 20:28                     ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2024-03-11 20:55                       ` Adam Porter
2024-03-11 22:13                         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2021-10-01 19:58 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-10-02 13:39   ` Stefan Kangas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvsf0wh1kp.fsf-monnier+emacs@gnu.org \
    --to=bug-gnu-emacs@gnu.org \
    --cc=50686@debbugs.gnu.org \
    --cc=adam@alphapapa.net \
    --cc=larsi@gnus.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=stefan@marxist.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).