unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Distribution statistics for ELPA and EMMS
@ 2023-07-13 20:54 Yoni Rabkin
  2023-07-13 23:16 ` Eduardo Ochs
  0 siblings, 1 reply; 18+ messages in thread
From: Yoni Rabkin @ 2023-07-13 20:54 UTC (permalink / raw)
  To: emacs-devel


Hello all,

I'm putting together a talk for EmacsConf on Emms, which is installable
via M-x list-packages.

I want to start the talk with the claim that Emms is a popular music and
video package for Emacs. However, I have absolutely no numbers or
statistics to back that claim. Is there a way to find out? Are any
statistics collected by ELPA?

Is there a way to answer the question: how many times was Emms installed
in 2022?


(I'm sending this question to emacs-devel because on
"https://elpa.gnu.org/" under "Contact" is says that "general inquiries"
should go to "the emacs-devel mailing list". Apologies in advance if
this isn't the right place.)

-- 
   "Cut your own wood and it will warm you twice"



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-07-13 20:54 Distribution statistics for ELPA and EMMS Yoni Rabkin
@ 2023-07-13 23:16 ` Eduardo Ochs
  2023-07-14  7:03   ` Philip Kaludercic
  0 siblings, 1 reply; 18+ messages in thread
From: Eduardo Ochs @ 2023-07-13 23:16 UTC (permalink / raw)
  To: Yoni Rabkin; +Cc: emacs-devel

On Thu, 13 Jul 2023 at 17:55, Yoni Rabkin <yoni@rabkins.net> wrote:
>
> I want to start the talk with the claim that Emms is a popular music and
> video package for Emacs. However, I have absolutely no numbers or
> statistics to back that claim. Is there a way to find out? Are any
> statistics collected by ELPA?
>
> Is there a way to answer the question: how many times was Emms installed
> in 2022?

Hey, thanks for asking that!

I'm the author of an Emacs package that is in ELPA and that I don't
have any idea how many users it has - I mean, besides the ones that
I've interacted with - and I was planning to start my talk about it
in the next EmacsConf by saying that it has "at least 10 users"... =/

  Cheers =),
  Eduardo Ochs
    http://anggtwu.net/#eev



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-07-13 23:16 ` Eduardo Ochs
@ 2023-07-14  7:03   ` Philip Kaludercic
  2023-07-14 14:02     ` Yoni Rabkin
  0 siblings, 1 reply; 18+ messages in thread
From: Philip Kaludercic @ 2023-07-14  7:03 UTC (permalink / raw)
  To: Eduardo Ochs; +Cc: Yoni Rabkin, emacs-devel

Eduardo Ochs <eduardoochs@gmail.com> writes:

> On Thu, 13 Jul 2023 at 17:55, Yoni Rabkin <yoni@rabkins.net> wrote:
>>
>> I want to start the talk with the claim that Emms is a popular music and
>> video package for Emacs. However, I have absolutely no numbers or
>> statistics to back that claim. Is there a way to find out? Are any
>> statistics collected by ELPA?
>>
>> Is there a way to answer the question: how many times was Emms installed
>> in 2022?
>
> Hey, thanks for asking that!
>
> I'm the author of an Emacs package that is in ELPA and that I don't
> have any idea how many users it has - I mean, besides the ones that
> I've interacted with - and I was planning to start my talk about it
> in the next EmacsConf by saying that it has "at least 10 users"... =/

Then again, if you go by download counts like MELPA, you will severely
overestimate the number of users, since AFAIK they do not distinguish
between downloads and updates, nor do they know if someone just
installed a package and then immediately removed it.

There was some discussion about updating the protocol that package.el
uses, in which context thinking about some reliable yet privacy
preserving method of estimating the user count would be nice to have.

>   Cheers =),
>   Eduardo Ochs
>     http://anggtwu.net/#eev



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-07-14  7:03   ` Philip Kaludercic
@ 2023-07-14 14:02     ` Yoni Rabkin
  2023-07-14 19:45       ` Adam Porter
  2023-09-07 16:46       ` Stefan Kangas
  0 siblings, 2 replies; 18+ messages in thread
From: Yoni Rabkin @ 2023-07-14 14:02 UTC (permalink / raw)
  To: emacs-devel

Philip Kaludercic <philipk@posteo.net> writes:

> Eduardo Ochs <eduardoochs@gmail.com> writes:
>
>> On Thu, 13 Jul 2023 at 17:55, Yoni Rabkin <yoni@rabkins.net> wrote:
>>>
>>> I want to start the talk with the claim that Emms is a popular music and
>>> video package for Emacs. However, I have absolutely no numbers or
>>> statistics to back that claim. Is there a way to find out? Are any
>>> statistics collected by ELPA?
>>>
>>> Is there a way to answer the question: how many times was Emms installed
>>> in 2022?
>>
>> Hey, thanks for asking that!
>>
>> I'm the author of an Emacs package that is in ELPA and that I don't
>> have any idea how many users it has - I mean, besides the ones that
>> I've interacted with - and I was planning to start my talk about it
>> in the next EmacsConf by saying that it has "at least 10 users"... =/
>
> Then again, if you go by download counts like MELPA, you will severely
> overestimate the number of users, since AFAIK they do not distinguish
> between downloads and updates, nor do they know if someone just
> installed a package and then immediately removed it.

I don't think that's important since we are not selling copies of the
software, nor trying to drive advertisement. Instead, what I had in mind
is a way of gauging activity. I'm thinking simply of the number of
installs and/or updates, even if that number is normalized to the most
"active" package.

A developer who wants insight could visit the ELPA page for their
package once in a while, note the change in the numbers, and draw their
own conclusions.

I think that seeing your package being installed and/or updated would be
a great way of encouraging people to continue developing. Developers,
especially of niche packages, could otherwise feel like they are "vox
clamantis in deserto"; shouting into a great empty wilderness.


> There was some discussion about updating the protocol that package.el
> uses, in which context thinking about some reliable yet privacy
> preserving method of estimating the user count would be nice to have.

I see no pressing reason to identify unique downloads as opposed to
simply downloads/updates, so I don't think privacy will be a concern.

Currently we have two developers who have voiced that they would like
this feature in ELPA. Perhaps if others chime in then it should be
considered.


-- 
   "Cut your own wood and it will warm you twice"



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-07-14 14:02     ` Yoni Rabkin
@ 2023-07-14 19:45       ` Adam Porter
  2023-07-17  2:25         ` Richard Stallman
  2023-09-07 16:46       ` Stefan Kangas
  1 sibling, 1 reply; 18+ messages in thread
From: Adam Porter @ 2023-07-14 19:45 UTC (permalink / raw)
  To: yoni; +Cc: emacs-devel

> Currently we have two developers who have voiced that they would 
> likethis feature in ELPA. Perhaps if others chime in then it should
> be considered.

Obviously, I think every developer would like to see the download counts 
of their packages.  I check mine on MELPA now and then to get a sense of 
how popular they are, at least relative to each other.  If nothing else 
it gives me an idea about how I may need to better document or publicize 
them (the ones that seem under-utilized, that is).  And, of course, it 
can be a nice ego boost.  :)

So, yes, it would be nice if ELPA offered something similar.  Nothing 
fancy, just a download counter, would be good.  Even better would be a 
page showing downloads per month, or something like that, for historical 
purposes.

However, another factor to keep in mind is that some packages may be 
downloaded regularly for continuous integration testing, which may 
inflate the download count.  One might think that this would only be a 
small number, and I guess for most packages it is, but I have seen on 
some of mine, which seem to be included in some distros with regular CI, 
that the Git repo is cloned hundreds or thousands of times per week, 
numbers far beyond the users of the package.  Maybe that pattern is 
confined to cloning the Git repo rather than downloading from ELPA, but 
I can't say.

As far as uniqueness, I would, of course, suggest that IP addresses 
should be protected.  Maybe some simple aggregation in the sense of 
collating multiple downloads from one address in a short period of time 
could be useful, but anything beyond that would seem bogus.

My two cents.  :)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-07-14 19:45       ` Adam Porter
@ 2023-07-17  2:25         ` Richard Stallman
  2023-09-19 14:49           ` Adam Porter
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Stallman @ 2023-07-17  2:25 UTC (permalink / raw)
  To: Adam Porter; +Cc: yoni, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

We could have two options for downloading, one which is "for a real
user" and one which is "for periodic testing".

The only difference would be that the former increments the user
download count and the latter does not.

-- 
Dr Richard Stallman (https://stallman.org)
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-07-14 14:02     ` Yoni Rabkin
  2023-07-14 19:45       ` Adam Porter
@ 2023-09-07 16:46       ` Stefan Kangas
  2023-09-07 17:10         ` Yoni Rabkin
                           ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Stefan Kangas @ 2023-09-07 16:46 UTC (permalink / raw)
  To: Yoni Rabkin, emacs-devel

Yoni Rabkin <yoni@rabkins.net> writes:

> Currently we have two developers who have voiced that they would like
> this feature in ELPA. Perhaps if others chime in then it should be
> considered.

(I'm a bit late on the ball here.)

We have Bug#50686 where I requested this feature.  It seems like the
status is that we need someone to volunteer to write up the scripts.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-07 16:46       ` Stefan Kangas
@ 2023-09-07 17:10         ` Yoni Rabkin
  2023-09-07 21:35           ` Akib Azmain Turja
  2023-09-07 23:09         ` Lynn Winebarger
  2023-09-08  7:51         ` Philip Kaludercic
  2 siblings, 1 reply; 18+ messages in thread
From: Yoni Rabkin @ 2023-09-07 17:10 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> Yoni Rabkin <yoni@rabkins.net> writes:
>
>> Currently we have two developers who have voiced that they would like
>> this feature in ELPA. Perhaps if others chime in then it should be
>> considered.
>
> (I'm a bit late on the ball here.)
>
> We have Bug#50686 where I requested this feature.  It seems like the
> status is that we need someone to volunteer to write up the scripts.

Thank you. I'll send a message to the emms developer mailing list in
case someone there wants to tackle it.

-- 
   "Cut your own wood and it will warm you twice"



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-07 17:10         ` Yoni Rabkin
@ 2023-09-07 21:35           ` Akib Azmain Turja
  2023-09-07 22:07             ` Stefan Kangas
  0 siblings, 1 reply; 18+ messages in thread
From: Akib Azmain Turja @ 2023-09-07 21:35 UTC (permalink / raw)
  To: Yoni Rabkin; +Cc: Stefan Kangas, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 929 bytes --]

Yoni Rabkin <yoni@rabkins.net> writes:

> Stefan Kangas <stefankangas@gmail.com> writes:
>
>> Yoni Rabkin <yoni@rabkins.net> writes:
>>
>>> Currently we have two developers who have voiced that they would like
>>> this feature in ELPA. Perhaps if others chime in then it should be
>>> considered.

May I also wish this feature?  :)

As the maintainer of a few NonGNU ELPA packages, it would be nice for
me.

>>
>> (I'm a bit late on the ball here.)
>>
>> We have Bug#50686 where I requested this feature.  It seems like the
>> status is that we need someone to volunteer to write up the scripts.
>
> Thank you. I'll send a message to the emms developer mailing list in
> case someone there wants to tackle it.

-- 
Akib Azmain Turja, GPG key: 70018CE5819F17A3BBA666AFE74F0EFA922AE7F5
Fediverse: akib@hostux.social
Codeberg: akib
emailselfdefense.fsf.org | "Nothing can be secure without encryption."

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-07 21:35           ` Akib Azmain Turja
@ 2023-09-07 22:07             ` Stefan Kangas
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Kangas @ 2023-09-07 22:07 UTC (permalink / raw)
  To: Akib Azmain Turja, Yoni Rabkin; +Cc: emacs-devel, Stefan Monnier

Akib Azmain Turja <akib@disroot.org> writes:

> May I also wish this feature?  :)

Yes, absolutely.  Even better if you want to volunteer to work on a
script, of course.

To make the job more straightforward, I just sent Bug#50686 some sample
logs that I got from Stefan Monnier in 2021.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-07 16:46       ` Stefan Kangas
  2023-09-07 17:10         ` Yoni Rabkin
@ 2023-09-07 23:09         ` Lynn Winebarger
  2023-09-08  7:51         ` Philip Kaludercic
  2 siblings, 0 replies; 18+ messages in thread
From: Lynn Winebarger @ 2023-09-07 23:09 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: Yoni Rabkin, emacs-devel

On Thu, Sep 7, 2023 at 12:47 PM Stefan Kangas <stefankangas@gmail.com> wrote:
> Yoni Rabkin <yoni@rabkins.net> writes:
> > Currently we have two developers who have voiced that they would like
> > this feature in ELPA. Perhaps if others chime in then it should be
> > considered.
>
> (I'm a bit late on the ball here.)
>
> We have Bug#50686 where I requested this feature.  It seems like the
> status is that we need someone to volunteer to write up the scripts.
>
Hi, Stefan,
As a *user*, I like seeing the number of downloads on MELPA as a (very
crude) gauge of how much scrutiny a package has received.  Especially
before I contributed something to MELPA and knew they do actually
review the code of packages before they go on there.

What I'd love to see is some measure of the number of users a package
has in the "list-packages" mode.  Is this feature request just to
calculate the statistics for the developers, or does it include any
changes to package.el to display them to potential users?

Thanks,
Lynn



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-07 16:46       ` Stefan Kangas
  2023-09-07 17:10         ` Yoni Rabkin
  2023-09-07 23:09         ` Lynn Winebarger
@ 2023-09-08  7:51         ` Philip Kaludercic
  2 siblings, 0 replies; 18+ messages in thread
From: Philip Kaludercic @ 2023-09-08  7:51 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: Yoni Rabkin, emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> Yoni Rabkin <yoni@rabkins.net> writes:
>
>> Currently we have two developers who have voiced that they would like
>> this feature in ELPA. Perhaps if others chime in then it should be
>> considered.
>
> (I'm a bit late on the ball here.)
>
> We have Bug#50686 where I requested this feature.  It seems like the
> status is that we need someone to volunteer to write up the scripts.

I can look into this, but it will probably not be easy to share this
information in the current archive-contents format (though I might be
mistaken, I'll have to check).  If it does turn out to be difficult, I'd
add this to the list of features that would be of interest when updating
the archive-contents format, as was proposed by Jonas Bernoulli a few
months back.

-- 
Philip Kaludercic



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-07-17  2:25         ` Richard Stallman
@ 2023-09-19 14:49           ` Adam Porter
  2023-09-19 16:38             ` Philip Kaludercic
  0 siblings, 1 reply; 18+ messages in thread
From: Adam Porter @ 2023-09-19 14:49 UTC (permalink / raw)
  To: rms; +Cc: yoni, emacs-devel

[I just noticed this message from a few months ago.]

On 7/16/23 21:25, Richard Stallman wrote:
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
> 
> We could have two options for downloading, one which is "for a real
> user" and one which is "for periodic testing".
> 
> The only difference would be that the former increments the user
> download count and the latter does not.

I like this idea, but it seems like it would be hard to enforce.  It 
could even go the other way, i.e. have Emacs send a query string or 
header when installing a package manually, which could be logged and 
used to filter the download logs later.  But even that might be harder 
than it seems, e.g. if I call a command like:

   emacs --eval "(package-install FOO)"

...to non-interactively install a package into a local directory for 
testing, how far, and in how many places, would some kind of flag need 
to be propagated to end up in the server's logs?



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-19 14:49           ` Adam Porter
@ 2023-09-19 16:38             ` Philip Kaludercic
  2023-09-19 19:00               ` Akib Azmain Turja
  0 siblings, 1 reply; 18+ messages in thread
From: Philip Kaludercic @ 2023-09-19 16:38 UTC (permalink / raw)
  To: Adam Porter; +Cc: rms, yoni, emacs-devel

Adam Porter <adam@alphapapa.net> writes:

> [I just noticed this message from a few months ago.]
>
> On 7/16/23 21:25, Richard Stallman wrote:
>> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
>> [[[ whether defending the US Constitution against all enemies,     ]]]
>> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>> We could have two options for downloading, one which is "for a real
>> user" and one which is "for periodic testing".
>> The only difference would be that the former increments the user
>> download count and the latter does not.
>
> I like this idea, but it seems like it would be hard to enforce.  It
> could even go the other way, i.e. have Emacs send a query string or
> header when installing a package manually, which could be logged and
> used to filter the download logs later.  But even that might be harder
> than it seems, e.g. if I call a command like:
>
>   emacs --eval "(package-install FOO)"
>
> ...to non-interactively install a package into a local directory for
> testing, how far, and in how many places, would some kind of flag need
> to be propagated to end up in the server's logs?

There is an inherent unreliability in these kinds of statistics that has
to be accepted.  The question is therefore are issues like these
significant or would they skew the results.  This has to be considered
under a false-positive and a false-negative approach, depending on what
we want to measure.  If it is all about dopamine-boosting, I think a
false-positive approach would be better ;^)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-19 16:38             ` Philip Kaludercic
@ 2023-09-19 19:00               ` Akib Azmain Turja
  2023-09-19 19:13                 ` Emanuel Berg
                                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Akib Azmain Turja @ 2023-09-19 19:00 UTC (permalink / raw)
  To: Philip Kaludercic; +Cc: Adam Porter, rms, yoni, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2941 bytes --]

Philip Kaludercic <philipk@posteo.net> writes:

> Adam Porter <adam@alphapapa.net> writes:
>
>> [I just noticed this message from a few months ago.]
>>
>> On 7/16/23 21:25, Richard Stallman wrote:
>>> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
>>> [[[ whether defending the US Constitution against all enemies,     ]]]
>>> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>>> We could have two options for downloading, one which is "for a real
>>> user" and one which is "for periodic testing".
>>> The only difference would be that the former increments the user
>>> download count and the latter does not.
>>
>> I like this idea, but it seems like it would be hard to enforce.  It
>> could even go the other way, i.e. have Emacs send a query string or
>> header when installing a package manually, which could be logged and
>> used to filter the download logs later.  But even that might be harder
>> than it seems, e.g. if I call a command like:
>>
>>   emacs --eval "(package-install FOO)"
>>
>> ...to non-interactively install a package into a local directory for
>> testing, how far, and in how many places, would some kind of flag need
>> to be propagated to end up in the server's logs?
>
> There is an inherent unreliability in these kinds of statistics that has
> to be accepted.  The question is therefore are issues like these
> significant or would they skew the results.  This has to be considered
> under a false-positive and a false-negative approach, depending on what
> we want to measure.

How are these numbers going to be useful?  This can't be a measure of
"popularity."

Say, for example, the package "git-commit" is 11th most downloaded
package on MELPA.  Is it really popular?  Few people install it
explicitly.  Only one package depends on it, which is Magit, a super
popular package.  So git-commit is automatically installed as a
dependency when Magit is installed.

And also, packages that get more frequent update are downloaded more
than whose update less frequently.  So its indeed possible for a less
popular but frequently updated package gets more downloaded than a
mature well written more popular package.

And also there are straight.el, Elpaca and Quelpa guys who don't use the
ELPA at all.

>                      If it is all about dopamine-boosting, I think a
> false-positive approach would be better ;^)
>

OK...

--8<---------------cut here---------------start------------->8---
(while t
  (package-install 'eat)
  (package-delete (cadr (assoc 'eat package-alist))))
--8<---------------cut here---------------end--------------->8---

Soon: Eat is the most popular terminal emulator.  xD

-- 
Akib Azmain Turja, GPG key: 70018CE5819F17A3BBA666AFE74F0EFA922AE7F5
Fediverse: akib@hostux.social
Codeberg: akib
emailselfdefense.fsf.org | "Nothing can be secure without encryption."

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-19 19:00               ` Akib Azmain Turja
@ 2023-09-19 19:13                 ` Emanuel Berg
  2023-09-19 19:42                 ` Yoni Rabkin
  2023-09-19 22:06                 ` Philip Kaludercic
  2 siblings, 0 replies; 18+ messages in thread
From: Emanuel Berg @ 2023-09-19 19:13 UTC (permalink / raw)
  To: emacs-devel

Akib Azmain Turja wrote:

> How are these numbers going to be useful? This can't be
> a measure of "popularity."

Stats are always useful, they only measure what they measure
and what that is should always be stated.

> Say, for example, the package "git-commit" is 11th most
> downloaded package on MELPA. Is it really popular?
> Few people install it explicitly. Only one package depends
> on it, which is Magit, a super popular package.
> So git-commit is automatically installed as a dependency
> when Magit is installed.

Thanks for that note and do include it in the description what
the stats say.

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-19 19:00               ` Akib Azmain Turja
  2023-09-19 19:13                 ` Emanuel Berg
@ 2023-09-19 19:42                 ` Yoni Rabkin
  2023-09-19 22:06                 ` Philip Kaludercic
  2 siblings, 0 replies; 18+ messages in thread
From: Yoni Rabkin @ 2023-09-19 19:42 UTC (permalink / raw)
  To: Akib Azmain Turja; +Cc: emacs-devel

Akib Azmain Turja <akib@disroot.org> writes:

> Philip Kaludercic <philipk@posteo.net> writes:
>
>> Adam Porter <adam@alphapapa.net> writes:
>>
>>> [I just noticed this message from a few months ago.]
>>>
>>> On 7/16/23 21:25, Richard Stallman wrote:
>>>> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
>>>> [[[ whether defending the US Constitution against all enemies,     ]]]
>>>> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>>>> We could have two options for downloading, one which is "for a real
>>>> user" and one which is "for periodic testing".
>>>> The only difference would be that the former increments the user
>>>> download count and the latter does not.
>>>
>>> I like this idea, but it seems like it would be hard to enforce.  It
>>> could even go the other way, i.e. have Emacs send a query string or
>>> header when installing a package manually, which could be logged and
>>> used to filter the download logs later.  But even that might be harder
>>> than it seems, e.g. if I call a command like:
>>>
>>>   emacs --eval "(package-install FOO)"
>>>
>>> ...to non-interactively install a package into a local directory for
>>> testing, how far, and in how many places, would some kind of flag need
>>> to be propagated to end up in the server's logs?
>>
>> There is an inherent unreliability in these kinds of statistics that has
>> to be accepted.  The question is therefore are issues like these
>> significant or would they skew the results.  This has to be considered
>> under a false-positive and a false-negative approach, depending on what
>> we want to measure.
>
> How are these numbers going to be useful?  This can't be a measure of
> "popularity."

Agreed. We haven't defined popularity (nor should we), so we can't
measure it.

But we can most certainly measure the number of downloads. I would be
interested in this number for Emms (and to a lesser degree for
rt-liberation.)

Moreover, the data will change over time, and I'll find observing those
changes interesting as well.

-- 
   "Cut your own wood and it will warm you twice"



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Distribution statistics for ELPA and EMMS
  2023-09-19 19:00               ` Akib Azmain Turja
  2023-09-19 19:13                 ` Emanuel Berg
  2023-09-19 19:42                 ` Yoni Rabkin
@ 2023-09-19 22:06                 ` Philip Kaludercic
  2 siblings, 0 replies; 18+ messages in thread
From: Philip Kaludercic @ 2023-09-19 22:06 UTC (permalink / raw)
  To: Akib Azmain Turja; +Cc: Adam Porter, rms, yoni, emacs-devel

Akib Azmain Turja <akib@disroot.org> writes:

> Philip Kaludercic <philipk@posteo.net> writes:
>
>> Adam Porter <adam@alphapapa.net> writes:
>>
>>> [I just noticed this message from a few months ago.]
>>>
>>> On 7/16/23 21:25, Richard Stallman wrote:
>>>> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
>>>> [[[ whether defending the US Constitution against all enemies,     ]]]
>>>> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>>>> We could have two options for downloading, one which is "for a real
>>>> user" and one which is "for periodic testing".
>>>> The only difference would be that the former increments the user
>>>> download count and the latter does not.
>>>
>>> I like this idea, but it seems like it would be hard to enforce.  It
>>> could even go the other way, i.e. have Emacs send a query string or
>>> header when installing a package manually, which could be logged and
>>> used to filter the download logs later.  But even that might be harder
>>> than it seems, e.g. if I call a command like:
>>>
>>>   emacs --eval "(package-install FOO)"
>>>
>>> ...to non-interactively install a package into a local directory for
>>> testing, how far, and in how many places, would some kind of flag need
>>> to be propagated to end up in the server's logs?
>>
>> There is an inherent unreliability in these kinds of statistics that has
>> to be accepted.  The question is therefore are issues like these
>> significant or would they skew the results.  This has to be considered
>> under a false-positive and a false-negative approach, depending on what
>> we want to measure.
>
> How are these numbers going to be useful?  This can't be a measure of
> "popularity."

Yes, they are at best an indicator.  A malicious person could always
manipulate them, unless considerable effort is put into verifying the
information -- which not only comes at the cost of time but also is
likely to decrease the amount of available information.

> Say, for example, the package "git-commit" is 11th most downloaded
> package on MELPA.  Is it really popular?  Few people install it
> explicitly.  Only one package depends on it, which is Magit, a super
> popular package.  So git-commit is automatically installed as a
> dependency when Magit is installed.

We should be able to solve that problem by adding a query string to the
request, as Adam suggests:

https://elpa.gnu.org/packages/poker-0.2.tar?selected=yes
https://elpa.gnu.org/packages/seq-2.24.tar?selected=no
https://elpa.gnu.org/packages/project-0.10.0.tar?selected=yes&upgrade=yes
etc.

Given this information, you know the user doesn't object to having this
information used (depending on whether or not this is a opt-in or
out-out thing), the version being fetched, whether it is a dependency or
not and whether it was an upgrade.

> And also, packages that get more frequent update are downloaded more
> than whose update less frequently.  So its indeed possible for a less
> popular but frequently updated package gets more downloaded than a
> mature well written more popular package.

We can remember upgrade-counts over the last week, year and all time.

> And also there are straight.el, Elpaca and Quelpa guys who don't use the
> ELPA at all.

Of course, hence "inherent unreliability", though I would be surprised
if the choice of package manager has a strong causal effect on what
packages one uses (setting aside that from-source package managers can
install unreleased packages that are not distributed in any archive).

>>                      If it is all about dopamine-boosting, I think a
>> false-positive approach would be better ;^)
>
> OK...
>
> (while t
>   (package-install 'eat)
>   (package-delete (cadr (assoc 'eat package-alist))))
>
> Soon: Eat is the most popular terminal emulator.  xD

Good point (though just asynchronously spamming the right URL would be
more efficient), my idea would be to count an IP address only once per
day, ignoring how many concrete requests were sent out and also use a
list of excluded addresses, such as Tor exit nodes, to filter out from
the statistics.

This approach approach, together with the fact that from-source package
managers wouldn't participate unless they are actively instructed to do
so, are further arguments for a false-negative approach.



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-09-19 22:06 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-13 20:54 Distribution statistics for ELPA and EMMS Yoni Rabkin
2023-07-13 23:16 ` Eduardo Ochs
2023-07-14  7:03   ` Philip Kaludercic
2023-07-14 14:02     ` Yoni Rabkin
2023-07-14 19:45       ` Adam Porter
2023-07-17  2:25         ` Richard Stallman
2023-09-19 14:49           ` Adam Porter
2023-09-19 16:38             ` Philip Kaludercic
2023-09-19 19:00               ` Akib Azmain Turja
2023-09-19 19:13                 ` Emanuel Berg
2023-09-19 19:42                 ` Yoni Rabkin
2023-09-19 22:06                 ` Philip Kaludercic
2023-09-07 16:46       ` Stefan Kangas
2023-09-07 17:10         ` Yoni Rabkin
2023-09-07 21:35           ` Akib Azmain Turja
2023-09-07 22:07             ` Stefan Kangas
2023-09-07 23:09         ` Lynn Winebarger
2023-09-08  7:51         ` Philip Kaludercic

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).