unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Jonas Bernoulli <jonas@bernoul.li>
To: emacs-devel@gnu.org
Subject: Re: Some hard numbers on licenses used by elisp packages
Date: Wed, 12 Jul 2017 14:49:52 +0200	[thread overview]
Message-ID: <87zic9zuof.fsf@bernoul.li> (raw)
In-Reply-To: <87shi4z7ps.fsf@bernoul.li>

Richard has asked me privately (by accident, I suspect) for some
clarifications.  Many of his questions were already addressed by the
page I linked to, and most others were already answered by the code
that that page in turn linked to.

I have now improved the introductory text on the linked page and I am
including that text here for your convenience:

> This page contains statistics about the licenses used by known Emacs
> packages.  *These statistics are not legal advice.  They are
> distributed in the hope that they will be useful, but WITHOUT ANY
> WARRANTY; without even the implied warranty of MERCHANTABILITY or
> FITNESS FOR A PARTICULAR PURPOSE.*
>
> The information used here is available from the Emacsmirror database
> (also known as the Epkg database).  For more information about the
> Emacsmirror see these [[https://emacsair.me/2016/04/16/re-introducing-the-emacsmirror][blog]] [[https://emacsair.me/2016/05/17/assimilate-emacs-packages-as-git-submodules][posts]].
>
> I have created this page to accompany [[http://lists.gnu.org/archive/html/emacs-devel/2017-07/msg00341.html][this]] conversation on
> ~emacs-devel~.
>
> I will periodically update the these statistics.  If you want to do so
> yourself, then read the relevant documentation.  You may also ask me
> for guidance.
>
> This information is extracted using the function ~elx-license~, which is
> provided by my package [[https://github.com/tarsius/elx][elx]] (~git clone https://github.com/tarsius/elx.git~).
>
> The license is determined from the contents of the "main library" of
> the package alone (the library whose name matches the name of the
> package).  First this function looks for a permission statement for a
> license published by the Free Software Foundation, if any.  If that
> fails, then the value of the "License" header keyword is considered.
> Finally it searches for brief, and potentially ambiguous, permission
> statements for non-FSF licenses.  For FSF licenses a "+" is appended
> if the text "or (at your option) any later version", or similar was
> found.  An effort is made to normalize the returned value.  This
> function also accounts for some commonly used variations in wording,
> typos, and other complications.
>
> However the returned value is sometimes false or ambiguous.  In
> particular note that if a license is "unknown", then that merely means
> that it is /not known/ what license applies.  This may be because the
> library lacks a permission statement altogether (possibly because an
> accompanying ~LICENSE~ file is considered sufficient by the upstream),
> but it may also be because ~elx-license~ does not attempt to detect the
> used non-standard and/or non-fsf permission statement, or because of
> typos in the statement, or for a number of other reasons.

I have also improved the code used to extract this information and made
a new `elx' release.  This is the relevant code, including doc-strings:

> (defconst elx-gnu-permission-statement-regexp
>   (replace-regexp-in-string
>    "\s" "[\s\t\n;]+"
>    ;; is free software[.,:;]? \
>    ;; you can redistribute it and/or modify it under the terms of the \
>    "\
> GNU \\(?1:Lesser \\| Library \\|Affero \\|Free \\)?\
> General Public Licen[sc]e[.,:;]? \
> \\(?:as published by the \\(?:Free Software Foundation\\|FSF\\)[.,:;]? \\)?\
> \\(?:either \\)?\
> \\(?:GPL \\)?\
> version \\(?2:[0-9.]*[0-9]\\)[.,:;]?\
> \\(?: of the Licen[sc]e[.,:;]?\\)?\
> \\(?3: or \\(?:(at your option) \\)?any later version\\)?"))
>
> (defconst elx-gnu-license-keyword-regexp "\
> \\(?:GNU \\(?1:Lesser \\| Library \\|Affero \\|Free \\)? General Public Licen[sc]e\
> \\|\\(?4:[laf]?gpl\\)[- ]?\
> \\)\
> \\(?:\\(?:v\\|version \\)?\\(?2:[0-9.]*[0-9]\\)\\)?\
> \\(?3: or \\(?:(at your option) \\)?\\(?:any \\)?later\\(?: version\\)?\\)?")
>
> (defconst elx-non-gnu-license-keyword-alist
>   '(("Apache-2.0"    .  "apache-2\\.0")
>     ("MIT"           .  "mit")
>     ("as-is"         .  "as-?is")
>     ("public-domain" . "public[- ]domain")))
>
> (defconst elx-non-gnu-license-keyword-regexp "\
> \\`\\(?4:[a-z]+\\)\\(?:\\(?:v\\|version \\)?\\(?2:[0-9.]*[0-9]\\)\\)?\\'")
>
> (defconst elx-non-gnu-permission-statement-alist
>   `(("Apache-2.0"    . "^;.* Apache License, Version 2\\.0")
>     ("MIT"           . "^;.* mit license")
>     ("public-domain" . "^;.*in\\(to\\)? the public[- ]domain")
>     ("public-domain" . "^;+ +Public domain\\.")
>     ("as-is"         . "^;.* \\(provided\\|distributed\\) \
> \\(by the author \\)?[\"`']\\{0,2\\}as[- ]is[\"`']\\{0,2\\}")))
>
> (defun elx-license (&optional file)
>   "Attempt to return the license used for the file FILE.
> Or the license used for the file that is being visited in the
> current buffer if FILE is nil.
>
> *** A value is returned in the hope that it will be useful, but
> *** WITHOUT ANY WARRANTY; without even the implied warranty of
> *** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> This function completely ignores and \"LICENSE\" or similar file
> in the proximity of FILE.  The returned value is solely based on
> the contents of FILE itself.
>
> The license is determined from the permission statement, if any.
> Otherwise the value of the \"License\" header keyword is
> considered.  An effort is made to normalize the returned value.
>
> *** However this function does not always return the correct
> *** value and the returned value is not legal advice.
>
> Note in particular that if this function returns nil, then that
> merely merely means that it is not known what license applies.
> This may be because the library lacks a permission statement
> altogether (possibly because an accompanying \"LICENSE\" file
> is considered sufficient by the upstream), but it may also be
> because this function does not attempt to detect the used
> non-standard and/or non-fsf permission statement, or because
> of typos in the statement, or for a number of other reasons."
>   (lm-with-file file
>     (cl-flet ((format-gnu-abbrev
>                (&optional object)
>                (let ((abbrev  (match-string 1 object))
>                      (version (match-string 2 object))
>                      (later   (match-string 3 object))
>                      (prefix  (match-string 4 object)))
>                  (concat (if prefix
>                              (upcase prefix)
>                            (pcase abbrev
>                              ("Lesser "  "LGPL")
>                              ("Library " "LGBL")
>                              ("Affero "  "AGPL")
>                              ("Free "    "FDL")
>                              (`nil       "GPL")))
>                          (and version (concat "-" version))
>                          (and later "+")))))
>       (let ((bound (lm-code-start))
>             (case-fold-search t))
>         (or (and (re-search-forward elx-gnu-permission-statement-regexp bound t)
>                  (format-gnu-abbrev))
>             (-when-let (license (lm-header "Licen[sc]e"))
>               (or (and (string-match elx-gnu-license-keyword-regexp license)
>                        (format-gnu-abbrev license))
>                   (car (cl-find-if (pcase-lambda (`(,_ . ,re))
>                                      (string-match re license))
>                                    elx-non-gnu-license-keyword-alist))
>                   (and (string-match elx-non-gnu-license-keyword-regexp license)
>                        (format-gnu-abbrev license))))
>             (and (re-search-forward
>                   "^;\\{1,4\\} Licensed under the same terms as Emacs" bound t)
>                  "GPL-3+")
>             (and ;; Some libraries are releases "under the *GPL and
>                  ;; "<other license>", while the GPL is mentioned in
>                  ;; a way the above code does not recognize.  Return
>                  ;; nil instead of "<other license>" in such cases.
>                  (not (re-search-forward elx-gnu-license-keyword-regexp bound t))
>                  (car (cl-find-if (pcase-lambda (`(,_ . ,re))
>                                     (re-search-forward re bound t))
>                                   elx-non-gnu-permission-statement-alist))))))))

Note that this function now returns e.g. "GPL-3+" if the "or (at your
option) any later version" pattern was detected.  I also made some other
changes to avoid false-positives (which comes at the cost of also no
longer matching some patterns that were previously matched correctly).

I can provide lists of packages that fall into a particular "category".
These lists can contain the names and email addresses of the maintainer,
links to the homepage and repository and many other things you might
find useful.

I would also be willing to contribute this code to the `lisp-mnt.el'
library, which is part of Emacs.  It certainly could still be improved
a lot, but it is a start.

Oh, and I almost forgot - here is an updated table:

| License       | Count | Percent |
|---------------+-------+---------|
| GPL-3+        |  2230 |      61 |
| GPL-2+        |   611 |      17 |
| (unknown)     |   511 |      14 |
| as-is         |    91 |       2 |
| MIT           |    70 |       2 |
| public-domain |    52 |       1 |
| GPL-3         |    41 |       1 |
| GPL-2         |    31 |       1 |
| Apache-2.0    |    18 |       0 |
| GPL-1+        |     4 |       0 |
| BSD           |     3 |       0 |
| GPL           |     2 |       0 |
| LGPL          |     2 |       0 |
| AGPL-3        |     1 |       0 |
| AGPL-3+       |     1 |       0 |
| BSD-3         |     1 |       0 |
| EPL           |     1 |       0 |
| LGPL-3+       |     1 |       0 |
| LGPL-3.0      |     1 |       0 |
|---------------+-------+---------|
| total GNU     |  2925 |      80 |
|---------------+-------+---------|
| total         |  3672 |     100 |

And to briefly answer the post questions:

>   > | (unknown)     |   509 |      14 |
>
> Could you explain what "unknown" means?  If a program
> does not explicitly state a license, it is proprietary.

Either the license was not specified OR the code was unable to find
the permission statement, which actually is present.

>   > | as-is         |   117 |       3 |
>
> Could you tell me what "as-is" means, here?  Is "as-is" meant to
> identify a speciic license?  If so, could you please show it to me?  I
> need to determine whether it is a free license and GPL-compatible.

Essentially the string "as-is" was found in the header.  I do agree
that this is ambiguous and problematic, but I decided to provide
this information anyway, because it is at least less ambiguous than
"unknown".

>   > | MIT           |    45 |       1 |
>
> "MIT" as the name of a license is ambiguous; see

Merely reporting that the string "MIT license" was found.

>   > | GPL           |    29 |       1 |
>
> What does that mean, concretely?
> Do these packages say, "any version of the GNU GPL"?
> That would be peculiar but not a substantive problem.
>
>   > | GPL-1         |     4 |       0 |
>
> Do these packages carry "GPL version 1 only"
> or "GPL version 1 or later"?

This has been improved now:

* "GPL"     => the GPL was mentioned, no version was mention
               (or possibly was just not detected)
* "GPL-N"   => the GPL and version N were mentioned
* "GPL-N+"  => ... additionally "or (at your opinion) any later version"
               was found (or a variation thereof).

>   > | EPL           |     1 |       0 |
>
> Does that mean the Eclipse Public License?

My guess is as good as yours; the string ";; License: EPL" was found.

  Best regards,
  Jonas



  reply	other threads:[~2017-07-12 12:49 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-10 14:29 Some hard numbers on licenses used by elisp packages Jonas Bernoulli
2017-07-12 12:49 ` Jonas Bernoulli [this message]
2017-07-13 12:23   ` Richard Stallman
2017-07-14 19:44     ` Jonas Bernoulli
2017-07-15 19:38       ` Mats Lidell
2017-07-16  1:55       ` Richard Stallman
2017-07-16  2:20         ` Jean-Christophe Helary
2017-07-16 15:41         ` Jonas Bernoulli
2017-07-16 17:37           ` Mats Lidell
2017-07-16 22:17             ` Mats Lidell
2017-07-17 12:00             ` Richard Stallman
2017-07-17 12:00             ` Richard Stallman
2017-07-17 12:00             ` Richard Stallman
2017-07-17 21:23               ` Mats Lidell
2017-07-18 14:16                 ` Richard Stallman
2017-07-23 22:14                   ` Mats Lidell
2017-07-27 22:50                     ` Mats Lidell
2017-07-28 17:16                       ` Richard Stallman
2017-07-29 12:19                         ` Mats Lidell
2017-07-29 19:09                           ` Richard Stallman
2017-07-29 19:54                             ` Mats Lidell
2017-07-29 22:49                             ` Ivan Andrus
2017-07-31  0:46                               ` Richard Stallman
2017-07-31 17:48                                 ` Achim Gratz
2017-08-08  1:02                                 ` Ivan Andrus
2017-07-31  0:51                     ` Richard Stallman
2017-08-01 21:46                       ` Mats Lidell
2017-08-02  1:54                         ` Stefan Monnier
2017-08-03 19:42                           ` Richard Stallman
2017-08-03 19:58                             ` Stefan Monnier
2017-07-28 10:52               ` Jonas Bernoulli
2017-07-28 17:16                 ` Richard Stallman
2017-07-28 17:47                   ` Jonas Bernoulli
2017-07-28 17:16                 ` Richard Stallman
2017-07-29 13:48                   ` Jonas Bernoulli
2017-07-29 19:10                     ` Richard Stallman
2017-07-29 19:10                     ` Richard Stallman
2017-07-28 20:36                 ` Karl Fogel
2017-07-29 19:07                   ` Richard Stallman
2017-07-31  0:49                   ` Richard Stallman
2017-07-17 11:59           ` Richard Stallman
2017-07-28 11:17             ` Jonas Bernoulli
2017-07-29 14:54               ` Mats Lidell
2017-07-29 19:09                 ` Richard Stallman
2017-07-29 19:06               ` Richard Stallman
2017-07-29 19:06               ` Richard Stallman
2017-07-29 20:07                 ` Mats Lidell
2017-07-30  6:28                   ` Jean-Christophe Helary
2017-07-31 16:03                     ` Jonas Bernoulli
2017-07-31 17:03                       ` Jonas Bernoulli
2017-08-03  9:35                 ` Mats Lidell
2017-08-03 19:50                   ` Richard Stallman
2017-07-13 12:23   ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zic9zuof.fsf@bernoul.li \
    --to=jonas@bernoul.li \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).