From: Jonas Bernoulli <jonas@bernoul.li>
To: emacs-devel@gnu.org
Subject: Re: Some hard numbers on licenses used by elisp packages
Date: Wed, 12 Jul 2017 14:49:52 +0200 [thread overview]
Message-ID: <87zic9zuof.fsf@bernoul.li> (raw)
In-Reply-To: <87shi4z7ps.fsf@bernoul.li>
Richard has asked me privately (by accident, I suspect) for some
clarifications. Many of his questions were already addressed by the
page I linked to, and most others were already answered by the code
that that page in turn linked to.
I have now improved the introductory text on the linked page and I am
including that text here for your convenience:
> This page contains statistics about the licenses used by known Emacs
> packages. *These statistics are not legal advice. They are
> distributed in the hope that they will be useful, but WITHOUT ANY
> WARRANTY; without even the implied warranty of MERCHANTABILITY or
> FITNESS FOR A PARTICULAR PURPOSE.*
>
> The information used here is available from the Emacsmirror database
> (also known as the Epkg database). For more information about the
> Emacsmirror see these [[https://emacsair.me/2016/04/16/re-introducing-the-emacsmirror][blog]] [[https://emacsair.me/2016/05/17/assimilate-emacs-packages-as-git-submodules][posts]].
>
> I have created this page to accompany [[http://lists.gnu.org/archive/html/emacs-devel/2017-07/msg00341.html][this]] conversation on
> ~emacs-devel~.
>
> I will periodically update the these statistics. If you want to do so
> yourself, then read the relevant documentation. You may also ask me
> for guidance.
>
> This information is extracted using the function ~elx-license~, which is
> provided by my package [[https://github.com/tarsius/elx][elx]] (~git clone https://github.com/tarsius/elx.git~).
>
> The license is determined from the contents of the "main library" of
> the package alone (the library whose name matches the name of the
> package). First this function looks for a permission statement for a
> license published by the Free Software Foundation, if any. If that
> fails, then the value of the "License" header keyword is considered.
> Finally it searches for brief, and potentially ambiguous, permission
> statements for non-FSF licenses. For FSF licenses a "+" is appended
> if the text "or (at your option) any later version", or similar was
> found. An effort is made to normalize the returned value. This
> function also accounts for some commonly used variations in wording,
> typos, and other complications.
>
> However the returned value is sometimes false or ambiguous. In
> particular note that if a license is "unknown", then that merely means
> that it is /not known/ what license applies. This may be because the
> library lacks a permission statement altogether (possibly because an
> accompanying ~LICENSE~ file is considered sufficient by the upstream),
> but it may also be because ~elx-license~ does not attempt to detect the
> used non-standard and/or non-fsf permission statement, or because of
> typos in the statement, or for a number of other reasons.
I have also improved the code used to extract this information and made
a new `elx' release. This is the relevant code, including doc-strings:
> (defconst elx-gnu-permission-statement-regexp
> (replace-regexp-in-string
> "\s" "[\s\t\n;]+"
> ;; is free software[.,:;]? \
> ;; you can redistribute it and/or modify it under the terms of the \
> "\
> GNU \\(?1:Lesser \\| Library \\|Affero \\|Free \\)?\
> General Public Licen[sc]e[.,:;]? \
> \\(?:as published by the \\(?:Free Software Foundation\\|FSF\\)[.,:;]? \\)?\
> \\(?:either \\)?\
> \\(?:GPL \\)?\
> version \\(?2:[0-9.]*[0-9]\\)[.,:;]?\
> \\(?: of the Licen[sc]e[.,:;]?\\)?\
> \\(?3: or \\(?:(at your option) \\)?any later version\\)?"))
>
> (defconst elx-gnu-license-keyword-regexp "\
> \\(?:GNU \\(?1:Lesser \\| Library \\|Affero \\|Free \\)? General Public Licen[sc]e\
> \\|\\(?4:[laf]?gpl\\)[- ]?\
> \\)\
> \\(?:\\(?:v\\|version \\)?\\(?2:[0-9.]*[0-9]\\)\\)?\
> \\(?3: or \\(?:(at your option) \\)?\\(?:any \\)?later\\(?: version\\)?\\)?")
>
> (defconst elx-non-gnu-license-keyword-alist
> '(("Apache-2.0" . "apache-2\\.0")
> ("MIT" . "mit")
> ("as-is" . "as-?is")
> ("public-domain" . "public[- ]domain")))
>
> (defconst elx-non-gnu-license-keyword-regexp "\
> \\`\\(?4:[a-z]+\\)\\(?:\\(?:v\\|version \\)?\\(?2:[0-9.]*[0-9]\\)\\)?\\'")
>
> (defconst elx-non-gnu-permission-statement-alist
> `(("Apache-2.0" . "^;.* Apache License, Version 2\\.0")
> ("MIT" . "^;.* mit license")
> ("public-domain" . "^;.*in\\(to\\)? the public[- ]domain")
> ("public-domain" . "^;+ +Public domain\\.")
> ("as-is" . "^;.* \\(provided\\|distributed\\) \
> \\(by the author \\)?[\"`']\\{0,2\\}as[- ]is[\"`']\\{0,2\\}")))
>
> (defun elx-license (&optional file)
> "Attempt to return the license used for the file FILE.
> Or the license used for the file that is being visited in the
> current buffer if FILE is nil.
>
> *** A value is returned in the hope that it will be useful, but
> *** WITHOUT ANY WARRANTY; without even the implied warranty of
> *** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> This function completely ignores and \"LICENSE\" or similar file
> in the proximity of FILE. The returned value is solely based on
> the contents of FILE itself.
>
> The license is determined from the permission statement, if any.
> Otherwise the value of the \"License\" header keyword is
> considered. An effort is made to normalize the returned value.
>
> *** However this function does not always return the correct
> *** value and the returned value is not legal advice.
>
> Note in particular that if this function returns nil, then that
> merely merely means that it is not known what license applies.
> This may be because the library lacks a permission statement
> altogether (possibly because an accompanying \"LICENSE\" file
> is considered sufficient by the upstream), but it may also be
> because this function does not attempt to detect the used
> non-standard and/or non-fsf permission statement, or because
> of typos in the statement, or for a number of other reasons."
> (lm-with-file file
> (cl-flet ((format-gnu-abbrev
> (&optional object)
> (let ((abbrev (match-string 1 object))
> (version (match-string 2 object))
> (later (match-string 3 object))
> (prefix (match-string 4 object)))
> (concat (if prefix
> (upcase prefix)
> (pcase abbrev
> ("Lesser " "LGPL")
> ("Library " "LGBL")
> ("Affero " "AGPL")
> ("Free " "FDL")
> (`nil "GPL")))
> (and version (concat "-" version))
> (and later "+")))))
> (let ((bound (lm-code-start))
> (case-fold-search t))
> (or (and (re-search-forward elx-gnu-permission-statement-regexp bound t)
> (format-gnu-abbrev))
> (-when-let (license (lm-header "Licen[sc]e"))
> (or (and (string-match elx-gnu-license-keyword-regexp license)
> (format-gnu-abbrev license))
> (car (cl-find-if (pcase-lambda (`(,_ . ,re))
> (string-match re license))
> elx-non-gnu-license-keyword-alist))
> (and (string-match elx-non-gnu-license-keyword-regexp license)
> (format-gnu-abbrev license))))
> (and (re-search-forward
> "^;\\{1,4\\} Licensed under the same terms as Emacs" bound t)
> "GPL-3+")
> (and ;; Some libraries are releases "under the *GPL and
> ;; "<other license>", while the GPL is mentioned in
> ;; a way the above code does not recognize. Return
> ;; nil instead of "<other license>" in such cases.
> (not (re-search-forward elx-gnu-license-keyword-regexp bound t))
> (car (cl-find-if (pcase-lambda (`(,_ . ,re))
> (re-search-forward re bound t))
> elx-non-gnu-permission-statement-alist))))))))
Note that this function now returns e.g. "GPL-3+" if the "or (at your
option) any later version" pattern was detected. I also made some other
changes to avoid false-positives (which comes at the cost of also no
longer matching some patterns that were previously matched correctly).
I can provide lists of packages that fall into a particular "category".
These lists can contain the names and email addresses of the maintainer,
links to the homepage and repository and many other things you might
find useful.
I would also be willing to contribute this code to the `lisp-mnt.el'
library, which is part of Emacs. It certainly could still be improved
a lot, but it is a start.
Oh, and I almost forgot - here is an updated table:
| License | Count | Percent |
|---------------+-------+---------|
| GPL-3+ | 2230 | 61 |
| GPL-2+ | 611 | 17 |
| (unknown) | 511 | 14 |
| as-is | 91 | 2 |
| MIT | 70 | 2 |
| public-domain | 52 | 1 |
| GPL-3 | 41 | 1 |
| GPL-2 | 31 | 1 |
| Apache-2.0 | 18 | 0 |
| GPL-1+ | 4 | 0 |
| BSD | 3 | 0 |
| GPL | 2 | 0 |
| LGPL | 2 | 0 |
| AGPL-3 | 1 | 0 |
| AGPL-3+ | 1 | 0 |
| BSD-3 | 1 | 0 |
| EPL | 1 | 0 |
| LGPL-3+ | 1 | 0 |
| LGPL-3.0 | 1 | 0 |
|---------------+-------+---------|
| total GNU | 2925 | 80 |
|---------------+-------+---------|
| total | 3672 | 100 |
And to briefly answer the post questions:
> > | (unknown) | 509 | 14 |
>
> Could you explain what "unknown" means? If a program
> does not explicitly state a license, it is proprietary.
Either the license was not specified OR the code was unable to find
the permission statement, which actually is present.
> > | as-is | 117 | 3 |
>
> Could you tell me what "as-is" means, here? Is "as-is" meant to
> identify a speciic license? If so, could you please show it to me? I
> need to determine whether it is a free license and GPL-compatible.
Essentially the string "as-is" was found in the header. I do agree
that this is ambiguous and problematic, but I decided to provide
this information anyway, because it is at least less ambiguous than
"unknown".
> > | MIT | 45 | 1 |
>
> "MIT" as the name of a license is ambiguous; see
Merely reporting that the string "MIT license" was found.
> > | GPL | 29 | 1 |
>
> What does that mean, concretely?
> Do these packages say, "any version of the GNU GPL"?
> That would be peculiar but not a substantive problem.
>
> > | GPL-1 | 4 | 0 |
>
> Do these packages carry "GPL version 1 only"
> or "GPL version 1 or later"?
This has been improved now:
* "GPL" => the GPL was mentioned, no version was mention
(or possibly was just not detected)
* "GPL-N" => the GPL and version N were mentioned
* "GPL-N+" => ... additionally "or (at your opinion) any later version"
was found (or a variation thereof).
> > | EPL | 1 | 0 |
>
> Does that mean the Eclipse Public License?
My guess is as good as yours; the string ";; License: EPL" was found.
Best regards,
Jonas
next prev parent reply other threads:[~2017-07-12 12:49 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-10 14:29 Some hard numbers on licenses used by elisp packages Jonas Bernoulli
2017-07-12 12:49 ` Jonas Bernoulli [this message]
2017-07-13 12:23 ` Richard Stallman
2017-07-14 19:44 ` Jonas Bernoulli
2017-07-15 19:38 ` Mats Lidell
2017-07-16 1:55 ` Richard Stallman
2017-07-16 2:20 ` Jean-Christophe Helary
2017-07-16 15:41 ` Jonas Bernoulli
2017-07-16 17:37 ` Mats Lidell
2017-07-16 22:17 ` Mats Lidell
2017-07-17 12:00 ` Richard Stallman
2017-07-17 12:00 ` Richard Stallman
2017-07-17 12:00 ` Richard Stallman
2017-07-17 21:23 ` Mats Lidell
2017-07-18 14:16 ` Richard Stallman
2017-07-23 22:14 ` Mats Lidell
2017-07-27 22:50 ` Mats Lidell
2017-07-28 17:16 ` Richard Stallman
2017-07-29 12:19 ` Mats Lidell
2017-07-29 19:09 ` Richard Stallman
2017-07-29 19:54 ` Mats Lidell
2017-07-29 22:49 ` Ivan Andrus
2017-07-31 0:46 ` Richard Stallman
2017-07-31 17:48 ` Achim Gratz
2017-08-08 1:02 ` Ivan Andrus
2017-07-31 0:51 ` Richard Stallman
2017-08-01 21:46 ` Mats Lidell
2017-08-02 1:54 ` Stefan Monnier
2017-08-03 19:42 ` Richard Stallman
2017-08-03 19:58 ` Stefan Monnier
2017-07-28 10:52 ` Jonas Bernoulli
2017-07-28 17:16 ` Richard Stallman
2017-07-28 17:47 ` Jonas Bernoulli
2017-07-28 17:16 ` Richard Stallman
2017-07-29 13:48 ` Jonas Bernoulli
2017-07-29 19:10 ` Richard Stallman
2017-07-29 19:10 ` Richard Stallman
2017-07-28 20:36 ` Karl Fogel
2017-07-29 19:07 ` Richard Stallman
2017-07-31 0:49 ` Richard Stallman
2017-07-17 11:59 ` Richard Stallman
2017-07-28 11:17 ` Jonas Bernoulli
2017-07-29 14:54 ` Mats Lidell
2017-07-29 19:09 ` Richard Stallman
2017-07-29 19:06 ` Richard Stallman
2017-07-29 19:06 ` Richard Stallman
2017-07-29 20:07 ` Mats Lidell
2017-07-30 6:28 ` Jean-Christophe Helary
2017-07-31 16:03 ` Jonas Bernoulli
2017-07-31 17:03 ` Jonas Bernoulli
2017-08-03 9:35 ` Mats Lidell
2017-08-03 19:50 ` Richard Stallman
2017-07-13 12:23 ` Richard Stallman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zic9zuof.fsf@bernoul.li \
--to=jonas@bernoul.li \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.