Something like this could be quite convenient. The following spdx->guix license symbol converter might save you some time: http://paste.lisp.org/display/322105 - Jelle 2016-08-03 19:55 GMT+02:00 Danny Milosavljevic : > On Wed, 3 Aug 2016 18:28:38 +0200 > David Craven wrote: > > > How can I tell the difference between a lgpl2.1 and lgpl2.1+ license? > > "or later" > > > Is this a job that an automated tool could do? Detecting licenses > > included in a tarball? > > I also wonder about that. Usually, the license text is just copied & > pasted anyway, so it should be quite regular. > > If there isn't one, I could write one which would basically, per source > file, > - try to find SPDX identifier, if that doesn't work: > - ignore newline, "#" or ";" or "*" or "//" at the beginning of the line > - lex that into words, where "word" is either [a-zA-Z0-9-]+ or [.,;] > - try to 1:1 match with all the licenses similarily mapped > - if that didn't work, try to find signal words and guess the license and > print the difference in a short form. > > I could do that program in maybe 2 hours and find and extract all the > official license texts in a few more hours. But does such a thing already > exist? [Seems like something obvious to have and I'm writing many other > things already.] > > A human would still have to review the non-1:1 things - there could always > be strange exceptions in the README or whatever - but the majority of cases > should work just fine. > > See also (especially < > https://github.com/triplecheck/>), < > http://www.sciencedirect.com/science/article/pii/S0164121216300905> (also > lists several license checkers; Fossology seems to be a whole webservice > which does that). > >