Hi, was just written in another mail, I'm currently working on a erlang/rebar build system. This includes an importer from hex.pm, a package repository for elixir and erlang packages. (Since this is build into rebar3 I assume it what PyPI is for Python and CPAN for Perl.) At hex.pm, packages are provided in a tarfile [1] wrapping the source tar-file: -rw-r--r-- 0/0 1 2017-06-14 21:57 VERSION -rw-r--r-- 0/0 64 2017-06-14 21:57 CHECKSUM -rw-r--r-- 0/0 532 2017-06-14 21:57 metadata.config -rw-r--r-- 0/0 4744 2017-06-14 21:57 contents.tar.gz IMHO it does not make sense to keep this wrapping tar-file in the store. So my idea is to create a "hexpm-fetch" method, which downloads the tar-file and only stores the "content.tar.gz" in the store (using a proper name, of course). How can this be done? [1] https://github.com/hexpm/specifications/blob/master/package_tarball.md -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible |
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, May 30, 2020 10:39 AM, Hartmut Goebel <h.goebel@crazy-compilers.com> wrote:
> Hi,
>
> was just written in another mail, I'm currently working on a
> erlang/rebar build system. This includes an importer from hex.pm, a
> package repository for elixir and erlang packages. (Since this is build
> into rebar3 I assume it what PyPI is for Python and CPAN for Perl.)
>
> At hex.pm, packages are provided in a tarfile [1] wrapping the source
> tar-file:
>
> -rw-r--r-- 0/0 1 2017-06-14 21:57 VERSION
> -rw-r--r-- 0/0 64 2017-06-14 21:57 CHECKSUM
> -rw-r--r-- 0/0 532 2017-06-14 21:57 metadata.config
> -rw-r--r-- 0/0 4744 2017-06-14 21:57 contents.tar.gz
>
> IMHO it does not make sense to keep this wrapping tar-file in the store.
>
> So my idea is to create a "hexpm-fetch" method, which downloads the
> tar-file and only stores the "content.tar.gz" in the store (using a
> proper name, of course).
>
> How can this be done?
>
> [1] https://github.com/hexpm/specifications/blob/master/package_tarball.md
>
>
Hi,
Probably you're able to reach the same conclusions as I did but anyway...
I took a look to guix/download.scm I think you just need to check what url-fetch/zipbomb does because the usecase is similar to what you are looking for.
Hope this helps at least a little.
Thanks for the work you are doing, I'm interested on it because I want to package Wings3D, so once you are done you'll probably have a tester :)
Best,
Ekaitz
Am 30.05.20 um 12:24 schrieb Ekaitz Zarraga: > I took a look to guix/download.scm I think you just need to check what url-fetch/zipbomb does because the usecase is similar to what you are looking for. Yes, I've already seen this. And there also is url-fetch/tarbomb. But this "%store-monad" in there discourages me, as I'm afraif this will keep the file in the store. > Thanks for the work you are doing, I'm interested on it because I want to package Wings3D, so once you are done you'll probably have a tester :) You already can start testing the rebar3 builder :-) You can find my WIP at <https://gitlab.digitalcourage.de/htgoebel/guix/-/tree/HG-rebar-build-system> -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible |
[-- Attachment #1: Type: text/plain, Size: 590 bytes --] Hi related to the "wrapped tarball downloader": Will this work with Software Heritage? E.g. will Software Heritage be able to archive the unwrapped tarbar? -- Schönen Gruß Hartmut Goebel Dipl.-Informatiker (univ), CISSP, CSSLP, ISO 27001 Lead Implementer Information Security Management, Security Governance, Secure Software Development Goebel Consult, Landshut http://www.goebel-consult.de Blog: https://www.goe-con.de/blog/35.000-gegen-vorratdatenspeicherung Kolumne: https://www.goe-con.de/hartmut-goebel/cissp-gefluester/2011-09-kommerz-uber-recht-fdp-die-gefaellt-mir-partei [-- Attachment #2: 0x7B752811BF773B65.asc --] [-- Type: application/pgp-keys, Size: 21249 bytes --]
Dear Hartmut, On Sun, 31 May 2020 at 10:21, Hartmut Goebel <h.goebel@goebel-consult.de> wrote: > related to the "wrapped tarball downloader": Sorry, I have not followed closely this topic, could you provide a link/entry point about "wrapped tarball downloader"? > Will this work with Software Heritage? E.g. will Software Heritage be > able to archive the unwrapped tarbar? As said above, I do not exactly know what mean "unwrapped tarball" but the current situation about SWH is: "guix lint" queues the origin if it is 'git-fetch' and SWH (will soon) fetch the tarballs from http://guix.gnu.org/sources.json (type: 'url' for now, if I have not misread the last updates). Thanks, simon
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Sunday, May 31, 2020 10:19 AM, Hartmut Goebel <h.goebel@crazy-compilers.com> wrote: > Am 30.05.20 um 12:24 schrieb Ekaitz Zarraga: > > > I took a look to guix/download.scm I think you just need to check what url-fetch/zipbomb does because the usecase is similar to what you are looking for. > > Yes, I've already seen this. And there also is url-fetch/tarbomb. But > this "%store-monad" in there discourages me, as I'm afraif this will > keep the file in the store. I've been thinking about this and I don't expect those files to be kept in the store. Doesn't the store just keep the result of the packaging rather than the source of it? > > Thanks for the work you are doing, I'm interested on it because I want to package Wings3D, so once you are done you'll probably have a tester :) > > You already can start testing the rebar3 builder :-) You can find my > WIP at > https://gitlab.digitalcourage.de/htgoebel/guix/-/tree/HG-rebar-build-system I'll take a look if I have some free time, thanks for the link!
[-- Attachment #1: Type: text/plain, Size: 1426 bytes --] Hartmut Goebel <h.goebel@crazy-compilers.com> writes: > Hi, > > was just written in another mail, I'm currently working on a > erlang/rebar build system. This includes an importer from hex.pm, a > package repository for elixir and erlang packages. (Since this is build > into rebar3 I assume it what PyPI is for Python and CPAN for Perl.) > > At hex.pm, packages are provided in a tarfile [1] wrapping the source > tar-file: > > -rw-r--r-- 0/0 1 2017-06-14 21:57 VERSION > -rw-r--r-- 0/0 64 2017-06-14 21:57 CHECKSUM > -rw-r--r-- 0/0 532 2017-06-14 21:57 metadata.config > -rw-r--r-- 0/0 4744 2017-06-14 21:57 contents.tar.gz > > IMHO it does not make sense to keep this wrapping tar-file in the store. > > So my idea is to create a "hexpm-fetch" method, which downloads the > tar-file and only stores the "content.tar.gz" in the store (using a > proper name, of course). > > How can this be done? Tarballs from rubygems.org has the same problem and works around it by special support in ruby-build-system. It would be ideal to have an origin method that could extract the "inner" tarball, i.e. contents.tar.gz for hex.pm and data.tar.gz in the case of RubyGems. As zimoun mentioned, a good place to start is look at how other origin methods are implemented such as url-fetch/tarbomb, etc. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --]
[-- Attachment #1.1: Type: text/plain, Size: 1400 bytes --] Am 02.06.20 um 21:41 schrieb Marius Bakke: > It would be ideal to have an origin method that could extract the > "inner" tarball, i.e. contents.tar.gz for hex.pm and data.tar.gz in the > case of RubyGems. As zimoun mentioned, a good place to start is look at > how other origin methods are implemented such as url-fetch/tarbomb, etc. I started implementing into this direction and would like your advice on the design. I found two options: 1. When implementing some "url-fetch/wrapped" (name tdb), *two* items will be kept in the store: the "outer" and the "inner" tarball. This is since "url-fetch" and siblings use the built-in downloader, which AFAIK always puts the downloaded files into the store. In this case we need to check the hash of the "outer" tarball, as the built-in downloader requires a hash to be passed and to match. But then we can not check the hash of the "outer" tarball. How would this work with substitutes and download-nar? 2. When implementing some "wrapped-fetch" (name tdb), modeled like "git-fetch", there is no easy way for the user to verify the hash, as this is taken from the "inner" tarball. How does this work with substitutes, download-nar and SWH? -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible | [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
Dear Hartmut, On Sat, 6 Jun 2020 at 17:29, Hartmut Goebel <h.goebel@crazy-compilers.com> wrote: > 2. When implementing some "wrapped-fetch" (name tdb), modeled like > "git-fetch", there is no easy way for the user to verify the hash, as > this is taken from the "inner" tarball. How does this work with > substitutes, download-nar and SWH? Today, Guix feeds SWH with only one stream "guix lint" and only for 'git-fetch' packages; if I understand well. The origin methods for Guix packages look like: 1 bzr-fetch 3 cvs-fetch 9 url-fetch/tarbomb 24 url-fetch/zipbomb 28 hg-fetch 30 computed-origin-method 67 no-origin 115 svn-fetch 135 svn-multi-fetch 3574 git-fetch 9690 url-fetch where 'svn-multi-fetch' are mainly CTAN/TeX packages. Well, as you see, most of the packages are not yet archived in SWH. Since SWH supports 'svn-fetch' and 'hg-fetch', it is doable to add them to "guix lint" but it is low-priority -- at least on my TODO. :-) The SWH-side WIP is about 'url-fetch'. I have not followed all the recent developments by lewo but roughly speaking they are implementing another "lister" [2,3,4] for tarballs. Well, the final aim is that SWH automatically ingests https://guix.gnu.org/sources.json which is automatically generated every X minutes. Currently, the compliance of this 'sources.json' is still a WIP; the format is changing and the specification not yet fixed. What SWH archives is the upstream source, i.e., *not* "guix build -S" but what comes from 'origin'. What happens after and what Guix does not matter for SWH. Therefore, if I understand correctly, SWH will archive the initial tarball. (Sorry, I am lost with the "inner/outer" terminology.) Note that only the package tarball you pointed [5] needs to be checksummed, well if this initial package tarball matches then 'contents.tar.gz' will match too, isn't it? I hope to not have misread and missed something. All the best, simon [1] https://archive.softwareheritage.org/save/ [2] https://docs.softwareheritage.org/devel/swh-lister/index.html [3] https://forge.softwareheritage.org/D2025 [4] https://forge.softwareheritage.org/T1991 [5] https://github.com/hexpm/specifications/blob/master/package_tarball.md
Hi, I'm pleased to announce that the pm downloader and rebar3 build-system are now available for testing and review: Either <https://gitlab.digitalcourage.de/htgoebel/guix/-/tree/HG-rebar-build-system> or <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=42180> -- Regards Hartmut Goebel | Hartmut Goebel | h.goebel@crazy-compilers.com | | www.crazy-compilers.com | compilers which you thought are impossible |