* [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
@ 2022-12-23 22:07 Denis 'GNUtoo' Carikli
2022-12-23 22:20 ` [bug#60288] [PATCH v1 1/2] build-system/copy: Add #:substitutable? argument Denis 'GNUtoo' Carikli
2022-12-28 18:10 ` [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Christopher Baines
0 siblings, 2 replies; 6+ messages in thread
From: Denis 'GNUtoo' Carikli @ 2022-12-23 22:07 UTC (permalink / raw)
To: 60288; +Cc: Denis 'GNUtoo' Carikli
[-- Attachment #1: Type: text/plain, Size: 1152 bytes --]
Hi,
Here are two small patches.
The first one add #:substitutable? to the copy-build system.
I don't know how to check if it works as intended though. It's
similar to the commit d0050ea8ad1c32d94cf5ba6725a0fc961bb23f38
("build-system/go: Add #:substitutable? argument.") so normally
it shouldn't be an issue, but if someone can double check it it
would be best as it would avoid keeping around substitutes of
very big sizes.
The second patch adds a ZIM file. I'll most likely send more
patches to add additional ZIM files packages (about 10) later
on. I prefer doing it this way as it avoids having to deal with
potential rebases breaking if there is something wrong with my
second patch.
Denis 'GNUtoo' Carikli (2):
build-system/copy: Add #:substitutable? argument.
gnu: Add wikipedia_en_all_maxi
gnu/local.mk | 1 +
gnu/packages/zim-files.scm | 86 ++++++++++++++++++++++++++++++++++++++
guix/build-system/copy.scm | 4 +-
3 files changed, 90 insertions(+), 1 deletion(-)
create mode 100644 gnu/packages/zim-files.scm
base-commit: c193b5203b31246a6d74270c8086c45851561947
--
2.38.1
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* [bug#60288] [PATCH v1 1/2] build-system/copy: Add #:substitutable? argument.
2022-12-23 22:07 [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Denis 'GNUtoo' Carikli
@ 2022-12-23 22:20 ` Denis 'GNUtoo' Carikli
2022-12-23 22:20 ` [bug#60288] [PATCH v1 2/2] gnu: Add wikipedia_en_all_maxi Denis 'GNUtoo' Carikli
2022-12-28 18:10 ` [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Christopher Baines
1 sibling, 1 reply; 6+ messages in thread
From: Denis 'GNUtoo' Carikli @ 2022-12-23 22:20 UTC (permalink / raw)
To: 60288; +Cc: Denis 'GNUtoo' Carikli
* guix/build-system/copy.scm (copy-build): Add 'substitutable?'
argument.
---
guix/build-system/copy.scm | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/guix/build-system/copy.scm b/guix/build-system/copy.scm
index 4894ba46fb..bb4d2daaa8 100644
--- a/guix/build-system/copy.scm
+++ b/guix/build-system/copy.scm
@@ -96,7 +96,8 @@ (define* (copy-build name inputs
(target #f)
(imported-modules %copy-build-system-modules)
(modules '((guix build copy-build-system)
- (guix build utils))))
+ (guix build utils)))
+ (substitutable? #t))
"Build SOURCE using INSTALL-PLAN, and with INPUTS."
(define builder
(with-imported-modules imported-modules
@@ -129,6 +130,7 @@ (define builder
(gexp->derivation name builder
#:system system
#:target #f
+ #:substitutable? substitutable?
#:guile-for-build guile)))
(define copy-build-system
--
2.38.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bug#60288] [PATCH v1 2/2] gnu: Add wikipedia_en_all_maxi
2022-12-23 22:20 ` [bug#60288] [PATCH v1 1/2] build-system/copy: Add #:substitutable? argument Denis 'GNUtoo' Carikli
@ 2022-12-23 22:20 ` Denis 'GNUtoo' Carikli
0 siblings, 0 replies; 6+ messages in thread
From: Denis 'GNUtoo' Carikli @ 2022-12-23 22:20 UTC (permalink / raw)
To: 60288; +Cc: Denis 'GNUtoo' Carikli
* gnu/packages/zim-files.scm (wikipedia_en_all_maxi): New variable.
---
gnu/local.mk | 1 +
gnu/packages/zim-files.scm | 86 ++++++++++++++++++++++++++++++++++++++
2 files changed, 87 insertions(+)
create mode 100644 gnu/packages/zim-files.scm
diff --git a/gnu/local.mk b/gnu/local.mk
index 5b8944f568..8957554fc2 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -643,6 +643,7 @@ GNU_SYSTEM_MODULES = \
%D%/packages/xfce.scm \
%D%/packages/zig.scm \
%D%/packages/zile.scm \
+ %D%/packages/zim-files.scm \
%D%/packages/zwave.scm \
\
%D%/services.scm \
diff --git a/gnu/packages/zim-files.scm b/gnu/packages/zim-files.scm
new file mode 100644
index 0000000000..49b7accb52
--- /dev/null
+++ b/gnu/packages/zim-files.scm
@@ -0,0 +1,86 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2022 Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix. If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (gnu packages zim-files)
+ #:use-module (gnu packages)
+ #:use-module (guix build-system copy)
+ #:use-module (guix download)
+ #:use-module (guix gexp)
+ #:use-module (guix utils)
+ #:use-module ((guix licenses) #:prefix license:)
+ #:use-module (guix packages))
+
+;;; Commentary:
+;;;
+;;; Many Guix contributors have a tendency to update packages in this
+;;; way: they only update the package revision and then launch a build
+;;; that fails just to make Guix tell them the right base32 hash. They
+;;; then update the base32 hash and launch the build again.
+;;;
+;;; However some ZIM files are quite big. At the time of writing,
+;;; wikipedia_en_all_maxi_2022-05.zim is about 89 GiB.
+;;;
+;;; So this approach will be time consuming as the second time Guix
+;;; will restart downloading the same file from scratch.
+;;;
+;;; The solution to this issue is to download the sha256sums (for that
+;;; simply append .sha256 to the URL of the ZIM file). It will give a
+;;; file like that:
+;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021 wikipedia_en_all_maxi_2022-05.zim
+;;;
+;;; You can then use this hash to compute the base32 with nix-hash:
+;;; $ nix-hash --type sha256 --to-base32 \
+;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021
+;;; 08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi
+
+(define-public wikipedia-en-all-maxi
+ (package
+ (name "wikipedia-en-all-maxi")
+ (version "2022-05")
+ (source (origin
+ (method url-fetch)
+ (uri (string-append
+ "https://mirror.download.kiwix.org/zim/wikipedia/"
+ (string-replace-substring name "-" "_")
+ "_" version ".zim"))
+ (sha256
+ (base32
+ "08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi"))))
+ (build-system copy-build-system)
+ (arguments
+ (list
+ ;; We are not (yet) generating the zim file, so it doesn't make sense to
+ ;; build substitutes.
+ #:substitutable? #f
+ ;; If we use kiwix-serve, the path of the ZIM file needs to be passed to
+ ;; it. And if the filename has a version in it, we'd need to update the
+ ;; path manually each time the package is updated. We also need to
+ ;; change the filename to match the package name.
+ #:install-plan #~'((#$(string-append
+ (string-replace-substring name "-" "_")
+ "_" version ".zim")
+ #$(string-append "share/" name ".zim")))))
+ (synopsis
+ "Complete English Wikipedia packed in a ZIM file, for offline usage with
+Kiwix")
+ (description
+ "Wikipedia is a free Encyclopedia. This is the English version. It
+contains all the articles, and all the medias (images, etc) present in
+the articles in a scaled down resolution.")
+ (home-page "https://en.wikipedia.org/wiki/Main_Page")
+ (license license:cc-by-sa3.0)))
--
2.38.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
2022-12-23 22:07 [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Denis 'GNUtoo' Carikli
2022-12-23 22:20 ` [bug#60288] [PATCH v1 1/2] build-system/copy: Add #:substitutable? argument Denis 'GNUtoo' Carikli
@ 2022-12-28 18:10 ` Christopher Baines
2022-12-29 23:19 ` Denis 'GNUtoo' Carikli
2023-01-02 20:01 ` Denis 'GNUtoo' Carikli
1 sibling, 2 replies; 6+ messages in thread
From: Christopher Baines @ 2022-12-28 18:10 UTC (permalink / raw)
To: Denis 'GNUtoo' Carikli; +Cc: 60288
[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]
Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org> writes:
> Here are two small patches.
>
> The first one add #:substitutable? to the copy-build system.
>
> I don't know how to check if it works as intended though. It's
> similar to the commit d0050ea8ad1c32d94cf5ba6725a0fc961bb23f38
> ("build-system/go: Add #:substitutable? argument.") so normally
> it shouldn't be an issue, but if someone can double check it it
> would be best as it would avoid keeping around substitutes of
> very big sizes.
>
> The second patch adds a ZIM file. I'll most likely send more
> patches to add additional ZIM files packages (about 10) later
> on. I prefer doing it this way as it avoids having to deal with
> potential rebases breaking if there is something wrong with my
> second patch.
>
> Denis 'GNUtoo' Carikli (2):
> build-system/copy: Add #:substitutable? argument.
> gnu: Add wikipedia_en_all_maxi
I haven't looked at this in detail, but one comment on the QA
failures. Building the package for this large file involves copying it
from the store, to another place in the store. This requires 2x the
space which this large file takes up, which is a pretty wasteful
approach.
This is the reason behind the build failures I've seen, the build
machines run out of space when attempting the file copy. Maybe an
alternative if you want to have a package would be to symlink to the
source. That way, there's only a large file and a symlink in the store,
rather than two copies of the same large file.
Chris
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 987 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
2022-12-28 18:10 ` [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Christopher Baines
@ 2022-12-29 23:19 ` Denis 'GNUtoo' Carikli
2023-01-02 20:01 ` Denis 'GNUtoo' Carikli
1 sibling, 0 replies; 6+ messages in thread
From: Denis 'GNUtoo' Carikli @ 2022-12-29 23:19 UTC (permalink / raw)
To: Christopher Baines; +Cc: 60288
[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]
On Wed, 28 Dec 2022 18:10:54 +0000
Christopher Baines <mail@cbaines.net> wrote:
> I haven't looked at this in detail, but one comment on the QA
> failures. Building the package for this large file involves copying it
> from the store, to another place in the store. This requires 2x the
> space which this large file takes up, which is a pretty wasteful
> approach.
Not only that but it also take a very long time to do that copy on
slower machines with an encrypted rootfs.
> This is the reason behind the build failures I've seen, the build
> machines run out of space when attempting the file copy. Maybe an
> alternative if you want to have a package would be to symlink to the
> source. That way, there's only a large file and a symlink in the
> store, rather than two copies of the same large file.
I'll try that. I hope that guix gc will not garbage collect the source
though.
Do you know if it's possible just to have a source package somehow
(and download the source to a specific filename) and not copy anything
at all?
Denis.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
2022-12-28 18:10 ` [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Christopher Baines
2022-12-29 23:19 ` Denis 'GNUtoo' Carikli
@ 2023-01-02 20:01 ` Denis 'GNUtoo' Carikli
1 sibling, 0 replies; 6+ messages in thread
From: Denis 'GNUtoo' Carikli @ 2023-01-02 20:01 UTC (permalink / raw)
To: Christopher Baines; +Cc: 60288
[-- Attachment #1: Type: text/plain, Size: 830 bytes --]
On Wed, 28 Dec 2022 18:10:54 +0000
Christopher Baines <mail@cbaines.net> wrote:>
> Maybe an alternative if you want to have a package would be to
> symlink to the source.
The issue is that I don't know how to refer to the source in a
situation like that.
I didn't really find good examples of all that. So far the best
I saw was to either define (source [...]) and reuse it in multiple
packages or to reuse the source of another package with (package-source
<package name>) like in linux.scm.
With the gnu build system, it copies the source in the current
directory, so I've really no idea what to do here. We might also need
to add the source to the inputs or native-inputs or propagated-inputs
somehow so it would not garbage collect it when we install the zim. Is
propagated-inputs the way to go?
Denis.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-01-02 20:03 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-23 22:07 [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Denis 'GNUtoo' Carikli
2022-12-23 22:20 ` [bug#60288] [PATCH v1 1/2] build-system/copy: Add #:substitutable? argument Denis 'GNUtoo' Carikli
2022-12-23 22:20 ` [bug#60288] [PATCH v1 2/2] gnu: Add wikipedia_en_all_maxi Denis 'GNUtoo' Carikli
2022-12-28 18:10 ` [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s) Christopher Baines
2022-12-29 23:19 ` Denis 'GNUtoo' Carikli
2023-01-02 20:01 ` Denis 'GNUtoo' Carikli
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).