unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* wishlist: “repack” generations history of profile
@ 2022-05-20 13:47 zimoun
  2022-05-21 11:30 ` Liliana Marie Prikler
  2022-05-23 15:42 ` Ludovic Courtès
  0 siblings, 2 replies; 10+ messages in thread
From: zimoun @ 2022-05-20 13:47 UTC (permalink / raw)
  To: Guix Devel

Hi,

For instance, I have these,

--8<---------------cut here---------------start------------->8---
$ guix package --list-generations -p ~/.config/guix/profiles/emacs/emacs
Generation 14	Dec 30 2021 21:49:01
Generation 15	Dec 30 2021 22:11:51
Generation 16	Dec 30 2021 22:26:48
Generation 17	Dec 30 2021 23:34:14
Generation 18	Dec 31 2021 19:10:15
Generation 19	Apr 26 2022 14:50:34
Generation 20	Apr 26 2022 14:50:45
Generation 21	May 03 2022 14:14:20	(current)
--8<---------------cut here---------------end--------------->8---

or

--8<---------------cut here---------------start------------->8---
$ guix pull --list-generations -p ~/.config/guix/current
Generation 72	Dec 29 2021 22:52:13
Generation 73	Dec 31 2021 19:02:14
Generation 74	Jan 04 2022 10:04:35
Generation 75	Jan 06 2022 10:30:48
Generation 76	Feb 04 2022 11:15:54
Generation 77	Apr 05 2022 10:27:45
Generation 78	Apr 20 2022 01:26:24
Generation 79	Apr 26 2022 13:43:13	(current)
--8<---------------cut here---------------end--------------->8---

Now, assume I am running out of space.  If I run,

    guix gc --delete-generations=2m

then I am removing the items in the store (data) and also the meta data
(manifest, date, channels, etc.).

I am fine to delete the old items in the store.  I do not want to keep
things I am not using.  However, for tracking and monitoring, I would
like to still keep these meta and potentially be able to rebuild such
generation.

The question is what to do when we delete?

I am proposing to delete the content, i.e., all but keep the meta, i.e.,
the file manifest.  We could have an option soft (keep meta) and hard
(remove all, meta included, as today) for guix gc.

Or an option “guix gc --repack=2m” which would remove all except the
manifest file.  The capacity of switching to old generation is kept,
space is saved, and it is almost transparent for the user.

Well, if I would like to switch to a previous generation, Guix would
recompute all the derivations, i.e., just fill the store (cache).


Working on projects long of several years, the old generations are
deleted and a part of the history is lost.  Obviously, I track in the
Git repo of the project some manifest.scm and channels.scm files.  Well,
it could be better if to be able to keep generations from years in a
very light and transparent way.

On machine where many users are running a lot of profiles, the sysadmin
periodically run “guix gc” to save some space and therefore Guix delete
materials for users missing to track the correct information.


WDYT?


Cheers,
simon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-05-20 13:47 wishlist: “repack” generations history of profile zimoun
@ 2022-05-21 11:30 ` Liliana Marie Prikler
  2022-05-23 16:20   ` zimoun
  2022-05-23 15:42 ` Ludovic Courtès
  1 sibling, 1 reply; 10+ messages in thread
From: Liliana Marie Prikler @ 2022-05-21 11:30 UTC (permalink / raw)
  To: zimoun, Guix Devel

Hi,

Am Freitag, dem 20.05.2022 um 15:47 +0200 schrieb zimoun:
> Hi,
> 
> [...]
> Now, assume I am running out of space.  If I run,
> 
>     guix gc --delete-generations=2m
> 
> then I am removing the items in the store (data) and also the meta
> data (manifest, date, channels, etc.).
> 
> I am fine to delete the old items in the store.  I do not want to
> keep things I am not using.  However, for tracking and monitoring, I
> would like to still keep these meta and potentially be able to
> rebuild such generation.
> 
> The question is what to do when we delete?
> 
> I am proposing to delete the content, i.e., all but keep the meta,
> i.e., the file manifest.  We could have an option soft (keep meta)
> and hard (remove all, meta included, as today) for guix gc.
> [...]
> WDYT?
I think we should implement this as a single --keep=stuff operator,
where stuff can be a comma-separated list.  In your case, I think the
stuff you wanted to keep were the profile manifests, but you could as
easily say that you want to keep all the .drv files and only drop the
store items.  With some fantasy, we could even add "essentials", which
would be git-minimal and other packages that are native inputs in the
building of guix itself.

I'm not so sure if I understand the generation bit correctly, but with
the switch proposed above, you'd --keep=generations, which keeps just
enough data to make switch-generation work "lazily".

WDYT?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-05-20 13:47 wishlist: “repack” generations history of profile zimoun
  2022-05-21 11:30 ` Liliana Marie Prikler
@ 2022-05-23 15:42 ` Ludovic Courtès
  2022-05-23 16:58   ` zimoun
  1 sibling, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2022-05-23 15:42 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

Hello!

zimoun <zimon.toutoune@gmail.com> skribis:

> The question is what to do when we delete?
>
> I am proposing to delete the content, i.e., all but keep the meta, i.e.,
> the file manifest.  We could have an option soft (keep meta) and hard
> (remove all, meta included, as today) for guix gc.

Exactly!  ‘guix pull’ profiles are entirely reproducible: we can rebuild
them from the output of ‘guix describe’.

So ‘guix gc’ (or something) could automatically remove old generation
symlinks and instead store the output of ‘guix describe’.  That way,
‘--list-generations’ or ‘--switch-generations’ could transparently
display the info or rebuild the generation.

System and Home generations are usually, but not necessarily,
reproducible: usually the channel info + config file are enough to
rebuild them, but in theory the config file might refer to resources not
known to Guix (e.g., SSH key files, modules, whatever).  That said, we
could arrange so that ‘guix gc -d’ keeps the metadata around.

For regular profiles, we might do the same, but no guarantee we can
rebuild them, unless all the packages come from the same channels (which
is the case if the profile was built with ‘guix package -m’).

Ludo’.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-05-21 11:30 ` Liliana Marie Prikler
@ 2022-05-23 16:20   ` zimoun
  0 siblings, 0 replies; 10+ messages in thread
From: zimoun @ 2022-05-23 16:20 UTC (permalink / raw)
  To: Liliana Marie Prikler, Guix Devel

Hi Liliana,

On Sat, 21 May 2022 at 13:30, Liliana Marie Prikler <liliana.prikler@gmail.com> wrote:

> I think we should implement this as a single --keep=stuff operator,

[...]

> I'm not so sure if I understand the generation bit correctly, but with
> the switch proposed above, you'd --keep=generations, which keeps just
> enough data to make switch-generation work "lazily".

From my understanding, this operator would add complexity and I am not
convinced that being so fine-grained is necessary.

Basically, the workflow is:

        guix package -m foo.scm
        guix package -m bar.scm
        guix pull
        guix package -m baz.scm
        guix gc -d 3m
        guix pull
        guix package -m bong.scm

and “guix gc” potentially remove all of the generation “foo.scm” (say
generation 1).  Instead, I would like to remove the content only to save
space but keep the meta (basically the internal manifest).  This way,
the store would not fully contain the items but it would be possible to
have,

        guix package --switch-generation=1

rebuilding the missing items for the generation 1.

Today, it is not possible to keep the history of generations and delete
the store items of old generations.  I am proposing to have an option
for “guix gc” allowing that: keep the history and the ability to switch
and in the same time remove the old items.


Cheers,
simon




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-05-23 15:42 ` Ludovic Courtès
@ 2022-05-23 16:58   ` zimoun
  2022-05-30 15:40     ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2022-05-23 16:58 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix Devel

Hi Ludo,

On Mon, 23 May 2022 at 17:42, Ludovic Courtès <ludo@gnu.org> wrote:

> Exactly!  ‘guix pull’ profiles are entirely reproducible: we can rebuild
> them from the output of ‘guix describe’.
>
> So ‘guix gc’ (or something) could automatically remove old generation
> symlinks and instead store the output of ‘guix describe’.  That way,
> ‘--list-generations’ or ‘--switch-generations’ could transparently
> display the info or rebuild the generation.

I have in mind to remove all except the manifest file.

--8<---------------cut here---------------start------------->8---
$ tree -L 1 $(readlink -f ~/.config/guix/current)
/gnu/store/jfnnd975724kdr8q61z4fwabrm4qvzff-profile/
├── bin -> /gnu/store/fnfzidl79gjdki2d8v2ghn6a42n75rqc-guix-58f372776/bin
├── etc
├── lib
├── manifest
└── share

4 directories, 1 file
--8<---------------cut here---------------end--------------->8---

After ’guix gc’ (or something), it would read:

--8<---------------cut here---------------start------------->8---
/gnu/store/jfnnd975724kdr8q61z4fwabrm4qvzff-profile/
└── manifest

0 directories, 1 file
--8<---------------cut here---------------end--------------->8---

And similarly for regular profile, it would become after ’guix gc’ (or
something)

--8<---------------cut here---------------start------------->8---
$ tree -L 1 $(readlink -f ~/.guix-profile)
/gnu/store/3kglj010azkdydsgn6inmxqa3d24yz8a-profile
└── manifest

0 directories, 1 file
--8<---------------cut here---------------end--------------->8---

instead of the 7 directories I get today.


Well, I am not enough familiar with various internals as the SQL
database and the daemon to have an opinion about the technical issues.

Or maybe, it is easier to remove as today but save the relevant
information elsewhere.  WDYT?


> System and Home generations are usually, but not necessarily,
> reproducible: usually the channel info + config file are enough to
> rebuild them, but in theory the config file might refer to resources not
> known to Guix (e.g., SSH key files, modules, whatever).  That said, we
> could arrange so that ‘guix gc -d’ keeps the metadata around.
>
> For regular profiles, we might do the same, but no guarantee we can
> rebuild them, unless all the packages come from the same channels (which
> is the case if the profile was built with ‘guix package -m’).

Yes.  The same limitation as --export-manifest and --export-channels.
Somehow, if the full history is kept, we should be able to reproduce.
However, it could be slow or inefficient.


Today, I manually store the output of “guix describe” before I
manipulate a profile.  This channels.scm output is tracked by Git as
part of my project.  Then I run “git log” to find the previous state of
interest and go back using,

    guix time-machine -C channels.scm -- shell -m manifest.scm

which is fine and works well.  Probably the good practise. :-)

Instead of this external tracking, I would like to allow this workflow:

    guix package -p project --list-generations
    guix package -p project --switch-generation=12

whatever the sysadmin collect about the old generations.


Cheers,
simon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-05-23 16:58   ` zimoun
@ 2022-05-30 15:40     ` Ludovic Courtès
  2022-05-30 17:18       ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2022-05-30 15:40 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

> On Mon, 23 May 2022 at 17:42, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> Exactly!  ‘guix pull’ profiles are entirely reproducible: we can rebuild
>> them from the output of ‘guix describe’.
>>
>> So ‘guix gc’ (or something) could automatically remove old generation
>> symlinks and instead store the output of ‘guix describe’.  That way,
>> ‘--list-generations’ or ‘--switch-generations’ could transparently
>> display the info or rebuild the generation.
>
> I have in mind to remove all except the manifest file.

[...]

> Or maybe, it is easier to remove as today but save the relevant
> information elsewhere.  WDYT?

Yes, that ‘manifest’ file would have copied elsewhere (we can’t just
remove part of what’s in a /gnu/store directory).

> Today, I manually store the output of “guix describe” before I
> manipulate a profile.  This channels.scm output is tracked by Git as
> part of my project.  Then I run “git log” to find the previous state of
> interest and go back using,
>
>     guix time-machine -C channels.scm -- shell -m manifest.scm
>
> which is fine and works well.  Probably the good practise. :-)
>
> Instead of this external tracking, I would like to allow this workflow:
>
>     guix package -p project --list-generations
>     guix package -p project --switch-generation=12
>
> whatever the sysadmin collect about the old generations.

Do you expect ‘--list-generations’ to look at older revisions of your
version-controlled ‘manifest.scm’?

Ludo’.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-05-30 15:40     ` Ludovic Courtès
@ 2022-05-30 17:18       ` zimoun
  2022-06-04  7:39         ` Giovanni Biscuolo
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2022-05-30 17:18 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix Devel

Hi,

On lun., 30 mai 2022 at 17:40, Ludovic Courtès <ludo@gnu.org> wrote:

> Yes, that ‘manifest’ file would have copied elsewhere (we can’t just
> remove part of what’s in a /gnu/store directory).

[...]

>> Instead of this external tracking, I would like to allow this workflow:
>>
>>     guix package -p project --list-generations
>>     guix package -p project --switch-generation=12
>>
>> whatever the sysadmin collect about the old generations.
>
> Do you expect ‘--list-generations’ to look at older revisions of your
> version-controlled ‘manifest.scm’?

I do not expect that Guix uses my version-controlled ’manifest.scm’ but
instead that Guix uses its own internal one.

If the sysadmin of my cluster does as root “guix gc
--delete-generations=3m”, then this GC is out of my control and
unexpected by me, which somehow breaks the rootless argument.

Other said, because “guix gc” can be run periodically (for good
reasons!), as a user, it is hard to predict what I could loose.

Well, consider the situation:

 1. User install foo bar the profile my-project on January
 2. User update foo bar on February
 3. User works on another project
 4. Months later, user works again on my-project

The generation #1 can be lost.  For sure it depends on the cluster
policy but, as a sysadmin, I do not tell all the users that a GC will be
run – and even if I am doing, I am sure that some user will miss to save
the channels.scm and manifest.scm for each generation.

That’s why, something like “repack” is missing.  As a user, I should be
able to do

    guix package --switch-generation=1

whatever the sysadmin collects about the old generations and whatever I
saved using some external tools.

At GC time, enough information of the old generations should be kept
allowing “guix package --switch-generation” or --export-manifest or
else.

We could imagine an intermediary mode between the two current ones:

 + full generation
 + repack (only keep some text files)
 + purge (remove these few text file)



Cheers,
simon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-05-30 17:18       ` zimoun
@ 2022-06-04  7:39         ` Giovanni Biscuolo
  2022-06-05  9:45           ` zimoun
  0 siblings, 1 reply; 10+ messages in thread
From: Giovanni Biscuolo @ 2022-06-04  7:39 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 1932 bytes --]

Hi Simon,

I know you know very well all I'm saying here, I'm just commenting for
casual readers of this thread

zimoun <zimon.toutoune@gmail.com> writes:

[...]

> The generation #1 can be lost.  For sure it depends on the cluster
> policy but, as a sysadmin, I do not tell all the users that a GC will be
> run – and even if I am doing, I am sure that some user will miss to save
> the channels.scm and manifest.scm for each generation.

I don't know how easy or not is the implementation of this feature and
for sure it would be a plus, but IMHO all users must understand that for
their projects (profiles) to be reproducible and versioned the /only/
way is to keep channels.scm and manifests.scm in a VCS (i.e. git)

> That’s why, something like “repack” is missing.  As a user, I should be
> able to do
>
>     guix package --switch-generation=1
>
> whatever the sysadmin collects about the old generations and whatever I
> saved using some external tools.

...except you wish to reproduce the project on another machine, or
/gnu/store is lost or corrupted for some reason

Also consider that sometimes pepole in teams choose to work on the same
project in different (not shared) profiles (i.e. for reproducibility
testing), this way generation history is not the same and the only way
to "sync" would be to exchange channels.scm and manifest.scm

Also, from a collaborative workflow point of view, keeping the two
"reproduce me" files (channels and manifest) is more efficient since
people can describe what (and why) they chenged things between "saved"
project generations; not committed channels.scm and manifests.scm should
be considered "local testing"

IMVHO there is no easy workaround to keeping channels.scm and
manifests.scm in a VCS, better sooner than later users should do it

[...]

Happy hacking! Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-06-04  7:39         ` Giovanni Biscuolo
@ 2022-06-05  9:45           ` zimoun
  2022-06-05 11:16             ` Giovanni Biscuolo
  0 siblings, 1 reply; 10+ messages in thread
From: zimoun @ 2022-06-05  9:45 UTC (permalink / raw)
  To: Giovanni Biscuolo; +Cc: Guix Devel

Hi,


On Sat, 04 Jun 2022 at 09:39, Giovanni Biscuolo <g@xelera.eu> wrote:

>                                  IMHO all users must understand that for
> their projects (profiles) to be reproducible and versioned the /only/
> way is to keep channels.scm and manifests.scm in a VCS (i.e. git)

I agree.  This practise is the target but as a matter of fact, we are
not there yet.  From what I daily see, scientists are starting to
integrate Git in their workflow, they are also starting to provide how
they generate their computational environment, but some are slower than
others. ;-)

>> That’s why, something like “repack” is missing.  As a user, I should be
>> able to do
>>
>>     guix package --switch-generation=1
>>
>> whatever the sysadmin collects about the old generations and whatever I
>> saved using some external tools.
>
> ...except you wish to reproduce the project on another machine, or
> /gnu/store is lost or corrupted for some reason

On the same machine, by the same user.  Consider that the project is 2
years long, you start to install some packages and run an analysis, 4
months later you receive other data and you analyse using an updated
version of tools, 2 months later you want to reanalyse all and you use
another updated version of tools… and you have some differences.
Therefore, you want to roll-back to the first generation and see… Bah
you cannot because it is many months old and the sysadmin runs “guix gc
-d 3m” to save some space.

Such roll-back should be possible – a full rebuild the profile though.
I mean, let GC as usual but also “repack“ the necessary information.
Then, if you wish to run on another machine, you can always run
’export-manifest’ and ’export-channels’ from this “repack”.

I agree that tracking channels.scm and manaifest.scm is the good
practise.  And I am trying to promote this very hard. :-)

However, we are often saying: do not worry, you can always travel back
in time (implicitly assuming Guix have the information :-)).  And this
assumption is often missed which leads to uncomfortable situations, not
to say maybe some scientists are sometime blaming sysadmin and/or Guix
promoter. :-)


Somehow my point is: The time scale of a project is often very different
to the time scale of GC on a machine.  Most of the time, the old
generations are useless and it is fine to remove them.  But for few rare
cases, they are necessary – and it is impossible to know in advance or
to know the range of time.  These few exceptions do not justify to keep
all these old generations; it does not make sense because the
“environmental costs” (storage, electricity, etc.).  Today, the only way
is a manual tracking when it could be nice to have a more automatic
feature; similarly as ’export-manifest’ and ’export-channels’, they are
not necessary per se because the good practise is track the files using
Git, but they are very handy in many situations. :-)


Cheers,
simon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: wishlist: “repack” generations history of profile
  2022-06-05  9:45           ` zimoun
@ 2022-06-05 11:16             ` Giovanni Biscuolo
  0 siblings, 0 replies; 10+ messages in thread
From: Giovanni Biscuolo @ 2022-06-05 11:16 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

[-- Attachment #1: Type: text/plain, Size: 2268 bytes --]

Hi Simon and developers,

what about a flag - e.g. --backup - and a related funcion for "guix
package -d generations" and "guix gc -d generations" (other?) that saves
"channels-generation-<N>.scm" and "manifest-generation-<N> for each
deleted generation?

This way we can keep the current deletion of generations and status
logic while giving users a utility to automaticcaly keep old
channels.scm and manifest.scm files, of course the responsibility to
store the backups (where, how, why) is on the users shoulders

Sorry I'm not able to help with such implementation, it's just an idea
for an alternative one.

zimoun <zimon.toutoune@gmail.com> writes:

[...]

> Therefore, you want to roll-back to the first generation and see… Bah
> you cannot because it is many months old and the sysadmin runs “guix gc
> -d 3m” to save some space.

I'm a sysadmin, please understand the ungrateful job to administer a
machine in a "shared servers context" in which users have the power to
administer their software profiles.... except they are not willing to do
it properly. Each and every user is /also/ a little sysadmin ;-)

"guix gc -d 3m" by sysadmins for their users is "hard delegation"

"guix package -d <generations>" by users and "guix gc -C" by sysadmins
is "soft delegation" and more fair IMHO

[...]

> However, we are often saying: do not worry, you can always travel back
> in time (implicitly assuming Guix have the information :-)).

If this is the case, IMHO we should patch the manual: what part of the
manual do you have in mind?

> And this assumption is often missed which leads to uncomfortable
> situations, not to say maybe some scientists are sometime blaming
> sysadmin and/or Guix promoter. :-)

I know: when a system does not work as expected is always someone else
resposibility, usually sysadmins :-O

> Somehow my point is: The time scale of a project is often very different
> to the time scale of GC on a machine.  Most of the time, the old

Somehow my point is: sysadmins and users should peacefully agree on a
Guix package and profile management policy, documenting it for the
organization

[...]

Happy Guixing! :-D  Gio'

-- 
Giovanni Biscuolo

Xelera IT Infrastructures

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 849 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-06-05 11:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-20 13:47 wishlist: “repack” generations history of profile zimoun
2022-05-21 11:30 ` Liliana Marie Prikler
2022-05-23 16:20   ` zimoun
2022-05-23 15:42 ` Ludovic Courtès
2022-05-23 16:58   ` zimoun
2022-05-30 15:40     ` Ludovic Courtès
2022-05-30 17:18       ` zimoun
2022-06-04  7:39         ` Giovanni Biscuolo
2022-06-05  9:45           ` zimoun
2022-06-05 11:16             ` Giovanni Biscuolo

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).