unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#45676: Store references inside compressed data
@ 2021-01-05 14:36 Miguel Ángel Arruga Vivas
  2021-01-05 20:22 ` Leo Famulari
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Miguel Ángel Arruga Vivas @ 2021-01-05 14:36 UTC (permalink / raw)
  To: 45676

There are several binary formats that allow compression of the
executable image, or some of its data, which is decompress at runtime:

  - Kernel images.
  - Compressed libraries: e.g. Smalltalk modules.
  - Compressed executable or data files: e.g. library.el.gz.

These aren't taken into account by the grafting process, which may lead
to issues when store paths are located inside that kind of files.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-05 14:36 bug#45676: Store references inside compressed data Miguel Ángel Arruga Vivas
@ 2021-01-05 20:22 ` Leo Famulari
  2021-01-05 20:22 ` Leo Famulari
  2021-01-05 22:33 ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2 siblings, 0 replies; 11+ messages in thread
From: Leo Famulari @ 2021-01-05 20:22 UTC (permalink / raw)
  To: Miguel Ángel Arruga Vivas; +Cc: 45676

On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
> There are several binary formats that allow compression of the
> executable image, or some of its data, which is decompress at runtime:
> 
>   - Kernel images.
>   - Compressed libraries: e.g. Smalltalk modules.
>   - Compressed executable or data files: e.g. library.el.gz.
> 
> These aren't taken into account by the grafting process, which may lead
> to issues when store paths are located inside that kind of files.

It's a serious problem, and not just because of grafting. These obscured
references can cause things to be garbage collected inappropriately.

Here is an older case of the same problem:

https://bugs.gnu.org/24703

It was resolved by patching GCC.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-05 14:36 bug#45676: Store references inside compressed data Miguel Ángel Arruga Vivas
  2021-01-05 20:22 ` Leo Famulari
@ 2021-01-05 20:22 ` Leo Famulari
  2021-01-06 11:35   ` Ludovic Courtès
  2021-01-05 22:33 ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2 siblings, 1 reply; 11+ messages in thread
From: Leo Famulari @ 2021-01-05 20:22 UTC (permalink / raw)
  To: Miguel Ángel Arruga Vivas; +Cc: 45676

On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
> There are several binary formats that allow compression of the
> executable image, or some of its data, which is decompress at runtime:
> 
>   - Kernel images.
>   - Compressed libraries: e.g. Smalltalk modules.
>   - Compressed executable or data files: e.g. library.el.gz.
> 
> These aren't taken into account by the grafting process, which may lead
> to issues when store paths are located inside that kind of files.

If you have specific instances of this type of bug, please report them.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-05 14:36 bug#45676: Store references inside compressed data Miguel Ángel Arruga Vivas
  2021-01-05 20:22 ` Leo Famulari
  2021-01-05 20:22 ` Leo Famulari
@ 2021-01-05 22:33 ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2021-01-06  8:54   ` Leo Prikler
  2021-01-06 18:40   ` Miguel Ángel Arruga Vivas
  2 siblings, 2 replies; 11+ messages in thread
From: Tobias Geerinckx-Rice via Bug reports for GNU Guix @ 2021-01-05 22:33 UTC (permalink / raw)
  To: Miguel Ángel Arruga Vivas; +Cc: 45676

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

Hi!

Miguel Ángel Arruga Vivas wrote:
> These aren't taken into account by the grafting process, which 
> may lead
> to issues when store paths are located inside that kind of 
> files.

It's true.  It's a known trade-off of an otherwise 
almost-zero-effort yet fast reference scanner.  I don't think it's 
a bug per se, but it is something of which to be aware.  I also 
think this trade-off is worth it.

Luckily, this case is easier to fix than the infamous 
<http://issues.guix.gnu.org/24703>, because the right solution is 
simple:

>   - Compressed libraries: e.g. Smalltalk modules.
>   - Compressed executable or data files: e.g. library.el.gz.

Let's stop installing compressed executables & data files.  We 
already avoid compressed .jars and other renamed zip files.  It 
ain't right.

It's not 1998, my hard drive isn't 1.1GB, and I didn't just 
reinstall Slackware because I ‘accidentally’ gzexe'd gzip.

Gzipping a tiny handful of Lisp or Smalltalk files is pointless 
when zstd {,de}compresses my entire 500GB SSD better and faster, 
at the file system level where it now squarely belongs.  Without 
breaking Guix.

Kind regards,

T G-R

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-05 22:33 ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
@ 2021-01-06  8:54   ` Leo Prikler
  2021-01-14 21:31     ` Ludovic Courtès
  2021-01-06 18:40   ` Miguel Ángel Arruga Vivas
  1 sibling, 1 reply; 11+ messages in thread
From: Leo Prikler @ 2021-01-06  8:54 UTC (permalink / raw)
  To: Tobias Geerinckx-Rice, Miguel Ángel Arruga Vivas; +Cc: 45676

[-- Attachment #1: Type: text/plain, Size: 409 bytes --]

Hi!
Am Dienstag, den 05.01.2021, 23:33 +0100 schrieb Tobias Geerinckx-Rice:
> Let's stop installing compressed executables & data files.  We 
> already avoid compressed .jars and other renamed zip files.  It 
> ain't right.
Would this be strictly necessary even if the same references are kept
through other files, e.g. uncompressed binaries?
I'll attach a patch, that fixes Emacs just in case.

Regards, Leo

[-- Attachment #2: 0001-gnu-emacs-Don-t-install-compressed-archives.patch --]
[-- Type: text/x-patch, Size: 1475 bytes --]

From 57c23bf6ecac79c397cb49ff251176ec3a7b1cf5 Mon Sep 17 00:00:00 2001
From: Leo Prikler <leo.prikler@student.tugraz.at>
Date: Wed, 6 Jan 2021 09:24:07 +0100
Subject: [PATCH] gnu: emacs: Don't install compressed archives.

See <http://issues.guix.gnu.org/45676#3>.

* gnu/packages/emacs.scm (emacs)[#:configure-flags]:
Add --without-compress-install.
(emacs-minimal)[#:configure-flags]: Likewise.
---
 gnu/packages/emacs.scm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gnu/packages/emacs.scm b/gnu/packages/emacs.scm
index ca14584ada..aa636b8c9b 100644
--- a/gnu/packages/emacs.scm
+++ b/gnu/packages/emacs.scm
@@ -124,6 +124,7 @@
      `(#:tests? #f                      ; no check target
        #:configure-flags (list "--with-modules"
                                "--with-cairo"
+                               "--without-compress-install"
                                "--disable-build-details")
        #:phases
        (modify-phases %standard-phases
@@ -355,7 +356,8 @@ also enabled and works without glitches even on X server."))))
     (arguments
      (substitute-keyword-arguments (package-arguments emacs)
        ((#:configure-flags flags ''())
-        `(list "--with-gnutls=no" "--disable-build-details"))
+        `(list "--with-gnutls=no" "--disable-build-details"
+               "--without-compress-install"))
        ((#:phases phases)
         `(modify-phases ,phases
            (delete 'restore-emacs-pdmp)
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-05 20:22 ` Leo Famulari
@ 2021-01-06 11:35   ` Ludovic Courtès
  2021-01-06 16:57     ` Miguel Ángel Arruga Vivas
  0 siblings, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2021-01-06 11:35 UTC (permalink / raw)
  To: Leo Famulari; +Cc: 45676, Miguel Ángel Arruga Vivas

Hi,

Leo Famulari <leo@famulari.name> skribis:

> On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
>> There are several binary formats that allow compression of the
>> executable image, or some of its data, which is decompress at runtime:
>> 
>>   - Kernel images.
>>   - Compressed libraries: e.g. Smalltalk modules.
>>   - Compressed executable or data files: e.g. library.el.gz.
>> 
>> These aren't taken into account by the grafting process, which may lead
>> to issues when store paths are located inside that kind of files.
>
> If you have specific instances of this type of bug, please report them.

Agreed.  The general issue is “well known” as we say, but what I think
we need to do is look for specific instances and address them.

Ludo’.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-06 11:35   ` Ludovic Courtès
@ 2021-01-06 16:57     ` Miguel Ángel Arruga Vivas
  2021-01-07 11:05       ` Ludovic Courtès
  0 siblings, 1 reply; 11+ messages in thread
From: Miguel Ángel Arruga Vivas @ 2021-01-06 16:57 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 45676

Hi Ludo and Leo,

Ludovic Courtès <ludo@gnu.org> writes:

> Hi,
>
> Leo Famulari <leo@famulari.name> skribis:
>
>> On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
>>> There are several binary formats that allow compression of the
>>> executable image, or some of its data, which is decompress at runtime:
>>> 
>>>   - Kernel images.
>>>   - Compressed libraries: e.g. Smalltalk modules.
>>>   - Compressed executable or data files: e.g. library.el.gz.
>>> 
>>> These aren't taken into account by the grafting process, which may lead
>>> to issues when store paths are located inside that kind of files.
>>
>> If you have specific instances of this type of bug, please report them.
>
> Agreed.  The general issue is “well known” as we say, but what I think
> we need to do is look for specific instances and address them.

It can be tagged it notabug if you consider so.  I've tagged it as
wishlist (I should have been done it before) for that reason (it's "well
known"), but I haven't found any specific instance yet.  OTOH, I think
it might be closely related to #33848, as the solution for both issues
could be solved by the extension on the dumpPath code path---or the
Scheme implementation equivalent, as pointed there.

Happy hacking!
Miguel




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-05 22:33 ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2021-01-06  8:54   ` Leo Prikler
@ 2021-01-06 18:40   ` Miguel Ángel Arruga Vivas
  1 sibling, 0 replies; 11+ messages in thread
From: Miguel Ángel Arruga Vivas @ 2021-01-06 18:40 UTC (permalink / raw)
  To: Tobias Geerinckx-Rice; +Cc: 45676

Hi!

Tobias Geerinckx-Rice <me@tobias.gr> writes:

> It's true.  It's a known trade-off of an otherwise almost-zero-effort
> yet fast reference scanner.  I don't think it's a bug per se, but it
> is something of which to be aware.
>
> Let's stop installing compressed executables & data files.  We already
> avoid compressed .jars and other renamed zip files.

This is the current trade-off between build time and closure size for
executable code, but it isn't the current status regarding data files.

> Gzipping a tiny handful of Lisp or Smalltalk files is pointless when
> zstd {,de}compresses my entire 500GB SSD better and faster, at the
> file system level where it now squarely belongs.

Not every system has a file system with compression, nor most of us
mortals have a SSD to test that. ;-)

> Without breaking Guix.

Software bugs are related to the number of lines, and this probably
would end up adding more, so I get that idea, hehe. :-P

With your proposal closures wouldn't benefit from the "standard tricks"
used by package maintainers to reduce their footprint for uncompressed
file systems.  Having an option to remove that compression seems best
for treating it at the file system level---perhaps only some wrappers
for the compression tools to use always -0 could do most of the
trick---but I'd still like to have the option of paying at build/graft
time the storage savings.  Of course, this is still only a wish.

Happy hacking!
Miguel




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-06 16:57     ` Miguel Ángel Arruga Vivas
@ 2021-01-07 11:05       ` Ludovic Courtès
  0 siblings, 0 replies; 11+ messages in thread
From: Ludovic Courtès @ 2021-01-07 11:05 UTC (permalink / raw)
  To: Miguel Ángel Arruga Vivas; +Cc: 45676

Howdy,

Miguel Ángel Arruga Vivas <rosen644835@gmail.com> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hi,
>>
>> Leo Famulari <leo@famulari.name> skribis:
>>
>>> On Tue, Jan 05, 2021 at 03:36:07PM +0100, Miguel Ángel Arruga Vivas wrote:
>>>> There are several binary formats that allow compression of the
>>>> executable image, or some of its data, which is decompress at runtime:
>>>> 
>>>>   - Kernel images.
>>>>   - Compressed libraries: e.g. Smalltalk modules.
>>>>   - Compressed executable or data files: e.g. library.el.gz.
>>>> 
>>>> These aren't taken into account by the grafting process, which may lead
>>>> to issues when store paths are located inside that kind of files.
>>>
>>> If you have specific instances of this type of bug, please report them.
>>
>> Agreed.  The general issue is “well known” as we say, but what I think
>> we need to do is look for specific instances and address them.
>
> It can be tagged it notabug if you consider so.  I've tagged it as
> wishlist (I should have been done it before) for that reason (it's "well
> known"), but I haven't found any specific instance yet.  OTOH, I think
> it might be closely related to #33848, as the solution for both issues
> could be solved by the extension on the dumpPath code path---or the
> Scheme implementation equivalent, as pointed there.

Yes, though I’d prefer simple workarounds if possible—after all, we’ve
lived with it since the beginning and there’s only ever been a handful
of instances of that problem (one of them was really tricky, see
‘gcc-strmov-store-file-names.patch’…).

Ludo’.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-06  8:54   ` Leo Prikler
@ 2021-01-14 21:31     ` Ludovic Courtès
  2021-01-14 22:24       ` Leo Prikler
  0 siblings, 1 reply; 11+ messages in thread
From: Ludovic Courtès @ 2021-01-14 21:31 UTC (permalink / raw)
  To: Leo Prikler; +Cc: 45676, Miguel Ángel Arruga Vivas

Hi Leo,

Leo Prikler <leo.prikler@student.tugraz.at> skribis:

> From 57c23bf6ecac79c397cb49ff251176ec3a7b1cf5 Mon Sep 17 00:00:00 2001
> From: Leo Prikler <leo.prikler@student.tugraz.at>
> Date: Wed, 6 Jan 2021 09:24:07 +0100
> Subject: [PATCH] gnu: emacs: Don't install compressed archives.
>
> See <http://issues.guix.gnu.org/45676#3>.

Perhaps make it a comment next to the option.

> * gnu/packages/emacs.scm (emacs)[#:configure-flags]:
> Add --without-compress-install.
> (emacs-minimal)[#:configure-flags]: Likewise.

[...]

> +                               "--without-compress-install"

Does that disable .el file compression altogether for Emacs’ own files?

If so, isn’t it too much?  Do these file currently contain store file
names?

(I know EMMS .el files for instance are full of store file names, so
that one should definitely not be gzipped, but Emacs itself may be
fine?)

Ludo’.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* bug#45676: Store references inside compressed data
  2021-01-14 21:31     ` Ludovic Courtès
@ 2021-01-14 22:24       ` Leo Prikler
  0 siblings, 0 replies; 11+ messages in thread
From: Leo Prikler @ 2021-01-14 22:24 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 45676, Miguel Ángel Arruga Vivas

Hi Ludo,

Am Donnerstag, den 14.01.2021, 22:31 +0100 schrieb Ludovic Courtès:
> Hi Leo,
> 
> Leo Prikler <leo.prikler@student.tugraz.at> skribis:
> 
> > From 57c23bf6ecac79c397cb49ff251176ec3a7b1cf5 Mon Sep 17 00:00:00
> > 2001
> > From: Leo Prikler <leo.prikler@student.tugraz.at>
> > Date: Wed, 6 Jan 2021 09:24:07 +0100
> > Subject: [PATCH] gnu: emacs: Don't install compressed archives.
> > 
> > See <http://issues.guix.gnu.org/45676#3>;.
> 
> Perhaps make it a comment next to the option.
I'll keep that in mind, but I wasn't going to commit this unless it is
absolutely needed.

> > * gnu/packages/emacs.scm (emacs)[#:configure-flags]:
> > Add --without-compress-install.
> > (emacs-minimal)[#:configure-flags]: Likewise.
> 
> [...]
> 
> > +                               "--without-compress-install"
> 
> Does that disable .el file compression altogether for Emacs’ own
> files?
> 
> If so, isn’t it too much?  Do these file currently contain store file
> names?
> 
> (I know EMMS .el files for instance are full of store file names, so
> that one should definitely not be gzipped, but Emacs itself may be
> fine?)
As far as I know, this is an all or nothing deal.  If I'm not mistaken,
however, all those references should still exist in the compiled (and
not compressed) .go files however, hence it making little difference. 
Perhaps time stamps could be added during compression, but I think our
Emacs reproducibility issues lie elsewhere as well.

All in all, I don't think there's a technical reason to do this (yet),
merely the somewhat purist stance of "no compressed source files".

Regards,
Leo





^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-01-14 22:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-05 14:36 bug#45676: Store references inside compressed data Miguel Ángel Arruga Vivas
2021-01-05 20:22 ` Leo Famulari
2021-01-05 20:22 ` Leo Famulari
2021-01-06 11:35   ` Ludovic Courtès
2021-01-06 16:57     ` Miguel Ángel Arruga Vivas
2021-01-07 11:05       ` Ludovic Courtès
2021-01-05 22:33 ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
2021-01-06  8:54   ` Leo Prikler
2021-01-14 21:31     ` Ludovic Courtès
2021-01-14 22:24       ` Leo Prikler
2021-01-06 18:40   ` Miguel Ángel Arruga Vivas

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).