unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Danny Milosavljevic <dannym@scratchpost.org>
To: Leo Prikler <leo.prikler@student.tugraz.at>
Cc: guix-devel@gnu.org
Subject: Re: GNOME in Guix
Date: Tue, 3 Nov 2020 20:26:16 +0100	[thread overview]
Message-ID: <20201103202416.628375fb@scratchpost.org> (raw)
In-Reply-To: <898d3e29025888a0d218ddf8b468a676ece2490f.camel@student.tugraz.at>

[-- Attachment #1: Type: text/plain, Size: 11370 bytes --]

Hi Leo,

On Tue, 03 Nov 2020 14:41:31 +0100
Leo Prikler <leo.prikler@student.tugraz.at> wrote:

> >   (note: "-l guix.scm")
> > 
> > seems to have fixed most of the problems.
> > (There is no automated diagnostic--so who knows whether it did fix
> > them for real?)  
> What diagnostic would you want here?

Whether there exist packages with the same name but different derivations in
the environment (especially if mixing hidden and non-hidden packages with
the same name).

Also, there's probably some kind of environment build file list printing in guix.
Using it, it could list libgobjects are in the environment without me having
to use a LD_PRELOAD to find which they are...

Does it exist?  If so, how to use it?

> Is that really the problem at hand? I don't see glib explicitly
> mentioned here, so you should not build glib-with-documentation here. 
> Adding glib to one of your --ad-hoc "chains" should yield a different
> profile.

It's not that direct--but yes, something like that is indeed the problem at hand.

If you want more detail, the guile-gi bug report is pretty detailed on it.

Also, better, the reason I put the "historical" section at the end of my
previous post is that it is reproducible later, so we can find out what exactly
happened here.

Following that you will get exactly the same situation--and using (only) the
LD_PRELOAD discussed on guile-gi, you will find the culprit, and you will see
that it is two different libgobjects being loaded.

We are now in a better situation because I do all my development using version
control--even throwaway stuff.  guile-gi devs had this happen before--but their
reproducers got lost to time.  Now we have reproducers that cause it every
single time (both the historical version of guix-gui mentioned and also
file test/insanity.scm in guile-gi by now).

> What is the meaning of "doesn't like" here?  Is it explicitly
> discouraged, poorly supported, work in progress, ...?

It means it does not work, and given what they are doing in the source code
GNOME evidently did not think about this case at all.

And at runtime using guix environment, this does (pretty much always) cause
problems without any special things to be done to cause it.

You can read a lot of details on the bug report on the guile-gi website.

> > This was both my fault, and guix's fault for being VERY obtuse.  

> I don't think looking for someone to blame is particularly useful.

We are engineers--so finding out what is to blame for a problem and
fixing that place is indeed what we do all day (that includes deciding
which projects should have the change, and which shouldn't.  If the
answer is "none" the situation will not improve.  So that's right out).

If you mean blaming people, I don't do that--that would be silly.

"Blaming" programs I will do all day every day (including mine).  Problems like
this need to be seen, otherwise us guix devs don't know that it's like this.

I know it from my own stuff I put into Guix master.  *I* did get those
to work by configuring stuff.  But is it usable for a regular user?

For this guix environment stuff, the answer is a definite no.  (there's
another thread on the guix-devel list about improving that--so it's not
like I'm the only one--it's just a bad interface.  It happens.)

>Regarding the safety aspect, I don't think that's what they're
> going for.  They might be thinking, that "--pure --no-grafts" is close
> enough to what you'd get inside `guix build`, so that you don't have to
> use `guix build` for debugging.

If grafts are broken (which seems likely according to guile-gi devs), grafting
should be fixed now, before a security problem happens and we miss it (...but
state that we fixed it).

I know that guile-gi argues for doing development with --no-grafts, but

(1) I don't have any readable characters when I do that
(2) If I did, it would still not help the end user because it is not the right fix

> > > Do you still encounter Guile-GI#96 after packaging?  
> > 
> > I've not finished packaging it yet (I tried--but that's pretty
> > difficult,
> > too!  Help wanted--see README and guix.scm in the guix-gui repo).  

> I'll see what I can do to help, but the only thing I can see are
> "FIXME" strings in guix.scm.  The README even suggests to run `guix
> build`.

Thanks!

If you do the guix build and then run the final script, then
you will see that it does not start up because it cannot find guile
modules.  I am not knowledgeable how to deploy large guile programs,
so I'd appreciate help in doing that.

(I can totally find a hacky way to make it work--but I'd rather use the
canonical way to find external guile modules that surely exists,
especially in guix (... which is written in guile and is a large program
with external guile modules))

> One could disagree about that based on the workflow you've posted here
> ;)

I can assure you this is what anyone would try, until he gets burned like
this and then does some of

(1) gives up and deletes guix
(2) posts a bad review to the media
(3) posts bug reports detailing what happened
(4) uses another way than "guix environment" and it finally works (but does
that really do the right thing?  Who knows...)

I prefer people do (3) and (4).  Ideally, getting information for (3) would be
automatic (at least more automatic than "<GValue> is not compatible with
<GValue>".  Oh yeah?  It's not compatible?  Good to know ;)

Also, guile-gi devs always used "--pure -l guix.scm" and it still happens to
them!

> Jokes aside, the Guile-GI devs seem to be hitting a similar issue, but
> as they write, it appears to mainly affect their own development
> environments (i.e. exactly the thing that hurts you the most as a
> developer).  There is potentially a discrepancy between `guix build`
> and `guix environment`.

Apparently so.

Note that with the current state of guix (no gui...) we mainly can target only
developers--so if the story for developers is bad, well, that's bad for guix.

> Having a look at Guile-GI specifically, ldd `guix build guile-gi` has
> the following:
> libgobject-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-
> glib-2.62.6/lib/libgobject-2.0.so.0
> 
> You should recognize that hash by now ;)

Definitely.  With how often in different contexts I saw gobject hashes by now,
I'll probably dream about those ;)

> You can eliminate a huge array of those issues by using pure
> environments.

As I said, "guix environment --pure" did not fix all (or even many) of these
problems.  I added "--pure" very soon--and WITH the "--pure" it still loads
two different libgobjects.  Using "-l" helps tremendously.  "--pure"? Nope.

> I'm not sure whether they must specifically enforce one and only one
> type registry or whether those can be made to coexist.  You should
> probably talk to gobject-introspection about this.

Yeah.  But I will see what guile-gi devs have to say about it first.

> It would also make it extremely easy to shoot yourself in the foot,
> thereby costing you even more time.  It's like shooting flies with a
> cannon.  Sure, you might kill some bugs, but you can just as well tear
> down castle walls with that.

I don't see what is even controversal with it.  The same duplicate symbol
checking *always* happens with regular linking.  So if you now dlopen stuff
that usually is regularily linked into C programs, the latter C program
developers already ensured that the libraries don't have duplicate symbols
(in a certain sense), otherwise the libraries wouldn't link with the C
programs in the first place.

I'm only suggesting for dlopen to to do the same thing in diagnostics mode.

Also, I'm not suggesting to always enable the feature--but it should *exist*.

But you are right, an easier fix would be guix environment just warning about
packages that have the same name but are different packages in the same
environment.  That way you wouldn't have to care about linker stuff at all.

Bad debugging features are the achilles heel of guix--and it's difficult
for me to justify spending time on guix when every time there's something
to debug it is so bad.

Examples:

(1) There is no debug info by default.  Ludo fixed that in that now one can
use package transformations to build certain packages with debug info without
having to recompile a guix checkout.

Nice.  (previously, you had to check out guix, manually edit packages (first
find out how), then compile it (which is not easy) and then use ./pre-inst-env
to rebuild whatever you wanted to debug in the first place so you get line
number information when one of the programs segfaults and you are in gdb.
Think joe regular dev would do that when collecting info for reporting bugs?
I don't)

(2) guix guile backtraces always just say eval.scm and not the actual
source file or line something happened in.  That is still the case.
I basically use divide and conquer "write"s in order to find out where in
my program the actual line it is talking about is.

(3) guix environment and the actual guix build daemon set up slightly
different enviroments.  That way, it can happen that a bug that
happens inside the guix build container doesn't happen when you search
for it using guix environment.

> I'm not sure how sane loading another version of GObject is in other GI
> implementations.

Not sane at all.

That duplicate loading of libraries with the same basename is not supposed to
be done.

I mean it's nice if someone gets it to work anyway--but I suspect that that
won't fly with gobject's runtime type checking.

> I don't think I can help you with that here, but if you have something
> usable for debugging, like a stack trace, it might be worth to ask
> around Guile-GI (assuming it is not just a repetition of their #96).

The problem is that the software stack does not fail correctly,
and somewhere the software stack needs to be adapted so that it fails
instead of dragging the error state all the way through the entire profile
without it being detected.  There are only usable stacktraces when you
go out of your way using LD_PRELOAD.  That is not a tenable way.

I want to be clear that guile-gi is not at fault, by which I mean that
this part of the software stack should not be changed.  If you did,
gobject-introspection would still do it "wrong".  If you changed that, the
next GNOME library over there would STILL do it "wrong".  That is not the
way to fix it.

Also, what is "fixing it"?  Is it preventing duplicate loading for good
or is it changing 29 libraries upstream to support duplicate loading?
Or is it warning if it happens?  All of those in different points of time?

With this, I hope to at least have raised awareness and started the
discussion so progress can be made.

(For example, it makes not much sense to bother guile-gi devs with a problem
that guile-gi did not cause.  gobject-introspection has the exact same
design (and is a dependency of guile-gi), so everyone who uses
gobject-introspection inherits that limitation.  The guile-gi people
are just nice enough to look at it anyway--even though it's not something
they can unilaterally reasonably fix anyway)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-11-03 19:26 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-02 13:51 GNOME in Guix Leo Prikler
2020-11-03  9:14 ` Danny Milosavljevic
2020-11-03 13:41   ` Leo Prikler
2020-11-03 19:26     ` Danny Milosavljevic [this message]
2020-11-03 23:20       ` Leo Prikler
2020-11-04  8:08         ` Danny Milosavljevic
2020-11-04  9:45           ` Leo Prikler
2020-11-04 13:43             ` Danny Milosavljevic
2020-11-04 14:02               ` Leo Prikler
2020-11-06  9:55   ` Pierre Neidhardt
  -- strict thread matches above, loose matches on Subject: below --
2020-10-29 16:25 Guix Front End (GUI) and making it more mainstream, popular in scientific community Aniket Patil
2020-10-29 19:34 ` Danny Milosavljevic
2020-11-02  7:44   ` Pierre Neidhardt
2020-11-02 10:17     ` GNOME in Guix Danny Milosavljevic
2020-11-06  9:41       ` Pierre Neidhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201103202416.628375fb@scratchpost.org \
    --to=dannym@scratchpost.org \
    --cc=guix-devel@gnu.org \
    --cc=leo.prikler@student.tugraz.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).