unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Leo Prikler <leo.prikler@student.tugraz.at>
To: Danny Milosavljevic <dannym@scratchpost.org>
Cc: guix-devel@gnu.org
Subject: Re: GNOME in Guix
Date: Wed, 04 Nov 2020 00:20:30 +0100	[thread overview]
Message-ID: <b01907b7a3395fabf7bae1818b75484075e17b62.camel@student.tugraz.at> (raw)
In-Reply-To: <20201103202416.628375fb@scratchpost.org>

Hi Danny,

Am Dienstag, den 03.11.2020, 20:26 +0100 schrieb Danny Milosavljevic:
> Hi Leo,
> 
> On Tue, 03 Nov 2020 14:41:31 +0100
> Leo Prikler <leo.prikler@student.tugraz.at> wrote:
> 
> > >   (note: "-l guix.scm")
> > > 
> > > seems to have fixed most of the problems.
> > > (There is no automated diagnostic--so who knows whether it did
> > > fix
> > > them for real?)  
> > What diagnostic would you want here?
> 
> Whether there exist packages with the same name but different
> derivations in
> the environment (especially if mixing hidden and non-hidden packages
> with
> the same name).
If conflicting versions of the same package are pulled in through
propagated inputs, things would fail.  If you get two different files
into the same profile otherwise, Guix silently resolves that conflict,
but you can get a warning about that at higher verbosity.  I'm not
exactly sure, where this falls into, however.  I have the weird
feeling, that you don't get two different packages into the
environment.  Not directly at least.  And at that point making
meaningful predictions based on (transitive) inputs is going to be
complicated.

> Also, there's probably some kind of environment build file list
> printing in guix.
> Using it, it could list libgobjects are in the environment without me
> having
> to use a LD_PRELOAD to find which they are...
> 
> Does it exist?  If so, how to use it?
I'm not sure if I interpret this correctly, but you're talking about
the derivation (.drv) files, right?  If so, Guix prints them on first
build and you can grep for the path of $GUIX_ENVIRONMENT in /gnu/store
too in order to find them.
Reading them on the other hand…

> > Is that really the problem at hand? I don't see glib explicitly
> > mentioned here, so you should not build glib-with-documentation
> > here. 
> > Adding glib to one of your --ad-hoc "chains" should yield a
> > different
> > profile.
> 
> It's not that direct--but yes, something like that is indeed the
> problem at hand.
> 
> If you want more detail, the guile-gi bug report is pretty detailed
> on it.
I think they mention glib-with-documentation as something the user
might want to put in their development environments for some other
purpose (or inadvertently pull in anyway), not as the thing that gets
put there in this specific instance.  Either way, you'd need to install
glib-with-documentation (= user-facing glib) either globally or in your
development environment for that to happen.

> Also, better, the reason I put the "historical" section at the end of
> my
> previous post is that it is reproducible later, so we can find out
> what exactly
> happened here.
> 
> Following that you will get exactly the same situation--and using
> (only) the
> LD_PRELOAD discussed on guile-gi, you will find the culprit, and you
> will see
> that it is two different libgobjects being loaded.
> 
> We are now in a better situation because I do all my development
> using version
> control--even throwaway stuff.  guile-gi devs had this happen before-
> -but their
> reproducers got lost to time.  Now we have reproducers that cause it
> every
> single time (both the historical version of guix-gui mentioned and
> also
> file test/insanity.scm in guile-gi by now).
Having reproducible reproducers is a wonderful thing, but you need to
understand them too.  I guess in the context of guile-gi,
test/insanity.scm catches the case, where the same structure has two
GTypes due to it being loaded from different places.

> > What is the meaning of "doesn't like" here?  Is it explicitly
> > discouraged, poorly supported, work in progress, ...?
> 
> It means it does not work, and given what they are doing in the
> source code
> GNOME evidently did not think about this case at all.
> 
> And at runtime using guix environment, this does (pretty much always)
> cause
> problems without any special things to be done to cause it.
> 
> You can read a lot of details on the bug report on the guile-gi
> website.
I am following this bug somewhat closely, but to me there doesn't seem
to be this hard evidence you are talking about.  If I read this
correctly, upstream thinks it might be possible to work around such
issues on their end.  What do GI maintainers say about loading a
different version of GObject?

> > > This was both my fault, and guix's fault for being VERY obtuse.  
> > I don't think looking for someone to blame is particularly useful.
> 
> We are engineers--so finding out what is to blame for a problem and
> fixing that place is indeed what we do all day (that includes
> deciding
> which projects should have the change, and which shouldn't.  If the
> answer is "none" the situation will not improve.  So that's right
> out).
Emphasis on "someone".  You can (and should of course) search for
issues in code and architecture, but not in people :)

> If you mean blaming people, I don't do that--that would be silly.
Fuck, you've anticipated my response.

> "Blaming" programs I will do all day every day (including
> mine).  Problems like
> this need to be seen, otherwise us guix devs don't know that it's
> like this.
> 
> I know it from my own stuff I put into Guix master.  *I* did get
> those
> to work by configuring stuff.  But is it usable for a regular user?
I think in this case having a pinned issue in Guile-GI saying "sorry
folks, `guix environment guile-gi` is broken" is a step forward.  The
question is how Guix or Guile-GI devs can "unbreak" this environment.

The crux seem to be two different versions of glib spawning from what
should be one.  Of the following:
(1) /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-glib-
2.62.6/lib/libgobject-2.0.so.0
(2) /gnu/store/xkfc1275h55ynpgfr3wwmzy9707nblwc-glib-
2.62.6/lib/libgobject-2.0.so.0
only (1) should be built in my opinion, but I have nothing to back up
my reasoning about that except for the fact that it's the one GI wants.

> For this guix environment stuff, the answer is a definite
> no.  (there's
> another thread on the guix-devel list about improving that--so it's
> not
> like I'm the only one--it's just a bad interface.  It happens.)
Which other thread are you referring to?  The most relevant thing I
found was you encountering something similar with mrustc 2 years ago.

> > Regarding the safety aspect, I don't think that's what they're
> > going for.  They might be thinking, that "--pure --no-grafts" is
> > close
> > enough to what you'd get inside `guix build`, so that you don't
> > have to
> > use `guix build` for debugging.
> 
> If grafts are broken (which seems likely according to guile-gi devs),
> grafting
> should be fixed now, before a security problem happens and we miss it
> (...but
> state that we fixed it).
I don't think grafting itself is broken.  What might be broken is `guix
environment --yes-grafts`, as it departs from the expectation that the
environment you get is "just like `guix build`".

> I know that guile-gi argues for doing development with --no-grafts,
> but
> 
> (1) I don't have any readable characters when I do that
> (2) If I did, it would still not help the end user because it is not
> the right fix
I don't think the end user will care too much about some development
hiccup.  What matters, is that the resulting package is functional and
you should get that by enforcing GI_TYPELIB_PATH = (getenv
"GI_TYPELIB_PATH").

> > > > Do you still encounter Guile-GI#96 after packaging?  
> > > 
> > > I've not finished packaging it yet (I tried--but that's pretty
> > > difficult,
> > > too!  Help wanted--see README and guix.scm in the guix-gui
> > > repo).  
> > I'll see what I can do to help, but the only thing I can see are
> > "FIXME" strings in guix.scm.  The README even suggests to run `guix
> > build`.
> 
> Thanks!
> 
> If you do the guix build and then run the final script, then
> you will see that it does not start up because it cannot find guile
> modules.  I am not knowledgeable how to deploy large guile programs,
> so I'd appreciate help in doing that.
> 
> (I can totally find a hacky way to make it work--but I'd rather use
> the
> canonical way to find external guile modules that surely exists,
> especially in guix (... which is written in guile and is a large
> program
> with external guile modules))
Given that your program is an application, you would wrap
GUILE_LOAD_PATH and GUILE_LOAD_COMPILED_PATH into the launcher using
wrap-program.  If it was a library, you'd request the user to install
guile in the same environment and propagate any Guile modules you
depend on.

> > One could disagree about that based on the workflow you've posted
> > here
> > ;)
> 
> I can assure you this is what anyone would try, until he gets burned
> like
> this and then does some of
> 
> (1) gives up and deletes guix
> (2) posts a bad review to the media
> (3) posts bug reports detailing what happened
> (4) uses another way than "guix environment" and it finally works
> (but does
> that really do the right thing?  Who knows...)
Building increasingly arcane commands is the exact opposite of what you
should do in Guix.  If it looks wrong and feels wrong it's most likely
wrong ;)
I know that npm and others educate their users to pack even more
brittle constructs onto already brittle constructs, but that's the
state of software that we want to get rid of.

> I prefer people do (3) and (4).  Ideally, getting information for (3)
> would be
> automatic (at least more automatic than "<GValue> is not compatible
> with
> <GValue>".  Oh yeah?  It's not compatible?  Good to know ;)
I think Guile-GI devs are at least going to have a laugh if you post
that over there.  Sadly I get stuff like "<GObject> imported from both
(x) and (y)", which is a slightly better clue that things are going
wrong.

> Also, guile-gi devs always used "--pure -l guix.scm" and it still
> happens to them!
Is this perhaps related to their request to use --no-grafts?

> > Jokes aside, the Guile-GI devs seem to be hitting a similar issue,
> > but
> > as they write, it appears to mainly affect their own development
> > environments (i.e. exactly the thing that hurts you the most as a
> > developer).  There is potentially a discrepancy between `guix
> > build`
> > and `guix environment`.
> 
> Apparently so.
> 
> Note that with the current state of guix (no gui...) we mainly can
> target only
> developers--so if the story for developers is bad, well, that's bad
> for guix.
Okay, but what is Guix to do?  "--no-graft" environments by default to
make them closer to `guix build`?  There is already a movement saying
they want "--ad-hoc" by default, which goes into the completely other
direction.  "Graft more carefully"?  If so, how?

> > Having a look at Guile-GI specifically, ldd `guix build guile-gi`
> > has
> > the following:
> > libgobject-2.0.so.0 => /gnu/store/xa1vfhfc42x655hi7vxqmbyvwldnz7r0-
> > glib-2.62.6/lib/libgobject-2.0.so.0
> > 
> > You should recognize that hash by now ;)
> 
> Definitely.  With how often in different contexts I saw gobject
> hashes by now,
> I'll probably dream about those ;)
xa1vf...

> > You can eliminate a huge array of those issues by using pure
> > environments.
> 
> As I said, "guix environment --pure" did not fix all (or even many)
> of these
> problems.  I added "--pure" very soon--and WITH the "--pure" it still
> loads
> two different libgobjects.  Using "-l" helps tremendously.  "--pure"?
> Nope.
I'm quite certain Guile-GI devs use "--pure -l".  Where is your God
now?

> > It would also make it extremely easy to shoot yourself in the foot,
> > thereby costing you even more time.  It's like shooting flies with
> > a
> > cannon.  Sure, you might kill some bugs, but you can just as well
> > tear
> > down castle walls with that.
> 
> I don't see what is even controversal with it.  The same duplicate
> symbol
> checking *always* happens with regular linking.  So if you now dlopen
> stuff
> that usually is regularily linked into C programs, the latter C
> program
> developers already ensured that the libraries don't have duplicate
> symbols
> (in a certain sense), otherwise the libraries wouldn't link with the
> C
> programs in the first place.
I wouldn't be too certain about that.  You are excluding very benign
behaviour here.  Going back to your earlier examples with A, B, Q and
Q', I'm not sure how you'd get that to link statically, whereas
dynamically linking them together should be of medium trickery.

> I'm only suggesting for dlopen to to do the same thing in diagnostics
> mode.
> 
> Also, I'm not suggesting to always enable the feature--but it should
> *exist*.
Anyway, we are departing a bit much from the original topic, are we
not?  I also feel as if this question is something better posed
separately, perhaps even as a general libc question on stackoverflow or
similar?

> But you are right, an easier fix would be guix environment just
> warning about
> packages that have the same name but are different packages in the
> same
> environment.  That way you wouldn't have to care about linker stuff
> at all.
I still wonder what exactly you mean by that, see my earlier point
regarding meaningful predictions.

> Bad debugging features are the achilles heel of guix--and it's
> difficult
> for me to justify spending time on guix when every time there's
> something
> to debug it is so bad.

To be honest, I have already gazed into the deepest abyss of debugging
there is and it gazed back, so nothing you will say after this line
will faze me.  It also means to take my counter-points with a grain of
salt.

> (1) There is no debug info by default.  Ludo fixed that in that now
> one can
> use package transformations to build certain packages with debug info
> without
> having to recompile a guix checkout.
> 
> Nice.  (previously, you had to check out guix, manually edit packages
> (first
> find out how), then compile it (which is not easy) and then use
> ./pre-inst-env
> to rebuild whatever you wanted to debug in the first place so you get
> line
> number information when one of the programs segfaults and you are in
> gdb.
> Think joe regular dev would do that when collecting info for
> reporting bugs?
> I don't)
I don't find it nice to bring already solved problems here, but Guix
has a huge number of ways to write custom packages (even more with guix
transformations) and of those only one requires checkout and
recompilation.
It also has a pretty decent manual.

> (2) guix guile backtraces always just say eval.scm and not the actual
> source file or line something happened in.  That is still the case.
> I basically use divide and conquer "write"s in order to find out
> where in
> my program the actual line it is talking about is.
Can't say I've encountered too many of those, but I think it might have
to do with (lack of) compilation.  Not really sure what I do wrong
here, I've been searching for less meaningful backtraces far and wide.

> (3) guix environment and the actual guix build daemon set up slightly
> different enviroments.  That way, it can happen that a bug that
> happens inside the guix build container doesn't happen when you
> search
> for it using guix environment.
Back to that one aren't we?  I feel like this is the main driver as to
why Guile-GI devs suggest --no-grafts.
That being said, I don't feel like there is a consensus on what `guix
environment` should be, so it ends up filling different niches all at
once.  I also don't know, whether those niches can neatly be separated
into different commands and if so how to name them.

> > I'm not sure how sane loading another version of GObject is in
> > other GI
> > implementations.
> 
> Not sane at all.
> 
> That duplicate loading of libraries with the same basename is not
> supposed to
> be done.
> 
> I mean it's nice if someone gets it to work anyway--but I suspect
> that that
> won't fly with gobject's runtime type checking.
That's nice as far as theory goes, but have you tried?
I'm joking of course.  There is no other version of GObject you could
load but one that is slightly differently grafted.

> > I don't think I can help you with that here, but if you have
> > something
> > usable for debugging, like a stack trace, it might be worth to ask
> > around Guile-GI (assuming it is not just a repetition of their
> > #96).
> 
> The problem is that the software stack does not fail correctly,
> and somewhere the software stack needs to be adapted so that it fails
> instead of dragging the error state all the way through the entire
> profile
> without it being detected.  There are only usable stacktraces when
> you
> go out of your way using LD_PRELOAD.  That is not a tenable way.
What kind of stack traces do you get from that?  Traces of when
conflicting libraries are loaded, I assume.

> I want to be clear that guile-gi is not at fault, by which I mean
> that
> this part of the software stack should not be changed.  If you did,
> gobject-introspection would still do it "wrong".  If you changed
> that, the
> next GNOME library over there would STILL do it "wrong".  That is not
> the
> way to fix it.
> 
> Also, what is "fixing it"?  Is it preventing duplicate loading for
> good
> or is it changing 29 libraries upstream to support duplicate loading?
> Or is it warning if it happens?  All of those in different points of
> time?
I feel like you've read me wrong here.  The point was, that the Guile-
GI devs can aid you in debugging the situation if it's not something
you already are familiar with.  They are (presumably) the ones who know
their codebase best, so if you raise an issue along the lines of "if I
do this and that, then this segfault happens with that warning printed"
or "if I spawn a Gtk Window and wait for three seconds, my application
crashes", they will likely help you out.

Of course, if it's a direct result of #96, you're not adding more than
a data point unless you go out of your way to prove something about it,
but I was speaking in slightly broader terms.

Whether or not it is their issue or someone else's is up for them to
decide and perhaps negotiate.  Looking at it from another perspective,
even if GI changed in the way you wanted, Guile-GI could still misuse
it in a way, that breaks things.  Perhaps that way one could interpret
#96 as them trying their best not to be the mood killer.

> With this, I hope to at least have raised awareness and started the
> discussion so progress can be made.
> 
> (For example, it makes not much sense to bother guile-gi devs with a
> problem
> that guile-gi did not cause.  gobject-introspection has the exact
> same
> design (and is a dependency of guile-gi), so everyone who uses
> gobject-introspection inherits that limitation.  The guile-gi people
> are just nice enough to look at it anyway--even though it's not
> something
> they can unilaterally reasonably fix anyway)
I understand the sentiment, but at the same time I don't think you
should tell them not to bother with that issue.  Look at it this way:
If they somehow do manage against all the odds to more or less
meaningfully load a different GObject through GI, would that not
encourage GI to adapt so as to make that easier?

Regards, Leo



  reply	other threads:[~2020-11-03 23:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-02 13:51 GNOME in Guix Leo Prikler
2020-11-03  9:14 ` Danny Milosavljevic
2020-11-03 13:41   ` Leo Prikler
2020-11-03 19:26     ` Danny Milosavljevic
2020-11-03 23:20       ` Leo Prikler [this message]
2020-11-04  8:08         ` Danny Milosavljevic
2020-11-04  9:45           ` Leo Prikler
2020-11-04 13:43             ` Danny Milosavljevic
2020-11-04 14:02               ` Leo Prikler
2020-11-06  9:55   ` Pierre Neidhardt
  -- strict thread matches above, loose matches on Subject: below --
2020-10-29 16:25 Guix Front End (GUI) and making it more mainstream, popular in scientific community Aniket Patil
2020-10-29 19:34 ` Danny Milosavljevic
2020-11-02  7:44   ` Pierre Neidhardt
2020-11-02 10:17     ` GNOME in Guix Danny Milosavljevic
2020-11-06  9:41       ` Pierre Neidhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b01907b7a3395fabf7bae1818b75484075e17b62.camel@student.tugraz.at \
    --to=leo.prikler@student.tugraz.at \
    --cc=dannym@scratchpost.org \
    --cc=guix-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).