RFC: new syntax for inline patches

unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed

* RFC: new syntax for inline patches
@ 2022-01-04 16:50 Ricardo Wurmus
  2022-01-05  8:16 ` Attila Lendvai
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Ricardo Wurmus @ 2022-01-04 16:50 UTC (permalink / raw)
  To: guix-devel

Hi Guix,

does this pattern look familiar to you?

  (arguments
    (list
    #:phases
    '(modify-phases %standard-phases
      (add-after 'unpack 'i-dont-care
        (lambda _
          (substitute* "this-file"
            (("^# some unique string, oh, careful! gotta \\(escape\\) this\\." m)
             (string-append m "\nI ONLY WANTED TO ADD THIS LINE!\n"))))))))

This is a lot of boilerplate just to add a line to a file.  I’m using
“substitute*” but actually I don’t want to substitute anything.  I just
know that I need to add a line somewhere in “this-file”.

Or maybe it’s a CMakeLists.txt file that inexplicably wants to download
stuff?  I should patch that file but it’s a multi-line change.  So I’m
trying to do the same as above with several different anchor strings to
comment out lines.

We have a lot of packages like that.  And while this boilerplate pattern
looks familiar to most of us now, it is really unclear.  It is
imperative and abuses regular expression matching when really it should
have been a patch.

There are a few reasons why we don’t use patches as often:

1. the source code is precious and we prefer to modify the original
sources as little as necessary, so that users can get the source code as
upstream intended with “guix build -S foo”.  We patch the sources
primarily to get rid of bundled source code, pre-built binaries, or
code that encroaches on users’ freedom.

2. the (patches …) field uses patch files.  These are annoying and
inflexible.  They have to be added to dist_patch_DATA in gnu/local.mk,
and they cannot contain computed store locations.  They are separate
from the package definition, which is inconvenient.

3. snippets feel like less convenient build phases.  Snippets are not
thunked, so we can’t do some things that we would do in a build phase
substitution.  We also can’t access %build-inputs or %outputs.  (I don’t
know if we can use Gexps there.)

I feel that the first point is perhaps a little overvalued.  I have
often felt annoyed that I had to manually apply all this build phase
patching to source code obtained with “guix build -S”, but I never felt
that source code I got from “guix build -S” was too far removed from
upstream.

It may not be possible to apply patches with computed store locations —
because when we compute the source derivation (which is an input to the
package derivation) we don’t yet know the outputs of the package
derivation.  But perhaps we can still agree on a more declarative way to
express patches that are to be applied before the build starts; syntax
that would be more declarative than a serious of brittle substitute*
expressions that latch onto hopefully unique strings in the target
files.

(We have something remotely related in etc/committer.scm.in, where we
define a record describing a diff hunk.)

Here’s a colour sample for the new bikeshed:

  (arguments
    (list
      #:patches
      #~(patch "the-file"
         ((line 10)
          (+ "I ONLY WANTED TO ADD THIS LINE"))
         ((line 3010)
          (- "maybe that’s better")
          (+ (string-append #$guix " is better"))
          (+ "but what do you think?")))))

-- 
Ricardo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-04 16:50 RFC: new syntax for inline patches Ricardo Wurmus
@ 2022-01-05  8:16 ` Attila Lendvai
  2022-01-06  0:19 ` Liliana Marie Prikler
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Attila Lendvai @ 2022-01-05  8:16 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

i may be lacking the necessary bird's eye view here... but why not just use good old copy-pasted diff for this?

introducing extra complexity in the form of a new DSL has all kinds of costs that people usually ignore (*), and it's not clear to me how those costs would pay off compared to just copy-pasting a small diff into the package declaration.

* lack of tooling; more opportunity for bugs; the ((line 10) (+ ...)) DSL is more fragile to upstream changes, etc)

sure, diff would be more verbose and arguably less aesthetic, but it's much easier on the brain of the coders, and is less fragile, too. and larger patches could go through the current, convoluted local.mk way.

- attila
PGP: 5D5F 45C7 DFCD 0A39

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-04 16:50 RFC: new syntax for inline patches Ricardo Wurmus
  2022-01-05  8:16 ` Attila Lendvai
@ 2022-01-06  0:19 ` Liliana Marie Prikler
  2022-01-06  1:20   ` Jelle Licht
  2022-01-08 21:34 ` Ludovic Courtès
  2022-01-12 17:56 ` Efraim Flashner
  3 siblings, 1 reply; 10+ messages in thread
From: Liliana Marie Prikler @ 2022-01-06  0:19 UTC (permalink / raw)
  To: Ricardo Wurmus, guix-devel

Hi Ricardo,

Am Dienstag, dem 04.01.2022 um 17:50 +0100 schrieb Ricardo Wurmus:
> Hi Guix,
> 
> does this pattern look familiar to you?
> 
>   (arguments
>     (list
>     #:phases
>     '(modify-phases %standard-phases
>       (add-after 'unpack 'i-dont-care
>         (lambda _
>           (substitute* "this-file"
>             (("^# some unique string, oh, careful! gotta \\(escape\\)
> this\\." m)
>              (string-append m "\nI ONLY WANTED TO ADD THIS
> LINE!\n"))))))))
> 
> This is a lot of boilerplate just to add a line to a file.  I’m using
> “substitute*” but actually I don’t want to substitute anything.  I
> just know that I need to add a line somewhere in “this-file”.
Now many of these substitute*s are actually sane.  E.g. those which
match the beginning of a defun in Emacs terms.  However, there for sure
are cases in which I think "when all you have is a substitute*, every
problem you have starts to look like... oh shit, I can't match this
multi-line string, back to square 0".

> CMakeLists.txt
I feel your pain.

> We have a lot of packages like that.  And while this boilerplate
> pattern looks familiar to most of us now, it is really unclear.  It
> is imperative and abuses regular expression matching when really it
> should have been a patch.
Now imperative is not really the bad thing here, but I'll get to that a
little bit later when discussing your mockup.  I do however agree that
it's too limited for its task.

> There are a few reasons why we don’t use patches as often:
> 
> 1. the source code is precious and we prefer to modify the original
> sources as little as necessary, so that users can get the source code
> as upstream intended with “guix build -S foo”.  We patch the sources
> primarily to get rid of bundled source code, pre-built binaries, or
> code that encroaches on users’ freedom.
> 
> 2. the (patches …) field uses patch files.  These are annoying and
> inflexible.  They have to be added to dist_patch_DATA in
> gnu/local.mk, and they cannot contain computed store locations.  They
> are separate from the package definition, which is inconvenient.
> 
> 3. snippets feel like less convenient build phases.  Snippets are not
> thunked, so we can’t do some things that we would do in a build phase
> substitution.  We also can’t access %build-inputs or %outputs.  (I
> don’t know if we can use Gexps there.)
Both patches and snippets serve the functions you've outlined in 1. 
Now 2. is indeed an issue, but it's still an issue if you include patch
as native input as well as the actual patch file you want to apply
(assuming it is free of store locations, which can be and has been
worked around).

> It may not be possible to apply patches with computed store locations
> — because when we compute the source derivation (which is an input to
> the package derivation) we don’t yet know the outputs of the package
> derivation.  But perhaps we can still agree on a more declarative way
> to express patches that are to be applied before the build starts;
> syntax that would be more declarative than a serious of brittle
> substitute* expressions that latch onto hopefully unique strings in
> the target files.
> 
> (We have something remotely related in etc/committer.scm.in, where we
> define a record describing a diff hunk.)
> 
> Here’s a colour sample for the new bikeshed:
> 
>   (arguments
>     (list
>       #:patches
>       #~(patch "the-file"
>          ((line 10)
>           (+ "I ONLY WANTED TO ADD THIS LINE"))
>          ((line 3010)
>           (- "maybe that’s better")
>           (+ (string-append #$guix " is better"))
>           (+ "but what do you think?")))))
Now this thing is again just as brittle as the patch it encodes and if
I know something about context-less patches then it's that they're
super not trustworthy.

However, have you considered that something similar has been lying
around in our codebase all this time and 99% of the packages ignore it
because it's so obscure and well hidden?  Why yes, of course I'm
talking about emacs-batch-edit-file.

Now of course there are some problems when it comes to invoking Emacs
inside Guix build.  For one, not every package has emacs-minimal
available at build time and honestly, that's a shame.  Even then, you'd
have to dance around and add emacs-utils to enable it.  And after that?
You code Emacs Lisp in the middle of Guile code!  Now certainly, this
UI can be improved.

Particularly, I'd propose a set of monadic functions that operate on a
buffer in much the same way Emacs does.  Now we wouldn't need to
implement all of Emacs in that way (though we certainly could if given
enough time).

Basic primitives would be the following to name a few: search-forward,
re-search-forward, goto-char, forward-line, point-min, insert,
beginning-of-line, end-of-line, delete-region, point.  Of course we
could rename them to be more guily, but that's a minor concern imo. 
Notice how I omitted find-file and save-buffer, because that should be
taken care of by the monad.

WDYT, should we pursue that?  Or should we just add an Emacs to our
native inputs?  :P


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-06  0:19 ` Liliana Marie Prikler
@ 2022-01-06  1:20   ` Jelle Licht
  2022-01-06  6:43     ` Liliana Marie Prikler
  0 siblings, 1 reply; 10+ messages in thread
From: Jelle Licht @ 2022-01-06  1:20 UTC (permalink / raw)
  To: Liliana Marie Prikler, Ricardo Wurmus, guix-devel

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

> Hi Ricardo,
>
> Am Dienstag, dem 04.01.2022 um 17:50 +0100 schrieb Ricardo Wurmus:
>> Hi Guix,
>> 
>> does this pattern look familiar to you?
>> 
>>   (arguments
>>     (list
>>     #:phases
>>     '(modify-phases %standard-phases
>>       (add-after 'unpack 'i-dont-care
>>         (lambda _
>>           (substitute* "this-file"
>>             (("^# some unique string, oh, careful! gotta \\(escape\\)
>> this\\." m)
>>              (string-append m "\nI ONLY WANTED TO ADD THIS
>> LINE!\n"))))))))
>> 
>> This is a lot of boilerplate just to add a line to a file.  I’m using
>> “substitute*” but actually I don’t want to substitute anything.  I
>> just know that I need to add a line somewhere in “this-file”.
> Now many of these substitute*s are actually sane.  E.g. those which
> match the beginning of a defun in Emacs terms.  However, there for sure
> are cases in which I think "when all you have is a substitute*, every
> problem you have starts to look like... oh shit, I can't match this
> multi-line string, back to square 0".
>
>> CMakeLists.txt
> I feel your pain.
>
>> We have a lot of packages like that.  And while this boilerplate
>> pattern looks familiar to most of us now, it is really unclear.  It
>> is imperative and abuses regular expression matching when really it
>> should have been a patch.
> Now imperative is not really the bad thing here, but I'll get to that a
> little bit later when discussing your mockup.  I do however agree that
> it's too limited for its task.
>
>> There are a few reasons why we don’t use patches as often:
>> 
>> 1. the source code is precious and we prefer to modify the original
>> sources as little as necessary, so that users can get the source code
>> as upstream intended with “guix build -S foo”.  We patch the sources
>> primarily to get rid of bundled source code, pre-built binaries, or
>> code that encroaches on users’ freedom.
>> 
>> 2. the (patches …) field uses patch files.  These are annoying and
>> inflexible.  They have to be added to dist_patch_DATA in
>> gnu/local.mk, and they cannot contain computed store locations.  They
>> are separate from the package definition, which is inconvenient.
>> 
>> 3. snippets feel like less convenient build phases.  Snippets are not
>> thunked, so we can’t do some things that we would do in a build phase
>> substitution.  We also can’t access %build-inputs or %outputs.  (I
>> don’t know if we can use Gexps there.)
> Both patches and snippets serve the functions you've outlined in 1. 
> Now 2. is indeed an issue, but it's still an issue if you include patch
> as native input as well as the actual patch file you want to apply
> (assuming it is free of store locations, which can be and has been
> worked around).
>
>> It may not be possible to apply patches with computed store locations
>> — because when we compute the source derivation (which is an input to
>> the package derivation) we don’t yet know the outputs of the package
>> derivation.  But perhaps we can still agree on a more declarative way
>> to express patches that are to be applied before the build starts;
>> syntax that would be more declarative than a serious of brittle
>> substitute* expressions that latch onto hopefully unique strings in
>> the target files.
>> 
>> (We have something remotely related in etc/committer.scm.in, where we
>> define a record describing a diff hunk.)
>> 
>> Here’s a colour sample for the new bikeshed:
>> 
>>   (arguments
>>     (list
>>       #:patches
>>       #~(patch "the-file"
>>          ((line 10)
>>           (+ "I ONLY WANTED TO ADD THIS LINE"))
>>          ((line 3010)
>>           (- "maybe that’s better")
>>           (+ (string-append #$guix " is better"))
>>           (+ "but what do you think?")))))
> Now this thing is again just as brittle as the patch it encodes and if
> I know something about context-less patches then it's that they're
> super not trustworthy.

What do you mean here, with 'brittle' and 'trustworthy'? Is it the fact
that line numbers are hardcoded, compared to the substitute* approach?

> However, have you considered that something similar has been lying
> around in our codebase all this time and 99% of the packages ignore it
> because it's so obscure and well hidden?  Why yes, of course I'm
> talking about emacs-batch-edit-file.
>
> Now of course there are some problems when it comes to invoking Emacs
> inside Guix build.  For one, not every package has emacs-minimal
> available at build time and honestly, that's a shame.  Even then, you'd
> have to dance around and add emacs-utils to enable it.  And after that?
> You code Emacs Lisp in the middle of Guile code!  Now certainly, this
> UI can be improved.

> Particularly, I'd propose a set of monadic functions that operate on a
> buffer in much the same way Emacs does.  Now we wouldn't need to
> implement all of Emacs in that way (though we certainly could if given
> enough time).
>
> Basic primitives would be the following to name a few: search-forward,
> re-search-forward, goto-char, forward-line, point-min, insert,
> beginning-of-line, end-of-line, delete-region, point.  Of course we
> could rename them to be more guily, but that's a minor concern imo. 
> Notice how I omitted find-file and save-buffer, because that should be
> taken care of by the monad.

I'd prefer something that is declarative in the sense that the 'how' is
abstracted away, with a focus on the 'what'; both Ricardo's proposal and
emacs-batch-edit-file do this, and to a much lesser extent the current
substitute* mess. 'Navigation primitives' as you propose here are
perhaps abstract in the technical sense of the phrase, but still
indicate to me a focus on the 'how' instead of the 'what'.

Another nice property of keeping these concerns separated is that it
should be easy (or at least possible) to later reimplement
e.g. Ricardo's proposal on top of such monadic primitives.  Doing it the
other way around seems much less fun, if even feasible at all :-).

> WDYT, should we pursue that?  Or should we just add an Emacs to our
> native inputs?  :P

At the risk of escalating this perhaps-not-totally-serious proposal, I
wrote some snippets that should allow one to write elisp in
g-expressions using guile's built-in elisp reader + some hacks to
serialize guile's internal representation of parsed elisp to valid elisp
again.

- Jelle


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-06  1:20   ` Jelle Licht
@ 2022-01-06  6:43     ` Liliana Marie Prikler
  2022-01-06  7:12       ` Ricardo Wurmus
  0 siblings, 1 reply; 10+ messages in thread
From: Liliana Marie Prikler @ 2022-01-06  6:43 UTC (permalink / raw)
  To: Jelle Licht, Ricardo Wurmus, guix-devel

Hi,

Am Donnerstag, dem 06.01.2022 um 02:20 +0100 schrieb Jelle Licht:
> > 
> > 
> > > Here’s a colour sample for the new bikeshed:
> > > 
> > >   (arguments
> > >     (list
> > >       #:patches
> > >       #~(patch "the-file"
> > >          ((line 10)
> > >           (+ "I ONLY WANTED TO ADD THIS LINE"))
> > >          ((line 3010)
> > >           (- "maybe that’s better")
> > >           (+ (string-append #$guix " is better"))
> > >           (+ "but what do you think?")))))
> > Now this thing is again just as brittle as the patch it encodes and
> > if I know something about context-less patches then it's that
> > they're super not trustworthy.
> 
> What do you mean here, with 'brittle' and 'trustworthy'? Is it the
> fact that line numbers are hardcoded, compared to the substitute*
> approach?
What Ricardo is writing here as a colour sample is a context-less diff
and if you've ever worked with those, then you'll know they apply
exactly without context.  So that line 10 is stuck there, it doesn't
move with regards to whatever entity is interesting at line 10.  The
second hunk is better in that it needs a line to match and replace, but
it throws an error if it doesn't find that at line 3010, even if it'd
exist and 3020 or 2048.

Now with substitute*s, you also need to account for wrong regexps, both
matching too many and too few lines (including the notorious 0, that
has sparked many a religious debate whether substitute* is safe
actually), so you might call that an even tradeoff.  However,
substitute is still more flexible in what it does than the proposed
patch objects.  Particularly, even if we added context, which would
trade away some of the nice look of it, we now have to go into and
manually regenerate our broken patches rather than using existing merge
tools with our plain old file patches.

> > However, have you considered that something similar has been lying
> > around in our codebase all this time and 99% of the packages ignore
> > it because it's so obscure and well hidden?  Why yes, of course I'm
> > talking about emacs-batch-edit-file.
> > 
> > Now of course there are some problems when it comes to invoking
> > Emacs inside Guix build.  For one, not every package has emacs-
> > minimal available at build time and honestly, that's a shame.  Even
> > then, you'd have to dance around and add emacs-utils to enable it. 
> > And after that?  You code Emacs Lisp in the middle of Guile code! 
> > Now certainly, this UI can be improved.
> 
> > Particularly, I'd propose a set of monadic functions that operate
> > on a buffer in much the same way Emacs does.  Now we wouldn't need
> > to implement all of Emacs in that way (though we certainly could if
> > given enough time).
> > 
> > Basic primitives would be the following to name a few: search-
> > forward, re-search-forward, goto-char, forward-line, point-min,
> > insert, beginning-of-line, end-of-line, delete-region, point.  Of
> > course we could rename them to be more guily, but that's a minor
> > concern imo. Notice how I omitted find-file and save-buffer,
> > because that should be taken care of by the monad.
> 
> I'd prefer something that is declarative in the sense that the 'how'
> is abstracted away, with a focus on the 'what'; both Ricardo's
> proposal and emacs-batch-edit-file do this, and to a much lesser
> extent the current substitute* mess. 'Navigation primitives' as you
> propose here are perhaps abstract in the technical sense of the
> phrase, but still indicate to me a focus on the 'how' instead of the
> 'what'.
I don't think you understood me correctly there.  The monadic
procedures I propose would implement a subset of Emacs' buffer
operations – we can bikeshed about what set of Emacs, so that the set
lets you focus on the 'how' instead of the 'what' – only inside Scheme
and inside some well-known (guix build) module.

Consider the translation of e.g. 
(emacs-batch-edit-file "foo.org"
  '(progn (goto-char (point-max))
          (insert "More content.")
          (basic-save-buffer))) 
to
(with-buffer "foo.org"
  (goto-char (point-max))
  (insert "More content."))
which would roughly equate to
(run-with-buffer (open-buffer "foo.org")
  (lambda (buffer-monad)
    (return [something magical going on here])))
where run-with-buffer would do an atomic file replacement which
temporarily creates a buffer monad for the translation from in to out.

> Another nice property of keeping these concerns separated is that it
> should be easy (or at least possible) to later reimplement
> e.g. Ricardo's proposal on top of such monadic primitives.  Doing it
> the other way around seems much less fun, if even feasible at all :-
> ).
I haven't heard about diff being Turing-complete while Emacs
demonstrably is, so buffer monads would be somewhere in between.

> > WDYT, should we pursue that?  Or should we just add an Emacs to our
> > native inputs?  :P
> 
> At the risk of escalating this perhaps-not-totally-serious proposal,
> I wrote some snippets that should allow one to write elisp in
> g-expressions using guile's built-in elisp reader + some hacks to
> serialize guile's internal representation of parsed elisp to valid
> elisp again.
I don't think we would need to use G-expressions here, because we would
have to quote or quasi-quote the batch-edit-file expression anyway and
if we do the latter, we can insert store references using the usual set
of primitives that you'd expect inside a build phase, e.g. search-
input-file.  However, that's still a cool thing.

Cheers


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-06  6:43     ` Liliana Marie Prikler
@ 2022-01-06  7:12       ` Ricardo Wurmus
  2022-01-06  8:12         ` Liliana Marie Prikler
  0 siblings, 1 reply; 10+ messages in thread
From: Ricardo Wurmus @ 2022-01-06  7:12 UTC (permalink / raw)
  To: Liliana Marie Prikler; +Cc: guix-devel

Liliana Marie Prikler <liliana.prikler@gmail.com> writes:

> Am Donnerstag, dem 06.01.2022 um 02:20 +0100 schrieb Jelle Licht:
>> > 
>> > 
>> > > Here’s a colour sample for the new bikeshed:
>> > > 
>> > >   (arguments
>> > >     (list
>> > >       #:patches
>> > >       #~(patch "the-file"
>> > >          ((line 10)
>> > >           (+ "I ONLY WANTED TO ADD THIS LINE"))
>> > >          ((line 3010)
>> > >           (- "maybe that’s better")
>> > >           (+ (string-append #$guix " is better"))
>> > >           (+ "but what do you think?")))))
>> > Now this thing is again just as brittle as the patch it encodes and
>> > if I know something about context-less patches then it's that
>> > they're super not trustworthy.
>> 
>> What do you mean here, with 'brittle' and 'trustworthy'? Is it the
>> fact that line numbers are hardcoded, compared to the substitute*
>> approach?
> What Ricardo is writing here as a colour sample is a context-less diff
> and if you've ever worked with those, then you'll know they apply
> exactly without context.  So that line 10 is stuck there, it doesn't
> move with regards to whatever entity is interesting at line 10.  The
> second hunk is better in that it needs a line to match and replace, but
> it throws an error if it doesn't find that at line 3010, even if it'd
> exist and 3020 or 2048.

Yes, this example is context-less.  But you know what it’s like looking
at bikeshed colours on a screen: they just don’t pop right, and
dependent on screen calibration or lack thereof they can seem outright
hideous — nothing like the real thing.

So lets take a step back and look at the location and shape of the
bikeshed rather than its color.  Do we agree that it would be lovely to
have a less flexible but declarative pattern to describe changes to
files?  Less flexible than a full-blown editing DSL as that of Emacs,
less flexible than perhaps arbitrary regular expression replacements as
provided by substitute*?  I just think that sometimes we want to focus
on the change itself and not how we get there.

It’s primarily a matter of style and readability.  I think it’s
regrettable to have all these boilerplate build phase shenanigans to
express a simple change in a file.  A large chunk of that is just boring
set up work to be permitted to use “substitute*”, and then the
“substitute*” itself is primarily concerned with an anchor that could
not be much uglier: a regular expression DSL embedded in a string with
double escaping.  Yuck!

Even something as simple as diff-in-a-string seems more appealing in
some cases than all these “substitute*” expressions.

-- 
Ricardo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-06  7:12       ` Ricardo Wurmus
@ 2022-01-06  8:12         ` Liliana Marie Prikler
  0 siblings, 0 replies; 10+ messages in thread
From: Liliana Marie Prikler @ 2022-01-06  8:12 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

Hi Ricardo,

Am Donnerstag, dem 06.01.2022 um 08:12 +0100 schrieb Ricardo Wurmus:
> So lets take a step back and look at the location and shape of the
> bikeshed rather than its color.  Do we agree that it would be lovely
> to have a less flexible but declarative pattern to describe changes
> to files?  Less flexible than a full-blown editing DSL as that of
> Emacs, less flexible than perhaps arbitrary regular expression
> replacements as provided by substitute*?  I just think that sometimes
> we want to focus on the change itself and not how we get there.
I can't say that we do.  Look back at Attila's case for plain old
diffs.  It seems to me that if all you want to encode is a patch file,
you ought to use a patch file.  That doesn't necessarily entail adding
it to origin, however, and I think we can find some agreement if we
change the way we write things.

> It’s primarily a matter of style and readability.  I think it’s
> regrettable to have all these boilerplate build phase shenanigans to
> express a simple change in a file.  A large chunk of that is just
> boring set up work to be permitted to use “substitute*”, and then the
> “substitute*” itself is primarily concerned with an anchor that could
> not be much uglier: a regular expression DSL embedded in a string
> with double escaping.  Yuck!
> 
> Even something as simple as diff-in-a-string seems more appealing in
> some cases than all these “substitute*” expressions.
Now X in a string is usually very appealing due to its pretty low
barrier for entry.  For a long time, I had the custom Ren'py launcher
be a big format¹, because that's what worked.  However, I've since
changed it to an aux-file + substitute*, and that gets my intent across
much more clearly.

I think we can do something similar for patches.  Rather than coding
them into the file, we'd have 
#:patches #~(list #$(locate-first-patch) #$(locate-second-patch)), and
a post-unpack phase (let's call it "patch") would take care of applying
these patches.  If you need store paths, you can write
@HERE_COMES_THE_STORE_PATH@ and easily match that in a substitute that
runs in a post-pack phase.  If you prefer we do so atomically in unpack
(i.e. unpack becomes "unpack and apply all #:patches") so as to not
change our phase API, that'd also work for me personally.

In short, I'd say "yes" to making it easier to apply patches at build
time.  Currently, you have to add both the patch and the patch program
as well as code up your own phase to do so – not ideal.  We could
lessen that to just making sure patch is in native-inputs if the
#:patches keyword is used.

Now you still have to adjust dist_patch_DATA and that might be a pain
point for maintainers, but in the grand scheme of things it's in my
opinion a lesser evil.  WDYT?

¹ We need one, because the one supplied by upstream happily loads the
same files twice only to then fail with a huge stack trace.  I'm not
sure if I inadvertently fixed the reason why it loads the same files
twice elsewhere, but it's nice to have that control.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-04 16:50 RFC: new syntax for inline patches Ricardo Wurmus
  2022-01-05  8:16 ` Attila Lendvai
  2022-01-06  0:19 ` Liliana Marie Prikler
@ 2022-01-08 21:34 ` Ludovic Courtès
  2022-01-12 18:06   ` Efraim Flashner
  2022-01-12 17:56 ` Efraim Flashner
  3 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2022-01-08 21:34 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

Hi!

Ricardo Wurmus <rekado@elephly.net> skribis:

>   (arguments
>     (list
>     #:phases
>     '(modify-phases %standard-phases
>       (add-after 'unpack 'i-dont-care
>         (lambda _
>           (substitute* "this-file"
>             (("^# some unique string, oh, careful! gotta \\(escape\\) this\\." m)
>              (string-append m "\nI ONLY WANTED TO ADD THIS LINE!\n"))))))))

[...]

> There are a few reasons why we don’t use patches as often:
>
> 1. the source code is precious and we prefer to modify the original
> sources as little as necessary, so that users can get the source code as
> upstream intended with “guix build -S foo”.  We patch the sources
> primarily to get rid of bundled source code, pre-built binaries, or
> code that encroaches on users’ freedom.
>
> 2. the (patches …) field uses patch files.  These are annoying and
> inflexible.  They have to be added to dist_patch_DATA in gnu/local.mk,
> and they cannot contain computed store locations.  They are separate
> from the package definition, which is inconvenient.
>
> 3. snippets feel like less convenient build phases.  Snippets are not
> thunked, so we can’t do some things that we would do in a build phase
> substitution.  We also can’t access %build-inputs or %outputs.  (I don’t
> know if we can use Gexps there.)

I agree that #1 is overrated.

As for #3, we could make ‘snippet’ thunked (a snippet can be a gexp
already).  We cannot refer to build inputs there, but that’s on purpose:
snippets, like patches, are supposed to be architecture-independent and
unable to insert store file names.

[...]

> (We have something remotely related in etc/committer.scm.in, where we
> define a record describing a diff hunk.)
>
> Here’s a colour sample for the new bikeshed:
>
>   (arguments
>     (list
>       #:patches
>       #~(patch "the-file"
>          ((line 10)
>           (+ "I ONLY WANTED TO ADD THIS LINE"))
>          ((line 3010)
>           (- "maybe that’s better")
>           (+ (string-append #$guix " is better"))
>           (+ "but what do you think?")))))

Like Attila my first reaction was skepticism.

… but thinking about it, we could have a <computed-patch> record,
similar to the <diff-hunk> record you mention; it would be a file-like
object that, when lowered, would give an actual patch.

So you could write:

  (origin
    ;; …
    (patches (list (computed-patch
                     (hunk (line 10) (+ "new line") (- "old line"))))))

The good thing is that the implementation of <computed-patch> would be
entirely orthogonal, separate from the package machinery.

OTOH, if we do that, we might as well write the actual patch right away.

I wonder how frequent the pattern we’re discussing is.  I know I’ve used
it a few times, but I wonder if it warrants sophisticated tooling.

Thoughts?

Ludo’.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-04 16:50 RFC: new syntax for inline patches Ricardo Wurmus
                   ` (2 preceding siblings ...)
  2022-01-08 21:34 ` Ludovic Courtès
@ 2022-01-12 17:56 ` Efraim Flashner
  3 siblings, 0 replies; 10+ messages in thread
From: Efraim Flashner @ 2022-01-12 17:56 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 4077 bytes --]

On Tue, Jan 04, 2022 at 05:50:31PM +0100, Ricardo Wurmus wrote:
> Hi Guix,
> 
> does this pattern look familiar to you?
> 
>   (arguments
>     (list
>     #:phases
>     '(modify-phases %standard-phases
>       (add-after 'unpack 'i-dont-care
>         (lambda _
>           (substitute* "this-file"
>             (("^# some unique string, oh, careful! gotta \\(escape\\) this\\." m)
>              (string-append m "\nI ONLY WANTED TO ADD THIS LINE!\n"))))))))
> 
> This is a lot of boilerplate just to add a line to a file.  I’m using
> “substitute*” but actually I don’t want to substitute anything.  I just
> know that I need to add a line somewhere in “this-file”.
> 
> Or maybe it’s a CMakeLists.txt file that inexplicably wants to download
> stuff?  I should patch that file but it’s a multi-line change.  So I’m
> trying to do the same as above with several different anchor strings to
> comment out lines.
> 
> We have a lot of packages like that.  And while this boilerplate pattern
> looks familiar to most of us now, it is really unclear.  It is
> imperative and abuses regular expression matching when really it should
> have been a patch.
> 
> There are a few reasons why we don’t use patches as often:
> 
> 1. the source code is precious and we prefer to modify the original
> sources as little as necessary, so that users can get the source code as
> upstream intended with “guix build -S foo”.  We patch the sources
> primarily to get rid of bundled source code, pre-built binaries, or
> code that encroaches on users’ freedom.
> 
> 2. the (patches …) field uses patch files.  These are annoying and
> inflexible.  They have to be added to dist_patch_DATA in gnu/local.mk,
> and they cannot contain computed store locations.  They are separate
> from the package definition, which is inconvenient.

It also feels wrong to add a 30 line patch, taking into account the
header bits, to make a 3 line change.

> 3. snippets feel like less convenient build phases.  Snippets are not
> thunked, so we can’t do some things that we would do in a build phase
> substitution.  We also can’t access %build-inputs or %outputs.  (I don’t
> know if we can use Gexps there.)

I believe you can leave out the modules line and use a gexp in the
snippet (without the "'(begin" portion )

> I feel that the first point is perhaps a little overvalued.  I have
> often felt annoyed that I had to manually apply all this build phase
> patching to source code obtained with “guix build -S”, but I never felt
> that source code I got from “guix build -S” was too far removed from
> upstream.
> 
> It may not be possible to apply patches with computed store locations —
> because when we compute the source derivation (which is an input to the
> package derivation) we don’t yet know the outputs of the package
> derivation.  But perhaps we can still agree on a more declarative way to
> express patches that are to be applied before the build starts; syntax
> that would be more declarative than a serious of brittle substitute*
> expressions that latch onto hopefully unique strings in the target
> files.
> 
> (We have something remotely related in etc/committer.scm.in, where we
> define a record describing a diff hunk.)
> 
> Here’s a colour sample for the new bikeshed:
> 
>   (arguments
>     (list
>       #:patches
>       #~(patch "the-file"
>          ((line 10)
>           (+ "I ONLY WANTED TO ADD THIS LINE"))
>          ((line 3010)
>           (- "maybe that’s better")
>           (+ (string-append #$guix " is better"))
>           (+ "but what do you think?")))))

I have on at least one occasion stopped myself from trying to use ed (it
IS the standard editor) to apply something that SHOULD BE trivial to
change.

-- 
Efraim Flashner   <efraim@flashner.co.il>   רנשלפ םירפא
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC: new syntax for inline patches
  2022-01-08 21:34 ` Ludovic Courtès
@ 2022-01-12 18:06   ` Efraim Flashner
  0 siblings, 0 replies; 10+ messages in thread
From: Efraim Flashner @ 2022-01-12 18:06 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 3748 bytes --]

On Sat, Jan 08, 2022 at 10:34:15PM +0100, Ludovic Courtès wrote:
> Hi!
> 
> Ricardo Wurmus <rekado@elephly.net> skribis:
> 
> >   (arguments
> >     (list
> >     #:phases
> >     '(modify-phases %standard-phases
> >       (add-after 'unpack 'i-dont-care
> >         (lambda _
> >           (substitute* "this-file"
> >             (("^# some unique string, oh, careful! gotta \\(escape\\) this\\." m)
> >              (string-append m "\nI ONLY WANTED TO ADD THIS LINE!\n"))))))))
> 
> [...]
> 
> > There are a few reasons why we don’t use patches as often:
> >
> > 1. the source code is precious and we prefer to modify the original
> > sources as little as necessary, so that users can get the source code as
> > upstream intended with “guix build -S foo”.  We patch the sources
> > primarily to get rid of bundled source code, pre-built binaries, or
> > code that encroaches on users’ freedom.
> >
> > 2. the (patches …) field uses patch files.  These are annoying and
> > inflexible.  They have to be added to dist_patch_DATA in gnu/local.mk,
> > and they cannot contain computed store locations.  They are separate
> > from the package definition, which is inconvenient.
> >
> > 3. snippets feel like less convenient build phases.  Snippets are not
> > thunked, so we can’t do some things that we would do in a build phase
> > substitution.  We also can’t access %build-inputs or %outputs.  (I don’t
> > know if we can use Gexps there.)
> 
> I agree that #1 is overrated.
> 
> As for #3, we could make ‘snippet’ thunked (a snippet can be a gexp
> already).  We cannot refer to build inputs there, but that’s on purpose:
> snippets, like patches, are supposed to be architecture-independent and
> unable to insert store file names.
> 
> [...]
> 
> > (We have something remotely related in etc/committer.scm.in, where we
> > define a record describing a diff hunk.)
> >
> > Here’s a colour sample for the new bikeshed:
> >
> >   (arguments
> >     (list
> >       #:patches
> >       #~(patch "the-file"
> >          ((line 10)
> >           (+ "I ONLY WANTED TO ADD THIS LINE"))
> >          ((line 3010)
> >           (- "maybe that’s better")
> >           (+ (string-append #$guix " is better"))
> >           (+ "but what do you think?")))))
> 
> Like Attila my first reaction was skepticism.
> 
> … but thinking about it, we could have a <computed-patch> record,
> similar to the <diff-hunk> record you mention; it would be a file-like
> object that, when lowered, would give an actual patch.
> 
> So you could write:
> 
>   (origin
>     ;; …
>     (patches (list (computed-patch
>                      (hunk (line 10) (+ "new line") (- "old line"))))))
> 
> The good thing is that the implementation of <computed-patch> would be
> entirely orthogonal, separate from the package machinery.
> 
> OTOH, if we do that, we might as well write the actual patch right away.
> 
> I wonder how frequent the pattern we’re discussing is.  I know I’ve used
> it a few times, but I wonder if it warrants sophisticated tooling.
> 
> Thoughts?

I'm OK with needing to change the exact line needed if it moves (EXACTLY
line 10, not 8 or 12 or 25).

It comes up a lot when glibc headers move or split, suddenly we're
looking at the sources, trying to find somewhere to stuff in an extra
include statement. Or qt-5.11, I think we came up with 3 different ways
of dealing with the missing header over the 10 patches.

-- 
Efraim Flashner   <efraim@flashner.co.il>   רנשלפ םירפא
GPG key = A28B F40C 3E55 1372 662D  14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-01-12 18:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-04 16:50 RFC: new syntax for inline patches Ricardo Wurmus
2022-01-05  8:16 ` Attila Lendvai
2022-01-06  0:19 ` Liliana Marie Prikler
2022-01-06  1:20   ` Jelle Licht
2022-01-06  6:43     ` Liliana Marie Prikler
2022-01-06  7:12       ` Ricardo Wurmus
2022-01-06  8:12         ` Liliana Marie Prikler
2022-01-08 21:34 ` Ludovic Courtès
2022-01-12 18:06   ` Efraim Flashner
2022-01-12 17:56 ` Efraim Flashner

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).