all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
blob 75367331c571a0d61c937ef5535d01eaa76d8a82 19004 bytes (raw)
name: website/posts/dissecting-guix-2-store-monad.md 	 # note: path name is non-authoritative(*)

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
 
title: Dissecting Guix, Part 2: The Store Monad
date: TBC
author: (
tags: Dissecting Guix, Functional package management, Programming interfaces, Scheme API
---
Hello again!

In [the last post](https://guix.gnu.org/en/blog/2023/dissecting-guix-part-1-derivations/),
we briefly mentioned the `with-store` and `run-with-store` macros.  Today, we'll
be looking at those in further detail, along with the related monad library and
the [`%store-monad`](https://guix.gnu.org/manual/devel/en/html_node/The-Store-Monad.html)!

Typically, we use monads to chain operations together, and the `%store-monad` is
no different; it's used to combine operations that work on the Guix store (for
instance, creating derivations, building derivations, or adding data files to
the store).

However, monads are a little hard to explain, and from a distance, they seem to
be quite incomprehensible.  So, I want you to erase them from your mind for now.
We'll come back to them later.  And be aware that if you can't seem to get your
head around them, it's okay; you can understand most of the architecture of Guix
without understanding monads.

# Yes, No, Maybe So

Let's instead implement another M of functional programming, _`maybe`_ values,
representing a value that may or may not exist.  For instance, there could be a
procedure that attempts to pop a stack, returning the result if there is one, or
`nothing` if the stack has no elements.

`maybe` is a very common feature of statically-typed functional languages, and
you'll see it all over the place in Haskell and OCaml code. However, Guile is
dynamically typed, so we usually use ad-hoc `#f` values as the "null value"
instead of a proper "nothing" or "none".

Just for fun, though, we'll implement a proper `maybe` in Guile.  Fire up that
REPL once again, and let's import a bunch of modules that we'll need:

```scheme
(use-modules (ice-9 match)
             (srfi srfi-9))
```

We'll implement `maybe` as a record with two fields, `is?` and `value`.  If the
value contains something, `is?` will be `#t` and `value` will contain the thing
in question, and if it's empty, `is?`'ll be `#f`.

```scheme
(define-record-type <maybe>
  (make-maybe is? value)
  maybe?
  (is? maybe-is?)
  (value maybe-value))
```

Now we'll define constructors for the two possible states:

```scheme
(define (something value)
  (make-maybe #t value))

(define (nothing)
  (make-maybe #f #f)) ;the value here doesn't matter; we'll just use #f
```

And make some silly functions that return optional values:

```scheme
(define (remove-a str)
  (if (eq? (string-ref str 0) #\a)
      (something (substring str 1))
      (nothing)))

(define (remove-b str)
  (if (eq? (string-ref str 0) #\b)
      (something (substring str 1))
      (nothing)))
      
(remove-a "ahh")
⇒ #<<maybe> is?: #t value: "hh">

(remove-a "ooh")
⇒ #<<maybe> is?: #f value: #f>

(remove-b "bad")
⇒ #<<maybe> is?: #t value: "ad">
```

But what if we want to compose the results of these functions?

# Keeping Your Composure

As you might have guessed, this is not fun.  Cosplaying as a compiler backend
typically isn't.

```scheme
(let ((t1 (remove-a "abcd")))
  (if (maybe-is? t1)
      (remove-b (maybe-value t1))
      (nothing)))
⇒ #<<maybe> is?: #t value: "cd">

(let ((t1 (remove-a "bbcd")))
  (if (maybe-is? t1)
      (remove-b (maybe-value t1))
      (nothing)))
⇒ #<<maybe> is?: #f value: #f>
```

I can almost hear the heckling.  Even worse, composing three:

```scheme
(let* ((t1 (remove-a "abad"))
       (t2 (if (maybe-is? t1)
               (remove-b (maybe-value t1))
               (nothing))))
  (if (maybe-is? t2)
      (remove-a (maybe-value t2))
      (nothing)))
⇒ #<<maybe> is?: #t value: "d">
```

So, how do we go about making this more bearable?  Well, one way could be to
make `remove-a` and `remove-b` accept `maybe`s:

```scheme
(define (remove-a ?str)
  (match ?str
    (($ <maybe> #t str)
     (if (eq? (string-ref str 0) #\a)
         (something (substring str 1))
         (nothing)))
    (_ (nothing))))

(define (remove-b ?str)
  (match ?str
    (($ <maybe> #t str)
     (if (eq? (string-ref str 0) #\b)
         (something (substring str 1))
         (nothing)))
    (_ (nothing))))
```

Not at all pretty, but it works!

```scheme
(remove-b (remove-a (something "abc")))
⇒ #<<maybe> is?: #t value: "c">
```

Still, our procedures now require quite a bit of boilerplate.  Might there be a
better way?

# The Ties That `>>=` Us

First of all, we'll revert to our original definitions of `remove-a` and
`remove-b`, that is to say, the ones that take a regular value and return a
`maybe`.

```scheme
(define (remove-a str)
  (if (eq? (string-ref str 0) #\a)
      (something (substring str 1))
      (nothing)))

(define (remove-b str)
  (if (eq? (string-ref str 0) #\b)
      (something (substring str 1))
      (nothing)))
```

What if tried introducing higher-order procedures (procedures that accept other
procedures as arguments) into the equation?  Because we're functional
programmers and we have an unhealthy obsession with that sort of thing.

```scheme
(define (maybe-chain maybe proc)
  (if (maybe-is? maybe)
      (proc (maybe-value maybe))
      (nothing)))

(maybe-chain (something "abc")
             remove-a)
⇒ #<<maybe> is?: #t value: "bc">

(maybe-chain (nothing)
             remove-a)
⇒ #<<maybe> is?: #f value: #f>
```

It lives!  To make it easier to compose procedures like this, we'll define a
macro that allows us to perform any number of sequenced operations with only one
composition form:

```scheme
(define-syntax maybe-chain*
  (syntax-rules ()
    ((_ maybe proc)
     (maybe-chain maybe proc))
    ((_ maybe proc rest ...)
     (maybe-chain* (maybe-chain maybe proc)
                   rest ...))))

(maybe-chain* (something "abad")
              remove-a
              remove-b
              remove-a)
⇒ #<<maybe> is?: #t value: "d">
```

Congratulations, you've just implemented the `bind` operation, commonly written
as `>>=`, for our `maybe` type.  And it turns out that a monad is just any
container-like value for which `>>=` (along with another procedure called
`return`, which wraps a given value in the simplest possible form of a monad)
has been implemented.

A more formal definition would be that a monad is a mathematical object composed
of three parts: a type, a `bind` function, and a `return` function.  So, how do
monads relate to Guix?

# New Wheel, Old Wheel

Now that we've reinvented the wheel, we'd better learn to use the original
wheel.  Guix provides a generic, high-level monads library, along with the two
generic monads `%identity-monad` and `%state-monad`, and the Guix-specific
`%store-monad`.  Since `maybe` is not one of them, let's integrate our version
into the Guix monad system!

First we'll import the module that provides the aforementioned library:

```scheme
(use-modules (guix monads))
```

To define a monad's behaviour in Guix, we simply use the `define-monad` macro,
and provide two procedures: `bind`, and `return`.

```scheme
(define-monad %maybe-monad
  (bind maybe-chain)
  (return something))
```

`bind` is just the procedure that we use to compose monadic procedure calls
together, and `return` is the procedure that wraps values in the most basic form
of the monad.  A properly implemented `bind` and `return` must follow the
so-called _monad laws_:

1. `(bind (return x) proc)` must be equivalent to `(proc x)`.
2. `(bind monad return)` must be equivalent to just `monad`.
3. `(bind (bind monad proc-1) proc-2)` must be equivalent to
   `(bind monad (lambda (x) (bind (proc-1 x) proc-2)))`.

Let's verify that our `maybe-chain` and `something` procedures adhere to the
monad laws:

```scheme
(define (mlaws-proc-1 x)
  (something (+ x 1)))

(define (mlaws-proc-2 x)
  (something (+ x 2)))
  
;; First law: the left identity.
(equal? (maybe-chain (something 0)
                     mlaws-proc-1)
        (mlaws-proc-1 0))
⇒ #t
 
;; Second law: the right identity.
(equal? (maybe-chain (something 0)
                     something)
        (something 0))
⇒ #t

;; Third law: associativity.
(equal? (maybe-chain (maybe-chain (something 0)
                                  mlaws-proc-1)
                     mlaws-proc-2)
        (maybe-chain (something 0)
                     (lambda (x)
                       (maybe-chain (mlaws-proc-1 x)
                                    mlaws-proc-2))))
⇒ #t
```

Now that we know they're valid, we can use the `with-monad` macro to tell Guix
to use these specific implementations of `bind` and `return`, and the `>>=`
macro to thread monads through procedure calls!

```scheme
(with-monad %maybe-monad
  (>>= (something "aabbc")
       remove-a
       remove-a
       remove-b
       remove-b))
⇒ #<<maybe> is?: #t value: "c">
```

We can also now use `return`:

```scheme
(with-monad %maybe-monad
  (return 32))
⇒ #<<maybe> is?: #t value: 32>
```

But Guix provides many higher-level interfaces than `>>=` and `return`, as we
will see.  There's `mbegin`, which evaluates monadic expressions without binding
them to symbols, returning the last one.  This, however, isn't particularly
useful with our `%maybe-monad`, as it's only really usable if the monadic
operations within have side effects, just like the non-monadic `begin`.

There's also `mlet` and `mlet*`, which _do_ bind the results of monadic
expressions to symbols, and are essentially equivalent to a chain of
`(>>= MEXPR (lambda (BINDING) ...))`:

```scheme
;; This is equivalent...
(mlet* %maybe-monad ((str -> "abad") ;non-monadic binding uses the -> symbol
                     (str1 (remove-a str))
                     (str2 (remove-b str)))
  (remove-a str))
⇒ #<<maybe> is?: #t value: "d">

;; ...to this:
(with-monad %maybe-monad
  (>>= (return "abad")
       (lambda (str)
         (remove-a str))
       (lambda (str1)
         (remove-b str))
       (lambda (str2)
         (remove-a str))))
```

Various abstractions over these two exist too, such as `mwhen` (a `when` plus an
`mbegin`), `munless` (an `unless` plus an `mbegin`), and `mparameterize`
(dynamically-scoped value rebinding, like `parameterize`, in a monadic context).
`lift` takes a procedure and a monad and creates a new procedure that returns
a monadic value.

There are also interfaces for manipulating lists wrapped in monads; `listm`
creates such a list, `sequence` turns a list of monads into a list wrapped in a
monad, and the `anym`, `mapm`, and `foldm` procedures are like their non-monadic
equivalents, except that they return lists wrapped in monads.

This is all well and good, you may be thinking, but why does Guix need a monad
library, anyway?  The answer is technically that it doesn't.  But building on
the monad API makes a lot of things much easier, and to learn why, we're going
to look at one of Guix's built-in monads.

# In a State

Guix implements a monad called `%state-monad`, and it works with single-argument
procedures returning two values.  Behold:

```scheme
(with-monad %state-monad
  (return 33))
⇒ #<procedure 21dc9a0 at <unknown port>:1106:22 (state)>
```

The `run-with-state` value turns this procedure into an actually useful value,
or, rather, two values:

```scheme
(run-with-state (with-monad %state-monad (return 33))
  (list "foo" "bar" "baz"))
⇒ 33
⇒ ("foo" "bar" "baz")
```

What can this actually do for us, though? Well, it gets interesting if we do
some `>>=`ing:

```scheme
(define state-seq
  (mlet* %state-monad ((number (return 33)))
    (state-push number)))
result
⇒ #<procedure 7fcb6f466960 at <unknown port>:1484:24 (state)>

(run-with-state state-seq (list 32))
⇒ (32)
⇒ (33 32)

(run-with-state state-seq (list 30 99))
⇒ (30 99)
⇒ (33 30 99)
```

What is `state-push`?  It's a monadic procedure for `%state-monad` that takes
whatever's currently in the first value (the primary value) and pushes it onto
the second value (the state value), which is assumed to be a list, returning the
old state value as the primary value and the new list as the state value.

So, when we do `(run-with-state result (list 32))`, we're passing `(list 32)` as
the initial state value, and then the `>>=` form passes that and `33` to
`state-push`.  What `%state-monad` allows us to do is thread together some
procedures that require some kind of state, while essentially pretending the
state value is stored globally, like you might do in, say, C, and then retrieve
both the final state and the result at the end!

If you're a bit confused, don't worry.  We'll write some of our own
`%state-monad`-based monadic procedures and hopefully all will become clear.
Consider, for instance, the
[Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number), in which
each value is computed by adding the previous two.  We could use the
`%state-monad` to compute Fibonacci numbers by storing the previous number as
the primary value and the number before that as the state value:

```scheme
(define (fibonacci-thing value)
  (lambda (state)
    (values (+ value state)
            value)))
```

Now we can feed our Fibonacci-generating procedure the first value using
`run-with-state` and the second using `return`:

```scheme
(run-with-state
    (mlet* %state-monad ((starting (return 1))
                         (n1 (fibonacci-thing starting))
                         (n2 (fibonacci-thing n1)))
      (fibonacci-thing n2))
  0)
⇒ 3
⇒ 2

(run-with-state
    (mlet* %state-monad ((starting (return 1))
                         (n1 (fibonacci-thing starting))
                         (n2 (fibonacci-thing n1))
                         (n3 (fibonacci-thing n2))
                         (n4 (fibonacci-thing n3))
                         (n5 (fibonacci-thing n4)))
      (fibonacci-thing n5))
  0)
⇒ 13
⇒ 8
```

This is all very nifty, and possibly useful in general, but what does this have
to do with Guix?  Well, many Guix store-based operations are meant to be used
in concert with yet another monad, called the `%store-monad`.  But if we look at
`(guix store)`, where `%store-monad` is defined...

```scheme
(define-alias %store-monad %state-monad)
(define-alias store-return state-return)
(define-alias store-bind state-bind)
```

It was all a shallow façade!  All the "store monad" is is a special case of the
state monad, where a value representing the store is passed as the state value.

# Lies, Damned Lies, and Abstractions

We mentioned that, technically, we didn't need monads for Guix.  Indeed, many
(now deprecated) procedures take a store value as the argument, such as
`build-expression->derivation`.  However, monads are far more elegant and
simplify store code by quite a bit.

`build-expression->derivation`, being deprecated, should never of course be
used.  For one thing, it uses the "quoted build expression" style, rather than
G-expressions (we'll discuss gexps another time).  The best way to create a
derivation from some basic build code is to use the new-fangled
`gexp->derivation` procedure:

```scheme
(use-modules (guix gexp)
             (gnu packages irc))

(define symlink-irssi
  (gexp->derivation "link-to-irssi"
    #~(symlink #$(file-append irssi "/bin/irssi") #$output)))
⇒ #<procedure 7fddcc7b81e0 at guix/gexp.scm:1180:2 (state)>
```

You don't have to understand the `#~(...)` form yet, only everything surrounding
it.  We can see that this `gexp->derivation` returns a procedure taking the
initial state (store), just like our `%state-monad` procedures did, and like we
used `run-with-state` to pass the initial state to a `%state-monad` monadic
value, we use our old friend `run-with-store` when we have a `%store-monad`
monadic value!

```scheme
(define symlink-irssi-drv
  (with-store store
    (run-with-store store
      symlink-irssi)))
⇒ #<derivation /gnu/store/q7kwwl4z6psifnv4di1p1kpvlx06fmyq-link-to-irssi.drv => /gnu/store/6a94niigx4ii0ldjdy33wx9anhifr25x-link-to-irssi 7fddb7ef52d0>
```

Let's just check this derivation is as expected by reading the code from the
builder script.

```scheme
(define symlink-irssi-builder
  (list-ref (derivation-builder-arguments symlink-irssi-drv) 1))

(call-with-input-file symlink-irssi-builder
  (lambda (port)
    (read port)))
    
⇒ (symlink
   "/gnu/store/hrlmypx1lrdjlxpkqy88bfrzg5p0bn6d-irssi-1.4.3/bin/irssi"
   ((@ (guile) getenv) "out"))
```

And indeed, it symlinks the `irssi` binary to the output path.  Some other,
higher-level, monadic procedures include `interned-file`, which copies a file
from outside the store into it, and `text-file`, which copies some text into it.
Generally, these procedures aren't used, as there are higher-level procedures
that perform similar functions (which we will discuss later), but for the sake
of this blog post, here's an example:

```scheme
(with-store store
  (run-with-store store
    (text-file "unmatched-paren"
      "( <paren@disroot.org>")))
⇒ "/gnu/store/v6smacxvdk4yvaa3s3wmd54lixn1dp3y-unmatched-paren"
```

# Conclusion

What have we learned about monads?  The key points we can take away are:

1. Monads are a way of composing together procedures and values that are wrapped
   in containers that give them extra context, like `maybe` values.
2. Guix provides a high-level monad library that compensates for Guile's lack of
   static typing or an interface-like system.
3. The `(guix monads)` module provides the state monad, which allows you to
   thread state through procedures, allowing you to essentially pretend it's a
   global variable that's modified by each procedure.
4. Guix uses the store monad frequently to thread a store connection through
   procedures that need it.
5. The store monad is really just the state monad in disguise, where the state
   value is used to thread the store object through monadic procedures.

If you've read this post in its entirety but still don't yet quite get it, don't
worry.  Try to modify and tinker about with the examples, and ask any questions
on the IRC channel `#guix:libera.chat` and mailing list at `help-guix@gnu.org`,
and hopefully it will all click eventually!

#### About GNU Guix

[GNU Guix](https://guix.gnu.org) is a transactional package manager and
an advanced distribution of the GNU system that [respects user
freedom](https://www.gnu.org/distros/free-system-distribution-guidelines.html).
Guix can be used on top of any system running the Hurd or the Linux
kernel, or it can be used as a standalone operating system distribution
for i686, x86_64, ARMv7, AArch64 and POWER9 machines.

In addition to standard package management features, Guix supports
transactional upgrades and roll-backs, unprivileged package management,
per-user profiles, and garbage collection.  When used as a standalone
GNU/Linux distribution, Guix offers a declarative, stateless approach to
operating system configuration management.  Guix is highly customizable
and hackable through [Guile](https://www.gnu.org/software/guile)
programming interfaces and extensions to the
[Scheme](http://schemers.org) language.

debug log:

solving 7536733 ...
found 7536733 in https://yhetil.org/guix/20230216170015.5188-1-paren@disroot.org/

applying [1/1] https://yhetil.org/guix/20230216170015.5188-1-paren@disroot.org/
diff --git a/website/posts/dissecting-guix-2-store-monad.md b/website/posts/dissecting-guix-2-store-monad.md
new file mode 100644
index 0000000..7536733

1:84: trailing whitespace.
      
1:269: trailing whitespace.
  
1:275: trailing whitespace.
 
1:515: trailing whitespace.
    
Checking patch website/posts/dissecting-guix-2-store-monad.md...
Applied patch website/posts/dissecting-guix-2-store-monad.md cleanly.
warning: 4 lines add whitespace errors.

index at:
100644 75367331c571a0d61c937ef5535d01eaa76d8a82	website/posts/dissecting-guix-2-store-monad.md

(*) Git path names are given by the tree(s) the blob belongs to.
    Blobs themselves have no identifier aside from the hash of its contents.^

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.