unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#52654: shepherd lacks error reporting
@ 2021-12-19  5:13 raingloom
  2021-12-19  6:02 ` raingloom
  0 siblings, 1 reply; 3+ messages in thread
From: raingloom @ 2021-12-19  5:13 UTC (permalink / raw)
  To: 52654

[-- Attachment #1: Type: text/plain, Size: 1114 bytes --]

I'm writing a single-shot shepherd-service that expands the (ext4) root
file system on first boot, using the hostname service as a template,
just passing the script as a G-expression, instead of using the
forkexec constructor.
Of course there is a bug in it. Trouble is, I have no idea what it is,
because Shepherd won't tell me. :)
The VM boots and completes the ssh initialization phase and then
apparently just gets stuck. Doesn't even show a login prompt.
It's... not a great debugging experience.
I'm going to attempt to at the very least add some error reporting.
It would also be really nice if the failure modes for Shepherd services
were better documented, like what happens when the procedure passed in
the `start` field fails, or is not even a procedure, etc.
Since I never touched Shepherd internals, help would be greatly
appreciated.

ps.: I'm attaching the system definition for completeness's sake and so
that someone might point out where the error is, but honestly the exact
bug in my code does not matter for the feature. All that matters is
there is an error and it should be logged but isn't.

[-- Attachment #2: cloud-deploy-bootstrap.scm --]
[-- Type: text/x-scheme, Size: 2928 bytes --]

(define-module (raingloom machines cloud-deploy-bootstrap))
(use-modules
 (gnu)
 (gnu system nss)
 (guix channels)
 (guix modules))
 

(use-service-modules
 admin
 networking
 shepherd
 ssh)

(use-package-modules
 admin
 bootloaders
 certs
 gnome
 linux
 networking
 ssh
 tmux
 tls
 version-control)

(define disk "/dev/vda")
(define partition "2")

(define ext-autoexpand-service-type
  (let
      ((name 'ext-autoexpand)
       (desc
        "Automatically expand ext2 root")
       (modules
        '((ice-9 popen))))
    (shepherd-service-type
     name
     (lambda (config)
       (shepherd-service
        (documentation desc)
        (provision (list name))
        (requirement '(file-systems))
        (one-shot? #t)
        (start
         (with-imported-modules
             (source-module-closure modules)
           #~(begin
               (use-modules #$@modules)
               (let ((port
                      (open-pipe*
                       OPEN_WRITE
                       #$(file-append util-linux "/sbin/sfdisk")
                       ;; don't check if the block is in use
                       ;; it is, and we don't care.
                       "--no-reread"
                       disk
                       "-N" partition)))
                 (display ",+" port)
                 (close-port port))
               (system* $#(file-append util-linux "/sbin/partx") "--update" disk)
               (system*
                #$(file-append e2fsprogs "/sbin/resize2fs")
                (string-append disk partition)))))))
     (description desc))))

(define-public %system
  (operating-system
   (host-name "cloud-deploy-bootstrap")
   (timezone "Europe/Budapest")
   (locale "en_US.utf8")
   (keyboard-layout (keyboard-layout "us"))
   (bootloader (bootloader-configuration
                (bootloader grub-bootloader)
                (targets '("/dev/vda"))
                (keyboard-layout keyboard-layout)))
   (file-systems (append
                  (list (file-system
                         (device (file-system-label "cloudimg-rootfs"))
                         (mount-point "/")
                         (type "btrfs")))
                  %base-file-systems))

   ;; This is where we specify system-wide packages.
   (packages (append (list
                      nss-certs
                      tmux)
                     %base-packages))

   (services
    (append
     (list
      (service ext-autoexpand-service-type #f)
      (service dhcp-client-service-type)
      (service openssh-service-type
               (openssh-configuration
                (openssh openssh-sans-x)
                (permit-root-login #t)
                (authorized-keys
                 `(("root" ,(local-file (string-append (getenv "HOME") "/.ssh/id_ed25519.pub"))))))))
     %base-services))
   ;; Allow resolution of '.local' host names with mDNS.
   (name-service-switch %mdns-host-lookup-nss)))

%system

^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#52654: shepherd lacks error reporting
  2021-12-19  5:13 bug#52654: shepherd lacks error reporting raingloom
@ 2021-12-19  6:02 ` raingloom
  2023-04-29 15:22   ` Maxim Cournoyer
  0 siblings, 1 reply; 3+ messages in thread
From: raingloom @ 2021-12-19  6:02 UTC (permalink / raw)
  To: 52654

On Sun, 19 Dec 2021 06:13:20 +0100
raingloom <raingloom@riseup.net> wrote:

> I'm writing a single-shot shepherd-service that expands the (ext4)
> root file system on first boot, using the hostname service as a
> template, just passing the script as a G-expression, instead of using
> the forkexec constructor.
> Of course there is a bug in it. Trouble is, I have no idea what it is,
> because Shepherd won't tell me. :)
> The VM boots and completes the ssh initialization phase and then
> apparently just gets stuck. Doesn't even show a login prompt.
> It's... not a great debugging experience.
> I'm going to attempt to at the very least add some error reporting.
> It would also be really nice if the failure modes for Shepherd
> services were better documented, like what happens when the procedure
> passed in the `start` field fails, or is not even a procedure, etc.
> Since I never touched Shepherd internals, help would be greatly
> appreciated.
> 
> ps.: I'm attaching the system definition for completeness's sake and
> so that someone might point out where the error is, but honestly the
> exact bug in my code does not matter for the feature. All that
> matters is there is an error and it should be logged but isn't.

So the error in my config turned out to be the G-expression not
evaluating to a lambda, but the issue with Shepherd still stands.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#52654: shepherd lacks error reporting
  2021-12-19  6:02 ` raingloom
@ 2023-04-29 15:22   ` Maxim Cournoyer
  0 siblings, 0 replies; 3+ messages in thread
From: Maxim Cournoyer @ 2023-04-29 15:22 UTC (permalink / raw)
  To: raingloom; +Cc: 52654

Hi,

I also encountered that issue, it's really puzzling.

Here's the problematic start slot that got my mpd test to hang the boot,
with the last message being "Please wait while gathering entropy to
generate the key pair;":

--8<---------------cut here---------------start------------->8---
(start
        (with-imported-modules (source-module-closure
                                '((gnu build activation)))
          #~(begin
              (use-modules (gnu build activation))

              (let ((user (getpw #$username)))

                (define (init-directory directory)
                  (unless (file-exists? directory)
                    (mkdir-p/perms directory user #o755)))

                (for-each
                 init-directory
                 (cons '#$(map dirname
                               ;; XXX: Delete the potential "syslog"
                               ;; log-file value, which is not a directory.
                               (delete "syslog"
                                       (filter-map maybe-value
                                                   (list db-file
                                                         log-file
                                                         state-file
                                                         sticker-file)))))))

              (make-forkexec-constructor
               (list #$(file-append package "/bin/mpd") "--no-daemon"
                     #$config-file)
               #:environment-variables '#$environment-variables))))
--8<---------------cut here---------------end--------------->8---

The error was the lonely cons.  Taking it out, the test then passed:

--8<---------------cut here---------------start------------->8---
(start
        (with-imported-modules (source-module-closure
                                '((gnu build activation)))
          #~(begin
              (use-modules (gnu build activation))

              (let ((user (getpw #$username)))

                (define (init-directory directory)
                  (unless (file-exists? directory)
                    (mkdir-p/perms directory user #o755)))

                (for-each
                 init-directory
                 '#$(map dirname
                         ;; XXX: Delete the potential "syslog"
                         ;; log-file value, which is not a directory.
                         (delete "syslog"
                                 (filter-map maybe-value
                                             (list db-file
                                                   log-file
                                                   state-file
                                                   sticker-file))))))

              (make-forkexec-constructor
               (list #$(file-append package "/bin/mpd") "--no-daemon"
                     #$config-file)
               #:environment-variables '#$environment-variables))))
--8<---------------cut here---------------end--------------->8---

Shepherd should report the error, fail that one service and attempt to
keep booting (if the service is not required by other critical ones).

-- 
Thanks,
Maxim




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-04-29 15:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-12-19  5:13 bug#52654: shepherd lacks error reporting raingloom
2021-12-19  6:02 ` raingloom
2023-04-29 15:22   ` Maxim Cournoyer

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).