all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#74501: ntpd segfaults at boot (take 2)
@ 2024-11-24  0:32 Fredrik Salomonsson
  2024-12-15  0:52 ` bug#74501: Problem confirmed Danny Milosavljevic
  0 siblings, 1 reply; 2+ messages in thread
From: Fredrik Salomonsson @ 2024-11-24  0:32 UTC (permalink / raw)
  To: 74501

Hi,

Similar to issue 73873 [0], I'm also seeing ntpd segfaults at boot and
it looks to be due to ipv6:
In /var/log/messages
--8<---------------cut here---------------start------------->8---
Nov 23 16:13:41 localhost ntpd[1019]: ntpd 4.2.8p18@1.4062-o Thu Jan  1 00:00:01 UTC 1970 (1): Starting
Nov 23 16:13:41 localhost ntpd[1019]: Command line: /gnu/store/s4ra0g0ym1q1wh5jrqs60092x1nrb8h9-ntp-4.2.8p18/bin/ntpd -n -c /gnu/store/ghh3m9wzraszf7p4ynac006x96svddbq-ntpd.conf -u ntpd -g
Nov 23 16:13:41 localhost ntpd[1019]: ----------------------------------------------------
Nov 23 16:13:41 localhost ntpd[1019]: ntp-4 is maintained by Network Time Foundation,
Nov 23 16:13:41 localhost ntpd[1019]: Inc. (NTF), a non-profit 501(c)(3) public-benefit
Nov 23 16:13:41 localhost ntpd[1019]: corporation.  Support and training for ntp-4 are
Nov 23 16:13:41 localhost ntpd[1019]: available at https://www.nwtime.org/support
Nov 23 16:13:41 localhost ntpd[1019]: ----------------------------------------------------
Nov 23 16:13:41 localhost ntpd[1019]: DEBUG behavior is enabled - a violation of any diagnostic assertion will cause ntpd to abort
Nov 23 16:13:41 localhost ntpd[1019]: proto: precision = 0.040 usec (-24)
Nov 23 16:13:41 localhost ntpd[1019]: baseday_set_day: invalid day (25556), UNIX epoch substituted
Nov 23 16:13:41 localhost ntpd[1019]: basedate set to 1970-01-01
Nov 23 16:13:41 localhost ntpd[1019]: gps base set to 1980-01-06 (week 0)
Nov 23 16:13:41 localhost ntpd[1019]: Listen and drop on 0 v6wildcard [::]:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen and drop on 1 v4wildcard 0.0.0.0:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 2 lo 127.0.0.1:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 3 enp37s0 192.168.1.8:123
Nov 23 16:13:41 localhost vmunix: [   22.648239] ntpd[1019]: segfault at 24 ip 000055fe102ab29b sp 00007ffc26382ca0 error 4 in ntpd[7f29b,55fe1023e000+86000] likely on CPU 0 (core 0, socket 0)
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 4 lo [::1]:123
Nov 23 16:13:41 localhost vmunix: [   22.649529] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
Nov 23 16:13:41 localhost ntpd[1019]: bind(21) AF_INET6 [2001:REDACTED:cedf]:123 flags 0x11 failed: Cannot assign requested address
Nov 23 16:13:41 localhost ntpd[1019]: unable to create socket on enp37s0 (5) for [2001:REDACTED:cedf]:123
Nov 23 16:13:41 localhost shepherd[1]: Service ntpd (PID 1019) terminated with signal 11. 
Nov 23 16:13:41 localhost shepherd[1]: Service ntpd has been disabled. 
Nov 23 16:13:41 localhost shepherd[1]:   (Respawning too fast.) 
--8<---------------cut here---------------end--------------->8---

And `sudo dmesg`:

--8<---------------cut here---------------start------------->8---
[   21.871447] ntpd[954]: segfault at 24 ip 000055abbdf0029b sp 00007ffebf673770 error 4 in ntpd[7f29b,55abbde93000+86000] likely on CPU 7 (core 9, socket 0)
[   21.871453] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.002809] ntpd[1005]: segfault at 24 ip 000055ac349d229b sp 00007fff8be14a00 error 4 in ntpd[7f29b,55ac34965000+86000] likely on CPU 12 (core 0, socket 0)
[   22.002863] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.131272] ntpd[1008]: segfault at 24 ip 0000556dc1ad529b sp 00007ffef46b9d50 error 4 in ntpd[7f29b,556dc1a68000+86000] likely on CPU 3 (core 3, socket 0)
[   22.132111] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.264012] ntpd[1011]: segfault at 24 ip 000055e02824f29b sp 00007fffa1e29970 error 4 in ntpd[7f29b,55e0281e2000+86000] likely on CPU 4 (core 4, socket 0)
[   22.264019] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.390893] ntpd[1014]: segfault at 24 ip 0000555b2757129b sp 00007ffe2d0ea050 error 4 in ntpd[7f29b,555b27504000+86000] likely on CPU 4 (core 4, socket 0)
[   22.390898] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.517794] ntpd[1016]: segfault at 24 ip 000056387455529b sp 00007ffde75cabf0 error 4 in ntpd[7f29b,5638744e8000+86000] likely on CPU 4 (core 4, socket 0)
[   22.518953] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[   22.648239] ntpd[1019]: segfault at 24 ip 000055fe102ab29b sp 00007ffc26382ca0 error 4 in ntpd[7f29b,55fe1023e000+86000] likely on CPU 0 (core 0, socket 0)
[   22.649529] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
--8<---------------cut here---------------end--------------->8---

It's been doing that since around the time of issue 73873 [0].  I double
checked and it does use the 2.guix.pool.ntp.org pool.  I also reverted back
to 0.guix.pool.ntp.org in case for some reason that would work.  Both
segfaults ntpd.  Did 2.guix.pool.ntp.org stop supporting ipv6?

Thanks

[0] https://issues.guix.gnu.org/73873
-- 
s/Fred[re]+i[ck]+/Fredrik/g




^ permalink raw reply	[flat|nested] 2+ messages in thread

* bug#74501: Problem confirmed
  2024-11-24  0:32 bug#74501: ntpd segfaults at boot (take 2) Fredrik Salomonsson
@ 2024-12-15  0:52 ` Danny Milosavljevic
  0 siblings, 0 replies; 2+ messages in thread
From: Danny Milosavljevic @ 2024-12-15  0:52 UTC (permalink / raw)
  To: 74501

Hi,

I also have this problem on x86_64 znver3.

I disassembled my "Code:" block and I get:

8b 04 25 28 00 00 00    mov    eax, DWORD PTR ds:0x28
48 89 44 24 08          mov    QWORD PTR [rsp+0x8], rax
31 c0                   xor    eax, eax
e8 dc 2d f9 ff          call   <relative_address>
44 8b 28                mov    r13d, DWORD PTR [rax]
48 89 c5                mov    rbp, rax
e8 61 9e ff ff          call   <relative_address>
49 89 c4                mov    r12, rax
48 85 db                test   rbx, rbx
0f 84 e5 00 00 00       je     <forward_jump>
<44> 0f b7 0b           movzx  r9d, WORD PTR [rbx]         ; <-- This is where <44> is
66 41 83 f9 02          cmp    r9w, 0x2
0f 84 f6 00 00 00       je     <forward_jump>
66 41 83 f9 0a          cmp    r9w, 0xa
74 57                   je     <forward_jump>

The 0x44 byte in this instruction is part of the REX prefix that indicates the use of an extended register (r9d in this case).

The error code is a combination of several error bits defined in fault.c in the Linux kernel:

/*
 * Page fault error code bits:
 *
 *   bit 0 ==    0: no page found       1: protection fault
 *   bit 1 ==    0: read access         1: write access
 *   bit 2 ==    0: kernel-mode access  1: user-mode access
 *   bit 3 ==                           1: use of reserved bit detected
 *   bit 4 ==                           1: fault was an instruction fetch
 *   bit 5 ==                           1: protection keys block access
 *   bit 6 ==                           1: shadow stack access fault
 *   bit 15 =                           1: SGX MMU page-fault
 */
enum x86_pf_error_code {
        X86_PF_PROT     =               1 << 0,
        X86_PF_WRITE    =               1 << 1,
        X86_PF_USER     =               1 << 2,
        X86_PF_RSVD     =               1 << 3,
        X86_PF_INSTR    =               1 << 4,
        X86_PF_PK       =               1 << 5,
        X86_PF_SHSTK    =               1 << 6,
        X86_PF_SGX      =               1 << 15,
};

Since ntpd is a user-mode program, X86_PF_USER is set and the error code is at least 4.

If the error code is 4, then the faulty memory access is a read from user space.

In total:

- User-mode access.
- Read access.
- No page found.




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-12-15  1:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-24  0:32 bug#74501: ntpd segfaults at boot (take 2) Fredrik Salomonsson
2024-12-15  0:52 ` bug#74501: Problem confirmed Danny Milosavljevic

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.