From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id oIooKgBQTWN51gAAbAwnHQ (envelope-from ) for ; Mon, 17 Oct 2022 14:52:16 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id OLMrKgBQTWOYUQEA9RJhRA (envelope-from ) for ; Mon, 17 Oct 2022 14:52:16 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id E81A16E4C for ; Mon, 17 Oct 2022 14:52:15 +0200 (CEST) Received: from localhost ([::1]:44516 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1okPbK-0004MV-OS for larch@yhetil.org; Mon, 17 Oct 2022 08:52:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42612) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1okPb9-0004Hi-CT for bug-guix@gnu.org; Mon, 17 Oct 2022 08:52:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:48668) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1okPb9-0004bs-5U for bug-guix@gnu.org; Mon, 17 Oct 2022 08:52:03 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1okPb8-0007ex-Uk for bug-guix@gnu.org; Mon, 17 Oct 2022 08:52:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#58320: Hurd VM fails to boot on AMD EPYC (kvm-amd) Resent-From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Mon, 17 Oct 2022 12:52:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58320 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: 58320@debbugs.gnu.org Cc: bug-hurd@gnu.org Received: via spool by 58320-submit@debbugs.gnu.org id=B58320.166601108829383 (code B ref 58320); Mon, 17 Oct 2022 12:52:02 +0000 Received: (at 58320) by debbugs.gnu.org; 17 Oct 2022 12:51:28 +0000 Received: from localhost ([127.0.0.1]:47742 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1okPaZ-0007dq-JX for submit@debbugs.gnu.org; Mon, 17 Oct 2022 08:51:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:45550) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1okPaX-0007dd-Ko for 58320@debbugs.gnu.org; Mon, 17 Oct 2022 08:51:26 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:48608) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1okPaP-0004RO-5i; Mon, 17 Oct 2022 08:51:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=hGU8dRhMjjnL7rsUjDGzqjDVlUJOrghGR0fjvru5VKA=; b=oNVXhAZfQ6GKe+LbT+Ix /KTMML8ttZNfvWgqSF7e/tkyL05BfOW9G7g0OS45PMEFleTECOQsZ49/3+WYY/1Ms+A3NGL39Xf2k X5oaAEmJt66lhqN+ALMLNdb/11c/2tv1W7A01heph+mbaDxCK5q5Vn2vVqy5oRLu0/9vzTw8Tajuw Sp6K01DJBQjenvpBKc+ZA+EwpdyhLBmtUzEwqtlHKgtzfVb3MDcMuh4llX2jiifhHAoDWHfGo/QVb uHgm/fzYolkqEPZ/PSGKpPq7WxmtNgdof9khg0KxG+3xPls1SGD4dlq/S8vgtCdlSgrOReM4Sek4v zvoyXr3Sau3dRg==; Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=34480 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1okPaB-00024o-Bm; Mon, 17 Oct 2022 08:51:16 -0400 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= References: <87k05eouh8.fsf@inria.fr> <8735c1nlga.fsf@gnu.org> <20221006135316.ijevz5ddnet4aqkr@begin> <87r0zkfvso.fsf@gnu.org> <20221006224219.mn7zp7lhzxwlyrpx@begin> <8735c0f3d5.fsf@gnu.org> Date: Mon, 17 Oct 2022 14:51:01 +0200 In-Reply-To: <8735c0f3d5.fsf@gnu.org> ("Ludovic =?UTF-8?Q?Court=C3=A8s?="'s message of "Fri, 07 Oct 2022 10:24:22 +0200") Message-ID: <87czaqsjey.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1666011136; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=hGU8dRhMjjnL7rsUjDGzqjDVlUJOrghGR0fjvru5VKA=; b=ZmiFokb6VJTrVDKbKh6kZXJRpMy7h5jXtuo7MSwcsjSP2+XM1j8jJU+RbDAY5G4NQT3CyE vQek9vBvdGTw6zin8lP9wI8Xg6DkIiNy3IVLt3G2gx4H5i6jRg9DvOH4pWwE+5mgUf3kP+ aXeMdnKeLyYzE0WnwZJ9yIi/0fXijVxTVR2gI+tq82ITtk70S9uagmUjCNpfOTYbDyzvXq Xc97OZI3dz+53+3bmmHtId+vzJ/y7SUonCPZntru/+r0EDryNLSfXHkQvQ3UX8M5PHElI2 gV8DDmEp1W/l53FQO08QQm70xhJvi4n0xi91SI/gjpoMBNM3Iy/Mwm5Dv+eYVQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1666011136; a=rsa-sha256; cv=none; b=lnCXODws7NxBysrzzGaWhL362vD1kNGbf6tbrx6moG20stzd6ZvEQWsS6xfDBdqHCbrmq2 ZnQIdxZkK6R4gDvZp2ydJrXQD+kt6gCrmlkAzz2WqUrzzhi7OvMFW+s/64h95nzNlWAoev OBAieFIi1E3Kr0/aLD95RBcgzjl2xxNulMS+szgpWJzX4mJmyaxUjc1+9hoAMMPQlSjQFE JtBRJgc1+RDq5BSlrU2V0qxggsvyQpPEPP4V/8iQQ8UHAeBCzArRbw5WSmEHcn4mmuWg8J C0UgGTb7wZ3WTVVHg0BPBmknYXPZXkIsCkYgJuEGibMaB2ivFeiEdAQuWVZMQg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=oNVXhAZf; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -3.82 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gnu.org header.s=fencepost-gnu-org header.b=oNVXhAZf; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: E81A16E4C X-Spam-Score: -3.82 X-Migadu-Scanner: scn1.migadu.com X-TUID: P4JoNgyX3/EL Hi, Ludovic Court=C3=A8s skribis: > =E2=80=A6 so =E2=80=98exec_load=E2=80=99 is doing its job, it seems. Turns out that may not be the case. Here=E2=80=99s a *bad* mapping on the second =E2=80=98task_resume=E2=80=99 = breakpoint (when =E2=80=98exec=E2=80=99 is about to start): --8<---------------cut here---------------start------------->8--- db> show all threads TASK THREADS 0 gnumach (f5f7cf00): 7 threads: 0 (f5f7be18) .W..N. 0xc11dac04 1 (f5f7bcd0) R..O..(idle_thread_continue) 2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4 3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c 4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0 5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74 6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8 1 ext2fs (f5f7ce40): 6 threads: 0 (f5f7b520) R....F 1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0 2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0 3 (f5f7b000) .W.O..(mach_msg_continue) 0 4 (f67d3e20) .W.O..(mach_msg_receive_continue) 0 5 (f67d3cd8) .W.O..(mach_msg_continue) 0 2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return) db> trace task_resume(f593e010,fb7d9010,f5f73e80,c106972a) ipc_kobject_server(f593e000,3,18,0)+0x1eb mach_msg_trap(bffff4c0,3,18,20,8)+0x1703 >>>>> user space <<<<< db> x/tbx 0xcbc 0xf5f7b3d8 no memory is assigned to address 00000cbc 0 db> show map $map2 Map 0xf5f6ff30: name=3D"exec", pmap=3D0xf5f71fa8,ref=3D1,nentries=3D5 size=3D290816,resident:225280,wired=3D0 version=3D13 map entry 0xf625ec08: start=3D0x0, end=3D0x1000 prot=3D1/7/copy, object=3D0x0, offset=3D0x0 map entry 0xf625ebb0: start=3D0x1000, end=3D0x26000 prot=3D5/7/copy, object=3D0xf5f6ad70, offset=3D0x0 Object 0xf5f6ad70: size=3D0x25000, 1 references 37 resident pages, 0 absent pages, 0 paging ops memory object=3D0x0 (offset=3D0x0),control=3D0x0, name=3D0xf5f82780 uninitialized,temporary internal,copy_strategy=3D0 shadow=3D0x0 (offset=3D0x0),copy=3D0x0 map entry 0xf625eb58: start=3D0x26000, end=3D0x34000 prot=3D1/7/copy, object=3D0xf5f6ad20, offset=3D0x0 Object 0xf5f6ad20: size=3D0xe000, 1 references 14 resident pages, 0 absent pages, 0 paging ops memory object=3D0x0 (offset=3D0x0),control=3D0x0, name=3D0xf5f82730 uninitialized,temporary internal,copy_strategy=3D0 shadow=3D0x0 (offset=3D0x0),copy=3D0x0 map entry 0xf625eb00: start=3D0x34000, end=3D0x37000 prot=3D3/7/copy, object=3D0xf5f6acd0, offset=3D0x0 Object 0xf5f6acd0: size=3D0x3000, 1 references 3 resident pages,--db_more-- --8<---------------cut here---------------end--------------->8--- Compare with what a =E2=80=9Cgood=E2=80=9D mapping looks like at that same = moment: --8<---------------cut here---------------start------------->8--- start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoin= t trap, eip 0xc1030d5b Breakpoint at task_resume: pushl %ebp db> show all threads TASK THREADS 0 gnumach (f5f7cf00): 7 threads: 0 (f5f7be18) .W..N. 0xc11dac04 1 (f5f7bcd0) R..O..(idle_thread_continue) 2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4 3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c 4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0 5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74 6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8 1 ext2fs (f5f7ce40): 6 threads: 0 (f5f7b520) R....F 1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0 2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0 3 (f5f7b000) .W.O..(mach_msg_continue) 0 4 (f67d2e20) .W.O..(mach_msg_receive_continue) 0 5 (f67d2cd8) .W.O..(mach_msg_continue) 0 2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return) db> x/tbx 0xcbc 0xf5f7b3d8 8 db> show map $map2 Map 0xf5f6ff30: name=3D"exec", pmap=3D0xf5f71fa8,ref=3D1,nentries=3D5 size=3D290816,resident:229376,wired=3D0 version=3D14 map entry 0xf625ec08: start=3D0x0, end=3D0x1000 prot=3D1/7/copy, object=3D0xf5f6ad70, offset=3D0x0 Object 0xf5f6ad70: size=3D0x1000, 1 references 1 resident pages, 0 absent pages, 0 paging ops memory object=3D0x0 (offset=3D0x0),control=3D0x0, name=3D0xf5f82780 uninitialized,temporary internal,copy_strategy=3D0 shadow=3D0x0 (offset=3D0x0),copy=3D0x0 map entry 0xf625ebb0: start=3D0x1000, end=3D0x26000 prot=3D5/7/copy, object=3D0xf5f6ad20, offset=3D0x0 Object 0xf5f6ad20: size=3D0x25000, 1 references 37 resident pages, 0 absent pages, 0 paging ops memory object=3D0x0 (offset=3D0x0),control=3D0x0, name=3D0xf5f82730 uninitialized,temporary internal,copy_strategy=3D0 shadow=3D0x0 (offset=3D0x0),copy=3D0x0 map entry 0xf625eb58: start=3D0x26000, end=3D0x34000 prot=3D1/7/copy, object=3D0xf5f6acd0, offset=3D0x0 Object 0xf5f6acd0: size=3D0xe000, 1 references 14 resident pages, 0 absent pages, 0 paging ops memory object=3D0x0 (offset=3D0x0),control=3D0x0, name=3D0xf5f826e0 uninitialized,temporary internal,copy_strategy=3D0 shadow=3D0x0 (offset=3D0x0),copy=3D0x0 map entry 0xf625eb00: start=3D0x34000, end=3D0x37000 prot=3D3/7/copy, object=3D0xf5f6ac80, offset=3D0x0 Object 0xf5f6ac80: size=3D0x3000, 1 references 3 resident pages, 0 absent pages, 0 paging ops memory object=3D0x0 (offset=3D0x0),control=3D0x0, name=3D0xf5f82690 uninitialized,temporary internal,copy_strategy=3D0 shadow=3D0x0 (offset=3D0x0),copy=3D0x0 map entry 0xf625eaa8: start=3D0xbfff0000, end=3D0xc0000000 prot=3D3/7/copy, object=3D0xf5f6ac30, offset=3D0x0 Object 0xf5f6ac30: size=3D0x10000, 1 references 1 resident pages, 0 absent pages, 0 paging ops memory object=3D0x0 (offset=3D0x0),control=3D0x0, name=3D0xf5f82640 uninitialized,temporary internal,copy_strategy=3D0 shadow=3D0x0 (offset=3D0x0),copy=3D0x0 --8<---------------cut here---------------end--------------->8--- Notice that 0xcbc reads a valid relocation, where 8 =3D R_386_RELATIVE. In the =E2=80=9Cbad=E2=80=9D case, the first map entry is empty, with no as= sociated memory object and zero resident pages. My reading of =E2=80=98read_exec=E2=80=99 is that the page is supposed to b= e populated eagerly by the =E2=80=98copyout=E2=80=99 call here: --8<---------------cut here---------------start------------->8--- static int read_exec(void *handle, vm_offset_t file_ofs, vm_size_t file_size, vm_offset_t mem_addr, vm_size_t mem_size, exec_sectype_t sec_type) { struct multiboot_module *mod =3D handle; [...] err =3D vm_allocate(user_map, &start_page, end_page - start_page, FALSE); assert(err =3D=3D 0); assert(start_page =3D=3D trunc_page(mem_addr)); if (file_size > 0) { err =3D copyout((char *)phystokv (mod->mod_start) + file_ofs, (void *)mem_addr, file_size); assert(err =3D=3D 0); } [...] return 0; } --8<---------------cut here---------------end--------------->8--- There are interesting tricks in =E2=80=98copyout_retry=E2=80=99 to fake a p= age fault so the copy can actually be made, IIUC. Could it be that this bit isn=E2=80=99t quite working? Ideas? Problem with debugging this is that setting a breakpoint on =E2=80=98exec_l= oad=E2=80=99 causes the system to boot fine (breaking on =E2=80=98task_resume=E2=80=99 i= s fine tough, go figure=E2=80=A6). Ludo=E2=80=99.