From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Marusich Subject: bug#36687: guix gc: error: executing SQLite statement: database disk image is malformed Date: Sat, 03 Aug 2019 02:11:22 -0700 Message-ID: <87wofuipfp.fsf@gmail.com> References: <87v9w2ctpm.fsf@gmail.com> <87tvbmgndy.fsf@elephly.net> <87pnm6egm4.fsf@gmail.com> <87blxqfo25.fsf@elephly.net> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:49500) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1htq56-0001ei-CY for bug-guix@gnu.org; Sat, 03 Aug 2019 05:12:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1htq54-0000fK-Ri for bug-guix@gnu.org; Sat, 03 Aug 2019 05:12:04 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:49463) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1htq54-0000f5-Ol for bug-guix@gnu.org; Sat, 03 Aug 2019 05:12:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1htq54-0001pB-Gw for bug-guix@gnu.org; Sat, 03 Aug 2019 05:12:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87blxqfo25.fsf@elephly.net> (Ricardo Wurmus's message of "Fri, 19 Jul 2019 10:04:02 +0200") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Ricardo Wurmus Cc: 36687@debbugs.gnu.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Ricardo Wurmus writes: > Chris Marusich writes: > >> is there anything we can check to understand why the database >> corruption has occurred? > [=E2=80=A6] >> For the record, I do sometimes abruptly power off my machine, and it >> does rarely abruptly crash (I think I have reported this when it occurs, >> but I can't be sure). > > This can be a cause of corruption. > > Have you manually run fsck on the unmounted disk yet? Yes, I have. The file system containing the database is my root file system, and it is ext4. On my system, Guix runs e2fsck on the root file system automatically on each boot [1]. However, just to be sure, I ran it manually on the unmounted device with the -f option to force a check. It reported the following (I've copied this output by hand): =2D-8<---------------cut here---------------start------------->8--- root@spaceship ~# e2fsck -f /dev/mapper/home e2fsck 1.45.2 (27-May-2019) Pass 1: Checking inodes, blocks, and sizes Inode 290237 extent tree (at level 2) could be narrower. Optimize? yes Inode 291644 extent tree (at level 2) could be narrower. Optimize? yes Inode 446822 extent tree (at level 1) could be narrower. Optimize? yes Inode 3016396 extent tree (at level 2) could be narrower. Optimize? yes Pass 1E: Optimizing extent trees Pass 2: Checking directory structure Problem in HTREE directory inode 12451844: block #21606 has bad min hash Invalid HTREE directory inode 12451844 (/gnu/store/.links). Clear HTree in= dex? yes Pass 3: Checking directory connectivity Pass 3A: Optimizing directories Pass 4: Checking reference counts Pass 5: Checking group summary information root: ***** FILE SYSTEM WAS MODIFIED ***** root: 1761282/14000128 files (0.4% non-contiguous), 50020294/55985664 blocks root@spaceship ~# echo $? 1 =2D-8<---------------cut here---------------end--------------->8--- Unfortunately, when I rebooted and tried "guix gc", the same problem as before occurred. I saw that Mike Gerwitz encountered the same error, so I thought I would try his work-around [2]. Before proceeding, I stopped guix-daemon ("sudo herd stop guix-daemon") and verified that no processes were accessing the database file ("sudo fuser db.sqlite"). Then I made a backup of my entire /var/guix directory. I dumped the database to SQL and recreated it like so: sudo sqlite3 db.sqlite .dump > /tmp/db.sqlite.dump sudo rm db.sqlite sudo sqlite3 db.sqlite < /tmp/db.sqlite.dump This succeeded without errors or warnings, and it reduced the size of the db.sqlite file from 145 MiB to 49 MiB. Although Mike's dump ended with a "ROLLBACK" and needed to be manually modified, mine ended with a "COMMIT" and did not require manual modification. I then started guix-daemon ("sudo herd start guix-daemon") and attempted to run "guix gc -C 1GiB", which succeeded. I followed that up with another "guix gc -C 40GiB", which also succeeded. The sqlite documentation says that this method of dumping and recreating the database is supported [3], so I think it's a valid work-around. However, you never know... Dumping and restoring the database file obviously changed it, so even if the restored database is "equivalent" to the original, it's still conceivable that the difference might cause a problem somehow. After all, obviously it was different enough that guix-daemon stopped failing. But maybe I'm being too pessimistic. In any case, it's notable that guix-daemon claimed the database was corrupt, even though sqlite3 didn't complain about it. Since the size was reduced, I suppose sqlite3 cleaned it up or optimized it somehow. Maybe in doing so, it fixed something that guix-daemon didn't know how to handle? That's just a guess. It's concerning that this "corruption" has occurred for two different people recently. We might want to check the Nix bug reports to see if they've encountered a bug like this. I did a (very) cursory search and found only this possibly similar, but possibly unrelated issue: https://github.com/NixOS/nixpkgs/issues/3958 It may be worth looking harder before diving too deeply into debugging. If you can think of anything I can do to drill into the files or debug this more, please let me know. If the database files do not contain sensitive information, I would also be willing to share them with you for further debugging. Footnotes:=20 [1] https://git.savannah.gnu.org/cgit/guix.git/tree/gnu/build/linux-boot.s= cm?id=3Dafb986e77cd669c2f21953f501f7893237730ca7#n383 [2] https://lists.gnu.org/archive/html/help-guix/2019-08/msg00012.html [3] https://sqlite.org/cli.html#dump =2D-=20 Chris --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEy/WXVcvn5+/vGD+x3UCaFdgiRp0FAl1FT7oACgkQ3UCaFdgi Rp0unA//Q3adnK5tsXdLuLSHDYlhTWv7H+MBYVtuDT+l6AN3TVpPrmLrrXG/0qXN RPww50/TgjDCl+1G99a3ocxBJvUSbnq2IssGylE+Ott06YRG5LTXxZfA9HZg7j2Y skBm6I7YFFN0JCdNJCceobSaSWcl4ZWV+70RudarI6VOnku3PDmMM9Rz2a4+88wF MSPa3piVK6j6KGI/VQZf3ocdOo8s3oQFSf+VBtvVnq5a+53wKLBOL8jVpjJ3jrpi qYO92zvVs04FvTClLPjmWrV4HzSVotg2yf6egfSucLsFg2807YSmkz6mAYkkPIXk Fei9kTRH17bKUGckmIEEhJJpHAYyKSYTEkVhBDOesreHWw4sGTQq+JoRImcAdarQ gmwaCrw8+LadZrT2xRjh68DK7F69wFlOFO1Vc3E9p7oxWJWDi2Wl77uqbGQBZQVp CiaAPM4/MgSPjggP/EtaOH8tUQsiNFmm+1C4FfqFRSs2vo3xdzcUgsq9SxPZz7Na OxhnRT1P7asC57PdnAY/bvaIhuZfHWnBmjS2BxAlGrXS+Wt3vcKwnQJpnvJ3gSFv LZWdf4g7WcnKIXKGpD1U/oZhJu9xqxAfJW8GLpDZkuY6+ULOkKMZaLtuRtamklxK QiQA2EvXuFL86SdB5b2mCBiaW8p5aVoyyYnEXiCIPh852mGQ7tA= =9ur8 -----END PGP SIGNATURE----- --=-=-=--