From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: bug#30116: [PATCH] `substitute' crashes when file contains NUL characters (core-updates) Date: Sun, 21 Jan 2018 13:17:45 -0500 Message-ID: <877esb84ae.fsf@netris.org> References: <87r2qrc3mq.fsf@gmail.com> <87o9lu6o9m.fsf@gnu.org> <87607vu9dp.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:41350) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1edKCv-0003u1-QH for bug-guix@gnu.org; Sun, 21 Jan 2018 13:19:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1edKCs-0007mx-LR for bug-guix@gnu.org; Sun, 21 Jan 2018 13:19:05 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:57662) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1edKCs-0007m9-HJ for bug-guix@gnu.org; Sun, 21 Jan 2018 13:19:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1edKCs-00057W-5l for bug-guix@gnu.org; Sun, 21 Jan 2018 13:19:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87607vu9dp.fsf@gmail.com> (Maxim Cournoyer's message of "Sat, 20 Jan 2018 23:24:34 -0500") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Maxim Cournoyer Cc: 30116@debbugs.gnu.org Maxim Cournoyer writes: > ludo@gnu.org (Ludovic Court=C3=A8s) writes: > >> Maxim Cournoyer skribis: >> >>> I've encountered the following crash when trying to use substitute on a >>> file which contains NUL characters: >> >> Yes, that=E2=80=99s because Guile=E2=80=99s =E2=80=98regexp-exec=E2=80= =99 simply wraps libc=E2=80=99s =E2=80=98regexec=E2=80=99, >> which does not handle NULs. >> >> We should consider switching to the pure-Scheme SRFI-115: >> >> https://srfi.schemers.org/srfi-115/srfi-115.html > > This looks good, and I started looking into porting `substitute' to it, > but quickly noticed it doesn't seem to be implemented in Guile yet? Indeed. SRFI-115 for Guile is on my TODO list, although it might be better to wait until after we switch to using UTF-8 encoding internally for strings, since that will drastically affect the implementation of any efficient regexp matcher on Scheme strings. Anyway, 'substitute*' is to be used only on text files, and NUL bytes are not a valid textual character. So, I think that this case is outside of what 'substitute*' is meant to do, and therefore not a bug in 'substitute*', although of course a more graceful error would surely be preferable. What do you think? Mark