From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#30116: [PATCH] `substitute' crashes when file contains NUL characters (core-updates) Date: Thu, 25 Jan 2018 12:11:30 +0100 Message-ID: <87shauxkf1.fsf@gnu.org> References: <87r2qrc3mq.fsf@gmail.com> <87o9lu6o9m.fsf@gnu.org> <87607vu9dp.fsf@gmail.com> <877esb84ae.fsf@netris.org> <87o9lmp3c2.fsf@gnu.org> <87k1w9ryhz.fsf@gmail.com> <87o9lk64xz.fsf@gnu.org> <87po5y8qv5.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:47186) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eefRv-00040Y-MU for bug-guix@gnu.org; Thu, 25 Jan 2018 06:12:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eefRq-0003a2-Kr for bug-guix@gnu.org; Thu, 25 Jan 2018 06:12:07 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:33879) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eefRq-0003Zu-H4 for bug-guix@gnu.org; Thu, 25 Jan 2018 06:12:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1eefRq-0002ag-6H for bug-guix@gnu.org; Thu, 25 Jan 2018 06:12:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87po5y8qv5.fsf@gmail.com> (Maxim Cournoyer's message of "Thu, 25 Jan 2018 00:11:26 -0500") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Maxim Cournoyer Cc: 30116@debbugs.gnu.org Maxim Cournoyer skribis: > ludo@gnu.org (Ludovic Court=C3=A8s) writes: > >> Maxim Cournoyer skribis: >> >>> In the `patch-el-files' phase of the emacs-build-system, we find the >>> following snippet: >>> >>> (with-directory-excursion el-dir >>> ;; Some old '.el' files (e.g., tex-buf.el in AUCTeX) are still en= coded >>> ;; with the "ISO-8859-1" locale. >>> (unless (false-if-exception (substitute-cmd)) >>> (with-fluids ((%default-port-encoding "ISO-8859-1")) >>> (substitute-cmd)))) >>> >>> In case an exception is returned while processing the file, it is >>> retried being opened with the "ISO-8859-1" encoding. Or, this resolves >>> to a call to `open-file', which documentation says: >>> >>> =E2=80=98b=E2=80=99 >>> Use binary mode, ensuring that each byte in the file will be >>> read as one Scheme character. >>> >>> To provide this property, the file will be opened with the >>> 8-bit character encoding "ISO-8859-1", ignoring the default >>> port encoding. *Note Ports::, for more information on port >>> encodings. >>> >>> So, by opening an file whose encoding is unknown as a ISO-8859-1 file, >>> we are doing the same as if we had passed the 'binary option. Could this >>> explain why we end up with NUL characters where we were expecting text? >> >> That could be the reason. Guile provides a way to honor Emacs-style >> =E2=80=98encoding=E2=80=99 declarations, and =E2=80=98call-with-input-fi= le=E2=80=99 does that if we pass >> #:guess-encoding #t (info "(guile) Character Encoding of Source Files"). >> >> Did the faulty file have such a declaration? > > Sadly, it doesn't. Although even if it did, I don't think it would be > very robust to expect every misbehaving files we might encounter to > include one! Sure, I was asking just because it=E2=80=99s an Emacs-related package. > So I think we should apply my v2 patch to core-updates for now (see my > previous reply on this thread), until we have our substitute routine > implemented using srfi-115! Sounds good! Note that I=E2=80=99ll wait until after the current =E2=80=98= core-updates=E2=80=99 has been merged. Please do ping me if you think I=E2=80=99ve forgotten! Thanks, Ludo=E2=80=99.