From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#30116: [PATCH] `substitute' crashes when file contains NUL characters (core-updates) Date: Tue, 23 Jan 2018 15:11:04 +0100 Message-ID: <87o9lk64xz.fsf@gnu.org> References: <87r2qrc3mq.fsf@gmail.com> <87o9lu6o9m.fsf@gnu.org> <87607vu9dp.fsf@gmail.com> <877esb84ae.fsf@netris.org> <87o9lmp3c2.fsf@gnu.org> <87k1w9ryhz.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:60765) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1edzJ0-00038q-Od for bug-guix@gnu.org; Tue, 23 Jan 2018 09:12:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1edzIw-0002Sd-OF for bug-guix@gnu.org; Tue, 23 Jan 2018 09:12:06 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:59554) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1edzIw-0002SS-LB for bug-guix@gnu.org; Tue, 23 Jan 2018 09:12:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1edzIw-0006EM-GB for bug-guix@gnu.org; Tue, 23 Jan 2018 09:12:02 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87k1w9ryhz.fsf@gmail.com> (Maxim Cournoyer's message of "Mon, 22 Jan 2018 23:27:04 -0500") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Maxim Cournoyer Cc: 30116@debbugs.gnu.org Maxim Cournoyer skribis: > In the `patch-el-files' phase of the emacs-build-system, we find the > following snippet: > > (with-directory-excursion el-dir > ;; Some old '.el' files (e.g., tex-buf.el in AUCTeX) are still enco= ded > ;; with the "ISO-8859-1" locale. > (unless (false-if-exception (substitute-cmd)) > (with-fluids ((%default-port-encoding "ISO-8859-1")) > (substitute-cmd)))) > > In case an exception is returned while processing the file, it is > retried being opened with the "ISO-8859-1" encoding. Or, this resolves > to a call to `open-file', which documentation says: > > =E2=80=98b=E2=80=99 > Use binary mode, ensuring that each byte in the file will be > read as one Scheme character. > > To provide this property, the file will be opened with the > 8-bit character encoding "ISO-8859-1", ignoring the default > port encoding. *Note Ports::, for more information on port > encodings. > > So, by opening an file whose encoding is unknown as a ISO-8859-1 file, > we are doing the same as if we had passed the 'binary option. Could this > explain why we end up with NUL characters where we were expecting text? That could be the reason. Guile provides a way to honor Emacs-style =E2=80=98encoding=E2=80=99 declarations, and =E2=80=98call-with-input-file= =E2=80=99 does that if we pass #:guess-encoding #t (info "(guile) Character Encoding of Source Files"). Did the faulty file have such a declaration? Thanks, Ludo=E2=80=99.