unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw@netris.org>
Cc: guix-devel@gnu.org
Subject: Re: 01/02: utils: Change 'patch-shebangs' to use binary input.
Date: Sat, 28 Feb 2015 15:50:23 +0100	[thread overview]
Message-ID: <87vbimjdjk.fsf@gnu.org> (raw)
In-Reply-To: <87r3tay7xv.fsf@netris.org> (Mark H. Weaver's message of "Fri, 27 Feb 2015 23:30:04 -0500")

Mark H Weaver <mhw@netris.org> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:
>
>> commit ca1e3ad2faa59d5b32289f84e0937fa476e21a1a
>> Author: Ludovic Courtès <ludo@gnu.org>
>> Date:   Sat Feb 28 01:01:51 2015 +0100
>>
>>     utils: Change 'patch-shebangs' to use binary input.
>>     
>>     * guix/build/utils.scm (get-char*): New procedure.
>>       (patch-shebang): Use it instead of 'read-char'.
>>       (fold-port-matches): Remove local 'get-char' and use 'get-char*'
>>       instead.
>> ---
>>  guix/build/utils.scm |   22 +++++++++++-----------
>>  1 files changed, 11 insertions(+), 11 deletions(-)
>>
>> diff --git a/guix/build/utils.scm b/guix/build/utils.scm
>> index a3f8911..c98c4ca 100644
>> --- a/guix/build/utils.scm
>> +++ b/guix/build/utils.scm
>> @@ -618,6 +618,14 @@ transferred and the continuation of the transfer as a thunk."
>>           (stat:atimensec stat)
>>           (stat:mtimensec stat)))
>>  
>> +(define (get-char* p)
>> +  ;; We call it `get-char', but that's really a binary version
>> +  ;; thereof.  (The real `get-char' cannot be used here because our
>> +  ;; bootstrap Guile is hacked to always use UTF-8.)
>> +  (match (get-u8 p)
>> +    ((? integer? x) (integer->char x))
>> +    (x x)))
>> +
>
> This is equivalent to reading with the ISO-8859-1 encoding.  The problem
> is that the procedures that use 'get-char*' will then typically use
> UTF-8 to write these characters back, so all non-ASCII characters will
> get corrupted by these filters.
>
> For now, I would suggest just using ISO-8859-1 for all of these build
> utilities that filter or substitute existing files, and then use the
> textual I/O procedures.

The difficulty is that ISO-8859-1 is not available during bootstrap, due
to guile-default-utf8.patch.

Commit dd0a8ef asks for ISO-8859-1 in the patch-* procedures, as you
suggest, but in reality during bootstrap what happens is not exactly
that.

If the bootstrap glibc had statically-linked gconv modules, we could get
rid of guile-default-utf8.patch.

> A better solution going forward would be to implement and use a
> permissive UTF-8 encoding in Guile.

Probably, although it’s not completely clear to me how that would work.
I suppose the idea would be to change to ISO-8859-1 when an invalid byte
sequence is encountered?

Ludo’.

      parent reply	other threads:[~2015-02-28 14:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20150228001057.17733.82336@vcs.savannah.gnu.org>
     [not found] ` <E1YRUzi-0004cx-0N@vcs.savannah.gnu.org>
2015-02-28  4:30   ` 01/02: utils: Change 'patch-shebangs' to use binary input Mark H Weaver
2015-02-28  9:51     ` Andreas Enge
2015-02-28 11:07       ` Andreas Enge
2015-02-28 11:11         ` Andreas Enge
2015-02-28 14:50     ` Ludovic Courtès [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87vbimjdjk.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=mhw@netris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).