From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: taylanbayirli@gmail.com (Taylan Ulrich =?UTF-8?Q?Bay=C4=B1rl=C4=B1/Kammer?=) Newsgroups: gmane.lisp.guile.bugs Subject: bug#26058: utf16->string and utf32->string don't conform to R6RS Date: Thu, 16 Mar 2017 20:34:14 +0100 Message-ID: <87r31xdnih.fsf@gmail.com> References: <87o9x83t0f.fsf@gmail.com> <87shmhqqgd.fsf@pobox.com> <87h92xyrmr.fsf@gmail.com> <87bmt4rht1.fsf@pobox.com> <87d1djzysb.fsf@gmail.com> <877f3r7ti2.fsf@pobox.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1489692374 14846 195.159.176.226 (16 Mar 2017 19:26:14 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 16 Mar 2017 19:26:14 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) Cc: 26058@debbugs.gnu.org To: Andy Wingo Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Thu Mar 16 20:26:10 2017 Return-path: Envelope-to: guile-bugs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cob2D-0002ua-AR for guile-bugs@m.gmane.org; Thu, 16 Mar 2017 20:26:05 +0100 Original-Received: from localhost ([::1]:45380 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cob2J-0000FT-7g for guile-bugs@m.gmane.org; Thu, 16 Mar 2017 15:26:11 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:33618) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cob2E-0000EI-2J for bug-guile@gnu.org; Thu, 16 Mar 2017 15:26:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cob2A-0003NJ-La for bug-guile@gnu.org; Thu, 16 Mar 2017 15:26:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:32840) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cob2A-0003NA-EJ for bug-guile@gnu.org; Thu, 16 Mar 2017 15:26:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cob2A-0001Nn-6G for bug-guile@gnu.org; Thu, 16 Mar 2017 15:26:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: taylanbayirli@gmail.com (Taylan Ulrich =?UTF-8?Q?Bay=C4=B1rl=C4=B1/Kammer?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-guile@gnu.org Resent-Date: Thu, 16 Mar 2017 19:26:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 26058 X-GNU-PR-Package: guile X-GNU-PR-Keywords: Original-Received: via spool by 26058-submit@debbugs.gnu.org id=B26058.14896923455293 (code B ref 26058); Thu, 16 Mar 2017 19:26:02 +0000 Original-Received: (at 26058) by debbugs.gnu.org; 16 Mar 2017 19:25:45 +0000 Original-Received: from localhost ([127.0.0.1]:59272 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cob1t-0001NI-AD for submit@debbugs.gnu.org; Thu, 16 Mar 2017 15:25:45 -0400 Original-Received: from mail-wr0-f194.google.com ([209.85.128.194]:33183) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cob1r-0001N6-F5 for 26058@debbugs.gnu.org; Thu, 16 Mar 2017 15:25:43 -0400 Original-Received: by mail-wr0-f194.google.com with SMTP id g10so7110754wrg.0 for <26058@debbugs.gnu.org>; Thu, 16 Mar 2017 12:25:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=DoQ8BBGGD6NUCfUN4BLzS6rp6ubghszxnVmSmC8aw2I=; b=gagenwsUt//B0VJQAWXxhZqwoP4ukMZ1cHS65HCTInNLsBTmlYwiqyUbWoJINEsuC2 c0ge9GglVVeUvaYHgBNNuMFA1gjyYQ89IHRAiOpZru3vWTk5RbVEueh45nSoLWH+wxMS eCNSbXRL0WpqIT2B+J9tRnmQ3pd97II5rK0K06RWqfzzFYtI+NNCGiQc4YgNUHPYYlBY x4dU+cub5sq9LQ9jz+eBg3XOlYQ+NNIRWmYYtH27+Vy3cssSlPYfZXt3t3K7Maaa/ZYo P+/5L4XvuERUyHRJvfAJg632mLmpR5YwuO1KKtkcnjvDFHswvxjWVg9WSsEA9R3bqOaS N7bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=DoQ8BBGGD6NUCfUN4BLzS6rp6ubghszxnVmSmC8aw2I=; b=Tx6obgWwzaqEyXqEf5o0EpiHevhe21eF9K8+5AnCNaUlVODv+IhFzIpVAVaTACgeYz 5e3mwZ1sQhN3Jhtm5jJY3eYbDURdrC3KNSS7rVDvjTw7SScbk88y2UnoQSLTz1qYJczi 9l6XObDbhOvGb7R5kQNV1PhwVbecpgpIIbESG86+wB9HRX4F/Y7T2ljep0wrcW3EEzek HLLKve5f8Cbetg+rHxzREfwKBZ+OjK5FfGsdaxRD/e/tWGTBJJGJE3HFNEh0unVtZtDP Me/1snW5LORoFrxuc/Iy/Ias/wtfE7B3ijzhjLQww6RCYYbgBmYq5CjsvbYU2WNbWL7p aJwA== X-Gm-Message-State: AFeK/H2ox/+lzSe2Zq/LZY/sAcx+tmtVLyQ8cGyxv1GkBRm8CIQC3tiXYwbSRmZf4pH6Og== X-Received: by 10.223.163.145 with SMTP id l17mr9500343wrb.103.1489692337722; Thu, 16 Mar 2017 12:25:37 -0700 (PDT) Original-Received: from T420 ([2a02:908:c30:3540:221:ccff:fe66:68f0]) by smtp.gmail.com with ESMTPSA id 36sm7253504wrk.57.2017.03.16.12.25.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 16 Mar 2017 12:25:36 -0700 (PDT) In-Reply-To: <877f3r7ti2.fsf@pobox.com> (Andy Wingo's message of "Tue, 14 Mar 2017 16:44:37 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-guile@gnu.org List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Original-Sender: "bug-guile" Xref: news.gmane.org gmane.lisp.guile.bugs:8697 Archived-At: Andy Wingo writes: > Adopting the behavior is more or less fine. If it can be done while > relying on the existing behavior, that is better than something ad-hoc > in a module. You mean somehow leveraging the existing BOM handling code of Guile (found in the ports code) would be preferable to reimplementing it like in this patch, correct? In that light, I had this attempt: (define r6rs-utf16->string (case-lambda ((bv default-endianness) (let* ((binary-port (open-bytevector-input-port bv)) (transcoder (make-transcoder (utf-16-codec))) (utf16-port (transcoded-port binary-port transcoder))) ;; XXX how to set default-endianness for a port? (get-string-all utf16-port))) ((bv endianness endianness-mandatory?) (if endianness-mandatory? (utf16->string bv endianness) (r6rs-utf16->string bv endianness))))) As commented in the first branch of the case-lambda, this does not yet make use of the 'default-endianness' parameter to tell the port transcoder (or whoever) what to do in case no BOM is found in the stream. >From what I can tell, Guile is currently hardcoded to *transparently* default to big-endian in ports.c, port_clear_stream_start_for_bom_read. Is there a way to detect when Guile was unable to find a BOM? (In that case one could set the endianness explicitly to the desired default.) Or do you see another way to implement this? Thanks for the feedback! Taylan P.S.: Huge congrats on the big release. :-)