From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jean Abou Samra Newsgroups: gmane.lisp.guile.user Subject: UTF16 encoding adds BOM everywhere? Date: Thu, 14 Jul 2022 21:16:44 +0200 Message-ID: <63f03f91-58d8-c247-2d2a-78848c2e5ca9@abou-samra.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31913"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 To: Guile User Original-X-From: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Thu Jul 14 21:17:25 2022 Return-path: Envelope-to: guile-user@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oC4Kz-0008AN-GB for guile-user@m.gmane-mx.org; Thu, 14 Jul 2022 21:17:25 +0200 Original-Received: from localhost ([::1]:41404 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oC4Ky-00042t-05 for guile-user@m.gmane-mx.org; Thu, 14 Jul 2022 15:17:24 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47200) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oC4KT-00041C-9b for guile-user@gnu.org; Thu, 14 Jul 2022 15:16:53 -0400 Original-Received: from mout.kundenserver.de ([212.227.17.10]:36841) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oC4KR-0003ps-8l for guile-user@gnu.org; Thu, 14 Jul 2022 15:16:53 -0400 Original-Received: from [192.168.1.128] ([82.65.251.18]) by mrelayeu.kundenserver.de (mreue106 [212.227.15.184]) with ESMTPSA (Nemesis) id 1MnqbU-1nk86623Vy-00pPmy for ; Thu, 14 Jul 2022 21:16:48 +0200 Content-Language: en-US X-Provags-ID: V03:K1:jiHt1gaIUHg+dNJNBLNjsAIZA4e4QGV+nL4YF/aR7lu+oT0PE3B 8xzTkii2XhcdJ4BpC0tjO12gRUZgfOMneIkeLAoFjxvUShTdJpYyrkp4Urqp/O4xs1fosoP H/fCWPGlTE5ATVRCLcjhIxe/2H5YFDfTrTk+O3Y06Pxk25svaH2g3RYBh1TxXRe/jQOFGOS SGmdZNx1V+FVSVQahQmIQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:kAl/MM7fd4w=:JjgK0syddXZfeUueSnlzYz Uq7VGLBz7NvebyxvGDFTL/jrZIV3nrTaxTsI4JZrxFGF+/yG2ta7UkLwrJaItSnF2ooy16lL6 NUXQhymLsbrVb37pcxoATag8KhkqfaJ4TyEXrax2U6GP1Dto6x5yF/waxFbobLFCpDq5qLmQa cUQb1UW9dhjvKNwuM5PoxHhTi6r5MuJ5oPQupqnb+5m6sI+hWTHrlzh4d+t9wmL2A4lsN1t1U 1GWf2ICiffuS4BtJQDFCmrY4HTYU/P3cIkq+ObL8DJwlh3fRek4K1UvXOUgarxg+polM3NZfX ipBwjpqDRSDMGmF9p9XZxIjtvBRgRaeBrbOoMH5Y8qblBROPQnkL4KM6DWpjX0nxCnpyR5CWY aOUAJlfv0yVpDr2zPaDTxj/fXv4ahgcyT4V5ANT84V6Z+aMh3BI6Wih3FcQ46EXZQiTzcK0xC 0gLkJ5VCEXFbs6MNQtSV7zu9xsEoCKCmdKwlE8WKcFyfe2Hi+l6Qmi192o5rDCVHvbGs9mAwN i2mMytRdWqqiszIikSYzma/ylaYL9+uUbT/5awxyXGK0DG4EYvYYsl234/dyxLSZwZvQkmUC7 0MOJdXeyhfYvQRJQpZPAMJFvi1WmQhjqYt5l1oJclKyAOw4A31HQx5PqHdDjaHcCxLuMjio+r 0a0KTk9pK53CnDfKlgB5uC8m4scUlgQnd8tu32UC8tnIZkcRESXcfmCdwKOexXFCzSJcgfssM e+1O2AlkFnevLMtK6B9HdIt4d9cE2psosKaUEw== Received-SPF: none client-ip=212.227.17.10; envelope-from=jean@abou-samra.fr; helo=mout.kundenserver.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.io gmane.lisp.guile.user:18418 Archived-At: Hi, With this code: (let ((p (open-output-file "x.txt")))   (set-port-encoding! p "UTF16")   (display "ABC" p)   (close-port p)) the sequence of bytes in the output file x.txt is ['FF', 'FE', '41', '0', 'FF', 'FE', '42', '0', 'FF', 'FE', '43', '0'] FFE is a little-endian Byte Order Mark (BOM), fine. But why is Guile adding it before every character instead of just at the beginning of the string? Is that expected? This is a curiosity question; since what I want is big endian, I just used "UTF16BE", which outputs big endian, and doesn't add the BOM -- I can then just add a BOM manually. Still, I'm puzzled by this behavior. Thanks, Jean