From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stephen Berman Newsgroups: gmane.emacs.help Subject: Re: Why does using aset sometimes output raw bytes? Date: Sun, 09 Dec 2018 19:50:08 +0100 Message-ID: <87pnuamt5r.fsf@gmx.net> References: <87h8fmohmo.fsf@gmx.net> <83y38y3exe.fsf@gnu.org> <87d0qaog92.fsf@gmx.net> <83r2eq39q7.fsf@gnu.org> <87wooimwr9.fsf@gmx.net> <83pnua384o.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1544381327 5381 195.159.176.226 (9 Dec 2018 18:48:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 9 Dec 2018 18:48:47 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: help-gnu-emacs@gnu.org To: Eli Zaretskii Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sun Dec 09 19:48:43 2018 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gW489-0001IB-E0 for geh-help-gnu-emacs@m.gmane.org; Sun, 09 Dec 2018 19:48:41 +0100 Original-Received: from localhost ([::1]:56280 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gW4AF-00019v-T1 for geh-help-gnu-emacs@m.gmane.org; Sun, 09 Dec 2018 13:50:51 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52923) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gW49f-00019e-Qs for help-gnu-emacs@gnu.org; Sun, 09 Dec 2018 13:50:16 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gW49b-0004Gw-P4 for help-gnu-emacs@gnu.org; Sun, 09 Dec 2018 13:50:15 -0500 Original-Received: from mout.gmx.net ([212.227.15.19]:49871) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gW49b-0004CU-EY; Sun, 09 Dec 2018 13:50:11 -0500 Original-Received: from rosalinde ([88.74.23.82]) by mail.gmx.com (mrgmx003 [212.227.17.190]) with ESMTPSA (Nemesis) id 0M4002-1hNTik2ujC-00rc2r; Sun, 09 Dec 2018 19:50:08 +0100 In-Reply-To: <83pnua384o.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 09 Dec 2018 19:47:03 +0200") X-Provags-ID: V03:K1:FAiC7OkEUsX/+6tQ0e13prEdmEF7eT4XVJtarPS9FjeRuyBaNxr YhwsBgqCTxIzlM935sg2JNziKIS7Hj+gI2Pd7eXZTfA0+9/fDI8FA0kMlcuc904SFAgptc1 51l1NMVJ2+ly+L64h+cghoqHNU0K94lxfZ4pzG0CGOBsywey3rGGV7VzFxmDyWDYBZ0Exgg 4ogNQKqNXFaUGyecl3oIw== X-UI-Out-Filterresults: notjunk:1;V03:K0:tFabNa44/H8=:W/JlhVkiHcVhfVJp9naCfj jdeixGdAgaVyKiNzPfjH2VhALhLhu+3vPP4IztfXEUd8VNGb4J1iUqv4xUirtryB+tzEWsATT Nf3Bhd5vGCA5VHS6/4SJNFhOkq56Yy4SaedpEd+KLJ1A51LB0PB1FADMpxEAso6TOA6ozGv7u nB76t7oWeEJvG4TQkc4rktnejXu0h+v0wPjDx48iug7rMBql4PEMA2JoZzgkVIULonECdF1A+ X49L/Iw5YCadIyoedL0YFS2qTFcLuC8QZR37rHaI08j+9N+uxzhaiG26Dtc8KrhO/V/1T2sl4 +dz+/B/aU42I93+NL9Cw10e0Iqd/QtvF7M0udD+cQRjeS+DPht7109VXdq34y54EeAXNwqARp wTmsLCi2PpBaQICg/uU4BsMBFQnvRAlA+zulo5FJfzx9t2J7NTLaz34OYaMDroePDMTG4qlpF iJz3Ej/L2kfXZNyLT232eQlLFmkIiBxRYlI7fu76R/YBMMRgO6VAWE5C9AVagsDTRQbsFm4I0 CzUjgBldl1QeNhqJu2F0u02+9MTjjmLdfjnJQIirC/kkdoOfOPP2UW+vyrVGkMYL1N+mpnMxJ Z9CZDhCx4UPQOZZLoUaBKFa/2084Q+wfUGwt1cVI0d0MJ6zvgRsZqGqoUmcA9SnS0XmeHma5T 6ad4UysHZn4ssFHselQfdLSv6J0RNMxWRlLMYBS4GEt+VcLMbLbpg4uFxbI2sNWJ2KPBA6cYH 8FHsb6qOTS6FO9bOQcQFs/evuG+lJLGpxsHYUC6ndlVqIUzWBrwQgT2aaUox2cCycu4/ITlA X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 212.227.15.19 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:118947 Archived-At: On Sun, 09 Dec 2018 19:47:03 +0200 Eli Zaretskii wrote: >> From: Stephen Berman >> Cc: help-gnu-emacs@gnu.org >> Date: Sun, 09 Dec 2018 18:32:26 +0100 >>=20 >> >> why are raw bytes inserted only with some >> >> multibyte strings (e.g. with "=C3=A4=C3=B6=C3=BC=C3=9F" but not with = "=C5=BF=C3=B0=C4=91=C5=8B")? >> > >> > Because =C5=BF doesn't fit in a single byte, so when you insert it, the >> > entire string is made multibyte, and then the other characters are >> > inserted into a multibyte string. >>=20 >> This seems to imply that =C3=A4, =C3=B6, =C3=BC and =C3=9F do fit in a s= ingle byte? Yet >> (multibyte-string-p "=C3=A4=C3=B6=C3=BC=C3=9F") returns t. So I still d= on't understand. > > Look at the codepoints: the above are all less than FF hex, so they > can fit in a single byte. By contrast, =C5=BF is 17F hex, more than a > single byte can hold. So inserting =C5=BF into a unibyte string _must_ > first make that string multibyte, whereas inserting =C3=A4 etc. can leave > it unibyte. > > Why (multibyte-string-p "=C3=A4=C3=B6=C3=BC=C3=9F") returns t is an unrel= ated issue: it > has to do with how the Lisp reader reads the string. The result is a > multibyte string, where =C3=A4 is represented by its UTF-8 sequence and n= ot > by its single-byte codepoint E4 hex. If you want a unibyte string > with these bytes, use (multibyte-string-p "\344\366\374\337") instead. Thanks for the very clear and enlightening explanations; I feel I understand this better now. >> >> is there some way in Lisp to say "treat the value of s0 as multibyte >> >> (regardless of what characters it contains)"? >> > >> > Not that I know of, no. And I don't really understand how could such >> > a thing exist: how do you "treat as multibyte" an arbitrary byte that >> > is beyond 127 decimal? >>=20 >> Actually, for the code I was experimenting with, it seems to suffice to >> use (make-string len 128) as the input to aset (before, I had used >> (make-string len 32), which led to raw bytes being displayed). > > Not sure I understand what you mean by "suffice". Feel free to ask > questions if there are some left. I was experimenting with aset to make random permutations of a string and didn't understand why there were sometimes raw bytes in the result (which also led to args-out-of-range errors), but using (make-string len 128) as the container for the permutations prevents that. And with your above explanations I now think I understand why. Steve Berman