From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.help Subject: Re: Why does using aset sometimes output raw bytes? Date: Sun, 09 Dec 2018 19:12:32 +0200 Message-ID: <83r2eq39q7.fsf@gnu.org> References: <87h8fmohmo.fsf@gmx.net> <83y38y3exe.fsf@gnu.org> <87d0qaog92.fsf@gmx.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1544375475 4301 195.159.176.226 (9 Dec 2018 17:11:15 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 9 Dec 2018 17:11:15 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sun Dec 09 18:11:11 2018 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gW2bm-000124-I4 for geh-help-gnu-emacs@m.gmane.org; Sun, 09 Dec 2018 18:11:10 +0100 Original-Received: from localhost ([::1]:56080 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gW2ds-00006b-SR for geh-help-gnu-emacs@m.gmane.org; Sun, 09 Dec 2018 12:13:20 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:32833) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gW2dQ-00006I-5k for help-gnu-emacs@gnu.org; Sun, 09 Dec 2018 12:12:53 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gW2dM-00048b-KG for help-gnu-emacs@gnu.org; Sun, 09 Dec 2018 12:12:52 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43836) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gW2dM-00048G-CS for help-gnu-emacs@gnu.org; Sun, 09 Dec 2018 12:12:48 -0500 Original-Received: from [176.228.60.248] (port=2102 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gW2dL-0006Dk-1h for help-gnu-emacs@gnu.org; Sun, 09 Dec 2018 12:12:47 -0500 In-reply-to: <87d0qaog92.fsf@gmx.net> (message from Stephen Berman on Sun, 09 Dec 2018 16:46:01 +0100) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:118943 Archived-At: > From: Stephen Berman > Cc: help-gnu-emacs@gnu.org > Date: Sun, 09 Dec 2018 16:46:01 +0100 > > > s0 and s2 originally include only pure ASCII characters, so they are > > unibyte strings. Try making them multibyte before using aset. > > Thanks, that works. But why are raw bytes inserted only with some > multibyte strings (e.g. with "äöüß" but not with "ſðđŋ")? Because ſ doesn't fit in a single byte, so when you insert it, the entire string is made multibyte, and then the other characters are inserted into a multibyte string. > Also, is there some way to ensure a string is handled as multibyte > if it's not known what characters it contains? E.g., s0 in my > example sexp could be bound to some string by a function call and > before applying the function it is not known if the string is > multibyte; You should generally keep away of such situations, but you don't tell enough about what you are trying to accomplish to give more practical advice. To answer your question: you can test whether a string is multibyte with multibyte-string-p, and you can make it multibyte if not. The only problematic situation is when a unibyte string includes non-ASCII bytes; what is TRT in that situation depends on the situation. > is there some way in Lisp to say "treat the value of s0 as multibyte > (regardless of what characters it contains)"? Not that I know of, no. And I don't really understand how could such a thing exist: how do you "treat as multibyte" an arbitrary byte that is beyond 127 decimal? > Also "aous" is also pure ASCII, so why don't raw bytes get inserted with > (insert (aset "aous" i (aref "äöüß" i)))? This inserts characters one by one into the current buffer, and the buffer is multibyte, so Emacs does the conversion. IOW, you don't insert the string, you insert individual characters which aset returns.