From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Michael Albinus Newsgroups: gmane.emacs.help Subject: Re: Parsing of multibyte strings frpom process output Date: Tue, 08 May 2018 14:01:22 +0200 Message-ID: <87wowe1hm5.fsf@gmx.de> References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1525780808 15846 195.159.176.226 (8 May 2018 12:00:08 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 8 May 2018 12:00:08 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: help-gnu-emacs@gnu.org To: Helmut Eller Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Tue May 08 14:00:04 2018 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fG1Hj-0003sX-TF for geh-help-gnu-emacs@m.gmane.org; Tue, 08 May 2018 14:00:00 +0200 Original-Received: from localhost ([::1]:50810 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fG1Jq-0002qb-OG for geh-help-gnu-emacs@m.gmane.org; Tue, 08 May 2018 08:02:10 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:33736) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fG1JE-0002q9-KK for help-gnu-emacs@gnu.org; Tue, 08 May 2018 08:01:38 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fG1J9-0006WX-6e for help-gnu-emacs@gnu.org; Tue, 08 May 2018 08:01:32 -0400 Original-Received: from mout.gmx.net ([212.227.17.21]:38409) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fG1J8-0006Uz-T7 for help-gnu-emacs@gnu.org; Tue, 08 May 2018 08:01:27 -0400 Original-Received: from detlef.gmx.de ([79.140.121.131]) by mail.gmx.com (mrgmx103 [212.227.17.168]) with ESMTPSA (Nemesis) id 0Lq9Ma-1ecGdY0kO8-00dmw9; Tue, 08 May 2018 14:01:24 +0200 In-Reply-To: (Helmut Eller's message of "Tue, 08 May 2018 13:00:13 +0200") X-Provags-ID: V03:K1:VTO3aTpcHGT+dPCP3WzW5boiNON58cuiCh90xoLjEAZtUSuvRgr f7qZ1TIbrCicX0cm9j+uJ1FYVd8xXkv/AoU5mMWvQXJ472Qm1UoV6aNs3wKkbo0WYAFCySx yv6qTcNBdG1pCHVKpnnlvNspJiPR3az7elDZZ43QnQd5kyTxp3hzF2GRh2M11QI4YxID9j+ qIIQrVTtRm2sfxS9QcM0g== X-UI-Out-Filterresults: notjunk:1;V01:K0:uDGEl/v4gJE=:Apim2erkt9Zmf+u7HokawJ taqf3CEOBwTltGws27/4JCs7REkBDbICvyb8yR4lV7CNsQuTMhpRL3H5Id4y+yFQsRnzBo1J8 znwczEfNVVhw2eEFiWHM5sc19hL/PnR1/M3xO0ao9VaEPgZ7DKfeeNcd4XaRCJkU5vPY4/nrR Mn2+kh2Cs48689ph07WhRI06EAmP1mmXSzB7YREmnd0Ge8mG92vM9lUhTHkh8j3wcfQ2wBYns I06vxA0oxQV2zRwt/dMA33WybRDlHlNwC30NGLAaKvaiEpB6Zyq3eGnnQSyW+ZNxXkyQQfB3X HiU5wKUsAC2dibyrRWxLc23giZbNPn2uD06zaX3yqeToEBWzot/9bmlOllbbOl0U3AnH0oJMT N45dSIOJWtmKxOgaM1NCju5DfNskieZAsf9cfwLJ/4ojye12Sc6x9BfKJW6ArXLo5mrQDQ1vt bFJHAKX1HOT9yuj0xUPQNkapVhLoGKyf9eWTJQdIV4XR8NBIZuJK4uZS1Ws/8y2YE7Bmje3yH fHk8sLONbo5LDn7cf+FwKlpv3HzjTn9ln5DK1D9WPIB6C1Y+NhjYMikJgKyDrtQEWt4XKBu+Z 4sfKJQOi7qfp0Vck5XXDxokuoDpOY4Dm07HlK6tWCHMfsw52yOa1WBRpthLN+qdnXezwSmX5N qGkWXNHtDAyykv5CpMg9Tqid1c6bIH0Gddft0LV0C4VAVhHCB2DbUHDtir5zLx9aFPh2XWOXw mhIzkUc79iIS1sKiokBU66pm/NvGwp3HfW++pnIFl4qAC/Urh3NkhNT1M56xLdu/IOxPZ35b X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 212.227.17.21 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.org gmane.emacs.help:116693 Archived-At: Helmut Eller writes: Hi Helmut, >> However, I don't know how to parse it that I could retrieve it. All >> what I have tried returns always the *two* characters ?\xc2 ?\x9a, >> multibyte encoded. How could I get just the multibyte character ?\x9a >> from this? > > You could use (set-process-coding-system 'utf-8) if you know that > the all output of the process is indeed utf-8 encoded. I've done this already, for other purposes. But it doesn't help, the string /home/albinus/tmp/\xc2\x9abung is written literally into the output buffer. > Alternatively, you could use 'binary as coding system and manually call > decode-coding-string on the parts that are utf-8 encoded. However keep > in mind, that "raw bytes" in multibyte strings have char codes in the > range #x3FFF00..#x3FFFFF. I tried that, with no luck. But I didn't know that "raw" bytes are in that range. > (decode-coding-string (string #x3FFFc3 #x3FFF9c) 'utf-8) =3D> "=C3=9C" That's it! The following code works for me (res-symlink-target keeps the file name from process output, as shown above): --8<---------------cut here---------------start------------->8--- (setq res-symlink-target ;; Parse multibyte codings. (decode-coding-string (replace-regexp-in-string "\\\\x\\([[:xdigit:]]\\{2\\}\\)" (lambda (x) (string (string-to-number (concat "3FFF" (match-string 1 x)) 16))) res-symlink-target) 'utf-8)) --8<---------------cut here---------------end--------------->8--- Thanks a lot! > Helmut Best regards, Muichael.