From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: npostavs@users.sourceforge.net Newsgroups: gmane.emacs.bugs Subject: bug#25288: 25.1; term, ansi-term, broken output of utf8 text Date: Wed, 28 Dec 2016 14:10:30 -0500 Message-ID: <87r34r98ex.fsf@users.sourceforge.net> References: NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1482952220 7113 195.159.176.226 (28 Dec 2016 19:10:20 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 28 Dec 2016 19:10:20 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) Cc: 25288@debbugs.gnu.org To: Vjacheslav Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Dec 28 20:10:15 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cMJc6-0000CL-HK for geb-bug-gnu-emacs@m.gmane.org; Wed, 28 Dec 2016 20:10:14 +0100 Original-Received: from localhost ([::1]:60617 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cMJc3-0001wl-6d for geb-bug-gnu-emacs@m.gmane.org; Wed, 28 Dec 2016 14:10:11 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48395) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cMJbx-0001vR-Eu for bug-gnu-emacs@gnu.org; Wed, 28 Dec 2016 14:10:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cMJbu-0001Zk-9I for bug-gnu-emacs@gnu.org; Wed, 28 Dec 2016 14:10:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:42883) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cMJbu-0001Zg-6B for bug-gnu-emacs@gnu.org; Wed, 28 Dec 2016 14:10:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cMJbu-0007gH-1W for bug-gnu-emacs@gnu.org; Wed, 28 Dec 2016 14:10:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: npostavs@users.sourceforge.net Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 28 Dec 2016 19:10:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 25288 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 25288-submit@debbugs.gnu.org id=B25288.148295218329496 (code B ref 25288); Wed, 28 Dec 2016 19:10:01 +0000 Original-Received: (at 25288) by debbugs.gnu.org; 28 Dec 2016 19:09:43 +0000 Original-Received: from localhost ([127.0.0.1]:58282 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cMJbb-0007fb-B5 for submit@debbugs.gnu.org; Wed, 28 Dec 2016 14:09:43 -0500 Original-Received: from mail-io0-f194.google.com ([209.85.223.194]:35172) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cMJbY-0007fF-Iq; Wed, 28 Dec 2016 14:09:42 -0500 Original-Received: by mail-io0-f194.google.com with SMTP id f73so43983608ioe.2; Wed, 28 Dec 2016 11:09:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=2mx2/4k3FyaHIpu/ttfEvra1GOSI502T50KWXVv4gOg=; b=tbO52D6ErUS/tiYksDRjCPRBYqBmlccitSWJxFlCxAyf4vNyr4kiTidVWkCj4JMPgJ iWTNcwMDNUcqwjBRCVqHO00oVioyl0gJh3FLdPs1wnZ0WZHWzdavx3raFkYHSA4Ed5JS qbyeBQSevdz3w7R0tOEXvprDPZMM4t5VPpkyefsVXK8WP1gl08ZG3oOCzjC/N5xKdBZ0 FcJ0pGP6KsMLRh5nt2+A1HT+FZkVkdQ5T9HN1Nf7a/8w9jrazNSZA21O+Ouo3wjxBAgF 5W4p2aikXi6RUOALdWXcp6ESZczHD3NxzZ1hjfxiCzQmNqwqkAbyLy4JyRrwAAcJUxfb gPqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version :content-transfer-encoding; bh=2mx2/4k3FyaHIpu/ttfEvra1GOSI502T50KWXVv4gOg=; b=bFp9sD3M4vZDbG+9WQqzdKRr2tG8Ltj9BHTrcYxCfhtnKDOb2LPgk0PCfUtwfVBHdY 5e6xA+Rr7LJGw5UclzDF4HO/eCeVy3Aw7cGUpjP3MGI5LE7V+qwqFEwckRJUlwu0TPMI GvNAX3HXWl+q8PclwDAFDk8tS65bN3BNhkElReoXCd7+HvjBls7RJ+9XhWigyvtBw1SH Y3XHMnTHcz6okPSfKyFcHwX2P2u/D6mMwGS9AIqvlc2kTK7wYT3eHfnAZyo0mL9WGLNJ BMpvqV/jTcwHl7FaR8cYj5gIvEiTqYXHOuf5PRgbQd0fS//c7s54QcgiHfz/fWPhjpes c9UA== X-Gm-Message-State: AIkVDXIP5XLjvCe2Pw2JBQc/bPRO2doqPtisq/GruLrYnjKAaxdlGdiQCmmhE8zqDKxbUw== X-Received: by 10.107.15.84 with SMTP id x81mr30130941ioi.68.1482952174921; Wed, 28 Dec 2016 11:09:34 -0800 (PST) Original-Received: from zony ([45.2.7.65]) by smtp.googlemail.com with ESMTPSA id e72sm24078603iof.26.2016.12.28.11.09.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 28 Dec 2016 11:09:34 -0800 (PST) In-Reply-To: (Vjacheslav's message of "Wed, 28 Dec 2016 13:41:55 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:127525 Archived-At: found 25288 24.5 tags 25288 confirmed quit Vjacheslav writes: > Trying to use this command from terminal running bash: > > [fva@localhost ~]$ python -c 'print "=D1=88"*5000' > > produces garbage (=D1=88=D1=88=D1=88\321\210=D1=88=D1=88=D1=88) in output= . Terminal needs > reset. Possibly this is a bug which seen in very old linux, (breaks > multibyte characters on buffer borders). > > default-process-coding-system is OK: > > default-process-coding-system is a variable defined in =E2=80=98C source = code=E2=80=99. > Its value is (utf-8-unix . utf-8-unix) It looks like the problem is that the process filter function, term-emulate-terminal, receives the output in chunks of 4096 bytes[1]. The =D1=88 character is encoded in 2 bytes, which means it can be split across chunks. Is there a way to recognize incomplete decoding from lisp? I can't see any. [1]: It's getting bytes rather than characters because in term-exec-1 we have: ;; The process's output contains not just chars but also binary ;; escape codes, so we need to see the raw output. We will have to ;; do the decoding by hand on the parts that are made of chars. (coding-system-for-read 'binary))