From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: string-to-unibyte in image-jpeg-p Date: Wed, 23 May 2018 23:01:08 +0300 Message-ID: <831se2cfaz.fsf@gnu.org> References: <87zi0q3c5b.fsf@gnuvola.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1527105551 9076 195.159.176.226 (23 May 2018 19:59:11 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 23 May 2018 19:59:11 +0000 (UTC) Cc: emacs-devel@gnu.org To: Thien-Thi Nguyen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed May 23 21:59:07 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fLZub-0002Db-4G for ged-emacs-devel@m.gmane.org; Wed, 23 May 2018 21:59:05 +0200 Original-Received: from localhost ([::1]:35213 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fLZwi-00070R-9z for ged-emacs-devel@m.gmane.org; Wed, 23 May 2018 16:01:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56399) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fLZwb-0006zv-Fh for emacs-devel@gnu.org; Wed, 23 May 2018 16:01:10 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fLZwa-0003Wi-OS for emacs-devel@gnu.org; Wed, 23 May 2018 16:01:09 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:49080) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fLZwS-0003UV-Rb; Wed, 23 May 2018 16:01:00 -0400 Original-Received: from [176.228.60.248] (port=2949 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1fLZwS-0002DQ-8s; Wed, 23 May 2018 16:01:00 -0400 In-reply-to: <87zi0q3c5b.fsf@gnuvola.org> (message from Thien-Thi Nguyen on Wed, 23 May 2018 12:21:52 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:225627 Archived-At: > From: Thien-Thi Nguyen > Date: Wed, 23 May 2018 12:21:52 +0200 > > (setq data (let ((tem (encode-coding-string data 'binary))) > (unless (string-match-p "\xc2" tem) > tem))) > > My questions are: > - Is my reasoning correct? In particular, i'd like to confirm > that the "error" detection via "\xc2" is a valid strategy. I'm not sure. #xC2 is the leading byte of a 5-byte representation of raw bytes, right? There are also 2-byte representations, which begin with C0 and C1 respectively. > - If so, which branch gets the change, ‘emacs-26’ or ‘master’? master in any case, but see below. > - If not, what am i missing? Doesn't it feel strange to encode a string using 'binary', which is an alias for 'no-conversion' as the encoding? Is it intuitive to do something like that? Is the meaning obvious (I guess not)? So I'd rather we left string-to-unibyte alone in this case, instead of artificially replacing it with encode-coding-string. And maybe we should un-deprecate string-to-unibyte, since this is the _only_ place where it's called, and if used correctly, that function has no good replacements. Thanks.