From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.help Subject: Re: More confusion about multibyte vs unibyte strings Date: Thu, 05 May 2022 11:44:41 -0700 Message-ID: <87v8ujn7ja.fsf@ericabrahamsen.net> References: <874k23or0c.fsf@ericabrahamsen.net> <83zgjv288x.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1818"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) To: help-gnu-emacs@gnu.org Cancel-Lock: sha1:Le1Mo2XMzHzZfMlJk8Td19CRwCI= Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Thu May 05 21:04:14 2022 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nmglq-0000HV-3d for geh-help-gnu-emacs@m.gmane-mx.org; Thu, 05 May 2022 21:04:14 +0200 Original-Received: from localhost ([::1]:45516 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nmglo-00016X-JG for geh-help-gnu-emacs@m.gmane-mx.org; Thu, 05 May 2022 15:04:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:44896) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nmgT5-0003X9-NA for help-gnu-emacs@gnu.org; Thu, 05 May 2022 14:44:52 -0400 Original-Received: from ciao.gmane.io ([116.202.254.214]:57978) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nmgT4-00036a-84 for help-gnu-emacs@gnu.org; Thu, 05 May 2022 14:44:51 -0400 Original-Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1nmgT0-0005us-2w for help-gnu-emacs@gnu.org; Thu, 05 May 2022 20:44:46 +0200 X-Injected-Via-Gmane: http://gmane.org/ Received-SPF: pass client-ip=116.202.254.214; envelope-from=geh-help-gnu-emacs@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:137152 Archived-At: Eli Zaretskii writes: >> From: Eric Abrahamsen >> Date: Thu, 05 May 2022 09:58:43 -0700 >> >> The function above uses `multibyte-string-p' to test whether the string >> needs the extra handling. This works correctly in the minibuffer and >> *scratch*: >> >> (multibyte-string-p "FROM eric") -> nil >> >> (multibyte-string-p "FROM 张三") -> t >> >> but when I edebug the code during an actual IMAP search, the test >> returns t for both strings, which messes things up. > > Why does it "mess things up", and what exactly is the nature of the > mess-up? A pure-ASCII string can be either unibyte or multibyte, and > that shouldn't change a thing. If the string is not ASCII, we need to encode it before sending to the server, and tell the server what encoding we used. Microsoft Exchange servers can't handle any encoding other than ascii. So if our code thinks a string isn't ascii, it sends the encoding message to the IMAP server, and Exchange blows up. If the string is ascii, we don't try to encode it, and everything's fine. So I need to know whether the string is actually ascii or not. I can solve this some other way, like (equal (length str) (string-bytes str)) but I'm just trying to figure out why this doesn't behave the way I expect it to. I'd thought that `multibyte-string-p' essentially performed the above length test.