From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.help Subject: More confusion about multibyte vs unibyte strings Date: Thu, 05 May 2022 09:58:43 -0700 Message-ID: <874k23or0c.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="14154"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) To: help-gnu-emacs@gnu.org Cancel-Lock: sha1:3OrorV6/o+3rNarqj0bFCw6h6bw= Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Thu May 05 19:09:07 2022 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nmeyR-0003Xp-88 for geh-help-gnu-emacs@m.gmane-mx.org; Thu, 05 May 2022 19:09:07 +0200 Original-Received: from localhost ([::1]:55286 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nmeyP-0001iM-2n for geh-help-gnu-emacs@m.gmane-mx.org; Thu, 05 May 2022 13:09:05 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49248) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nmeod-0007gt-4X for help-gnu-emacs@gnu.org; Thu, 05 May 2022 12:58:59 -0400 Original-Received: from ciao.gmane.io ([116.202.254.214]:36022) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nmeoY-0007SA-A7 for help-gnu-emacs@gnu.org; Thu, 05 May 2022 12:58:55 -0400 Original-Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1nmeoV-0000Ft-7s for help-gnu-emacs@gnu.org; Thu, 05 May 2022 18:58:51 +0200 X-Injected-Via-Gmane: http://gmane.org/ Received-SPF: pass client-ip=116.202.254.214; envelope-from=geh-help-gnu-emacs@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:137149 Archived-At: In gnus-search.el, we do some work on search strings before sending them to an IMAP server as a query: there are particular formats that need to be used depending on whether the string is plain ASCII, or needs to be encoded as UTF-8 or something. From the code itself: (gnus-search-imap-handle-string (make-instance 'gnus-search-imap :literal-plus t) "FROM eric") -> "FROM eric" (gnus-search-imap-handle-string (make-instance 'gnus-search-imap :literal-plus t) "FROM 张三") -> "{11+} FROM \345\274\240\344\270\211" The function above uses `multibyte-string-p' to test whether the string needs the extra handling. This works correctly in the minibuffer and *scratch*: (multibyte-string-p "FROM eric") -> nil (multibyte-string-p "FROM 张三") -> t but when I edebug the code during an actual IMAP search, the test returns t for both strings, which messes things up. I must be using it wrong! But I don't understand why. What can change in the evaluation environment such that the calls to `multibyte-string-p' would return different results at different times? And what check *should* I be using to see if a string is pure ASCII? Thanks, Eric