From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Scan of regexps in Emacs (March 17) Date: Tue, 02 Apr 2019 12:58:18 -0400 Message-ID: References: <5363970c-3207-1bb4-8b30-74a7d12277cc@cs.ucla.edu> <05269D79-B016-4FCB-94B8-068BF7D1C2D2@acm.org> <3974269b-6cad-0744-bd1f-66c067f94192@cs.ucla.edu> <4b1164c4-e302-ce41-07c3-145d31a97b4c@cs.ucla.edu> <21CCFA3D-B391-44E1-9ED5-1D37009F1988@acm.org> <09AE372B-3A30-4596-8C4E-B9F4CBF6E348@acm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="43026"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Apr 02 18:59:04 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hBMka-000B2K-70 for ged-emacs-devel@m.gmane.org; Tue, 02 Apr 2019 18:59:04 +0200 Original-Received: from localhost ([127.0.0.1]:59360 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBMkZ-0007Ni-8V for ged-emacs-devel@m.gmane.org; Tue, 02 Apr 2019 12:59:03 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:44485) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBMjz-0007NX-4a for emacs-devel@gnu.org; Tue, 02 Apr 2019 12:58:27 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBMjy-0002L9-4H for emacs-devel@gnu.org; Tue, 02 Apr 2019 12:58:27 -0400 Original-Received: from [195.159.176.226] (port=41716 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBMjx-0002ID-NR for emacs-devel@gnu.org; Tue, 02 Apr 2019 12:58:26 -0400 Original-Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1hBMjw-000AGw-6T for emacs-devel@gnu.org; Tue, 02 Apr 2019 18:58:24 +0200 X-Injected-Via-Gmane: http://gmane.org/ Cancel-Lock: sha1:pqoMTkKed3yn5kdXRH4Ai8qqFqs= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:234879 Archived-At: > (string-match "\xff" "\xff") => 0 > (string-match "[\xff]" "\xff") => 0 > (string-match "\xffé?" "\xff") => nil > (string-match "[\xff]é?" "\xff") => 0 > (string-match "\xff" "\xffé") => 0 > (string-match "[\xff]" "\xffé") => nil > (string-match "\xffé?" "\xffé") => 0 > (string-match "[\xff]é?" "\xffé") => nil Check (multibyte-string-p "...") on those strings, to see some of the reasons why. IIRC the treatment of those escape sequences to determine unibyte/multibyte strings is pretty tricky (last time I looked at it, I found its behavior to be undesirable, but I believe it has slightly changed since and I can't remember what were the problems I bumped into). Stefan