From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel Subject: Re: Scan of regexps in Emacs (March 17) Date: Wed, 3 Apr 2019 10:02:33 -0700 Organization: UCLA Computer Science Department Message-ID: <384f994e-be12-6a6e-5ffe-3b97657926fd@cs.ucla.edu> References: <5363970c-3207-1bb4-8b30-74a7d12277cc@cs.ucla.edu> <05269D79-B016-4FCB-94B8-068BF7D1C2D2@acm.org> <3974269b-6cad-0744-bd1f-66c067f94192@cs.ucla.edu> <4b1164c4-e302-ce41-07c3-145d31a97b4c@cs.ucla.edu> <21CCFA3D-B391-44E1-9ED5-1D37009F1988@acm.org> <09AE372B-3A30-4596-8C4E-B9F4CBF6E348@acm.org> <692fe297-1c72-0cda-8765-c119fd0b5ef6@cs.ucla.edu> <83v9zvelmz.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="86281"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 Cc: mattiase@acm.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Apr 03 19:03:33 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hBjIT-000MJR-5w for ged-emacs-devel@m.gmane.org; Wed, 03 Apr 2019 19:03:33 +0200 Original-Received: from localhost ([127.0.0.1]:33960 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBjIS-0001PD-3l for ged-emacs-devel@m.gmane.org; Wed, 03 Apr 2019 13:03:32 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:39475) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBjHb-0001G7-EX for emacs-devel@gnu.org; Wed, 03 Apr 2019 13:02:42 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBjHa-00089G-I9 for emacs-devel@gnu.org; Wed, 03 Apr 2019 13:02:39 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:46430) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hBjHY-0007yh-Ds; Wed, 03 Apr 2019 13:02:36 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 7BAB31614DB; Wed, 3 Apr 2019 10:02:34 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id N96a__e7jzJI; Wed, 3 Apr 2019 10:02:33 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id B441416150D; Wed, 3 Apr 2019 10:02:33 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id OFXLtcQ6B155; Wed, 3 Apr 2019 10:02:33 -0700 (PDT) Original-Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 90C52160083; Wed, 3 Apr 2019 10:02:33 -0700 (PDT) Openpgp: preference=signencrypt Autocrypt: addr=eggert@cs.ucla.edu; prefer-encrypt=mutual; keydata= xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG FQgCCQoLBBYCAwECH In-Reply-To: <83v9zvelmz.fsf@gnu.org> Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 131.179.128.68 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:234923 Archived-At: On 4/2/19 9:52 PM, Eli Zaretskii wrote: > Emacs sometimes does use regexps when dealing with unibyte > buffers and strings, where these ranges could be significant. Quite possibly we'd want to support the case where a range in a unibyte pattern is used against a unibyte buffer or string, as the semantics are straightforward there and the code should work (or at least I think it should work; the implementation is not clear). The troublesome cases with ranges and raw 8-bit bytes occur when either the pattern or the buffer/string is multibyte - there it's not clear what the implementation *should* do, much less what it does.