From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Embedded modifiers in the regex engine Date: Thu, 25 Feb 2016 18:15:52 +0200 Message-ID: <83oab4g407.fsf@gnu.org> References: <87ziupinhq.fsf@secretsauce.net> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1456416991 13377 80.91.229.3 (25 Feb 2016 16:16:31 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 25 Feb 2016 16:16:31 +0000 (UTC) Cc: emacs-devel@gnu.org To: Dima Kogan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Feb 25 17:16:25 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aYyaV-0005Nk-0i for ged-emacs-devel@m.gmane.org; Thu, 25 Feb 2016 17:16:23 +0100 Original-Received: from localhost ([::1]:44203 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYyaU-0002uO-AF for ged-emacs-devel@m.gmane.org; Thu, 25 Feb 2016 11:16:22 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:57244) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYya1-0002lc-Mr for emacs-devel@gnu.org; Thu, 25 Feb 2016 11:15:59 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aYya0-0002CR-Bm for emacs-devel@gnu.org; Thu, 25 Feb 2016 11:15:53 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:41236) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYya0-0002CN-7B; Thu, 25 Feb 2016 11:15:52 -0500 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4898 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aYyZz-0003fU-IK; Thu, 25 Feb 2016 11:15:51 -0500 In-reply-to: <87ziupinhq.fsf@secretsauce.net> (message from Dima Kogan on Wed, 24 Feb 2016 17:32:01 -0800) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:200662 Archived-At: > From: Dima Kogan > Date: Wed, 24 Feb 2016 17:32:01 -0800 > > I've been thinking of ways to make some fancier aspects of isearch and > hi-lock work better, specifically, the way we handle the different > modes: case-fold, char-fold, lax-whitespace, etc. > > The relevant bugs I filed recently: > > http://debbugs.gnu.org/22541 > http://debbugs.gnu.org/22520 > http://debbugs.gnu.org/22479 > > In short, different parts of emacs (isearch, isearch history, hi-lock, > etc) treat these modes inconsistently, which results in unexpected > behavior. > > The best solution I can think of to clean this up is also the most > intrusive: adding support for pcre-style embedded modifiers to > activate/deactivate the modes. > > So for instance "\\(?i\\)asdf" would be interpreted as a case-folding > regex regardless of the value of case-fold-search. I think this would be > a great thing to have in general, but for the specific issues in the > bugs above, it'd make things simpler and more correct. I hope you are not proposing this as a replacement for the M-s toggles, because if so, I'm very much opposed. > As an example, currently hi-lock generates a complicated-looking regex > to emulate char-folding and case-folding. If we supported the modifiers, > this change would simply be a prepend of "\\(?i\\)" or whatever other > modes we want. This is simple and expected to be bug-free on the hi-lock > level. Bugs such as hi-lock not supporting char-fold and case-fold at > the same time would not happen. They will also not happen once character-folding is implemented via translation tables, instead of regular expressions. The current implementation will go away at some point (one hopes).