From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.devel Subject: Make regexp handling more regular Date: Wed, 02 Dec 2020 10:05:25 +0100 Message-ID: <87lfeg60iy.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3009"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Dec 02 11:41:00 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kkPZE-0000gX-Ck for ged-emacs-devel@m.gmane-mx.org; Wed, 02 Dec 2020 11:41:00 +0100 Original-Received: from localhost ([::1]:34596 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kkPZD-0008Bg-Dq for ged-emacs-devel@m.gmane-mx.org; Wed, 02 Dec 2020 05:40:59 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54318) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kkO4v-0004w0-VC for emacs-devel@gnu.org; Wed, 02 Dec 2020 04:05:37 -0500 Original-Received: from quimby.gnus.org ([2a01:4f9:2b:f0f::2]:47890) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kkO4u-00056O-3n for emacs-devel@gnu.org; Wed, 02 Dec 2020 04:05:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:Date:Subject:To:From: Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=BZMMEiP8uBFtsRKQoZ3G+NyHes9I5SZC0ZGJaG/5FPQ=; b=VXGbf/uRXaEZEpy4PJIGFGtX1V h7mfy6mm8hKh3n0m1fAwXVTDSlxazVj8+Ztf7XbOaOYU7/zzd0uAiz+xbvNu8LhcSfPwcc4NBg/Q2 pnJMhyWdamjJmFL3ZbQMWUvdJFmm8vajSkoWFQzITS7OZqEFMRt17oPfftsJobtmg7To=; Original-Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kkO4l-000405-UN for emacs-devel@gnu.org; Wed, 02 Dec 2020 10:05:31 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAgMAAAAqbBEUAAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAACVBMVEUwKCOfkID///+p RWBfAAAAAWJLR0QCZgt8ZAAAAAd0SU1FB+QMAggaHmrJQL4AAAEqSURBVCjPPZFbigUxCERLSP57 oN2PwuTfQNz/VqY0904TQk77LqF6XjVJUzM8AAh+BgZ4YUy4oeAFpomZnAfxlhtiqAeSbgEYdAUW LRLYmB8YwbAGpp4HE/LLhx5Z6nxs/h5HmF6Yk9mQcaqNBs2dJyoyYPO1OCeLeFI1ZH9gZXrFFEhm jAYtt4x+P7TkB4YpVqpXHdFQluZMVhbGhMXeB/dLj22tSl22s7zwtk3yW6Ucc/6D7NvAdfMz0RJK K9CWqEZZ9IIJhmfpSO3EoZl7NkzjBLHH1Y4SOy7kTb8K9FgN/jNvHxwxn1esgXO8zyObFHlK/phe 20oKKW4NnLHyiXhr0OqLVW3WSd6r22UHCeXefyp1rQolB48LYahdWaad9r8L4ebEvmD4A1eHPsJv VNQxAAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIwLTEyLTAyVDA4OjI2OjMwKzAwOjAwGxFjLgAAACV0 RVh0ZGF0ZTptb2RpZnkAMjAyMC0xMi0wMlQwODoyNjozMCswMDowMGpM25IAAAAASUVORK5CYII= X-Now-Playing: David Bowie's _Tonight_: "Tumble And Twirl" Received-SPF: pass client-ip=2a01:4f9:2b:f0f::2; envelope-from=larsi@gnus.org; helo=quimby.gnus.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:260175 Archived-At: Today's idle shower thought: I constant source of confusion and subtle bugs is the way Emacs does regexp match handling: The way `string-match' (and the rest) sets a global state, and you sort of have to catch them "early" is often a challenge for new users. Experienced Emacs Lisp programmers know to be safe and will say: (when (string-match "[a-z]" string) (let ((match (match-string 0 string))) (foo) (bar match))) while people new to Emacs Lisp will expect this to work: (when (string-match "[a-z]" string) (foo) (bar (match-string - string))) And sometimes it does, and sometimes it doesn't, depending on whether `foo' also messes with the match data. So my idle shower thought for the day is: Is there any reasonable path forward that the Emacs Lisp language could take here? Well, we obviously can't alter functions like `string-match' and `re-search-forward' -- they have well-defined semantics, and we can't make them return a match object. But we could make a new set of functions that are more, er, functional. Naming is, of course, the most difficult problem here. I wondered whether the namespace would allow us to just add -p to the functions, but names like `string-match-p' are already taken for variations on the non-p functions. In any case, if we happen upon a naming convention that's good, the new interface for these functions would then be to return a "match object", that can then be used for looking at details of the match. I.e., (when (setq match (rx-string-match "[a-z]" string)) (foo) (bar (match match 0))) The match object would know what it had matched, too. The following code is an error: (when (re-search-forward "p[a-z]+" nil t) (with-temp-buffer (insert (match-string 0)) (buffer-string))) But the following would work: (when (setq match (rx-search-forward "p[a-z]+" nil t)) (with-temp-buffer (insert (match match 0)) (buffer-string))) And the same for functions working on strings, of course. And equivalent forms for match-beginning/-end. And we could finally get rid of the confusingly-named `match-string' function. There's nothing but upsides, people! -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no