From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lars Ingebrigtsen Newsgroups: gmane.emacs.devel Subject: Re: Make regexp handling more regular Date: Thu, 03 Dec 2020 09:38:21 +0100 Message-ID: <87blfbz3lu.fsf@gnus.org> References: <87lfeg60iy.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10406"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Dec 03 09:39:49 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kkk9V-0002ca-JB for ged-emacs-devel@m.gmane-mx.org; Thu, 03 Dec 2020 09:39:49 +0100 Original-Received: from localhost ([::1]:56090 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kkk9U-0005C4-Hc for ged-emacs-devel@m.gmane-mx.org; Thu, 03 Dec 2020 03:39:48 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:43384) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kkk8F-0003sd-6p for emacs-devel@gnu.org; Thu, 03 Dec 2020 03:38:31 -0500 Original-Received: from quimby.gnus.org ([2a01:4f9:2b:f0f::2]:60650) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kkk8D-0003p3-Hc for emacs-devel@gnu.org; Thu, 03 Dec 2020 03:38:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date: References:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=+fudia3NRmxVchkD51uORCaSE5Cv4q7rbQFKS73r2Ic=; b=CxnusrM9Jw8axaLGEL8U+YgmNN L7ovr6wS1H8PFnJYlpMlUYAyfBO/HAU3oc2wPJqQ/pS+tkjYdmmykREJYXF5CXbV9URHVZ0A3kfHs BABHYYsmzt/BeRNw6Gk6DCv1ct1H1Sedw93DbBR77bTLpkPQEt07WISUsPjIZFFSYB/4=; Original-Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kkk87-0000xD-69; Thu, 03 Dec 2020 09:38:26 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAgMAAAAqbBEUAAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAADFBMVEW6lGvVvpOJTTv/ //8WfddaAAAAAWJLR0QDEQxM8gAAAAd0SU1FB+QMAwgHLgvAe2sAAAFySURBVCjPTdDBatwwEAbg 3wKHrU9pQHc71ODVU4glAScnx0h72FN7CJQ8hSirHnx3z17IgjJP2RnZpdFFfGik+UcoPS9HdOEN WnAk3XqUG5JvjrQAZfmgb5J/HYlgPuK385/UmxOj6WP7+46InhldE5v7UZt9L2WX2Pw8zi+vfUXY 97F5Gq++tSqjZbhoIa918el0dbNFxX0G/cJl2kLBKgCztt7y/sYtKPqDzkjmFy1+rAvGV845FI9D 3aDIqe0Xh7baRniAQ6ECMhxkbRgKeWpZoaK1XegynOIs3bJiqGZ7a5ZqRT0fBhOqAP432+qTpTCt iNqAAn8Vw8yUHAWTMvQb8cxVQuGd8c80MUiw97aaFijBsPelpR2Q8cEj1zuoBPjh4r+Ds6hFcPU/ wFUyUDkmX2PXgeMABzqHe+rkDwXLZCjBrMhL3gB3Jg5AuPyDnLwL1IrD+RPodiszAvvpTr6/Iqnp P4IiSYN8ARvkJGX8BePRrLauUHjHAAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIwLTEyLTAzVDA4OjA3 OjQ2KzAwOjAwa8IcYwAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMC0xMi0wM1QwODowNzo0NiswMDow MBqfpN8AAAAASUVORK5CYII= X-Now-Playing: The Contortions's _Buy_: "Bedroom Athlete" In-Reply-To: (Stefan Monnier's message of "Wed, 02 Dec 2020 12:17:12 -0500") Received-SPF: pass client-ip=2a01:4f9:2b:f0f::2; envelope-from=larsi@gnus.org; helo=quimby.gnus.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:260206 Archived-At: Stefan Monnier writes: >> Naming is, of course, the most difficult problem here. > > I agree that it might be worth looking at what other languages do. > But we could also just follow "traditional regexp" libraries's > suggestions for naming and go with something like: > > (re-match REGEXP &optional OBJECT START END) > (re-search REGEXP &optional OBJECT START END) > > [ the first being like `looking-at` (i.e. an "anchored" match). ] I like it. Off-list, it's been pointed out that the current implementation of functions like re-search-forward would be faster than these interfaces because they produce less garbage -- since there's just one global match object, it's static, while (while (setq match (re-search "[a-z]+")) (bar (re-data match 0))) would create a whole lot of garbage to be collected. Now, that's not really that much of an issue if you're just saying (when (re-match "foo") ...) here and there, but searching through a buffer for matches is a very common use case, and we wouldn't want that to be slower than now. So the suggestion is to be able to pass in an (optional) match object. However, we don't really need to create that explicitly everywhere. The idiom could be just: (while (setq match (re-search "[a-z]+" match)) (bar (re-data match 0))) In the first iteration, it's nil, which means that `re-search' will allocate one, but in subsequent iterations, it'll be reused. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no