From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Make regexp handling more regular Date: Thu, 03 Dec 2020 10:10:06 -0500 Message-ID: References: <87lfeg60iy.fsf@gnus.org> <87blfbz3lu.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18599"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Lars Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Dec 03 16:10:58 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kkqG1-0004js-Hj for ged-emacs-devel@m.gmane-mx.org; Thu, 03 Dec 2020 16:10:57 +0100 Original-Received: from localhost ([::1]:52054 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kkqG0-0001xl-L4 for ged-emacs-devel@m.gmane-mx.org; Thu, 03 Dec 2020 10:10:56 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:48274) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kkqFI-0001Q0-Os for emacs-devel@gnu.org; Thu, 03 Dec 2020 10:10:12 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:20962) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kkqFG-0004XM-HQ for emacs-devel@gnu.org; Thu, 03 Dec 2020 10:10:11 -0500 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 4DC37811E3; Thu, 3 Dec 2020 10:10:09 -0500 (EST) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 8B79580C56; Thu, 3 Dec 2020 10:10:07 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1607008207; bh=9kM13avV7ErinKxZYbq6iiv5oar//aqddt7nXsHriYg=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=OeXDTPIK5WInrsY7EX1Afe+KFkDBTqvrHtRJk0Vbrwtiu1+rIvtbVspNWOjHgeWYW 0xrArkbWGROp8oEwYeiU8YOox9FyB/UUX3e/IwTiKo8gyKT9K3eYrQua5lKaDHvTkc aPYA0LRRVds0x9kS940aOrj3T4BQqJ4ma+vWQQ7O38LEd4V3ASsvf9grZHN2Gp8Wzn FuNQWXsj/rFg9Y37TUQn5mBsp/ldH+QSgohyxfvKODQHTaF42z9UY4LQWFMYnAkGQG NilYHj2gSnQbG/MP7DiXdIuqWnDBu33nv7sK5d9WXzozrIRHRLX95i9l2dF1e5P+S2 m0dsls1q87W0g== Original-Received: from alfajor (69-165-136-52.dsl.teksavvy.com [69.165.136.52]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 3F5ED1200BF; Thu, 3 Dec 2020 10:10:07 -0500 (EST) In-Reply-To: <87blfbz3lu.fsf@gnus.org> (Lars Ingebrigtsen's message of "Thu, 03 Dec 2020 09:38:21 +0100") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:260221 Archived-At: > Off-list, it's been pointed out that the current implementation of > functions like re-search-forward would be faster than these interfaces > because they produce less garbage -- since there's just one global match > object, it's static, while Yes, it's indeed my main worry. Reusing the match-data sounds like a good practical approach. Maybe another approach would be to use an API where the match doesn't return a "match data" but instead let-binds some variables with the relevant data. IOW, specify right away in which part of the data you're interested, so only the relevant data is returned. That would also remove the need for the `string-match-p` alternatives which don't return any match data. I'm not completely sure what it would look like, tho. Maybe (let-re-match (overall (beg end)) (re-match "regexp") ...) which would be equivalent to (progn (re-match "regexp") (let ((overall (match-string 0)) (beg (match-beginning 1)) (end (match-end 1))) ...)) ?? This has problems dealing with match-failure tho: it works, but it with a lot of spurious match-data extraction. Stefan