From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#61514: 30.0.50; sadistically long xml line hangs emacs Date: Tue, 21 Feb 2023 11:58:02 -0500 Message-ID: References: <87lel0c65v.fsf@everybody.org> <838rgvymcd.fsf@gnu.org> <831qmkwmux.fsf@gnu.org> <83cz64v3v7.fsf@gnu.org> <83r0ujte97.fsf@gnu.org> <6abb5de688808f8d363d@heytings.org> <83bklntbeb.fsf@gnu.org> <6abb5de688545b31428e@heytings.org> <837cwbt6p7.fsf@gnu.org> <6abb5de688b004f6f267@heytings.org> <6abb5de6884bd36cb594@heytings.org> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="38176"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: Eli Zaretskii , 61514@debbugs.gnu.org, mah@everybody.org To: Gregory Heytings Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue Feb 21 17:59:26 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pUVzB-0009l7-Op for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 21 Feb 2023 17:59:25 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pUVyz-0002Qf-Eq; Tue, 21 Feb 2023 11:59:13 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pUVyo-0002NX-MN for bug-gnu-emacs@gnu.org; Tue, 21 Feb 2023 11:59:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pUVyo-0002nl-DR for bug-gnu-emacs@gnu.org; Tue, 21 Feb 2023 11:59:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pUVyn-0007oe-Q4 for bug-gnu-emacs@gnu.org; Tue, 21 Feb 2023 11:59:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 21 Feb 2023 16:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 61514 X-GNU-PR-Package: emacs Original-Received: via spool by 61514-submit@debbugs.gnu.org id=B61514.167699870429998 (code B ref 61514); Tue, 21 Feb 2023 16:59:01 +0000 Original-Received: (at 61514) by debbugs.gnu.org; 21 Feb 2023 16:58:24 +0000 Original-Received: from localhost ([127.0.0.1]:57170 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pUVyC-0007nm-7q for submit@debbugs.gnu.org; Tue, 21 Feb 2023 11:58:24 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:62498) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pUVy9-0007nZ-V8 for 61514@debbugs.gnu.org; Tue, 21 Feb 2023 11:58:23 -0500 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 709DB443071; Tue, 21 Feb 2023 11:58:16 -0500 (EST) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 2585C44306D; Tue, 21 Feb 2023 11:58:15 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1676998695; bh=UHZIsrkoxVNX07lyrKK4G3rO9t3nmJF0jdi/8ubcxW8=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=D5khOw2du5FVdTXbQyLK+8fTlIyWB5vBX7jdGQ6SsYNbwiYIraQRloj82JuUdYM1I fthD3nbsD9XctcJjdnkTaNZVeVo0WXGdZNSgFCt519gWosrjBfAevx4mnwWxzbRNuJ T9sB7SEoA5ZdNpo1Ma8oOQTDLI6nyfNJMFgR1djEfXxrpNDyUJJcL5Q+HPC3ogJrXZ P6TM7zz4Ugjdlk/me+L4cpE7hs08lYJWmBvOs3yc2dOo1T1J19GTiRlOZoRHcY17gR 0EjA7tzdynSHFMZ2F2d/x9VK1aR9R8CbVzSPuroKEJFH4jPfegxcHhKUVx/F3z8Fu1 IuBXtEFxSdtRQ== Original-Received: from lechazo (lechon.iro.umontreal.ca [132.204.27.242]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id C0A7212322C; Tue, 21 Feb 2023 11:58:14 -0500 (EST) In-Reply-To: <6abb5de6884bd36cb594@heytings.org> (Gregory Heytings's message of "Tue, 21 Feb 2023 15:44:57 +0000") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:256300 Archived-At: >> OK, thanks. Stefan, do you have any further comments/objections on >> that version? LGTM. > By the way, I noted that a variant of the regexp still produces stack > overflows: > > (with-current-buffer (get-buffer-create "*bug*") > (erase-buffer) > (insert (make-string 266665 ?x) "=") > (goto-char (point-min)) > (looking-at "[^y]*=*")) > > 266665 overflows, 266664 does not. Is that expected? Yes, there's "nothing" we can do about it (short of a significant redesign of the engine): [^y] also matches = so at every iteration of the loop, both paths (perform one more iteration, or exit the loop) are valid, so we need to try them both, which we do via backtracking. We'd need a "Thompson NFA" or something along the same lines to avoid it. Of course, we could also just backtrack less deep by exploring the search space in a different order (e.g. the `*?` repetition does that), but if we want to still return the same end result, we'd then have to explore more of the search space (and after the fact, choose which match we should return) rather than stop at the first match. Stefan