From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#64128: regexp parser zero-width assertion bugs Date: Sun, 18 Jun 2023 22:26:28 +0200 Message-ID: References: <4A303177-384E-4FEF-98F2-FAB89A12ACC9@gmail.com> <83pm5tpdy2.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39240"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Paul Eggert , monnier@iro.umontreal.ca, 64128@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Jun 18 22:27:23 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qAyza-000A0I-Bi for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 18 Jun 2023 22:27:22 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qAyzH-0000zR-Tn; Sun, 18 Jun 2023 16:27:03 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qAyzG-0000zF-Ni for bug-gnu-emacs@gnu.org; Sun, 18 Jun 2023 16:27:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qAyzG-0004tF-3N for bug-gnu-emacs@gnu.org; Sun, 18 Jun 2023 16:27:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1qAyzF-0006eV-Lf for bug-gnu-emacs@gnu.org; Sun, 18 Jun 2023 16:27:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 18 Jun 2023 20:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 64128 X-GNU-PR-Package: emacs Original-Received: via spool by 64128-submit@debbugs.gnu.org id=B64128.168711999825539 (code B ref 64128); Sun, 18 Jun 2023 20:27:01 +0000 Original-Received: (at 64128) by debbugs.gnu.org; 18 Jun 2023 20:26:38 +0000 Original-Received: from localhost ([127.0.0.1]:55001 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qAyys-0006dr-2W for submit@debbugs.gnu.org; Sun, 18 Jun 2023 16:26:38 -0400 Original-Received: from mail-lj1-f170.google.com ([209.85.208.170]:53300) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qAyyp-0006db-LF for 64128@debbugs.gnu.org; Sun, 18 Jun 2023 16:26:36 -0400 Original-Received: by mail-lj1-f170.google.com with SMTP id 38308e7fff4ca-2b4636bb22eso21495821fa.2 for <64128@debbugs.gnu.org>; Sun, 18 Jun 2023 13:26:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687119990; x=1689711990; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:sender:from:to:cc:subject :date:message-id:reply-to; bh=2rqZJTe/cZi2XdkI62XexNShpL5WuZ+p9P2Ig1+P+Oc=; b=dtdXwP+YJYmo4u2hm5jHxi0mkCH79/WnLRIJJatxY6qJ+1R/bAiZ8o/SMYsLBxKwha oezHYxbTFtKVUWDYXfdSVfxdlgkCfHLtQyAtzcjOnpejZDaRBJLx3h1oLjMTPdir5R5k IANRHSShPAm4LX7URWv4JJMRRvZ14zrvaQ5nD/ZfGhVb7t+ISOkuJ79Pr9FbZZkQP80R UsFtdZ5TYv85k6/g4A1oN3N+xw5xBxj09LHLZ6zQ6omXKQxg2tVVvm/uZs+JHFMtXuDQ VCe6cV3LfdSlVtCkWed99/YMX/YSD1RETF83Xh/spO6L7FjJBQe2jRsdu6/PTsvHmeHM 7oJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687119990; x=1689711990; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:sender:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=2rqZJTe/cZi2XdkI62XexNShpL5WuZ+p9P2Ig1+P+Oc=; b=XkrAt/St7TkA0dH7nb1/ju/kmWIoS5IvVSMyWN8Jw42/KpohNsfvCbC1223TqmHpNC MSV1Wd7t2Y62LYpTqLZcFMGWcsIaIB0yQUDm5JX4kuJRWre+qXoIp2KQ3kIx8LNS4gQP nTJhGDtEWXNXkyERLYH5O2LQcHVC/CA/z+2hNdCh1YsFyeKHnRwZ0A8QGyCzLIxswso4 x6vIjS1B26mCC0W+IIRx50dvGeL1b2GLVuQIk1BuZfqsjhPGYIwvUGq8O4KPgVcYsmwB 9XIYYqyyeFD5UIEv4QL9n26SuFATXQCmEqWwT5nmS5GzlsAEg40KEoxB7J1+7Pru2Kxw YUgw== X-Gm-Message-State: AC+VfDwg/fJEnuZ7thsXvncIQi25Av5SqL53KEIFxcP3sTxSVzuyi04f +gQjb0Y7UrqY80Q/9/hwohU= X-Google-Smtp-Source: ACHHUZ4AltouwGOQukMGbjShHQVbjJUy+HeVQLO7nhk7VprMHxM57JxhTxsewRDnv/CtMzEzGx1ZeQ== X-Received: by 2002:a2e:3e1a:0:b0:2b1:e958:efa0 with SMTP id l26-20020a2e3e1a000000b002b1e958efa0mr4538085lja.50.1687119989451; Sun, 18 Jun 2023 13:26:29 -0700 (PDT) Original-Received: from smtpclient.apple (c188-150-165-235.bredband.tele2.se. [188.150.165.235]) by smtp.gmail.com with ESMTPSA id v4-20020a2e9604000000b002b33e954509sm2769219ljh.119.2023.06.18.13.26.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 18 Jun 2023 13:26:28 -0700 (PDT) In-Reply-To: <83pm5tpdy2.fsf@gnu.org> X-Mailer: Apple Mail (2.3654.120.0.1.15) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:263657 Archived-At: 18 juni 2023 kl. 06.55 skrev Eli Zaretskii : > My comment is that since this was a documented feature, I'm not > interested in making it an error. Yes, it would be unwise to raise an error for "^*" or the like; it's in = active use. The manual is a bit hazy about what we actually promise, though. As Paul notes, we must be able to document it and that might not be = easy, so perhaps we shouldn't even try (to change, or document)? To make everything clear, we have to groups of zero-width assertions: Group A: ^ $ \` \' \b \B Group B: \< \> \_< \_> \=3D Group B assertions work like ordinary elements, syntactically and = semantically. Simple, predictable, but also useless. Group A assertions are more interesting: either there is nothing before = a train of such assertions, such as "^\\`\\b\\`*?" which turns the first character of the operator into a literal (and a = second character, if present, now becomes an operator acting on that = literal). Or there is something, and the operator acts on the last element = preceding the assertions, except that multiple literal characters = coalesce to a single element. Except if one of the literal chars is an = out-of-place `^` which splits a sequence of literals into separate = segments but not exactly where you think it would. For example, "abc^def\\B\\B+?" means, I think, (seq "ab" (+? "c^def" not-word-boundary not-word-boundary))