From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: philippe schnoebelen Newsgroups: gmane.emacs.devel Subject: regular expressions that match nothing Date: Tue, 14 May 2019 09:25:59 +0200 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="0000000000007a44840588d3f2ea" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="231976"; mail-complaints-to="usenet@blaine.gmane.org" To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue May 14 09:27:16 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hQRqF-000yG1-JC for ged-emacs-devel@m.gmane.org; Tue, 14 May 2019 09:27:15 +0200 Original-Received: from localhost ([127.0.0.1]:41083 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hQRqE-00011U-3J for ged-emacs-devel@m.gmane.org; Tue, 14 May 2019 03:27:14 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:36155) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hQRpG-0000yj-2w for emacs-devel@gnu.org; Tue, 14 May 2019 03:26:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hQRpF-0007yr-2m for emacs-devel@gnu.org; Tue, 14 May 2019 03:26:14 -0400 Original-Received: from mail-vs1-xe43.google.com ([2607:f8b0:4864:20::e43]:39022) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hQRpE-0007wM-S0 for emacs-devel@gnu.org; Tue, 14 May 2019 03:26:13 -0400 Original-Received: by mail-vs1-xe43.google.com with SMTP id m1so2519485vsr.6 for ; Tue, 14 May 2019 00:26:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=Zr67Xux3BtZWN5y8xTuKZQm6yoPuplt00EnZE63GEgA=; b=bO2T/+f426I9zeAHfqJQmdIJzvLgxuYkJPNTwrZVXq/Kf16Ix8BT4QwK/IzRkQUPtd zJrc//hlHpX3138eC56KChAoaMbufpz0JwOS5b1ksmAZQsHWWvs71yrQ2iLIQLXDKVbI bujxzz27JFlY/HRUOJ/xFzfQzHlHfds4jKbo2kTa0bK3VoanFooVG+Y/BOG7y3gxCRtE lDLC8vk9CWijmN3fdsQCYnQmANQjvMD8vBAYt7LBs4UatFsPsZQQkMwVCJKpxdO1ZBW7 Ojn+nNERj1AESGUYqAbVQ//deaXoL2o/e3QewNuMUX6Uwy4dV2UaUXtsGO+I2kQY6bhy aY6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=Zr67Xux3BtZWN5y8xTuKZQm6yoPuplt00EnZE63GEgA=; b=MpXH5J4IkBmByCLxX+jZ41hLGoBRO484ddtDTAvkrp9kHoqFLpQNAzj8tCslQrzwUh vqN+qRqchOnc9gwgZlGtY6EUfsiE2Vdc8rwtsFouGYFZ6NzXfQREgqCpxXE4eldbS51V rOEdaP4rYAdrjI8ItOijFW1jPstc7+Pln/UCay0wI+66F3BorNLaQ1iUARdxfsVkrNHH cpgvqqQwekSOwDvT2XFggjrK1qncP6e8KxFKfkrowgtFM+oCfL5JXDUH0XibEB573NGZ td2Kn1WhtcPvmpGq5E0OYDwajs49DX6MecdYd9WPCtk8HpuxnUTaediidglD1RUvj0oD 7gAQ== X-Gm-Message-State: APjAAAVCcE5o7OhSzgJ2I5zjyw4haaZApAuZfDvRojAWQgWuLMojNVgL KyWgk/zeK31LyfP+lVSvh1TS490dRKyaOch5SOfzsD5D5oJ5rW5r X-Google-Smtp-Source: APXvYqzV9IHO7XpeGbdT0vX073ZlY9bc0+m7lly2Q5fV8I/+7BvOmMAJgs3v+tfCvjUns4Rwxey50KHY6G2UImsSCpA= X-Received: by 2002:a67:e07:: with SMTP id 7mr16000296vso.220.1557818771121; Tue, 14 May 2019 00:26:11 -0700 (PDT) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::e43 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:236492 Archived-At: --0000000000007a44840588d3f2ea Content-Type: multipart/alternative; boundary="0000000000007a44810588d3f2e7" --0000000000007a44810588d3f2e7 Content-Type: text/plain; charset="UTF-8" I was very happy to see that in v27.0.50 (regexp-opt nil) now properly returns a regular expression that matches nothing, namely a\`. Thanks to whoever fixed that old bug. I was wondering why (regexp-opt nil) uses a\` and not \'a or another option like \=a\= so I did some profiling (see attached code). The different options that I tried have more or less the same response time when one checks, via looking-at, whether the regexp matches at point. But when one searches for a match across a whole buffer, some options behave notably faster than the others. And a\` is not the best, e.g., \=a\= is way faster. Maybe some other solutions would be even faster. Of course this may be dependent on the internals of the specific regexp library at hand. I do not know enough to judge. In fact I believe that a solid regular expression library should provide a specific regular expression that matches nothing with special but easy treatment that guarantees best response time. --phs --0000000000007a44810588d3f2e7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I was very happy to see that in v27.= 0.50 (regexp-opt nil) now properly returns a regular expression that matche= s nothing, namely a\`. Thanks to whoever fixed that old bug.
=
I was wondering why (regexp-opt nil) uses a\` and not \'= a or another option like \=3Da\=3D so I did some profiling (see attached co= de).

The different options that I tried have more = or less the same response time when one checks, via looking-at, whether the= regexp matches at point. But when one searches for a match across a whole = buffer, some options behave notably faster than the others. And a\` is not = the best, e.g., \=3Da\=3D is way faster. Maybe some other solutions would b= e even faster.

Of course this may be dependen= t on the internals of the specific regexp library at hand. I do not know en= ough to judge. In fact I believe that a solid regular expression library sh= ould provide a specific regular expression that matches nothing with specia= l but easy treatment that guarantees best response time.

=
--phs
--0000000000007a44810588d3f2e7-- --0000000000007a44840588d3f2ea Content-Type: application/octet-stream; name="profile-empty-regexp.el" Content-Disposition: attachment; filename="profile-empty-regexp.el" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_jvnh29kz0 KGRlZnVuIHByb2ZpbGUtZW1wdHktcmVnZXhwcyAoJm9wdGlvbmFsIGJ1ZmZlcikKICAiUmVwb3J0 IHNvbWUgbWF0Y2hpbmcgdGltZXMgZm9yIHNldmVyYWwgcmVndWxhciBleHByZXNzaW9ucy4iCiAg KGludGVyYWN0aXZlKQogICh1bmxlc3MgYnVmZmVyCiAgICAoc2V0cSBidWZmZXIgKGZpbmQtZmls ZS1ub3NlbGVjdCAobG9jYXRlLWxpYnJhcnkgInJlZ2V4cC1vcHQiKSkpKQogICh3aXRoLW91dHB1 dC10by10ZW1wLWJ1ZmZlciAiUHJvZmlsaW5nIgogICAgKHByaW5jIChwcm9maWxlLW9uZS1yZWdl eHAgImFcXGAiIGJ1ZmZlcikpICAgICAgICAocHJpbmMgIlxuIikKICAgIChwcmluYyAocHJvZmls ZS1vbmUtcmVnZXhwICJcXCdhIiBidWZmZXIpKSAgICAgICAgKHByaW5jICJcbiIpCiAgICAocHJp bmMgKHByb2ZpbGUtb25lLXJlZ2V4cCAiXFw9LlxcPSIgYnVmZmVyKSkgICAgIChwcmluYyAiXG4i KQogICAgKHByaW5jIChwcm9maWxlLW9uZS1yZWdleHAgIlxcPWFcXD0iIGJ1ZmZlcikpICAgICAo cHJpbmMgIlxuIikKICAgIChwcmluYyAocHJvZmlsZS1vbmUtcmVnZXhwICJcXC4iIGJ1ZmZlcikp ICAgICAgICAgKHByaW5jICJcbiIpCiAgICApKQoKKGRlZnVuIHByb2ZpbGUtb25lLXJlZ2V4cCAo cmVnZXhwIGJ1ZmZlciAmb3B0aW9uYWwgbmJyZXBlYXRzKQogIDs7IFRoZSB3b3JraG9yc2UKICAo c2V0cSBuYnJlcGVhdHMgKG9yIG5icmVwZWF0cyA1MDAwMCkpCiAgKGxldCAoc3RhcnQtdGltZSBk dXJhdGlvbjEgZHVyYXRpb24yIGZvdW5kKQogICAgKHdpdGgtY3VycmVudC1idWZmZXIgYnVmZmVy CiAgICAgIChzYXZlLWV4Y3Vyc2lvbgoJKHNldHEgc3RhcnQtdGltZSAoY3VycmVudC10aW1lKSkK CShnb3RvLWNoYXIgKHBvaW50LW1pbikpCgkoZG90aW1lcyAoXyBuYnJlcGVhdHMpCgkgIChsb29r aW5nLWF0IHJlZ2V4cCkpCgkoc2V0cSBkdXJhdGlvbjEgKHRpbWUtc3VidHJhY3QgKGN1cnJlbnQt dGltZSkgc3RhcnQtdGltZSkpCgkoc2V0cSBzdGFydC10aW1lIChjdXJyZW50LXRpbWUpKQoJKGdv dG8tY2hhciAocG9pbnQtbWluKSkKCShkb3RpbWVzIChfIG5icmVwZWF0cykKCSAgKHdoZW4gKHJl LXNlYXJjaC1mb3J3YXJkIHJlZ2V4cCBuaWwgdCkgCgkgICAgKHNldHEgZm91bmQgdCkKCSAgICAo Z290by1jaGFyIChwb2ludC1taW4pKSkpCTs7IHJldHVybiB0byB0ZXN0IHBvc2l0aW9uCgkoc2V0 cSBkdXJhdGlvbjIgKHRpbWUtc3VidHJhY3QgKGN1cnJlbnQtdGltZSkgc3RhcnQtdGltZSkpKSkK ICAgIChmb3JtYXQgIlRlc3RpbmcgcmVnZXhwICVzICVkIHRpbWVzXG5cdG1hdGNoIGF0IHBvaW50 LW1pbjogJS40ZnNcblx0c2VhcmNoIGluIGJ1ZmZlciAlcyAoc2l6ZSAlZCk6ICUuNGZzXG4lcyIg cmVnZXhwIG5icmVwZWF0cyAoZmxvYXQtdGltZSBkdXJhdGlvbjEpIChidWZmZXItbmFtZSBidWZm ZXIpIAoJICAgIChidWZmZXItc2l6ZSBidWZmZXIpIChmbG9hdC10aW1lIGR1cmF0aW9uMikKCSAg ICAoaWYgZm91bmQgIlx0KioqIFdBUk5JTkcgKioqIGEgbWF0Y2ggd2FzIGZvdW5kXG4iICIiKSkp KQoK --0000000000007a44840588d3f2ea--