From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Michal Nazarewicz Newsgroups: gmane.emacs.bugs Subject: bug#24071: [PATCH] Refactor regex character class parsing in [:name:] Date: Wed, 27 Jul 2016 17:29:04 +0200 Organization: http://mina86.com/ Message-ID: References: <1469487245-11126-1-git-send-email-mina86@mina86.com> <83d1m0tq25.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1469634820 3657 80.91.229.3 (27 Jul 2016 15:53:40 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 27 Jul 2016 15:53:40 +0000 (UTC) Cc: 24071@debbugs.gnu.org To: Eli Zaretskii , Dima Kogan Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Jul 27 17:53:33 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1bSR94-0002bN-Qn for geb-bug-gnu-emacs@m.gmane.org; Wed, 27 Jul 2016 17:53:18 +0200 Original-Received: from localhost ([::1]:47109 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bSQmi-00089m-IF for geb-bug-gnu-emacs@m.gmane.org; Wed, 27 Jul 2016 11:30:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39082) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bSQmc-00087w-74 for bug-gnu-emacs@gnu.org; Wed, 27 Jul 2016 11:30:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bSQmY-0003lm-GL for bug-gnu-emacs@gnu.org; Wed, 27 Jul 2016 11:30:05 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:55233) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bSQmY-0003li-CM for bug-gnu-emacs@gnu.org; Wed, 27 Jul 2016 11:30:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1bSQmY-0005qz-5a for bug-gnu-emacs@gnu.org; Wed, 27 Jul 2016 11:30:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Michal Nazarewicz Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 27 Jul 2016 15:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24071 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 24071-submit@debbugs.gnu.org id=B24071.146963335522411 (code B ref 24071); Wed, 27 Jul 2016 15:30:02 +0000 Original-Received: (at 24071) by debbugs.gnu.org; 27 Jul 2016 15:29:15 +0000 Original-Received: from localhost ([127.0.0.1]:39337 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bSQlm-0005pP-Re for submit@debbugs.gnu.org; Wed, 27 Jul 2016 11:29:15 -0400 Original-Received: from mail-wm0-f53.google.com ([74.125.82.53]:38452) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bSQll-0005pC-6A for 24071@debbugs.gnu.org; Wed, 27 Jul 2016 11:29:13 -0400 Original-Received: by mail-wm0-f53.google.com with SMTP id o80so68052261wme.1 for <24071@debbugs.gnu.org>; Wed, 27 Jul 2016 08:29:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:from:to:cc:subject:in-reply-to:organization:references :user-agent:face:date:message-id:mime-version :content-transfer-encoding; bh=w9xu1YsGA+69JGs5bbW50UOn44W+A46xjJA9FjULTCk=; b=I3d8zuqvoD9rE6nbK5BL9wOKWdnEu0FCbI81o3M+27L2UAMszoMfHNRCgGNnSS1dD3 OrS5JS2Hj2R9EqOmv2nmcHGzTAtY/J1TquW+hwymy7rDWsoUdoZ1uwP/UUDsutkpuh8V aDY+4p5P4kPxCCmACDKkUmeFz661CoZ5krkqcKVsU9CKQ8Q2swgTeYhzQlQgkWS2foO7 /HhCsSng7Hs+CDu8N49vTheLeMjDFkZzWaz7oH2DxNj7+FL+tuWDRA2sk+Rjx78pgbbj AKqRb/KfIMnAzt6PCItvTblC/QweFxHjzeomN+XB3UTBdrtKg8fgRKK/1F7TjGZZilPd 095A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:in-reply-to :organization:references:user-agent:face:date:message-id :mime-version:content-transfer-encoding; bh=w9xu1YsGA+69JGs5bbW50UOn44W+A46xjJA9FjULTCk=; b=EFylWE8NqjuoCyt/vSqskXUuQoKfs4aZhfy41Xdf98xPS/HSr+oVaKuiQq2FX338Zk uboHAykblph6HUMKFGmb1ZrquJVvtF8fpW16zAoBW/X/Mb/77R0mIsD12sbhPqc8vMfZ it5NCdoWaY9BWnHtdX8DKn1njROt877s1dHGQOwYrXLa3vzE2Wkn9xwPUdnHzBTkTShp h325ibr+Fdgkhl7vBqks3r8DeNo/OtOe9YZhieLjGiIKtOBFnl0ZmNWrLta/GxV2N+Z7 oaJfFFcmMLL+R7OrZnXoLlh9FuMs21zZXGNoE4rVebu0bVw26XybsGqBwzZIUOJ2dfVA pZpw== X-Gm-Message-State: AEkoouujoirj/0B3f/BBcFX8jpAZnoR3S9HT5d+L1I+//Spt90Qq/r/X3RMLRfJoGHJd7TJy X-Received: by 10.28.94.18 with SMTP id s18mr31820029wmb.44.1469633346999; Wed, 27 Jul 2016 08:29:06 -0700 (PDT) Original-Received: from mpn-glaptop ([2620:0:105f:301:e136:c882:c6f0:97af]) by smtp.gmail.com with ESMTPSA id d80sm38922789wmd.14.2016.07.27.08.29.04 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Wed, 27 Jul 2016 08:29:05 -0700 (PDT) In-Reply-To: <83d1m0tq25.fsf@gnu.org> User-Agent: Notmuch/0.19+53~g2e63a09 (http://notmuchmail.org) Emacs/25.1.50.1 (x86_64-unknown-linux-gnu) Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJFBMVEWbfGlUPDDHgE57V0jUupKjgIObY0PLrom9mH4dFRK4gmjPs41MxjOgAAACP0lEQVQ4T23Sv2vbQBQHcBk1xE6WyALX107VUEgmn6+ouUwpEQQ6uRjttkWP4CkBg2M0BQLBdPFZYPsyFYo7qEtKDQ7on+t7+nF2Ux8ahD587717OmNYrOvycHsZ+o2r051wHTHysAvGb8ygvgu4QWT0sCmkgZCIEnlV2X8BtyraazFGDuxhmKSQJMlwHQ7v5MHSNxmz78rfElwAa3ieVD9e+hBhjaPDDG6NgFo2f4wBMNIo5YmRtF0RyDgFjJjlMIWbnuM4x9MMfABGTlN4qgIQB4A1DEyA1BHWtfeWNUMwiVJKoqh97KrkOO+qzgluVYLvFCUKAX73nONeBr7BGMdM6Sg0kuep03VywLaIzRiVr+GAzKlpQIsAFnWAG2e6DT5WmWDiudZMIc6hYrMOmeMQK9WX0B+/RfjzL9DI7Y9/Iayn29Ci0r2i4f9gMimMSZLCDMalgQGU5hnUtqAN0OGvEmO1Wnl0C0wWSCEHnuHBqmygxdxA8oWXwbipoc1EoNR9DqOpBpOJrnr0criQab9ZT4LL+wI+K7GBQH30CrhUruilgP9DRTrhVWZCiAyILP+wiuLeCKGTD6r/nc8LOJcAwR6IBTUs+7CASw3QFZ0MdA2PI3zNziH4ZKVhXCRMBjeZ1DWMekKwDCASwExy+NQ86TaykaDAFHO4aP48y4 fIcDM5yOG8GcTLbOyp8A8azjJI93JFd1EA6yN8sSxMQJWoABqniRZVykYgRXErzrdqExAoUrRb0xfRp8p2A/4XmfilTtkDZ4cAAAAASUVORK5CYII= X-Face: -TR8(rDTHy/(xl?SfWd1|3:TTgDIatE^t'vop%*gVg[kn$t{EpK(P"VQ=~T2#ysNmJKN$"yTRLB4YQs$4{[.]Fc1)*O]3+XO^oXM>Q#b^ix, O)Zbn)q[y06$`e3?C)`CwR9y5riE=fv^X@x$y?D:XO6L&x4f-}}I4=VRNwiA^t1-ZrVK^07.Pi/57c_du'& X-PGP: 50751FF4 X-PGP-FP: AC1F 5F5C D418 88F8 CC84 5858 2060 4012 5075 1FF4 X-Hashcash: 1:20:160727:24071@debbugs.gnu.org::BL+8XuAiFsGF1NEz:00000000000000000000000000000000000000003117 X-Hashcash: 1:20:160727:lists@dima.secretsauce.net::T12Tw6AFunOmJAEc:000000000000000000000000000000000004Ib1 X-Hashcash: 1:20:160727:eliz@gnu.org::3M8fa9mCXybaSBJ7:000001oNm X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:121582 Archived-At: On Tue, Jul 26 2016, Eli Zaretskii wrote: >> From: Michal Nazarewicz >> Date: Tue, 26 Jul 2016 00:54:05 +0200 >>=20 >> re_wctype function is used in three separate places and in all of >> those places almost exact code extracting the name from [:name:] >> surrounds it. Furthermore, re_wctype requires a NUL-terminated >> string, so the name of the character class is copied to a temporary >> buffer. >>=20 >> The code duplication and unnecessary memory copying can be avoided by >> pushing the responsibility of parsing the whole [:name:] sequence to >> the function. >>=20 >> Furthermore, since now the function has access to the length of the >> character class name (since it=E2=80=99s doing the parsing), it can take >> advantage of that information in skipping some string comparisons and >> using a constant-length memcmp instead of strcmp which needs to take >> care of NUL bytes. > > Thanks. > > If we are going to make some serious refactoring in regex.c, I think > we should start with having a test suite for it. I agree. Which is why I started test/src/regex-tests.el=C2=B9. Since this patch touches only character classes I limited the tests to character classes. =C2=B9 If fact, the bug I=E2=80=99ve fixed with the previous patch was disc= overed precisely because I=E2=80=99ve written tests for this patch. > The dima_regex_embedded_modifiers branch, created by Dima Kogan > (CC'ed) in the Emacs repository includes a suite taken from glibc. > Dima, could you perhaps merge the parts of the test suite that can > already be used to the master branch, so that we could use them to > verify changes in regex.c? This looks relatively straightforward; I can take care of it. I=E2=80=99ll send a link to the result soon. --=20 Best regards =E3=83=9F=E3=83=8F=E3=82=A6 =E2=80=9C=F0=9D=93=B6=F0=9D=93=B2=F0=9D=93=B7= =F0=9D=93=AA86=E2=80=9D =E3=83=8A=E3=82=B6=E3=83=AC=E3=83=B4=E3=82=A4=E3=83= =84 =C2=ABIf at first you don=E2=80=99t succeed, give up skydiving=C2=BB