From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#24071: [PATCH] Refactor regex character class parsing in [:name:] Date: Tue, 26 Jul 2016 17:46:42 +0300 Message-ID: <83d1m0tq25.fsf@gnu.org> References: <1469487245-11126-1-git-send-email-mina86@mina86.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1469544512 9964 80.91.229.3 (26 Jul 2016 14:48:32 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 26 Jul 2016 14:48:32 +0000 (UTC) Cc: 24071@debbugs.gnu.org To: Michal Nazarewicz , Dima Kogan Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Jul 26 16:48:20 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1bS3eb-00071h-QP for geb-bug-gnu-emacs@m.gmane.org; Tue, 26 Jul 2016 16:48:18 +0200 Original-Received: from localhost ([::1]:40345 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bS3ea-0006Ww-OG for geb-bug-gnu-emacs@m.gmane.org; Tue, 26 Jul 2016 10:48:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:39390) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bS3eR-0006Vn-4Y for bug-gnu-emacs@gnu.org; Tue, 26 Jul 2016 10:48:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bS3eN-0003lz-6P for bug-gnu-emacs@gnu.org; Tue, 26 Jul 2016 10:48:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:54139) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bS3eN-0003ll-2e for bug-gnu-emacs@gnu.org; Tue, 26 Jul 2016 10:48:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1bS3eM-0007HA-C3 for bug-gnu-emacs@gnu.org; Tue, 26 Jul 2016 10:48:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 26 Jul 2016 14:48:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24071 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 24071-submit@debbugs.gnu.org id=B24071.146954443627898 (code B ref 24071); Tue, 26 Jul 2016 14:48:02 +0000 Original-Received: (at 24071) by debbugs.gnu.org; 26 Jul 2016 14:47:16 +0000 Original-Received: from localhost ([127.0.0.1]:38240 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bS3dX-0007Fp-Ee for submit@debbugs.gnu.org; Tue, 26 Jul 2016 10:47:15 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:47955) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bS3dR-0007FH-57 for 24071@debbugs.gnu.org; Tue, 26 Jul 2016 10:47:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bS3dI-0003Xy-51 for 24071@debbugs.gnu.org; Tue, 26 Jul 2016 10:47:00 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:40765) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bS3dI-0003Xn-1Q; Tue, 26 Jul 2016 10:46:56 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1682 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1bS3dG-0001GE-3L; Tue, 26 Jul 2016 10:46:54 -0400 In-reply-to: <1469487245-11126-1-git-send-email-mina86@mina86.com> (message from Michal Nazarewicz on Tue, 26 Jul 2016 00:54:05 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:121553 Archived-At: > From: Michal Nazarewicz > Date: Tue, 26 Jul 2016 00:54:05 +0200 > > re_wctype function is used in three separate places and in all of > those places almost exact code extracting the name from [:name:] > surrounds it. Furthermore, re_wctype requires a NUL-terminated > string, so the name of the character class is copied to a temporary > buffer. > > The code duplication and unnecessary memory copying can be avoided by > pushing the responsibility of parsing the whole [:name:] sequence to > the function. > > Furthermore, since now the function has access to the length of the > character class name (since it’s doing the parsing), it can take > advantage of that information in skipping some string comparisons and > using a constant-length memcmp instead of strcmp which needs to take > care of NUL bytes. Thanks. If we are going to make some serious refactoring in regex.c, I think we should start with having a test suite for it. The dima_regex_embedded_modifiers branch, created by Dima Kogan (CC'ed) in the Emacs repository includes a suite taken from glibc. Dima, could you perhaps merge the parts of the test suite that can already be used to the master branch, so that we could use them to verify changes in regex.c?