From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.devel Subject: Re: Scan of regexp mistakes Date: Sat, 9 Mar 2019 13:36:59 +0100 Message-ID: <0AAB9274-F50F-45D7-9043-FB7A49416CF8@acm.org> References: <3ef768c2-98d9-a42d-067a-4a5ffc945cf4@cs.ucla.edu> Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="200352"; mail-complaints-to="usenet@blaine.gmane.org" Cc: emacs-devel To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Mar 09 13:37:28 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h2bEG-000q0w-K0 for ged-emacs-devel@m.gmane.org; Sat, 09 Mar 2019 13:37:28 +0100 Original-Received: from localhost ([127.0.0.1]:58135 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h2bEF-0002gr-EI for ged-emacs-devel@m.gmane.org; Sat, 09 Mar 2019 07:37:27 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:32913) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h2bE5-0002gL-0U for emacs-devel@gnu.org; Sat, 09 Mar 2019 07:37:18 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h2bE4-0001sf-0b for emacs-devel@gnu.org; Sat, 09 Mar 2019 07:37:16 -0500 Original-Received: from mail80c50.megamailservers.eu ([91.136.10.90]:55584 helo=mail70c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h2bE3-0001rN-CN for emacs-devel@gnu.org; Sat, 09 Mar 2019 07:37:15 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1552135021; bh=/MMDlfJDfwkfB9DHxNJueSBscASTcj9cKj8VEkQeJgA=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=nFlQw9c5vgbaFoifuIeox2LHmW/ALvD6H6WSLt30jJhJrbhP4h50u/md/cGS1Nabu LhGDZ6swCmUZDrDpv5kuUmHxFlZD+wvuDJFqs/iLd+JVOpaelYKxCido/r1ZYn4V27 e+O2dqvGG8MRt3ytqgKNJpUC7P/fqxu7H3FMITTo= Feedback-ID: mattiase@acm.or Original-Received: from [192.168.1.64] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail70c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x29CaxtY025474; Sat, 9 Mar 2019 12:37:01 +0000 In-Reply-To: X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0208.5C83B36D.005D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=ILcs9DnG c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=IkcTkHD0fZMA:10 a=43xlJxM4XbrCBjfuiOQA:9 a=QEXdDO2ut3YA:10 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 91.136.10.90 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:233960 Archived-At: 8 mars 2019 kl. 18.13 skrev Paul Eggert : >=20 > On 3/5/19 7:06 AM, Mattias Engdeg=C3=A5rd wrote: >>=20 >> I can run it periodically but would surely forget. Should I put the = trawler in the Emacs source tree (if so, where?), in ELPA, or elsewhere? >=20 > Stefan mentioned one possibility. Though even then I daresay it'd be > helpful if you ran it periodically, just as I periodically run > admin/merge-gnulib. (If you don't run it, it's likely nobody else = will....) You are both right. I shall try to run it periodically but also see if = it can be integrated into `make check'. It depends on an ELPA package (xr); does this require any special = treatment? We don't want `make check' to fail if xr isn't installed. Thanks for your updates, and I agree with your changes. >> (replace-regexp-in-string "[\000-\032\177<>#%\"{}|\\^[]`%?;]" >>=20 >> That \032 doesn't look right (number base confusion?), and it looks = like it's meant as a single character alternative but it isn't, given = the misplaced `]'. >=20 > The regexp has other troubles. It doesn't include !$'()*+,/:@&=3D (all = of > which are reserved characters according to RFC 3986), and it has > duplicate %. The attached patch fixes the % and puts in a FIXME about > the other chars. Thank you. It annoyed me that I couldn't catch this regexp with any = formal rule violation (maybe we should try tax evasion). An ad-hoc = pattern to catch [...[...]...] did work, and caught nothing else, but I = was afraid it would become a false positive in legitimate patterns. >> --- a/lisp/progmodes/fortran.el >> +++ b/lisp/progmodes/fortran.el >> @@ -2052,7 +2052,7 @@ If ALL is nil, only match comments that start = in column > 0." >> (when (<=3D (point) bos) >> (move-to-column (1+ fill-column)) >> ;; What is this doing??? >> - (or (re-search-forward "[\t\n,'+-/*)=3D]" eol t) >> + (or (re-search-forward "[-\t\n,'+./*)=3D]" eol t) >>=20 >> Where did the . come from? Don't you think that `+-/*' were meant to = include those four symbols only? >=20 > I couldn't figure out what the code was doing (note the comment...) so > decided to preserve the semantics of the old regexp. But you're right, > "." is likely not intended there. I removed it in the attached. It appears to look for the first good place to desperately break a line = that is already indented beyond the margin, using a convention of = breaking after binary operators. =46rom this point of view, excluding . = is probably correct, lest we split .EQ. . I can't really explain the = "\t\n')" part; in particular, the search bound should make it impossible = for \n to match. It still mangles some things: for instance, ALPHA**BETA is line-broken as ALPHA* *BETA Thus we might want to improve the regexp to (rx (or "**" (any = "\t\n,')=3D*/+-"))), but we should really ask a Fortran expert.