From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#58847: Patch to properly parse c++11 multiline strings Date: Sat, 29 Oct 2022 10:41:26 +0300 Message-ID: <83h6znhyah.fsf@gnu.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="39388"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 58847@debbugs.gnu.org To: Jan Stranik , Gerd =?UTF-8?Q?M=C3=B6llmann?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Oct 29 09:44:34 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oogW8-000A4a-Ug for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 29 Oct 2022 09:44:33 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oogVI-00044O-Lf; Sat, 29 Oct 2022 03:43:40 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oogTi-0001yQ-O8 for bug-gnu-emacs@gnu.org; Sat, 29 Oct 2022 03:42:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oogTi-00041k-Gx for bug-gnu-emacs@gnu.org; Sat, 29 Oct 2022 03:42:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oogTi-0006Ye-An for bug-gnu-emacs@gnu.org; Sat, 29 Oct 2022 03:42:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Oct 2022 07:42:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58847 X-GNU-PR-Package: emacs Original-Received: via spool by 58847-submit@debbugs.gnu.org id=B58847.166702929625172 (code B ref 58847); Sat, 29 Oct 2022 07:42:02 +0000 Original-Received: (at 58847) by debbugs.gnu.org; 29 Oct 2022 07:41:36 +0000 Original-Received: from localhost ([127.0.0.1]:34905 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oogTH-0006Xv-NA for submit@debbugs.gnu.org; Sat, 29 Oct 2022 03:41:36 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:43922) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oogTG-0006Xi-D5 for 58847@debbugs.gnu.org; Sat, 29 Oct 2022 03:41:34 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oogTA-0003zj-Vr; Sat, 29 Oct 2022 03:41:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=JTsfEPbnTQhE/ZLH5Nkds0UvvjGrn/xsdcsfqq7Hk3w=; b=lhCMg2dLWXF4ZR7/r3WK uVH2vQ0Mn4J6bAeW4jTsS9vO++gRhTSZ+m02DbI1OoBS5QE2WCNY1ZphJiJtn3uAIcW41e9CNYMum NalgGqPzih5pgepn7PGdBeG96fc6wSwgLeDyEOxt/Fetl/VPmIUoUDfs+BwfeCMHJBzqlQOGRMoix qtjVS7jU+rIK5tZFvkkiCKpUjY9UHC6BNRcm716qsYwgTnnqJq6KyknTQbbbrb+o6kAF5wJ5K8+q7 KuUs+rq/ZDoJ95XZWIPGFfQ0b6/htX3WIwfvSUoXN40t5Vb/jQ+6J/sXVeCPl4rdzHlFaPCbwdkga wvxAReD173Dr0w==; Original-Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oogTA-0004F9-89; Sat, 29 Oct 2022 03:41:28 -0400 In-Reply-To: (bug-gnu-emacs@gnu.org) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: "bug-gnu-emacs" Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:246497 Archived-At: > Date: Fri, 28 Oct 2022 16:13:42 -0400 > From: Jan Stranik via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" > > Hello - > I’m happy user of emacs and ebrowse feature. Recently I noticed that ebrowse does not work for multi-line strings in c++. > The r-string parsing is on also for c files, but it does not matter since c does not have r strings. Thanks. Gerd, any comments? > EBROWSE: parse c++11 rstrings > > C++11 allows definition of multi-line stirngs. This patch makes ebrowse propely parse these strings. > > Example of test multi-line string: > > repro.cxx: > ---------- > struct Foo { > using STR = const char*; > STR rstrprefix = R"prefix(is is a C++11 multi > line string > )prefix"; > > STR rstr = R"( > multiline string without a prefix > )"; > > STR rstr_test = R"prefix( > )prefix not at end > )prefixtoolong" > )pref" to short > > string still continues > )prefix"; > > const char* str = "a regular string"; > > void func() { > } > }; > ---------- > > ~/project/test/lit_repro $ c++ -std=c++10 -c repro.cxx # repro.cxx compiles > > ~/project/test/lit_repro $ ebrowse repro.cxx # current ebrowse chokes on file and produces wrong symbols > repro.cxx:3: newline in string constant > repro.cxx:4: newline in string constant > repro.cxx:7: newline in string constant > repro.cxx:8: newline in string constant > repro.cxx:11: newline in string constant > repro.cxx:12: newline in string constant > repro.cxx:14: newline in string constant > repro.cxx:15: newline in string constant > repro.cxx:16: newline in string constant > ~/project/test/lit_repro $ cat BROWSE > [ebrowse-hs "ebrowse 5.0" " -x" () ()][ebrowse-ts [ebrowse-cs "Foo" () 0"repro.cxx" "struct Foo {" 12"repro.cxx" ] > ()([ebrowse-ms "R" () 0 () "multiline string without a prefix > )\";" 175 0 () () 0] > [ebrowse-ms "pref" () 0 () ")prefix\";" 291 0 () () 0] > [ebrowse-ms "str" () 0 () " const char* str = \"a regular string\";" 334 0 () () 0] > ) > ([ebrowse-ms "func" () 0 () " void func()" 351 0 () " void func()" 351] > ) > ~/project/test/lit_repro $ ~/Downloads/emacs-master/lib-src/ebrowse repro.cxx # patch properly parses source and generates symbols > ~/project/test/lit_repro $ cat BROWSE > [ebrowse-hs "ebrowse 5.0" " -x" () ()][ebrowse-ts [ebrowse-cs "Foo" () 0"repro.cxx" "struct Foo {" 12"repro.cxx" ] > ()([ebrowse-ms "rstr" () 0 () "multiline string without a prefix > )\";" 175 0 () () 0] > [ebrowse-ms "rstr_test" () 0 () ")prefix\";" 291 0 () () 0] > [ebrowse-ms "rstrprefix" () 0 () ")prefix\";" 117 0 () () 0] > [ebrowse-ms "str" () 0 () " const char* str = \"a regular string\";" 334 0 () () 0] > ) > ([ebrowse-ms "func" () 0 () " void func()" 351 0 () " void func()" 351] > ) > Index: emacs-master/lib-src/ebrowse.c > =================================================================== > --- emacs-master.orig/lib-src/ebrowse.c > +++ emacs-master/lib-src/ebrowse.c > @@ -1574,6 +1574,51 @@ yylex (void) > > end_string: > return end_char == '\'' ? CCHAR : CSTRING; > + case 'R': > + if (GET (c) == '"') { > + /* c++11 rstrings */ > + > + #define RSTRING_EOF_CHECK do {if (c=='\0') { yyerror("unterminated c++11 rstring", NULL); UNGET(); return CSTRING;}}while(0) > + char *rstring_prefix_start = in; > + > + while (GET (c) != '(') { > + RSTRING_EOF_CHECK; > + if (c == '"') > + { > + yyerror ("malformed c++11 rstring", NULL); > + return CSTRING; > + } > + } > + char *rstring_prefix_end = in - 1; > + while (TRUE) { > + switch(GET (c)) { > + default: > + RSTRING_EOF_CHECK; > + break; > + case '\n': > + INCREMENT_LINENO; > + break; > + case ')': > + { > + char *in_saved = in; > + char *prefix = rstring_prefix_start; > + while (prefix != rstring_prefix_end && GET (c) == *prefix) { > + RSTRING_EOF_CHECK; > + prefix++; > + } > + if (prefix == rstring_prefix_end) { > + if (GET(c) == '"') > + return CSTRING; > + RSTRING_EOF_CHECK; > + } > + in = in_saved; > + } > + } > + } > + } > + > + UNGET (); > + /* fall through to ident */ > > case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': case 'g': > case 'h': case 'i': case 'j': case 'k': case 'l': case 'm': case 'n': > @@ -1581,7 +1626,7 @@ yylex (void) > case 'v': case 'w': case 'x': case 'y': case 'z': > case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': case 'G': > case 'H': case 'I': case 'J': case 'K': case 'L': case 'M': case 'N': > - case 'O': case 'P': case 'Q': case 'R': case 'S': case 'T': case 'U': > + case 'O': case 'P': case 'Q': case 'S': case 'T': case 'U': > case 'V': case 'W': case 'X': case 'Y': case 'Z': case '_': > { > /* Identifier and keywords. */ >