From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#44173: 28.0.50; gdb-mi mangles strings with octal escapes Date: Sat, 24 Oct 2020 20:27:13 +0200 Message-ID: <3191F82D-0C4C-4E56-B96D-798C22234E63@acm.org> References: <2A87D378-9FB9-4FC7-951E-5BA9832051CF@acm.org> <837drhjglr.fsf@gnu.org> <834kmljd01.fsf@gnu.org> <2C183F78-8E7D-48AE-BCC2-1E32EC0A4E29@acm.org> <83zh4dhui0.fsf@gnu.org> <865DFE77-16F6-4F20-8238-76E6A09801C8@acm.org> <83mu0ciz3a.fsf@gnu.org> <9D6D4CDF-DCEE-4EED-B0E5-44A999CD4DFA@acm.org> <837drfh72d.fsf@gnu.org> Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13920"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 44173@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Oct 24 20:28:09 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kWOGv-0003WP-Lh for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 24 Oct 2020 20:28:09 +0200 Original-Received: from localhost ([::1]:59332 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kWOGu-0006ze-LX for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 24 Oct 2020 14:28:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:46626) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kWOGo-0006zV-JP for bug-gnu-emacs@gnu.org; Sat, 24 Oct 2020 14:28:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:51795) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kWOGo-0006g5-9d for bug-gnu-emacs@gnu.org; Sat, 24 Oct 2020 14:28:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kWOGo-0006oL-72 for bug-gnu-emacs@gnu.org; Sat, 24 Oct 2020 14:28:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 24 Oct 2020 18:28:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44173 X-GNU-PR-Package: emacs Original-Received: via spool by 44173-submit@debbugs.gnu.org id=B44173.160356404726136 (code B ref 44173); Sat, 24 Oct 2020 18:28:02 +0000 Original-Received: (at 44173) by debbugs.gnu.org; 24 Oct 2020 18:27:27 +0000 Original-Received: from localhost ([127.0.0.1]:35108 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kWOGE-0006nU-Pr for submit@debbugs.gnu.org; Sat, 24 Oct 2020 14:27:27 -0400 Original-Received: from mail1458c50.megamailservers.eu ([91.136.14.58]:41558 helo=mail267c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kWOGB-0006nC-M4 for 44173@debbugs.gnu.org; Sat, 24 Oct 2020 14:27:25 -0400 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1603564036; bh=3hM3LXIir97xZr6FqnUXrxnk4r4YHaatpC2PNSfnBwc=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=XqjOtV9I4Y5CqnWZjEtuxw9EV63fdggCiAxHt9Jw3Q2veZ0UNMAct4cfP5QjPYN1E 5ElZ5UTI7rjpo+rBo01YBlsdTp86elBoYaRhOAWNPrJViuCwBjetMNd1DgKcYq+9sO /0PAWozBC7XYC8jvIvZFhzDM3AyQRDENPkOZMlVY= Feedback-ID: mattiase@acm.or Original-Received: from stanniol.lan (c-304ee655.032-75-73746f71.bbcust.telenor.se [85.230.78.48]) (authenticated bits=0) by mail267c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 09OIRDJ0027189; Sat, 24 Oct 2020 18:27:15 +0000 In-Reply-To: <837drfh72d.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F1D.5F947204.0057:SCFSTAT76743386, ss=1, re=-4.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: -4.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=Cf92G4jl c=1 sm=1 tr=0 a=63Z2wlQ1NB3xHpgKFKE71g==:117 a=63Z2wlQ1NB3xHpgKFKE71g==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=7rUh4Dt9qm2lQ08n2l4A:9 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:191441 Archived-At: 24 okt. 2020 kl. 19.23 skrev Eli Zaretskii : >> If gdb-mi-decode-strings is non-nil, then file names, string contents = etc are properly decoded as UTF-8 as expected >=20 > Not UTF-8, but the value of gdb-mi-decode-strings, if it's a > coding-system, right? Right. > I hoped/thought you intended to solve this issue as well, but if the > situation is no worse than it was before, it's fine to leave it at > that. However, please retain at least part of the comment regarding > gdb-mi-decode-strings and the ambiguity related to its use, I think > it's important that people know that. Yes, the valid parts of the comment will be kept. I'm not sure what a solution to the remaining problems would look like, = but it would probably involve splitting gdb-mi-decode-strings in = separate variables for file names and program values. On the other hand, = given that the world is converging to UTF-8, it may be a disappearing = problem? In any case, should we want to decode strings differently depending on = their structural position in the answer, I believe that it would be = better done in the field accessors instead of the parser. For example, (bindat-get-field breakpoint 'fullname) might become something like (gdb-mi--get-string-field breakpoint 'fullname 'filename) which would tell the accessor how to decode the field. In the short term I suggest changing the default value of = gdb-mi-decode-strings to 't' as this gives the behaviour most commonly = expected by the user. However, it is not critical, and in any case = orthogonal to the issue at hand. What do you think? > And I hope you've verified that this does still fix the problem in > bug#21572, which this variable and the related code tries to fix? Yes -- I tried debugging programs whose source file names contain = Unicode chars and they were shown correctly (with gdb-mi-decode-strings = =3D t). >> + (t >> + (error "Unrecognised escape char: %c" (following-char)))) >=20 > How about leaving the text unchanged instead of signaling an error > (and thus preventing the entire data from getting to the higher > levels)? Maybe, but I really dislike hiding bugs by being overly tolerant. It is = precisely this tolerant nature of 'json-read' that caused this bug in = the first place. (I'm not sure whether this is compliant with RFC 8259, = by the way.) I think it's fine to signal errors if the syntax isn't what we expect; = after all, that is what the JSON parser does in other cases. Thanks for the helpful comments. I'll prepare a proper patch.