From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: Syntactic fontification of diff hunks Date: Fri, 17 Aug 2018 20:47:08 +0300 Organization: LINKOV.NET Message-ID: <87o9e0khr7.fsf@mail.linkov.net> References: <87in4af29r.fsf@mail.linkov.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1534528142 14234 195.159.176.226 (17 Aug 2018 17:49:02 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 17 Aug 2018 17:49:02 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) Cc: Andreas =?iso-8859-1?Q?R=F6hler?= , Emacs developers To: Yuri Khan Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Aug 17 19:48:58 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fqiro-0003YJ-VN for ged-emacs-devel@m.gmane.org; Fri, 17 Aug 2018 19:48:57 +0200 Original-Received: from localhost ([::1]:35704 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fqitv-0000FA-C9 for ged-emacs-devel@m.gmane.org; Fri, 17 Aug 2018 13:51:07 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:44815) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fqisw-0000Da-SO for emacs-devel@gnu.org; Fri, 17 Aug 2018 13:50:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fqisr-0004KL-Q0 for emacs-devel@gnu.org; Fri, 17 Aug 2018 13:50:06 -0400 Original-Received: from homie-sub3.mail.dreamhost.com ([69.163.253.7]:37510 helo=homiemail-a22.g.dreamhost.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fqisr-0004JJ-EI for emacs-devel@gnu.org; Fri, 17 Aug 2018 13:50:01 -0400 Original-Received: from homiemail-a22.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a22.g.dreamhost.com (Postfix) with ESMTP id BC06B114067; Fri, 17 Aug 2018 10:49:59 -0700 (PDT) Original-Received: from localhost.linkov.net (m91-129-110-6.cust.tele2.ee [91.129.110.6]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by homiemail-a22.g.dreamhost.com (Postfix) with ESMTPSA id A59CD114065; Fri, 17 Aug 2018 10:49:58 -0700 (PDT) In-Reply-To: (Yuri Khan's message of "Fri, 17 Aug 2018 13:47:57 +0700") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 69.163.253.7 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:228645 Archived-At: >> > I suppose some integration will be needed before this works in Magit= , too? If Magit doesn't use diff-mode, then yes, the code should be duplicated in Magit. >> > Also, it looks like it=E2=80=99s going to be somewhere between sligh= tly and >> > horribly inaccurate depending on where the hunk starts (e.g. in the >> > middle of a string literal or comment)? >> >> Which probably thwarts the whole project. > > Not necessarily; I think it might be good enough for a major part of > real-life use cases. Also, if the diff comes from an interactive tool, > the user can probably just increase context size a few times so as to > shift the balance towards =E2=80=9Cslightly inaccurate=E2=80=9D. Such as e.g. using --function-context for git will provide enough context= . >> Theoretically if being behind >> the end or start of a string could be fetched from source files - > > That condition is undecidable based on the hunk text alone. > > An implementation that has access to at least one of the source files > could just fontify that whole file and extract fontification from that > using hunk line offsets (assuming they are accurate). This probably > covers Ediff, vc, and Magit, but not reading standalone patches. I see that gitlab and github highlight syntax on server using whole files, then send highlighted html chunks to browser. The same way we can easily get whole files from git by their sha from the index in diff headers, or get files by relative paths when using diff on regular files. But when reading standalone patches e.g. in Gnus, then indeed there is a potential problem, but in practice when I looked at large patches, there was only small amount of such problematic places, that fortunately are narrowed to only one diff hunk, unlike failures in font-lock of some programming language modes (most often I remember such cases in cperl mod= e) where an incorrectly recognized quote breaks fontification to the end of the whole file. > There is also the theoretical issue of syntax being changed by the > patch =E2=80=94 e.g. introducing an unbalanced multiline string or comm= ent > opener or closer on a separate line. > > (defun foo () > + " > (bar baz) > + " > (quux xyzzy)) > > The middle line here has (punctuation identifier identifier > punctuation) syntax according to the =E2=80=9Cbefore=E2=80=9D file, but= (string) > according to =E2=80=9Cafter=E2=80=9D. I think =E2=80=9Cafter=E2=80=9D should have priority over =E2=80=9Cbefore= =E2=80=9D in context because the main goal of reading patches is to see how code will look after changes, so in this case =E2=80=98(bar baz)=E2=80=99 should be highlighted as a st= ring.