From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Anders Lindgren Newsgroups: gmane.emacs.bugs Subject: bug#41441: 26.3; mhtml misbehaving font-lock extend region can cause Emacs to hang Date: Wed, 27 May 2020 22:40:55 +0200 Message-ID: References: <87pnasro4o.fsf@tromey.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000030c90d05a6a73bd5" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="52940"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 41441@debbugs.gnu.org To: Tom Tromey Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed May 27 22:42:12 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1je2sM-000Des-Jr for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 27 May 2020 22:42:10 +0200 Original-Received: from localhost ([::1]:53648 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1je2sL-0003EQ-Kz for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 27 May 2020 16:42:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35710) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1je2sE-0003BN-IG for bug-gnu-emacs@gnu.org; Wed, 27 May 2020 16:42:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:38530) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1je2sE-0001jn-7V for bug-gnu-emacs@gnu.org; Wed, 27 May 2020 16:42:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1je2sE-0002d0-4G for bug-gnu-emacs@gnu.org; Wed, 27 May 2020 16:42:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Anders Lindgren Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 27 May 2020 20:42:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 41441 X-GNU-PR-Package: emacs Original-Received: via spool by 41441-submit@debbugs.gnu.org id=B41441.159061209010060 (code B ref 41441); Wed, 27 May 2020 20:42:02 +0000 Original-Received: (at 41441) by debbugs.gnu.org; 27 May 2020 20:41:30 +0000 Original-Received: from localhost ([127.0.0.1]:50076 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1je2rT-0002bu-Fp for submit@debbugs.gnu.org; Wed, 27 May 2020 16:41:30 -0400 Original-Received: from mail-lf1-f53.google.com ([209.85.167.53]:46239) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1je2rQ-0002bf-Ix for 41441@debbugs.gnu.org; Wed, 27 May 2020 16:41:13 -0400 Original-Received: by mail-lf1-f53.google.com with SMTP id r125so15236507lff.13 for <41441@debbugs.gnu.org>; Wed, 27 May 2020 13:41:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cDLEkBnCjFd0nrPHCHuaj+D1yPY3ZVH6zc08RhfR2Mc=; b=p8pdBM9k4K5wmQKfM4oGDluoItniiXHkBlTgzwzLR4zRyJiFRvjEp2JhAGzPdwQ/h3 NuLFhwUvP77ETf/FRcmzaCAPlGd6Q2nQZe4gpDw/Z20h4ROYLEDXn6+4uTCeQ6zf+cxt 8ZtD6nLuvp8XOay9C5AmvbE5VtMWO3YcapvK/trkKvRNnwXGKrVRNh5O9wVCpsWDs42R bGYw1QCaWbccKw97oLOOzSKZVTeUFZCkj68SvnRwaRPzbp3jAylm6pIebZWnxPwHcEEd 7PlbdyUGdzWe9mj6tLUy66sB49lqNORL1WO0X/mCUjyfvmOsDLwrtLFpFFW3r69MtJz/ /XUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cDLEkBnCjFd0nrPHCHuaj+D1yPY3ZVH6zc08RhfR2Mc=; b=Pls84k3ATj6luuTam0rHgzrz9BA/s9o1MlmCS81vStV3vWVt1UTQOLucv/CJJiBIc/ aQFRO+yea6LW8JNd26hnw2GTjWnsmmMAvMLD+FnubzS+SQDVZPP81rPJNK3E/EY5p5Dg stXFLmx/Pg7E+BvWJIAEvs3pU3PBCzr9LL/yAt8Cj+dzMHPQVdukoKNnw/iMa3ullQ7a XH81RBQZvXto5HLY6vKaKt7BqMJ8Odh+7cvTyDsKEfAzFlgg8IiPicstqg4p1lUqE3ZI JAfwd0MALmrd9wPLTdBUc1ZcFXlE9YgowLj5np1i9wW54uD/WJ75rGFVAAUW/yNl+Iab 1tKA== X-Gm-Message-State: AOAM533SZWazaqsNMwV521pvfiWdw0unGiBgJ5GlnfUg8nxZ3knmdBF8 4BShyfL3UZxfvy5JsJe8fZxbpakBlaTS7t1UJnQO3sXO X-Google-Smtp-Source: ABdhPJw+9s5LUdZeT5f0S4Gmlmz01lfef8EoIGUeIT8UfdRxiSiUocV4EHkPJeUfzt5uN6IMHYvWyI+vDNbEf01kEuo= X-Received: by 2002:ac2:550a:: with SMTP id j10mr3879362lfk.46.1590612066354; Wed, 27 May 2020 13:41:06 -0700 (PDT) In-Reply-To: <87pnasro4o.fsf@tromey.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:181106 Archived-At: --00000000000030c90d05a6a73bd5 Content-Type: text/plain; charset="UTF-8" Hi! To me it's obviously a bug. First, the name of the function contains "extend", not "change". Secondly, the loop in font-lock-default-fontify-region is written so that it loops until all functions in font-lock-extend-region-functions are satisfied (which is also documented). To me, when code is written this way it's clear that monotonic growth of the region is assumed (even though that isn't explicitly stated in the documentation). Anyway, I took a look at the mhtml code today. This is from mhtml--extend-font-lock-region: (goto-char font-lock-beg) (unless (eobp) (forward-char)) (setq font-lock-beg (previous-single-property-change (point) 'mhtml-submode nil (line-beginning-position))) Without having a deep knowledge of this, I don't think this do the correct thing when font-lock-beg is at the end of a line (as it is when the line is empty). The problem is that the code move forward one character (the newline) and then search backward with line-beginning-position (i.e. the beginning of the line following the original font-lock-beg) as limit, effectively shrinking the region. On Mon, May 25, 2020 at 6:11 PM Tom Tromey wrote: > Anders> * Fix the problem in `mhtml--extend-font-lock-region`. > > It's not entirely clear to me that (1) this is a bug, or (2) that it can > be changed. > > My recollection is that mhtml mode has to shrink the region in some > cases, because we don't want font-locking to extend beyond the end of a > sub-mode. > > For example consider things like > >

more html here

> > Here, we want to font-lock the body of the script using one set of > rules, and the rest with another set. > > Looking at mhtml--submode-fontify-region, though, I wonder if maybe > region extension isn't needed at all, since that function seems to > handle sub-mode region boundaries. So I guess that is one > experiment that could be done. > > Anders> One minor mode that, when enabled, would cause Emacs to hang > Anders> with the above buffer is > Anders> https://github.com/Lindydancer/char-font-lock (this package > Anders> highlights incorrect whitespace). > > Maybe this mode could be changed instead. Unfortunately, with the current font-lock interface, it's not. (Well, at least I haven't figured out a way to do it, and I've been writing Emacs packages for 25 years.) The problem is that when the "expand" function is called, the functions has no way of knowing whether it's the first time it has been called, or if it has been called repeatedly. Besides, what should it do if there is another expand function that has adjusted the region in the opposite order? Who has the right of way? Anyway, it's not just a problem for this mode, but for all font-lock packages that expand the region. -- Anders > > Tom > --00000000000030c90d05a6a73bd5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi!

To me it's obviously= a bug. First, the name of the function contains "extend", not &q= uot;change". Secondly, the loop in font-lock-default-fontify-region is= written so that it loops until all functions in font-lock-extend-region-fu= nctions are satisfied (which is also documented). To me, when code is writt= en this=C2=A0way it's clear that monotonic growth of the region is assu= med (even though that isn't explicitly stated in the documentation).

Anyway, I took a look at the mhtml code today. This = is from mhtml--extend-font-lock-region:

     =
 (goto-char font-lock-beg)
      (unless (eobp)
        (forward-char))
      (setq font-lock-beg (prev=
ious-single-property-change
                           (point) 'mhtml-submode nil
                           (line-beginning-position)))
Without having a deep knowledge of this, I don't think= this do the correct thing when font-lock-beg is at the end of a line (as i= t is when the line is empty). The problem is that the code move forward one= character (the newline) and then search backward with line-beginning-posit= ion (i.e. the beginning of the line following the original font-lock-beg) a= s limit, effectively shrinking the region.


On Mon,= May 25, 2020 at 6:11 PM Tom Tromey <tom@tromey.com> wrote:
Anders> * Fix the problem in `mhtml--ex= tend-font-lock-region`.

It's not entirely clear to me that (1) this is a bug, or (2) that it ca= n
be changed.

My recollection is that mhtml mode has to shrink the region in some
cases, because we don't want font-locking to extend beyond the end of a=
sub-mode.

For example consider things like

<script>some js here;</script><p>more html here</p>=

Here, we want to font-lock the body of the script using one set of
rules, and the rest with another set.

Looking at mhtml--submode-fontify-region, though, I wonder if maybe
region extension isn't needed at all, since that function seems to
handle sub-mode region boundaries.=C2=A0 So I guess that is one
experiment that could be done.

Anders> One minor mode that, when enabled, would cause Emacs to hang
Anders> with the above buffer is
Anders> https://github.com/Lindydancer/char-font-loc= k (this package
Anders> highlights incorrect whitespace).

Maybe this mode could be changed instead.

U= nfortunately, with the current font-lock interface, it's not. (Well, at= least I haven't figured out a way to do it, and I've been writing = Emacs packages for 25 years.)

The problem is that = when the "expand" function is called, the=C2=A0functions has no w= ay of knowing whether it's the first time it has been called, or if it = has been called repeatedly. Besides, what should it do if there is another = expand function that has adjusted the region in the opposite order? Who has= the right of way?

Anyway, it's not just a pro= blem for this mode, but for all font-lock packages that expand=C2=A0the reg= ion.

=C2=A0 =C2=A0 -- Anders
=C2=A0

Tom
--00000000000030c90d05a6a73bd5--