From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail
From: Dmitry Gutov <dgutov@yandex.ru>
Newsgroups: gmane.emacs.devel
Subject: Re: emacs rendering comparisson between emacs23 and emacs26.3
Date: Sun, 19 Apr 2020 21:12:57 +0300
Message-ID: <34fc9563-479e-f026-9640-1b70ca9885b9@yandex.ru>
References: <83lfn9s63n.fsf@gnu.org>
 <c73564b8-f6af-5c61-5fe6-4fa142010323@yandex.ru> <83h7xvqsgc.fsf@gnu.org>
 <90749329-ccb1-f96e-29c0-b4ecbb81d1d4@yandex.ru> <20200407174217.GC4009@ACM>
 <50acd968-4459-2fab-1609-7869e1ed072a@yandex.ru> <20200408020913.GA3992@ACM>
 <a8eb7e65-c5c8-ce55-68af-c27965d02c5c@yandex.ru> <20200412153458.GA5249@ACM>
 <6d65d90c-178e-87e2-68dd-236275a5e038@yandex.ru> <20200419171209.GA23044@ACM>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202";
	logging-data="130216"; mail-complaints-to="usenet@ciao.gmane.io"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.4.1
Cc: rudalics@gmx.at, Eli Zaretskii <eliz@gnu.org>, rrandresf@gmail.com,
 rms@gnu.org, emacs-devel@gnu.org
To: Alan Mackenzie <acm@muc.de>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Apr 19 20:13:39 2020
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1jQERm-000XmJ-Sw
	for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Apr 2020 20:13:38 +0200
Original-Received: from localhost ([::1]:46196 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>)
	id 1jQERm-0004I1-0C
	for ged-emacs-devel@m.gmane-mx.org; Sun, 19 Apr 2020 14:13:38 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:33312 helo=eggs1p.gnu.org)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <raaahh@gmail.com>) id 1jQERE-0003hm-CD
 for emacs-devel@gnu.org; Sun, 19 Apr 2020 14:13:04 -0400
Original-Received: from Debian-exim by eggs1p.gnu.org with spam-scanned (Exim 4.90_1)
 (envelope-from <raaahh@gmail.com>) id 1jQERD-0001dZ-CS
 for emacs-devel@gnu.org; Sun, 19 Apr 2020 14:13:04 -0400
Original-Received: from mail-wr1-x42d.google.com ([2a00:1450:4864:20::42d]:37022)
 by eggs1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <raaahh@gmail.com>)
 id 1jQERC-0001bQ-VD; Sun, 19 Apr 2020 14:13:03 -0400
Original-Received: by mail-wr1-x42d.google.com with SMTP id k1so9305457wrx.4;
 Sun, 19 Apr 2020 11:13:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=sender:subject:to:cc:references:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-language:content-transfer-encoding;
 bh=ol+5oubmW0XJVx6G8dWhZE/UDHPOLUinO43KI5dpQFY=;
 b=lRv5x4P4DxtZzBZKCs1PvG9IylTM527MLSwKluJyEYzQDaA94lyuPGaRJiVnqXgvBt
 fH4eIG2Xp7oTaKqP6gCUOmtLASJy9jNW3K+Son9V44JTfy12HG3fJxLm0sfJmmpCfnm0
 8ryvWo9n6fLOy/mLrj8S/glTLujU8Sf+rJnP4f6LvJZsTPpOLioo2bn+vwZ6taLvDiU0
 o523O3bRKsURvxmoxBb0fPd43ZeTC/jNmqj0btQ0xfk/MnLRYThJjLRkFu3n1Pew9p6O
 EOzho5KWIjuV7343BGvkMiQBA0R47+ieN92orHdWVpEmlGuH60rsWoxinCvBmTKSshoh
 TAjg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:sender:subject:to:cc:references:from:message-id
 :date:user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=ol+5oubmW0XJVx6G8dWhZE/UDHPOLUinO43KI5dpQFY=;
 b=cbQby5P53kHafkxZNE9RqJNffr2qzxYAOCSbzjeU/KMKsMAZqwryZU0Yt+MsB2ve4i
 2m4Spn2EkG8ijLNgxc98hUGAoQm4oLeJYjFDNPpbSViUyONUxOjhBsXjEvlh/7/Ftng/
 rf8RyATVSs6AyckA9RVQGbG8ey4ndY5T0OEZwn4/Z6mr0deRF1zFuRz+v4oq/zzk/so8
 o6gNTKJ4i06pTUcP9ydV75zD5aTKY2PFLPYkknYuLp/4oQHX0ZhaiNYtuOmyQQfFJrvU
 6vBkfCd+JC9j8URcHFpGBukLG9Ap2/B7lj6R089FDkqplx5chNhcId1uRBLjGH539MYI
 UAqg==
X-Gm-Message-State: AGi0PuZqXbHc1dX9QAgn2UAWiqhqVWyPIls09XLPrShVs3BnT662uczj
 hESrEODHnz1UwQLYmOPgLE+Mw6iGShQ=
X-Google-Smtp-Source: APiQypL7JqSBjsQ0EFoEXVnWK7wK1YfFwlbKPj+pCXbMx8rVrmoOva80hvXpS+hFIzzN8EllQSNG1w==
X-Received: by 2002:adf:dbce:: with SMTP id e14mr13716331wrj.337.1587319980585; 
 Sun, 19 Apr 2020 11:13:00 -0700 (PDT)
Original-Received: from [192.168.0.2] ([66.205.73.129])
 by smtp.googlemail.com with ESMTPSA id p7sm41458154wrf.31.2020.04.19.11.12.58
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Sun, 19 Apr 2020 11:12:59 -0700 (PDT)
In-Reply-To: <20200419171209.GA23044@ACM>
Content-Language: en-US
Received-SPF: pass client-ip=2a00:1450:4864:20::42d;
 envelope-from=raaahh@gmail.com; helo=mail-wr1-x42d.google.com
X-detected-operating-system: by eggs1p.gnu.org: Genre and OS details not
 recognized.
X-Received-From: 2a00:1450:4864:20::42d
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
 <mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org
Original-Sender: "Emacs-devel"
 <emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org>
Xref: news.gmane.io gmane.emacs.devel:247322
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/247322>

On 19.04.2020 20:12, Alan Mackenzie wrote:

>> Yes, on average it's only 2x benefit, but then again, if the buffer
>> opens at its beginning first, the extra initialization will be spread
>> across user commands. Which should have some additional positive effect
>> on apparent responsiveness.
> 
> Will it?  A slightly faster appearance of the buffer at line 1 will be
> offset by a slightly more sluggish response when doing M-x imenu foo.  And
> "slightly" here means a fraction of a second on a modern machine,
> presumably somewhat longer on an older one.

M-x file-file followed by M-x imenu should take ~as much as before. And 
if the user doesn't go for Imenu or scrolling to EOB right away, they 
get extra performance.

I don't know how slow it's for the average user, but there certainly are 
some with slow machines out there. Or people compiling without 
optimizations.

>>> Inserting a C++ raw string opener does typically necessitate a full
>>> scan (a search for a matching closer), but that would also be the case
>>> using syntax-propertize.
> 
>> Not really. It would just mark the opener as a string opener (maybe with
>> some extra text property), and that's that.
> 
> You don't know whether it's an unterminated raw string (the usual case)
> until you've scanned for a potential closing delimiter.

Is C++ syntax so ambiguous? Can R"( mean something else?

> This affects the
> font locking.  (An unterminated opening delimiter gets
> font-lock-warning-face, a terminated one doesn't.)

If everything after R"( is fontified as a string, it serves as a 
"warning" of sort as well.

> This is the sort of feature which I'm not willing to sacrifice.

Is it worth a full buffer scan every time you write a new raw string 
literal?

>> Then font-lock would fontify the following text as string contents
>> (until the end of the window or a little bit farther). Then you type
>> the closer, it only has to scan a little far back (it'll call
>> syntax-ppss to find the string opener), the closer is propertized as
>> appropriate, and that's that. No full buffer scans at any step.
> 
>> I recall that fontifying the rest of the buffer as text after a simple
>> string opener could be a sore topic for you, but raw strings should be
>> rare enough (aren't they?), or if they are not, fontification logic
>> could opt to do something different, while syntax-table properties will
>> be applied the "correct" way.
> 
> I'm not sure what you mean by "as text".

Sorry, I meant "as string".

> I've no reason to think raw
> strings are at all rare.  I've had several bug reports for them.  I'm not
> sure what you mean by "fontification logic ... something different" - do
> you mean in the raw string case?

I mean that if a raw string is unterminated, the default behavior should 
be to fontify the rest of the buffer as string. But then again, you 
could choose some different highliting in font-lock rules.

> Or by "while s-t properties will be
> applied in the "correct" way" - these properties are currently correctly
> applied (modulo any remaining bugs).

In that case, you should always be able to find the beginning of the 
current string literal (raw or plain) using syntax-ppss.

>> Yes, I think before-change-functions should become empty. Or much emptier.
> 
> It can't become empty.  after-change-functions is fine for dealing with
> insertions, but can't do much after a deletion.  Consider the case where
> you're in a string and all you know is that 5 characters have been
> deleted.  Those characters might have been )foo", so after checking the
> beginning of the string starts off with R"bar(, you've then got to scan a
> long way forward looking for )bar".  Effectively every deletion within a
> string would involve scanning to the end of that string.

This is an example of extra complexity you have to retain to implement 
the above feature the way you do.

It's probably also an example of how before/after-change-functions 
essentially duplicate the knowledge of language syntax. I'm guessing 
here, but to make it work like that, you need to have multiple functions 
"understand" the raw string syntax.

Whereas with syntax-propertize-function, that knowledge is concentrated 
in one place (maybe two, if font-lock rules do something unusual). This 
way, the code is simplified.

>>> Having actually gone through all the issues and implemented raw strings,
>>> I can't agree with you there.  There are all sorts of subtleties which
>>> necessitate intimate cooperation between the before-change-functions and
>>> after-change-functions.  Such cooperation seems to be excluded by the
>>> syntax-propertize mechanism.
> 
>> It encourages a different approach. Again: there are examples of raw
>> strings support in other major modes.
> 
> Do any of them have the syntactical complexities of C++'s raw strings?

You tell me:

https://docs.ruby-lang.org/en/2.4.0/syntax/literals_rdoc.html#label-Percent+Strings
https://rosettacode.org/wiki/Here_document#Ruby
https://github.com/atom/language-ruby/issues/109#issue-98715165

>>> Maybe, but with a slowdown.  More of these properties will get erased
>>> than needed (with nested template forms), and they will all need to get
>>> put back again.
> 
>> We won't really know until we can measure the result.
> 
> What's the point in investing all the effort to make the change, when
> there's not even a prediction of a speed up?

In principle, the speed-up will come from:

- Deferred execution (where several buffer changes can be handled 
together and not right away),
- No parsing the buffer contents much farther than the current window, 
in most cases. Which can speed up the majority of user actions. The 
exceptions will remain slower, but that is often a good tradeoff.

> And I'm not sure where the proof of the syntax-propertize mechanism being
> helpful is.  Has anybody but its originator positively chosen to use it,
> whilst being aware of the alternatives?

The alternatives being reinventing the relevant logic from zero in each 
major mode? And writing syntax caching logic each time?

> To become usable for CC Mode, it would need to provide something on
> before-change-functions to complement what's on a-c-f, and it would need
> to provide some control to the major mode over which syntax-table
> properties get erased.

Not something I can comment on.