unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Thinking about changed buffers
@ 2016-03-28 17:31 Lars Magne Ingebrigtsen
  2016-03-28 17:56 ` Andreas Schwab
                   ` (5 more replies)
  0 siblings, 6 replies; 46+ messages in thread
From: Lars Magne Ingebrigtsen @ 2016-03-28 17:31 UTC (permalink / raw)
  To: emacs-devel

In conjunction with the wishlist item "`M-q' shouldn't say that the
buffer hasn't changed when it hasn't", we started talking a bit about
further issues about what it means that a buffer has changed or not.

If you load a file, and then hit "a", and then delete the "a", then
Emacs will say that the buffer has changed.  If you hit "a" and then
`undo', Emacs will say that it hasn't.

If there was a way to deal with this discrepancy, that would be very
nice, I think.

One idea that popped up is that whenever we mark a buffer as unchanged
(that is, `(set-buffer-modified-p nil)', we save the byte size of the
buffer and a cryptographic hash of the buffer.  Then `buffer-modified-p'
would simply see whether either the size had changed, and if not,
whether the hash had changed.  If both are identical, then the buffer
hasn't changed.

This would basically allow us to really tell the users "yes, your buffer
is now back to the state it was when you loaded it".  I think that would
be very nice.

However, there are two problems:

1) Speed.  When editing files normally, `buffer-modified-p' would be
very fast, because buffers would change size, and we'd just be comparing
the sizes and say "yup, changed".  If, however, you're somehow altering
the buffer a lot but always returning to the same size, you'd have to
compute the hash.  (On my five year old, the current implementation
takes 2.7s on a 1GB buffer.)

2) Text properties.  If you call `add-text-properties' on a buffer, the
buffer becomes marked as changed.  The hashing function could look at
the intervals, too, so that's not a problem, but many (most?) of the
text properties are added by font locking mores with
`with-silent-modifications', which means "no, no, these text properties
here don't change the buffer".  But there's nothing in the text
properties themselves that will reveal this after the fact, unless I'm
reading the code incorrectly.

Óscar suggested that to deal with 2), Emacs should simply not regard
text properties as changing the buffer at all, but I think there are
various "rich text" modes that use text properties to generate the
output file (i.e., you put "bold" on some text and it gets written out
as <bold>).  I may be wrong about that.  Anybody know?

Anybody have any thoughts on this issue?

---------
I feel the need to add this, given the way the discussion went in the
`M-q' bug report, but let's hope it's unnecessary:

(Let's take it as a given that, yes, you can create hash collisions, but
that's irrelevant.  In normal, non-cryptographically-constructed text,
the likelihood of two texts having the same MD5 hash is 10^-29 and for
SHA1 it's 10^-39 (if I remember correctly), so it's Not Going To Happen
and we don't need to have that discussions.  (And yes, you can construct
MD5 collisions as fast as you want, but it. is. irrelevant.)  Sheesh.
There's something about cryptography that brings out the most irrelevant
stuff in some people.  If you want to discuss that part, please take it
to emacs-tangents.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2016-04-04  5:26 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-28 17:31 Thinking about changed buffers Lars Magne Ingebrigtsen
2016-03-28 17:56 ` Andreas Schwab
2016-03-28 18:00   ` Lars Magne Ingebrigtsen
2016-03-28 18:10     ` Andreas Schwab
2016-03-28 18:19       ` Lars Magne Ingebrigtsen
2016-03-28 18:30         ` Eli Zaretskii
2016-03-28 18:53           ` Lars Magne Ingebrigtsen
2016-03-28 18:57             ` Eli Zaretskii
2016-03-28 19:06               ` Lars Magne Ingebrigtsen
2016-03-28 19:15                 ` Eli Zaretskii
2016-03-28 19:23                   ` Lars Magne Ingebrigtsen
2016-03-28 19:38                     ` Eli Zaretskii
2016-03-28 19:46                       ` Lars Magne Ingebrigtsen
2016-03-28 20:21                         ` Lars Magne Ingebrigtsen
2016-03-29  2:29                           ` Eli Zaretskii
2016-03-28 18:54           ` Andreas Schwab
2016-03-28 18:22     ` Eli Zaretskii
2016-03-28 18:40 ` Lars Magne Ingebrigtsen
2016-03-28 18:49 ` Stephan Mueller
2016-03-28 19:13   ` Stefan Monnier
2016-03-28 19:20     ` Lars Magne Ingebrigtsen
2016-03-28 20:13       ` Clément Pit--Claudel
2016-03-28 20:32       ` Óscar Fuentes
2016-03-28 20:33       ` Stephan Mueller
2016-03-28 20:17     ` Marcin Borkowski
2016-03-28 18:51 ` Lars Magne Ingebrigtsen
2016-03-28 19:22   ` Stefan Monnier
2016-03-28 19:27     ` Lars Magne Ingebrigtsen
2016-03-28 19:32       ` Dmitry Gutov
2016-03-28 20:16         ` Clément Pit--Claudel
2016-03-28 20:22           ` Lars Magne Ingebrigtsen
2016-03-28 21:43       ` Stefan Monnier
2016-03-29  8:53 ` Florian Weimer
2016-03-29 13:14 ` Phillip Lord
2016-03-29 13:39   ` Stefan Monnier
2016-03-29 15:30     ` Lars Magne Ingebrigtsen
2016-04-03 23:05       ` John Wiegley
2016-04-03 23:29         ` Clément Pit--Claudel
2016-04-03 23:30           ` John Wiegley
2016-04-03 23:44           ` Óscar Fuentes
2016-04-04  0:20             ` Clément Pit--Claudel
2016-04-04  0:29               ` Óscar Fuentes
2016-04-04  5:21             ` Lars Magne Ingebrigtsen
2016-04-04  5:26               ` John Wiegley
2016-03-29 22:26     ` Phillip Lord
2016-04-03 23:05     ` John Wiegley

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).