From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: base Date: Thu, 26 Aug 2010 19:25:35 +0900 Message-ID: <87zkw94qn4.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20100822120642.GA1794@muc.de> <87bp8uzu9d.fsf@mithlond.arda> <871v9o7dmf.fsf@uwakimon.sk.tsukuba.ac.jp> <87wrrg5rzg.fsf@uwakimon.sk.tsukuba.ac.jp> <87r5ho5gyr.fsf@uwakimon.sk.tsukuba.ac.jp> <87hbij6hib.fsf@uwakimon.sk.tsukuba.ac.jp> <87k4nf7ezq.fsf@catnip.gol.com> <878w3v7dd2.fsf@catnip.gol.com> <83wrrfmljv.fsf@gnu.org> <87d3t75crc.fsf@uwakimon.sk.tsukuba.ac.jp> <8739u265eq.fsf@uwakimon.sk.tsukuba.ac.jp> <83sk22msp4.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1282819624 26830 80.91.229.12 (26 Aug 2010 10:47:04 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 26 Aug 2010 10:47:04 +0000 (UTC) Cc: miles@gnu.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Aug 26 12:47:01 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OoZz0-00033s-Vf for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 12:46:59 +0200 Original-Received: from localhost ([127.0.0.1]:38507 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OoZyw-0007bR-5u for ged-emacs-devel@m.gmane.org; Thu, 26 Aug 2010 06:46:54 -0400 Original-Received: from [140.186.70.92] (port=33038 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OoZyn-0007aF-BW for emacs-devel@gnu.org; Thu, 26 Aug 2010 06:46:46 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OoZyi-00028o-B8 for emacs-devel@gnu.org; Thu, 26 Aug 2010 06:46:45 -0400 Original-Received: from [130.158.254.170] (port=46649 helo=dmail01.cc.tsukuba.ac.jp) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OoZyf-000274-NC; Thu, 26 Aug 2010 06:46:38 -0400 Original-Received: from imss12.cc.tsukuba.ac.jp (unknown [130.158.254.130]) by dmail01.cc.tsukuba.ac.jp (Postfix) with ESMTP id E58E2F5916; Thu, 26 Aug 2010 19:29:38 +0900 (JST) Original-Received: from imss12.cc.tsukuba.ac.jp (imss12.cc.tsukuba.ac.jp [127.0.0.1]) by postfix.imss70 (Postfix) with ESMTP id 8C91DF4003; Thu, 26 Aug 2010 19:29:19 +0900 (JST) Original-Received: from mgmt1.sk.tsukuba.ac.jp (unknown [130.158.97.223]) by imss12.cc.tsukuba.ac.jp (Postfix) with ESMTP id 7DBD0F4002; Thu, 26 Aug 2010 19:29:19 +0900 (JST) Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) by mgmt1.sk.tsukuba.ac.jp (Postfix) with ESMTP id 7B63F3FA0278; Thu, 26 Aug 2010 19:29:19 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id 8D2041A47B8; Thu, 26 Aug 2010 19:25:35 +0900 (JST) In-Reply-To: <83sk22msp4.fsf@gnu.org> X-Mailer: VM undefined under 21.5 (beta29) "garbanzo" ed3b274cc037 XEmacs Lucid (x86_64-unknown-linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129247 Archived-At: Eli Zaretskii writes: > That's not a "mental model", at least not by your definition Of course it is, by *my* definition. Of course I can't speak for yours, though. *sigh* I guess I'm going to have to go into interminable detail on this.... > I don't see how users would need to know that stuff in order to be > able to use the tool safely and efficiently. Really? Don't you mean "I'd like to believe that users don't need to know, because it's inconvenient and tedious"? > It's like saying that Emacs users need to know how Lisp data types > are implemented or what is the glyph matrix, in order to make good > _use_ of Emacs (as opposed to _extend__ it). No, it isn't. Your analogy is broken, because you are ignoring the difference between private and public data. When you use Emacs as an editor, the public data is text. When that text is structured, you demand that other contributors understand the structure of the text, whether it's C code, LISP code, TexInfo documentation, docstrings, or even ChangeLogs and the NEWS file. LISP data and the glyph matrix are private data, and it's reasonable to ask Emacs to deal with them and not bother you with those details when you're using Emacs. You can even ask Emacs to help with the syntax of text. But eventually you have to tell the users "You need to know TexInfo or whatever to work on the manual, and you need to know when to use @code and when to use @samp." Emacs will help you produce correct syntax for @code, but it can't tell you when to use it. The history DAG and the commit metadata are public data. They are shared, they can be seen not only by the user who produces them but by everybody in the project. There are choices to be made, and they don't have technical answers; they're matters of taste and policy. You cannot expect the VCS to make correct decisions according to project policy about when and where to branch, or which branch to merge into which other, or when to rebase and when to merge at this stage of the technology. Project policy about what constitutes a "nice readable" history is a matter of taste and a matter of software capability. If users are going to participate in those discussions, they need to understand the capabilities of the software. Bazaar is very feature-poor as far as history restructuring goes; if you want a particular structure, you need to follow a workflow that produces it. Those workflows turn out to be more complex for Emacs than the old workflows, and people immediately complained. That resulted in people posting alternative workflows which work in *some* situations but not others, and yet other people screwing up history by taking shortcuts with the alternative workflows without having any idea what they were doing to the DAG. > Sure, it's nice to know all that, but it isn't (and shouldn't) be > necessary for a user. If you want to _extend_ the tool, then yes, > you'd need this and some more. You may be right about "shouldn't", but you are wrong about "isn't", at least given the capabilities of the chosen tool and at the level of anybody who wants to participate in discussions about what the Emacs workflow should be. > > I don't think any such thing exists for bzr. > http://doc.bazaar.canonical.com/bzr.2.2/developers/overview.html > and other docs in that area comes close, maybe. But that stuff is > rightfully in the developers' department, IMO. And in mine. That is not even on the same planet with a mental model. It's full of details that developers need to know to write the code to work with Bazaar internals, but completely unnecessary to an explanation of how the things the user can see work together with each other and with the users. OTOH, it is no help in understanding externally visible behavior, whether intentional or buggy. By contrast, the object database model as presented by the Git Community Book is abstract. In fact if you look into a real git ODB, you'll just see files full of apparently random bits -- they're all delta-compressed and gzipped. Such details are not part of the mental model, and they're not in the book (not emphasized in that chapter, at least). And that model can be visualized directly in git by gitk. If you can see it there, it's in your repository. You can look at directory objects or at files (the GUI version of the use of cat-file in gittutorial-2), or at diffs. This is not necessarily true in bzr, which supports "ghost revisions" -- don't ask, I don't know the details. But that has to complicate the mental model, and it definitely complicates and weakens the implementation: "ghost revisions" are implicated in several of the cases of wedged branches I've seen reported on bazaar@canonical. You can logically traverse the graph you see in gitk using reset or checkout, and it will have a visible effect on the display in gitk (after doing refresh or maybe reload). Modern versions of gitk will also display a representation of the index to you (which you can also get from the status command). The bzr model behind status is much fuzzier (although this probably doesn't matter as much as the object model). So having a model means that it is easy to understand what an git operation will do in terms of the graph you see in gitk. "commit" makes a node and links it on to the head of the active branch, "branch" and "tag" produce labels, etc. Fetch adds hidden nodes and arcs to the graph, pull or merge makes them visible by attaching them to the visible part of the graph and committing on top, then advancing the branch head to the new commit (thus making the new nodes "reachable from HEAD" in graph-theoretic terms, and gitk will adjust the display by making these newly reachable nodes visible upon refresh/reload). In bzr, however, it's not so obvious how to think about this. The commit is not automatic, so "merge" seems to be about content rather than the DAG. And there's one really big difference. In the mental model I use, git objects are *eternal* and *universal*. They've always existed, they always will, and they're the same for everyone. This is because of the (model) 1-1 map between SHA1 hash values and all the possible objects that could in theory exist. (In reality, a hash isn't good enough, for "eternity" and "all" you really want something like G=F6del numbers but with better compression properties -- a typical difference between "mental model" and "implementation".) In the actual git implementation, once you've seen an object's content, it's "eternal enough". That is, a non-garbage object is never deleted, and even garbage is kept around for 60 days by default. (Hey, that's even better than Scheme's promise of eternity! And if that worries you, you can set garbage collection to "never".) Once you realize that, recovery from rebase madness or inadvertant deletion of a branch[1] becomes a certainty -- you just need to find the command that does the tracing part of garbage collection, which will list up the dangling heads and other unreachables for you. This may not mean much to you, but I think of git as very reliable because I know what it promises about my data, and I'm quite confident that Linus's original design was straightforward and good ("just how hard can it be to design a linked list?" he asks the Emacs developers), and that improvements since then didn't require a genius to work on the ODB, just ordinarily smart hackers. This *simply isn't true* of bzr. Eternity is a little shaky. bzr uncommit will (eventually) destroy the commit, and nothing in the bzr docs or that I've read on bazaar@canonical gives me a model of *when*, except that I sorta think that it won't be copied when you branch from that branch. Practically, the bzrtools plugin does have a "heads" command which will allow you to find such objects, so probably this is no worse than git. But not having a model worries me. Worse, bzr objects are definitely *not* universal from the external viewpoint (eg, two identical programs can be different objects in bazaar). bzr tracks "containers" (delete all the content and metadata like name from a file, and you have a "container"). This is what allows it to do the much-vaunted "provably correct tracking of copies and renames". Containers have identifiers (those arch-ids you see in Emacs files are actually container ids), and those ids depend on who created them, where, and when. Two containers can be exactly the same as files, but they'll be different anyway. I don't have a model of this. That is, I don't understand why this is the provably correct way to treat renames and copies. And who knows how the history DAG is represented in Bazaar? I don't, and I suspect it changes from branch format to branch format. Why do the "rich root" formats need to be incompatible with the non-rich-root formats, and what are they good for anyway? You got me on both counts. This kind of thing just goes on and on and on and on and on and on and on and on and on and on and on and on and on and on and .... Note that I'm not saying that Bazaar is inconsistent or incoherent. I'm saying that I personally have trouble giving an account of why it's consistent and coherent. I would not want to try to debug *any* failure in bzr without the help of the developers, because I have no idea where I'd start. OTOH, I can give an account of why git does what it does, and I've had success in teaching that model to others. Although the first thing I'd do in case of a bug is report it, the second thing I'd do is to start browsing code. I think I'd have a good chance of localizing the issue, because I have a model of how git is supposed to work. YMMV. Footnotes:=20 [1] Branches are *not* objects, they are *refs* =3D references to objects, and so can be created or deleted at will without breaking any promises about object permanence.