From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Eric S. Raymond" Newsgroups: gmane.emacs.devel Subject: Re: Goals for repo conversion day Date: Sat, 25 Jan 2014 16:01:32 -0500 Organization: Eric Conspiracy Secret Labs Message-ID: <20140125210132.GB13305@thyrsus.com> References: <20140124170751.GA23376@thyrsus.com> <87mwils3b3.fsf@igel.home> <20140124185429.GA25191@thyrsus.com> <83k3dpcbpe.fsf@gnu.org> <20140125062551.GA2554@thyrsus.com> <83bnz0cxp8.fsf@gnu.org> <20140125140637.GA5631@thyrsus.com> <83vbx8azss.fsf@gnu.org> <20140125160124.GA8171@thyrsus.com> <83ppngasor.fsf@gnu.org> Reply-To: esr@thyrsus.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1390683701 13637 80.91.229.3 (25 Jan 2014 21:01:41 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 25 Jan 2014 21:01:41 +0000 (UTC) Cc: schwab@linux-m68k.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jan 25 22:01:49 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1W7AMO-0007n1-Ip for ged-emacs-devel@m.gmane.org; Sat, 25 Jan 2014 22:01:48 +0100 Original-Received: from localhost ([::1]:52521 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W7AMO-0005CL-8j for ged-emacs-devel@m.gmane.org; Sat, 25 Jan 2014 16:01:48 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52708) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W7AMI-000577-Lb for emacs-devel@gnu.org; Sat, 25 Jan 2014 16:01:46 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W7AMF-0006Bg-1L for emacs-devel@gnu.org; Sat, 25 Jan 2014 16:01:42 -0500 Original-Received: from static-71-162-243-5.phlapa.fios.verizon.net ([71.162.243.5]:43800 helo=snark.thyrsus.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W7AM9-0006Aa-AJ; Sat, 25 Jan 2014 16:01:33 -0500 Original-Received: by snark.thyrsus.com (Postfix, from userid 1000) id D24A33806AA; Sat, 25 Jan 2014 16:01:32 -0500 (EST) Content-Disposition: inline In-Reply-To: <83ppngasor.fsf@gnu.org> X-Eric-Conspiracy: There is no conspiracy User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 71.162.243.5 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:169082 Archived-At: Eli Zaretskii : > > *I* object to this. On the grounds that I've been through this dance > > many times before, and know that such out-of-band representations > > generally cost more hassle and deliver less than people expect when > > they think them up. > > With all due respect, this is not necessarily good enough. You have > come to the project offering help, but no one gave you the right to > make unilateral decisions about these issues. It won't be you who > will need to use this data in the years to come. And whatever the > other projects which you converted in the past, I doubt that any of > them had as long and complex history as Emacs. Your doubt is justified. None of my previous conversions have been quite this complex. The only conversion I have heard of that might have been hairier was that of Blender, which was done by other people using my tools. But as the size and complexity of the repo goes up, so does the value of in-band references actually working. Emacs is an exceptionally *bad* case for relying solely on an external reference map, not an exceptionally good one. > I'd appreciate if you posted the final list of the references, when > you are finished with it, so we could have some QA. Here is the current list. It is not final because I expect to resolve at least a few more of these, and it is still possible more fossil references could turn up in odd places. ChangeLog: revno 108687 -> 2012-06-22T21:17:42Z!eggert@cs.ucla.edu revno:105007 -> 2011-07-07T04:21:49Z!handa@m17n.org r112148 -> 2013-03-26T22:08:58Z!aidalgol@no8wireless.co.nz revno:108936 -> 2012-07-07T10:34:37Z!cyd@gnu.org revision 106831 -> 2012-01-10T08:27:22Z!cyd@gnu.org revision 1.59 CVS-1.61 1.61 in CVS revno:106608 -> 2011-12-04T17:13:01Z!lekktu@gmail.com revno 100789 -> 2010-07-12T05:26:57Z!handa@etlken rev. 110325 -> 2012-10-01T18:10:29Z!cyd@gnu.org r115470 -> 2013-12-11T19:01:44Z!tzz@lifelogs.com of 2012-12-20 (r111276) -> 2012-12-20T11:15:38Z!michael.albinus@gmx.de 2013-12-11 (r115470) -> 2013-12-11T19:01:44Z!tzz@lifelogs.com revno:114543 -> 2013-10-07T01:28:34Z!sdl.web@gmail.com revno:113793 -> 2013-08-11T00:07:48Z!lekktu@gmail.com revno:113117 -> 2013-06-21T12:24:37Z!lekktu@gmail.com r114834 -> 2013-10-29T02:50:24Z!dancol@dancol.org revno:113431 -> 2013-07-16T11:41:06Z!jan.h.d@swipnet.se revno:113147 -> 2013-06-23T20:29:18Z!lekktu@gmail.com revno 101897 -> 2010-10-10T14:43:05Z!dann@ics.uci.edu revno 101876 -> 2010-10-09T18:31:12Z!rgm@gnu.org revno 100306 -> 2010-05-15T21:21:30Z!raeburn@raeburn.org revno 108687 -> 2012-06-22T21:17:42Z!eggert@cs.ucla.edu revision 114614 (commit of 2013-10-10) -> 2013-10-10T19:15:33Z!eggert@cs.ucla.edu revno:113431 -> 2013-07-16T11:41:06Z!jan.h.d@swipnet.se revno 101949 -> 2010-10-13T14:50:06Z!lekktu@gmail.com revno:103013 -> 2011-01-28T22:12:05Z!monnier@iro.umontreal.ca rev 102609 -> 2010-12-08T08:09:27Z!kfogel@red-bean.com revno 101688 -> 2010-09-30T02:53:26Z!lekktu@gmail.com revno 101459 -> 2010-09-17T13:30:30Z!monnier@iro.umontreal.ca revnos 101381 -> 2010-09-08T14:42:54Z!michael.albinus@gmx.de 101422 -> 2010-09-13T15:17:01Z!michael.albinus@gmx.de rev 100010 -> 2010-04-23T16:26:11Z!monnier@iro.umontreal.ca revno:109911 -> 2012-09-07T04:15:56Z!dgutov@yandex.ru 109621 -> 2012-08-15T03:33:55Z!monnier@iro.umontreal.ca revno:88805 -> 2008-06-21T01:38:39Z!monnier@iro.umontreal.ca revno:88864 -> 2008-06-22T13:57:28Z!monnier@iro.umontreal.ca revno:89810 -> 2008-07-31T05:33:56Z!dann@ics.uci.edu revision 106664 -> 2011-12-11T14:49:48Z!vincentb1@users.sourceforge.net revno:105285 -> 2011-07-19T15:01:49Z!larsi@gnus.org revno:104787 (2011-06-30) -> 2011-06-30T01:09:13Z!larsi@gnus.org revno:104988 (2011-07-06) -> 2011-07-06T15:49:19Z!larsi@gnus.org revno:101730 (2010-10-02) -> 2010-10-02T13:21:43Z!michael.albinus@gmx.de revno:103877 (2011-04-09) -> 2011-04-09T20:28:01Z!cyd@stupidchicken.com revno:99634.2.463 (2010-10-09) -> 2010-10-09T04:09:19Z!cyd@stupidchicken.com revno:101913 -> 2010-10-11T23:57:49Z!lekktu@gmail.com revno 95090 dated 2009-03-06 -> 2009-03-06T07:51:52Z!handa@m17n.org revno 101757 -> 2010-10-03T13:59:56Z!dann@ics.uci.edu revno 82799 (2007-11-30) -> 2007-11-30T13:57:21Z!jasonr@gnu.org 2010-07-29 (revno 100939) -> 2010-07-29T16:49:59Z!jan.h.d@swipnet.se revno 100928 -> 2010-07-29T03:25:08Z!dann@ics.uci.edu revnos 100982 -> 2010-08-05T23:15:24Z!dann@ics.uci.edu 100984 -> 2010-08-05T23:34:12Z!dann@ics.uci.edu revno 99854.1.6 -> 2010-04-17T12:33:05Z!eliz@gnu.org revno 99950 -> 2010-04-20T13:31:28Z!eliz@gnu.org revno:100708 -> 2010-07-04T07:50:25Z!dann@ics.uci.edu revno:110851 -> 2012-11-09T04:10:16Z!monnier@iro.umontreal.ca revision 1.1 -> the initial version cvs-1.12.1 Revision 1.694 -> 2004-05-20T23:29:24Z!teirllm@auburn.edu revno 108687 -> 2012-06-22T21:17:42Z!eggert@cs.ucla.edu revno:108521 -> 2012-06-08T08:44:45Z!eliz@gnu.org revno:108341 -> 2012-05-22T16:20:27Z!eggert@cs.ucla.edu 2011-08-30 (revision 105619) -> 2011-08-30T17:32:44Z!eliz@gnu.org 2011-08-30 (revision 105619) -> 2011-08-30T17:32:44Z!eliz@gnu.org revision 84777 on 2008-02-22 -> 2008-02-22T17:42:09Z!monnier@iro.umontreal.ca revno:102982 (2011-01-26) -> 2011-01-26T20:02:07Z!monnier@iro.umontreal.ca revision 104625 -> 2011-06-18T18:49:19Z!cyd@stupidchicken.com revision 104134 -> 2011-05-06T07:13:19Z!eggert@cs.ucla.edu revno:20537 (1998-01-01) -> 1998-01-01T02:27:27Z!rms@gnu.org revno:87605 (2008-05-14) -> 2008-05-14T01:40:23Z!handa@m17n.org revno:50135 (2003-03-16) -> 2003-03-16T20:45:46Z!storm@cua.dk revno:87605 (2008-05-14) -> 2008-05-14T01:40:23Z!handa@m17n.org revno:34925 (2000-12-29) -> 2000-12-29T14:24:09Z!gerd@gnu.org revno:20537 (1998-01-01) -> 1998-01-01T02:27:27Z!rms@gnu.org revno:25013 (1999-07-21) -> 1999-07-21T21:43:52Z!gerd@gnu.org revno:43563.1.17 (2002-03-01) -> 2002-03-01T01:17:24Z!handa@m17n.org revno:84043 (2008-02-1) -> 2008-02-01T16:01:31Z!miles@gnu.org revno:25356 (1999-08-21) -> 1999-08-21T19:30:21Z!gerd@gnu.org revno:20870 (1998-02-08) -> 1998-02-08T21:33:56Z!rms@gnu.org revno:36704 (2001-03-09) -> 2001-03-09T18:41:50Z!gerd@gnu.org revno:32591 (2000-10-17) -> 2000-10-17T16:08:18Z!gerd@gnu.org revno:25013 (1999-07-21) -> 1999-07-21T21:43:52Z!gerd@gnu.org revno:43563.1.32 (2002-03-01) -> 2002-03-01T01:17:24Z!handa@m17n.org revno:14998 (1996-04-12) -> 1996-04-12T06:01:29Z!rms@gnu.org revno:86854 (2008-04-19) -> 2008-04-19T19:30:53Z!monnier@iro.umontreal.ca revno:20569 (1998-01-02) -> 1998-01-02T21:29:48Z!rms@gnu.org revno 103623 -> 2011-03-11T07:24:21Z!eggert@cs.ucla.edu revision 1.32 of saveplace.el -> saveplace.el at 2005-05-29T08:36:26Z!rms@gnu.org revision 1.30 of saveplace.el -> saveplace.el at 2005-04-10T23:32:00Z!rms@gnu.org version 1.100 -> 2007-12-06T19:56:41Z!deego@gnufans.org erc.el 1.39 -> 2007-12-01T03:41:01Z!rgm@gnu.org revision 1.104, made on 2000-05-21 -> 2000-05-21T17:04:47Z!fx@gnu.org 2007-07-18 (revision 1.51) revision 1.90 (commitid mWoPbju3pgNotDps) -> 2007-07-13T18:16:17Z!kfogel@red-bean.com revision 1.117 -> 2008-10-29T17:42:49Z!cyd@stupidchicken.com 1.85 1.878 1.113 1.244 1.34 1.233 rev 1.82 -> 1994-08-03T07:39:00Z!rms@gnu.org 1.70 (Jan 5 changes) -> 1994-01-03T07:21:12Z!rms@gnu.org r99212 -> 2009-12-29T07:22:00Z!nickrob@snap.net.nz rev. 110325 -> 2012-10-01T18:10:29Z!cyd@gnu.org revno r112320 -> 2013-04-18T00:12:33Z!monnier@iro.umontreal.ca Change comments: bzrs 111300 -> 2012-12-22T19:57:35Z!rgm@gnu.org 111840 -> 2013-02-21T02:42:30Z!eggert@cs.ucla.edu revision 111647 -> 2013-02-01T07:23:18Z!dmantipov@yandex.ru revno:11026 -> 1995-03-15T21:55:37Z!kwzh@gnu.org revno:88864 -> 2008-06-22T13:57:28Z!monnier@iro.umontreal.ca revno:88805 -> 2008-06-21T01:38:39Z!monnier@iro.umontreal.ca revno:89810 -> 2008-07-31T05:33:56Z!dann@ics.uci.edu revision 10835 -> 1995-02-25T20:57:45Z!rms@gnu.org revision 106726 -> 2011-12-23T14:51:51Z!eliz@gnu.org revision 87208 -> 2008-05-02T07:12:59Z!esr@snark.thyrsus.com revision 84777 on 2008-02-22 -> 2008-02-22T17:42:09Z!monnier@iro.umontreal.ca revno:99634.2.463 (2010-10-09) -> 2010-10-09T04:09:19Z!cyd@stupidchicken.com revno:101913 (2010-10-12). -> 2010-10-11T23:57:49Z!lekktu@gmail.com revno:20537 (1998-01-01) -> 1998-01-01T02:27:27Z!rms@gnu.org revno:87605 (2008-05-14) -> 2008-05-14T01:40:23Z!handa@m17n.org revno:87605 (2008-05-14) -> 2008-05-14T01:40:23Z!handa@m17n.org revno:34925 (2000-12-29) -> 2000-12-29T14:24:09Z!gerd@gnu.org revno:20537 (1998-01-01) -> 1998-01-01T02:27:27Z!rms@gnu.org revno:25013 (1999-07-21) -> 1999-07-21T21:43:52Z!gerd@gnu.org revno:43563.1.16 (2002-03-01) -> 2002-03-01T01:16:34Z!handa@m17n.org revno:84043 (2008-02-1) -> 2008-02-01T16:01:31Z!miles@gnu.org revno:20870 (1998-02-08) -> 1998-02-08T21:33:56Z!rms@gnu.org revno:36704 (2001-03-09) -> 2001-03-09T18:41:50Z!gerd@gnu.org revno:32591 (2000-10-17) -> 2000-10-17T16:08:18Z!gerd@gnu.org revno:25356 (1999-08-21) -> 1999-08-21T19:30:21Z!gerd@gnu.org revno:14998 (1996-04-12) -> 1996-04-12T06:01:29Z!rms@gnu.org revno:86854 (2008-04-19) -> 2008-04-19T19:30:53Z!monnier@iro.umontreal.ca revno:20569 (1998-01-02) -> 1998-01-02T21:29:48Z!rms@gnu.org r100577 -> 2010-06-10T12:56:11Z!michael.albinus@gmx.de CVS rev 1.49, 2001-09-12 CVS rev 1.47, 2003/01/27 CVS r1.35 revno 95090 dated 2009-03-06 -> 2009-03-06T07:51:52Z!handa@m17n.org 2005-02-15 (revno 60055) -> 2005-02-15T23:19:26Z!jasonr@gnu.org r111320 -> 2012-12-24T15:56:17Z!eliz@gnu.org revno 99854.1.6 -> 2010-04-17T12:33:05Z!eliz@gnu.org revno 99950 -> 2010-04-20T13:31:28Z!eliz@gnu.org revision 99649 -> 2010-03-12T16:34:27Z!eliz@gnu.org rev 99649 -> 2010-03-12T16:34:27Z!eliz@gnu.org rev 99553 -> 2010-02-24T22:07:26Z!bob@gnu.org revno 99212 -> 2009-12-29T07:22:00Z!nickrob@snap.net.nz revision 94343 -> 2009-01-30T13:06:07Z!lekktu@gmail.com revision 1.32 -> 2005-05-29T08:36:26Z!rms@gnu.org revision 1.30 -> 2005-04-10T23:32:00Z!rms@gnu.org version 1.100 -> 2007-12-06T19:56:41Z!deego@gnufans.org r1.135 -> 2009-10-10T21:48:22Z!kfogel@red-bean.com rev 1.114 1.878 revision 1.117 -> 2008-10-29T17:42:49Z!cyd@stupidchicken.com rev 1.14395 revision 1.56 3.85 1.17 revision 1.69 revision 1.1 -> initial revision rev 1.5 revisions 1.40 1.41 1.39-> 2007-12-01T03:41:01Z!rgm@gnu.org revision 1.104 revision 1.51 revision 1.90 (commitid mWoPbju3pgNotDps) -> 2007-07-13T18:16:17Z!kfogel@red-bean.com revision 1.1509 revision 7.8 CVS v1.12.8 and 1.12.9 cvs-1.12.1 1.103 HEAD (1.72) v1.275 1.58 v1.5046 v1.5039 rev 1.82 -> 1994-08-03T07:39:00Z!rms@gnu.org rev. 1.761 revision 1.3831 1.3832 revision 1.12 revision 1.13 revision 1.14 revision 1.15 The ChangeLog references are not attributed to individual files because they moved as the files rotated. Some of the remaining CVS references cannot be reseolved within the Emacs history; they actually point to other projects. One particularly fertile source of these, which I think accounts for this group 1.85 1.878 1.113 1.244 1.34 1.233 in ChangeLogs, is the CVS history of the erc files before they were merged into Emacs. > The problem is not the size of the repository alone. The problem is > that different portions of a single changeset were committed many > revisions apart. And I don't even understand (and you didn't explain) > how will you handle the situation I described above, where a single > commit checked in ChangeLog changes for several unrelated commits in > the same directory. Which commit clique will you assign the ChangeLog > commit to? The devil is in the details, but you haven't provided any > details about your plans in this matter. Would you please do that? I see we are using the term "changeset" slightly differently, and this has produced some confusion. The uncoalesced changesets I am looking for are not defined by "all share the same ChangeLog entry" (though usually that is the case). You are quite right that attempting to coalesce all of those would produce perverse results in cases of several unrelated commits. Fortunately, most of the unresolved cliques are not like this. The usual case, in this conversion as in others I've seen (such as groff) is that an unresolved clique consists of one or several closely related changes and one ChangeLog modification, without intervening commits by others. This is what I think of as a changeset. Normally tools such as parsecvs collect these into single changesets. But these converters have a maximum coalescence window. If such a span of commits took place over a longer period of time than the window, it won't be coalesced. The problem is that the default time windows on these converters are set small in order to avoid false-positive matches. Experience has taught me that this is a mostly imaginary problem; the window would have been better set to infinity in almost every case I have seen. The result of a too-small commit window is that some genuine changesets (not the edge case you are pointing at) do not get coalesced. In your edge case, the least bad thing to do is accept that the ChangeLog entry must remain its own changeset; sometimes you can get partial coalescence in the file changes. When there is CVS in the history, a standard part of my cleanup is basically to run a coalescence pass with a very long window. Semi-automating this operation, so it (a) doesn't have to be done manually, but (b) is easily checked by skilled human judgment, was one of the purposes for which I originally wrote reposurgeon. Fortunately the bad cases aren't actually very common. > > > > 5. Unconverted .bzrignores (and possibly .cvsignores) in the history. > > > > > > Why is that a problem? > > > > See "seamless history browsing". > > Sorry, I don't understand. Please elaborate: what is the relation > between these ignore files and history browsing? In a properly done conversion, file ignores don't abruptly stop working bevcause you browsed back past the point of conversion and what should be .gitignore files are nmow .bzrignores or .cvsignores. > > The way this is working is that I am building a reposurgeon script that > > expresses a sequence of edits to Andreas's mirror. On conversion day > > we will apply that script once, after which everyone can re-clone and > > go on as before. > > Sorry, I don't see how this changes anything. You are still going to > make deep changes to the existing mirror. Yes, for arguable values of "deep". As Paul Eggert (I think) said, I'm after a result that is stainless steel rather than earthenware. With ugly cracks in it. > > > Noble goals all of them, but I'm skeptical as to whether they can be > > > achieved in practice. What's worse, we won't know whether some issues > > > remained until much later. > > > > I know they can be achieved in practice because I have achieved them before, > > many times. Most recently in the conversion of the groff history, but > > you could check with the maintainers of NUT or Hercules or robotfindskitten > > or Roundup as well. Or the Blender Foundation - blender is a big reposurgeon > > conversion done by someone else. > > Sorry, been there done that. The CVS to bzr conversion also seemed > flawless until much later. There are several differences this time. One of the most important is that the state of the art has advanced. My tools do things that would have been impossible or impractical before they existed. I have auditing capabilities you would probably have to work a bit to even imagine. As a relatively trivial example - if Stefan or some other person with policy authority makes the call, I could reliably split elpa out into its own repo with one short command in the reposurgeon DSL. > > If we find any problems afterwards, I have the tools to fix them. Part of > > my commitment is to do that. > > I don't think any of us can in good faith give such promises. The span of my contributions to Emacs is measures in decades. I do not think you need to fear that I will vanish before this job is done. -- Eric S. Raymond