unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Referring to revisions in the git future.
@ 2014-10-28 22:33 Alan Mackenzie
  2014-10-28 22:54 ` Óscar Fuentes
                   ` (3 more replies)
  0 siblings, 4 replies; 137+ messages in thread
From: Alan Mackenzie @ 2014-10-28 22:33 UTC (permalink / raw)
  To: emacs-devel

Hello, Emacs.

We are switching to git, soon.

git doesn't have revision numbers.  Instead it uses cryptic identifiers,
which are not very useful in day to day conversation.  A bit like in
George Orwell's "Newspeak", where lingusists constantly removed words and
meanings so as to render certain notions literally inexpressible, we seem
to be faced with the same situation.

On this list, one quite often sees statements such as:

    "That was fixed in revision 118147, have you updated since then?"

or

    "The bug seems to have been introduced between 118230 and 118477.
    Maybe you could do a bisect to track it down.".

Is it going to be possible to express such ideas in our git world, in any
meaningful way?  If so, how?  Does git have a useable way of mapping its
cryptic revision identifiers to monotonically increasing natural numbers,
or some other useable scheme?

I have bad feelings about this.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-28 22:33 Referring to revisions in the git future Alan Mackenzie
@ 2014-10-28 22:54 ` Óscar Fuentes
  2014-10-28 23:05   ` Alan Mackenzie
  2014-10-29  0:49 ` Eric S. Raymond
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 137+ messages in thread
From: Óscar Fuentes @ 2014-10-28 22:54 UTC (permalink / raw)
  To: emacs-devel

Hello Alan,

Alan Mackenzie <acm@muc.de> writes:

> Hello, Emacs.
>
> We are switching to git, soon.
>
> git doesn't have revision numbers.  Instead it uses cryptic identifiers,
> which are not very useful in day to day conversation.  A bit like in
> George Orwell's "Newspeak", where lingusists constantly removed words and
> meanings so as to render certain notions literally inexpressible, we seem
> to be faced with the same situation.
>
> On this list, one quite often sees statements such as:
>
>     "That was fixed in revision 118147, have you updated since then?"
>
> or
>
>     "The bug seems to have been introduced between 118230 and 118477.
>     Maybe you could do a bisect to track it down.".
>
> Is it going to be possible to express such ideas in our git world, in any
> meaningful way?  If so, how?  Does git have a useable way of mapping its
> cryptic revision identifiers to monotonically increasing natural numbers,
> or some other useable scheme?
>
> I have bad feelings about this.

Before switching to git mayself the lack of revision numbers was the
strongest perceived inconvenience. Afterwards, it wasn't that bad. First
of all, you need to realize the limitations of using revision numbers:
they are meaningful only on the context of a branch. As soon as you have
more than one branch and merge among them, revision numbers are an
inconvenience.

As you use Mercurial, which has revision numbers, the advice of the
Mercurial experts possibly have some weight for you:

http://mercurial.selenic.com/wiki/RevisionNumber

    Revision numbers referring to changesets are very likely to be
    different in another copy of a repository. Do not use them to talk
    about changesets with other people. Use the changeset ID instead.

OTOH, there was some discussion on this list about using some
tool-independent schema, using a combination of the author's e-mail and
a timestamp.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-28 22:54 ` Óscar Fuentes
@ 2014-10-28 23:05   ` Alan Mackenzie
  2014-10-28 23:24     ` Óscar Fuentes
  2014-10-31 22:47     ` Paul Eggert
  0 siblings, 2 replies; 137+ messages in thread
From: Alan Mackenzie @ 2014-10-28 23:05 UTC (permalink / raw)
  To: Óscar Fuentes; +Cc: emacs-devel

Hello, Óscar.

On Tue, Oct 28, 2014 at 11:54:19PM +0100, Óscar Fuentes wrote:
> Hello Alan,

> Alan Mackenzie <acm@muc.de> writes:

> > Hello, Emacs.

> > We are switching to git, soon.

> > git doesn't have revision numbers.  Instead it uses cryptic identifiers,
> > which are not very useful in day to day conversation.  A bit like in
> > George Orwell's "Newspeak", where lingusists constantly removed words and
> > meanings so as to render certain notions literally inexpressible, we seem
> > to be faced with the same situation.

> > On this list, one quite often sees statements such as:

> >     "That was fixed in revision 118147, have you updated since then?"

> > or

> >     "The bug seems to have been introduced between 118230 and 118477.
> >     Maybe you could do a bisect to track it down.".

> > Is it going to be possible to express such ideas in our git world, in any
> > meaningful way?  If so, how?  Does git have a useable way of mapping its
> > cryptic revision identifiers to monotonically increasing natural numbers,
> > or some other useable scheme?

> > I have bad feelings about this.

> Before switching to git mayself the lack of revision numbers was the
> strongest perceived inconvenience. Afterwards, it wasn't that bad. First
> of all, you need to realize the limitations of using revision numbers:
> they are meaningful only on the context of a branch. As soon as you have
> more than one branch and merge among them, revision numbers are an
> inconvenience.

We've more than one branch in our Emacs repository, yet the bzr revision
numbers are not in the slightest inconvenient.

> As you use Mercurial, which has revision numbers, the advice of the
> Mercurial experts possibly have some weight for you:

> http://mercurial.selenic.com/wiki/RevisionNumber

>     Revision numbers referring to changesets are very likely to be
>     different in another copy of a repository. Do not use them to talk
>     about changesets with other people. Use the changeset ID instead.

That is a bit like saying, instead of saying "tomorrow at 8 o'clock",
which is horribly ambiguous, you should instead say at time 238707724383
(i.e. number of seconds after 1970-01-01, or whenever it was).  Changeset
IDs are good for some things, bad for others.

> OTOH, there was some discussion on this list about using some
> tool-independent schema, using a combination of the author's e-mail and
> a timestamp.

Are they going to enable the sort of conversation I exemplified above?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-28 23:05   ` Alan Mackenzie
@ 2014-10-28 23:24     ` Óscar Fuentes
  2014-10-31 22:47     ` Paul Eggert
  1 sibling, 0 replies; 137+ messages in thread
From: Óscar Fuentes @ 2014-10-28 23:24 UTC (permalink / raw)
  To: emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> We've more than one branch in our Emacs repository, yet the bzr revision
> numbers are not in the slightest inconvenient.

Expressions like this, taken from your original message:

    "That was fixed in revision 118147, have you updated since then?"

are not useful unless the branch is known. And then, if I wish to know
if that fix was merged into another branch, I'm forced to obtain the
message id.

Rev numbers are only useful when the community works with a CVS-like
workflow.

>> As you use Mercurial, which has revision numbers, the advice of the
>> Mercurial experts possibly have some weight for you:
>
>> http://mercurial.selenic.com/wiki/RevisionNumber
>
>>     Revision numbers referring to changesets are very likely to be
>>     different in another copy of a repository. Do not use them to talk
>>     about changesets with other people. Use the changeset ID instead.
>
> That is a bit like saying, instead of saying "tomorrow at 8 o'clock",
> which is horribly ambiguous, you should instead say at time 238707724383
> (i.e. number of seconds after 1970-01-01, or whenever it was).  Changeset
> IDs are good for some things, bad for others.

Yes, one of the inconveniences of changeset ids are that they are just
that: ids, without any other info. OTOH it is trivial to obtain any info
from the id alone (author, date, diff, branches that include it, etc)
with a simple Emacs trick. That does not apply to rev numbers.

>> OTOH, there was some discussion on this list about using some
>> tool-independent schema, using a combination of the author's e-mail and
>> a timestamp.
>
> Are they going to enable the sort of conversation I exemplified above?

As they would contain a human-readable timestamp, yes, essentially. But
the timestamp was, precisely, the trickiest part to get right.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-28 22:33 Referring to revisions in the git future Alan Mackenzie
  2014-10-28 22:54 ` Óscar Fuentes
@ 2014-10-29  0:49 ` Eric S. Raymond
  2014-10-29  3:38   ` Stephen J. Turnbull
  2014-10-29 14:52   ` Barry Warsaw
  2014-10-29  1:11 ` Stefan Monnier
  2014-10-29  8:50 ` David Kastrup
  3 siblings, 2 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29  0:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Alan Mackenzie <acm@muc.de>:
> On this list, one quite often sees statements such as:
> 
>     "That was fixed in revision 118147, have you updated since then?"
> 
> or
> 
>     "The bug seems to have been introduced between 118230 and 118477.
>     Maybe you could do a bisect to track it down.".
> 
> Is it going to be possible to express such ideas in our git world, in any
> meaningful way?  If so, how?  Does git have a useable way of mapping its
> cryptic revision identifiers to monotonically increasing natural numbers,
> or some other useable scheme?

Git does not have such a mapping.  This is not the git designers being
perverse; all other DVCSes have the same issue. A true DVCS is
designed for distributed operation in which there is no privileged node
to hand out the monotonically-increasing IDs.

I agree that git hashes make a terrrible reference format.  In the git
version of the guidelines for working with the repository I will advise
bnver using them.  Instead, refer to a commit by its author and date,
or (even better) by quoting the summary line from its log comment.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-28 22:33 Referring to revisions in the git future Alan Mackenzie
  2014-10-28 22:54 ` Óscar Fuentes
  2014-10-29  0:49 ` Eric S. Raymond
@ 2014-10-29  1:11 ` Stefan Monnier
  2014-10-29  6:06   ` Werner LEMBERG
  2014-10-29  8:50 ` David Kastrup
  3 siblings, 1 reply; 137+ messages in thread
From: Stefan Monnier @ 2014-10-29  1:11 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> Is it going to be possible to express such ideas in our git world, in any
> meaningful way?

FWIW I have basically never used Bzr's revno in conversations.
There's no question that they're handy, and I've used them, but I think
that dates work well enough in practice as a replacement (with pretty
much the same downsides).  Just replace

    "That was fixed in revision 118147, have you updated since then?"
with
    "That was fixed on Sep 23, have you updated since then?"


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  0:49 ` Eric S. Raymond
@ 2014-10-29  3:38   ` Stephen J. Turnbull
  2014-10-29 12:26     ` Stefan Monnier
  2014-10-29 14:52   ` Barry Warsaw
  1 sibling, 1 reply; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-10-29  3:38 UTC (permalink / raw)
  To: esr; +Cc: Alan Mackenzie, emacs-devel

Eric S. Raymond writes:

 > I agree that git hashes make a terrrible reference format.

I agree, but only because so many people I need to communicate with
think so.  In a more perfect world, I'd go with the easily
recognizable and unambiguous IDs.  The main advantage is that you
never have conversations like

    E:  I believe that was fixed in r666042.
    S:  But how does a commit to Gnus fix vc-git?

and you do have conversations like

    E:  I fixed that in commit FACECAFE.
    S:  I just pulled, and there's no FACECAFE here.
    E:  OMG!!  ... Try it now.

If you live in Emacs, this is hardly inconvenient as long as you have
get-log-for-sha1-near-point (unimplemented :-) and
get-logs-for-sha1s-in-buffer (also unimplemented :-).  Heck, if we
agreed on this, I bet larsi would provide a zero-day exploit which
washes your message presentation buffer and provides mouse-over
tooltips containing the log for each SHA1 in the message.

Sure there are design issues (which repo, mainly), but these could be
handled.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  1:11 ` Stefan Monnier
@ 2014-10-29  6:06   ` Werner LEMBERG
  2014-10-29  9:01     ` David Kastrup
  0 siblings, 1 reply; 137+ messages in thread
From: Werner LEMBERG @ 2014-10-29  6:06 UTC (permalink / raw)
  To: monnier; +Cc: acm, emacs-devel


> FWIW I have basically never used Bzr's revno in conversations.
> There's no question that they're handy, and I've used them, but I
> think that dates work well enough in practice as a replacement (with
> pretty much the same downsides).  Just replace
> 
>     "That was fixed in revision 118147, have you updated since then?"
> with
>     "That was fixed on Sep 23, have you updated since then?"

`git describe' and/or `git describe --tags' returns sequential
numbers, more or less.  It's essentially a tool to do the opposite,
this is, showing the most recent tag that is reachable from a given
commit, but it contains the number of commits in its return value.


    Werner



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-28 22:33 Referring to revisions in the git future Alan Mackenzie
                   ` (2 preceding siblings ...)
  2014-10-29  1:11 ` Stefan Monnier
@ 2014-10-29  8:50 ` David Kastrup
  2014-10-29  9:52   ` Eric S. Raymond
  2014-10-29 11:18   ` Alan Mackenzie
  3 siblings, 2 replies; 137+ messages in thread
From: David Kastrup @ 2014-10-29  8:50 UTC (permalink / raw)
  To: emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> Hello, Emacs.
>
> We are switching to git, soon.
>
> git doesn't have revision numbers.  Instead it uses cryptic
> identifiers, which are not very useful in day to day conversation.  A
> bit like in George Orwell's "Newspeak", where lingusists constantly
> removed words and meanings so as to render certain notions literally
> inexpressible, we seem to be faced with the same situation.
>
> On this list, one quite often sees statements such as:
>
>     "That was fixed in revision 118147, have you updated since then?"
>
> or
>
>     "The bug seems to have been introduced between 118230 and 118477.
>     Maybe you could do a bisect to track it down.".

So what are people going to do with this kind of information?
Copy&paste it into some command line.  A 40-letter string works just as
well as a 6 letter string for that.

If you were not talking about "on this list" but rather about "in a
typical developer meeting conversation", you'd have sort of a point,
assuming that there are developers who actually memorize revision ids
(which I somewhat doubt).  But mailing list?  Copy&paste.

> Is it going to be possible to express such ideas in our git world, in
> any meaningful way?  If so, how?

Just use the SHA1.

> Does git have a useable way of mapping its cryptic revision
> identifiers to monotonically increasing natural numbers, or some other
> useable scheme?

As long as you are not actually going to use those "monotonically
increasing natural numbers" in any manner sufficiently different from
"arbitrary digit string", and I don't see that you do here, I see no
advantage over cryptic unique strings.

> I have bad feelings about this.

I don't see what would substantiate them looking at the above.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  6:06   ` Werner LEMBERG
@ 2014-10-29  9:01     ` David Kastrup
  0 siblings, 0 replies; 137+ messages in thread
From: David Kastrup @ 2014-10-29  9:01 UTC (permalink / raw)
  To: emacs-devel

Werner LEMBERG <wl@gnu.org> writes:

>> FWIW I have basically never used Bzr's revno in conversations.
>> There's no question that they're handy, and I've used them, but I
>> think that dates work well enough in practice as a replacement (with
>> pretty much the same downsides).  Just replace
>> 
>>     "That was fixed in revision 118147, have you updated since then?"
>> with
>>     "That was fixed on Sep 23, have you updated since then?"
>
> `git describe' and/or `git describe --tags' returns sequential
> numbers, more or less.  It's essentially a tool to do the opposite,
> this is, showing the most recent tag that is reachable from a given
> commit, but it contains the number of commits in its return value.

dak@lola:/usr/local/tmp/lilypond$ git describe
release/2.19.15-1-52-g0c59175

Seriously.  Ok, this is 52 commits after tag release/2.19.15-1
(obviously, a tag naming scheme that does not help readability), and it
turns out that the commit id is 0c59175f0867663196e77061786dc07708d69894
so that is where the g... part is from.

But I am still hard put to consider this more useful for any purpose.
It's either some hand-waving relation to the last tag, or you could have
been using the SHA1 in the first place.

One can actually feed such an id into git but it would appear that git
_only_ looks at the trailing g0c59175 string and that becomes invalid as
soon as there is another commit id starting with 0c59175.  While the
abbreviated commit id is guaranteed to be unique in the _current_
repository, it may no longer be once stuff is added to it.  Or even if
the person you send release/2.19.15-1-52-g0c59175 to happens to have a
commit id starting with 0c59175 in a private branch of his.

Such an id is probably nice as a version number string since then
developers might be able to put the version into a rough ballpark
without looking more closely.

But for communicating about particular commits, just using the SHA1
seems much less problematic.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  8:50 ` David Kastrup
@ 2014-10-29  9:52   ` Eric S. Raymond
  2014-10-29 11:00     ` David Kastrup
                       ` (2 more replies)
  2014-10-29 11:18   ` Alan Mackenzie
  1 sibling, 3 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29  9:52 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

David Kastrup <dak@gnu.org>:
> Just use the SHA1.

Please *don't* use the SHA1.  It's an opaque blob, not portable to any 
future VCS we may need to move to someday.  It is better, and more human 
friendly, to refer to commits by their summary line, or by committer 
and date.

About summary lines, a reminder: Please don't write the traditional
GNUish run-on change comment with a semi-infinite number of bulleted
items in it any more. We're no longer in CVS-land, commits are cheap,
make them fine-grained.  

Git tools (and Mercurial tools, and it is a safe bet future DVCS tools
as well) like there to be a short, self-contained summary line
beginning the change comment.  This makes the history easier to read
in tools like gitk and hg view. 

If you can't express your intention in a short summary line, you should
break up the change into smaller commits until you can.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  9:52   ` Eric S. Raymond
@ 2014-10-29 11:00     ` David Kastrup
  2014-10-29 14:32       ` Eli Zaretskii
  2014-10-29 12:35     ` Stefan Monnier
  2014-10-29 13:08     ` Jan Djärv
  2 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-29 11:00 UTC (permalink / raw)
  To: emacs-devel

"Eric S. Raymond" <esr@thyrsus.com> writes:

> David Kastrup <dak@gnu.org>:
>> Just use the SHA1.
>
> Please *don't* use the SHA1.  It's an opaque blob, not portable to any
> future VCS we may need to move to someday.  It is better, and more
> human friendly, to refer to commits by their summary line, or by
> committer and date.

Shrug.  Not if the human actually wants to use it for any purpose.  It's
fine to be more explicit, like

commit 0c59175f0867663196e77061786dc07708d69894
Author: David Kastrup <dak@gnu.org>
Date:   Wed Jan 1 12:47:42 2014 +0100

    Some parser work, mostly unconvincing

But in the end, the one thing that is actually definitive is the commit
id.  And I don't see it as either more or less useful than a revision
number.  You don't want to type in either by hand, but at least the
commit id is reasonably safe against typos, given enough digits.

> About summary lines, a reminder: Please don't write the traditional
> GNUish run-on change comment with a semi-infinite number of bulleted
> items in it any more. We're no longer in CVS-land, commits are cheap,
> make them fine-grained.

Commits are awfully expensive since they should contain the ChangeLog
entries corresponding to each commit.

With regard to ChangeLog entries, we are still quite in CVS-land (though
CVS commits only allowed one file at a time, making it even worse to
keep track of the corresponding ChangeLog entry).

One thing that we really used ChangeLog for is distinguishing between
committer and author of a change, and we needed to keep track of the
latter for copyright and attribution reasons.  Fortunately, Git keeps
track of both.

At any rate, as long as ChangeLog entries are here to stay (and that's a
different discussion we had a few times), "commits are cheap" is not
matching reality.  Each commit tends to come with its own manual
conflict resolution for ChangeLog.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  8:50 ` David Kastrup
  2014-10-29  9:52   ` Eric S. Raymond
@ 2014-10-29 11:18   ` Alan Mackenzie
  2014-10-29 11:37     ` David Kastrup
  1 sibling, 1 reply; 137+ messages in thread
From: Alan Mackenzie @ 2014-10-29 11:18 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

Hello, David.

On Wed, Oct 29, 2014 at 09:50:28AM +0100, David Kastrup wrote:
> Alan Mackenzie <acm@muc.de> writes:

> > Hello, Emacs.

> > We are switching to git, soon.

> > git doesn't have revision numbers.  Instead it uses cryptic
> > identifiers, which are not very useful in day to day conversation.  A
> > bit like in George Orwell's "Newspeak", where lingusists constantly
> > removed words and meanings so as to render certain notions literally
> > inexpressible, we seem to be faced with the same situation.

> > On this list, one quite often sees statements such as:

> >     "That was fixed in revision 118147, have you updated since then?"

> > or

> >     "The bug seems to have been introduced between 118230 and 118477.
> >     Maybe you could do a bisect to track it down.".

> So what are people going to do with this kind of information?
> Copy&paste it into some command line.  A 40-letter string works just as
> well as a 6 letter string for that.

Copy&paste using a mouse is a tedious operation which interrupts
workflow.  A number like 118230 can be easily memorised and typed in to a
command line.

What else am I going to do with the information?  A revision number
contains useful meta-information: how old the revision is (more or less),
and whether it comes before or after another revision (more or less).  In
the above "fancy doing a bisect?" example, it's immediately clear that
the bisect operation is going to be taking around 7 or 8 repetitions,
that clarity being more immediate and subconscious than calculated.  One
can estimate, quasi subconsciously, whether the tedium involved in the
bisection would be well spent, or whether some other approach would be
better.

A revision number tells you how old the repository is.  With 118230, the
repository is clearly decades old.  With 729, it might be as young as a
few months.

With revision hashes, all that information is absent.  To get it, one is
forced to enter tedious command line commands, likely having to use a
mouse to cut and paste the hash - twice.  It is analogous to being able
to refer to people by their names, compared with having to use some sort
of random identifier.

Having revision numbers clearly works very well.  bzr and hg both have
them in addition to the universe-unique hashes.  git is missing this
useful feature.

> -- 
> David Kastrup

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 11:18   ` Alan Mackenzie
@ 2014-10-29 11:37     ` David Kastrup
  0 siblings, 0 replies; 137+ messages in thread
From: David Kastrup @ 2014-10-29 11:37 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> Hello, David.
>
> On Wed, Oct 29, 2014 at 09:50:28AM +0100, David Kastrup wrote:
>> Alan Mackenzie <acm@muc.de> writes:
>
>> > Hello, Emacs.
>
>> > We are switching to git, soon.
>
>> > git doesn't have revision numbers.  Instead it uses cryptic
>> > identifiers, which are not very useful in day to day conversation.  A
>> > bit like in George Orwell's "Newspeak", where lingusists constantly
>> > removed words and meanings so as to render certain notions literally
>> > inexpressible, we seem to be faced with the same situation.
>
>> > On this list, one quite often sees statements such as:
>
>> >     "That was fixed in revision 118147, have you updated since then?"
>
>> > or
>
>> >     "The bug seems to have been introduced between 118230 and 118477.
>> >     Maybe you could do a bisect to track it down.".
>
>> So what are people going to do with this kind of information?
>> Copy&paste it into some command line.  A 40-letter string works just as
>> well as a 6 letter string for that.
>
> Copy&paste using a mouse is a tedious operation which interrupts
> workflow.

So don't use a mouse.

> A number like 118230 can be easily memorised and typed in to a command
> line.

Frankly, memorizing something like revision ids is error prone and
dangerous.  If you memorize the first 6 characters of a commit id
instead, at least Git will complain if that identifies no unique commit.

> What else am I going to do with the information?  A revision number
> contains useful meta-information: how old the revision is (more or less),
> and whether it comes before or after another revision (more or less).

That's the kind of stuff I'd rather ask my version control system.

> In the above "fancy doing a bisect?" example, it's immediately clear
> that the bisect operation is going to be taking around 7 or 8
> repetitions, that clarity being more immediate and subconscious than
> calculated.  One can estimate, quasi subconsciously, whether the
> tedium involved in the bisection would be well spent, or whether some
> other approach would be better.

Well, feed those numbers to git bisect, and it will tell you right away
how many steps it will take.  Exactly.

> A revision number tells you how old the repository is.  With 118230,
> the repository is clearly decades old.  With 729, it might be as young
> as a few months.

I don't see you appending your birth date to your name, either.
Basically you are complaining that the commit SHA1 only provides the
commit SHA1 without additional literary value that one can approximately
deduce when brooding for hours over it rather than asking the version
control system.

But it is not intended to be a conversation piece.

> With revision hashes, all that information is absent.  To get it, one
> is forced to enter tedious command line commands, likely having to use
> a mouse to cut and paste the hash - twice.

Keyboard exists.

> It is analogous to being able to refer to people by their names,
> compared with having to use some sort of random identifier.

If you become acquainted with a particular revision number to a degree
that you develop a personal relation to it, chances are that this commit
was a bad idea.  Also chances are that you'll start recognizing the SHA1
after getting to see it for the twentieth time.

> Having revision numbers clearly works very well.

That reminds me of the rule-of-thumb for discovering errors in
mathematical proofs: just look for the first occurence of any of
"clearly", "trivially", "obviously", "it can be easily shown".

> bzr and hg both have them in addition to the universe-unique hashes.
> git is missing this useful feature.

And git clearly works very well since it is being used in large-scale
non-trivial projects with thousands of developers.

Feel free to use "git describe" yourself for getting a "human-readable"
"revision number".  But don't expect many others to follow that practice
enthusedly.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  3:38   ` Stephen J. Turnbull
@ 2014-10-29 12:26     ` Stefan Monnier
  2014-10-29 12:41       ` Alexander Baier
  0 siblings, 1 reply; 137+ messages in thread
From: Stefan Monnier @ 2014-10-29 12:26 UTC (permalink / raw)
  To: Stephen J. Turnbull; +Cc: esr, Alan Mackenzie, emacs-devel

> think so.  In a more perfect world, I'd go with the easily
> recognizable and unambiguous IDs.

Actually, if those revids were really mechanically usable, I think that
would be good enough.  But they're not "easily recognizable".

It's hard to write a regexp that recognizes these without too many false
positive, and even if you could do it, you'd still need to find out in
which repository to look it up (when they appear within a file within
that repository, it can be made to work, but when they appear in an
email it's much more difficult to link it to the appropriate
repository).


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  9:52   ` Eric S. Raymond
  2014-10-29 11:00     ` David Kastrup
@ 2014-10-29 12:35     ` Stefan Monnier
  2014-10-29 13:00       ` Jose E. Marchesi
  2014-10-29 13:26       ` Referring to revisions in the git future Eric S. Raymond
  2014-10-29 13:08     ` Jan Djärv
  2 siblings, 2 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-29 12:35 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: David Kastrup, emacs-devel

> About summary lines, a reminder: Please don't write the traditional
> GNUish run-on change comment with a semi-infinite number of bulleted
> items in it any more.

That's your opinion, but the convention we still use here (and don't
just recommend but *request* people to follow) is the GNU ChangeLog format.

> We're no longer in CVS-land, commits are cheap,

Don't know yet about Git, but I can assure you that in Bzr-land, commits
are not cheap (things like "bzr merge", "bzr annotate" and many others
have time complexities that depend on the number of commits).

> make them fine-grained.

Fine-grained or not is irrelevant.  What they should be is logical/coherent.
Your recent 20 or so single-line commits which all have the same summary
line is the perfect example of what should *not* be done.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 12:26     ` Stefan Monnier
@ 2014-10-29 12:41       ` Alexander Baier
  0 siblings, 0 replies; 137+ messages in thread
From: Alexander Baier @ 2014-10-29 12:41 UTC (permalink / raw)
  To: emacs-devel

On 2014-10-29 13:26 Stefan Monnier wrote:
>> think so.  In a more perfect world, I'd go with the easily
>> recognizable and unambiguous IDs.
>
> Actually, if those revids were really mechanically usable, I think that
> would be good enough.  But they're not "easily recognizable".
>
> It's hard to write a regexp that recognizes these without too many false
> positive, and even if you could do it, you'd still need to find out in
> which repository to look it up (when they appear within a file within
> that repository, it can be made to work, but when they appear in an
> email it's much more difficult to link it to the appropriate
> repository).

One could introduce a group/topic parameter that holds a list of
possible repositories to match. If more than one matches, just show all
matches in the tooltip.

Regards,
-- 
Alexander Baier




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 12:35     ` Stefan Monnier
@ 2014-10-29 13:00       ` Jose E. Marchesi
  2014-10-29 13:59         ` Stefan Monnier
  2014-10-29 14:04         ` utf8 and emacs text/string multibyte representation Camm Maguire
  2014-10-29 13:26       ` Referring to revisions in the git future Eric S. Raymond
  1 sibling, 2 replies; 137+ messages in thread
From: Jose E. Marchesi @ 2014-10-29 13:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eric S. Raymond, David Kastrup, emacs-devel


    > About summary lines, a reminder: Please don't write the traditional
    > GNUish run-on change comment with a semi-infinite number of bulleted
    > items in it any more.
    
    That's your opinion, but the convention we still use here (and don't
    just recommend but *request* people to follow) is the GNU ChangeLog
    format.

You can have both.  In other GNU projects using git (gdb, binutils) we
include a summary line for the benefit of `git log --oneline', followed
by an extended description in a separated paragraph(s) and finally the
ChangeLog entries.

This way `git log --oneline' will show you something like this:

e44528a New commands `enable probe' and `disable probe'.

While `git log' will give you the full description, including the
ChangeLog entries:

commit e44528a65592707466a9434434ed272dd3b13d9a
Author: Jose E. Marchesi <jose.marchesi@oracle.com>
Date:   Tue Oct 28 14:35:24 2014 +0100

    New commands `enable probe' and `disable probe'.
    
    This patch adds the above-mentioned commands to the generic probe
    abstraction implemented in probe.[ch].  The effects associated to
    enabling or disabling a probe depend on the type of probe being
    handled, and is triggered by invoking two back-end hooks in
    `probe_ops'.
    
    In case some particular probe type does not support the notion of
    enabling and/or disabling, the corresponding fields on `probe_ops' can
    be initialized to NULL.  This is the case of SystemTap probes.
    
    gdb/ChangeLog:
    
      2014-10-28  Jose E. Marchesi  <jose.marchesi@oracle.com>
    
        	* stap-probe.c (stap_probe_ops): Add NULLs in the static
        	stap_probe_ops for `enable_probe' and `disable_probe'.
        	* probe.c (enable_probes_command): New function.
        	(disable_probes_command): Likewise.
        	(_initialize_probe): Define the cli commands `enable probe' and
        	`disable probe'.
        	(parse_probe_linespec): New function.
        	(info_probes_for_ops): Use parse_probe_linespec.
        	* probe.h (probe_ops): New hooks `enable_probe' and
        	`disable_probe'.
    
    gdb/doc/ChangeLog:
    
      2014-10-28  Jose E. Marchesi  <jose.marchesi@oracle.com>
    
      	        * gdb.texinfo (Static Probe Points): Cover the `enable probe' and
                `disable probe' commands.





^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  9:52   ` Eric S. Raymond
  2014-10-29 11:00     ` David Kastrup
  2014-10-29 12:35     ` Stefan Monnier
@ 2014-10-29 13:08     ` Jan Djärv
  2014-10-29 13:27       ` Eric S. Raymond
  2 siblings, 1 reply; 137+ messages in thread
From: Jan Djärv @ 2014-10-29 13:08 UTC (permalink / raw)
  To: esr@thyrsus.com; +Cc: David Kastrup, emacs-devel@gnu.org

Hi. 


> 29 okt 2014 kl. 10:52 skrev Eric S. Raymond <esr@thyrsus.com>:
> 
> About summary lines, a reminder: Please don't write the traditional
> GNUish run-on change comment with a semi-infinite number of bulleted
> items in it any more. We're no longer in CVS-land, commits are cheap,
> make them fine-grained.  
> 

When using vc-mode it is very easy to insert the ChangeLog entry. 
Why is that bad?

      Jan D. 



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 12:35     ` Stefan Monnier
  2014-10-29 13:00       ` Jose E. Marchesi
@ 2014-10-29 13:26       ` Eric S. Raymond
  2014-10-29 14:04         ` Stefan Monnier
  1 sibling, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 13:26 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: David Kastrup, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca>:
> > About summary lines, a reminder: Please don't write the traditional
> > GNUish run-on change comment with a semi-infinite number of bulleted
> > items in it any more.
> 
> That's your opinion, but the convention we still use here (and don't
> just recommend but *request* people to follow) is the GNU ChangeLog format.

That's fine - for ChangeLogs.  But if you write run-on text without summary
lines *in comments*, you will make it more difficult for other programmers
to get a view of what you are doing through the DVCS tools.  Look at
a gitk display to see what I mean.

In the DVCS world, comments that don't have proper summary lines
impose friction costs. You shouldn't do that to other developers.  If
GNU conventions "request" you to impose such friction costs, it's GNU
conventions that are the problem and must change.

> > We're no longer in CVS-land, commits are cheap,
> 
> Don't know yet about Git, but I can assure you that in Bzr-land, commits
> are not cheap (things like "bzr merge", "bzr annotate" and many others
> have time complexities that depend on the number of commits).

Yes, git commits are cheap.  This was a hard requirement for the
kernel-dev workflow, which is very merge-intensive.

> > make them fine-grained.
> 
> Fine-grained or not is irrelevant.  What they should be is logical/coherent.
> Your recent 20 or so single-line commits which all have the same summary
> line is the perfect example of what should *not* be done.

Heh.  And, of course, you don't understand that the exact reason I did
this was the ChangeLog conventions - I was trying to be a good soldier
and make changesets in which content changes were properly grouped
with their ChangeLog entries, and this meant in practice I could not
generally allow a commit to touch multiple directories containing
ChangeLogs.  Even if it was trivial in all of them.

The underlying problem here is that ChangeLog entries and changeset
comments contest each other for authority over the same kinds of
metadata.  Eventually, ChangeLogs are bound to lose, because they're a
half-assed simulation of changeset comments without the actual
changesets.  In the mean time,  they're going to require a lot
of ceremony and duplicative effort and lead to avoidable mistakes.

ChangeLogs were a reasonable adaptation to file-oriented VCSes, but
their time is gone.  It's 2014. Changeset commit logs have been
a thing for a decade now; we should start acting like we know that.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 13:08     ` Jan Djärv
@ 2014-10-29 13:27       ` Eric S. Raymond
  2014-10-29 13:49         ` Eric S. Raymond
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 13:27 UTC (permalink / raw)
  To: Jan Djärv; +Cc: David Kastrup, emacs-devel@gnu.org

Jan Djärv <jan.h.d@swipnet.se>:
> > 29 okt 2014 kl. 10:52 skrev Eric S. Raymond <esr@thyrsus.com>:
> > 
> > About summary lines, a reminder: Please don't write the traditional
> > GNUish run-on change comment with a semi-infinite number of bulleted
> > items in it any more. We're no longer in CVS-land, commits are cheap,
> > make them fine-grained.  
> > 
> 
> When using vc-mode it is very easy to insert the ChangeLog entry. 
> Why is that bad?

Look at a gitk listing. You'll see it.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 13:27       ` Eric S. Raymond
@ 2014-10-29 13:49         ` Eric S. Raymond
  2014-10-29 18:03           ` Jan Djärv
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 13:49 UTC (permalink / raw)
  To: Jan Djärv; +Cc: David Kastrup, emacs-devel@gnu.org

Eric S. Raymond <esr@thyrsus.com>:
> Jan Djärv <jan.h.d@swipnet.se>:
> > > 29 okt 2014 kl. 10:52 skrev Eric S. Raymond <esr@thyrsus.com>:
> > > 
> > > About summary lines, a reminder: Please don't write the traditional
> > > GNUish run-on change comment with a semi-infinite number of bulleted
> > > items in it any more. We're no longer in CVS-land, commits are cheap,
> > > make them fine-grained.  
> > > 
> > 
> > When using vc-mode it is very easy to insert the ChangeLog entry. 
> > Why is that bad?
> 
> Look at a gitk listing. You'll see it.

Thinking about it, this was too short an answer.  Sorry.

A better answer is that it's fine to insert the ChangeLog entry *if you put
it after a proper self-contained summary line*.  That way, listings 
in gitk and git log -1 will make sense.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 13:00       ` Jose E. Marchesi
@ 2014-10-29 13:59         ` Stefan Monnier
  2014-10-29 14:39           ` Eric S. Raymond
  2014-10-29 14:04         ` utf8 and emacs text/string multibyte representation Camm Maguire
  1 sibling, 1 reply; 137+ messages in thread
From: Stefan Monnier @ 2014-10-29 13:59 UTC (permalink / raw)
  To: Jose E. Marchesi; +Cc: Eric S. Raymond, David Kastrup, emacs-devel

>> About summary lines, a reminder: Please don't write the traditional
>> GNUish run-on change comment with a semi-infinite number of bulleted
>> items in it any more.
>     That's your opinion, but the convention we still use here (and don't
>     just recommend but *request* people to follow) is the GNU ChangeLog
>     format.
> You can have both.  In other GNU projects using git (gdb, binutils) we
> include a summary line for the benefit of `git log --oneline', followed
> by an extended description in a separated paragraph(s) and finally the
> ChangeLog entries.

We do like to have a summary line as well, indeed (that's part of the
reason why 24.4 has now "Summary:" in the default content of the
*VC-Log* buffer where you write your commit message).  But Eric was
recommending not to include the full ChangeLog thingy.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 13:26       ` Referring to revisions in the git future Eric S. Raymond
@ 2014-10-29 14:04         ` Stefan Monnier
  2014-10-29 14:49           ` Eric S. Raymond
  2014-10-30  2:43           ` Stephen J. Turnbull
  0 siblings, 2 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-29 14:04 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: David Kastrup, emacs-devel

> That's fine - for ChangeLogs.

Forget the ChangeLog file.  Their content is just a redundant copy of the
commit message.

> But if you write run-on text without summary lines *in comments*,

I didn't say not to write a summary line.  I opposed your recommendation
"don't write the traditional GNUish run-on change comment".

> Yes, git commits are cheap.

The same was said of Bzr commits.  I'll see when I start using it
more extensively.

> Heh.  And, of course, you don't understand that the exact reason I did
> this was the ChangeLog conventions - I was trying to be a good soldier
> and make changesets in which content changes were properly grouped
> with their ChangeLog entries, and this meant in practice I could not
> generally allow a commit to touch multiple directories containing
> ChangeLogs.

There's no problem with a single commit that touches many ChangeLog files.
Don't ever decide how to split a patch based on what the ChangeLog files
should contain.  That's completely backassward.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* utf8 and emacs text/string multibyte representation
  2014-10-29 13:00       ` Jose E. Marchesi
  2014-10-29 13:59         ` Stefan Monnier
@ 2014-10-29 14:04         ` Camm Maguire
  2014-10-29 14:51           ` Eli Zaretskii
                             ` (2 more replies)
  1 sibling, 3 replies; 137+ messages in thread
From: Camm Maguire @ 2014-10-29 14:04 UTC (permalink / raw)
  To: emacs-devel, gcl-devel

Greetings!  I've recently been considering supporting unicode in gcl by
representing strings internally in utf8.  It appears that emacs does the
same or similar.  Apart from the obvious memory footprint benefits, I'd
like to ask what other advantages/disadvantages have been discovered.
Much of the utf8 literature emphasizes that most algorithms can proceed
conventionally in byte-wise fashion, including lexicographical ordering
comparisons, given that almost all jobs are sequential, at least
initially.  A cached internal pointer storing the last referenced
codepoint offset makes access essentially O(1).  Yet setting string
elements can trigger reallocations/memmove operations.  While these can
be aggregated over the setting of multiple elements, operations like
nreverse look ridiculous if left in terms of calls to aref and aset.

Thoughts, advice and experiences most appreciated.

Take care,
-- 
Camm Maguire			     		    camm@maguirefamily.org
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 11:00     ` David Kastrup
@ 2014-10-29 14:32       ` Eli Zaretskii
  2014-10-29 14:35         ` David Kastrup
  0 siblings, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-29 14:32 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

> From: David Kastrup <dak@gnu.org>
> Date: Wed, 29 Oct 2014 12:00:18 +0100
> 
> Each commit tends to come with its own manual conflict resolution
> for ChangeLog.

Not if you have git-merge-changelog installed.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:32       ` Eli Zaretskii
@ 2014-10-29 14:35         ` David Kastrup
  2014-10-29 14:55           ` Eli Zaretskii
  0 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-29 14:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: David Kastrup <dak@gnu.org>
>> Date: Wed, 29 Oct 2014 12:00:18 +0100
>> 
>> Each commit tends to come with its own manual conflict resolution
>> for ChangeLog.
>
> Not if you have git-merge-changelog installed.

If it were part of the official workflow, it would be part of Emacs or
at least ELPA.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 13:59         ` Stefan Monnier
@ 2014-10-29 14:39           ` Eric S. Raymond
  2014-10-29 14:46             ` Rasmus
  2014-10-29 15:27             ` Stefan Monnier
  0 siblings, 2 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 14:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: David Kastrup, emacs-devel, Jose E. Marchesi

Stefan Monnier <monnier@iro.umontreal.ca>:
> >> About summary lines, a reminder: Please don't write the traditional
> >> GNUish run-on change comment with a semi-infinite number of bulleted
> >> items in it any more.
> >     That's your opinion, but the convention we still use here (and don't
> >     just recommend but *request* people to follow) is the GNU ChangeLog
> >     format.
> > You can have both.  In other GNU projects using git (gdb, binutils) we
> > include a summary line for the benefit of `git log --oneline', followed
> > by an extended description in a separated paragraph(s) and finally the
> > ChangeLog entries.
> 
> We do like to have a summary line as well, indeed (that's part of the
> reason why 24.4 has now "Summary:" in the default content of the
> *VC-Log* buffer where you write your commit message).  But Eric was
> recommending not to include the full ChangeLog thingy.

Eh?

No. As I just said to Jan Djarv, including the full ChangeLog thingy 
is fine *as long as there's a proper summary line in front of it*
In other words, we need to move from this (marking start and 
end of comment with '-----'):

---------------------------------------------------------
* foo.c: Add a mutex around struct grobble so queue updates can be
thread-safe
* bar.c, baz.c: Assert mutex to avoid data scrambleage.
---------------------------------------------------------

to this:

---------------------------------------------------------
Properly mutex-lock the grobble structure.

* foo.c: Add a mutex around struct grobble so queue updates can be
thread-safe
* bar.c, baz.c: Assert mutex to avoid data scrambleage.
---------------------------------------------------------

Now, in the longer term, I think changeset comment logs will and
should replace ChangeLog files entirely, but that's a different
conversation.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:39           ` Eric S. Raymond
@ 2014-10-29 14:46             ` Rasmus
  2014-10-29 14:52               ` Eric S. Raymond
  2014-10-30  0:58               ` Rob Browning
  2014-10-29 15:27             ` Stefan Monnier
  1 sibling, 2 replies; 137+ messages in thread
From: Rasmus @ 2014-10-29 14:46 UTC (permalink / raw)
  To: emacs-devel

"Eric S. Raymond" <esr@thyrsus.com> writes:

> Now, in the longer term, I think changeset comment logs will and
> should replace ChangeLog files entirely, but that's a different
> conversation.

AFAIK that's what Org does:

      http://orgmode.org/worg/org-contribute.html#unnumbered-10

-- 
The second rule of Fight Club is: You do not talk about Fight Club




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:04         ` Stefan Monnier
@ 2014-10-29 14:49           ` Eric S. Raymond
  2014-10-30  2:43           ` Stephen J. Turnbull
  1 sibling, 0 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 14:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: David Kastrup, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca>:
> > But if you write run-on text without summary lines *in comments*,
> 
> I didn't say not to write a summary line.  I opposed your recommendation
> "don't write the traditional GNUish run-on change comment".

Well, I still recommend that. But there are two distinct practice issues 
here:

(1) The traditional way of writing run-on comments *discourages* writing
proper summary lines - instead we get grab-bags of only quasi-related changes 
in a bullet list, with  only the first like of the first bullet list 
accidently visible as a summary.  We need to stop doing this.

(2) Summary lines aside, commits that *have* to be described with a 
bullet list are almost always overgrown lumps that should have been 
better partitioned, if only so the changes will be easier to read later.

DVCSes encourage lots of fine-grained commits that are one single
thought each.  Embrace this, it's good for you.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-29 14:04         ` utf8 and emacs text/string multibyte representation Camm Maguire
@ 2014-10-29 14:51           ` Eli Zaretskii
  2014-10-29 15:55             ` Camm Maguire
  2014-10-29 16:45             ` Stefan Monnier
  2014-10-29 15:56           ` Raymond Toy
  2014-10-30  3:08           ` Stephen J. Turnbull
  2 siblings, 2 replies; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-29 14:51 UTC (permalink / raw)
  To: Camm Maguire; +Cc: gcl-devel, emacs-devel

> From: Camm Maguire <camm@maguirefamily.org>
> Date: Wed, 29 Oct 2014 10:04:58 -0400
> 
> Greetings!  I've recently been considering supporting unicode in gcl by
> representing strings internally in utf8.  It appears that emacs does the
> same or similar.

If you haven't already, you can find some basic description of what
Emacs does in the node "Text Representations" of the ELisp manual.

> Apart from the obvious memory footprint benefits, I'd
> like to ask what other advantages/disadvantages have been discovered.

You have basically said it yourself: memory footprint vs
addressability.  If you want to discuss this in more detail, I suggest
to ask more specific questions about specific aspects that bother you.

> A cached internal pointer storing the last referenced codepoint
> offset makes access essentially O(1).

We indeed maintain a cache for byte-to-character and character-to-byte
conversions.

> Yet setting string elements can trigger reallocations/memmove
> operations.

Emacs, as every editor, needs to handle this efficiently anyway,
because editing operations rarely leave the buffer size unchanged.  So
Emacs uses a gap to minimize reallocations.

> While these can be aggregated over the setting of multiple elements,
> operations like nreverse look ridiculous if left in terms of calls
> to aref and aset.

nreverse applied to a string is a rarity, IME.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:46             ` Rasmus
@ 2014-10-29 14:52               ` Eric S. Raymond
  2014-10-30  0:58               ` Rob Browning
  1 sibling, 0 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 14:52 UTC (permalink / raw)
  To: Rasmus; +Cc: emacs-devel

Rasmus <rasmus@gmx.us>:
> "Eric S. Raymond" <esr@thyrsus.com> writes:
> 
> > Now, in the longer term, I think changeset comment logs will and
> > should replace ChangeLog files entirely, but that's a different
> > conversation.
> 
> AFAIK that's what Org does:
> 
>       http://orgmode.org/worg/org-contribute.html#unnumbered-10

A clue!  A veritable clue!

/me dances the happy dance.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29  0:49 ` Eric S. Raymond
  2014-10-29  3:38   ` Stephen J. Turnbull
@ 2014-10-29 14:52   ` Barry Warsaw
  2014-10-29 15:01     ` David Kastrup
  1 sibling, 1 reply; 137+ messages in thread
From: Barry Warsaw @ 2014-10-29 14:52 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1674 bytes --]

On Oct 28, 2014, at 08:49 PM, Eric S. Raymond wrote:

>Git does not have such a mapping.  This is not the git designers being
>perverse; all other DVCSes have the same issue. A true DVCS is
>designed for distributed operation in which there is no privileged node
>to hand out the monotonically-increasing IDs.

This is certainly true in theory, but in practice you almost always have
sufficient context for monotonically increasing revision numbers[*] to make
sense.

For example, bzr has both revision hashes for unique reference, but it also
has human friendly revision numbers, which will generally be completely fine
to use in practice.  When used like this, the context is almost always for the
"trunk" branch in the "master" repository.

Yes, of course dvcs, democracy, and all, but I claim that most projects have a
canonical place for their source code.  If you ask their project leaders "how
do I get your code", they will answer with the url to this canonical location.
Thus, "your bug is fixed in r19801" has implied context for this url, and the
master, trunk, line-of-development (or whatever you call it) branch.

git really doesn't acknowledge this common development workflow, so it's
understandable that it doesn't in anyway support human readable revision ids.

It's also true that in e.g. bzr, if you really had to refer to a unique
revision id, you can use the hash.  It's just that in polite conversation
<wink>, it's rarely needed.

Cheers,
-Barry

[*] Although it's true that some bzr merge operations can "mess with" those
numbers, it's generally bad practice to use merge in such a way as to cause
this to happen.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:35         ` David Kastrup
@ 2014-10-29 14:55           ` Eli Zaretskii
  2014-10-30  4:44             ` Richard Stallman
  0 siblings, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-29 14:55 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

> From: David Kastrup <dak@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Wed, 29 Oct 2014 15:35:48 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> From: David Kastrup <dak@gnu.org>
> >> Date: Wed, 29 Oct 2014 12:00:18 +0100
> >> 
> >> Each commit tends to come with its own manual conflict resolution
> >> for ChangeLog.
> >
> > Not if you have git-merge-changelog installed.
> 
> If it were part of the official workflow

We don't have "the official workflow" for git.  The Powers That Be
claim that it's unnecessary, and even harmful.

I personally find it invaluable, as I do the equivalent bzr plugin.

> it would be part of Emacs or at least ELPA.

It's a C program that is available as a Gnulib module.  Not sure how
that is relevant to ELPA.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:52   ` Barry Warsaw
@ 2014-10-29 15:01     ` David Kastrup
  2014-10-29 15:06       ` Eric S. Raymond
  0 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-29 15:01 UTC (permalink / raw)
  To: emacs-devel

Barry Warsaw <barry@python.org> writes:

> [*] Although it's true that some bzr merge operations can "mess with"
> those numbers, it's generally bad practice to use merge in such a way
> as to cause this to happen.

My asking-for-trouble meter just rang off the hook.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 15:01     ` David Kastrup
@ 2014-10-29 15:06       ` Eric S. Raymond
  2014-10-29 18:12         ` Barry Warsaw
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 15:06 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

David Kastrup <dak@gnu.org>:
> Barry Warsaw <barry@python.org> writes:
> 
> > [*] Although it's true that some bzr merge operations can "mess with"
> > those numbers, it's generally bad practice to use merge in such a way
> > as to cause this to happen.
> 
> My asking-for-trouble meter just rang off the hook.

*shudder* And rightly so!
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:39           ` Eric S. Raymond
  2014-10-29 14:46             ` Rasmus
@ 2014-10-29 15:27             ` Stefan Monnier
  1 sibling, 0 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-29 15:27 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: David Kastrup, emacs-devel, Jose E. Marchesi

> No. As I just said to Jan Djarv, including the full ChangeLog thingy 
> is fine *as long as there's a proper summary line in front of it*
> In other words, we need to move from this (marking start and 
> end of comment with '-----'):

Yes, we agree.  But what you wrote was different:

   About summary lines, a reminder: Please don't write the traditional
   GNUish run-on change comment with a semi-infinite number of bulleted
   items in it any more.

Hence my reaction to correct your statement so people don't start
committing changes without proper ChangeLog-style commit message.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-29 14:51           ` Eli Zaretskii
@ 2014-10-29 15:55             ` Camm Maguire
  2014-10-29 16:19               ` Eli Zaretskii
  2014-10-29 16:45             ` Stefan Monnier
  1 sibling, 1 reply; 137+ messages in thread
From: Camm Maguire @ 2014-10-29 15:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gcl-devel, emacs-devel

Greetings, and thanks so much for the feedback!

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Camm Maguire <camm@maguirefamily.org>
>> Date: Wed, 29 Oct 2014 10:04:58 -0400
>> 
> You have basically said it yourself: memory footprint vs
> addressability.  If you want to discuss this in more detail, I suggest
> to ask more specific questions about specific aspects that bother you.
>

I thought there would be a little more on the upside, say some benefit
from having the internal representation be the same as that used in many
external representations, at least on linux, and perhaps some algorithm
coalescing with straightforward byte-wise operations.  Does every string
access in emacs proceed through the utf8 decoder?

>> A cached internal pointer storing the last referenced codepoint
>> offset makes access essentially O(1).
>
> We indeed maintain a cache for byte-to-character and character-to-byte
> conversions.

How big is this cache?

>
>> Yet setting string elements can trigger reallocations/memmove
>> operations.
>
> Emacs, as every editor, needs to handle this efficiently anyway,
> because editing operations rarely leave the buffer size unchanged.  So
> Emacs uses a gap to minimize reallocations.
>

But no gap in strings, right (i.e. just buffers)?

>> While these can be aggregated over the setting of multiple elements,
>> operations like nreverse look ridiculous if left in terms of calls
>> to aref and aset.
>
> nreverse applied to a string is a rarity, IME.
>

This is the stuff I really need to get a handle on -- what are the
dominant string operations.

Take care,
-- 
Camm Maguire			     		    camm@maguirefamily.org
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-29 14:04         ` utf8 and emacs text/string multibyte representation Camm Maguire
  2014-10-29 14:51           ` Eli Zaretskii
@ 2014-10-29 15:56           ` Raymond Toy
  2014-10-30 14:16             ` Camm Maguire
  2014-10-30  3:08           ` Stephen J. Turnbull
  2 siblings, 1 reply; 137+ messages in thread
From: Raymond Toy @ 2014-10-29 15:56 UTC (permalink / raw)
  To: gcl-devel; +Cc: emacs-devel

>>>>> "Camm" == Camm Maguire <camm@maguirefamily.org> writes:

    Camm> Greetings!  I've recently been considering supporting unicode in gcl by
    Camm> representing strings internally in utf8.  It appears that emacs does the
    Camm> same or similar.  Apart from the obvious memory footprint benefits, I'd
    Camm> like to ask what other advantages/disadvantages have been discovered.
    Camm> Much of the utf8 literature emphasizes that most algorithms can proceed
    Camm> conventionally in byte-wise fashion, including lexicographical ordering
    Camm> comparisons, given that almost all jobs are sequential, at least
    Camm> initially.  A cached internal pointer storing the last referenced
    Camm> codepoint offset makes access essentially O(1).  Yet setting string
    Camm> elements can trigger reallocations/memmove operations.  While these can
    Camm> be aggregated over the setting of multiple elements, operations like
    Camm> nreverse look ridiculous if left in terms of calls to aref and aset.

    Camm> Thoughts, advice and experiences most appreciated.

Have you looked at what other Lisp implementations do? AFAIK, none use
utf-8. CCL and clisp use utf-32, cmucl and allegro use utf-16, sbcl
and ecl(?) have two string types: 8-bit base-string and 32-bit
strings.

As a one-man operation (unfortunately), I'd go with the easiest one to
get right and follow either ccl or cmucl.  The rest of the support for
unicode can be added with libraries like cl-unicode and/or babel, if
need be.

--
Ray

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-29 15:55             ` Camm Maguire
@ 2014-10-29 16:19               ` Eli Zaretskii
  2014-10-30 14:13                 ` Camm Maguire
  0 siblings, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-29 16:19 UTC (permalink / raw)
  To: Camm Maguire; +Cc: gcl-devel, emacs-devel

> From: Camm Maguire <camm@maguirefamily.org>
> Cc: emacs-devel@gnu.org,  gcl-devel@gnu.org
> Date: Wed, 29 Oct 2014 11:55:13 -0400
> 
> I thought there would be a little more on the upside, say some benefit
> from having the internal representation be the same as that used in many
> external representations, at least on linux

Yes, that too.  Emacs originally used a very different internal
encoding (ISO-2022 based), and the switch to UTF-8 based was due to
the above.  In general, having a Unicode basis works better when you
want to support various Unicode defined features, like the UCA etc.

> and perhaps some algorithm coalescing with straightforward byte-wise
> operations.

Not sure what you mean here, please elaborate.  In general, many
operations with UTF-8 strings can use the usual string library
functions, as you probably know very well.

> Does every string access in emacs proceed through the utf8 decoder?

If you need to look at the character, yes.  E.g., if you need some
property of the character, you need to index the appropriate table by
that character's codepoint.  But in most operations that is not
needed.  You just need to recognize several specific characters, like
the null character, the slash, etc., most of which are ASCII.

> >> A cached internal pointer storing the last referenced codepoint
> >> offset makes access essentially O(1).
> >
> > We indeed maintain a cache for byte-to-character and character-to-byte
> > conversions.
> 
> How big is this cache?

Its size is dynamic, and depends on how frequently the conversion is
needed in places that are far away.  The cache stores byte-to-char
correspondence in places that are far away, and Emacs uses binary
search in between them.

> >> Yet setting string elements can trigger reallocations/memmove
> >> operations.
> >
> > Emacs, as every editor, needs to handle this efficiently anyway,
> > because editing operations rarely leave the buffer size unchanged.  So
> > Emacs uses a gap to minimize reallocations.
> >
> 
> But no gap in strings, right (i.e. just buffers)?

Right.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-29 14:51           ` Eli Zaretskii
  2014-10-29 15:55             ` Camm Maguire
@ 2014-10-29 16:45             ` Stefan Monnier
  1 sibling, 0 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-29 16:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Camm Maguire, gcl-devel, emacs-devel

>> Yet setting string elements can trigger reallocations/memmove
>> operations.
> Emacs, as every editor, needs to handle this efficiently anyway,
> because editing operations rarely leave the buffer size unchanged.  So
> Emacs uses a gap to minimize reallocations.

To clarify: Emacs handles modification of strings naively
(reallocate+memmove), but it doesn't matter much because these are
almost never used (my own local Emacs actually completely disallows
them, and I very rarely bump into problems because of it).  All "string
modifications" instead take place in buffers where we can efficiently
insert/delete/replace text.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 13:49         ` Eric S. Raymond
@ 2014-10-29 18:03           ` Jan Djärv
  0 siblings, 0 replies; 137+ messages in thread
From: Jan Djärv @ 2014-10-29 18:03 UTC (permalink / raw)
  To: esr; +Cc: David Kastrup, emacs-devel@gnu.org

Hi.

> 29 okt 2014 kl. 14:49 skrev Eric S. Raymond <esr@thyrsus.com>:
> 
> Eric S. Raymond <esr@thyrsus.com>:
>> Jan Djärv <jan.h.d@swipnet.se>:
>>>> 29 okt 2014 kl. 10:52 skrev Eric S. Raymond <esr@thyrsus.com>:
>>>> 
>>>> About summary lines, a reminder: Please don't write the traditional
>>>> GNUish run-on change comment with a semi-infinite number of bulleted
>>>> items in it any more. We're no longer in CVS-land, commits are cheap,
>>>> make them fine-grained.  
>>>> 
>>> 
>>> When using vc-mode it is very easy to insert the ChangeLog entry. 
>>> Why is that bad?
>> 
>> Look at a gitk listing. You'll see it.
> 
> Thinking about it, this was too short an answer.  Sorry.
> 
> A better answer is that it's fine to insert the ChangeLog entry *if you put
> it after a proper self-contained summary line*.  That way, listings 
> in gitk and git log -1 will make sense.

That is the recommendation already, and vc-mode supports it.

	Jan D.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 15:06       ` Eric S. Raymond
@ 2014-10-29 18:12         ` Barry Warsaw
  2014-10-29 22:09           ` Lars Magne Ingebrigtsen
  2014-10-30  3:32           ` Stephen J. Turnbull
  0 siblings, 2 replies; 137+ messages in thread
From: Barry Warsaw @ 2014-10-29 18:12 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 657 bytes --]

On Oct 29, 2014, at 11:06 AM, Eric S. Raymond wrote:

>David Kastrup <dak@gnu.org>:
>> Barry Warsaw <barry@python.org> writes:
>> 
>> > [*] Although it's true that some bzr merge operations can "mess with"
>> > those numbers, it's generally bad practice to use merge in such a way
>> > as to cause this to happen.
>> 
>> My asking-for-trouble meter just rang off the hook.
>
>*shudder* And rightly so!

Sure, but it's rare and only happens with some funky (and I'd argue incorrect
directionally) merges.  And besides, those sequential numbers are only for our
convenience, right? ;)  Revision ids of course are inviolate.

Cheers,
-Barry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 18:12         ` Barry Warsaw
@ 2014-10-29 22:09           ` Lars Magne Ingebrigtsen
  2014-10-29 22:29             ` Eric S. Raymond
  2014-10-30  3:32           ` Stephen J. Turnbull
  1 sibling, 1 reply; 137+ messages in thread
From: Lars Magne Ingebrigtsen @ 2014-10-29 22:09 UTC (permalink / raw)
  To: emacs-devel

Apparently Emacs is a project that certain people I won't name won't
name:

http://esr.ibiblio.org/?p=6485

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 22:09           ` Lars Magne Ingebrigtsen
@ 2014-10-29 22:29             ` Eric S. Raymond
  2014-10-29 23:31               ` Paul Eggert
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-29 22:29 UTC (permalink / raw)
  To: emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org>:
> Apparently Emacs is a project that certain people I won't name won't
> name:

I admit nothing, I deny nothing.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 22:29             ` Eric S. Raymond
@ 2014-10-29 23:31               ` Paul Eggert
  2014-10-30  0:01                 ` Nic Ferrier
                                   ` (2 more replies)
  0 siblings, 3 replies; 137+ messages in thread
From: Paul Eggert @ 2014-10-29 23:31 UTC (permalink / raw)
  To: esr, emacs-devel

Eric S. Raymond wrote:
> I admit nothing, I deny nothing.

Gee, I'm older than you are, and I wish we had switched to Git years ago.

Emacs should switch to automatically-generated ChangeLogs, too.  This would save 
us time and effort.  Several other GNU projects have switched already, and in 
practice it's a win.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 23:31               ` Paul Eggert
@ 2014-10-30  0:01                 ` Nic Ferrier
  2014-10-30  1:53                 ` Stefan Monnier
  2014-10-30  6:46                 ` Jan Djärv
  2 siblings, 0 replies; 137+ messages in thread
From: Nic Ferrier @ 2014-10-30  0:01 UTC (permalink / raw)
  To: Paul Eggert; +Cc: esr, emacs-devel

Paul Eggert <eggert@cs.ucla.edu> writes:

> Eric S. Raymond wrote:
>> I admit nothing, I deny nothing.
>
> Gee, I'm older than you are, and I wish we had switched to Git years ago.
>
> Emacs should switch to automatically-generated ChangeLogs, too.  This
> would save us time and effort.  Several other GNU projects have
> switched already, and in practice it's a win.

I second this.

I don't understand why anyone would do anything else.

Sure, generate them with every change if you want, with a hook and then
commit the changed ChangeLog back into the repo.

It's easy to ignore the Changelog changes in the git history.


Nic



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:46             ` Rasmus
  2014-10-29 14:52               ` Eric S. Raymond
@ 2014-10-30  0:58               ` Rob Browning
  1 sibling, 0 replies; 137+ messages in thread
From: Rob Browning @ 2014-10-30  0:58 UTC (permalink / raw)
  To: Rasmus, emacs-devel

Rasmus <rasmus@gmx.us> writes:

> AFAIK that's what Org does:
>
>       http://orgmode.org/worg/org-contribute.html#unnumbered-10

And Guile (FWIW):

  http://git.savannah.gnu.org/cgit/guile.git/tree/ChangeLog

-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 23:31               ` Paul Eggert
  2014-10-30  0:01                 ` Nic Ferrier
@ 2014-10-30  1:53                 ` Stefan Monnier
  2014-10-30  2:10                   ` Eric S. Raymond
  2014-10-30  6:46                 ` Jan Djärv
  2 siblings, 1 reply; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30  1:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: esr, emacs-devel

> Emacs should switch to automatically-generated ChangeLogs, too.

No disagreement from me either,


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  1:53                 ` Stefan Monnier
@ 2014-10-30  2:10                   ` Eric S. Raymond
  2014-10-30  2:13                     ` Paul Eggert
                                       ` (2 more replies)
  0 siblings, 3 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30  2:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Paul Eggert, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca>:
> > Emacs should switch to automatically-generated ChangeLogs, too.
> 
> No disagreement from me either,

You surprise me, in a good way.  I am sketching the design of a tool 
to query git log in my head even as I write.   A good thing for me 
to do after the git transition is finished.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  2:10                   ` Eric S. Raymond
@ 2014-10-30  2:13                     ` Paul Eggert
  2014-10-30  2:48                       ` Eric S. Raymond
  2014-10-30  2:25                     ` Glenn Morris
  2014-10-30 13:02                     ` Stefan Monnier
  2 siblings, 1 reply; 137+ messages in thread
From: Paul Eggert @ 2014-10-30  2:13 UTC (permalink / raw)
  To: esr; +Cc: emacs-devel

Eric S. Raymond wrote:
> I am sketching the design of a tool
> to query git log in my head even as I write.

I suggest using Gnulib's gitlog-to-changelog script.  Or improve it, if it needs 
improvements.  Several other GNU projects use it.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  2:10                   ` Eric S. Raymond
  2014-10-30  2:13                     ` Paul Eggert
@ 2014-10-30  2:25                     ` Glenn Morris
  2014-10-30 10:10                       ` David Kastrup
  2014-10-30 13:02                     ` Stefan Monnier
  2 siblings, 1 reply; 137+ messages in thread
From: Glenn Morris @ 2014-10-30  2:25 UTC (permalink / raw)
  To: esr; +Cc: Paul Eggert, Stefan Monnier, emacs-devel

"Eric S. Raymond" wrote:

> Stefan Monnier <monnier@iro.umontreal.ca>:
>> > Emacs should switch to automatically-generated ChangeLogs, too.
>> 
>> No disagreement from me either,
>
> You surprise me, in a good way. 

Stefan's views on this are not a surprise to regular readers of emacs-devel.

http://lists.gnu.org/archive/html/emacs-devel/2013-03/msg00978.html

and several other instances. This is not a new discussion.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:04         ` Stefan Monnier
  2014-10-29 14:49           ` Eric S. Raymond
@ 2014-10-30  2:43           ` Stephen J. Turnbull
  1 sibling, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-10-30  2:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eric S. Raymond, David Kastrup, emacs-devel

Stefan Monnier writes:

 > > Yes, git commits are cheap.
 > 
 > The same was said of Bzr commits.  I'll see when I start using it
 > more extensively.

The bzr developers didn't care about speed until somebody complained,
and their designs for repos were not only complexified by their
requirements for human-friendliness (not to mentioned their attachment
to multi-layered APIs), but they repeatedly changed those underlying
data structures.

Git has always had only one repo structure: a object database whose
"large" structure is provided by what are basically conses (the
commits).  The database has been layered over "packs", it's true, but
those are actually a speed optimization for accessing commits and
other objects.

Historically, Git also perceived itself to be competing on speed
(specifically with Mercurial, thus the introduction of packs), and
addressed performance issues quickly and thoroughly.

I don't know if that makes you feel better, but it should. :-)



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  2:13                     ` Paul Eggert
@ 2014-10-30  2:48                       ` Eric S. Raymond
  0 siblings, 0 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30  2:48 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

Paul Eggert <eggert@cs.ucla.edu>:
> Eric S. Raymond wrote:
> >I am sketching the design of a tool
> >to query git log in my head even as I write.
> 
> I suggest using Gnulib's gitlog-to-changelog script.  Or improve it,
> if it needs improvements.  Several other GNU projects use it.

I will investagate, thank you.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* utf8 and emacs text/string multibyte representation
  2014-10-29 14:04         ` utf8 and emacs text/string multibyte representation Camm Maguire
  2014-10-29 14:51           ` Eli Zaretskii
  2014-10-29 15:56           ` Raymond Toy
@ 2014-10-30  3:08           ` Stephen J. Turnbull
  2 siblings, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-10-30  3:08 UTC (permalink / raw)
  To: Camm Maguire; +Cc: gcl-devel, emacs-devel

Camm Maguire writes:

 > Greetings!  I've recently been considering supporting unicode in gcl by
 > representing strings internally in utf8.  It appears that emacs does the
 > same or similar.  Apart from the obvious memory footprint benefits,

If you need to *edit* large strings at arbitrary positions with high
performance, the memory footprint benefits are reduced by the need to
cache char position vs. memory position.  If you're on a 64-bit
architecture, those cache entries chew up memory 16 bytes at a time.

I think Emacs does a much better job of handling the position cache
than XEmacs does, so you're asking in the right place.  Just be aware
that it's possible to do it poorly. :-)

 > Yet setting string elements can trigger reallocations/memmove
 > operations.  While these can be aggregated over the setting of
 > multiple elements, operations like nreverse look ridiculous if left
 > in terms of calls to aref and aset.

How many of those operations are there, though?  At worst, nreverse
requires a few bytes of temporary storage to be implemented
efficiently.  If there are only a few of them, just implement them as
primitives.

Note that Python has chosen to use a "just big enough for the data"
fixed-width representation, and AFAIK the Python-licensed code is
GPL-compatible.
http://legacy.python.org/dev/peps/pep-0393/
This strategy has the advantage that manipulating strings internally
is always an array operation, so Python code can be efficient
(enough); you don't need to reimplement such operations as primitives,
and there are no gotchas for user code where the user code looks like
it's operating on an array (efficient) but is actually moving large
chunks of memory around all the time.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 18:12         ` Barry Warsaw
  2014-10-29 22:09           ` Lars Magne Ingebrigtsen
@ 2014-10-30  3:32           ` Stephen J. Turnbull
  2014-10-30  4:35             ` Barry Warsaw
  2014-10-30 13:19             ` Stefan Monnier
  1 sibling, 2 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-10-30  3:32 UTC (permalink / raw)
  To: Barry Warsaw; +Cc: emacs-devel

Barry Warsaw writes:

 > Sure, but [changing revision numbers is] rare and only happens with
 > some funky (and I'd argue incorrect directionally) merges.

Actually, it happens in *every* merge -- the local revnos are
obsoleted and turn into something meaningless in many cases (you get
the base, but who cares about that? -- it's whether the commit being
compared to is before or after the *merge* that matters).  And even on
the bzr list you'd occasionally see the "it's in my commit r666042"
"no, it's not, and that's not even your commit" "oops, push push must
push, try now" conversation.  Or worse "sorry, that's in the review
pipeline, wait for it" :-) conversation, but that was project-specific.

True, it only happens in the public repo if people merge public into
local and then push, and you can configure the public repo to refuse
such pushes (at least you can in bzr and hg, I'm not sure if git
supports this well even today).  But seriously, I'm frequently far
enough out of sync (say, 5 to 10 commits :-) that the revnos are just
as useless as SHA1s would be.  The overhead of pulling just to get the
commit under discussion swamps any difference in convenience as far as
I'm concerned.  YMMV, of course, that's just me.

 > And besides, those sequential numbers are only for our convenience,
 > right? ;)

Sure, and they're convenient mostly because you're used to them.  They
really don't have more content than SHA1s do, but they're easier to
read because they're decimal and relatively small.  I'm not going to
deny that, but I think everybody would be better off if some
infrastructure were created to make SHA1s easier to manipulate.

For example, in response to my earlier post, Stefan responded that
SHA1s aren't that easy to recognize and you'll get too many false
positives.  My initial rebuttal was "Eh?!", but a more constructive
response is, so we establish a convention of prefixing with "sha:" or
"SHA:".  The issue of "which repo" (especially for ELPA) is real, too,
but again context is likely to give you a very good first guess, and
after that you can configure some kind of variable to improve Emacs's
guessing.

This ain't rocket science, and it's the kind of thing Emacsen are
*very* good at.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  3:32           ` Stephen J. Turnbull
@ 2014-10-30  4:35             ` Barry Warsaw
  2014-10-30  5:24               ` Stephen J. Turnbull
                                 ` (2 more replies)
  2014-10-30 13:19             ` Stefan Monnier
  1 sibling, 3 replies; 137+ messages in thread
From: Barry Warsaw @ 2014-10-30  4:35 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 526 bytes --]

On Oct 30, 2014, at 12:32 PM, Stephen J. Turnbull wrote:

>Sure, and they're convenient mostly because you're used to them.  They
>really don't have more content than SHA1s do, but they're easier to
>read because they're decimal and relatively small.  I'm not going to
>deny that, but I think everybody would be better off if some
>infrastructure were created to make SHA1s easier to manipulate.

That's the point I'm really trying to make; SHAs are simply terrible to
communicate between humans.

Cheers,
-Barry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 14:55           ` Eli Zaretskii
@ 2014-10-30  4:44             ` Richard Stallman
  2014-10-30  8:32               ` Eric S. Raymond
  0 siblings, 1 reply; 137+ messages in thread
From: Richard Stallman @ 2014-10-30  4:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dak, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > We don't have "the official workflow" for git.  The Powers That Be
  > claim that it's unnecessary, and even harmful.

I've never used git, and I doubt I will use it for anything except Emacs.
I hope Emacs will advertise a simple recommended beginner's git workflow.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  4:35             ` Barry Warsaw
@ 2014-10-30  5:24               ` Stephen J. Turnbull
  2014-10-30 10:17               ` David Kastrup
  2014-10-30 13:42               ` Alex Bennée
  2 siblings, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-10-30  5:24 UTC (permalink / raw)
  To: Barry Warsaw; +Cc: emacs-devel

Barry Warsaw writes:

 > That's the point I'm really trying to make; SHAs are simply terrible to
 > communicate between humans.

OK, opinion noted (but mine differs, at least when revnos are offered
as an improvement).

Still, I believe the point is mostly moot.  That is, I suspect git
isn't going to give you revnos, at least not out of the box, and
longtime git users aren't going to give them to you even if git can.





^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-29 23:31               ` Paul Eggert
  2014-10-30  0:01                 ` Nic Ferrier
  2014-10-30  1:53                 ` Stefan Monnier
@ 2014-10-30  6:46                 ` Jan Djärv
  2014-10-30  7:36                   ` Ivan Shmakov
  2 siblings, 1 reply; 137+ messages in thread
From: Jan Djärv @ 2014-10-30  6:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: esr@thyrsus.com, emacs-devel@gnu.org

+1.  ChangeLogs require work for no real benefit compared to commit logs. 

       Jan D. 


> 30 okt 2014 kl. 00:31 skrev Paul Eggert <eggert@cs.ucla.edu>:
> 
> Eric S. Raymond wrote:
>> I admit nothing, I deny nothing.
> 
> Gee, I'm older than you are, and I wish we had switched to Git years ago.
> 
> Emacs should switch to automatically-generated ChangeLogs, too.  This would save us time and effort.  Several other GNU projects have switched already, and in practice it's a win.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  6:46                 ` Jan Djärv
@ 2014-10-30  7:36                   ` Ivan Shmakov
  2014-10-30  8:09                     ` Jan Djärv
                                       ` (2 more replies)
  0 siblings, 3 replies; 137+ messages in thread
From: Ivan Shmakov @ 2014-10-30  7:36 UTC (permalink / raw)
  To: emacs-devel

>>>>> Jan Djärv <jan.h.d@swipnet.se> writes:
>>>>> 30 okt 2014 kl. 00:31 skrev Paul Eggert <eggert@cs.ucla.edu>:

 >> Emacs should switch to automatically-generated ChangeLogs, too.
 >> This would save us time and effort.  Several other GNU projects have
 >> switched already, and in practice it's a win.

 > +1.  ChangeLogs require work for no real benefit compared to commit
 > logs.

	… Except that it’s virtually impossible to fix a typo in a Git
	commit message, while it’s easy to fix one in a ChangeLog.

-- 
FSF associate member #7257  http://boycottsystemd.org/  … 3013 B6A0 230E 334A



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  7:36                   ` Ivan Shmakov
@ 2014-10-30  8:09                     ` Jan Djärv
  2014-10-30  8:31                     ` Eric S. Raymond
  2014-10-30 15:52                     ` Eli Zaretskii
  2 siblings, 0 replies; 137+ messages in thread
From: Jan Djärv @ 2014-10-30  8:09 UTC (permalink / raw)
  To: Ivan Shmakov; +Cc: emacs-devel@gnu.org

Hi. 


30 okt 2014 kl. 08:36 skrev Ivan Shmakov <ivan@siamics.net>:

>>>>>> Jan Djärv <jan.h.d@swipnet.se> writes:
>>>>>> 30 okt 2014 kl. 00:31 skrev Paul Eggert <eggert@cs.ucla.edu>:
> 
>>> Emacs should switch to automatically-generated ChangeLogs, too.
>>> This would save us time and effort.  Several other GNU projects have
>>> switched already, and in practice it's a win.
> 
>> +1.  ChangeLogs require work for no real benefit compared to commit
>> logs.
> 
>    … Except that it’s virtually impossible to fix a typo in a Git
>    commit message, while it’s easy to fix one in a ChangeLog.

Correct, but still not worth it IMHO. 

      Jan D. 


^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  7:36                   ` Ivan Shmakov
  2014-10-30  8:09                     ` Jan Djärv
@ 2014-10-30  8:31                     ` Eric S. Raymond
  2014-10-30  9:53                       ` Andreas Schwab
                                         ` (3 more replies)
  2014-10-30 15:52                     ` Eli Zaretskii
  2 siblings, 4 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30  8:31 UTC (permalink / raw)
  To: Ivan Shmakov; +Cc: emacs-devel

Ivan Shmakov <ivan@siamics.net>:
> 	… Except that it’s virtually impossible to fix a typo in a Git
> 	commit message, while it’s easy to fix one in a ChangeLog.

I have a little script called "editcomment" that does this.  You can
only use it before the commit has been pusged, though.

The tension here is fundamental. You can have easy typo fixes, or you
can have a record that is both reliable and snared, but not both. I
think the latter is more important than the former.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  4:44             ` Richard Stallman
@ 2014-10-30  8:32               ` Eric S. Raymond
  2014-10-30 10:25                 ` David Kastrup
                                   ` (2 more replies)
  0 siblings, 3 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30  8:32 UTC (permalink / raw)
  To: Richard Stallman; +Cc: Eli Zaretskii, dak, emacs-devel

Richard Stallman <rms@gnu.org>:
> I've never used git, and I doubt I will use it for anything except Emacs.
> I hope Emacs will advertise a simple recommended beginner's git workflow.

The wiki describes it in detail.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  8:31                     ` Eric S. Raymond
@ 2014-10-30  9:53                       ` Andreas Schwab
  2014-10-30 10:13                         ` Eric S. Raymond
  2014-10-30 10:12                       ` David Kastrup
                                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 137+ messages in thread
From: Andreas Schwab @ 2014-10-30  9:53 UTC (permalink / raw)
  To: esr; +Cc: Ivan Shmakov, emacs-devel

"Eric S. Raymond" <esr@thyrsus.com> writes:

> Ivan Shmakov <ivan@siamics.net>:
>> 	… Except that it’s virtually impossible to fix a typo in a Git
>> 	commit message, while it’s easy to fix one in a ChangeLog.
>
> I have a little script called "editcomment" that does this.

How is that different from git commit --amend?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  2:25                     ` Glenn Morris
@ 2014-10-30 10:10                       ` David Kastrup
  2014-10-30 13:03                         ` Stefan Monnier
  0 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-30 10:10 UTC (permalink / raw)
  To: emacs-devel

Glenn Morris <rgm@gnu.org> writes:

> "Eric S. Raymond" wrote:
>
>> Stefan Monnier <monnier@iro.umontreal.ca>:
>>> > Emacs should switch to automatically-generated ChangeLogs, too.
>>> 
>>> No disagreement from me either,
>>
>> You surprise me, in a good way. 
>
> Stefan's views on this are not a surprise to regular readers of emacs-devel.
>
> http://lists.gnu.org/archive/html/emacs-devel/2013-03/msg00978.html
>
> and several other instances. This is not a new discussion.

It's easier to call everybody a mossback rather than getting oneself up
to speed regarding the fine details of the discussion.  Exasperation at
not having used the features of Git yet seems a bit premature at a time
we have not even switched to Git.  But then who am I old mossback to
argue with an impetuous spriteful youth?

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  8:31                     ` Eric S. Raymond
  2014-10-30  9:53                       ` Andreas Schwab
@ 2014-10-30 10:12                       ` David Kastrup
  2014-10-30 13:29                       ` Stefan Monnier
  2014-10-30 14:20                       ` Barry Warsaw
  3 siblings, 0 replies; 137+ messages in thread
From: David Kastrup @ 2014-10-30 10:12 UTC (permalink / raw)
  To: emacs-devel

"Eric S. Raymond" <esr@thyrsus.com> writes:

> Ivan Shmakov <ivan@siamics.net>:
>> 	… Except that it’s virtually impossible to fix a typo in a Git
>> 	commit message, while it’s easy to fix one in a ChangeLog.
>
> I have a little script called "editcomment" that does this.

Seriously?  git rebase -i works just fine for that kind of thing.

> You can only use it before the commit has been pusged, though.

Obviously.  Except when you do, but then you are not going to make
people overly happy.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  9:53                       ` Andreas Schwab
@ 2014-10-30 10:13                         ` Eric S. Raymond
  2014-10-30 10:32                           ` Andreas Schwab
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30 10:13 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Ivan Shmakov, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 356 bytes --]

Andreas Schwab <schwab@suse.de>:
> > I have a little script called "editcomment" that does this.
> 
> How is that different from git commit --amend?

It works on any revision, not just the current branch tip.  But it 
invalidates all the downsttream hashes.  Copy enclosed for your inspection.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

[-- Attachment #2: editcomment --]
[-- Type: text/plain, Size: 1559 bytes --]

#!/bin/sh
# Give this a commit-ID specification.  It will edit the associated comment.
# Usual caveats apply; the edited one and all commits after will change IDs,
# and pushing them to a repo with the old commits will wreak havoc.
# Note also that this cavalierly overwrites refs/original.
#
# This script by Eric S. Raymond, March 2010, all rites perverted. It's based 
# on an idea by thiago from #git, but debugged and with a safety check.
# It contains porcelain and porcelain byproducts.

topdir=`git rev-parse --show-cdup`
test -n "$topdir" && cd "$topdir"

my_commit=`git rev-parse $1` || exit $?

# Run a safety check before edits that could hose remotes.
if test -n "`git branch -r --contains $mycommit`"
then
    echo -n "Commit has been pushed.  Really edit? "
    read yn
    if test "$yn" != 'y'
    then
	exit 0
    fi
fi

my_file=COMMIT_EDITMSG
test -d .git && myfile=.git/COMMIT_EDITMSG

# This effort to invoke the user's normal editor fails.
# the problem is that anything the editor writes to stdout on the
# controlling terminal becomes part of the commit message.  So
# the editor needs to actually run inside another window.
#test -z "$GIT_EDITOR" && GIT_EDITOR=$EDITOR
#test -z "$GIT_EDITOR" && GIT_EDITOR=vi
#my_editor=$GIT_EDITOR

# xterm -e vi should also work.
my_editor=emacsclient

export my_file my_commit my_editor

exec git filter-branch -f --tag-name-filter cat --msg-filter '
if test "$GIT_COMMIT" = "$my_commit"; then
    cat > $my_file;
    $my_editor $my_file >/dev/null;
    cat $my_file
else
    cat
fi' "$1~.."

# End

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  4:35             ` Barry Warsaw
  2014-10-30  5:24               ` Stephen J. Turnbull
@ 2014-10-30 10:17               ` David Kastrup
  2014-10-30 13:42               ` Alex Bennée
  2 siblings, 0 replies; 137+ messages in thread
From: David Kastrup @ 2014-10-30 10:17 UTC (permalink / raw)
  To: emacs-devel

Barry Warsaw <barry@python.org> writes:

> On Oct 30, 2014, at 12:32 PM, Stephen J. Turnbull wrote:
>
>>Sure, and they're convenient mostly because you're used to them.  They
>>really don't have more content than SHA1s do, but they're easier to
>>read because they're decimal and relatively small.  I'm not going to
>>deny that, but I think everybody would be better off if some
>>infrastructure were created to make SHA1s easier to manipulate.
>
> That's the point I'm really trying to make; SHAs are simply terrible to
> communicate between humans.

They are not intended for communication between humans.  Neither are
revision numbers.  The moment you recognize a revision number without
even referring to a computer is when that revision number has become
infamous.  And in that case, you are at least equally likely to
recognize the first digits of its SHA1 since it differs much more from
those of the neighboring commits than a revision number would.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  8:32               ` Eric S. Raymond
@ 2014-10-30 10:25                 ` David Kastrup
  2014-10-30 11:51                   ` Eric S. Raymond
  2014-10-30 15:53                 ` Eli Zaretskii
  2014-10-31  7:47                 ` Richard Stallman
  2 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-30 10:25 UTC (permalink / raw)
  To: emacs-devel

"Eric S. Raymond" <esr@thyrsus.com> writes:

> Richard Stallman <rms@gnu.org>:
>> I've never used git, and I doubt I will use it for anything except Emacs.
>> I hope Emacs will advertise a simple recommended beginner's git workflow.
>
> The wiki describes it in detail.

I was of the opinion that the idea of a wiki caught on at some point of
time, so I am surprised to hear that there is still only one.

At any rate, that's not an on-site resource for somebody having
installed a full copy of Emacs.  One of the main points of a distributed
version control system is the ability to work offline.

"There is some instruction somewhere on the Internet" is not a
sufficient excuse for leaving Emacs without any resources pointing out
the desired workflow for Emacs developers.

If "the wiki" is an authoritive source for the workflow of Emacs
developers, it means that anybody who wants to can tell the Emacs
developers how to do their work.

Since not everybody agrees with the ways mossbacks keep track of the
copyright status of contributions (for example), this can easily lead to
such instructions getting out of synch with the legal requirements put
forth by the FSF and/or agreed upon by the Emacs developers.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30 10:13                         ` Eric S. Raymond
@ 2014-10-30 10:32                           ` Andreas Schwab
  2014-10-30 11:13                             ` Nicolas Richard
  0 siblings, 1 reply; 137+ messages in thread
From: Andreas Schwab @ 2014-10-30 10:32 UTC (permalink / raw)
  To: esr; +Cc: Ivan Shmakov, emacs-devel

"Eric S. Raymond" <esr@thyrsus.com> writes:

> Andreas Schwab <schwab@suse.de>:
>> > I have a little script called "editcomment" that does this.
>> 
>> How is that different from git commit --amend?
>
> It works on any revision, not just the current branch tip.

So its just a git rebase -i.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30 10:32                           ` Andreas Schwab
@ 2014-10-30 11:13                             ` Nicolas Richard
  0 siblings, 0 replies; 137+ messages in thread
From: Nicolas Richard @ 2014-10-30 11:13 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: esr, Ivan Shmakov, emacs-devel

Andreas Schwab <schwab@suse.de> writes:
> "Eric S. Raymond" <esr@thyrsus.com> writes:
>> Andreas Schwab <schwab@suse.de>:
>>> > I have a little script called "editcomment" that does this.
>>> 
>>> How is that different from git commit --amend?
>>
>> It works on any revision, not just the current branch tip.
>
> So its just a git rebase -i.

Perhaps with --preserve-merge ?

-- 
Nicolas Richard



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 10:25                 ` David Kastrup
@ 2014-10-30 11:51                   ` Eric S. Raymond
  2014-10-30 12:14                     ` David Kastrup
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30 11:51 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

David Kastrup <dak@gnu.org>:
> At any rate, that's not an on-site resource for somebody having
> installed a full copy of Emacs.  One of the main points of a distributed
> version control system is the ability to work offline.
> 
> "There is some instruction somewhere on the Internet" is not a
> sufficient excuse for leaving Emacs without any resources pointing out
> the desired workflow for Emacs developers.
> 
> If "the wiki" is an authoritive source for the workflow of Emacs
> developers, it means that anybody who wants to can tell the Emacs
> developers how to do their work.

You may be right. But this is not a git transition issue.

There is some in-tree workflow documentation in admin/notes/repo.  It
isn't very good.  My transition patch fixes it to refer to git commands
rather than bzr ones (and deletes some sections about things like
loggerhead that will no longer be relevant).  

It still isn't very good, but reworking it is out of scope for what
I'm trying to get done before conversion day.  After that, we can have
a *separate* conversation about the proper role of the in-tree notes
vs. the wiki, whether access to the wiki should be restricted in some
way, etc.

I have no particular opinions about those matters.  I do have both the
ability and the willingness to write good documentation, so I expect
I will end up doing a lot of the writing if we decide to reorganize.  

What I am not willing to do is dive into that matter *right now*.
The transition job is huge and it's not done yet. (Pull the repo
containing my transition machinery sometime and browse it.  You
should find the experience ... enlightening.)

It's all to the good that the transition is causing people to pay
attention to back issues that they normally ignore, but when these
come up please try to queue them for later rather than talking as
if the solution is a requirement for the transition itself.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 11:51                   ` Eric S. Raymond
@ 2014-10-30 12:14                     ` David Kastrup
  2014-10-30 15:01                       ` Eric S. Raymond
  0 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-30 12:14 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: emacs-devel

"Eric S. Raymond" <esr@thyrsus.com> writes:

> David Kastrup <dak@gnu.org>:
>> At any rate, that's not an on-site resource for somebody having
>> installed a full copy of Emacs.  One of the main points of a distributed
>> version control system is the ability to work offline.
>> 
>> "There is some instruction somewhere on the Internet" is not a
>> sufficient excuse for leaving Emacs without any resources pointing out
>> the desired workflow for Emacs developers.
>> 
>> If "the wiki" is an authoritive source for the workflow of Emacs
>> developers, it means that anybody who wants to can tell the Emacs
>> developers how to do their work.
>
> You may be right. But this is not a git transition issue.

[...]

Let me summarize this discussion:

Richard: I'd like to see problem X solved eventually
Eric: problem X is solved already
David: no, it isn't
Eric: well, but it can be solved eventually rather than now

To me it looks like you are prone to spend a lot of time and energy
chasing off straw men.  Part of that might come from considering
yourself to be the only person with any amount of intelligence around.
But that's not an actually workable hypothesis: if you were the only
person with any intelligence around, you'd not have sufficient time for
solving all of the world's problems anyway.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  2:10                   ` Eric S. Raymond
  2014-10-30  2:13                     ` Paul Eggert
  2014-10-30  2:25                     ` Glenn Morris
@ 2014-10-30 13:02                     ` Stefan Monnier
  2014-10-30 15:12                       ` Eric S. Raymond
  2 siblings, 1 reply; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30 13:02 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: Paul Eggert, emacs-devel

>> > Emacs should switch to automatically-generated ChangeLogs, too.
>> No disagreement from me either,
> You surprise me, in a good way.

That doesn't surprise me.  Your write up shoed you completely
misunderstood what I was saying.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 10:10                       ` David Kastrup
@ 2014-10-30 13:03                         ` Stefan Monnier
  2014-10-30 13:40                           ` David Kastrup
  0 siblings, 1 reply; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30 13:03 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

> to speed regarding the fine details of the discussion.  Exasperation at
> not having used the features of Git yet seems a bit premature at a time
> we have not even switched to Git.

FWIW, this has nothing to do with Git, since the exact same thing could
be done with Bzr.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  3:32           ` Stephen J. Turnbull
  2014-10-30  4:35             ` Barry Warsaw
@ 2014-10-30 13:19             ` Stefan Monnier
  2014-10-31  6:36               ` Stephen J. Turnbull
  2014-10-31 19:42               ` David Kastrup
  1 sibling, 2 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30 13:19 UTC (permalink / raw)
  To: Stephen J. Turnbull; +Cc: Barry Warsaw, emacs-devel

> as useless as SHA1s would be.  The overhead of pulling just to get the
> commit under discussion swamps any difference in convenience as far as
> I'm concerned.  YMMV, of course, that's just me.

I largely agree, and for that reason I basically never use revnos in
discussions.  I find dates to work much better: humans can relate to
them very well, and since you'll have to add your own contextual info
(which branch is under discussion) and then go through a tool if you
need/want to get more details, you might as well use dates.
FWIW, I find this same argument makes me prefer dates over revids.

> For example, in response to my earlier post, Stefan responded that
> SHA1s aren't that easy to recognize and you'll get too many false
> positives.  My initial rebuttal was "Eh?!", but a more constructive
> response is, so we establish a convention of prefixing with "sha:" or
> "SHA:".

I'd rather go with "git:", but yes that's also the first obvious answer
that came to my mind.

> The issue of "which repo" (especially for ELPA) is real, too, but
> again context is likely to give you a very good first guess, and after
> that you can configure some kind of variable to improve
> Emacs's guessing.

I guess that could work, but it requires a bit more setup than I like.

Especially since Git doesn't support shared repositories very well, so
I'll probably have to live with multiple Git repositories (compared to
the single Bzr repository I use now shared between 4-6 lightweight
checkouts).  This in turn means that the "auto-handle git revids" tool
would have to try all of those repositories.

And if I don't place the repositories for the various projects where
I care about revids at the same place on every host, then sharing
that config-setup work between those various hosts doesn't work as well.

> This ain't rocket science, and it's the kind of thing Emacsen are
> *very* good at.

Kind of.  The thing is, in 99% of the cases, the main thing I want to
know from a "revision identifier" is the rough age of that revision
(this is useful info because if it's very recent, then I can probably
guess exactly which commit this is referring to).  For that "date" is
the best description there is.


        Stefan




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  8:31                     ` Eric S. Raymond
  2014-10-30  9:53                       ` Andreas Schwab
  2014-10-30 10:12                       ` David Kastrup
@ 2014-10-30 13:29                       ` Stefan Monnier
  2014-10-30 15:33                         ` DVCS design philosophy Eric S. Raymond
  2014-10-31 20:18                         ` Referring to revisions in the git future Nicolas Richard
  2014-10-30 14:20                       ` Barry Warsaw
  3 siblings, 2 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30 13:29 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: Ivan Shmakov, emacs-devel

> The tension here is fundamental.  You can have easy typo fixes, or you
> can have a record that is both reliable and snared, but not both.
> I think the latter is more important than the former.

I think you've just been brain-washed by the Bzr/Git/Hg crowd.
There's no reason why the commit message would need to be considered as
being part of the "immutable history".  IOW there's no technical reason
to include the commit message in the Git hash.
[ For that same reason, I think a DVCS like Git should not include the
  parents in the computation of the hash either, so you can later on
  change the history graph (which might not be a DAG any more).  ]

Furthermore, even now that those tools have already decided to keep the
commit message in the immutable history, we can still change the way
"git log" displays the commit message, by including "git-log-patches" in
subsequent commits.  I haven't seen such a tool for Git yet, sadly.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 13:03                         ` Stefan Monnier
@ 2014-10-30 13:40                           ` David Kastrup
  2014-10-30 14:00                             ` Stefan Monnier
  0 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-30 13:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> to speed regarding the fine details of the discussion.  Exasperation at
>> not having used the features of Git yet seems a bit premature at a time
>> we have not even switched to Git.
>
> FWIW, this has nothing to do with Git, since the exact same thing could
> be done with Bzr.

Needing different tools and workflows, and Bzr was pretty much slated
for replacement when the discussion came up last time.  Regarding
readily available scripts, there was mentioning of a Git-based tool.

Deferring the ChangeLog maintenance to the version control system
requires version control to keep separate track of author and committer.
CVS doesn't.  Bzr does, Git does.  Being reasonably sure that any system
we might be using in the near future will do so as well is advisable for
switching, though of course once a ChangeLog can be generated in an
automated way, one could do so once before switching to a system
requiring manual ChangeLog maintenance.

At any rate: I certainly hope that C-x 4 a will get a _builtin_
equivalent in VC for working on commit messages.

When committing to GUILE (which wants ChangeLog style entries in its
commit messages), I currently do C-x 4 a at each source change, copy and
paste the relevant part from the resulting modified ChangeLog buffer
into the commit message, unindent it, undo the changes in the ChangeLog
buffer and kill it again, and then commit.

There must be something better than that.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  4:35             ` Barry Warsaw
  2014-10-30  5:24               ` Stephen J. Turnbull
  2014-10-30 10:17               ` David Kastrup
@ 2014-10-30 13:42               ` Alex Bennée
  2 siblings, 0 replies; 137+ messages in thread
From: Alex Bennée @ 2014-10-30 13:42 UTC (permalink / raw)
  To: Barry Warsaw; +Cc: emacs-devel


Barry Warsaw <barry@python.org> writes:

> On Oct 30, 2014, at 12:32 PM, Stephen J. Turnbull wrote:
>
>>Sure, and they're convenient mostly because you're used to them.  They
>>really don't have more content than SHA1s do, but they're easier to
>>read because they're decimal and relatively small.  I'm not going to
>>deny that, but I think everybody would be better off if some
>>infrastructure were created to make SHA1s easier to manipulate.
>
> That's the point I'm really trying to make; SHAs are simply terrible to
> communicate between humans.

git does provide the describe option:

3:39 alex@zen/x86_64 [emacs.git/emacs-24@origin] >git describe --tags
emacs-24.4-40-ga3d0e80

This tells me my current head is 40 commits above the emacs-24.4 tag
with a shortened commit-ish for unambiguous identity.

>
> Cheers,
> -Barry

-- 
Alex Bennée



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 13:40                           ` David Kastrup
@ 2014-10-30 14:00                             ` Stefan Monnier
  0 siblings, 0 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30 14:00 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

> At any rate: I certainly hope that C-x 4 a will get a _builtin_
> equivalent in VC for working on commit messages.

I'm using a local hack that does something like that (used in GNU ELPA
where we don't use a ChangeLog file).  I don't like it very much, tho,
because you don't get to work on it "slowly" (i.e. you can't really
save it).

> When committing to GUILE (which wants ChangeLog style entries in its
> commit messages), I currently do C-x 4 a at each source change, copy and
> paste the relevant part from the resulting modified ChangeLog buffer
> into the commit message, unindent it, undo the changes in the ChangeLog
> buffer and kill it again, and then commit.
> There must be something better than that.

The thing I like better is:
- C-x 4 a as usual into the usual ChangeLog file.
- Let C-c C-a copy the ChangeLog contents back into the *VC-Log* buffer.
Just don't include the ChangeLog file in the VCS, so it's not committed,
not merged, not nothing: it's only used as a scratchpad to write your
commit message at your own pace.

But, I do wonder what people use to temporarily store the commit message
while they're working on it.  For GNU Arch, there was a standard file
name for that (with "tla commit" would then read).


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-29 16:19               ` Eli Zaretskii
@ 2014-10-30 14:13                 ` Camm Maguire
  2014-10-30 16:06                   ` Eli Zaretskii
  0 siblings, 1 reply; 137+ messages in thread
From: Camm Maguire @ 2014-10-30 14:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gcl-devel, emacs-devel

Greetings!

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Camm Maguire <camm@maguirefamily.org>
>> Cc: emacs-devel@gnu.org,  gcl-devel@gnu.org
>> Date: Wed, 29 Oct 2014 11:55:13 -0400
>>
>> Does every string access in emacs proceed through the utf8 decoder?
>
> If you need to look at the character, yes.  E.g., if you need some
> property of the character, you need to index the appropriate table by
> that character's codepoint.  But in most operations that is not
> needed.  You just need to recognize several specific characters, like
> the null character, the slash, etc., most of which are ASCII.
>

Do you allocate a fresh boxed character on each aref, or output an
integer referring to a fixed ~2^22 sized table?  Do you maintain such a
table in core?

>> >> A cached internal pointer storing the last referenced codepoint
>> >> offset makes access essentially O(1).
>> >
>> > We indeed maintain a cache for byte-to-character and character-to-byte
>> > conversions.
>> 
>> How big is this cache?
>
> Its size is dynamic, and depends on how frequently the conversion is
> needed in places that are far away.  The cache stores byte-to-char
> correspondence in places that are far away, and Emacs uses binary
> search in between them.
>

How far is 'far away'?

If you had this to do all over again, would you still opt for the
multibyte? 

While you have buffers to consider too, which probably relate to
strings, it seems to me that the dominant costs are always memory
allocation/gc related, making the memory footprint important but not at
the expense of allocating characters, and that the most frequent
operations are removals/pattern substitutions, which can proceed
bytewise with the same gc overhead.

GCL also supports regular expressions -- how is this modified for utf-8?

Take care,
-- 
Camm Maguire			     		    camm@maguirefamily.org
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-29 15:56           ` Raymond Toy
@ 2014-10-30 14:16             ` Camm Maguire
  2014-10-31 18:47               ` Sam Steingold
  2014-10-31 19:52               ` [Gcl-devel] " Stefan Monnier
  0 siblings, 2 replies; 137+ messages in thread
From: Camm Maguire @ 2014-10-30 14:16 UTC (permalink / raw)
  To: Raymond Toy; +Cc: gcl-devel, emacs-devel

Greetings!  Don't worry -- I'm not committed to this idea yet, just
exploring!

Do these other lisps allocate a fresh character on each aref?  Do they
maintain some ~2^21 sized table in core?  (And isn't emacs a "lisp"
:-)).

Take care,

Raymond Toy <toy.raymond@gmail.com> writes:

>>>>>> "Camm" == Camm Maguire <camm@maguirefamily.org> writes:
>
>     Camm> Greetings!  I've recently been considering supporting unicode in gcl by
>     Camm> representing strings internally in utf8.  It appears that emacs does the
>     Camm> same or similar.  Apart from the obvious memory footprint benefits, I'd
>     Camm> like to ask what other advantages/disadvantages have been discovered.
>     Camm> Much of the utf8 literature emphasizes that most algorithms can proceed
>     Camm> conventionally in byte-wise fashion, including lexicographical ordering
>     Camm> comparisons, given that almost all jobs are sequential, at least
>     Camm> initially.  A cached internal pointer storing the last referenced
>     Camm> codepoint offset makes access essentially O(1).  Yet setting string
>     Camm> elements can trigger reallocations/memmove operations.  While these can
>     Camm> be aggregated over the setting of multiple elements, operations like
>     Camm> nreverse look ridiculous if left in terms of calls to aref and aset.
>
>     Camm> Thoughts, advice and experiences most appreciated.
>
> Have you looked at what other Lisp implementations do? AFAIK, none use
> utf-8. CCL and clisp use utf-32, cmucl and allegro use utf-16, sbcl
> and ecl(?) have two string types: 8-bit base-string and 32-bit
> strings.
>
> As a one-man operation (unfortunately), I'd go with the easiest one to
> get right and follow either ccl or cmucl.  The rest of the support for
> unicode can be added with libraries like cl-unicode and/or babel, if
> need be.
>
> --
> Ray
>
>
> _______________________________________________
> Gcl-devel mailing list
> Gcl-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/gcl-devel
>
>
>
>

-- 
Camm Maguire			     		    camm@maguirefamily.org
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  8:31                     ` Eric S. Raymond
                                         ` (2 preceding siblings ...)
  2014-10-30 13:29                       ` Stefan Monnier
@ 2014-10-30 14:20                       ` Barry Warsaw
  2014-11-01  1:23                         ` Stephen J. Turnbull
  3 siblings, 1 reply; 137+ messages in thread
From: Barry Warsaw @ 2014-10-30 14:20 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 694 bytes --]

On Oct 30, 2014, at 04:31 AM, Eric S. Raymond wrote:

>The tension here is fundamental. You can have easy typo fixes, or you
>can have a record that is both reliable and snared, but not both. I
>think the latter is more important than the former.

It also requires more care in writing commit messages, since often (although I
can't speak for Emacs) commit messages and changelogs are intended for
different audiences.  When committing changes in Mailman for example, I'll
include a lot more detail about a change so that future developers have a good
history spelunking experience, but a changelog/NEWS file will give only the
higher level user visible changes.

Cheers,
-Barry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 12:14                     ` David Kastrup
@ 2014-10-30 15:01                       ` Eric S. Raymond
  0 siblings, 0 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30 15:01 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

David Kastrup <dak@gnu.org>:
> Let me summarize this discussion:

I am trying to solve problems, not trade insults.  So I going to ignore
responses like these. When you have something to say that advances
*solving the problems*, I will nevertheless pay careful attention.
--
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 13:02                     ` Stefan Monnier
@ 2014-10-30 15:12                       ` Eric S. Raymond
  2014-10-30 16:49                         ` Stefan Monnier
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30 15:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Paul Eggert, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca>:
> > You surprise me, in a good way.
> 
> That doesn't surprise me.  Your write up shoed you completely
> misunderstood what I was saying.

I apologise for misunderstanding you.

While acknowledging that the fault is mine, might I ask you to try to be
a bit less terse and elliptical?  It would probably help.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* DVCS design philosophy
  2014-10-30 13:29                       ` Stefan Monnier
@ 2014-10-30 15:33                         ` Eric S. Raymond
  2014-10-30 16:59                           ` Stefan Monnier
  2014-10-31 20:18                         ` Referring to revisions in the git future Nicolas Richard
  1 sibling, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30 15:33 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Ivan Shmakov, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca>:
> > The tension here is fundamental.  You can have easy typo fixes, or you
> > can have a record that is both reliable and snared, but not both.
> > I think the latter is more important than the former.
> 
> I think you've just been brain-washed by the Bzr/Git/Hg crowd.
> There's no reason why the commit message would need to be considered as
> being part of the "immutable history".  IOW there's no technical reason
> to include the commit message in the Git hash.
> [ For that same reason, I think a DVCS like Git should not include the
>   parents in the computation of the hash either, so you can later on
>   change the history graph (which might not be a DAG any more).  ]

You raise an interesting two points.

I will amend my statement.  *If* you think the expression of programmer's
intent should e part of a reliable shared record, *then* you can't have
easy typo fixes. 

I sometimes regret this, as I am quite prone to typos.  But I don't think
the Bzr/Git/Hg choice to make programmer's intent part of that record
is unreasonable, either.  One property it does guarantee is non-repudiability.  
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30  7:36                   ` Ivan Shmakov
  2014-10-30  8:09                     ` Jan Djärv
  2014-10-30  8:31                     ` Eric S. Raymond
@ 2014-10-30 15:52                     ` Eli Zaretskii
  2 siblings, 0 replies; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-30 15:52 UTC (permalink / raw)
  To: Ivan Shmakov; +Cc: emacs-devel

> From: Ivan Shmakov <ivan@siamics.net>
> Date: Thu, 30 Oct 2014 07:36:29 +0000
> 
> 	… Except that it’s virtually impossible to fix a typo in a Git
> 	commit message, while it’s easy to fix one in a ChangeLog.

The Gnulib gitlog-to-changelog script has a facility to fix typos when
generating ChangeLog files from the commit log messages.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  8:32               ` Eric S. Raymond
  2014-10-30 10:25                 ` David Kastrup
@ 2014-10-30 15:53                 ` Eli Zaretskii
  2014-10-30 15:56                   ` Eric S. Raymond
  2014-10-31  7:47                 ` Richard Stallman
  2 siblings, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-30 15:53 UTC (permalink / raw)
  To: esr; +Cc: dak, rms, emacs-devel

> Date: Thu, 30 Oct 2014 04:32:58 -0400
> From: "Eric S. Raymond" <esr@thyrsus.com>
> Cc: Eli Zaretskii <eliz@gnu.org>, dak@gnu.org, emacs-devel@gnu.org
> 
> Richard Stallman <rms@gnu.org>:
> > I've never used git, and I doubt I will use it for anything except Emacs.
> > I hope Emacs will advertise a simple recommended beginner's git workflow.
> 
> The wiki describes it in detail.

Yes, it does.  And no, that's not the "workflow" I alluded to.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 15:53                 ` Eli Zaretskii
@ 2014-10-30 15:56                   ` Eric S. Raymond
  2014-10-30 16:44                     ` Eli Zaretskii
  0 siblings, 1 reply; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30 15:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dak, rms, emacs-devel

Eli Zaretskii <eliz@gnu.org>:
> > Date: Thu, 30 Oct 2014 04:32:58 -0400
> > From: "Eric S. Raymond" <esr@thyrsus.com>
> > Cc: Eli Zaretskii <eliz@gnu.org>, dak@gnu.org, emacs-devel@gnu.org
> > 
> > Richard Stallman <rms@gnu.org>:
> > > I've never used git, and I doubt I will use it for anything except Emacs.
> > > I hope Emacs will advertise a simple recommended beginner's git workflow.
> > 
> > The wiki describes it in detail.
> 
> Yes, it does.  And no, that's not the "workflow" I alluded to.

Can you be more specific about what problem needs solving, then?
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-30 14:13                 ` Camm Maguire
@ 2014-10-30 16:06                   ` Eli Zaretskii
  2014-10-30 16:27                     ` Camm Maguire
  0 siblings, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-30 16:06 UTC (permalink / raw)
  To: Camm Maguire; +Cc: gcl-devel, emacs-devel

> From: Camm Maguire <camm@maguirefamily.org>
> Cc: emacs-devel@gnu.org,  gcl-devel@gnu.org
> Date: Thu, 30 Oct 2014 10:13:20 -0400
> 
> >> Does every string access in emacs proceed through the utf8 decoder?
> >
> > If you need to look at the character, yes.  E.g., if you need some
> > property of the character, you need to index the appropriate table by
> > that character's codepoint.  But in most operations that is not
> > needed.  You just need to recognize several specific characters, like
> > the null character, the slash, etc., most of which are ASCII.
> >
> 
> Do you allocate a fresh boxed character on each aref, or output an
> integer referring to a fixed ~2^22 sized table?

I'm not sure what you mean by a "boxed character".  A character in
Emacs is just an int.

> Do you maintain such a table in core?

We have a lot of tables indexed by characters.  Their implementation
is memory efficient: it can store identical values for a range of
characters, and also store the default value with minimal overhead.

> >> > We indeed maintain a cache for byte-to-character and character-to-byte
> >> > conversions.
> >> 
> >> How big is this cache?
> >
> > Its size is dynamic, and depends on how frequently the conversion is
> > needed in places that are far away.  The cache stores byte-to-char
> > correspondence in places that are far away, and Emacs uses binary
> > search in between them.
> >
> 
> How far is 'far away'?

The current heuristic value is 5000 characters.

> If you had this to do all over again, would you still opt for the
> multibyte? 

Yes, I think so.  I know nobody ever suggested to switch.

> While you have buffers to consider too, which probably relate to
> strings, it seems to me that the dominant costs are always memory
> allocation/gc related, making the memory footprint important but not at
> the expense of allocating characters, and that the most frequent
> operations are removals/pattern substitutions, which can proceed
> bytewise with the same gc overhead.

We don't allocate characters, they are just integers.

As for strings, Emacs allocates small strings specially, to minimize
overhead.  And of course, there's GC that takes care of freeing
memory.

> GCL also supports regular expressions -- how is this modified for utf-8?

We use GNU regexp, slightly modified for Emacs.  I suggest to take a
look at the source.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-30 16:06                   ` Eli Zaretskii
@ 2014-10-30 16:27                     ` Camm Maguire
  2014-10-30 16:35                       ` Eli Zaretskii
  0 siblings, 1 reply; 137+ messages in thread
From: Camm Maguire @ 2014-10-30 16:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gcl-devel, emacs-devel

Greetings, and thanks so much for the feedback!  Almost done --

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Camm Maguire <camm@maguirefamily.org>
>> Cc: emacs-devel@gnu.org,  gcl-devel@gnu.org
>> Date: Thu, 30 Oct 2014 10:13:20 -0400
>> 
>> >> Does every string access in emacs proceed through the utf8 decoder?
>> >
>> > If you need to look at the character, yes.  E.g., if you need some
>> > property of the character, you need to index the appropriate table by
>> > that character's codepoint.  But in most operations that is not
>> > needed.  You just need to recognize several specific characters, like
>> > the null character, the slash, etc., most of which are ASCII.
>> >
>> 
>> Do you allocate a fresh boxed character on each aref, or output an
>> integer referring to a fixed ~2^22 sized table?
>
> I'm not sure what you mean by a "boxed character".  A character in
> Emacs is just an int.
>

Then how do you distinguish integers from characters at the lisp level?

Take care,
-- 
Camm Maguire			     		    camm@maguirefamily.org
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-30 16:27                     ` Camm Maguire
@ 2014-10-30 16:35                       ` Eli Zaretskii
  2014-10-31 18:05                         ` Camm Maguire
  2014-11-01  1:16                         ` Stephen J. Turnbull
  0 siblings, 2 replies; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-30 16:35 UTC (permalink / raw)
  To: Camm Maguire; +Cc: gcl-devel, emacs-devel

> From: Camm Maguire <camm@maguirefamily.org>
> Cc: emacs-devel@gnu.org,  gcl-devel@gnu.org
> Date: Thu, 30 Oct 2014 12:27:58 -0400
> 
> > I'm not sure what you mean by a "boxed character".  A character in
> > Emacs is just an int.
> >
> 
> Then how do you distinguish integers from characters at the lisp level?

We don't -- except that a valid character's value must fit the Unicode
range.

There's no character data type in Emacs.  (XEmacs does have it.)



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 15:56                   ` Eric S. Raymond
@ 2014-10-30 16:44                     ` Eli Zaretskii
  0 siblings, 0 replies; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-30 16:44 UTC (permalink / raw)
  To: esr; +Cc: dak, rms, emacs-devel

> Date: Thu, 30 Oct 2014 11:56:29 -0400
> From: "Eric S. Raymond" <esr@thyrsus.com>
> Cc: rms@gnu.org, dak@gnu.org, emacs-devel@gnu.org
> 
> Eli Zaretskii <eliz@gnu.org>:
> > > Date: Thu, 30 Oct 2014 04:32:58 -0400
> > > From: "Eric S. Raymond" <esr@thyrsus.com>
> > > Cc: Eli Zaretskii <eliz@gnu.org>, dak@gnu.org, emacs-devel@gnu.org
> > > 
> > > Richard Stallman <rms@gnu.org>:
> > > > I've never used git, and I doubt I will use it for anything except Emacs.
> > > > I hope Emacs will advertise a simple recommended beginner's git workflow.
> > > 
> > > The wiki describes it in detail.
> > 
> > Yes, it does.  And no, that's not the "workflow" I alluded to.
> 
> Can you be more specific about what problem needs solving, then?

I already did that, in the discussion that started here:

  http://lists.gnu.org/archive/html/emacs-devel/2014-01/msg00790.html

See especially the last bullet.

Given the negative (to say the least) reaction to what I said back
then, I have absolutely no motivation to talk about that any more.  I
guess we all will discover in due time "who is right and who is dead".



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 15:12                       ` Eric S. Raymond
@ 2014-10-30 16:49                         ` Stefan Monnier
  0 siblings, 0 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30 16:49 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: Paul Eggert, emacs-devel

> I apologise for misunderstanding you.

No need to apologize.  We all misunderstand each other all the time.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: DVCS design philosophy
  2014-10-30 15:33                         ` DVCS design philosophy Eric S. Raymond
@ 2014-10-30 16:59                           ` Stefan Monnier
  2014-10-30 17:41                             ` Eric S. Raymond
  0 siblings, 1 reply; 137+ messages in thread
From: Stefan Monnier @ 2014-10-30 16:59 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: Ivan Shmakov, emacs-devel

> I sometimes regret this, as I am quite prone to typos.  But I don't
> think the Bzr/Git/Hg choice to make programmer's intent part of that
> record is unreasonable, either.  One property it does guarantee is
> non-repudiability.

[ This is getting very far from Emacs, but I find it interesting, so here
  I am.  Did I say it was my last message in this thread?  ]

My view on this is that we need to distinguish the tree from the ways to
get to it.  So, the revision would get a hash which only depends on the
files in it and not on the commit message, nor the author, nor the
parent revisions.  Then the arcs between revisions (i.e. the patches)
would have their own hash which would take into account the author, the
commit message, the date, and of course the hashes of the "from" and
"to" revisions.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: DVCS design philosophy
  2014-10-30 16:59                           ` Stefan Monnier
@ 2014-10-30 17:41                             ` Eric S. Raymond
  0 siblings, 0 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-30 17:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Ivan Shmakov, emacs-devel

Stefan Monnier <monnier@IRO.UMontreal.CA>:
> My view on this is that we need to distinguish the tree from the ways to
> get to it.  So, the revision would get a hash which only depends on the
> files in it and not on the commit message, nor the author, nor the
> parent revisions.  Then the arcs between revisions (i.e. the patches)
> would have their own hash which would take into account the author, the
> commit message, the date, and of course the hashes of the "from" and
> "to" revisions.

This sounds rather like descriptions I have read of darcs.  It matches
up, anyway, with the darcs line on being "patch oriented".

It is too bad that project stalled out; I found their theoretical talk
about patch algebra and commutativity very interesting. Alas, the FAQ
suggests that they ran into a fundamental problem with the theory thst
they were unable to solve.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 13:19             ` Stefan Monnier
@ 2014-10-31  6:36               ` Stephen J. Turnbull
  2014-10-31 19:42               ` David Kastrup
  1 sibling, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-10-31  6:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Barry Warsaw, emacs-devel

Stefan Monnier writes:

 > Especially since Git doesn't support shared repositories very well,

You mean "it requires a bit more setup than you like".
GIT_ALTERNATE_OBJECT_DIRECTORIES, anyone?

 > And if I don't place the repositories for the various projects
 > where I care about revids at the same place on every host, then
 > sharing that config-setup work between those various hosts doesn't
 > work as well.

So put them in the same place: submodules are great for that.

 > Kind of.  The thing is, in 99% of the cases, the main thing I want to
 > know from a "revision identifier" is the rough age of that revision
 > (this is useful info because if it's very recent, then I can probably
 > guess exactly which commit this is referring to).  For that "date" is
 > the best description there is.

Heh.  I *wish* I had so many new bugs I could ignore the old ones!
The new ones are typically a *lot* easier.

There are things that git doesn't do very well, but rarely are they
the things that bzr users complain about.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30  8:32               ` Eric S. Raymond
  2014-10-30 10:25                 ` David Kastrup
  2014-10-30 15:53                 ` Eli Zaretskii
@ 2014-10-31  7:47                 ` Richard Stallman
  2014-10-31  8:17                   ` Eli Zaretskii
  2014-10-31 10:21                   ` Eric S. Raymond
  2 siblings, 2 replies; 137+ messages in thread
From: Richard Stallman @ 2014-10-31  7:47 UTC (permalink / raw)
  To: esr; +Cc: eliz, dak, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > I've never used git, and I doubt I will use it for anything except Emacs.
  > > I hope Emacs will advertise a simple recommended beginner's git workflow.

  > The wiki describes it in detail.

What is the URL of that description?  I would like to take a look.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-31  7:47                 ` Richard Stallman
@ 2014-10-31  8:17                   ` Eli Zaretskii
  2014-10-31 10:21                   ` Eric S. Raymond
  1 sibling, 0 replies; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-31  8:17 UTC (permalink / raw)
  To: rms; +Cc: esr, dak, emacs-devel

> Date: Fri, 31 Oct 2014 03:47:25 -0400
> From: Richard Stallman <rms@gnu.org>
> CC: eliz@gnu.org, dak@gnu.org, emacs-devel@gnu.org
> 
>   > The wiki describes it in detail.
> 
> What is the URL of that description?

http://www.emacswiki.org/emacs/GitQuickStartForEmacsDevs
http://www.emacswiki.org/emacs/GitForEmacsDevs



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
@ 2014-10-31  9:43 Eli Zaretskii
  0 siblings, 0 replies; 137+ messages in thread
From: Eli Zaretskii @ 2014-10-31  9:43 UTC (permalink / raw)
  To: emacs-devel

One of the changes Eric made is this:

  === modified file 'test/automated/thingatpt.el'
  --- test/automated/thingatpt.el 2014-01-01 07:43:34 +0000
  +++ test/automated/thingatpt.el 2014-10-31 08:58:37 +0000
  @@ -26,7 +26,6 @@
       ("http://2.gnu.org" 6 url "http://2.gnu.org")
       ("http://3.gnu.org" 19 url "http://3.gnu.org")
       ("https://4.gnu.org" 1  url "https://4.gnu.org")
  -    ("bzr://savannah.gnu.org" 1 url "bzr://savannah.gnu.org")
       ("A geo URI (geo:3.14159,-2.71828)." 12 url "geo:3.14159,-2.71828")
       ("Visit http://5.gnu.org now." 5 url nil)
       ("Visit http://6.gnu.org now." 7 url "http://6.gnu.org")

I'm not sure this is necessary.  First, the Emacs bzr repository on
Savannah will probably remain in place for some time to come.  And
second, if we want to replace it, we should use the URL of the git
repository instead.  I see no need to make our test coverage smaller
than what he have now.

Comments?



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-31  7:47                 ` Richard Stallman
  2014-10-31  8:17                   ` Eli Zaretskii
@ 2014-10-31 10:21                   ` Eric S. Raymond
  1 sibling, 0 replies; 137+ messages in thread
From: Eric S. Raymond @ 2014-10-31 10:21 UTC (permalink / raw)
  To: Richard Stallman; +Cc: eliz, dak, emacs-devel

Richard Stallman <rms@gnu.org>:
>   > > I've never used git, and I doubt I will use it for anything except Emacs.
>   > > I hope Emacs will advertise a simple recommended beginner's git workflow.
> 
>   > The wiki describes it in detail.
> 
> What is the URL of that description?  I would like to take a look.

http://www.emacswiki.org/emacs/GitQuickStartForEmacsDevs

http://www.emacswiki.org/emacs/GitForEmacsDevs
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-30 16:35                       ` Eli Zaretskii
@ 2014-10-31 18:05                         ` Camm Maguire
  2014-11-01  9:01                           ` Eli Zaretskii
  2014-11-01  1:16                         ` Stephen J. Turnbull
  1 sibling, 1 reply; 137+ messages in thread
From: Camm Maguire @ 2014-10-31 18:05 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gcl-devel, emacs-devel

Thanks so much!

Been discussing this elsewhere, and its come to my attention that not
only do all unicode code-points not fit into UTF-16, but all unicode
characters don't fit into unicode code-points :-).  Presumably this is
why emacs expanded to 22bits?  In any case, it makes clear what one
correspondent said, that unicode must be processed sequentially, so
there is no real reason to struggle to get random O(1) access to unicode
characters. 

If this is indeed the case, all these encodings have the same problems
though varying in degree, and UTF-8 is clearly the smallest and most
ascii compatible.  The question then arises as to whether lisp
characters, which by definition do offer random access in strings, need
be the same as or close to unicode characters.  

Did you consider leaving aref, char-code and code-char alone and writing
unicode functions on top of these, i.e. unicode-length!=length, as
opposed to making aref itself do this translation under the hood,
thereby violating the expectation of O(1) access, (which is certainly
offered in other kinds of arrays, though it is questionable whether real
users actually expect this for strings)?  In doing so, one would then
know that aref is random-access, and unicode-??? is sequential only.

Take care,

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Camm Maguire <camm@maguirefamily.org>
>> Cc: emacs-devel@gnu.org,  gcl-devel@gnu.org
>> Date: Thu, 30 Oct 2014 12:27:58 -0400
>> 
>> > I'm not sure what you mean by a "boxed character".  A character in
>> > Emacs is just an int.
>> >
>> 
>> Then how do you distinguish integers from characters at the lisp level?
>
> We don't -- except that a valid character's value must fit the Unicode
> range.
>
> There's no character data type in Emacs.  (XEmacs does have it.)
>
>
>
>

-- 
Camm Maguire			     		    camm@maguirefamily.org
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-30 14:16             ` Camm Maguire
@ 2014-10-31 18:47               ` Sam Steingold
  2014-10-31 21:00                 ` Andreas Schwab
  2014-10-31 19:52               ` [Gcl-devel] " Stefan Monnier
  1 sibling, 1 reply; 137+ messages in thread
From: Sam Steingold @ 2014-10-31 18:47 UTC (permalink / raw)
  To: gcl-devel; +Cc: emacs-devel

> * Camm Maguire <pnzz@znthversnzvyl.bet> [2014-10-30 10:16:15 -0400]:
>
> Do these other lisps allocate a fresh character on each aref?

Of course not!
A character is an immediate object, a word.
Even a 32-bit system has words big enough to store all Unicode.

-- 
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1343
http://www.childpsy.net/ http://www.dhimmitude.org http://honestreporting.com
http://www.memritv.org http://memri.org http://camera.org http://jihadwatch.org
It's not just a language, it's an adventure.  Common Lisp.

^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-30 13:19             ` Stefan Monnier
  2014-10-31  6:36               ` Stephen J. Turnbull
@ 2014-10-31 19:42               ` David Kastrup
  2014-11-01  1:34                 ` Stephen J. Turnbull
  1 sibling, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-10-31 19:42 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> For example, in response to my earlier post, Stefan responded that
>> SHA1s aren't that easy to recognize and you'll get too many false
>> positives.  My initial rebuttal was "Eh?!", but a more constructive
>> response is, so we establish a convention of prefixing with "sha:" or
>> "SHA:".
>
> I'd rather go with "git:", but yes that's also the first obvious answer
> that came to my mind.

git: indicates a transfer method.  40 hexadecimal digits are pretty
unambiguous on their own.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: [Gcl-devel] utf8 and emacs text/string multibyte representation
  2014-10-30 14:16             ` Camm Maguire
  2014-10-31 18:47               ` Sam Steingold
@ 2014-10-31 19:52               ` Stefan Monnier
  1 sibling, 0 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-31 19:52 UTC (permalink / raw)
  To: Camm Maguire; +Cc: Raymond Toy, gcl-devel, emacs-devel

> Do these other lisps allocate a fresh character on each aref?  Do they
> maintain some ~2^21 sized table in core?  (And isn't emacs a "lisp"
> :-)).

I'd expect that dynamically typed languages that have a character type
distinct from integers all use an "unboxed" representation for those
chars (i.e. reserve some tag-bit combination for the "character" type).
IIRC, a unicode char only needs 22bit, so that leaves a lot of space
for tagbits.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30 13:29                       ` Stefan Monnier
  2014-10-30 15:33                         ` DVCS design philosophy Eric S. Raymond
@ 2014-10-31 20:18                         ` Nicolas Richard
  2014-10-31 21:11                           ` Stefan Monnier
  1 sibling, 1 reply; 137+ messages in thread
From: Nicolas Richard @ 2014-10-31 20:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eric S. Raymond, Ivan Shmakov, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:
> There's no reason why the commit message would need to be considered as
> being part of the "immutable history".  IOW there's no technical reason
> to include the commit message in the Git hash.

git has a separate hash for the tree. "git cat-file commit
<somecommitsha1>" will show you that.

(Sorry if this is known stuff.)

-- 
Nicolas Richard



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-31 18:47               ` Sam Steingold
@ 2014-10-31 21:00                 ` Andreas Schwab
  0 siblings, 0 replies; 137+ messages in thread
From: Andreas Schwab @ 2014-10-31 21:00 UTC (permalink / raw)
  To: Sam Steingold; +Cc: gcl-devel, emacs-devel

Sam Steingold <sds@gnu.org> writes:

>> * Camm Maguire <pnzz@znthversnzvyl.bet> [2014-10-30 10:16:15 -0400]:
>>
>> Do these other lisps allocate a fresh character on each aref?
>
> Of course not!
> A character is an immediate object, a word.
> Even a 32-bit system has words big enough to store all Unicode.

Or rather enough bits to store all unicode values plus tags bits.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-31 20:18                         ` Referring to revisions in the git future Nicolas Richard
@ 2014-10-31 21:11                           ` Stefan Monnier
  2014-11-01  1:44                             ` Stephen J. Turnbull
  2014-11-01  7:58                             ` David Kastrup
  0 siblings, 2 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-10-31 21:11 UTC (permalink / raw)
  To: Nicolas Richard; +Cc: Eric S. Raymond, Ivan Shmakov, emacs-devel

>> There's no reason why the commit message would need to be considered as
>> being part of the "immutable history".  IOW there's no technical reason
>> to include the commit message in the Git hash.
> git has a separate hash for the tree. "git cat-file commit
> <somecommitsha1>" will show you that.

I know, but the "parents" reference a "commit", not a "tree".


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-28 23:05   ` Alan Mackenzie
  2014-10-28 23:24     ` Óscar Fuentes
@ 2014-10-31 22:47     ` Paul Eggert
  1 sibling, 0 replies; 137+ messages in thread
From: Paul Eggert @ 2014-10-31 22:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Alan Mackenzie wrote:

> We've more than one branch in our Emacs repository, yet the bzr revision
> numbers are not in the slightest inconvenient.

On more than one occasion they've been inconvenient to me, because I've 
mistakenly used a trunk bzr revno in a non-trunk branch, or vice versa.  It's a 
natural mistake to make.

>> there was some discussion on this list about using some
>> tool-independent schema, using a combination of the author's e-mail and
>> a timestamp.
>
> Are they going to enable the sort of conversation I exemplified above?

Sure, if they catch on.  If not, another common simplification is abbreviated 
hashes, as in the output of "git log --abbrev-commit".  That way, instead of 
writing "04a4a930a63e7396976fc016661f8f466faa64e6" one can write "04a4a93". 
Abbreviated hashes are not perfect -- they're not sequential and in theory they 
can collide -- but in practice they work well enough.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-30 16:35                       ` Eli Zaretskii
  2014-10-31 18:05                         ` Camm Maguire
@ 2014-11-01  1:16                         ` Stephen J. Turnbull
  1 sibling, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-11-01  1:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Camm Maguire, gcl-devel, emacs-devel

Eli Zaretskii writes:

 > There's no character data type in Emacs.  (XEmacs does have it.)

IIRC Ken'ichi has been thinking about adding a character type to
Emacs.  But that would be a big change.  It was worth it for XEmacs,
but I doubt it would be worth it for Emacs any more.  If GCL has a
character type already, that should not change.

Regarding "boxing", in XEmacs currently we have two tag bits at the
low end of a word, bit patterns ending in ...1 are integers (x >> 1
gives the value), bit patterns ending in ...10 are characters (x >> 2
give ord(c)), and bit patterns ending in ...00 are pointers to lrecord
types.  So yes, we always return a boxed character, but the
representation fits in a single word, and is not a pointer.








^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-30 14:20                       ` Barry Warsaw
@ 2014-11-01  1:23                         ` Stephen J. Turnbull
  0 siblings, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-11-01  1:23 UTC (permalink / raw)
  To: Barry Warsaw; +Cc: emacs-devel

Barry Warsaw writes:

 > [Commit logs] include a lot more detail about a change so that
 > future developers have a good history spelunking experience, but a
 > changelog/NEWS file will give only the higher level user visible
 > changes.

ChangeLog != NEWS in Emacs.  Am I missing something?

Come to think of it, there could be a convention for creating and
marking NEWS entries in commit logs, then NEWS (or proto-NEWS, there
are ordering issues and it's not clear that committers could get them
right) could be auto-generated.






^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-10-31 19:42               ` David Kastrup
@ 2014-11-01  1:34                 ` Stephen J. Turnbull
  2014-11-01  7:05                   ` Tassilo Horn
  0 siblings, 1 reply; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-11-01  1:34 UTC (permalink / raw)
  To: David Kastrup; +Cc: emacs-devel

David Kastrup writes:
 > Stefan Monnier <monnier@iro.umontreal.ca> writes:
 > 
 > >> For example, in response to my earlier post, Stefan responded that
 > >> SHA1s aren't that easy to recognize and you'll get too many false
 > >> positives.  My initial rebuttal was "Eh?!", but a more constructive
 > >> response is, so we establish a convention of prefixing with
 > >> "sha:" or "SHA:".
 > >
 > > I'd rather go with "git:", but yes that's also the first obvious answer
 > > that came to my mind.
 > 
 > git: indicates a transfer method.

I don't have a problem with that interpretation, the full format
being:

    git://<repo>.<suborg>.hosting-provider.tld/<40 hex digits>

This would put some strain on Savannah orgs, but hardly impossible.
Oh, yeah, you're right -- it's incompatible with current usage by git
itself, which would be

    git://hosting-provider.tld/<path-to-repo>#<40 hex digits>

IIRC.  So then it would be the somewhat unintuitive "git:#".

 > 40 hexadecimal digits are pretty unambiguous on their own.

Sure, but I had in mind abbreviations as well.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-31 21:11                           ` Stefan Monnier
@ 2014-11-01  1:44                             ` Stephen J. Turnbull
  2014-11-01  7:58                             ` David Kastrup
  1 sibling, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-11-01  1:44 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Eric S. Raymond, Nicolas Richard, Ivan Shmakov, emacs-devel

Stefan Monnier writes:

 > >> There's no reason why the commit message would need to be
 > >> considered as being part of the "immutable history".  IOW
 > >> there's no technical reason to include the commit message in the
 > >> Git hash.

There's no technical reason why anything needs to be considered part
of the immutable history, it's all convention.  And of course (up to
the finite set of SHA1s) both the typo and the fixed message are
already part of immutable history.  The integers aren't going to
change. ;-)

But more practically, Darcs, for example, does not have a DAG at all:
you can easily mix and match patches the would be on different
branches in a DAG-based VCS.  Effectively for our purposes there is no
history, and Darcs users don't seem to miss it.

As a possible way to address this issue, you can always attach "notes"
to a git commit.  vc.el could deal with them reliably even if git's
own UIs refuse to.

 > > git has a separate hash for the tree. "git cat-file commit
 > > <somecommitsha1>" will show you that.
 > 
 > I know, but the "parents" reference a "commit", not a "tree".

Double indirection is not a real problem.  It's just that (as you
validly point out) the tools haven't been written.

BTW, you need to deal with merges (something that you can't do
efficiently in Darcs, which doesn't know about parents anyway).




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01  1:34                 ` Stephen J. Turnbull
@ 2014-11-01  7:05                   ` Tassilo Horn
  2014-11-01  7:09                     ` Dima Kogan
                                       ` (2 more replies)
  0 siblings, 3 replies; 137+ messages in thread
From: Tassilo Horn @ 2014-11-01  7:05 UTC (permalink / raw)
  To: Stephen J. Turnbull; +Cc: David Kastrup, emacs-devel

"Stephen J. Turnbull" <stephen@xemacs.org> writes:

>  > 40 hexadecimal digits are pretty unambiguous on their own.
>
> Sure, but I had in mind abbreviations as well.

According to the Regex Dictionary at http://www.visca.com/regexdict/
there is no single English word matching the regex "^[a-f0-9]{7,}$" and
AFAICS an abberviated Git SHA is 7 chars wide.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01  7:05                   ` Tassilo Horn
@ 2014-11-01  7:09                     ` Dima Kogan
  2014-11-01  7:28                     ` Paul Eggert
  2014-11-01  7:49                     ` David Kastrup
  2 siblings, 0 replies; 137+ messages in thread
From: Dima Kogan @ 2014-11-01  7:09 UTC (permalink / raw)
  To: emacs-devel

Tassilo Horn <tsdh@gnu.org> writes:

> "Stephen J. Turnbull" <stephen@xemacs.org> writes:
>
>>  > 40 hexadecimal digits are pretty unambiguous on their own.
>>
>> Sure, but I had in mind abbreviations as well.
>
> According to the Regex Dictionary at http://www.visca.com/regexdict/
> there is no single English word matching the regex "^[a-f0-9]{7,}$" and
> AFAICS an abberviated Git SHA is 7 chars wide.

Almost:

$ egrep -i '^[a-f]{7,}$' /usr/share/dict/american-english-large

acceded
defaced
effaced



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01  7:05                   ` Tassilo Horn
  2014-11-01  7:09                     ` Dima Kogan
@ 2014-11-01  7:28                     ` Paul Eggert
  2014-11-01  7:49                     ` David Kastrup
  2 siblings, 0 replies; 137+ messages in thread
From: Paul Eggert @ 2014-11-01  7:28 UTC (permalink / raw)
  To: emacs-devel

Tassilo Horn wrote:
> According to the Regex Dictionary athttp://www.visca.com/regexdict/
> there is no single English word matching the regex "^[a-f0-9]{7,}$"

What about 'defface'?  That's not in the dictionary but it's a popular word when 
talking about Emacs.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01  7:05                   ` Tassilo Horn
  2014-11-01  7:09                     ` Dima Kogan
  2014-11-01  7:28                     ` Paul Eggert
@ 2014-11-01  7:49                     ` David Kastrup
  2014-11-01  9:46                       ` Alan Mackenzie
  2 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-11-01  7:49 UTC (permalink / raw)
  To: Stephen J. Turnbull; +Cc: emacs-devel

Tassilo Horn <tsdh@gnu.org> writes:

> "Stephen J. Turnbull" <stephen@xemacs.org> writes:
>
>>  > 40 hexadecimal digits are pretty unambiguous on their own.
>>
>> Sure, but I had in mind abbreviations as well.
>
> According to the Regex Dictionary at http://www.visca.com/regexdict/
> there is no single English word matching the regex "^[a-f0-9]{7,}$" and
> AFAICS an abberviated Git SHA is 7 chars wide.

It is at least 6 characters I think, more if 6 would not be unambiguous.
The problem is that "unambiguous" changes over time and repositories.  A
historically unambiguous SHA might become ambiguous over time and/or it
might be unambiguous in one person's repository but not another.  And
when garbage collection removes rebased or deleted branches,
"unambiguous" might even become shorter again.

It's not like anybody's going to want to type off "abbreviations" by
hand anyway: too error-prone.  Just paste the full thing.  Really.  I've
been developing for years with Git, and that's just what everybody does
most of the time.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-10-31 21:11                           ` Stefan Monnier
  2014-11-01  1:44                             ` Stephen J. Turnbull
@ 2014-11-01  7:58                             ` David Kastrup
  1 sibling, 0 replies; 137+ messages in thread
From: David Kastrup @ 2014-11-01  7:58 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> There's no reason why the commit message would need to be considered as
>>> being part of the "immutable history".  IOW there's no technical reason
>>> to include the commit message in the Git hash.
>> git has a separate hash for the tree. "git cat-file commit
>> <somecommitsha1>" will show you that.
>
> I know, but the "parents" reference a "commit", not a "tree".

I see the danger of companies or people having fun obfuscating/stripping
repositories where there are only isolated root commits without DAG.
Those are not helpful for development.  One idea of Git was to have
verifiable records, also with regard to attributions.  Being able to
silently render parts of history useless for work, possibly having it
spread out through push/pop eventually, is not really a comforting idea.

For that reason, I think that the current scheme is not the worst idea.
One can add "notes" after the fact, and it is conceivable to create
notes that are used for patching typos when creating a ChangeLog
automatically if it is really important to someone.

But I consider it reasonable that the default records are indelible.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-10-31 18:05                         ` Camm Maguire
@ 2014-11-01  9:01                           ` Eli Zaretskii
  2014-11-01 18:32                             ` Stephen J. Turnbull
  0 siblings, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-11-01  9:01 UTC (permalink / raw)
  To: Camm Maguire; +Cc: gcl-devel, emacs-devel

> From: Camm Maguire <camm@maguirefamily.org>
> Cc: emacs-devel@gnu.org,  gcl-devel@gnu.org
> Date: Fri, 31 Oct 2014 14:05:20 -0400
> 
> Been discussing this elsewhere, and its come to my attention that not
> only do all unicode code-points not fit into UTF-16, but all unicode
> characters don't fit into unicode code-points :-).  Presumably this is
> why emacs expanded to 22bits?

Not sure what you mean here.  All Unicode characters do fit into the
Unicode codepoint space.  Emacs extends that codepoint space beyond 22
bits because it needs to support cultures which don't want unification
yet.

> If this is indeed the case, all these encodings have the same problems
> though varying in degree, and UTF-8 is clearly the smallest and most
> ascii compatible.  The question then arises as to whether lisp
> characters, which by definition do offer random access in strings, need
> be the same as or close to unicode characters.  

In Emacs, they are the same, yes.  Anything else means considerable
complications, AFAIR.

Random access to strings on the Lisp level is implemented as a
function on the C level, which simply walks the UTF-8 representation
one character at a time.  UTF-8 makes it easy to determine the number
of bytes by the first byte, so you compute that and move that many
bytes.

Emacs includes optimizations for a popular use case when each
character is a single byte (as in pure ASCII strings).  It also
records the last string used in aref and the last character and the
corresponding byte accessed in that string.  So if the Lisp program
access several characters of the same string that are close to each
other, the 2nd and subsequent calls to aref are much cheaper, because
they start from a closer starting point.

> Did you consider leaving aref, char-code and code-char alone and writing
> unicode functions on top of these, i.e. unicode-length!=length, as
> opposed to making aref itself do this translation under the hood,
> thereby violating the expectation of O(1) access, (which is certainly
> offered in other kinds of arrays, though it is questionable whether real
> users actually expect this for strings)?

What would be the benefit of having such byte-oriented aref?  Lisp
code needs to manipulate characters, not bytes.  Having byte-oriented
aref would just push the translation to characters to the Lisp level,
something no Lisp application wants or should want doing.

Internally, on the C level, Emacs does have access to individual
bytes, of course.  On that level, each string is indeed
byte-addressable at O(1) complexity.

> In doing so, one would then know that aref is random-access, and
> unicode-??? is sequential only.

As explained above, the access to characters is not really sequential
in Emacs, except for the first character of a string that was not
accessed yet.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01  7:49                     ` David Kastrup
@ 2014-11-01  9:46                       ` Alan Mackenzie
  2014-11-01 10:13                         ` Eli Zaretskii
  2014-11-01 10:29                         ` David Kastrup
  0 siblings, 2 replies; 137+ messages in thread
From: Alan Mackenzie @ 2014-11-01  9:46 UTC (permalink / raw)
  To: David Kastrup; +Cc: Stephen J. Turnbull, emacs-devel

Good morning, David.

On Sat, Nov 01, 2014 at 08:49:34AM +0100, David Kastrup wrote:
> It's not like anybody's going to want to type off "abbreviations" by
> hand anyway: too error-prone.

I'm going to want to do this, that's why I started this thread.  Using a
computer to kill and yank such a number is going to be such a downer.  Do
you also kill and yank a variable name each time you need to type it in,
or do you just type it?

Likely, I'm not going to be able to do this.  Remembering and typing in a
revision number is trivial: two chunks of memory - one for the bit that
slowly changes "118" the other for the "220" at the end.  With 40 digit
hex strings, even abbreviated, it's going to be 6 or 7 chunks to
memorise.  As you say, this will be error-prone.

> Just paste the full thing.  Really.  I've been developing for years
> with Git, and that's just what everybody does most of the time.

Because they have to, not because it's their preferred way of working.

> -- 
> David Kastrup

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01  9:46                       ` Alan Mackenzie
@ 2014-11-01 10:13                         ` Eli Zaretskii
  2014-11-01 11:33                           ` Alan Mackenzie
  2014-11-01 10:29                         ` David Kastrup
  1 sibling, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-11-01 10:13 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: stephen, dak, emacs-devel

> Date: Sat, 1 Nov 2014 09:46:04 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: "Stephen J. Turnbull" <stephen@xemacs.org>, emacs-devel@gnu.org
> 
> On Sat, Nov 01, 2014 at 08:49:34AM +0100, David Kastrup wrote:
> > It's not like anybody's going to want to type off "abbreviations" by
> > hand anyway: too error-prone.
> 
> I'm going to want to do this, that's why I started this thread.  Using a
> computer to kill and yank such a number is going to be such a downer.  Do
> you also kill and yank a variable name each time you need to type it in,
> or do you just type it?

I suggest "M-/" instead, I'm using that with bzr revno's as well.
It's way better than copy/paste, and usually is right on target after
you type the first 2 characters.  Sometimes I need to type M-/ twice.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01  9:46                       ` Alan Mackenzie
  2014-11-01 10:13                         ` Eli Zaretskii
@ 2014-11-01 10:29                         ` David Kastrup
  2014-11-01 11:29                           ` Alan Mackenzie
  1 sibling, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-11-01 10:29 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stephen J. Turnbull, emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> Good morning, David.
>
> On Sat, Nov 01, 2014 at 08:49:34AM +0100, David Kastrup wrote:
>> It's not like anybody's going to want to type off "abbreviations" by
>> hand anyway: too error-prone.
>
> I'm going to want to do this, that's why I started this thread.  Using a
> computer to kill and yank such a number is going to be such a downer.  Do
> you also kill and yank a variable name each time you need to type it in,
> or do you just type it?
>
> Likely, I'm not going to be able to do this.  Remembering and typing in a
> revision number is trivial: two chunks of memory - one for the bit that
> slowly changes "118" the other for the "220" at the end.  With 40 digit
> hex strings, even abbreviated, it's going to be 6 or 7 chunks to
> memorise.  As you say, this will be error-prone.
>
>> Just paste the full thing.  Really.  I've been developing for years
>> with Git, and that's just what everybody does most of the time.
>
> Because they have to, not because it's their preferred way of working.

Do you really think you know me better than I do?

If you do, you can just continue this discussion in your imagination and
not bother the list with it.  I've worked with sequential numbers for
decades.  Everybody did in the time of RCS and CVS.  I've worked with
Git and SHA-1 for quite some time, too.

You didn't apparently.  And yet you think yourself more qualified than I
am to tell people about _my_ motivations?

SHA1 is _great_ for mailing lists, by the way.  You plug in an SHA1 into
a mailing list search, and out fall all relevant mails.  Try that with a
sequence number.  You plug in an SHA1 of some commit with inscrutable
commit message missing an issue number into an issue tracker, and out
falls the message reporting the closing of the issue with a particular
commit.  It's a real life-saver for finding stuff again.  Neither
abbreviated SHA1 nor sequence numbers work for that.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01 10:29                         ` David Kastrup
@ 2014-11-01 11:29                           ` Alan Mackenzie
  2014-11-01 11:57                             ` David Kastrup
  0 siblings, 1 reply; 137+ messages in thread
From: Alan Mackenzie @ 2014-11-01 11:29 UTC (permalink / raw)
  To: David Kastrup; +Cc: Stephen J. Turnbull, emacs-devel

Hello, David.

On Sat, Nov 01, 2014 at 11:29:47AM +0100, David Kastrup wrote:
> Alan Mackenzie <acm@muc.de> writes:

> > On Sat, Nov 01, 2014 at 08:49:34AM +0100, David Kastrup wrote:
> >> It's not like anybody's going to want to type off "abbreviations" by
> >> hand anyway: too error-prone.

> > I'm going to want to do this, that's why I started this thread.  Using a
> > computer to kill and yank such a number is going to be such a downer.  Do
> > you also kill and yank a variable name each time you need to type it in,
> > or do you just type it?

> > Likely, I'm not going to be able to do this.  Remembering and typing in a
> > revision number is trivial: two chunks of memory - one for the bit that
> > slowly changes "118" the other for the "220" at the end.  With 40 digit
> > hex strings, even abbreviated, it's going to be 6 or 7 chunks to
> > memorise.  As you say, this will be error-prone.

> >> Just paste the full thing.  Really.  I've been developing for years
> >> with Git, and that's just what everybody does most of the time.

> > Because they have to, not because it's their preferred way of working.

> Do you really think you know me better than I do?

No, I don't, thankfully.  Just as you don't know what "anybody"'s going
to want to do, better than the anybody himself does.

> If you do, you can just continue this discussion in your imagination and
> not bother the list with it.  I've worked with sequential numbers for
> decades.  Everybody did in the time of RCS and CVS.  I've worked with
> Git and SHA-1 for quite some time, too.

I've been working with bzr, which uses hashes, for some while.  Funnily
enough, I can't remember anybody on emacs-devel referring to a revision
by that hash (except in discussions such as this one).  The revision
number has been used many, many times.

> You didn't apparently.  And yet you think yourself more qualified than I
> am to tell people about _my_ motivations?

Calm down, David!  Feel free to insert a "necessarily" into the last
sentence of my previous post.  It would clarify my meaning.

> SHA1 is _great_ for mailing lists, by the way.  You plug in an SHA1 into
> a mailing list search, and out fall all relevant mails.

OK, I'll believe you.  That's assuming you type in a short enough
abbreviation, or the person writing the email has typed in the full hash.
Presumblay people using git do type in the full hash.  You wouldn't get
very far with that strategy on emacs-devel, though, at least not as yet.

Incidentally there have been five threads on emacs-devel which have
referred to version numbers 118xxx.  Clearly a lot of people find these
numbers convenient and useful.

> Try that with a sequence number.

Just done it (see above) with a regexp search.  It worked well, although
there were lots of false matches, of course.  The only regexp searches
you can do for hashes are for messages which include one at all, or for
single specific hashes.  If you know roughly a specific revision lies
between 33b57e7 and 16181a4, you've no way of searching an email list
for mentions of it.  On the other hand, if it's between 118204 and
118499, you'll find it easily.

> You plug in an SHA1 of some commit with inscrutable commit message
> missing an issue number into an issue tracker, and out falls the
> message reporting the closing of the issue with a particular commit.
> It's a real life-saver for finding stuff again.  Neither abbreviated
> SHA1 nor sequence numbers work for that.

Yes, that sounds like an excellent reason for using hashes.  But for many
uses, a revision number is better.  bzr has them both.  git doesn't.

> -- 
> David Kastrup

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01 10:13                         ` Eli Zaretskii
@ 2014-11-01 11:33                           ` Alan Mackenzie
  2014-11-01 13:06                             ` Eli Zaretskii
  0 siblings, 1 reply; 137+ messages in thread
From: Alan Mackenzie @ 2014-11-01 11:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: stephen, dak, emacs-devel

Hello, Eli.

On Sat, Nov 01, 2014 at 12:13:35PM +0200, Eli Zaretskii wrote:
> > Date: Sat, 1 Nov 2014 09:46:04 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: "Stephen J. Turnbull" <stephen@xemacs.org>, emacs-devel@gnu.org

> > On Sat, Nov 01, 2014 at 08:49:34AM +0100, David Kastrup wrote:
> > > It's not like anybody's going to want to type off "abbreviations" by
> > > hand anyway: too error-prone.

> > I'm going to want to do this, that's why I started this thread.  Using a
> > computer to kill and yank such a number is going to be such a downer.  Do
> > you also kill and yank a variable name each time you need to type it in,
> > or do you just type it?

> I suggest "M-/" instead, I'm using that with bzr revno's as well.
> It's way better than copy/paste, and usually is right on target after
> you type the first 2 characters.  Sometimes I need to type M-/ twice.

Thanks, I didn't know about M-/.  I'll give it a try.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01 11:29                           ` Alan Mackenzie
@ 2014-11-01 11:57                             ` David Kastrup
  2014-11-01 12:29                               ` Alan Mackenzie
  0 siblings, 1 reply; 137+ messages in thread
From: David Kastrup @ 2014-11-01 11:57 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stephen J. Turnbull, emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> Yes, that sounds like an excellent reason for using hashes.  But for many
> uses, a revision number is better.  bzr has them both.  git doesn't.

Git has "git describe" for getting a sequence-based decription.  It's
used for informal version numbers, like for a self-compiled git as
opposed to an official release:

dak@lola:/usr/local/tmp/lilypond$ git --version
git version 1.9.1
dak@lola:/usr/local/tmp/lilypond$ ../git/git --version
git version 2.1.0.rc2.3.g67de23d.dirty
dak@lola:/usr/local/tmp/lilypond$ cd ../git
dak@lola:/usr/local/tmp/git$ git describe
v2.1.0-rc2-3-g67de23d

Nobody uses it in Email communication because there are no sufficient
upsides to it.  If you want to talk about a commit, you'll talk about

commit 67de23ddb1ed5471e302f6a84fae7a9037a0d980
Merge: f82887f 09898e7
Author: Junio C Hamano <gitster@pobox.com>
Date:   Sun Aug 10 11:03:03 2014 -0700

    Merge branch 'master' of git://ozlabs.org/~paulus/gitk
    
    * 'master' of git://ozlabs.org/~paulus/gitk:
      gitk: Updated Bulgarian translation (302t,0f,0u)
      gitk: Add keybinding to switch to parent commit

because that saves everybody the work of having to figure out the
details himself by asking his repositoty.  A serial number is not in any
manner more useful than a hash here.

I am not really interested in continuing this silliness since it is
totally irrelevant, anyway.  If you want to change Git's operation and
nomenclature, feel free to make your point on the Git developer list and
tell everybody there they are doing it all wrong.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01 11:57                             ` David Kastrup
@ 2014-11-01 12:29                               ` Alan Mackenzie
  2014-11-01 12:47                                 ` Ivan Shmakov
  2014-11-01 12:49                                 ` David Kastrup
  0 siblings, 2 replies; 137+ messages in thread
From: Alan Mackenzie @ 2014-11-01 12:29 UTC (permalink / raw)
  To: David Kastrup; +Cc: Stephen J. Turnbull, emacs-devel

Hello, David.

On Sat, Nov 01, 2014 at 12:57:13PM +0100, David Kastrup wrote:
> Alan Mackenzie <acm@muc.de> writes:

> > Yes, that sounds like an excellent reason for using hashes.  But for many
> > uses, a revision number is better.  bzr has them both.  git doesn't.

> Git has "git describe" for getting a sequence-based decription.  It's
> used for informal version numbers, like for a self-compiled git as
> opposed to an official release:

> dak@lola:/usr/local/tmp/lilypond$ git --version
> git version 1.9.1
> dak@lola:/usr/local/tmp/lilypond$ ../git/git --version
> git version 2.1.0.rc2.3.g67de23d.dirty
> dak@lola:/usr/local/tmp/lilypond$ cd ../git
> dak@lola:/usr/local/tmp/git$ git describe
> v2.1.0-rc2-3-g67de23d

Maybe I'm blind, but I can't see anything like a sequential version
number in that string.

> Nobody uses it in Email communication because there are no sufficient
> upsides to it.

Is it of any use for anything?  Can you use it as input to a git command,
for example?

> If you want to talk about a commit, you'll talk about

[ .... ]

> I am not really interested in continuing this silliness since it is
> totally irrelevant, anyway.  If you want to change Git's operation and
> nomenclature, feel free to make your point on the Git developer list and
> tell everybody there they are doing it all wrong.

I've never been anywhere near that list, but I'd be willing to bet an
awful lot of money that this point has been raised many times on that
list, each time being dismissively dismissed with religious fervour.

There are certainly workarounds for the lack of version numbers in git,
as you and others have pointed out.  I expect their necessity will bring
me to hate the program.  Such is life.

> -- 
> David Kastrup

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-11-01 12:29                               ` Alan Mackenzie
@ 2014-11-01 12:47                                 ` Ivan Shmakov
  2014-11-01 13:46                                   ` Alan Mackenzie
  2014-11-01 12:49                                 ` David Kastrup
  1 sibling, 1 reply; 137+ messages in thread
From: Ivan Shmakov @ 2014-11-01 12:47 UTC (permalink / raw)
  To: emacs-devel

>>>>> Alan Mackenzie <acm@muc.de> writes:
>>>>> On Sat, Nov 01, 2014 at 12:57:13PM +0100, David Kastrup wrote:

[…]

 >> dak@lola:/usr/local/tmp/git$ git describe 
 >> v2.1.0-rc2-3-g67de23d

 > Maybe I'm blind, but I can't see anything like a sequential version
 > number in that string.

	It comes in-between the tag name (v2.1.0-rc2) and the Git
	revision proper (g67de23d), prefixed with a ‘g’.

 >> Nobody uses it in Email communication because there are no
 >> sufficient upsides to it.

 > Is it of any use for anything?  Can you use it as input to a git
 > command, for example?

	The part of that string after the last ‘g’ (that is: 67de23d) is
	an abbreviated Git revision hash, which could be used with just
	about any Git command.

[…]

-- 
FSF associate member #7257  http://boycottsystemd.org/  … 3013 B6A0 230E 334A



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01 12:29                               ` Alan Mackenzie
  2014-11-01 12:47                                 ` Ivan Shmakov
@ 2014-11-01 12:49                                 ` David Kastrup
  1 sibling, 0 replies; 137+ messages in thread
From: David Kastrup @ 2014-11-01 12:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stephen J. Turnbull, emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> Hello, David.
>
> On Sat, Nov 01, 2014 at 12:57:13PM +0100, David Kastrup wrote:
>> Alan Mackenzie <acm@muc.de> writes:
>
>> > Yes, that sounds like an excellent reason for using hashes.  But for many
>> > uses, a revision number is better.  bzr has them both.  git doesn't.
>
>> Git has "git describe" for getting a sequence-based decription.  It's
>> used for informal version numbers, like for a self-compiled git as
>> opposed to an official release:
>
>> dak@lola:/usr/local/tmp/lilypond$ git --version
>> git version 1.9.1
>> dak@lola:/usr/local/tmp/lilypond$ ../git/git --version
>> git version 2.1.0.rc2.3.g67de23d.dirty
>> dak@lola:/usr/local/tmp/lilypond$ cd ../git
>> dak@lola:/usr/local/tmp/git$ git describe
>> v2.1.0-rc2-3-g67de23d
>
> Maybe I'm blind, but I can't see anything like a sequential version
> number in that string.

It's the 3.  The tag v2.1.0-rc2 occured 3 commits before the named one:

*   commit 67de23ddb1ed5471e302f6a84fae7a9037a0d980 (HEAD, master)
|\  Merge: f82887f 09898e7
| | Author: Junio C Hamano <gitster@pobox.com>
| | Date:   Sun Aug 10 11:03:03 2014 -0700
| | 
| |     Merge branch 'master' of git://ozlabs.org/~paulus/gitk
| |     
| |     * 'master' of git://ozlabs.org/~paulus/gitk:
| |       gitk: Updated Bulgarian translation (302t,0f,0u)
| |       gitk: Add keybinding to switch to parent commit
| |   
| * commit 09898e7c3b040086e8addd4ef226548c9dce1460
| | Author: Alexander Shopov <ash@kambanaria.org>
| | Date:   Sun Aug 3 15:36:43 2014 +0300
| | 
| |     gitk: Updated Bulgarian translation (302t,0f,0u)
| |     
| |     Signed-off-by: Alexander Shopov <ash@kambanaria.org>
| |     Signed-off-by: Paul Mackerras <paulus@samba.org>
| |   
| * commit d4ec30b24a8ad076771064ac71dbe5420512cc30
| | Author: Max Kirillov <max@max630.net>
| | Date:   Tue Jul 8 23:45:35 2014 +0300
| | 
| |     gitk: Add keybinding to switch to parent commit
| |     
| |     Signed-off-by: Max Kirillov <max@max630.net>
| |     Signed-off-by: Paul Mackerras <paulus@samba.org>
| |   
* | commit f82887f29010e1ec88ec1930a99ddc56b6438452 (tag: v2.1.0-rc2)
| | Author: Junio C Hamano <gitster@pobox.com>
| | Date:   Fri Aug 8 13:52:16 2014 -0700
| | 
| |     Git 2.1-rc2
| |     
| |     Signed-off-by: Junio C Hamano <gitster@pobox.com>


>> Nobody uses it in Email communication because there are no sufficient
>> upsides to it.
>
> Is it of any use for anything?  Can you use it as input to a git
> command, for example?

You can, but git just skims off the g67de23d as far as I can tell and
uses that.

>> I am not really interested in continuing this silliness since it is
>> totally irrelevant, anyway.  If you want to change Git's operation
>> and nomenclature, feel free to make your point on the Git developer
>> list and tell everybody there they are doing it all wrong.
>
> I've never been anywhere near that list, but I'd be willing to bet an
> awful lot of money that this point has been raised many times on that
> list, each time being dismissively dismissed with religious fervour.

The irony, the irony.

> There are certainly workarounds for the lack of version numbers in
> git, as you and others have pointed out.  I expect their necessity
> will bring me to hate the program.  Such is life.

I have no doubt that you'll be able to indulge your preconceptions.

-- 
David Kastrup



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01 11:33                           ` Alan Mackenzie
@ 2014-11-01 13:06                             ` Eli Zaretskii
  2014-11-01 13:21                               ` Alan Mackenzie
  0 siblings, 1 reply; 137+ messages in thread
From: Eli Zaretskii @ 2014-11-01 13:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: stephen, dak, emacs-devel

> Date: Sat, 1 Nov 2014 11:33:38 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: stephen@xemacs.org, dak@gnu.org, emacs-devel@gnu.org
> 
> > I suggest "M-/" instead, I'm using that with bzr revno's as well.
> > It's way better than copy/paste, and usually is right on target after
> > you type the first 2 characters.  Sometimes I need to type M-/ twice.
> 
> Thanks, I didn't know about M-/.

Really?  So do you always type the names of your variable in their
entirety?  M-/ is a great savior of keystrokes for that.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future.
  2014-11-01 13:06                             ` Eli Zaretskii
@ 2014-11-01 13:21                               ` Alan Mackenzie
  0 siblings, 0 replies; 137+ messages in thread
From: Alan Mackenzie @ 2014-11-01 13:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: stephen, dak, emacs-devel

Hello, Eli.

On Sat, Nov 01, 2014 at 03:06:40PM +0200, Eli Zaretskii wrote:
> > Date: Sat, 1 Nov 2014 11:33:38 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: stephen@xemacs.org, dak@gnu.org, emacs-devel@gnu.org

> > > I suggest "M-/" instead, I'm using that with bzr revno's as well.
> > > It's way better than copy/paste, and usually is right on target after
> > > you type the first 2 characters.  Sometimes I need to type M-/ twice.

> > Thanks, I didn't know about M-/.

> Really?  So do you always type the names of your variable in their
> entirety?  M-/ is a great savior of keystrokes for that.

Yes, and yes.  I'm a reasonably fast typist, so it doesn't bother me
most of the time.  But I will try M-/.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-11-01 12:47                                 ` Ivan Shmakov
@ 2014-11-01 13:46                                   ` Alan Mackenzie
  2014-11-01 18:58                                     ` Stephen J. Turnbull
  0 siblings, 1 reply; 137+ messages in thread
From: Alan Mackenzie @ 2014-11-01 13:46 UTC (permalink / raw)
  To: Ivan Shmakov; +Cc: emacs-devel

Hello, Ivan.

On Sat, Nov 01, 2014 at 12:47:57PM +0000, Ivan Shmakov wrote:
> >>>>> Alan Mackenzie <acm@muc.de> writes:
> >>>>> On Sat, Nov 01, 2014 at 12:57:13PM +0100, David Kastrup wrote:

>  >> dak@lola:/usr/local/tmp/git$ git describe 
>  >> v2.1.0-rc2-3-g67de23d

>  > Maybe I'm blind, but I can't see anything like a sequential version
>  > number in that string.

> 	It comes in-between the tag name (v2.1.0-rc2) and the Git
> 	revision proper (g67de23d), prefixed with a ‘g’.

Ah, it's the "3".

>  >> Nobody uses it in Email communication because there are no
>  >> sufficient upsides to it.

>  > Is it of any use for anything?  Can you use it as input to a git
>  > command, for example?

By which I meant the sequential version number.  Clearly "3" is not going
to do anything at all.

> 	The part of that string after the last ‘g’ (that is: 67de23d) is
> 	an abbreviated Git revision hash, which could be used with just
> 	about any Git command.

So, in other words, you might as well just use the "67de23d" rather than
typing in the whole lot.

> -- 
> FSF associate member #7257  http://boycottsystemd.org/  … 3013 B6A0 230E 334A

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-11-01  9:01                           ` Eli Zaretskii
@ 2014-11-01 18:32                             ` Stephen J. Turnbull
  2014-11-01 18:41                               ` David Kastrup
  0 siblings, 1 reply; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-11-01 18:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Camm Maguire, gcl-devel, emacs-devel

Eli Zaretskii writes:

 > > Been discussing this elsewhere, and its come to my attention that not
 > > only do all unicode code-points not fit into UTF-16, but all unicode
 > > characters don't fit into unicode code-points :-).  Presumably this is
 > > why emacs expanded to 22bits?
 > 
 > Not sure what you mean here.  All Unicode characters do fit into the
 > Unicode codepoint space.  Emacs extends that codepoint space beyond 22
 > bits because it needs to support cultures which don't want unification
 > yet.

I suppose he means grapheme complexes, such as various accented
characters that can be constructed from composing characters but do
not have precomposed forms in Unicode.  As you say, that's not why
Emacs extended the code space.

 > > Did you consider leaving aref, char-code and code-char alone and writing
 > > unicode functions on top of these, i.e. unicode-length!=length, as
 > > opposed to making aref itself do this translation under the hood,
 > > thereby violating the expectation of O(1) access, (which is certainly
 > > offered in other kinds of arrays, though it is questionable whether real
 > > users actually expect this for strings)?

Actually, originally Emacs allowed you to treat text (buffers and
strings) either as sequences of characters or arrays of bytes, and
this was a real bug-breeder (and why XEmacs chose the pain of the
incompatible separation of integer type from character type).

I'm not sure if the feature is present in modern Emacs, but at the
very least the usage is so rare today that I'm unaware of any.

That's not what you asked, but it implies the answer "no, and you
shouldn't, either" to your question.  This is despite the fact that
yes, in many languages and applications users *do* expect O(1) access
to individual characters in text.



^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-11-01 18:32                             ` Stephen J. Turnbull
@ 2014-11-01 18:41                               ` David Kastrup
  2014-11-01 19:09                                 ` Stephen J. Turnbull
  2014-11-02  0:56                                 ` Stefan Monnier
  0 siblings, 2 replies; 137+ messages in thread
From: David Kastrup @ 2014-11-01 18:41 UTC (permalink / raw)
  To: emacs-devel; +Cc: gcl-devel

"Stephen J. Turnbull" <stephen@xemacs.org> writes:

> Eli Zaretskii writes:
>
>  > > Been discussing this elsewhere, and its come to my attention that not
>  > > only do all unicode code-points not fit into UTF-16, but all unicode
>  > > characters don't fit into unicode code-points :-).  Presumably this is
>  > > why emacs expanded to 22bits?
>  > 
>  > Not sure what you mean here.  All Unicode characters do fit into the
>  > Unicode codepoint space.  Emacs extends that codepoint space beyond 22
>  > bits because it needs to support cultures which don't want unification
>  > yet.
>
> I suppose he means grapheme complexes, such as various accented
> characters that can be constructed from composing characters but do
> not have precomposed forms in Unicode.  As you say, that's not why
> Emacs extended the code space.
>
>  > > Did you consider leaving aref, char-code and code-char alone and writing
>  > > unicode functions on top of these, i.e. unicode-length!=length, as
>  > > opposed to making aref itself do this translation under the hood,
>  > > thereby violating the expectation of O(1) access, (which is certainly
>  > > offered in other kinds of arrays, though it is questionable whether real
>  > > users actually expect this for strings)?
>
> Actually, originally Emacs allowed you to treat text (buffers and
> strings) either as sequences of characters or arrays of bytes, and
> this was a real bug-breeder (and why XEmacs chose the pain of the
> incompatible separation of integer type from character type).
>
> I'm not sure if the feature is present in modern Emacs, but at the
> very least the usage is so rare today that I'm unaware of any.

string-as-unibyte and string-as-multibyte most certainly are available
for going from one to the other.  But the commands working on either
unibyte or multibyte strings are the same.  Similar for buffers.  I have
no idea whether this is a problem vector for creating inconsistent
multibyte content.  I could imagine it to be, but so could be
user-created CCL programs for code conversion.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: Referring to revisions in the git future
  2014-11-01 13:46                                   ` Alan Mackenzie
@ 2014-11-01 18:58                                     ` Stephen J. Turnbull
  0 siblings, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-11-01 18:58 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Ivan Shmakov, emacs-devel

Alan Mackenzie writes:

 > >  >> dak@lola:/usr/local/tmp/git$ git describe 
 > >  >> v2.1.0-rc2-3-g67de23d
 > 
 > >  > Maybe I'm blind, but I can't see anything like a sequential version
 > >  > number in that string.

Unlike the other commenters, I will tell you that it is
"v2.1.0-rc2-3".  git describe works such that as long as you know
where the tags are relative to each other, you will be able to order
them.  It can't be used for computing bisects, of course, but
"v2.1.0-rc2-3" vs. "v2.0.9-4" tells you that these commits are close
to each other indeed.  That's useful, although I doubt you'll find it
terribly useful.

That's assuming they're on the same branch (although the usual
semantics of release versioning suggests that even on different
branches they're likely to be only a small patch apart).  Also there
may be un-orderable tags like "acm-forget-me-not".

The "not on same branch" issue applies to bzr, as well, of course, but
bzr encourages focus on the mainline.  With all due respect to Barry,
I disagree that it's the most natural way to work all of the time, and
I suspect that with the transition to git you will start seeing more
complex branching structure.  This will lead to more off-trunk
references and more cross-branch comparisons.

In any case, the git transition is a done deal, so you will really be
a lot happier if you focus on how you can make use of SHAs more
effective for you rather than on the essential wonderfulness of
revnos.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-11-01 18:41                               ` David Kastrup
@ 2014-11-01 19:09                                 ` Stephen J. Turnbull
  2014-11-02  0:56                                 ` Stefan Monnier
  1 sibling, 0 replies; 137+ messages in thread
From: Stephen J. Turnbull @ 2014-11-01 19:09 UTC (permalink / raw)
  To: David Kastrup; +Cc: gcl-devel, emacs-devel

David Kastrup writes:

 > string-as-unibyte and string-as-multibyte most certainly are available
 > for going from one to the other.  But the commands working on either
 > unibyte or multibyte strings are the same.

True.  Thanks for pointing that out.

 > Similar for buffers.  I have no idea whether this is a problem
 > vector for creating inconsistent multibyte content.  I could
 > imagine it to be,

And indeed it was.

 > but so could be user-created CCL programs for code conversion.

True, but few users ever create CCL programs.  It's something only a
specialist would do, so as far as I'm concerned that's not an
attractive nuisance the way *-as-unibyte are.  YMMV etc.




^ permalink raw reply	[flat|nested] 137+ messages in thread

* Re: utf8 and emacs text/string multibyte representation
  2014-11-01 18:41                               ` David Kastrup
  2014-11-01 19:09                                 ` Stephen J. Turnbull
@ 2014-11-02  0:56                                 ` Stefan Monnier
  1 sibling, 0 replies; 137+ messages in thread
From: Stefan Monnier @ 2014-11-02  0:56 UTC (permalink / raw)
  To: David Kastrup; +Cc: gcl-devel, emacs-devel

>> Actually, originally Emacs allowed you to treat text (buffers and
>> strings) either as sequences of characters or arrays of bytes, and

This was the case in Emacs-20.1, yes.  It was fixed by Emacs-20.3 or
maybe even 20.2 already.

> string-as-unibyte and string-as-multibyte most certainly are available
> for going from one to the other.

And they both suck.

> I have no idea whether this is a problem vector for creating
> inconsistent multibyte content.

It's mostly a problem in that it helps 8bit people stay stuck in the
confusion between bytes and characters.


        Stefan



^ permalink raw reply	[flat|nested] 137+ messages in thread

end of thread, other threads:[~2014-11-02  0:56 UTC | newest]

Thread overview: 137+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-28 22:33 Referring to revisions in the git future Alan Mackenzie
2014-10-28 22:54 ` Óscar Fuentes
2014-10-28 23:05   ` Alan Mackenzie
2014-10-28 23:24     ` Óscar Fuentes
2014-10-31 22:47     ` Paul Eggert
2014-10-29  0:49 ` Eric S. Raymond
2014-10-29  3:38   ` Stephen J. Turnbull
2014-10-29 12:26     ` Stefan Monnier
2014-10-29 12:41       ` Alexander Baier
2014-10-29 14:52   ` Barry Warsaw
2014-10-29 15:01     ` David Kastrup
2014-10-29 15:06       ` Eric S. Raymond
2014-10-29 18:12         ` Barry Warsaw
2014-10-29 22:09           ` Lars Magne Ingebrigtsen
2014-10-29 22:29             ` Eric S. Raymond
2014-10-29 23:31               ` Paul Eggert
2014-10-30  0:01                 ` Nic Ferrier
2014-10-30  1:53                 ` Stefan Monnier
2014-10-30  2:10                   ` Eric S. Raymond
2014-10-30  2:13                     ` Paul Eggert
2014-10-30  2:48                       ` Eric S. Raymond
2014-10-30  2:25                     ` Glenn Morris
2014-10-30 10:10                       ` David Kastrup
2014-10-30 13:03                         ` Stefan Monnier
2014-10-30 13:40                           ` David Kastrup
2014-10-30 14:00                             ` Stefan Monnier
2014-10-30 13:02                     ` Stefan Monnier
2014-10-30 15:12                       ` Eric S. Raymond
2014-10-30 16:49                         ` Stefan Monnier
2014-10-30  6:46                 ` Jan Djärv
2014-10-30  7:36                   ` Ivan Shmakov
2014-10-30  8:09                     ` Jan Djärv
2014-10-30  8:31                     ` Eric S. Raymond
2014-10-30  9:53                       ` Andreas Schwab
2014-10-30 10:13                         ` Eric S. Raymond
2014-10-30 10:32                           ` Andreas Schwab
2014-10-30 11:13                             ` Nicolas Richard
2014-10-30 10:12                       ` David Kastrup
2014-10-30 13:29                       ` Stefan Monnier
2014-10-30 15:33                         ` DVCS design philosophy Eric S. Raymond
2014-10-30 16:59                           ` Stefan Monnier
2014-10-30 17:41                             ` Eric S. Raymond
2014-10-31 20:18                         ` Referring to revisions in the git future Nicolas Richard
2014-10-31 21:11                           ` Stefan Monnier
2014-11-01  1:44                             ` Stephen J. Turnbull
2014-11-01  7:58                             ` David Kastrup
2014-10-30 14:20                       ` Barry Warsaw
2014-11-01  1:23                         ` Stephen J. Turnbull
2014-10-30 15:52                     ` Eli Zaretskii
2014-10-30  3:32           ` Stephen J. Turnbull
2014-10-30  4:35             ` Barry Warsaw
2014-10-30  5:24               ` Stephen J. Turnbull
2014-10-30 10:17               ` David Kastrup
2014-10-30 13:42               ` Alex Bennée
2014-10-30 13:19             ` Stefan Monnier
2014-10-31  6:36               ` Stephen J. Turnbull
2014-10-31 19:42               ` David Kastrup
2014-11-01  1:34                 ` Stephen J. Turnbull
2014-11-01  7:05                   ` Tassilo Horn
2014-11-01  7:09                     ` Dima Kogan
2014-11-01  7:28                     ` Paul Eggert
2014-11-01  7:49                     ` David Kastrup
2014-11-01  9:46                       ` Alan Mackenzie
2014-11-01 10:13                         ` Eli Zaretskii
2014-11-01 11:33                           ` Alan Mackenzie
2014-11-01 13:06                             ` Eli Zaretskii
2014-11-01 13:21                               ` Alan Mackenzie
2014-11-01 10:29                         ` David Kastrup
2014-11-01 11:29                           ` Alan Mackenzie
2014-11-01 11:57                             ` David Kastrup
2014-11-01 12:29                               ` Alan Mackenzie
2014-11-01 12:47                                 ` Ivan Shmakov
2014-11-01 13:46                                   ` Alan Mackenzie
2014-11-01 18:58                                     ` Stephen J. Turnbull
2014-11-01 12:49                                 ` David Kastrup
2014-10-29  1:11 ` Stefan Monnier
2014-10-29  6:06   ` Werner LEMBERG
2014-10-29  9:01     ` David Kastrup
2014-10-29  8:50 ` David Kastrup
2014-10-29  9:52   ` Eric S. Raymond
2014-10-29 11:00     ` David Kastrup
2014-10-29 14:32       ` Eli Zaretskii
2014-10-29 14:35         ` David Kastrup
2014-10-29 14:55           ` Eli Zaretskii
2014-10-30  4:44             ` Richard Stallman
2014-10-30  8:32               ` Eric S. Raymond
2014-10-30 10:25                 ` David Kastrup
2014-10-30 11:51                   ` Eric S. Raymond
2014-10-30 12:14                     ` David Kastrup
2014-10-30 15:01                       ` Eric S. Raymond
2014-10-30 15:53                 ` Eli Zaretskii
2014-10-30 15:56                   ` Eric S. Raymond
2014-10-30 16:44                     ` Eli Zaretskii
2014-10-31  7:47                 ` Richard Stallman
2014-10-31  8:17                   ` Eli Zaretskii
2014-10-31 10:21                   ` Eric S. Raymond
2014-10-29 12:35     ` Stefan Monnier
2014-10-29 13:00       ` Jose E. Marchesi
2014-10-29 13:59         ` Stefan Monnier
2014-10-29 14:39           ` Eric S. Raymond
2014-10-29 14:46             ` Rasmus
2014-10-29 14:52               ` Eric S. Raymond
2014-10-30  0:58               ` Rob Browning
2014-10-29 15:27             ` Stefan Monnier
2014-10-29 14:04         ` utf8 and emacs text/string multibyte representation Camm Maguire
2014-10-29 14:51           ` Eli Zaretskii
2014-10-29 15:55             ` Camm Maguire
2014-10-29 16:19               ` Eli Zaretskii
2014-10-30 14:13                 ` Camm Maguire
2014-10-30 16:06                   ` Eli Zaretskii
2014-10-30 16:27                     ` Camm Maguire
2014-10-30 16:35                       ` Eli Zaretskii
2014-10-31 18:05                         ` Camm Maguire
2014-11-01  9:01                           ` Eli Zaretskii
2014-11-01 18:32                             ` Stephen J. Turnbull
2014-11-01 18:41                               ` David Kastrup
2014-11-01 19:09                                 ` Stephen J. Turnbull
2014-11-02  0:56                                 ` Stefan Monnier
2014-11-01  1:16                         ` Stephen J. Turnbull
2014-10-29 16:45             ` Stefan Monnier
2014-10-29 15:56           ` Raymond Toy
2014-10-30 14:16             ` Camm Maguire
2014-10-31 18:47               ` Sam Steingold
2014-10-31 21:00                 ` Andreas Schwab
2014-10-31 19:52               ` [Gcl-devel] " Stefan Monnier
2014-10-30  3:08           ` Stephen J. Turnbull
2014-10-29 13:26       ` Referring to revisions in the git future Eric S. Raymond
2014-10-29 14:04         ` Stefan Monnier
2014-10-29 14:49           ` Eric S. Raymond
2014-10-30  2:43           ` Stephen J. Turnbull
2014-10-29 13:08     ` Jan Djärv
2014-10-29 13:27       ` Eric S. Raymond
2014-10-29 13:49         ` Eric S. Raymond
2014-10-29 18:03           ` Jan Djärv
2014-10-29 11:18   ` Alan Mackenzie
2014-10-29 11:37     ` David Kastrup
  -- strict thread matches above, loose matches on Subject: below --
2014-10-31  9:43 Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).