* Policty question - encoding to use in git repository? @ 2014-02-17 15:29 Eric S. Raymond 2014-02-17 15:41 ` Juanma Barranquero ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Eric S. Raymond @ 2014-02-17 15:29 UTC (permalink / raw To: emacs-devel While continuing to try to identify correct deletion points for attic files, I have run across a minor problem: Latin-1 characters in Changelog files. I have seen two, c-cedilla and something I can't identify that renders as a backtick. There may be more. I can fix them up. I request a policy decision about what encoding the repository content should use. I see three reasonable choices: * Leave Latin-1 in place. * Transcode to UTF-8. (I favor this as the best long-term solution.) * Transcode to ASCII approximations - easy in the two cases I've found so far. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> Nearly all men can stand adversity, but if you want to test a man's character, give him power. -- Abraham Lincoln ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 15:29 Policty question - encoding to use in git repository? Eric S. Raymond @ 2014-02-17 15:41 ` Juanma Barranquero 2014-02-17 16:30 ` Eric S. Raymond 2014-02-17 16:39 ` Karl Fogel 2014-02-17 19:22 ` Ivan Kanis 2 siblings, 1 reply; 17+ messages in thread From: Juanma Barranquero @ 2014-02-17 15:41 UTC (permalink / raw To: Eric S. Raymond; +Cc: Emacs developers On Mon, Feb 17, 2014 at 4:29 PM, Eric S. Raymond <esr@thyrsus.com> wrote: > I have seen two, c-cedilla and something I can't > identify that renders as a backtick. There may be more. I can fix > them up. Can you please show examples of these? J ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 15:41 ` Juanma Barranquero @ 2014-02-17 16:30 ` Eric S. Raymond 2014-02-17 17:03 ` Andreas Schwab 0 siblings, 1 reply; 17+ messages in thread From: Eric S. Raymond @ 2014-02-17 16:30 UTC (permalink / raw To: Juanma Barranquero; +Cc: Emacs developers Juanma Barranquero <lekktu@gmail.com>: > On Mon, Feb 17, 2014 at 4:29 PM, Eric S. Raymond <esr@thyrsus.com> wrote: > > > I have seen two, c-cedilla and something I can't > > identify that renders as a backtick. There may be more. I can fix > > them up. > > Can you please show examples of these? Yes. Excuse the odd formatting, I have processed the ChangeLogs into a big Python initializer so I can write code to mine them for the deletion points of attic files. In the first one, the accent acute before [delete] (I confused it with the later backtick before). Probably this should just be edited to an ASCII backtick. In the second one, the c-cedilla in Franc,ois Pinard's name. There be others; I have not yet checked all Changelogs for these, though I will do so now. ("2000-07-07T14:15:55Z!gerd@gnu.org", "lisp/ChangeLog", "refs/tags/emacs-pretest-21.0.90", r"""\ 2000-07-07 Gerd Moellmann <gerd@gnu.org> * bindings.el: Bind ´[delete]' to delete-char. * dired.el (dired-find-alternate-file): New function. (dired-mode-map): Bind `a' to dired-find-alternate-file. (toplevel): Require dired-aux when compiling. (dired-buffers): Move defvar within file to avoid compiler warning. * info.el (Info-last-search): Variable removed. (Info-search-history): New variable. (Info-search): New Info-search-history. * battery.el, info-look.el: Change author's mail address. """), ("2000-08-28T20:35:45Z!pbreton@attbi.com", "lisp/ChangeLog", "refs/tags/emacs-pretest-21.0.90", r"""\ 2000-08-28 Peter Breton <pbreton@ne.mediaone.net> * locate.el (locate): Cleaned up locate command's interactive prompting Thanks to François_Pinard <pinard@iro.umontreal.ca> for suggestions. * filecache.el (file-cache-case-fold-search): New variable (file-cache-assoc-function): New variable (file-cache-minibuffer-complete): Use file-cache-assoc-function. Use file-cache-case-fold-search variable (file-cache-add-file): Use file-cache-assoc-function (file-cache-delete-file): likewise (file-cache-directory-name): likewise (file-cache-debug-read-from-minibuffer): likewise """), -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 16:30 ` Eric S. Raymond @ 2014-02-17 17:03 ` Andreas Schwab 2014-02-17 17:12 ` Eric S. Raymond 0 siblings, 1 reply; 17+ messages in thread From: Andreas Schwab @ 2014-02-17 17:03 UTC (permalink / raw To: esr; +Cc: Juanma Barranquero, Emacs developers "Eric S. Raymond" <esr@thyrsus.com> writes: > ("2000-07-07T14:15:55Z!gerd@gnu.org", > "lisp/ChangeLog", The file was reencoded in revid:lekktu@gmail.com-20080327114958-auavr50v7a90i6cw (git id c8ec82b). Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 17:03 ` Andreas Schwab @ 2014-02-17 17:12 ` Eric S. Raymond 0 siblings, 0 replies; 17+ messages in thread From: Eric S. Raymond @ 2014-02-17 17:12 UTC (permalink / raw To: Andreas Schwab; +Cc: Juanma Barranquero, Emacs developers Andreas Schwab <schwab@suse.de>: > "Eric S. Raymond" <esr@thyrsus.com> writes: > > > ("2000-07-07T14:15:55Z!gerd@gnu.org", > > "lisp/ChangeLog", > > The file was reencoded in > revid:lekktu@gmail.com-20080327114958-auavr50v7a90i6cw (git id c8ec82b). OK, that being the case my plan is not to touch the earlier versions. That is, unless someone with decision authority tells me differently. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 15:29 Policty question - encoding to use in git repository? Eric S. Raymond 2014-02-17 15:41 ` Juanma Barranquero @ 2014-02-17 16:39 ` Karl Fogel 2014-02-17 16:52 ` Eli Zaretskii 2014-02-17 19:22 ` Ivan Kanis 2 siblings, 1 reply; 17+ messages in thread From: Karl Fogel @ 2014-02-17 16:39 UTC (permalink / raw To: Eric S. Raymond; +Cc: emacs-devel esr@thyrsus.com (Eric S. Raymond) writes: >While continuing to try to identify correct deletion points for attic >files, I have run across a minor problem: Latin-1 characters in >Changelog files. I have seen two, c-cedilla and something I can't >identify that renders as a backtick. There may be more. I can fix >them up. > >I request a policy decision about what encoding the repository >content should use. I see three reasonable choices: > >* Leave Latin-1 in place. > >* Transcode to UTF-8. (I favor this as the best long-term solution.) > >* Transcode to ASCII approximations - easy in the two cases I've > found so far. +1 to UTF-8 ! ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 16:39 ` Karl Fogel @ 2014-02-17 16:52 ` Eli Zaretskii 2014-02-17 17:01 ` Eric S. Raymond 0 siblings, 1 reply; 17+ messages in thread From: Eli Zaretskii @ 2014-02-17 16:52 UTC (permalink / raw To: Karl Fogel; +Cc: esr, emacs-devel > From: Karl Fogel <kfogel@red-bean.com> > Date: Mon, 17 Feb 2014 10:39:39 -0600 > Cc: emacs-devel@gnu.org > > esr@thyrsus.com (Eric S. Raymond) writes: > >While continuing to try to identify correct deletion points for attic > >files, I have run across a minor problem: Latin-1 characters in > >Changelog files. I have seen two, c-cedilla and something I can't > >identify that renders as a backtick. There may be more. I can fix > >them up. > > > >I request a policy decision about what encoding the repository > >content should use. I see three reasonable choices: > > > >* Leave Latin-1 in place. > > > >* Transcode to UTF-8. (I favor this as the best long-term solution.) > > > >* Transcode to ASCII approximations - easy in the two cases I've > > found so far. > > +1 to UTF-8 ! It's already UTF-8. This is a non-issue. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 16:52 ` Eli Zaretskii @ 2014-02-17 17:01 ` Eric S. Raymond 2014-02-17 17:04 ` Andreas Schwab 2014-02-17 17:24 ` Eli Zaretskii 0 siblings, 2 replies; 17+ messages in thread From: Eric S. Raymond @ 2014-02-17 17:01 UTC (permalink / raw To: Eli Zaretskii; +Cc: Karl Fogel, emacs-devel Eli Zaretskii <eliz@gnu.org>: > It's already UTF-8. This is a non-issue. Do you mean the policy is already to use UTF-8? Or that you believe there are no non-UTF-8 characters in the Changelogs? Python doesn't think the latter is true when I try to interpret string initializers mined from the Changelogs. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 17:01 ` Eric S. Raymond @ 2014-02-17 17:04 ` Andreas Schwab 2014-02-17 17:24 ` Eli Zaretskii 1 sibling, 0 replies; 17+ messages in thread From: Andreas Schwab @ 2014-02-17 17:04 UTC (permalink / raw To: esr; +Cc: Karl Fogel, Eli Zaretskii, emacs-devel "Eric S. Raymond" <esr@thyrsus.com> writes: > Python doesn't think the latter is true when I try to interpret string > initializers mined from the Changelogs. The file featured a couple of encodings in its history. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 17:01 ` Eric S. Raymond 2014-02-17 17:04 ` Andreas Schwab @ 2014-02-17 17:24 ` Eli Zaretskii 1 sibling, 0 replies; 17+ messages in thread From: Eli Zaretskii @ 2014-02-17 17:24 UTC (permalink / raw To: esr; +Cc: kfogel, emacs-devel > Date: Mon, 17 Feb 2014 12:01:28 -0500 > From: "Eric S. Raymond" <esr@thyrsus.com> > Cc: Karl Fogel <kfogel@red-bean.com>, emacs-devel@gnu.org > > Eli Zaretskii <eliz@gnu.org>: > > It's already UTF-8. This is a non-issue. > > Do you mean the policy is already to use UTF-8? Or that you believe there > are no non-UTF-8 characters in the Changelogs? Both. Each ChangeLog file has this file-local variable at the end: ;; Local Variables: ;; coding: utf-8 ;; End: > Python doesn't think the latter is true when I try to interpret string > initializers mined from the Changelogs. Then something is probably wrong with the mining process, because I look in the file ChangeLog.9 and see a 2-byte UTF-8 sequence \303\247 there for the Latin-1 character ç in "François_Pinard". ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 15:29 Policty question - encoding to use in git repository? Eric S. Raymond 2014-02-17 15:41 ` Juanma Barranquero 2014-02-17 16:39 ` Karl Fogel @ 2014-02-17 19:22 ` Ivan Kanis 2014-02-17 21:47 ` Paul Eggert 2 siblings, 1 reply; 17+ messages in thread From: Ivan Kanis @ 2014-02-17 19:22 UTC (permalink / raw To: Eric S. Raymond; +Cc: emacs-devel February, 17 at 10:29 Eric S. Raymond wrote: > While continuing to try to identify correct deletion points for attic > files, I have run across a minor problem: Latin-1 characters in > Changelog files. I have seen two, c-cedilla and something I can't > identify that renders as a backtick. There may be more. I can fix > them up. > > I request a policy decision about what encoding the repository > content should use. I see three reasonable choices: > > * Leave Latin-1 in place. > > * Transcode to UTF-8. (I favor this as the best long-term solution.) > > * Transcode to ASCII approximations - easy in the two cases I've > found so far. Transcode to UTF-8 would be best I think. -- A faith is something you die for; a doctrine is something you kill for: there is all the difference in the world. -- Tony Benn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 19:22 ` Ivan Kanis @ 2014-02-17 21:47 ` Paul Eggert 2014-02-17 22:08 ` Eric S. Raymond ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Paul Eggert @ 2014-02-17 21:47 UTC (permalink / raw To: Eric S. Raymond; +Cc: emacs-devel The closest thing we have to an encoding policy is given in the "Source file encoding" section of admin/notes/unicode. The history is that there's been fitful recoding to UTF-8 over the years. About the time I wrote "Source file encoding" (March 2013), I recoded several files. Juanma recoded a bunch of ChangeLogs in March 2008 -- these are the most-relevant to the issue of what should appear in the repository. I assume there are other recodings as well; I haven't kept track. While we're on the topic of normalization, is it the intent to normalize spelling of author and committer names in the repository? E.g., replace "François_Pinard" with "François Pinard" (no underscore)? Or replace "Richard M. Stallman" and "Richard M Stallman" with "Richard Stallman"? How about email addresses? Perhaps you've already addressed this point but if so I'm afraid I forgot what you wrote. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 21:47 ` Paul Eggert @ 2014-02-17 22:08 ` Eric S. Raymond 2014-02-18 0:50 ` Glenn Morris 2014-02-18 6:40 ` David Kastrup 2 siblings, 0 replies; 17+ messages in thread From: Eric S. Raymond @ 2014-02-17 22:08 UTC (permalink / raw To: Paul Eggert; +Cc: emacs-devel Paul Eggert <eggert@cs.ucla.edu>: > While we're on the topic of normalization, is it the intent to > normalize spelling of author and committer names in the repository? > E.g., replace "François_Pinard" with "François Pinard" (no > underscore)? Or replace "Richard M. Stallman" and "Richard M > Stallman" with "Richard Stallman"? How about email addresses? That's the first request I've had for this sort of change. It would be easy enough to do. All interested parties may email me normalization requests to be incorporated into the lift script. I would prefer to get them from the owner of the name or email address in question. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 21:47 ` Paul Eggert 2014-02-17 22:08 ` Eric S. Raymond @ 2014-02-18 0:50 ` Glenn Morris 2014-02-18 1:07 ` Stefan Monnier 2014-02-18 6:40 ` David Kastrup 2 siblings, 1 reply; 17+ messages in thread From: Glenn Morris @ 2014-02-18 0:50 UTC (permalink / raw To: emacs-devel Paul Eggert wrote: > While we're on the topic of normalization, is it the intent to > normalize spelling of author and committer names in the repository? I wondered when the revisionism would get to that stage... Remember to fix all typos as well! And don't forget to remove all trailing whitespace! And use US spelling throughout! And two spaces after all full stops! And ... ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-18 0:50 ` Glenn Morris @ 2014-02-18 1:07 ` Stefan Monnier 2014-02-18 16:30 ` Karl Fogel 0 siblings, 1 reply; 17+ messages in thread From: Stefan Monnier @ 2014-02-18 1:07 UTC (permalink / raw To: Glenn Morris; +Cc: emacs-devel >> While we're on the topic of normalization, is it the intent to >> normalize spelling of author and committer names in the repository? > I wondered when the revisionism would get to that stage... > Remember to fix all typos as well! > And don't forget to remove all trailing whitespace! > And use US spelling throughout! > And two spaces after all full stops! > And ... Please Glenn, stop this nonsense! Fixing those cosmetic issues is a waste of time. We should focus on fixing actual bugs. Could someone please help Eric collect a list of the various bug-fixes that should be back-ported to older revisions? If you could find to which revision it should apply it's better, but if not, I'm confident Eric will find a neat heuristic to add to his reposurgeon to automatically find the right revision. Stefan ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-18 1:07 ` Stefan Monnier @ 2014-02-18 16:30 ` Karl Fogel 0 siblings, 0 replies; 17+ messages in thread From: Karl Fogel @ 2014-02-18 16:30 UTC (permalink / raw To: Stefan Monnier; +Cc: emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: >>> While we're on the topic of normalization, is it the intent to >>> normalize spelling of author and committer names in the repository? >> I wondered when the revisionism would get to that stage... >> Remember to fix all typos as well! >> And don't forget to remove all trailing whitespace! >> And use US spelling throughout! >> And two spaces after all full stops! >> And ... > >Please Glenn, stop this nonsense! Fixing those cosmetic issues is >a waste of time. We should focus on fixing actual bugs. Could someone >please help Eric collect a list of the various bug-fixes that should be >back-ported to older revisions? If you could find to which revision it >should apply it's better, but if not, I'm confident Eric will find >a neat heuristic to add to his reposurgeon to automatically find the >right revision. Hah! It seems the farther we get in this process, the closer we get to http://bzr.savannah.gnu.org/lh/emacs/trunk/annotate/head:/etc/future-bug becoming reality, yikes... ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Policty question - encoding to use in git repository? 2014-02-17 21:47 ` Paul Eggert 2014-02-17 22:08 ` Eric S. Raymond 2014-02-18 0:50 ` Glenn Morris @ 2014-02-18 6:40 ` David Kastrup 2 siblings, 0 replies; 17+ messages in thread From: David Kastrup @ 2014-02-18 6:40 UTC (permalink / raw To: emacs-devel Paul Eggert <eggert@cs.ucla.edu> writes: > While we're on the topic of normalization, is it the intent to > normalize spelling of author and committer names in the repository? > E.g., replace "François_Pinard" with "François Pinard" (no > underscore)? Or replace "Richard M. Stallman" and "Richard M > Stallman" with "Richard Stallman"? How about email addresses? Perhaps > you've already addressed this point but if so I'm afraid I forgot what > you wrote. Normalization of names and mail addresses is done by putting all respective mail addresses with proper names into a .mailmap file in the top directory of the repository. Git will consult this file for his various operations requiring unification of names, like git shortlog. -- David Kastrup ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-02-18 16:30 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-17 15:29 Policty question - encoding to use in git repository? Eric S. Raymond 2014-02-17 15:41 ` Juanma Barranquero 2014-02-17 16:30 ` Eric S. Raymond 2014-02-17 17:03 ` Andreas Schwab 2014-02-17 17:12 ` Eric S. Raymond 2014-02-17 16:39 ` Karl Fogel 2014-02-17 16:52 ` Eli Zaretskii 2014-02-17 17:01 ` Eric S. Raymond 2014-02-17 17:04 ` Andreas Schwab 2014-02-17 17:24 ` Eli Zaretskii 2014-02-17 19:22 ` Ivan Kanis 2014-02-17 21:47 ` Paul Eggert 2014-02-17 22:08 ` Eric S. Raymond 2014-02-18 0:50 ` Glenn Morris 2014-02-18 1:07 ` Stefan Monnier 2014-02-18 16:30 ` Karl Fogel 2014-02-18 6:40 ` David Kastrup
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.