* EOL: unix/dos/mac @ 2013-03-25 13:34 Per Starbäck 2013-03-25 13:56 ` Xue Fuqiao ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Per Starbäck @ 2013-03-25 13:34 UTC (permalink / raw) To: emacs-devel The end-of-line indicators for coding systems are unix, dos, and mac. I suggest they are replaced with lf, crlf, and cr. I think the current indicators are misleading for those who don't know about this. Mac OS X has been preloaded on all Macs since 2002. Mac owners who find how to click on the colon on the mode line might think that they should have "Mac" there. (And other users might think they are doing their Mac user friends a favor if they convert a text file to "Mac" before sending them a file.) "DOS" might also lead to confusion, since Microsoft Windows isn't really DOS (anymore). Wouldn't users who see that think that this must be something old in some obsolete DOS encoding? It's not a big problem, but it will (very) slowly get bigger and bigger with time, so that even people who do know about this stuff can be confused. So I think it would be best if Emacs used designations that aren't about what systems (used to) use the different codings, but instead just "(CR)", "(LF)", "(CRLF)". ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 13:34 EOL: unix/dos/mac Per Starbäck @ 2013-03-25 13:56 ` Xue Fuqiao 2013-03-25 22:41 ` Richard Stallman 2013-03-25 14:21 ` Eli Zaretskii 2013-03-25 19:17 ` Stefan Monnier 2 siblings, 1 reply; 27+ messages in thread From: Xue Fuqiao @ 2013-03-25 13:56 UTC (permalink / raw) To: Per Starbäck; +Cc: emacs-devel On Mon, 25 Mar 2013 14:34:04 +0100 Per Starbäck <per.starback@gmail.com> wrote: > The end-of-line indicators for coding systems are unix, dos, and mac. > I suggest they are replaced with lf, crlf, and cr. +1 -- Xue Fuqiao http://www.gnu.org/software/emacs/ ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 13:56 ` Xue Fuqiao @ 2013-03-25 22:41 ` Richard Stallman 2013-03-26 2:11 ` Stephen J. Turnbull 0 siblings, 1 reply; 27+ messages in thread From: Richard Stallman @ 2013-03-25 22:41 UTC (permalink / raw) To: Xue Fuqiao; +Cc: per.starback, emacs-devel > The end-of-line indicators for coding systems are unix, dos, and mac. > I suggest they are replaced with lf, crlf, and cr. Someone needs to check how this would affect non-wizard users. It might be confusing for them. It might also be good for them. The point is, we are not like them and we can't tell how this would affect them. -- Dr Richard Stallman President, Free Software Foundation 51 Franklin St Boston MA 02110 USA www.fsf.org www.gnu.org Skype: No way! That's nonfree (freedom-denying) software. Use Ekiga or an ordinary phone call ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 22:41 ` Richard Stallman @ 2013-03-26 2:11 ` Stephen J. Turnbull 0 siblings, 0 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 2:11 UTC (permalink / raw) To: rms; +Cc: Xue Fuqiao, per.starback, emacs-devel Richard Stallman writes: > > The end-of-line indicators for coding systems are unix, dos, and > > mac. I suggest they are replaced with lf, crlf, and cr. > > Someone needs to check how this would affect non-wizard users. I don't see why it would. Non-wizards rarely want to see it at all, and usually have a very incomplete understanding of what it means. IME that's what it means to be a non-wizard. Even in Japan, where users encounter 4 or 5 (!!) encodings *every* *day* (ISO-2022-JP in mail headers, EUC-JP and Shift JIS in text files from older *nix and Micros*ft environments, UTF-8 in text files from modern environments, and UTF-16 in file names on NT file systems), younger users don't even realize that they're there. They just call coding problems "mojibake" and ask for corrected data. I think a better way to present this information would be to put it in a separate "troubleshoot this buffer" function. Perhaps adding it to C-u C-x =, or a separate function on C-h = (both with the nuance "troubleshoot around point"). Caveat: I have no empirical evidence for the feeling that this would be better, just introspection and experience with helping users who are not much helped by the current UI. The idea is that ordinarily, Emacs just Does The Right Thing, so there's no need to know what the EOL suffix (or for that matter the EOL modeline indicator) means, and many users forget or never learn. If they *do* run into trouble like "stair-stepping" or "^M" in a buffer, they can use C-u C-x = to find out "what's different about this linebreak". ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 13:34 EOL: unix/dos/mac Per Starbäck 2013-03-25 13:56 ` Xue Fuqiao @ 2013-03-25 14:21 ` Eli Zaretskii 2013-03-25 17:28 ` Dani Moncayo 2013-03-25 19:17 ` Stefan Monnier 2 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2013-03-25 14:21 UTC (permalink / raw) To: Per Starbäck; +Cc: emacs-devel > Date: Mon, 25 Mar 2013 14:34:04 +0100 > From: Per Starbäck <per.starback@gmail.com> > > The end-of-line indicators for coding systems are unix, dos, and mac. > I suggest they are replaced with lf, crlf, and cr. I have customized my Emacsen long ago to show /, \, and : instead. Never looked back, and I will certainly keep those customizations if your suggestion is accepted as the default. The current indicators are shown only if the EOL format is _not_ the native one on the underlying platform. That was done a long time ago, to draw users' attention to the fact that the file has unusual line endings. I think the need to draw attention to that has passed. But that's me. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 14:21 ` Eli Zaretskii @ 2013-03-25 17:28 ` Dani Moncayo 0 siblings, 0 replies; 27+ messages in thread From: Dani Moncayo @ 2013-03-25 17:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Per Starbäck, emacs-devel >> The end-of-line indicators for coding systems are unix, dos, and mac. >> I suggest they are replaced with lf, crlf, and cr. > > I have customized my Emacsen long ago to show /, \, and : instead. > Never looked back, and I will certainly keep those customizations if > your suggestion is accepted as the default. Me too. I don't want these strings to take more than one character in the modeline, which is sometimes too short. > The current indicators are shown only if the EOL format is _not_ the > native one on the underlying platform. That was done a long time ago, > to draw users' attention to the fact that the file has unusual line > endings. I think the need to draw attention to that has passed. But > that's me. I also prefer a consistent notation across all platforms. I don't think that this information (EOL-style) deserves so much attention. -- Dani Moncayo ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 13:34 EOL: unix/dos/mac Per Starbäck 2013-03-25 13:56 ` Xue Fuqiao 2013-03-25 14:21 ` Eli Zaretskii @ 2013-03-25 19:17 ` Stefan Monnier 2013-03-26 1:42 ` Stephen J. Turnbull 2013-03-26 7:53 ` Ulrich Mueller 2 siblings, 2 replies; 27+ messages in thread From: Stefan Monnier @ 2013-03-25 19:17 UTC (permalink / raw) To: Per Starbäck; +Cc: emacs-devel > The end-of-line indicators for coding systems are unix, dos, and mac. > I suggest they are replaced with lf, crlf, and cr. I do not like cr/lf/crlf as I expect many users will have no idea what they mean. > Mac OS X has been preloaded on all Macs since 2002. The "Mac" indicator is indeed a very poor choice nowadays. > "DOS" might also lead to confusion, since Microsoft Windows isn't > really DOS (anymore). "DOS" is not a great choice either, indeed, tho it's definitely not as bad as "Mac" since the heir of DOS still uses the same system. > I have customized my Emacsen long ago to show /, \, and : instead. I also like this representation, since it happens to correlate rather well (although most Mac OS X users never see the `/', just like most Mac OS users never saw the `:' separator). > The current indicators are shown only if the EOL format is _not_ the > native one on the underlying platform. That was done a long time ago, > to draw users' attention to the fact that the file has unusual line > endings. I think the need to draw attention to that has passed. But > that's me. I actually disagree that this need has passed. For that reason, I actually like to see "(DOS)" in the modeline, since a simple change from "/" to "\" would definitely go unnoticed (in my case at least). So I'm OK with "updating" the indicators, tho I'm not sure what we should use instead. To replace "Mac", maybe we could use "MacOS9", which is longish but hopefully such files are rare nowadays. But DOS files are not rare, so we need something sufficiently concise. BTW, in this same area, it would be good to detect and indicate prominently "Unix with some CRLFs", also known as "mixed-line-ending", which is often misunderstood as "my Emacs fails to recognize my CRLF file". Stefan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 19:17 ` Stefan Monnier @ 2013-03-26 1:42 ` Stephen J. Turnbull 2013-03-26 6:28 ` Eli Zaretskii ` (2 more replies) 2013-03-26 7:53 ` Ulrich Mueller 1 sibling, 3 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 1:42 UTC (permalink / raw) To: Stefan Monnier; +Cc: Per Starbäck, emacs-devel Stefan Monnier writes: > BTW, in this same area, it would be good to detect and indicate > prominently "Unix with some CRLFs", also known as "mixed-line-ending", > which is often misunderstood as "my Emacs fails to recognize my CRLF > file". Unicode doesn't care, you know: it considers all ASCII line breaks and terminators to be the same thing (NEW LINE FUNCTION). I haven't read that part of the standard in a long time, but IIRC, although many people interpolate "according to platform", Unicode doesn't care about that, it just says "all of these sequences when encountered in text purporting to conform to this standard should be treated in the same way." Emacsen should do the same. The question then is how to deal with file comparison. We'd like to avoid creating spurious diffs based on "fixing" random different line endings, so if the user doesn't edit those positions (lines?), the line ending should be written as read. I guess one could attach a text property to newlines differing from the file's autodetected EOL convention. I've also considered switching the internal representation of newline to U+2028 LINE SEPARATOR, but that's not at all pressing. Steve ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 1:42 ` Stephen J. Turnbull @ 2013-03-26 6:28 ` Eli Zaretskii 2013-03-26 7:45 ` Stephen J. Turnbull 2013-03-26 12:51 ` Stefan Monnier 2013-03-26 14:02 ` Alan Mackenzie 2 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2013-03-26 6:28 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: per.starback, monnier, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Date: Tue, 26 Mar 2013 10:42:38 +0900 > Cc: Per Starbäck <per.starback@gmail.com>, emacs-devel@gnu.org > > Stefan Monnier writes: > > > BTW, in this same area, it would be good to detect and indicate > > prominently "Unix with some CRLFs", also known as "mixed-line-ending", > > which is often misunderstood as "my Emacs fails to recognize my CRLF > > file". > > Unicode doesn't care, you know: it considers all ASCII line breaks and > terminators to be the same thing (NEW LINE FUNCTION). I haven't read > that part of the standard in a long time, but IIRC, although many > people interpolate "according to platform", Unicode doesn't care about > that, it just says "all of these sequences when encountered in text > purporting to conform to this standard should be treated in the same > way." Emacsen should do the same. That would require Emacs to store all the possible EOL sequences in the buffer, and treat them all identically. That's doable, but is a non-trivial job; volunteers are welcome. > The question then is how to deal with file comparison. We'd like to > avoid creating spurious diffs based on "fixing" random different line > endings If Emacs is to support different EOL formats in the same file, it should not convert them at all. Anything else _will_ introduce spurious modifications, and could even corrupt some files, if the exact EOL sequence here or there matters. > I guess one could attach a text property to newlines differing from > the file's autodetected EOL convention. Not sure how a text property should help here. > I've also considered switching the internal representation of newline > to U+2028 LINE SEPARATOR What good would that be? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 6:28 ` Eli Zaretskii @ 2013-03-26 7:45 ` Stephen J. Turnbull 2013-03-26 8:42 ` Eli Zaretskii 0 siblings, 1 reply; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 7:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: per.starback, monnier, emacs-devel Eli Zaretskii writes: > > From: "Stephen J. Turnbull" <stephen@xemacs.org> > > [Unicode] just says "all of these sequences when encountered in > > text purporting to conform to this standard should be treated in > > the same way." Emacsen should do the same. > > That would require Emacs to store all the possible EOL sequences in > the buffer, and treat them all identically. That's doable, but is a > non-trivial job; volunteers are welcome. I don't know what you mean by "all the possible EOL sequences". It's well-defined (in Unicode TR#13 or section 5.8 of Unicode 6.2) what an NLF is: it's the first of CRLF, LF, CR, or NL (U+0085) that matches when parsing a line. In the buffer, they would all be converted to Emacs' representation (ie, LF). Ensuring that C-x C-f file RET C-x C-w file RET is the identity requires marking non-default EOL sequences somehow, that's all. > > The question then is how to deal with file comparison. We'd like to > > avoid creating spurious diffs based on "fixing" random different line > > endings > > If Emacs is to support different EOL formats in the same file, it > should not convert them at all. Of course it should convert them. Trying to support multiple EOL codings in the buffer is craziness. Two decades ago, I had to live that madness at the coding system level, it was called "Nihongo Emacs" (or "The Japanese Patch" in other programs). Richard (and every other upstream maintainer) rightly (with all due respect to the developers of those patches) rejected that patch for application to the mainstream project. Doing it only for EOLs would be much less painful, but it's not worth it. > Anything else _will_ introduce spurious modifications, and could > even corrupt some files, if the exact EOL sequence here or there > matters. No, it need not, any more than any ambiguous encoding need do so. Of course it will be fragile if (for example) Emacs crashes and you have to recover an autosave file. > > I guess one could attach a text property to newlines differing from > > the file's autodetected EOL convention. > > Not sure how a text property should help here. It would mark non-default EOL sequences for correct output. > > I've also considered switching the internal representation of newline > > to U+2028 LINE SEPARATOR > > What good would that be? Unicode correctness; no confusion between Emacs internal representation and the actual encoding of EOL on any given platform; no long-lines ambiguity (LS would be considered a "soft newline" in applications that automatically rewrap, and U+2029 PARAGRAPH SEPARATOR would unambiguously demark paragraphs). As I wrote, it's not urgent. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 7:45 ` Stephen J. Turnbull @ 2013-03-26 8:42 ` Eli Zaretskii 2013-03-26 11:47 ` Stephen J. Turnbull 0 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2013-03-26 8:42 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: per.starback, monnier, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: per.starback@gmail.com, > monnier@iro.umontreal.ca, > emacs-devel@gnu.org > Date: Tue, 26 Mar 2013 16:45:30 +0900 > > Eli Zaretskii writes: > > > From: "Stephen J. Turnbull" <stephen@xemacs.org> > > > > [Unicode] just says "all of these sequences when encountered in > > > text purporting to conform to this standard should be treated in > > > the same way." Emacsen should do the same. > > > > That would require Emacs to store all the possible EOL sequences in > > the buffer, and treat them all identically. That's doable, but is a > > non-trivial job; volunteers are welcome. > > I don't know what you mean by "all the possible EOL sequences". It's > well-defined (in Unicode TR#13 or section 5.8 of Unicode 6.2) what an > NLF is: it's the first of CRLF, LF, CR, or NL (U+0085) that matches > when parsing a line. That's what I meant: any of the possible NLF. > > > The question then is how to deal with file comparison. We'd like to > > > avoid creating spurious diffs based on "fixing" random different line > > > endings > > > > If Emacs is to support different EOL formats in the same file, it > > should not convert them at all. > > Of course it should convert them. > > Trying to support multiple EOL codings in the buffer is craziness. But it's the only way to be 100% sure you don't introduce spurious changes into files. And since newlines, unlike characters, are not displayed, there's no issues with fonts etc. here. So "craziness" sounds like exaggeration to me, although I do agree that making this happen is not a trivial job. > Doing it only for EOLs would be much less painful, but it's not > worth it. Please explain why do you think it isn't worth it. Surely, going again through the pain of inadvertent changes to user files is a movie we don't want to be part of again. > > Anything else _will_ introduce spurious modifications, and could > > even corrupt some files, if the exact EOL sequence here or there > > matters. > > No, it need not, any more than any ambiguous encoding need do so. Of > course it will be fragile if (for example) Emacs crashes and you have > to recover an autosave file. It will be fragile, and subtle bugs will tend to break quite a bit. > > > I guess one could attach a text property to newlines differing from > > > the file's autodetected EOL convention. > > > > Not sure how a text property should help here. > > It would mark non-default EOL sequences for correct output. And when text properties are removed by some operation on a buffer, what then? I don't think this is reliable enough to ensure we don't change user files where the user didn't edit them. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 8:42 ` Eli Zaretskii @ 2013-03-26 11:47 ` Stephen J. Turnbull 2013-03-26 13:07 ` Eli Zaretskii 0 siblings, 1 reply; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 11:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: per.starback, monnier, emacs-devel Eli Zaretskii writes: > > From: "Stephen J. Turnbull" <stephen@xemacs.org> > > Of course it should convert them. > > > > Trying to support multiple EOL codings in the buffer is craziness. > > But it's the only way to be 100% sure you don't introduce spurious > changes into files. And since newlines, unlike characters, are not > displayed, there's no issues with fonts etc. here. Currently NLFs *are* displayed, if they don't match the default for the buffer. Some even appear as glyphs (^M in -unix buffers). Sure, there's no issue with fonts. There are worse things than getting the wrong font, though. > So "craziness" sounds like exaggeration to me, although I do agree > that making this happen is not a trivial job. > > > Doing it only for EOLs would be much less painful, but it's not > > worth it. > > Please explain why do you think it isn't worth it. Because you have to fix pretty much everything, and new syntax will be required for stuff like zap-to-char and nearly required for regexps. Code will be massively uglified with tests for variable-length sequences instead of single characters, everything from motion to insdel will have to be modified. Any code handling old-style hidden lines (with CR marking "invisible" lines) will have to be changed. It's not obvious to me that there are no counterintuitive implications. Opposed to that, there are very few text files with mixed line endings, and in many cases the user would actually like to have them regularized (at a time of their choosing, so they can have a commit with only whitespace changes, for example). > Surely, going again through the pain of inadvertent changes to user > files is a movie we don't want to be part of again. What pain of inadvertant changes? Sure, there will likely be bugs in the first draft of such code, what else is new? If you're talking specifically about the \201 regression, that's a completely different issue AFAICT -- that was about buffer-as-unibyte exposing the *internal* representation to Lisp, which was a "Mr. Foot, may I introduce to you Mr. Bullet" kind of idea from Day 1. > > > Anything else _will_ introduce spurious modifications, and could > > > even corrupt some files, if the exact EOL sequence here or there > > > matters. > > > > No, it need not, any more than any ambiguous encoding need do so. Of > > course it will be fragile if (for example) Emacs crashes and you have > > to recover an autosave file. > > It will be fragile, and subtle bugs will tend to break quite a bit. I don't think so. It can be implemented as two functions, one run just after decoding text from an external encoding, and one run just before encoding text to an external encoding. Done efficiently it can probably be applied to saving autosave files as well, removing the fragility. For maximum safety the information about non-default NLFs could be kept in "no-see-'um" properties accessed by separate APIs so that users and programs don't accidentally delete the information. > > > > I guess one could attach a text property to newlines differing from > > > > the file's autodetected EOL convention. > > > > > > Not sure how a text property should help here. > > > > It would mark non-default EOL sequences for correct output. > > And when text properties are removed by some operation on a buffer, > what then? I don't think this is reliable enough to ensure we don't > change user files where the user didn't edit them. I think you're hearing monsters in the closet. Sure, that *could* happen but code that does so is buggy IMO. If that's not a good enough answer, "no-see-'um" properties as described above would do the trick. I suspect that operations that change properties are rare enough that putting a check for a "don't change me" flag into the normal text property operations would not be an efficiency hit. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 11:47 ` Stephen J. Turnbull @ 2013-03-26 13:07 ` Eli Zaretskii 2013-03-26 18:12 ` Stephen J. Turnbull 0 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2013-03-26 13:07 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: per.starback, monnier, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: per.starback@gmail.com, > monnier@iro.umontreal.ca, > emacs-devel@gnu.org > Date: Tue, 26 Mar 2013 20:47:33 +0900 > > > > Trying to support multiple EOL codings in the buffer is craziness. > > > > But it's the only way to be 100% sure you don't introduce spurious > > changes into files. And since newlines, unlike characters, are not > > displayed, there's no issues with fonts etc. here. > > Currently NLFs *are* displayed, if they don't match the default for > the buffer. No, they are displayed because nothing other than a single LF is treated like NLF by the Emacs internals. EOL conversion is a layer on top of that; the buffer maintenance and the display engine know absolutely nothing about it. Once these byte sequences are recognized as NLFs, they will not be displayed, because that's how the Emacs display works. > > > Doing it only for EOLs would be much less painful, but it's not > > > worth it. > > > > Please explain why do you think it isn't worth it. > > Because you have to fix pretty much everything I'm probably missing something important, because things I think will need fixing are nowhere near "pretty much everything". How about posting a long enough list of things to fix to convince me that "pretty much everything" is close to the truth? > new syntax will be required for stuff like zap-to-char Why? > and nearly required for regexps. For $ we will need to get regex.c support the additional NLFs, and that's all. If you mean a literal \n in regexps, then yes, something will have to be done with that. But it would be a good thing on its own right, because Emacs will come closer to supporting Unicode standard annexes. > Code will be massively uglified with tests for variable-length > sequences instead of single characters The code is already replete with that, ever since Emacs started using a multi-byte representation for characters in buffers. We have a set of macros to fetch and examine multi-byte sequences, for that reason. I see nothing hard or "ugly" here, sorry. > everything from motion to insdel will have to be modified Why? > Any code handling old-style hidden lines (with CR marking > "invisible" lines) will have to be changed. First, we want to deprecate and remove this feature anyway (there's already an implemented alternative). And second, we already handle this today so that we don't display ^M there; the same method can be used for the other NLFs. > It's not obvious to me that there are no counterintuitive > implications. Opposed to that, there are very few text files with > mixed line endings, and in many cases the user would actually like to > have them regularized (at a time of their choosing, so they can have a > commit with only whitespace changes, for example). We should be consistent: either there is a problem with mixed line endings and with Unicode NLFs that aren't treated as EOL at all, or there isn't. If the problem is insignificant, perhaps nothing should be changed at all. If the problem _is_ significant, we might as well solve it The Right Way, instead of applying more and more band-aid. Conversion of NLFs to a single LF is a kludge, same as emptying the kettle when you already have a procedure for preparing a kettle of boiled water starting with an empty one. You cannot do such conversion efficiently if you need to discover the EOL format for every line. Dispensing with the conversion altogether solves both problems in one go. What it adds doesn't seem so frightening to me, certainly less so than, say, adding bidi support ;-) > > Surely, going again through the pain of inadvertent changes to user > > files is a movie we don't want to be part of again. > > What pain of inadvertant changes? Sure, there will likely be bugs in > the first draft of such code, what else is new? If you're talking > specifically about the \201 regression, that's a completely different > issue AFAICT -- that was about buffer-as-unibyte exposing the > *internal* representation to Lisp, which was a "Mr. Foot, may I > introduce to you Mr. Bullet" kind of idea from Day 1. The internal representation is still exposed, so nothing's changed in that department. > > > > Anything else _will_ introduce spurious modifications, and could > > > > even corrupt some files, if the exact EOL sequence here or there > > > > matters. > > > > > > No, it need not, any more than any ambiguous encoding need do so. Of > > > course it will be fragile if (for example) Emacs crashes and you have > > > to recover an autosave file. > > > > It will be fragile, and subtle bugs will tend to break quite a bit. > > I don't think so. Well, then we will have agree to disagree. > I think you're hearing monsters in the closet. And I think _you_ are hearing them. Or maybe you will show me such a large list of things that will become broken by keeping NLFs that I will change my mind. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 13:07 ` Eli Zaretskii @ 2013-03-26 18:12 ` Stephen J. Turnbull 2013-03-26 18:44 ` Eli Zaretskii 0 siblings, 1 reply; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 18:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: per.starback, monnier, emacs-devel Eli Zaretskii writes: > > From: "Stephen J. Turnbull" <stephen@xemacs.org> > > Currently NLFs *are* displayed, if they don't match the default for > > the buffer. > > No, they are displayed because nothing other than a single LF is > treated like NLF by the Emacs internals. Emacs doesn't get to define NLF; it's a Unicode concept. You'll get in trouble if you get confused about that. Those *are* NLFs, and in the "CR in *-unix buffer" form they *are* displayed as "^M"s, while in the "bare LF in *-doc buffer" form they *do* appear as stair-stepping lines. That does bother some users, including some who understand why it happens. > > Because you have to fix pretty much everything > > I'm probably missing something important, because things I think will > need fixing are nowhere near "pretty much everything". How about > posting a long enough list of things to fix to convince me that > "pretty much everything" is close to the truth? "Everything" is of course an exaggeration. At a minimum, you need to change delete and motion commands to handle the fact that EOL doesn't have a constant width in characters. Should users be able to move *into* a CRLF in -unix buffer? How about a -dos buffer? Should forward-char-command move into or *over* a CRLF? Does it matter what the EOL convention is for that buffer? What are we going to do for the occasional user who wants the less usual behavior for some reason? You need to decide what (insert "\015") means in a -dos buffer, and you can be pretty sure that some users will be confused whichever you choose. Ditto (insert "\012") in a -mac buffer. You may very well want those to mean something different from the commands that self-insert either or both of those characters. Until now, skip-chars-forward and regexps would find EOL if the string defining the target contained "\n". Is that going to continue to be true? How do you propose to find a bare LF -- are we going to make users use octal or hex escapes, or do we define new string syntax? > > Code will be massively uglified with tests for variable-length > > sequences instead of single characters > > The code is already replete with that, ever since Emacs started using > a multi-byte representation for characters in buffers. We have a set > of macros to fetch and examine multi-byte sequences, for that reason. > I see nothing hard or "ugly" here, sorry. Ah, but this is completely a different story. Those there are C macros, and not visible to Lisp programs, which know that a line break is represented by a single character, U+000A. That's no longer true for NLF, which by definition is composed of one or more *characters*, not code units. It's *Lisp* code that has to deal with this. > > Any code handling old-style hidden lines (with CR marking > > "invisible" lines) will have to be changed. > > First, we want to deprecate and remove this feature anyway (there's > already an implemented alternative). And second, we already handle > this today so that we don't display ^M there; the same method can be > used for the other NLFs. Sorry, that breaks immediately. That ^M is now an NLF, and you either treat it that way and not as an invisibility marker, or the meaning of the buffer changes when you switch that mode on and off in a very delicate way. I'm pretty sure it will corrupt the buffer unless you mark preexisting ^Ms as NLFs or convert them to something else. Which is what I'm proposing, of course. So you can fall back on deprecation. Has the feature actually been scheduled for deprecation and eventual removal? If not, you're looking at 5-10 years before it gets removed. > If the problem _is_ significant, we might as well solve it The > Right Way, instead of applying more and more band-aid. Conversion > of NLFs to a single LF is a kludge, Not to mention a close approximation to the right way to handle them according to the Unicode standard under many circumstances. (The truly correct way to handle them is to substitute LINE SEPARATOR, as I mentioned earlier.) > You cannot do such conversion efficiently if you need to discover > the EOL format for every line. Of course you can. You don't need to "discover" the EOL format; you know that an EOL is anything that matches "\r\n\|\r\|\n\|\205" as you move forward through the buffer. It's only a tiny bit more expensive than current conversion for -dos or -mac, and those are hardly prohibitive, especially when compared to I/O itself. > What it adds doesn't seem so frightening to me, certainly less so > than, say, adding bidi support ;-) Agreed, but irrelevant. bidi is a new feature necessary to support some languages currently used by millions of people, and the hairiness is mandated by UAX #9 -- an alternative implementation is not going to make conformance much easier. What we're talking about here are alternative implementations of a much smaller feature, NLF, and which one is going to be more efficient and more natural for Emacs. > The internal representation is still exposed, so nothing's changed in > that department. I know, and taking advantage of that exposure still falls in the class of "Kids, these stunts are performed by trained professionals. Don't try this at home!" Can you deny that? > > I think you're hearing monsters in the closet. > > And I think _you_ are hearing them. Well, yes, I am. But I've worked with implementations of coding systems in both XEmacs and Python, and I know that what I'm talking about will work and be efficient, and buffers and strings will continue to conform to the Emacs model. I know that what you're talking about will break some invariants for character motion and editing at line end, and that worries me. Proof? You're right, I have none. By the same token, you don't either. What worries me is that while I can prove (or perhaps disprove) my point with a small set of unit tests and benchmarks, you will have to hand that version of Emacs to real users for a year or three to find out if anybody really cares that the model broke. > Or maybe you will show me such a large list of things that will > become broken by keeping NLFs that I will change my mind. I can't; I gave you my list already, and I grant that it's not all that long and several of the potential problems can't be confirmed at this point. But if you decide to keep NLFs in the buffer rather than conforming to the tried and true Emacs/Mule model of converting them to a one-character representation, I predict you will find plenty of breakage over years, just as the \201 bug regressed multiple times over something like a decade. It's true that keeping NLFs in the buffer will bring Emacs's internal representation into closer conformance with the Unicode Standard, but both the benefits and the costs of that are unclear to me. Sure, it makes it conceptually straightforward to support Unicode handling of NLF in regexps, but you can already do that by simply avoiding EOL conversion when you need highly accurate Unicode conformance. On the other hand, when you are treating NLFs as NLFs, you will be breaking the 40-year-old Emacs model of a linebreak marked by a single character. I don't know what trouble that will cause, but there's no easy workaround for it that preserves those NLFs. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 18:12 ` Stephen J. Turnbull @ 2013-03-26 18:44 ` Eli Zaretskii 2013-03-27 5:10 ` Stephen J. Turnbull 0 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2013-03-26 18:44 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: per.starback, monnier, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: per.starback@gmail.com, > monnier@iro.umontreal.ca, > emacs-devel@gnu.org > Date: Wed, 27 Mar 2013 03:12:11 +0900 > > Eli Zaretskii writes: > > > From: "Stephen J. Turnbull" <stephen@xemacs.org> > > > > Currently NLFs *are* displayed, if they don't match the default for > > > the buffer. > > > > No, they are displayed because nothing other than a single LF is > > treated like NLF by the Emacs internals. > > Emacs doesn't get to define NLF; it's a Unicode concept. Can we be less pedantic, please, just to have the water less muddy? OK, let me rephrase: they are displayed because nothing other than a single LF character is currently treated by Emacs as an end of line. An end of line is never displayed by Emacs or sent to the screen, not even on a TTY; it is acted upon by moving the display to the next line (a.k.a. "new-line function"). > Those *are* NLFs, and in > the "CR in *-unix buffer" form they *are* displayed as "^M"s, while in > the "bare LF in *-doc buffer" form they *do* appear as stair-stepping > lines. I guess you meant "-dos", not "-doc". Anyway, there are no stair-stepping lines in Emacs because of this, because Emacs never outputs the EOL sequences to the screen. That is why the -unix or -dos variants are meaningless in terminal-coding-system. > "Everything" is of course an exaggeration. At a minimum, you need to > change delete and motion commands to handle the fact that EOL doesn't > have a constant width in characters. Should users be able to move > *into* a CRLF in -unix buffer? How about a -dos buffer? No and no (and there won't be any -unix and -dos buffers under this mode of operation). > Should forward-char-command move into or *over* a CRLF? No. > Does it matter what the EOL convention is for that buffer? No. > What are we going to do for the occasional user who wants the less > usual behavior for some reason? What "less usual behavior"? > You need to decide what (insert "\015") means in a -dos buffer No decision required: it will insert an CR, like it does today. If that CR happens to precede a newline, it will become invisible when inserted. > Until now, skip-chars-forward and regexps would find EOL if the > string defining the target contained "\n". Is that going to > continue to be true? How do you propose to find a bare LF -- are we > going to make users use octal or hex escapes, or do we define new > string syntax? I see no serious problems with this, sorry to disappoint you. > Ah, but this is completely a different story. Those there are C > macros, and not visible to Lisp programs, which know that a line break > is represented by a single character, U+000A. That's no longer true > for NLF, which by definition is composed of one or more *characters*, > not code units. It's *Lisp* code that has to deal with this. Lisp code already needs to deal with similar complications, e.g. when it moves across invisible text or text covered by a 'display' property or overlay string. > > > Any code handling old-style hidden lines (with CR marking > > > "invisible" lines) will have to be changed. > > > > First, we want to deprecate and remove this feature anyway (there's > > already an implemented alternative). And second, we already handle > > this today so that we don't display ^M there; the same method can be > > used for the other NLFs. > > Sorry, that breaks immediately. That ^M is now an NLF, and you either > treat it that way and not as an invisibility marker, or the meaning of > the buffer changes when you switch that mode on and off in a very > delicate way. No, it doesn't break, like it doesn't today. When selective display is in effect, a buffer-local variable says that, so you can treat ^M accordingly. > So you can fall back on deprecation. Has the feature actually been > scheduled for deprecation and eventual removal? Yes, long ago. > > What it adds doesn't seem so frightening to me, certainly less so > > than, say, adding bidi support ;-) > > Agreed, but irrelevant. bidi is a new feature necessary to support > some languages currently used by millions of people, and the hairiness > is mandated by UAX #9 -- an alternative implementation is not going to > make conformance much easier. You are missing my point, which was about implications _on_Emacs_ of adding bidi support. UAX#9 cannot (and didn't) help making design decisions in that regard. > > The internal representation is still exposed, so nothing's changed in > > that department. > > I know, and taking advantage of that exposure still falls in the class > of "Kids, these stunts are performed by trained professionals. Don't > try this at home!" Can you deny that? No. But I'm saying that given that exposure, the abstraction _will_ leak, and when it does, users will be unhappy again. > I know that what you're talking about will break some invariants for > character motion and editing at line end, and that worries me. > Proof? You're right, I have none. You don't need a proof, because I agree. But we already have quite a few features that introduce peculiar effects into character motion, and they didn't cause any catastrophes. I don't see why this one is any different. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 18:44 ` Eli Zaretskii @ 2013-03-27 5:10 ` Stephen J. Turnbull 0 siblings, 0 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-27 5:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: per.starback, monnier, emacs-devel Eli Zaretskii writes: > You don't need a proof, because I agree. But we already have quite a > few features that introduce peculiar effects into character motion, > and they didn't cause any catastrophes. I don't see why this one is > any different. If your standard is "catastrophes", then (a) no, this one is no different, and (b) I have no contribution to make, because the contribution I want to make requires concern with problems that are less than catastrophic. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 1:42 ` Stephen J. Turnbull 2013-03-26 6:28 ` Eli Zaretskii @ 2013-03-26 12:51 ` Stefan Monnier 2013-03-26 13:10 ` Eli Zaretskii 2013-03-26 16:16 ` Stephen J. Turnbull 2013-03-26 14:02 ` Alan Mackenzie 2 siblings, 2 replies; 27+ messages in thread From: Stefan Monnier @ 2013-03-26 12:51 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: Per Starbäck, emacs-devel > Unicode doesn't care, you know: it considers all ASCII line breaks and > terminators to be the same thing (NEW LINE FUNCTION). But when saving the file, which line ends would we use? For pre-existing line-ends, we could reproduce what was there before, but what about new lines? Stefan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 12:51 ` Stefan Monnier @ 2013-03-26 13:10 ` Eli Zaretskii 2013-03-26 17:16 ` Stefan Monnier 2013-03-26 16:16 ` Stephen J. Turnbull 1 sibling, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2013-03-26 13:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: per.starback, stephen, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Tue, 26 Mar 2013 08:51:45 -0400 > Cc: Per Starbäck <per.starback@gmail.com>, > emacs-devel@gnu.org > > But when saving the file, which line ends would we use? > For pre-existing line-ends, we could reproduce what was there before, > but what about new lines? User preference and some heuristics, I guess, as always. E.g., if all the lines used the same NLF, use that for new lines; otherwise look at some user option for guidance. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 13:10 ` Eli Zaretskii @ 2013-03-26 17:16 ` Stefan Monnier 2013-03-26 17:47 ` Eli Zaretskii 0 siblings, 1 reply; 27+ messages in thread From: Stefan Monnier @ 2013-03-26 17:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: per.starback, stephen, emacs-devel >> But when saving the file, which line ends would we use? >> For pre-existing line-ends, we could reproduce what was there before, >> but what about new lines? > User preference and some heuristics, I guess, as always. E.g., if all > the lines used the same NLF, use that for new lines; otherwise look at > some user option for guidance. So for files that use a consistent style, that means same behavior as what we now have. The only difference is for mixed-style files, and AFAIK the only mixed-style files that occur often enough to care are of the LF-vs-CRLF kind, where I think the most important thing is to make ti clear that the extra CRs displayed are due to the presence of this mixed-style (so maybe we should check which style is more prominent and either highlight the few extra CRs or on the contrary hide the CRs and highlight the few missing CRs). Stefan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 17:16 ` Stefan Monnier @ 2013-03-26 17:47 ` Eli Zaretskii 2013-03-26 18:41 ` Stephen J. Turnbull 0 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2013-03-26 17:47 UTC (permalink / raw) To: Stefan Monnier; +Cc: per.starback, stephen, emacs-devel > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: per.starback@gmail.com, stephen@xemacs.org, emacs-devel@gnu.org > Date: Tue, 26 Mar 2013 13:16:00 -0400 > > >> But when saving the file, which line ends would we use? > >> For pre-existing line-ends, we could reproduce what was there before, > >> but what about new lines? > > User preference and some heuristics, I guess, as always. E.g., if all > > the lines used the same NLF, use that for new lines; otherwise look at > > some user option for guidance. > > So for files that use a consistent style, that means same behavior as > what we now have. The suggestion was to support _all_ Unicode NLFs, which are more than the 3 EOL formats we support now. Other than that, yes, for consistent style the behavior visible to user will be the same. Note that my take on this is that if we extend EOL format to all the Unicode NLFs, we should not convert them to newline and back on I/O, but rather keep them verbatim in the buffers and strings (Stephen disagrees). If we go that way, there will be another user-visible change: the character position could jump by more than one when you move into the next line. > The only difference is for mixed-style files, and > AFAIK the only mixed-style files that occur often enough to care are of > the LF-vs-CRLF kind, where I think the most important thing is to make > ti clear that the extra CRs displayed are due to the presence of this > mixed-style (so maybe we should check which style is more prominent and > either highlight the few extra CRs or on the contrary hide the CRs and > highlight the few missing CRs). If we want to continue with a clear indication of mixed style, then perhaps no changes are needed at all, as we do that now. The only change in that case might be a mode-line indication of the mixed style, since the offending CR characters might not be visible in the displayed portion of the file. I rather thought the suggestion was to stop paying attention to what exactly is used as EOL, including if they are mixed-style. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 17:47 ` Eli Zaretskii @ 2013-03-26 18:41 ` Stephen J. Turnbull 0 siblings, 0 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 18:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: per.starback, Stefan Monnier, emacs-devel Eli Zaretskii writes: > If we want to continue with a clear indication of mixed style, then > perhaps no changes are needed at all, as we do that now. The only > change in that case might be a mode-line indication of the mixed > style, since the offending CR characters might not be visible in the > displayed portion of the file. > > I rather thought the suggestion was to stop paying attention to what > exactly is used as EOL, including if they are mixed-style. That's what I have in mind. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 12:51 ` Stefan Monnier 2013-03-26 13:10 ` Eli Zaretskii @ 2013-03-26 16:16 ` Stephen J. Turnbull 1 sibling, 0 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 16:16 UTC (permalink / raw) To: Stefan Monnier; +Cc: Per Starbäck, emacs-devel Stefan Monnier writes: > > Unicode doesn't care, you know: it considers all ASCII line breaks and > > terminators to be the same thing (NEW LINE FUNCTION). > > But when saving the file, which line ends would we use? > For pre-existing line-ends, we could reproduce what was there before, > but what about new lines? Basically, what Eli said. To remind you how flexible this is: The file coding system including EOL convention would be determined as it currently: a specific argument to write-file or the binding of buffer-file-coding-system, in that order. The last would be determined as currently: user's explicit setting, various settings based on alists, and finally heuristic autodetection based on file contents and platform convention for new/empty files. We'd need an additional control variable: whether to automatically convert variant NLFs to the EOL convention for writing. Or perhaps this should be done on reading. And a command to do it at the user's convenience. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 1:42 ` Stephen J. Turnbull 2013-03-26 6:28 ` Eli Zaretskii 2013-03-26 12:51 ` Stefan Monnier @ 2013-03-26 14:02 ` Alan Mackenzie 2013-03-26 14:19 ` Eli Zaretskii 2013-03-26 18:34 ` Stephen J. Turnbull 2 siblings, 2 replies; 27+ messages in thread From: Alan Mackenzie @ 2013-03-26 14:02 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: Per Starbäck, Stefan Monnier, emacs-devel Hi, Stephen. On Tue, Mar 26, 2013 at 10:42:38AM +0900, Stephen J. Turnbull wrote: > Stefan Monnier writes: > > BTW, in this same area, it would be good to detect and indicate > > prominently "Unix with some CRLFs", also known as "mixed-line-ending", > > which is often misunderstood as "my Emacs fails to recognize my CRLF > > file". > Unicode doesn't care, you know: it considers all ASCII line breaks and > terminators to be the same thing (NEW LINE FUNCTION). I haven't read > that part of the standard in a long time, but IIRC, although many > people interpolate "according to platform", Unicode doesn't care about > that, it just says "all of these sequences when encountered in text > purporting to conform to this standard should be treated in the same > way." Emacsen should do the same. This is a little confusing to poor old me. ASCII doesn't care about line breaks either; only particular use cases care. If you write a script (whether bash, sed, ....) on a *nix system and it has CRLF line ends, it will fail (with an obscure error message) regardless of whether that script is nominally in UTF-8 or ASCII or whatever. In what sense does Unicode "not care"? > Steve ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 14:02 ` Alan Mackenzie @ 2013-03-26 14:19 ` Eli Zaretskii 2013-03-26 18:34 ` Stephen J. Turnbull 1 sibling, 0 replies; 27+ messages in thread From: Eli Zaretskii @ 2013-03-26 14:19 UTC (permalink / raw) To: Alan Mackenzie; +Cc: per.starback, stephen, monnier, emacs-devel > Date: Tue, 26 Mar 2013 14:02:47 +0000 > From: Alan Mackenzie <acm@muc.de> > Cc: Per Starbäck <per.starback@gmail.com>, > Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org > > This is a little confusing to poor old me. ASCII doesn't care about line > breaks either; only particular use cases care. If you write a script > (whether bash, sed, ....) on a *nix system and it has CRLF line ends, it > will fail (with an obscure error message) regardless of whether that > script is nominally in UTF-8 or ASCII or whatever. > > In what sense does Unicode "not care"? In the sense that the shell script with CR-LF EOLs should not have failed, if Bash supported Unicode line-breaking features. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 14:02 ` Alan Mackenzie 2013-03-26 14:19 ` Eli Zaretskii @ 2013-03-26 18:34 ` Stephen J. Turnbull 1 sibling, 0 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2013-03-26 18:34 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Per Starbäck, Stefan Monnier, emacs-devel Alan Mackenzie writes: > This is a little confusing to poor old me. ASCII doesn't care about line > breaks either; only particular use cases care. True. ASCII is a coded character set. It does not have a way to represent an abstract line break in a single character; whatever you do, then, is outside of the ASCII standard. > If you write a script (whether bash, sed, ....) on a *nix system > and it has CRLF line ends, it will fail (with an obscure error > message) regardless of whether that script is nominally in UTF-8 or > ASCII or whatever. Python, at least, is not in your ellipsis. Not by default, and not on any supported platform. I wouldn't be surprised if Perl and Ruby have adopted "universal newlines", too. > In what sense does Unicode "not care"? In the sense that Unicode is more than a character set; it prescribes all kinds of algorithms for text processing as well. Here, section 5.8 of the Unicode Standard v6.2 prescribes that any of LF, CR, CRLF, and ISO 6246 NEXT LINE (U+0085) should be considered to be a single line (or paragraph) break in legacy text. It says nothing about how they should be represented internally, though. Unusually for the Unicode Standard, it allows you to guess what the user wants, and in some cases even alter the input stream before outputting it. "Legacy" text means it uses ASCII (or C1) control characters to represent line and/or paragraph breaks, rather than the characters prescribed by Unicode (U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR). ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-25 19:17 ` Stefan Monnier 2013-03-26 1:42 ` Stephen J. Turnbull @ 2013-03-26 7:53 ` Ulrich Mueller 2013-03-26 12:53 ` Stefan Monnier 1 sibling, 1 reply; 27+ messages in thread From: Ulrich Mueller @ 2013-03-26 7:53 UTC (permalink / raw) To: Stefan Monnier; +Cc: Per Starbäck, emacs-devel >>>>> On Mon, 25 Mar 2013, Stefan Monnier wrote: > So I'm OK with "updating" the indicators, tho I'm not sure what we > should use instead. Currently, the indicators coincide with the naming of coding systems, like utf-8-{unix,dos,mac}. Wouldn't it be confusing to use different notations? Or are the coding systems to be changed too? > To replace "Mac", maybe we could use "MacOS9", which is longish but > hopefully such files are rare nowadays. You could use "OS9". There are both OS-9 (by Microware) and Mac OS 9, but they agree on using CR as a line ending. Ulrich ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: EOL: unix/dos/mac 2013-03-26 7:53 ` Ulrich Mueller @ 2013-03-26 12:53 ` Stefan Monnier 0 siblings, 0 replies; 27+ messages in thread From: Stefan Monnier @ 2013-03-26 12:53 UTC (permalink / raw) To: Ulrich Mueller; +Cc: Per Starbäck, emacs-devel > You could use "OS9". There are both OS-9 (by Microware) and Mac OS 9, > but they agree on using CR as a line ending. Ah, I didn't know OS-9 also used CR as line-ending, so indeed "OS9" sounds like an attractive replacement for "Mac". Stefan ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2013-03-27 5:10 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-03-25 13:34 EOL: unix/dos/mac Per Starbäck 2013-03-25 13:56 ` Xue Fuqiao 2013-03-25 22:41 ` Richard Stallman 2013-03-26 2:11 ` Stephen J. Turnbull 2013-03-25 14:21 ` Eli Zaretskii 2013-03-25 17:28 ` Dani Moncayo 2013-03-25 19:17 ` Stefan Monnier 2013-03-26 1:42 ` Stephen J. Turnbull 2013-03-26 6:28 ` Eli Zaretskii 2013-03-26 7:45 ` Stephen J. Turnbull 2013-03-26 8:42 ` Eli Zaretskii 2013-03-26 11:47 ` Stephen J. Turnbull 2013-03-26 13:07 ` Eli Zaretskii 2013-03-26 18:12 ` Stephen J. Turnbull 2013-03-26 18:44 ` Eli Zaretskii 2013-03-27 5:10 ` Stephen J. Turnbull 2013-03-26 12:51 ` Stefan Monnier 2013-03-26 13:10 ` Eli Zaretskii 2013-03-26 17:16 ` Stefan Monnier 2013-03-26 17:47 ` Eli Zaretskii 2013-03-26 18:41 ` Stephen J. Turnbull 2013-03-26 16:16 ` Stephen J. Turnbull 2013-03-26 14:02 ` Alan Mackenzie 2013-03-26 14:19 ` Eli Zaretskii 2013-03-26 18:34 ` Stephen J. Turnbull 2013-03-26 7:53 ` Ulrich Mueller 2013-03-26 12:53 ` Stefan Monnier
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.