* Understanding how to specify UTF-8 @ 2017-04-07 23:43 Will Parsons 2017-04-08 7:29 ` Eli Zaretskii ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Will Parsons @ 2017-04-07 23:43 UTC (permalink / raw) To: help-gnu-emacs I want to always use Unicode/UTF-8 unless otherwise specified. I've noticed that I've attempted to do this in my .emacs file in two separate ways on two separate platforms: 1) (setq-default buffer-file-coding-system 'utf-8-unix) 2) (set-language-environment "UTF-8") Both seem to work, but I'm wondering if there are subtle differences between the two that I should be aware of. -- Will ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-07 23:43 Understanding how to specify UTF-8 Will Parsons @ 2017-04-08 7:29 ` Eli Zaretskii 2017-04-13 5:09 ` B. T. Raven ` (2 subsequent siblings) 3 siblings, 0 replies; 13+ messages in thread From: Eli Zaretskii @ 2017-04-08 7:29 UTC (permalink / raw) To: help-gnu-emacs > From: Will Parsons <wbp@nodomain.invalid> > Date: 7 Apr 2017 23:43:55 GMT > > I want to always use Unicode/UTF-8 unless otherwise specified. This doesn't tell what exactly do you want to happen. The above basically says "I want to use UTF-8 except when I don't", and doesn't say a word about those "I don't" cases. So please elaborate to make the responses more accurate and correct. For example, what about files you edit that were encoded in something other than UTF-8 before? what about responding to email encoded in something other than UTF-8? etc. etc. > I've noticed that I've attempted to do this in my .emacs file in two > separate ways on two separate platforms: > > 1) (setq-default buffer-file-coding-system 'utf-8-unix) > > 2) (set-language-environment "UTF-8") > > Both seem to work, but I'm wondering if there are subtle differences between > the two that I should be aware of. The second one is better, as it leaves Emacs more leeway where UTF-8 might not be appropriate. But it's difficult to know what to tell without the additional information. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-07 23:43 Understanding how to specify UTF-8 Will Parsons 2017-04-08 7:29 ` Eli Zaretskii @ 2017-04-13 5:09 ` B. T. Raven 2017-04-13 6:37 ` (unknown) Eli Zaretskii ` (2 more replies) 2017-04-21 9:28 ` Jason Rumney 2017-04-21 18:30 ` Understanding how to specify UTF-8 Stefan Monnier 3 siblings, 3 replies; 13+ messages in thread From: B. T. Raven @ 2017-04-13 5:09 UTC (permalink / raw) To: help-gnu-emacs Hi Will. I decided to respond because of this observation in the latest posting: "They used to say emacs and vi are religions; these days they are starting to seem like latin." On 4/7/2017 18:43, Will Parsons wrote: > I want to always use Unicode/UTF-8 unless otherwise specified. I've noticed > that I've attempted to do this in my .emacs file in two separate ways on two > separate platforms: > > 1) (setq-default buffer-file-coding-system 'utf-8-unix) > > 2) (set-language-environment "UTF-8") > > Both seem to work, but I'm wondering if there are subtle differences between > the two that I should be aware of. I can't help with any subtlties but can only recommend that you add this cookie to the beginning of the buffer: ;; -*- coding: utf-8 -*- I think it may be enough to save and reload the file into a new buffer before adding exotic characters. I also have these lines in my .emacs: (set-locale-environment "utf-8") (set-language-environment 'utf-8) (set-default-coding-systems 'utf-8) (setq file-name-coding-system 'utf-8) (setq buffer-file-coding-system 'utf-8) (setq coding-system-for-write 'utf-8) (set-keyboard-coding-system 'utf-8) (set-terminal-coding-system 'utf-8) (prefer-coding-system 'utf-8) ;; (set-buffer-process-coding-system 'utf-8 'utf-8) (modify-coding-system-alist 'process "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos) The line commented out caused a problem but I don't remember what it was. My os w64 vers. 7 Ed ^ permalink raw reply [flat|nested] 13+ messages in thread
* (unknown) 2017-04-13 5:09 ` B. T. Raven @ 2017-04-13 6:37 ` Eli Zaretskii 2017-04-13 7:18 ` Understanding how to specify UTF-8 Eli Zaretskii 2017-04-14 23:37 ` Will Parsons 2 siblings, 0 replies; 13+ messages in thread From: Eli Zaretskii @ 2017-04-13 6:37 UTC (permalink / raw) To: help-gnu-emacs > From: "B. T. Raven" <btraven@nihilo.net> > Date: Thu, 13 Apr 2017 00:09:51 -0500 > > I also have these lines in my .emacs: > > (set-locale-environment "utf-8") > (set-language-environment 'utf-8) > (set-default-coding-systems 'utf-8) > (setq file-name-coding-system 'utf-8) > (setq buffer-file-coding-system 'utf-8) > (setq coding-system-for-write 'utf-8) > (set-keyboard-coding-system 'utf-8) > (set-terminal-coding-system 'utf-8) > (prefer-coding-system 'utf-8) > ;; (set-buffer-process-coding-system 'utf-8 'utf-8) > (modify-coding-system-alist 'process > "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos) > > > The line commented out caused a problem but I don't remember what it > was. My os w64 vers. 7 Some of the above are not recommended, and some are downright dangerous (a.k.a. "shooting yourself in the foot"). Especially on MS-Windows, UTF-8 should be used with extra care, because Windows only partially supports this encoding in its APIs. Specifically: > (set-locale-environment "utf-8") Don't do this on Windows, as Windows locales cannot use UTF-8 as their encoding. > (set-language-environment 'utf-8) > (set-default-coding-systems 'utf-8) Redundant as long as you have the prefer-coding-system call below. > (setq file-name-coding-system 'utf-8) This is a no-op: Emacs on Windows ignores the value of this variable, except if you are on Windows 9X, and file names cannot be encoded in UTF-8 on Windows anyway. Starting with Emacs 24.4, Emacs on Windows uses Unicode APIs to deal with file names, so it supports non-ASCII file names with all Unicode characters, and you don't need to do anything to get this support. > (setq buffer-file-coding-system 'utf-8) Dangerous. Also redundant with prefer-coding-system below. > (setq coding-system-for-write 'utf-8) This is dangerous: it will produce subtle issues with some commands, notably when invoking subprocesses with non-ASCII strings in command-line arguments. This variable exists so that Lisp programs could force specific encoding where appropriate, so leave it to that and don't globally set it. > (set-keyboard-coding-system 'utf-8) > (set-terminal-coding-system 'utf-8) These are wrong, and will get in the way when you work in -nw sessions. Emacs on MS-Windows doesn't fully support UTF-8 encoding of keyboard input and console output, even if you tweak your system's codepage to be 65001 (did you?). > (prefer-coding-system 'utf-8) This is the only setting that you should have if you want to use UTF-8 wherever possible and reasonable. > ;; (set-buffer-process-coding-system 'utf-8 'utf-8) > (modify-coding-system-alist 'process > "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos) This is wrong: Emacs on MS-Windows doesn't support UTF-8 encoding of program command-line arguments for subprocesses, and most Windows programs will NOT talk UTF-8 in their standard streams. prefer-coding-system should take care of those situations where this is possible/actually happens; the rest should be left alone, or you will have subtle problems with non-ASCII I/O vis-a-vis subprocesses. HTH ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-13 5:09 ` B. T. Raven 2017-04-13 6:37 ` (unknown) Eli Zaretskii @ 2017-04-13 7:18 ` Eli Zaretskii 2017-04-13 9:42 ` hector 2017-04-14 23:37 ` Will Parsons 2 siblings, 1 reply; 13+ messages in thread From: Eli Zaretskii @ 2017-04-13 7:18 UTC (permalink / raw) To: help-gnu-emacs [Resending with the correct Subject.] > From: "B. T. Raven" <btraven@nihilo.net> > Date: Thu, 13 Apr 2017 00:09:51 -0500 > > I also have these lines in my .emacs: > > (set-locale-environment "utf-8") > (set-language-environment 'utf-8) > (set-default-coding-systems 'utf-8) > (setq file-name-coding-system 'utf-8) > (setq buffer-file-coding-system 'utf-8) > (setq coding-system-for-write 'utf-8) > (set-keyboard-coding-system 'utf-8) > (set-terminal-coding-system 'utf-8) > (prefer-coding-system 'utf-8) > ;; (set-buffer-process-coding-system 'utf-8 'utf-8) > (modify-coding-system-alist 'process > "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos) > > > The line commented out caused a problem but I don't remember what it > was. My os w64 vers. 7 Some of the above are not recommended, and some are downright dangerous (a.k.a. "shooting yourself in the foot"). Especially on MS-Windows, UTF-8 should be used with extra care, because Windows only partially supports this encoding in its APIs. Specifically: > (set-locale-environment "utf-8") Don't do this on Windows, as Windows locales cannot use UTF-8 as their encoding. > (set-language-environment 'utf-8) > (set-default-coding-systems 'utf-8) Redundant as long as you have the prefer-coding-system call below. > (setq file-name-coding-system 'utf-8) This is a no-op: Emacs on Windows ignores the value of this variable, except if you are on Windows 9X, and file names cannot be encoded in UTF-8 on Windows anyway. Starting with Emacs 24.4, Emacs on Windows uses Unicode APIs to deal with file names, so it supports non-ASCII file names with all Unicode characters, and you don't need to do anything to get this support. > (setq buffer-file-coding-system 'utf-8) Dangerous. Also redundant with prefer-coding-system below. > (setq coding-system-for-write 'utf-8) This is dangerous: it will produce subtle issues with some commands, notably when invoking subprocesses with non-ASCII strings in command-line arguments. This variable exists so that Lisp programs could force specific encoding where appropriate, so leave it to that and don't globally set it. > (set-keyboard-coding-system 'utf-8) > (set-terminal-coding-system 'utf-8) These are wrong, and will get in the way when you work in -nw sessions. Emacs on MS-Windows doesn't fully support UTF-8 encoding of keyboard input and console output, even if you tweak your system's codepage to be 65001 (did you?). > (prefer-coding-system 'utf-8) This is the only setting that you should have if you want to use UTF-8 wherever possible and reasonable. > ;; (set-buffer-process-coding-system 'utf-8 'utf-8) > (modify-coding-system-alist 'process > "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos) This is wrong: Emacs on MS-Windows doesn't support UTF-8 encoding of program command-line arguments for subprocesses, and most Windows programs will NOT talk UTF-8 in their standard streams. prefer-coding-system should take care of those situations where this is possible/actually happens; the rest should be left alone, or you will have subtle problems with non-ASCII I/O vis-a-vis subprocesses. HTH ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-13 7:18 ` Understanding how to specify UTF-8 Eli Zaretskii @ 2017-04-13 9:42 ` hector 0 siblings, 0 replies; 13+ messages in thread From: hector @ 2017-04-13 9:42 UTC (permalink / raw) To: help-gnu-emacs @Eli: Thank you. Everything works better when you know what you're doing. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-13 5:09 ` B. T. Raven 2017-04-13 6:37 ` (unknown) Eli Zaretskii 2017-04-13 7:18 ` Understanding how to specify UTF-8 Eli Zaretskii @ 2017-04-14 23:37 ` Will Parsons 2 siblings, 0 replies; 13+ messages in thread From: Will Parsons @ 2017-04-14 23:37 UTC (permalink / raw) To: help-gnu-emacs B. T. Raven wrote: > Hi Will. I decided to respond because of this observation in the latest > posting: > "They used to say emacs and vi are religions; these days they are > starting to seem like latin." Not completely - "Emacs" should be spelt "Emax" first ;) (And the plural, I suppose should be "emaces" rather than "emacsen".) > On 4/7/2017 18:43, Will Parsons wrote: >> I want to always use Unicode/UTF-8 unless otherwise specified. I've noticed >> that I've attempted to do this in my .emacs file in two separate ways on two >> separate platforms: >> >> 1) (setq-default buffer-file-coding-system 'utf-8-unix) >> >> 2) (set-language-environment "UTF-8") >> >> Both seem to work, but I'm wondering if there are subtle differences between >> the two that I should be aware of. > > I can't help with any subtlties but can only recommend that you add this > cookie to the beginning of the buffer: > > ;; -*- coding: utf-8 -*- Yes, I've employed that too. (Incidentally, I've been programming a lot in Ruby for some years now, and I was surprised to find that after inserting a copyright symbol (©) into one of my Ruby source files, that Emacs ruby-mode inserted a line containing '# coding: utf-8' at the top when the file was saved.) > I think it may be enough to save and reload the file into a new buffer > before adding exotic characters. > I also have these lines in my .emacs: > > (set-locale-environment "utf-8") > (set-language-environment 'utf-8) > (set-default-coding-systems 'utf-8) > (setq file-name-coding-system 'utf-8) > (setq buffer-file-coding-system 'utf-8) > (setq coding-system-for-write 'utf-8) > (set-keyboard-coding-system 'utf-8) > (set-terminal-coding-system 'utf-8) > (prefer-coding-system 'utf-8) > ;; (set-buffer-process-coding-system 'utf-8 'utf-8) > (modify-coding-system-alist 'process > "[cC][mM][dD][pP][rR][oO][xX][yY]" 'utf-8-dos) > > The line commented out caused a problem but I don't remember what it > was. My os w64 vers. 7 Wow. I should think that should cover all possibilities. I prefer to be a bit more minimalist than that though... Anyway, thanks - Vale Edwarde! -- Will ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-07 23:43 Understanding how to specify UTF-8 Will Parsons 2017-04-08 7:29 ` Eli Zaretskii 2017-04-13 5:09 ` B. T. Raven @ 2017-04-21 9:28 ` Jason Rumney 2017-04-21 10:54 ` Eli Zaretskii ` (2 more replies) 2017-04-21 18:30 ` Understanding how to specify UTF-8 Stefan Monnier 3 siblings, 3 replies; 13+ messages in thread From: Jason Rumney @ 2017-04-21 9:28 UTC (permalink / raw) To: help-gnu-emacs On Saturday, 8 April 2017 07:43:58 UTC+8, Will Parsons wrote: > I want to always use Unicode/UTF-8 unless otherwise specified. I've noticed > that I've attempted to do this in my .emacs file in two separate ways on two > separate platforms: > > 1) (setq-default buffer-file-coding-system 'utf-8-unix) > > 2) (set-language-environment "UTF-8") > > Both seem to work, but I'm wondering if there are subtle differences between > the two that I should be aware of. The first only sets the default coding system for Files. The second sets it for for everything, including system clipboard, file names, process I/O ... On modern GNU/Linux, Mac or other Posix based OS's, you probably want everything in UTF-8, so the latter is correct. On Windows, the system itself does not support UTF-8 fully, so the former is safer. For clipboard and file names on Windows, the latest versions of Emacs will use Unicode regardless of what you specify for the coding system, it is really only process I/O that is the problem - Cygwin and Mingw apps may support UTF-8 I/O, but native Windows apps (including the cmd.exe shell) can have severe difficulties with it. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-21 9:28 ` Jason Rumney @ 2017-04-21 10:54 ` Eli Zaretskii 2017-04-21 17:36 ` Will Parsons 2017-05-29 15:16 ` Understanding cross version problem Francis Belliveau 2 siblings, 0 replies; 13+ messages in thread From: Eli Zaretskii @ 2017-04-21 10:54 UTC (permalink / raw) To: help-gnu-emacs > Date: Fri, 21 Apr 2017 02:28:45 -0700 (PDT) > From: Jason Rumney <jasonrumney@gmail.com> > > On Windows, the system itself does not support UTF-8 fully, so the former is safer. For clipboard and file names on Windows, the latest versions of Emacs will use Unicode regardless of what you specify for the coding system, it is really only process I/O that is the problem - Cygwin and Mingw apps may support UTF-8 I/O, but native Windows apps (including the cmd.exe shell) can have severe difficulties with it. MinGW apps are native apps, so they don't support UTF-8. I think you meant MSYS, not MinGW (and then only MSYS2 apps support UTF-8). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-21 9:28 ` Jason Rumney 2017-04-21 10:54 ` Eli Zaretskii @ 2017-04-21 17:36 ` Will Parsons 2017-05-29 15:16 ` Understanding cross version problem Francis Belliveau 2 siblings, 0 replies; 13+ messages in thread From: Will Parsons @ 2017-04-21 17:36 UTC (permalink / raw) To: help-gnu-emacs Jason Rumney wrote: > On Saturday, 8 April 2017 07:43:58 UTC+8, Will Parsons wrote: >> I want to always use Unicode/UTF-8 unless otherwise specified. I've noticed >> that I've attempted to do this in my .emacs file in two separate ways on two >> separate platforms: >> >> 1) (setq-default buffer-file-coding-system 'utf-8-unix) >> >> 2) (set-language-environment "UTF-8") >> >> Both seem to work, but I'm wondering if there are subtle differences between >> the two that I should be aware of. > > The first only sets the default coding system for Files. > > The second sets it for for everything, including system clipboard, file names, process I/O ... > > On modern GNU/Linux, Mac or other Posix based OS's, you probably want everything in UTF-8, so the latter is correct. > > On Windows, the system itself does not support UTF-8 fully, so the former is safer. For clipboard and file names on Windows, the latest versions of Emacs will use Unicode regardless of what you specify for the coding system, it is really only process I/O that is the problem - Cygwin and Mingw apps may support UTF-8 I/O, but native Windows apps (including the cmd.exe shell) can have severe difficulties with it. Thank you for this detailed answer. Interestingly enough, I have them reversed in my Unix vs Windows configurations. -- Will ^ permalink raw reply [flat|nested] 13+ messages in thread
* Understanding cross version problem 2017-04-21 9:28 ` Jason Rumney 2017-04-21 10:54 ` Eli Zaretskii 2017-04-21 17:36 ` Will Parsons @ 2017-05-29 15:16 ` Francis Belliveau 2017-05-29 16:38 ` Drew Adams 2 siblings, 1 reply; 13+ messages in thread From: Francis Belliveau @ 2017-05-29 15:16 UTC (permalink / raw) To: help-gnu-emacs I have encountered something that does not make sense to me. I am normally running version 23.1 but my OS command line binds to 22.1 I have the following line in my .emacs file (if (boundp tool-bar-mode) (tool-bar-mode -1)) I put that there to eliminate an error from 22.1 about the missing variable. However, when I -debug-init I am still being told: "void-variable tool-bar-mode" I thought that is what "boundp" was checking? What have I missed? Fran ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: Understanding cross version problem 2017-05-29 15:16 ` Understanding cross version problem Francis Belliveau @ 2017-05-29 16:38 ` Drew Adams 0 siblings, 0 replies; 13+ messages in thread From: Drew Adams @ 2017-05-29 16:38 UTC (permalink / raw) To: Francis Belliveau, help-gnu-emacs > (if (boundp tool-bar-mode) (tool-bar-mode -1)) Change (boundp tool-bar-mode) to (boundp 'tool-bar-mode). > I put that there to eliminate an error from 22.1 about the missing variable. > However, when I -debug-init I am still being told: > "void-variable tool-bar-mode" > I thought that is what "boundp" was checking? > > What have I missed? See above. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Understanding how to specify UTF-8 2017-04-07 23:43 Understanding how to specify UTF-8 Will Parsons ` (2 preceding siblings ...) 2017-04-21 9:28 ` Jason Rumney @ 2017-04-21 18:30 ` Stefan Monnier 3 siblings, 0 replies; 13+ messages in thread From: Stefan Monnier @ 2017-04-21 18:30 UTC (permalink / raw) To: help-gnu-emacs > I want to always use Unicode/UTF-8 unless otherwise specified. If your locale is using utf-8 (which it should nowadays in most cases under GNU/Linux, especially if you "want to always use Unicode/UTF-8"), then Emacs should already do that automatically. Stefan ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-05-29 16:38 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-04-07 23:43 Understanding how to specify UTF-8 Will Parsons 2017-04-08 7:29 ` Eli Zaretskii 2017-04-13 5:09 ` B. T. Raven 2017-04-13 6:37 ` (unknown) Eli Zaretskii 2017-04-13 7:18 ` Understanding how to specify UTF-8 Eli Zaretskii 2017-04-13 9:42 ` hector 2017-04-14 23:37 ` Will Parsons 2017-04-21 9:28 ` Jason Rumney 2017-04-21 10:54 ` Eli Zaretskii 2017-04-21 17:36 ` Will Parsons 2017-05-29 15:16 ` Understanding cross version problem Francis Belliveau 2017-05-29 16:38 ` Drew Adams 2017-04-21 18:30 ` Understanding how to specify UTF-8 Stefan Monnier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).