* bug#27505: LC_CTYPE affects tutorial language @ 2017-06-27 14:48 ` Leonard Lausen 2017-06-27 15:05 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Leonard Lausen @ 2017-06-27 14:48 UTC (permalink / raw) To: 27505 Dear all, as far as I know the environment variable LC_CTYPE applies to classification and conversion of characters, and to multibyte and wide characters. So setting it should not influence the interface language, correct? However, with the following locale: LANG=en_US.UTF-8 LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE=C LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= I find that the emacs tutorial (C-h t) is displayed in Chinese. Is this expected behavior or a bug? This may or may not be related to bug#27312 where I reported that I can't activate fcitx even though env is set up correctly. (I.e. the following is the first line of the displayed tutorial: Emacs 快速指南.(查看版权声明请至本文末尾)) > > In GNU Emacs 25.2.1 (x86_64-pc-linux-gnu, GTK+ Version 3.22.15) > of 2017-06-10 built on leonard-xps13 > Windowing system distributor 'The X.Org Foundation', version 11.0.11903000 > Configured using: > 'configure --prefix=/usr --build=x86_64-pc-linux-gnu > --host=x86_64-pc-linux-gnu --mandir=/usr/share/man > --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc > --localstatedir=/var/lib --disable-dependency-tracking > --disable-silent-rules --docdir=/usr/share/doc/emacs-25.2 > --htmldir=/usr/share/doc/emacs-25.2/html --libdir=/usr/lib64 > --program-suffix=-emacs-25 --infodir=/usr/share/info/emacs-25 > --localstatedir=/var > --enable-locallisppath=/etc/emacs:/usr/share/emacs/site-lisp > --with-gameuser=:gamestat --without-compress-install > --with-file-notification=inotify --enable-acl --with-dbus > --with-modules --with-gpm --without-hesiod --without-kerberos > --without-kerberos5 --with-xml2 --without-selinux --with-gnutls > --without-wide-int --with-zlib --with-sound=alsa --with-x --without-ns > --with-gconf --with-gsettings --without-toolkit-scroll-bars --with-gif > --with-jpeg --with-png --with-rsvg --with-tiff --with-xpm > --with-imagemagick --with-xft --without-cairo --with-libotf > --with-m17n-flt --with-x-toolkit=gtk3 --without-xwidgets > GENTOO_PACKAGE=app-editors/emacs-25.2 'CFLAGS=-march=native > -mtune=native -O2 -pipe' CPPFLAGS= 'LDFLAGS=-Wl,-O1 -Wl,--as-needed'' > > Configured features: > XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GCONF GSETTINGS > NOTIFY ACL GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB GTK3 X11 > MODULES > > Important settings: > value of $LC_COLLATE: C > value of $LC_CTYPE: zh_CN.UTF-8 > value of $LANG: en_US.UTF-8 > value of $XMODIFIERS: @im=fcitx > locale-coding-system: utf-8-unix > > Major mode: Lisp Interaction > > Minor modes in effect: > tooltip-mode: t > global-eldoc-mode: t > electric-indent-mode: t > mouse-wheel-mode: t > tool-bar-mode: t > menu-bar-mode: t > file-name-shadow-mode: t > global-font-lock-mode: t > font-lock-mode: t > blink-cursor-mode: t > auto-composition-mode: t > auto-encryption-mode: t > auto-compression-mode: t > line-number-mode: t > transient-mark-mode: t > > Recent messages: > For information about GNU Emacs and the GNU system, type C-h C-a. > Making completion list... [2 times] > delete-backward-char: Text is read-only [3 times] > Making completion list... > > Load-path shadows: > None found. > > Features: > (shadow sort mail-extr emacsbug message dired format-spec rfc822 mml > mml-sec password-cache epg epg-config gnus-util mm-decode mm-bodies > mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail > rfc2047 rfc2045 ietf-drums mm-util help-fns help-mode easymenu > cl-loaddefs pcase cl-lib mail-prsvr mail-utils time-date mule-util > china-util tooltip eldoc electric uniquify ediff-hook vc-hooks > lisp-float-type mwheel x-win term/common-win x-dnd tool-bar dnd fontset > image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode > prog-mode register page menu-bar rfn-eshadow timer select scroll-bar > mouse jit-lock font-lock syntax facemenu font-core frame cl-generic cham > georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao > korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech > european ethiopic indian cyrillic chinese charscript case-table epa-hook > jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice > loaddefs button faces cus-face macroexp files text-properties overlay > sha1 md5 base64 format env code-pages mule custom widget > hashtable-print-readable backquote dbusbind inotify dynamic-setting > system-font-setting font-render-setting move-toolbar gtk x-toolkit x > multi-tty make-network-process emacs) > > Memory information: > ((conses 16 86605 6233) > (symbols 48 19787 0) > (miscs 40 46 96) > (strings 32 14398 4574) > (string-bytes 1 414247) > (vectors 16 12192) > (vector-slots 8 484142 16017) > (floats 8 167 9) > (intervals 56 279 0) > (buffers 976 19) > (heap 1024 16015 1078) ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: LC_CTYPE affects tutorial language 2017-06-27 14:48 ` bug#27505: LC_CTYPE affects tutorial language Leonard Lausen @ 2017-06-27 15:05 ` Eli Zaretskii 2017-06-27 15:13 ` Andreas Schwab [not found] ` <handler.27505.C.150189707129878.notifdonectrl.0@debbugs.gnu.org> 2 siblings, 0 replies; 18+ messages in thread From: Eli Zaretskii @ 2017-06-27 15:05 UTC (permalink / raw) To: Leonard Lausen; +Cc: 27505 > From: Leonard Lausen <leonard@lausen.nl> > Date: Tue, 27 Jun 2017 23:48:41 +0900 > > as far as I know the environment variable LC_CTYPE applies to > classification and conversion of characters, and to multibyte and wide > characters. So setting it should not influence the interface language, > correct? > > However, with the following locale: > LANG=en_US.UTF-8 > LC_CTYPE=zh_CN.UTF-8 > LC_NUMERIC="en_US.UTF-8" > LC_TIME="en_US.UTF-8" > LC_COLLATE=C > LC_MONETARY="en_US.UTF-8" > LC_MESSAGES="en_US.UTF-8" > LC_PAPER="en_US.UTF-8" > LC_NAME="en_US.UTF-8" > LC_ADDRESS="en_US.UTF-8" > LC_TELEPHONE="en_US.UTF-8" > LC_MEASUREMENT="en_US.UTF-8" > LC_IDENTIFICATION="en_US.UTF-8" > LC_ALL= > > I find that the emacs tutorial (C-h t) is displayed in Chinese. > > Is this expected behavior or a bug? It's the intended behavior: LC_CTYPE affects the language environment which Emacs sets up by default. From the Emacs manual: Some operating systems let you specify the character-set locale you are using by setting the locale environment variables ‘LC_ALL’, ‘LC_CTYPE’, or ‘LANG’. (If more than one of these is set, the first one that is nonempty specifies your locale for this purpose.) During startup, Emacs looks up your character-set locale’s name in the system locale alias table, matches its canonical name against entries in the value of the variables ‘locale-charset-language-names’ and ‘locale-language-names’ (the former overrides the latter), and selects the corresponding language environment if a match is found. It also adjusts the display table and terminal coding system, the locale coding system, the preferred coding system as needed for the locale, and—last but not least—the way Emacs decodes non-ASCII characters sent by your keyboard. And the language environment includes a setting for the default tutorial. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: LC_CTYPE affects tutorial language 2017-06-27 14:48 ` bug#27505: LC_CTYPE affects tutorial language Leonard Lausen 2017-06-27 15:05 ` Eli Zaretskii @ 2017-06-27 15:13 ` Andreas Schwab [not found] ` <handler.27505.C.150189707129878.notifdonectrl.0@debbugs.gnu.org> 2 siblings, 0 replies; 18+ messages in thread From: Andreas Schwab @ 2017-06-27 15:13 UTC (permalink / raw) To: Leonard Lausen; +Cc: 27505 On Jun 27 2017, Leonard Lausen <leonard@lausen.nl> wrote: > as far as I know the environment variable LC_CTYPE applies to > classification and conversion of characters, and to multibyte and wide > characters. So setting it should not influence the interface language, > correct? current-language-environment is set from LC_CTYPE, which also controls the tutorial language. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <handler.27505.C.150189707129878.notifdonectrl.0@debbugs.gnu.org>]
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) [not found] ` <handler.27505.C.150189707129878.notifdonectrl.0@debbugs.gnu.org> @ 2017-08-05 1:54 ` Leonard Lausen 2017-08-05 2:06 ` npostavs 2017-08-05 7:06 ` Eli Zaretskii 0 siblings, 2 replies; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 1:54 UTC (permalink / raw) To: 27505 Please reopen this bug. Unfortunately my previous reply was only sent to Andreas, but not to the bug list. I am attaching it below. A short summary is that emacs is assuming the language I occasionally need to input is also the language I want to read by default, which is a wrong assumption. Note that its not possible to input Chinese characters in emacs without setting LC_CTYPE to zh_CN. Thanks Andreas and Eli for the prompt reply. In that case though I believe the intended emacs behavior does not make sense. Given that I need to set LC_CTYPE=zh_CN.UTF-8 just to make it possible to use input system input methods for Chinese characters doesn't mean I want to actually use a Chinese language interface. Or concretely, I am learning Chinese and am comfortable typing it or having daily conversations, however I don't feel comfortable reading the emacs manual in Chinese. For my language learning I also tend to keep some notes in Chinese which I would like to edit with emacs. Shouldn't there be a way to allow people to input Chinese (or other non-European languages) without affecting the language environment? The current behavior seems to discriminate language learners What do you think? Thanks! Best regards Leonard ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 1:54 ` bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) Leonard Lausen @ 2017-08-05 2:06 ` npostavs 2017-08-05 5:59 ` Leonard Lausen 2017-08-05 7:10 ` Eli Zaretskii 2017-08-05 7:06 ` Eli Zaretskii 1 sibling, 2 replies; 18+ messages in thread From: npostavs @ 2017-08-05 2:06 UTC (permalink / raw) To: Leonard Lausen; +Cc: 27505 reopen 27505 tags 27505 - notabug quit Leonard Lausen <leonard@lausen.nl> writes: > Please reopen this bug. Unfortunately my previous reply was only sent to > Andreas, but not to the bug list. I am attaching it below. A short > summary is that emacs is assuming the language I occasionally need to > input is also the language I want to read by default, which is a wrong > assumption. Note that its not possible to input Chinese characters in > emacs without setting LC_CTYPE to zh_CN. Does setting LC_ALL=en_US.UTF-8 and LC_CTYPE=zh_CN.UTF-8 work? Some operating systems let you specify the character-set locale you are using by setting the locale environment variables ‘LC_ALL’, ‘LC_CTYPE’, or ‘LANG’. (If more than one of these is set, the first one that is nonempty specifies your locale for this purpose.) Or should LANG should take precedence over LC_CTYPE perhaps? ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 2:06 ` npostavs @ 2017-08-05 5:59 ` Leonard Lausen 2017-08-05 7:10 ` Eli Zaretskii 1 sibling, 0 replies; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 5:59 UTC (permalink / raw) To: npostavs; +Cc: 27505 > Does setting LC_ALL=en_US.UTF-8 and LC_CTYPE=zh_CN.UTF-8 work? Due to bug 27312 unfortunately I can't test if setting LC_ALL=en_US.UTF-8 and LC_CTYPE=zh_CN.UTF-8 works (i.e. still allows using the X input method). But I would expect it not to work, as LC_ALL is supposed to overwrite LC_CTYPE (see also below). Arguably #10867 should be fixed directly and the X input method should work independently of the setting of LC_CTYPE. > Or should LANG should take precedence over LC_CTYPE perhaps? Using LANG to decide the language of the tutorial should also work. But would changing the precedence order to set current-language-environment based on LANG not interfere with bug #10867? I.e. does emacs directly check LC_CTYPE to decide if it supports using the X input method or does it check current-language-environment (#10867). In the latter case simply changing the precedence order wouldn't fix the problem, as it would still require me to have current-language-environment to be set to Chinese just to input Chinese characters.. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 2:06 ` npostavs 2017-08-05 5:59 ` Leonard Lausen @ 2017-08-05 7:10 ` Eli Zaretskii 1 sibling, 0 replies; 18+ messages in thread From: Eli Zaretskii @ 2017-08-05 7:10 UTC (permalink / raw) To: npostavs; +Cc: leonard, 27505 > From: npostavs@users.sourceforge.net > Date: Fri, 04 Aug 2017 22:06:30 -0400 > Cc: 27505@debbugs.gnu.org > > Or should LANG should take precedence over LC_CTYPE perhaps? That'd go against the Posix semantics of these variables, so we shouldn't do that, because it might not be what is expected by users who set both LANG and other LC_* variables. As I wrote previously, I don't really understand the exact problem we are asked to solve here. I don't think we should be discussing solutions before we understand the actual problem. Right now, I believe that Emacs already provides features to resolve any such problems. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 1:54 ` bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) Leonard Lausen 2017-08-05 2:06 ` npostavs @ 2017-08-05 7:06 ` Eli Zaretskii 2017-08-05 8:17 ` Leonard Lausen 2017-08-05 8:18 ` bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) Leonard Lausen 1 sibling, 2 replies; 18+ messages in thread From: Eli Zaretskii @ 2017-08-05 7:06 UTC (permalink / raw) To: Leonard Lausen; +Cc: 27505 > From: Leonard Lausen <leonard@lausen.nl> > Date: Sat, 5 Aug 2017 10:54:37 +0900 > > Please reopen this bug. Continuing the discussions doesn't require reopening the bug, as long as we don't intend to make any changes for it. > In that case though I believe the intended emacs behavior does not make > sense. Given that I need to set LC_CTYPE=zh_CN.UTF-8 just to make it > possible to use input system input methods for Chinese characters > doesn't mean I want to actually use a Chinese language interface. > > Or concretely, I am learning Chinese and am comfortable typing it or > having daily conversations, however I don't feel comfortable reading the > emacs manual in Chinese. For my language learning I also tend to keep > some notes in Chinese which I would like to edit with emacs. > > Shouldn't there be a way to allow people to input Chinese (or other > non-European languages) without affecting the language environment? The > current behavior seems to discriminate language learners Yes, there should be such a way, and in fact it is already, and always was, implemented in Emacs. The values of LC_CTYPE etc. environment variables are only used to set up the _defaults_; users can use commands and options to override those defaults in many ways. For example, "C-h t" can be invoked with a numeric argument ("C-u C-h t") in which case Emacs will ask you in what language to display the tutorial. As another example, input method of your choosing can be invoked at any moment with "C-u C-\"; then you can switch it back off as soon as you've finished typing characters that are not directly accessible from your system keyboard. Finally, the language environment of your choosing can be set with "C-x RET l", and doing that will set many other defaults according to the language environment you select. Given all these facilities, I'm not sure I understand what exactly is your problem. The original report was about the tutorial language, but you never explained why did you set LC_CTYPE to the value that specified Chinese. If you did that for some reason other than for using Chinese in your programs, then perhaps you shouldn't set LC_CTYPE, and instead should use the above-mentioned, more focused, Emacs features to specify Chinese where you want it? ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 7:06 ` Eli Zaretskii @ 2017-08-05 8:17 ` Leonard Lausen 2017-08-05 9:17 ` Eli Zaretskii 2017-08-05 8:18 ` bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) Leonard Lausen 1 sibling, 1 reply; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 8:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 27505 Hey Eli, thanks for your reply. > Yes, there should be such a way, and in fact it is already, and > always was, implemented in Emacs. The values of LC_CTYPE etc. > environment variables are only used to set up the _defaults_; users > can use commands and options to override those defaults in many ways. > For example, "C-h t" can be invoked with a numeric argument ("C-u C-h > t") in which case Emacs will ask you in what language to display the > tutorial. As another example, input method of your choosing can be > invoked at any moment with "C-u C-\"; then you can switch it back > off as soon as you've finished typing characters that are not > directly accessible from your system keyboard. Finally, the > language environment of your choosing can be set with "C-x RET l", > and doing that will set many other defaults according to the > language environment you select. I was not aware of the feature to change the tutorial language via "C-u C-h t". Thanks for pointing that out. > Given all these facilities, I'm not sure I understand what exactly > is your problem. The original report was about the tutorial > language, but you never explained why did you set LC_CTYPE to the > value that specified Chinese. If you did that for some reason other > than for using Chinese in your programs, then perhaps you shouldn't > set LC_CTYPE, and instead should use the above-mentioned, more > focused, Emacs features to specify Chinese where you want it? Sorry for not being clear about it. To input Chinese, Japanese or Korean (CJK) on Linux people usually rely on tools such as fcitx or ibus, which allow inputting CJK characters in any application. They are also supported by emacs via the X Input Method (XIM) protocol. Unfortunately XIM is only supported in emacs when LC_CTYPE is set to a CJK locale (#10867: must export LC_CTYPE to zh_CN.UTF-8 or similar CJK locale to use X input method). Compared to using emacs input methods, fcitx provides the same experience for all desktop applications and arguably better statistical matching methods to match the user input (Latin characters) to the target CJK Characters, so it is preferable over the emacs input methods ("C-u C-\"). I would be more than happy to not set LC_CTYPE to Chinese, if #10867 gets fixed. Until then it seems the only way to get XIM working. If I remember correctly though, #10867 is intended behavior and won't be fixed (which is not sensible IMO). My problem is, that just because I would like to use XIM doesn't mean that I would like to see any of the emacs interface in the LC_CTYPE language. So given that #10867 seems to be intended behavior at least emacs shouldn't rely on LC_CTYPE to change the interface language in any user-visible way. From my perspective it would make more sense to fix #10867 though. > That'd go against the Posix semantics of these variables, so we > shouldn't do that, because it might not be what is expected by users > who set both LANG and other LC_* variables. > > As I wrote previously, I don't really understand the exact problem > we are asked to solve here. I don't think we should be discussing > solutions before we understand the actual problem. Right now, I > believe that Emacs already provides features to resolve any such > problems. As far as I understand the current behavior of emacs to change the interface language based on LC_CTYPE is application defined behavior that is not part of Posix. Posix only says: > This variable determines the locale category for character handling > functions, such as tolower(), toupper() and isalpha(). This > environment variable determines the interpretation of sequences of > bytes of text data as characters (for example, single- as opposed to > multi-byte characters), the classification of characters (for > example, alpha, digit, graph) and the behaviour of character classes. > Additional semantics of this variable, if any, are > implementation-dependent. So I see no problem with LANG defining the interface language and LC_CTYPE taking care of the character handling.. Best regards Leonard PS: Besides emacs bug 10867 there is also an Ubuntu bug from 2009 https://bugs.launchpad.net/ubuntu/+source/emacs-snapshot/+bug/434730 Or in Chinese forums https://emacs-china.org/t/emacs-gui/1271 ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 8:17 ` Leonard Lausen @ 2017-08-05 9:17 ` Eli Zaretskii 2017-08-05 9:52 ` Leonard Lausen 0 siblings, 1 reply; 18+ messages in thread From: Eli Zaretskii @ 2017-08-05 9:17 UTC (permalink / raw) To: Leonard Lausen; +Cc: 27505 > Cc: 27505@debbugs.gnu.org > From: Leonard Lausen <leonard@lausen.nl> > Date: Sat, 5 Aug 2017 17:17:47 +0900 > > I would be more than happy to not set LC_CTYPE to Chinese, if #10867 > gets fixed. Until then it seems the only way to get XIM working. If I > remember correctly though, #10867 is intended behavior and won't be > fixed (which is not sensible IMO). > > My problem is, that just because I would like to use XIM doesn't mean > that I would like to see any of the emacs interface in the LC_CTYPE > language. So given that #10867 seems to be intended behavior at least > emacs shouldn't rely on LC_CTYPE to change the > interface language in any user-visible way. From my perspective it would > make more sense to fix #10867 though. I don't see any experts we have who could fix that, unfortunately. But I don't see why that would be a problem for you: if you don't want that Emacs language environment be Chinese when you use XIM, you should be able to invoke set-language-environment inside Emacs after starting it, to set the language environment to something other than Chinese. Does that work for you? > As far as I understand the current behavior of emacs to change the > interface language based on LC_CTYPE is application defined behavior > that is not part of Posix. Posix only says: > > > This variable determines the locale category for character handling > > functions, such as tolower(), toupper() and isalpha(). This > > environment variable determines the interpretation of sequences of > > bytes of text data as characters (for example, single- as opposed to > > multi-byte characters), the classification of characters (for > > example, alpha, digit, graph) and the behaviour of character classes. > > Additional semantics of this variable, if any, are > > implementation-dependent. See the "interpretation of sequences of bytes of text data as characters" parts: that's what causes Emacs to use LC_CTYPE to setup the language environment. So we do follow Posix, AFAIU. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 9:17 ` Eli Zaretskii @ 2017-08-05 9:52 ` Leonard Lausen 2017-08-05 10:15 ` Eli Zaretskii 0 siblings, 1 reply; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 9:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 27505 > I don't see any experts we have who could fix that, unfortunately. > But I don't see why that would be a problem for you: if you don't want > that Emacs language environment be Chinese when you use XIM, you > should be able to invoke set-language-environment inside Emacs after > starting it, to set the language environment to something other than > Chinese. Does that work for you? That is a good workaround. I created this bug report, as I would expect this as default behavior though. Unfortunately XIM currently does not work for me at all. So I can't confirm that changing set-language-environment won't stop XIM from working. (Though XIM worked for me before making a switch from Debian-based to Gentoo.. Bug https://debbugs.gnu.org/cgi/bugreport.cgi?bug=27312 ). >> As far as I understand the current behavior of emacs to change the >> interface language based on LC_CTYPE is application defined behavior >> that is not part of Posix. Posix only says: >> >>> This variable determines the locale category for character handling >>> functions, such as tolower(), toupper() and isalpha(). This >>> environment variable determines the interpretation of sequences of >>> bytes of text data as characters (for example, single- as opposed to >>> multi-byte characters), the classification of characters (for >>> example, alpha, digit, graph) and the behaviour of character classes. >>> Additional semantics of this variable, if any, are >>> implementation-dependent. > > See the "interpretation of sequences of bytes of text data as > characters" parts: that's what causes Emacs to use LC_CTYPE to setup > the language environment. So we do follow Posix, AFAIU Hm, as long as LANG and LC_CTYPE both are UTF-8 locales, the interpretation of bytes would be the same. In principle the interface language is independent from the interpretation of bytes right? One could just parse the first part of LANG (i.e. "en_EN") do decide the display language but follow LC_CTYPE for the interpretation of bytes. This seems also to be what the majority of applications are doing, given that I set LC_CTYPE to Chinese system wide, but only emacs (and Dropbox) are changing their interface language (more specifically the tutorial language). ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 9:52 ` Leonard Lausen @ 2017-08-05 10:15 ` Eli Zaretskii 2017-08-05 10:50 ` Leonard Lausen 2022-04-17 19:44 ` bug#27505: LC_CTYPE affects tutorial language Lars Ingebrigtsen 0 siblings, 2 replies; 18+ messages in thread From: Eli Zaretskii @ 2017-08-05 10:15 UTC (permalink / raw) To: Leonard Lausen; +Cc: 27505 > Cc: 27505@debbugs.gnu.org > From: Leonard Lausen <leonard@lausen.nl> > Date: Sat, 5 Aug 2017 18:52:38 +0900 > > > But I don't see why that would be a problem for you: if you don't want > > that Emacs language environment be Chinese when you use XIM, you > > should be able to invoke set-language-environment inside Emacs after > > starting it, to set the language environment to something other than > > Chinese. Does that work for you? > > That is a good workaround. I created this bug report, as I would expect > this as default behavior though. The default behavior is very unlikely to change, sorry. It took us many years to arrive at the current behavior, so changing that for a single use case, even if it's deemed important, makes little sense to me. > > See the "interpretation of sequences of bytes of text data as > > characters" parts: that's what causes Emacs to use LC_CTYPE to setup > > the language environment. So we do follow Posix, AFAIU > > Hm, as long as LANG and LC_CTYPE both are UTF-8 locales, the > interpretation of bytes would be the same. Yes, but LANG is the fallback in case LC_* are not defined, so I don't think how LANG set to a different language than LC_CTYPE could be according to Posix. > In principle the interface > language is independent from the interpretation of bytes right? One > could just parse the first part of LANG (i.e. "en_EN") do decide the > display language but follow LC_CTYPE for the interpretation of bytes. > This seems also to be what the majority of applications are doing, given > that I set LC_CTYPE to Chinese system wide, but only emacs (and Dropbox) > are changing their interface language (more specifically the tutorial > language). In Emacs, "display language" is just one aspect of the multi-lingual environment. So I'm afraid if the default is not to your liking, you will have to customize the individual aspects of the language environment separately, as you see fit. That's why those variables exist in the first place -- to tailor the Emacs operation to even rare and non-typical use cases. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 10:15 ` Eli Zaretskii @ 2017-08-05 10:50 ` Leonard Lausen 2017-08-05 11:09 ` Andreas Schwab 2022-04-17 19:44 ` bug#27505: LC_CTYPE affects tutorial language Lars Ingebrigtsen 1 sibling, 1 reply; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 10:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 27505 >>> See the "interpretation of sequences of bytes of text data as >>> characters" parts: that's what causes Emacs to use LC_CTYPE to setup >>> the language environment. So we do follow Posix, AFAIU >> >> Hm, as long as LANG and LC_CTYPE both are UTF-8 locales, the >> interpretation of bytes would be the same. > > Yes, but LANG is the fallback in case LC_* are not defined, so I don't > think how LANG set to a different language than LC_CTYPE could be > according to Posix. Well, it's a fallback for the things that the respectively undefined LC_* variable would define. So the argument here is that LC_CTYPE according to POSIX does not define the interface language. The current behavior of emacs can only be justified by the "Additional semantics of this variable, if any, are implementation-dependent." clause for the LC_CTYPE variable. Note though that I have besides Dropbox not found a single program which uses LC_CTYPE to set the interface language. Instead those other programs rely on LANG. You may try "LANG=zh_CN.utf8 vim" compared to "LC_CTYPE=zh_CN.utf8 vim" as example. Also the name of LANG compared to LC_CTYPE does somewhat suggest to me that it should define the interface language whereas CTYPE should define the "character types" (?) ;) So I agree with the previous comment that LANG should take precedence over LC_CTYPE with regards to the interface language. Not sure if the current emacs implementation allows that change without affecting the settings where LC_CTYPE does change precedence over LANG. But nevermind if you prefer to keep the current behavior. You taught me how to overwrite the language variable manually, so while I still am unhappy about emacs behaving differently to most applications, my immediate concern is resolved ;) ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 10:50 ` Leonard Lausen @ 2017-08-05 11:09 ` Andreas Schwab 2017-08-05 11:20 ` Leonard Lausen 0 siblings, 1 reply; 18+ messages in thread From: Andreas Schwab @ 2017-08-05 11:09 UTC (permalink / raw) To: Leonard Lausen; +Cc: 27505 On Aug 05 2017, Leonard Lausen <leonard@lausen.nl> wrote: > So I agree with the previous comment that LANG should take precedence > over LC_CTYPE with regards to the interface language. Not sure if the > current emacs implementation allows that change without affecting the > settings where LC_CTYPE does change precedence over LANG. LANG never takes precedence over other LC_* values, it only serves as the default for them. An interface that uses LC_CTYPE must ignore LANG when LC_CTYPE is set. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 11:09 ` Andreas Schwab @ 2017-08-05 11:20 ` Leonard Lausen 2017-08-05 11:22 ` Leonard Lausen 0 siblings, 1 reply; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 11:20 UTC (permalink / raw) To: Andreas Schwab; +Cc: 27505 >> So I agree with the previous comment that LANG should take precedence >> over LC_CTYPE with regards to the interface language. Not sure if the >> current emacs implementation allows that change without affecting the >> settings where LC_CTYPE does change precedence over LANG. > > LANG never takes precedence over other LC_* values, it only serves as > the default for them. An interface that uses LC_CTYPE must ignore LANG > when LC_CTYPE is set. I agree that LC_CTYPE always takes precedence for the things that LC_CTYPE defines according to the POSIX standard. However, as far as I understand the display language is not defined by LC_CTYPE. LC_CTYPE defines "Character classification and case conversion". The closest would be LC_MESSAGES ("Formats of informative and diagnostic messages and interactive responses."). I just tried, and for example vim uses indeed LC_MESSAGES to decide on the interface language. So does Chromium and KDE applications such as okular.. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 11:20 ` Leonard Lausen @ 2017-08-05 11:22 ` Leonard Lausen 0 siblings, 0 replies; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 11:22 UTC (permalink / raw) To: Andreas Schwab; +Cc: 27505 On 08/05/2017 08:20 PM, Leonard Lausen wrote: >>> So I agree with the previous comment that LANG should take precedence >>> over LC_CTYPE with regards to the interface language. Not sure if the >>> current emacs implementation allows that change without affecting the >>> settings where LC_CTYPE does change precedence over LANG. >> >> LANG never takes precedence over other LC_* values, it only serves as >> the default for them. An interface that uses LC_CTYPE must ignore LANG >> when LC_CTYPE is set. > > I agree that LC_CTYPE always takes precedence for the things that > LC_CTYPE defines according to the POSIX standard. However, as far as I > understand the display language is not defined by LC_CTYPE. LC_CTYPE > defines "Character classification and case conversion". > > The closest would be LC_MESSAGES ("Formats of informative and diagnostic > messages and interactive responses."). I just tried, and for example vim > uses indeed LC_MESSAGES to decide on the interface language. So does > Chromium and KDE applications such as okular.. So what I mean to say is that LANG should take precedence over LC_CTYPE with respect to the interface language. At least as long as LC_MESSAGES is not defined. Of course you can argue if the interface language is the same as "Formats of informative and diagnostic messages and interactive responses.". But if you disagree with that, then LANG should always take precedence. ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: LC_CTYPE affects tutorial language 2017-08-05 10:15 ` Eli Zaretskii 2017-08-05 10:50 ` Leonard Lausen @ 2022-04-17 19:44 ` Lars Ingebrigtsen 1 sibling, 0 replies; 18+ messages in thread From: Lars Ingebrigtsen @ 2022-04-17 19:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Leonard Lausen, 27505 Eli Zaretskii <eliz@gnu.org> writes: > The default behavior is very unlikely to change, sorry. It took us > many years to arrive at the current behavior, so changing that for a > single use case, even if it's deemed important, makes little sense to > me. (I'm going through old bug reports that unfortunately weren't resolved at the time.) Skimming this bug report, I think the conclusion was that we don't want to change how these variables work in Emacs at this point, so I'm therefore closing this bug report. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) 2017-08-05 7:06 ` Eli Zaretskii 2017-08-05 8:17 ` Leonard Lausen @ 2017-08-05 8:18 ` Leonard Lausen 1 sibling, 0 replies; 18+ messages in thread From: Leonard Lausen @ 2017-08-05 8:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 27505 Hey Eli, thanks for your reply. > Yes, there should be such a way, and in fact it is already, and > always was, implemented in Emacs. The values of LC_CTYPE etc. > environment variables are only used to set up the _defaults_; users > can use commands and options to override those defaults in many ways. > For example, "C-h t" can be invoked with a numeric argument ("C-u C-h > t") in which case Emacs will ask you in what language to display the > tutorial. As another example, input method of your choosing can be > invoked at any moment with "C-u C-\"; then you can switch it back > off as soon as you've finished typing characters that are not > directly accessible from your system keyboard. Finally, the > language environment of your choosing can be set with "C-x RET l", > and doing that will set many other defaults according to the > language environment you select. I was not aware of the feature to change the tutorial language via "C-u C-h t". Thanks for pointing that out. > Given all these facilities, I'm not sure I understand what exactly > is your problem. The original report was about the tutorial > language, but you never explained why did you set LC_CTYPE to the > value that specified Chinese. If you did that for some reason other > than for using Chinese in your programs, then perhaps you shouldn't > set LC_CTYPE, and instead should use the above-mentioned, more > focused, Emacs features to specify Chinese where you want it? Sorry for not being clear about it. To input Chinese, Japanese or Korean (CJK) on Linux people usually rely on tools such as fcitx or ibus, which allow inputting CJK characters in any application. They are also supported by emacs via the X Input Method (XIM) protocol. Unfortunately XIM is only supported in emacs when LC_CTYPE is set to a CJK locale (#10867: must export LC_CTYPE to zh_CN.UTF-8 or similar CJK locale to use X input method). Compared to using emacs input methods, fcitx provides the same experience for all desktop applications and arguably better statistical matching methods to match the user input (Latin characters) to the target CJK Characters, so it is preferable over the emacs input methods ("C-u C-\"). I would be more than happy to not set LC_CTYPE to Chinese, if #10867 gets fixed. Until then it seems the only way to get XIM working. If I remember correctly though, #10867 is intended behavior and won't be fixed (which is not sensible IMO). My problem is, that just because I would like to use XIM doesn't mean that I would like to see any of the emacs interface in the LC_CTYPE language. So given that #10867 seems to be intended behavior at least emacs shouldn't rely on LC_CTYPE to change the interface language in any user-visible way. From my perspective it would make more sense to fix #10867 though. > That'd go against the Posix semantics of these variables, so we > shouldn't do that, because it might not be what is expected by users > who set both LANG and other LC_* variables. > > As I wrote previously, I don't really understand the exact problem > we are asked to solve here. I don't think we should be discussing > solutions before we understand the actual problem. Right now, I > believe that Emacs already provides features to resolve any such > problems. As far as I understand the current behavior of emacs to change the interface language based on LC_CTYPE is application defined behavior that is not part of Posix. At least according to http://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html : > This variable determines the locale category for character handling > functions, such as tolower(), toupper() and isalpha(). This > environment variable determines the interpretation of sequences of > bytes of text data as characters (for example, single- as opposed to > multi-byte characters), the classification of characters (for > example, alpha, digit, graph) and the behaviour of character classes. > Additional semantics of this variable, if any, are > implementation-dependent. So I see no problem with LANG defining the interface language and LC_CTYPE taking care of the character handling.. Best regards Leonard PS: Besides emacs bug 10867 there is also an Ubuntu bug from 2009 https://bugs.launchpad.net/ubuntu/+source/emacs-snapshot/+bug/434730 Or in Chinese forums https://emacs-china.org/t/emacs-gui/1271 ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2022-04-17 19:44 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <871soq7pyr.fsf@users.sourceforge.net> 2017-06-27 14:48 ` bug#27505: LC_CTYPE affects tutorial language Leonard Lausen 2017-06-27 15:05 ` Eli Zaretskii 2017-06-27 15:13 ` Andreas Schwab [not found] ` <handler.27505.C.150189707129878.notifdonectrl.0@debbugs.gnu.org> 2017-08-05 1:54 ` bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) Leonard Lausen 2017-08-05 2:06 ` npostavs 2017-08-05 5:59 ` Leonard Lausen 2017-08-05 7:10 ` Eli Zaretskii 2017-08-05 7:06 ` Eli Zaretskii 2017-08-05 8:17 ` Leonard Lausen 2017-08-05 9:17 ` Eli Zaretskii 2017-08-05 9:52 ` Leonard Lausen 2017-08-05 10:15 ` Eli Zaretskii 2017-08-05 10:50 ` Leonard Lausen 2017-08-05 11:09 ` Andreas Schwab 2017-08-05 11:20 ` Leonard Lausen 2017-08-05 11:22 ` Leonard Lausen 2022-04-17 19:44 ` bug#27505: LC_CTYPE affects tutorial language Lars Ingebrigtsen 2017-08-05 8:18 ` bug#27505: acknowledged by developer (Re: bug#27505: LC_CTYPE affects tutorial language) Leonard Lausen
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.