* per-buffer language environments @ 2010-12-11 15:25 Werner LEMBERG 2010-12-11 19:00 ` Eli Zaretskii 0 siblings, 1 reply; 27+ messages in thread From: Werner LEMBERG @ 2010-12-11 15:25 UTC (permalink / raw) To: emacs-devel According to the documentation, set-language-environent acts globally. However, at least for CJK documents, it would be very helpful if this could be controlled on a per-buffer basis[1]. For example, on my GNU/Linux box with latin-1 as the default langauge environment, while editing some Japanese text, I see the katakana glyphs from the `simsun' font which look particularly ugly. Assuming that I edit a Chinese text in parallel, a Japanese font from a Japanese language environment would miss most of the Chinese characters, causing fallback character substitution which looks ugly again... Werner [1] An extension of this would be enriched text which supports multiple language environments within a buffer. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-11 15:25 per-buffer language environments Werner LEMBERG @ 2010-12-11 19:00 ` Eli Zaretskii 2010-12-12 6:25 ` Werner LEMBERG 0 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2010-12-11 19:00 UTC (permalink / raw) To: Werner LEMBERG; +Cc: emacs-devel > Date: Sat, 11 Dec 2010 16:25:03 +0100 (CET) > From: Werner LEMBERG <wl@gnu.org> > > According to the documentation, set-language-environent acts globally. > However, at least for CJK documents, it would be very helpful if this > could be controlled on a per-buffer basis[1]. For example, on my > GNU/Linux box with latin-1 as the default langauge environment, while > editing some Japanese text, I see the katakana glyphs from the > `simsun' font which look particularly ugly. Assuming that I edit a > Chinese text in parallel, a Japanese font from a Japanese language > environment would miss most of the Chinese characters, causing > fallback character substitution which looks ugly again... But font selection is just one part of the language environment. Are there any other aspects of the language environment that would make sense to have on per-buffer basis? If font selection is the only part, then doesn't the fontset definition feature (see "(emacs)Defining Fontsets") do what you want? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-11 19:00 ` Eli Zaretskii @ 2010-12-12 6:25 ` Werner LEMBERG 2010-12-13 7:56 ` Kenichi Handa 0 siblings, 1 reply; 27+ messages in thread From: Werner LEMBERG @ 2010-12-12 6:25 UTC (permalink / raw) To: eliz; +Cc: emacs-devel >> According to the documentation, set-language-environent acts >> globally. However, at least for CJK documents, it would be very >> helpful if this could be controlled on a per-buffer basis[1]. For >> example, on my GNU/Linux box with latin-1 as the default langauge >> environment, while editing some Japanese text, I see the katakana >> glyphs from the `simsun' font which look particularly ugly. >> Assuming that I edit a Chinese text in parallel, a Japanese font >> from a Japanese language environment would miss most of the Chinese >> characters, causing fallback character substitution which looks >> ugly again... > > But font selection is just one part of the language environment. Are > there any other aspects of the language environment that would make > sense to have on per-buffer basis? For CJK language environments, I'm not aware of other aspects, but probably Ken'ichi-san knows more. > If font selection is the only part, then doesn't the fontset > definition feature (see "(emacs)Defining Fontsets") do what you > want? If you tell me how to do that, this would be fine. Note that the `CHARSET:FONT' feature within a fontset is not appropriate since it helps only if there are different charsets. However, in the discussed problem all buffer encodings are using Unicode. On the other hand, I think it is not the right solution to specify a fontset as a file variable. I really want to say that file `foo' contains Chinese; Emacs parses this information somehow and then forwards this information to the font selection engine. Werner ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-12 6:25 ` Werner LEMBERG @ 2010-12-13 7:56 ` Kenichi Handa 2010-12-13 9:27 ` Werner LEMBERG ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Kenichi Handa @ 2010-12-13 7:56 UTC (permalink / raw) To: Werner LEMBERG; +Cc: eliz, emacs-devel In article <20101212.072550.527160732.wl@gnu.org>, Werner LEMBERG <wl@gnu.org> writes: > > But font selection is just one part of the language environment. Are > > there any other aspects of the language environment that would make > > sense to have on per-buffer basis? > For CJK language environments, I'm not aware of other aspects, but > probably Ken'ichi-san knows more. * Which input method to turn on by C-\. * Which coding system to use on writing when the current buffer contains a character that can't be encoded by buffer-file-coding-system. * Which coding systems have higher priority when inserting a file in the current buffer. * The locale of the program invoked by shell-command-on-region. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-13 7:56 ` Kenichi Handa @ 2010-12-13 9:27 ` Werner LEMBERG 2010-12-13 10:59 ` Kenichi Handa 2010-12-13 11:47 ` Eli Zaretskii 2010-12-18 17:03 ` Per Starbäck 2 siblings, 1 reply; 27+ messages in thread From: Werner LEMBERG @ 2010-12-13 9:27 UTC (permalink / raw) To: handa; +Cc: eliz, emacs-devel >> > Are there any other aspects of the language environment that >> > would make sense to have on per-buffer basis? > >> For CJK language environments, I'm not aware of other aspects, but >> probably Ken'ichi-san knows more. > > * Which input method to turn on by C-\. > > * Which coding system to use on writing when the current > buffer contains a character that can't be encoded by > buffer-file-coding-system. > > * Which coding systems have higher priority when inserting a > file in the current buffer. > > * The locale of the program invoked by shell-command-on-region. Thanks for the list. IMHO, this adds more arguments to per-buffer-language enviroments. Werner ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-13 9:27 ` Werner LEMBERG @ 2010-12-13 10:59 ` Kenichi Handa 2010-12-13 12:15 ` Werner LEMBERG 0 siblings, 1 reply; 27+ messages in thread From: Kenichi Handa @ 2010-12-13 10:59 UTC (permalink / raw) To: Werner LEMBERG; +Cc: eliz, emacs-devel In article <20101213.102709.409649500.wl@gnu.org>, Werner LEMBERG <wl@gnu.org> writes: > > * Which input method to turn on by C-\. > > > > * Which coding system to use on writing when the current > > buffer contains a character that can't be encoded by > > buffer-file-coding-system. > > > > * Which coding systems have higher priority when inserting a > > file in the current buffer. > > > > * The locale of the program invoked by shell-command-on-region. > Thanks for the list. IMHO, this adds more arguments to > per-buffer-language enviroments. Yes, but deciding exactly how they should work is not that straight forward. For instance, how the command prefer-coding-system should work when invoked in a buffer for which you locally changed the language environment? Should it change the preference globally, or for the current buffer only, or for all buffers that have the same language environment? --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-13 10:59 ` Kenichi Handa @ 2010-12-13 12:15 ` Werner LEMBERG 0 siblings, 0 replies; 27+ messages in thread From: Werner LEMBERG @ 2010-12-13 12:15 UTC (permalink / raw) To: handa; +Cc: eliz, emacs-devel > [...] deciding exactly how they should work is not that straight > forward. For instance, how the command prefer-coding-system should > work when invoked in a buffer for which you locally changed the > language environment? Should it change the preference globally, or > for the current buffer only, or for all buffers that have the same > language environment? Perhaps we should start with items which are agreed on, this is, the possibility to set a language environment buffer-wise so that Emacs can benefit by looking up the right font in case the buffer encoding is Unicode. The same for the default input encoding. Werner ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-13 7:56 ` Kenichi Handa 2010-12-13 9:27 ` Werner LEMBERG @ 2010-12-13 11:47 ` Eli Zaretskii 2010-12-14 11:38 ` Stephen J. Turnbull 2010-12-18 17:03 ` Per Starbäck 2 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2010-12-13 11:47 UTC (permalink / raw) To: Kenichi Handa; +Cc: emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: eliz@gnu.org, emacs-devel@gnu.org > Date: Mon, 13 Dec 2010 16:56:08 +0900 > > In article <20101212.072550.527160732.wl@gnu.org>, Werner LEMBERG <wl@gnu.org> writes: > > > But font selection is just one part of the language environment. Are > > > there any other aspects of the language environment that would make > > > sense to have on per-buffer basis? > > > For CJK language environments, I'm not aware of other aspects, but > > probably Ken'ichi-san knows more. > > * Which input method to turn on by C-\. > > * Which coding system to use on writing when the current > buffer contains a character that can't be encoded by > buffer-file-coding-system. > > * Which coding systems have higher priority when inserting a > file in the current buffer. I could understand how the font selection and the default input method are related to the language, but what do encodings have to do with that? The preferred encoding is generally an attribute of a locale, not of a language. The fact that we mix them is because Emacs had language environments before it had locale environments. It's high time to make the distinction, IMO. The language environment should be derived from the language(s) of the text we are editing, and is internal to Emacs, in the sense that it is defined by internal Emacs logic for its purposes. The locale environment is derived from the environment outside Emacs, and expresses the preferences of the outside world. > * The locale of the program invoked by shell-command-on-region. This is _definitely_ not related to the language. It may be the case that to force an external program DTRT for a certain language, you need to set some LC_* variable in the environment of that program, but that's an implementation detail, IMO. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-13 11:47 ` Eli Zaretskii @ 2010-12-14 11:38 ` Stephen J. Turnbull 2010-12-14 15:14 ` Eli Zaretskii 0 siblings, 1 reply; 27+ messages in thread From: Stephen J. Turnbull @ 2010-12-14 11:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, Kenichi Handa Eli Zaretskii writes: > > * Which coding systems have higher priority when inserting a > > file in the current buffer. > > I could understand how the font selection and the default input method > are related to the language, but what do encodings have to do with > that? The preferred encoding is generally an attribute of a locale, > not of a language. Note the word "insert", which implies "read". It is certainly true that a locale may specify an encoding. However, if the person is Japanese, they may specify ja_JP.UTF-8 for their locale and strongly prefer that files be written with that encoding, yet still need to read files in other encodings. The locale encoding of UTF-8 is no help in distinguishing an EUC-JP file from an ISO-8859-1 file, let alone an EUC-CN file. OTOH, somebody with a Hebrew language environment and a locale specifying UTF-8 as the encoding almost certainly prefers that a file containing 8-bit-set octets inconsistent with UTF-8 be recognized as ISO-8859-8 rather than EUC-JP, no? > The fact that we mix them is because Emacs had > language environments before it had locale environments. What's a "locale environment"? AFAIK Emacsen use the locale as a heuristic for determining the language environment unless otherwise specified, but it seems like you mean something else. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-14 11:38 ` Stephen J. Turnbull @ 2010-12-14 15:14 ` Eli Zaretskii 2010-12-15 4:51 ` Stephen J. Turnbull 0 siblings, 1 reply; 27+ messages in thread From: Eli Zaretskii @ 2010-12-14 15:14 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: emacs-devel, handa > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: Kenichi Handa <handa@m17n.org>, > emacs-devel@gnu.org > Date: Tue, 14 Dec 2010 20:38:43 +0900 > > Eli Zaretskii writes: > > > > * Which coding systems have higher priority when inserting a > > > file in the current buffer. > > > > I could understand how the font selection and the default input method > > are related to the language, but what do encodings have to do with > > that? The preferred encoding is generally an attribute of a locale, > > not of a language. > > Note the word "insert", which implies "read". It is certainly true > that a locale may specify an encoding. However, if the person is > Japanese, they may specify ja_JP.UTF-8 for their locale and strongly > prefer that files be written with that encoding, yet still need to > read files in other encodings. The locale encoding of UTF-8 is no > help in distinguishing an EUC-JP file from an ISO-8859-1 file, let > alone an EUC-CN file. OTOH, somebody with a Hebrew language > environment and a locale specifying UTF-8 as the encoding almost > certainly prefers that a file containing 8-bit-set octets inconsistent > with UTF-8 be recognized as ISO-8859-8 rather than EUC-JP, no? Those are all valid concerns, but they are just the tip of an iceberg. There's an almost infinite number of combinations of a language and the preferred encoding, and it's impossible to fold them all, or even their significant fraction, in a reasonably usable user-level interface. We shouldn't even try, IMO; we already have prefer-coding-system, the coding: cookies, the .dir_locals meta-data, etc. to cover the situations where the user knows what encoding should be preferred/used, even though her language and locale say otherwise. set-language-environment accepts a single string, which should be a language name, as its argument. (There are some "languages" that we recognize, such as "Chinese-GB18030", which sneak in the encoding as well, but that's an anomaly, I think, which goes back to when Emacs didn't have locale environments to express that. Now that we do, we could get rid of that, at least in principle.) Therefore, a language environment should set the defaults suitable for the language, and that doesn't include the encoding, or at least does not have to fit each minor cultural variant of the language. > > The fact that we mix them is because Emacs had > > language environments before it had locale environments. > > What's a "locale environment"? See set-locale-environment. > AFAIK Emacsen use the locale as a heuristic for determining the > language environment There's no heuristic involved, AFAIR. Emacs has a database of languages _and_encodings_ suitable for the known locale names. set-locale-environment uses that database to get the language and the preferred encoding(s), then calls set-language-environment with the language, and sets the priorities of the encodings according to the encoding preferences. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-14 15:14 ` Eli Zaretskii @ 2010-12-15 4:51 ` Stephen J. Turnbull 2010-12-15 6:47 ` Eli Zaretskii 0 siblings, 1 reply; 27+ messages in thread From: Stephen J. Turnbull @ 2010-12-15 4:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: handa, emacs-devel Eli Zaretskii writes: > > From: "Stephen J. Turnbull" <stephen@xemacs.org> > > Cc: Kenichi Handa <handa@m17n.org>, > > emacs-devel@gnu.org > > Date: Tue, 14 Dec 2010 20:38:43 +0900 > > > > Eli Zaretskii writes: > > > > > > * Which coding systems have higher priority when inserting a > > > > file in the current buffer. > > > > > > I could understand how the font selection and the default input method > > > are related to the language, but what do encodings have to do with > > > that? The preferred encoding is generally an attribute of a locale, > > > not of a language. > > > > Note the word "insert", which implies "read". It is certainly true > > that a locale may specify an encoding. However, if the person is > > Japanese, they may specify ja_JP.UTF-8 for their locale and strongly > > prefer that files be written with that encoding, yet still need to > > read files in other encodings. [more examples snipped] > > Those are all valid concerns, but they are just the tip of an > iceberg. No, they *are* the iceberg, at least as far as the autopilot is concerned. After that, you *must* ask the user. > There's an almost infinite number of combinations of a language and > the preferred encoding Sure, but given a language and the set of encoding features Emacs knows how to detect *when reading from a stream*, there remains substantial ambiguity. Setting the priority list can remove almost all of that ambiguity, leaving what's left for the user. That is what the priority lists are for, and it is a useful feature of the language environment. All problems with the language environment that I know of stem from its global nature applying to all buffers and the application itself, not from appropriate use in a given buffer. IOW, it's just the defects of the POSIX_ME_HARDER locale mirrored into Emacs itself. The preferred encoding, OTOH, is a heuristic for the encoding of files read, and the default for the encoding of files written. These two are independent in principle, but of course "preferred encoding for writing" = "highest priority encoding for reading" is a very valuable heuristic. > , and it's impossible to fold them all, or even their significant > fraction, Of course a significant fraction is possible. That's precisely what the priority lists have been achieving since the early 1990s. If your complaint is that we should do better, "patches welcome" is the only thing I can think of to say. But it does a pretty damn good job already, and buffer-local language environments should cut current damage by 80% or more; your work is cut out for you. > in a reasonably usable user-level interface. We shouldn't even > try, IMO; we already have prefer-coding-system Huh? prefer-coding-system has two effects: it promotes a certain coding-system to highest priority in its category, and it promotes that category to highest priority in case of ambiguity. IOW, it's a user override of the priority setting that comes from the language environment. A completely different purpose (handling exceptions) from the language environment itself (handling the unmarked case). Are you sure you have any idea what you're talking about? (That's an honest question; the way you are going, I have to wonder. If you say "yes", I'll trust you, but I'd appreciate an explanation of what you're talking about that refers to real bugs in the current system, rather than general features that offend your sense of design.) > , the coding: cookies, the .dir_locals meta-data, Speaking of *my* sense of design, two features that are an offense against Man and a stench in the nostrils of God. But I digress. > etc. to cover the situations where the user knows what encoding should > be preferred/used, even though her language and locale say otherwise. > > set-language-environment accepts a single string, which should be a > language name, as its argument. (There are some "languages" that we > recognize, such as "Chinese-GB18030", which sneak in the encoding as > well, but that's an anomaly, I think, which goes back to when Emacs > didn't have locale environments to express that. Now that we do, we > could get rid of that, at least in principle.) Therefore, a language > environment should set the defaults suitable for the language, and > that doesn't include the encoding, or at least does not have to fit > each minor cultural variant of the language. That's not what coding priority settings are for. They are to remove ambiguities like "we have EUC, but which one?" and "we have Windows-125x, but which one?" and "since ISO-8859-1 allows all 256 bytes, if we want to give priority to Chinese or Japanese, that had better come late in the list!" > > > The fact that we mix them is because Emacs had > > > language environments before it had locale environments. > > > > What's a "locale environment"? > > See set-locale-environment. "[No match]" YAGNI, apparently. (For values of "you" == "me", obviously. YMMV. :-) > > AFAIK Emacsen use the locale as a heuristic for determining the > > language environment > > There's no heuristic involved, AFAIR. Emacs has a database of > languages _and_encodings_ suitable for the known locale names. You're confusing "algorithmic" with "non-heuristic". Of course it's possible to have a heuristic algorithm. And of course in this case, locale is a heuristic. *Emacs is a multilingual* (well, technically, multiscript) *application*, and any setting of the language environment that doesn't take into account the current text we're working with is surely heuristic. > set-locale-environment uses that database to get the language and the > preferred encoding(s), then calls set-language-environment with the > language, and sets the priorities of the encodings according to the > encoding preferences. That's an unnecessary API, ISTM. (set-language-environment nil) should do that. Perhaps there should be a `set-locale' command to override the POSIX_ME_HARDER locale taken from the environment, but the POSIX_ME_HARDER locale is an abomination in a multilingual application and should be buried as deeply as we can manage. It is, of course, a useful heuristic for the user's preferred language environment for *scratch*, but that's about as far as we can take that. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-15 4:51 ` Stephen J. Turnbull @ 2010-12-15 6:47 ` Eli Zaretskii 2010-12-15 7:45 ` Werner LEMBERG ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: Eli Zaretskii @ 2010-12-15 6:47 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: handa, emacs-devel > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: emacs-devel@gnu.org, > handa@m17n.org > Date: Wed, 15 Dec 2010 13:51:40 +0900 > > > Those are all valid concerns, but they are just the tip of an > > iceberg. > > No, they *are* the iceberg, at least as far as the autopilot is > concerned. After that, you *must* ask the user. As long as we agree that there _is_ an iceberg, I won't argue. > > There's an almost infinite number of combinations of a language and > > the preferred encoding > > Sure, but given a language and the set of encoding features Emacs > knows how to detect *when reading from a stream*, there remains > substantial ambiguity. The emphasis on *reading* takes what I originally wrote out of its context. I didn't comment on reading alone, I commented on the entire issue of coding-systems being tied up to the language: > > * Which coding system to use on writing when the current > > buffer contains a character that can't be encoded by > > buffer-file-coding-system. > > > > * Which coding systems have higher priority when inserting a > > file in the current buffer. > > I could understand how the font selection and the default input method > are related to the language, but what do encodings have to do with > that? The preferred encoding is generally an attribute of a locale, > not of a language. If the ambiguity you are talking about is that there are more settings than just for reading, then I was originally talking about those, too. If the ambiguity is about something else, please tell what that is. > All problems with the language environment that I know > of stem from its global nature applying to all buffers and the > application itself, not from appropriate use in a given buffer. I agree that it would be useful to have a language as per-buffer setting. This discussion is about what should that include. > IOW, it's just the defects of the POSIX_ME_HARDER locale mirrored > into Emacs itself. I also stated quite clearly (I think) that I think we should distinguish between the locale and the language, as far as their effects on Emacs are concerned. > > , and it's impossible to fold them all, or even their significant > > fraction, > > Of course a significant fraction is possible. That's precisely what > the priority lists have been achieving since the early 1990s. Evidently, your examples try to show that the fraction is not significant enough. > If your complaint is that we should do better, "patches welcome" is > the only thing I can think of to say. No, I'm saying we shouldn't try to do better _automatically_. Users have enough facilities to affect the defaults according to their specific use-cases. > > in a reasonably usable user-level interface. We shouldn't even > > try, IMO; we already have prefer-coding-system > > Huh? prefer-coding-system has two effects: it promotes a certain > coding-system to highest priority in its category, and it promotes > that category to highest priority in case of ambiguity. IOW, it's a > user override of the priority setting that comes from the language > environment. Exactly my point: the user can override the automated selections if she needs. So the current automation doesn't need to do better. > A completely different purpose (handling exceptions) > from the language environment itself (handling the unmarked case). Except that set-language-environment calls prefer-coding-system under the hood to do most of its job... > Are you sure you have any idea what you're talking about? I think I do. I'm not sure we are talking about the same thing, though. > That's an honest question; the way you are going, I have to wonder. Knowing me for as long as you do, I wonder how can such a question be honest. But I digress. > If you say "yes", I'll trust you, but I'd appreciate an explanation > of what you're talking about that refers to real bugs in the current > system, rather than general features that offend your sense of > design. I wasn't talking about any bugs at all. Werner suggested to add a new _feature_; I was talking about what that feature should and shouldn't include. > [coding priority settings] are to remove ambiguities like "we have > EUC, but which one?" and "we have Windows-125x, but which one?" and > "since ISO-8859-1 allows all 256 bytes, if we want to give priority > to Chinese or Japanese, that had better come late in the list!" I don't think I said anything to the contrary. I would add, though, that the priority settings also deal with "we have some encoding that uses 8-bit bytes, but which encoding is that?" > > > AFAIK Emacsen use the locale as a heuristic for determining the > > > language environment > > > > There's no heuristic involved, AFAIR. Emacs has a database of > > languages _and_encodings_ suitable for the known locale names. > > You're confusing "algorithmic" with "non-heuristic". Please take a look at the database. I stand by what I wrote: there's no heuristic anywhere in sight. > And of course in this case, locale is a heuristic. *Emacs is a > multilingual* (well, technically, multiscript) *application*, and any > setting of the language environment that doesn't take into account the > current text we're working with is surely heuristic. If so, it's a heuristic that is external to Emacs. Emacs just abides by it, because users expect that. Anyway, this aspect is entirely unrelated to the issue at hand. > > set-locale-environment uses that database to get the language and the > > preferred encoding(s), then calls set-language-environment with the > > language, and sets the priorities of the encodings according to the > > encoding preferences. > > That's an unnecessary API, ISTM. (set-language-environment nil) > should do that. So we basically agree: the (not entirely complete) equivalence between these 2 APIs is not TRT and it should go away. We may disagree which API should be dropped and which one retained, but that's just a naming issue (and maybe a consequence of the fact that you didn't know about set-locale-environment before). But this is not the main issue I wanted to discuss. The main issue is what constitutes a "language environment" as far as Emacs is concerned, after we factor out the effects of the locale? If we are going to implement per-buffer language environments, we need to decide that first and foremost. Perhaps a useful starting point would be to ask: what exactly is a "language name" string? should it specify only a language, or should it also try to specify the preferred encodings? > the POSIX_ME_HARDER locale is an abomination in a multilingual > application and should be buried as deeply as we can manage. It is, > of course, a useful heuristic for the user's preferred language > environment for *scratch*, but that's about as far as we can take that. I'm not sure it's as black and white as you make it sound. For example, users of the same language on GNU/Linux and on MS-Windows might very well disagree wrt to the preferred encodings. So some aspects of the locale still affect language-specific choices. But again, I think talking about the locale just muddies the waters in this discussion. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-15 6:47 ` Eli Zaretskii @ 2010-12-15 7:45 ` Werner LEMBERG 2010-12-16 21:10 ` Stephen J. Turnbull 2010-12-17 0:51 ` Kenichi Handa 2 siblings, 0 replies; 27+ messages in thread From: Werner LEMBERG @ 2010-12-15 7:45 UTC (permalink / raw) To: eliz; +Cc: stephen, emacs-devel, handa > I wasn't talking about any bugs at all. Werner suggested to add a > new _feature_; I was talking about what that feature should and > shouldn't include. Perhaps it makes sense to provide typical user cases instead of theorizing a priori. Hopefully, others provide real-life scenarios too. My case: file language: Chinese or Japanese or Korean file encoding: UTF-8 (or any other flavour of Unicode) Wish: Emacs should select a proper font based on a file language tag. The fonts should be specified by the user, to be configured as a preference list in `.emacs'. Reason: It is not possible to automatically decide whether a given font like SimSun is really suitable for a given language; this concept is missing in the OpenType specification, contrary to, say, CID-keyed fonts. A hint might be the presence of a specific script and language tag in the font's OpenType tables (`HANI' and `CHN', respectively, for SimSun), but there are many TrueType fonts which don't have advanced OpenType features. Since SimSun contains Katakana, Hiragana, and CJK glyphs – this might be deduced from the OS/2 table, and FontConfig checks that also – it *can* be used for Japanese, but it doesn't *suit*. This problem is really important for CJK fonts, however, even European languages can be affected. For example, the right way in Romanian is to write `ş' (s with cedilla), but it should be displayed as `ș' (s with comma below). Recent OpenType fonts often contain proper language tags so that a language specific mapping can be done, but many, many Type 1 fonts don't; they contain the glyph name `scedilla', but the real glyph displayed is s with comma below. Werner ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-15 6:47 ` Eli Zaretskii 2010-12-15 7:45 ` Werner LEMBERG @ 2010-12-16 21:10 ` Stephen J. Turnbull 2010-12-17 11:51 ` Eli Zaretskii 2010-12-17 0:51 ` Kenichi Handa 2 siblings, 1 reply; 27+ messages in thread From: Stephen J. Turnbull @ 2010-12-16 21:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, handa Eli Zaretskii writes: > The emphasis on *reading* takes what I originally wrote out of its > context. I didn't comment on reading alone, I commented on the entire > issue of coding-systems being tied up to the language: I know you were talking about something else, but I can't figure out what or why. You said, "don't associate coding priorities with language," I gave a good reason why there should be coding priorities associated with language. The rest of what you write is irrelevant, since none of it points out a real problem with that association. > If the ambiguity you are talking about is that there are more settings > than just for reading, Of course that's not the ambiguity I'm talking about. The ambiguity I'm talking about is in the reading, and that is sufficient reason to associate priorities with language. > I agree that it would be useful to have a language as per-buffer > setting. This discussion is about what should that include. It should include priorities for encoding detection. > > Of course a significant fraction is possible. That's precisely what > > the priority lists have been achieving since the early 1990s. > > Evidently, your examples try to show that the fraction is not > significant enough. No, my examples show what you will lose by removing the association of encoding priority with language environment. > > If your complaint is that we should do better, "patches welcome" is > > the only thing I can think of to say. > > No, I'm saying we shouldn't try to do better _automatically_. Users > have enough facilities to affect the defaults according to their > specific use-cases. Handa-san was not talking about trying to do better. He was talking about how we achieve the success rates we currently get. Removing the association of language with encoding priority would drastically decrease that for anybody who needs to deal with multiple languages and multiple associated encodings in their environment. > Exactly my point: the user can override the automated selections if > she needs. So the current automation doesn't need to do better. Well, your point is just plain wrong, then, because nobody is proposing a change w.r.t. the current automation. All that has been suggested is that we keep doing the same things we've been doing to achieve a reasonable degree of automatic recognition for people in environments with multiple encodings. > > A completely different purpose (handling exceptions) > > from the language environment itself (handling the unmarked case). > > Except that set-language-environment calls prefer-coding-system under > the hood to do most of its job... Yes, this works for Europeans, Arabs, and Israelis, because basically what you need to do is disambiguate ISO-8859-X, and just putting the right ISO coding system (or perhaps a Windows-125x coding system) at the head of the list (ie, just using prefer-coding-system) does what you need. It's not good enough for Han users because they need to disambiguate EUC from each other and from 8-bit ISO, and among Microsoft bogus encodings (Shift JIS and Big5). That means manipulating the priority lists at positions other than head of list. I'm not sure about Cyrillic users. > > That's an honest question; the way you are going, I have to wonder. > > Knowing me for as long as you do, I wonder how can such a question be > honest. But I digress. Usually you don't miss a point like "nobody is proposing anything new here for how language environments work". (All that is being proposed is making them buffer-local.) Since you did miss it, I have to wonder if you know anything about how encoding detection works internally. > I wasn't talking about any bugs at all. Werner suggested to add a new > _feature_; I was talking about what that feature should and shouldn't > include. Well, you're wrong about manipulating the coding priorities. It is not new, and it is needed. > > And of course in this case, locale is a heuristic. *Emacs is a > > multilingual* (well, technically, multiscript) *application*, and any > > setting of the language environment that doesn't take into account the > > current text we're working with is surely heuristic. > > If so, it's a heuristic that is external to Emacs. Emacs just abides > by it, because users expect that. Anyway, this aspect is entirely > unrelated to the issue at hand. Of course it's not unrelated. Referring to the locale is an external heuristic and therefore unreliable. If the user sets a language environment, that is surely better information than what you get from the locale. However, it's probably a good idea to merge information from the new language environment with that from the old one, giving precedence to the new. > But this is not the main issue I wanted to discuss. The main issue is > what constitutes a "language environment" as far as Emacs is > concerned, after we factor out the effects of the locale? What are you talking about, "factor out"? If the user sets a language environment, that will override the locale on all points where it specifies behavior. > Perhaps a useful starting point would be to ask: what exactly is a > "language name" string? should it specify only a language, or should > it also try to specify the preferred encodings? It should specify only the language, IMO. Determining the preferred encodings is complex but fairly mature at this point. If the user doesn't want the default priorities associated with a language, I don't see why they shouldn't use prefer-coding-system or set-coding-priority-list rather than piggyback on the language environment itself. > I'm not sure it's as black and white as you make it sound. For > example, users of the same language on GNU/Linux and on MS-Windows > might very well disagree wrt to the preferred encodings. So some > aspects of the locale still affect language-specific choices. Huh? That's not "locale", that's system convention. Locale is something else entirely. It's true that you can override that heuristic via locale, but (at least in XEmacs) we take the system type into account when computing the startup priorities, even if the locale specifies an encoding. I would imagine Emacs does the same. > But again, I think talking about the locale just muddies the waters > in this discussion. Then why do you keep talking about it? Can we agree that it's a good heuristic for (1) the initial language environment for *scratch* and (2) when an encoding is specified in the locale, it should be prefer-coding-system'd, and (3) after doing (1) and (2) we don't care about the locale any more? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-16 21:10 ` Stephen J. Turnbull @ 2010-12-17 11:51 ` Eli Zaretskii 2010-12-18 6:29 ` Werner LEMBERG 2010-12-18 9:30 ` Stephen J. Turnbull 0 siblings, 2 replies; 27+ messages in thread From: Eli Zaretskii @ 2010-12-17 11:51 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: emacs-devel, handa > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: handa@m17n.org, > emacs-devel@gnu.org > Date: Fri, 17 Dec 2010 06:10:44 +0900 > > I know you were talking about something else, but I can't figure out > what or why. Sorry for not making myself clear. Let me try again: . The issue is what it means to have a separate buffer-local "language environment". . The current machinery of language environments was invented and evolved to its current form as a global session-wide setting. I'm not sure the same set of heuristics, or even the extent of what "language environment" means and what settings it affects, are still correct for a buffer-local setting. . There's any number of possible use-cases for needing this kind of feature. They are all quite rare (if they weren't, we would have many complaints about not having such a feature, which we don't). The current heuristics encoded in the global language environment does not cover well rare and marginal use-cases, being just that -- a set of heuristics. It is therefore quite probable that just making the language environment buffer-local and keeping all the rest of its machinery and semantics would do the wrong thing for a large portion of the use-cases which need such a buffer-local feature. . IMO, the way we set priorities for selecting an encoding based on the language runs the highest risk being inappropriate for this kind of buffer-local "language environment". That's because selection of an appropriate encoding depends on factors that have nothing to do with the language, for those languages which have several alternative encodings. These factors include the locale, the filesystem on which the buffer's file lives (which could be local or remote), the purpose of the text that is edited (it could be a text file, or a program source, or an email message meant to be sent, or text to be sent to a subsidiary program or copy/pasted through a selection), and possibly some more. Setting the language can surely identify a small set of appropriate encodings, but I very much doubt that it can correctly select The Right One. . Therefore, I think that buffer-local "language environments" should not automatically select the encodings given just the language name, but instead let the user specify them separately when she selects the buffer-local language. > > > That's an honest question; the way you are going, I have to wonder. > > > > Knowing me for as long as you do, I wonder how can such a question be > > honest. But I digress. > > Usually you don't miss a point like "nobody is proposing anything new > here for how language environments work". (All that is being proposed > is making them buffer-local.) Since you did miss it, I have to wonder > if you know anything about how encoding detection works internally. Since you have the logs to get you straight about the degree of my knowledge in that issue, you should rather wonder whether I'm missing your point because I misunderstood what you are saying or because you failed to explain it clearly. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-17 11:51 ` Eli Zaretskii @ 2010-12-18 6:29 ` Werner LEMBERG 2010-12-18 9:30 ` Stephen J. Turnbull 1 sibling, 0 replies; 27+ messages in thread From: Werner LEMBERG @ 2010-12-18 6:29 UTC (permalink / raw) To: eliz; +Cc: stephen, handa, emacs-devel > . There's any number of possible use-cases for needing this kind > of feature. They are all quite rare (if they weren't, we would > have many complaints about not having such a feature, which we > don't). I would rather say that these use-cases are all quite new. Previously, Emacs didn't do too well with Unicode, and only recently free CJK fonts (especially for Japanese) are available as TrueType fonts so that the user has a choice to select between well-crafted fonts. > The current heuristics encoded in the global language environment > does not cover well rare and marginal use-cases, being just that > -- a set of heuristics. It is therefore quite probable that just > making the language environment buffer-local and keeping all the > rest of its machinery and semantics would do the wrong thing for a > large portion of the use-cases which need such a buffer-local > feature. I don't think so, however, people on this list should submit more use-cases if possible so that we can decide this issue easier. > Setting the language can surely identify a small set of > appropriate encodings, but I very much doubt that it can > correctly select The Right One. Note that I'm specifically talking about Unicode. IMHO, handling of all other encodings should stay as-is since they will be extinct soon anyway. Werner ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-17 11:51 ` Eli Zaretskii 2010-12-18 6:29 ` Werner LEMBERG @ 2010-12-18 9:30 ` Stephen J. Turnbull 2010-12-21 18:39 ` Eli Zaretskii 2010-12-21 21:16 ` Werner LEMBERG 1 sibling, 2 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2010-12-18 9:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, handa Eli Zaretskii writes: > . The issue is what it means to have a separate buffer-local > "language environment". Please, let's postpone this until Handa-san has some time to work on it. I have two comments to make to try to avoid misunderstanding later. First, please note that what Werner needs has little or nothing to do with this discussion of modifying the coding priority list. Werner is in the fairly small set of users for whom encoding selection is a solved problem. If Emacs gets it wrong, he knows what to do about it. *His* problems are harder, and more deeply tied to language itself. The current language environment mechanism is good at what it does and will be somewhat improved by being made buffer-local, but to be really useful to Werner a number of additional attributes need to be added, as well as some functionality that I don't yet really know how to implement (eg, his Romanian s-with-comma-below vs. s-cedilla issue). Second, you wrote: > Since you have the logs to get you straight about the degree of my > knowledge in that issue, you should rather wonder whether I'm missing > your point because I misunderstood what you are saying or because you > failed to explain it clearly. *sigh* OK, then, let me make things perfectly clear. I have so wondered, but the words you have written make me believe that you have a fundamental misunderstanding of how buffer-file-coding-system gets set in Emacsen, and specifically you do not understand the role of the priority lists. Since the details matter (and I believe now differ across the Emacsen), I recommend you look at the source rather than have me explain. But I'll tell you how the general scheme works in XEmacs if you want, and later Handa-san can clarify any differences. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-18 9:30 ` Stephen J. Turnbull @ 2010-12-21 18:39 ` Eli Zaretskii 2010-12-21 21:16 ` Werner LEMBERG 1 sibling, 0 replies; 27+ messages in thread From: Eli Zaretskii @ 2010-12-21 18:39 UTC (permalink / raw) To: Stephen J. Turnbull; +Cc: emacs-devel, handa > From: "Stephen J. Turnbull" <stephen@xemacs.org> > Cc: handa@m17n.org, > emacs-devel@gnu.org > Date: Sat, 18 Dec 2010 18:30:39 +0900 > > the words you have written make me believe that you have > a fundamental misunderstanding of how buffer-file-coding-system gets > set in Emacsen, and specifically you do not understand the role of the > priority lists. You are wrong. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-18 9:30 ` Stephen J. Turnbull 2010-12-21 18:39 ` Eli Zaretskii @ 2010-12-21 21:16 ` Werner LEMBERG 2010-12-22 6:52 ` Stephen J. Turnbull 1 sibling, 1 reply; 27+ messages in thread From: Werner LEMBERG @ 2010-12-21 21:16 UTC (permalink / raw) To: stephen; +Cc: eliz, handa, emacs-devel > *His* problems are harder, and more deeply tied to language itself. > The current language environment mechanism is good at what it does > and will be somewhat improved by being made buffer-local, but to be > really useful to Werner a number of additional attributes need to be > added, as well as some functionality that I don't yet really know > how to implement (eg, his Romanian s-with-comma-below vs. s-cedilla > issue). IMHO, the Romanian functionality is nothing Emacs should take care at all. It should simply forward a `language environment' to the font library which has to take care of using the proper glyph. Today, most of the good multilingual OpenType fonts have support for that mechanism. However, for CJK stuff, the situation is very different. Virtually *no* font supports different glyphs for Chinese, Japanese, and Korean. Ken Lunde from Adobe has analyzed the problem in detail, and according to him, it would be necessary to add about 40% more glyphs, making huge fonts even larger. Werner ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-21 21:16 ` Werner LEMBERG @ 2010-12-22 6:52 ` Stephen J. Turnbull 2010-12-22 7:42 ` Werner LEMBERG 0 siblings, 1 reply; 27+ messages in thread From: Stephen J. Turnbull @ 2010-12-22 6:52 UTC (permalink / raw) To: Werner LEMBERG; +Cc: eliz, emacs-devel, handa Werner LEMBERG writes: > IMHO, the Romanian functionality is nothing Emacs should take care at > all. It should simply forward a `language environment' to the font > library which has to take care of using the proper glyph. Today, most > of the good multilingual OpenType fonts have support for that > mechanism. It's not obvious to me that that is a generally correct solution (see below for why I don't think it appropriate for CJK), but if it does work for European (and probably many other) languages, that's great. BTW, did you mean to say good *free* multilingual OpenType fonts, and just assume freedom, or was the omission prompted by reality? Freedom matters to Emacsen, of course. > However, for CJK stuff, the situation is very different. Virtually > *no* font supports different glyphs for Chinese, Japanese, and > Korean. It's not obvious to me that they should. If you look at the multiple Chinese languages, Japanese, Korean, and Vietnamese, you see that there are clearly Chinese styles (and I suspect differences among Taiwanese, Cantonese, and Mandarin styles), clearly Japanese styles, etc. with respect to stroke endings, attitude of slanted strokes, contact points, and extensions at join points. I don't think that people from different East Asian culture/languages would find compromise fonts acceptable, except perhaps in the very simplest of Gothic and Maru Gothic faces (Japanese names for font styles basically equivalent to sans-serif upright faces for Latin characters). Eg, in Emacs, even as one who learned Japanese late in life, I've gotten used to distinguishing Chinese spam from Japanese spam via such stylistic differences (strictly speaking, it's unnecessary as the presence of kana is normally decisive). I have to wonder if such stylistic fine points might not be very important to the comfort level of someone who is bilingual in Chinese and Japanese. But as a practical matter, today if Emacs wants to display Chinese attractively (maybe even "correctly"), it cannot use a Japanese font and compromise fonts with multilingual support basically don't exist. So even if support in Emacs for choosing appropriate fonts based on language is not needed for Romanian, it is needed for Han-based languages. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-22 6:52 ` Stephen J. Turnbull @ 2010-12-22 7:42 ` Werner LEMBERG 0 siblings, 0 replies; 27+ messages in thread From: Werner LEMBERG @ 2010-12-22 7:42 UTC (permalink / raw) To: stephen; +Cc: eliz, emacs-devel, handa > BTW, did you mean to say good *free* multilingual OpenType fonts, > and just assume freedom, or was the omission prompted by reality? > Freedom matters to Emacsen, of course. Today, virtually any new multilingual font, be it `free' or not, supports that. For example, the TeX Gyre project provides extended glyph sets for the URW clones of the 35 standard PS fonts, and of course the Romanian case is handled by its OpenType tables (script `latn', language `ROM', table `locl'). > > However, for CJK stuff, the situation is very different. > > Virtually *no* font supports different glyphs for Chinese, > > Japanese, and Korean. > > It's not obvious to me that they should. If you look at the > multiple Chinese languages, Japanese, Korean, and Vietnamese, you > see that there are clearly Chinese styles (and I suspect differences > among Taiwanese, Cantonese, and Mandarin styles), clearly Japanese > styles, etc. with respect to stroke endings, attitude of slanted > strokes, contact points, and extensions at join points. I don't > think that people from different East Asian culture/languages would > find compromise fonts acceptable, except perhaps in the very > simplest of Gothic and Maru Gothic faces (Japanese names for font > styles basically equivalent to sans-serif upright faces for Latin > characters). Eg, in Emacs, even as one who learned Japanese late in > life, I've gotten used to distinguishing Chinese spam from Japanese > spam via such stylistic differences (strictly speaking, it's > unnecessary as the presence of kana is normally decisive). I have > to wonder if such stylistic fine points might not be very important > to the comfort level of someone who is bilingual in Chinese and > Japanese. You've probably misunderstood me. The idea of the script/language tags within OpenType fonts is that you can map the input character codes to script or language specific glyphs. If you do so for CJK fonts, you need about 60% more glyphs *to get locale specific correct shapes*, as Ken Lunde has analyzed (unfortunately, his presentation is not available in the net). A great number of glyphs (about 40%), however, *can* be shared among the CJK locales ^[$(Q#|^[(B just think of the characters ^[$B0lFs;0;M8^^[(B which have always the same shape. In other words, the technical problems to have a single font with support for multiple CJK locales have been solved, but there is no such font (neither free nor non-free, AFAIK) which incorporates this technique. > But as a practical matter, today if Emacs wants to display Chinese > attractively (maybe even "correctly"), it cannot use a Japanese font > and compromise fonts with multilingual support basically don't > exist. So even if support in Emacs for choosing appropriate fonts > based on language is not needed for Romanian, it is needed for > Han-based languages. With `no support in Emacs needed' I mean that the burden of handling the issue can be transferred to a library (e.g. libotf). In case the library says "sorry, I can't provide a locale specific font", Emacs should do nothing for Romanian (since the user can easily install a font with proper support), but should actually handle the case for CJK. Werner ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-15 6:47 ` Eli Zaretskii 2010-12-15 7:45 ` Werner LEMBERG 2010-12-16 21:10 ` Stephen J. Turnbull @ 2010-12-17 0:51 ` Kenichi Handa 2010-12-17 2:48 ` Stephen J. Turnbull 2010-12-17 11:05 ` Eli Zaretskii 2 siblings, 2 replies; 27+ messages in thread From: Kenichi Handa @ 2010-12-17 0:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: stephen, emacs-devel I should join this discussion, but sorry, I don't have a time at the moment. I'd like to say one thing: In article <E1PSl9O-0001wu-GB@fencepost.gnu.org>, Eli Zaretskii <eliz@gnu.org> writes: > Perhaps a useful starting point would be to ask: what exactly is a > "language name" string? should it specify only a language, or should > it also try to specify the preferred encodings? The reason why I chose the term "language environment" instead of simple "language" was to make it provide a set of various good settings (i.e. environment). So my intention of "language environment" name was not a language name but an environment name. This concept was close to "locale" but when I made it, it was not clear what the system locale can do. Anyway, for that, "language" is just one aspect, and thus there are variants of Chinese-* (which specify both language and encoding) and variants of Latin-* (which specify only encoding). A while ago, I proposed more dynamic way of specifing language environment which allows user to freely name a environment by any combination of language and encoding, and Emacs automatically generate a proper "language environment" object associated with the specified name. The name can have this syntax: LANGUAGE-[ENCODING[-CHARSET[-INPUT_METHOD]]] or any other convenient syntax (e.g. keyward-value pair). But that idea was rejected because it's overkill. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-17 0:51 ` Kenichi Handa @ 2010-12-17 2:48 ` Stephen J. Turnbull 2010-12-17 11:05 ` Eli Zaretskii 1 sibling, 0 replies; 27+ messages in thread From: Stephen J. Turnbull @ 2010-12-17 2:48 UTC (permalink / raw) To: Kenichi Handa; +Cc: Eli Zaretskii, emacs-devel Kenichi Handa writes: > I should join this discussion, but sorry, I don't have a > time at the moment. I'd like to say one thing: Well, it *is* shiwasu, after all.[1] I think it's best if Eli and I defer the discussion until you can join, then. Footnotes: [1] To let everybody in on the joke, it's a Japanese name for December which literally means "[even] teachers run". ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-17 0:51 ` Kenichi Handa 2010-12-17 2:48 ` Stephen J. Turnbull @ 2010-12-17 11:05 ` Eli Zaretskii 1 sibling, 0 replies; 27+ messages in thread From: Eli Zaretskii @ 2010-12-17 11:05 UTC (permalink / raw) To: Kenichi Handa; +Cc: stephen, emacs-devel > From: Kenichi Handa <handa@m17n.org> > Cc: stephen@xemacs.org, emacs-devel@gnu.org > Date: Fri, 17 Dec 2010 09:51:23 +0900 > > The reason why I chose the term "language environment" > instead of simple "language" was to make it provide a set of > various good settings (i.e. environment). So my intention > of "language environment" name was not a language name but > an environment name. That's okay, but it's not clear whether what Werner wants still fits this model, which was originally invented to be global for the entire Emacs session. Since now the issue (as I understand it) is to provide a more fine-grained feature, we should analyze again whether it is still appropriate to specify the whole thing (language, input method, default encodings for all of their varieties, etc.) with a single string argument, or indeed with a single API/UI. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-13 7:56 ` Kenichi Handa 2010-12-13 9:27 ` Werner LEMBERG 2010-12-13 11:47 ` Eli Zaretskii @ 2010-12-18 17:03 ` Per Starbäck 2010-12-19 13:54 ` Stefan Monnier 2 siblings, 1 reply; 27+ messages in thread From: Per Starbäck @ 2010-12-18 17:03 UTC (permalink / raw) To: Kenichi Handa; +Cc: eliz, emacs-devel I've "always" wanted Emacs to know what natural language a buffer is in, at least text buffers, maybe as a minor mode. Here are two things I think think haven't been mentioned in the thread yet, which I would like: * automatically ispell-change-dictionary * have language-specific abbrevs If there's a hook for changing from and to specific languages that could be used for other things as well, like changing the values of some sentence-end-* variables. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-18 17:03 ` Per Starbäck @ 2010-12-19 13:54 ` Stefan Monnier 2010-12-19 21:05 ` Dimitri Fontaine 0 siblings, 1 reply; 27+ messages in thread From: Stefan Monnier @ 2010-12-19 13:54 UTC (permalink / raw) To: Per Starbäck; +Cc: eliz, emacs-devel, Kenichi Handa > * automatically ispell-change-dictionary Yes, that would be nice. It'd have to work "on the fly" to be useful for me (typically I want/need it for email messages, which start empty). I was thinking that (fly|i)spell could detect "too many misspelling" and trigger an auto-detect. Stefan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: per-buffer language environments 2010-12-19 13:54 ` Stefan Monnier @ 2010-12-19 21:05 ` Dimitri Fontaine 0 siblings, 0 replies; 27+ messages in thread From: Dimitri Fontaine @ 2010-12-19 21:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: Per Starbäck, eliz, Kenichi Handa, emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: >> * automatically ispell-change-dictionary > > Yes, that would be nice. It'd have to work "on the fly" to be useful > for me (typically I want/need it for email messages, which start empty). > I was thinking that (fly|i)spell could detect "too many misspelling" and > trigger an auto-detect. See: http://git.naquadah.org/?p=flyguess.git;a=summary http://www.emacswiki.org/emacs-fr/CategorySpelling Regards, -- dim ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2010-12-22 7:42 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-12-11 15:25 per-buffer language environments Werner LEMBERG 2010-12-11 19:00 ` Eli Zaretskii 2010-12-12 6:25 ` Werner LEMBERG 2010-12-13 7:56 ` Kenichi Handa 2010-12-13 9:27 ` Werner LEMBERG 2010-12-13 10:59 ` Kenichi Handa 2010-12-13 12:15 ` Werner LEMBERG 2010-12-13 11:47 ` Eli Zaretskii 2010-12-14 11:38 ` Stephen J. Turnbull 2010-12-14 15:14 ` Eli Zaretskii 2010-12-15 4:51 ` Stephen J. Turnbull 2010-12-15 6:47 ` Eli Zaretskii 2010-12-15 7:45 ` Werner LEMBERG 2010-12-16 21:10 ` Stephen J. Turnbull 2010-12-17 11:51 ` Eli Zaretskii 2010-12-18 6:29 ` Werner LEMBERG 2010-12-18 9:30 ` Stephen J. Turnbull 2010-12-21 18:39 ` Eli Zaretskii 2010-12-21 21:16 ` Werner LEMBERG 2010-12-22 6:52 ` Stephen J. Turnbull 2010-12-22 7:42 ` Werner LEMBERG 2010-12-17 0:51 ` Kenichi Handa 2010-12-17 2:48 ` Stephen J. Turnbull 2010-12-17 11:05 ` Eli Zaretskii 2010-12-18 17:03 ` Per Starbäck 2010-12-19 13:54 ` Stefan Monnier 2010-12-19 21:05 ` Dimitri Fontaine
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.