From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.help Subject: Re: word syntax/umlauts emacs 23 vs 22 Date: Wed, 20 Oct 2010 21:27:20 -0400 Organization: A noiseless patient Spider Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: dough.gmane.org 1291878958 12789 80.91.229.12 (9 Dec 2010 07:15:58 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 9 Dec 2010 07:15:58 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Dec 09 08:15:54 2010 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PQajI-0000ET-4g for geh-help-gnu-emacs@m.gmane.org; Thu, 09 Dec 2010 08:15:52 +0100 Original-Received: from localhost ([127.0.0.1]:55834 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PQaTr-0007Nf-JH for geh-help-gnu-emacs@m.gmane.org; Thu, 09 Dec 2010 01:59:55 -0500 Original-Path: usenet.stanford.edu!goblin1!goblin.stu.neva.ru!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail Original-Newsgroups: comp.emacs,gnu.emacs.help Original-Lines: 51 Injection-Info: mx02.eternal-september.org; posting-host="Ml8bg9Px1Z9ybLwV1rYffA"; logging-data="12805"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/eNEyU5TntMQ1gc3yBIrQ0" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) Cancel-Lock: sha1:4IPt4dw7D5DyMAVGK4LWPbJTXM8= sha1:dXURHsUmzzYjeKtDWh+efMs+k98= Original-Xref: usenet.stanford.edu comp.emacs:100614 gnu.emacs.help:181911 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:76808 Archived-At: > | I cannot reproduce it. What is your LANG/LC_ALL setting? > LANG=de_DE.UTF-8 > | > I use unibyte in regular use. > | How do you tell Emacs to use unibyte? > I just recognized that I had set EMACS_UNIBYTE in the environment. > If I unset this and start /usr/bin/emacs -Q, I get correct word-movement > on Umlauts inserted on a german keyboard. Great. > Now we still have basically all of our files in unibyte encoding, and "unibyte encoding" is a term that makes sense here, but searching for it won't put you on the right track, I'm afraid ;-) > the show as M\374ller, with the single-byte Umlauts as escape sequences, Your "unibyte encoding" is most likely latin-1 or latin-9, so your problem now is that Emacs for some reason does not try latin-1 for those files that don't use utf-8. C-x RET r latin-1 RET should cause the file to be re-read as a latin-1 file, and it should then be displayed properly. Now, the question is why didn't Emacs recognize the file as a latin-1 file. If you do emacs23 -Q ~/tmp/foo.txt where foo.txt is a file encoded in latin-1 that contains Müller and some more ASCII text, Emacs should properly recognize the file as latin-1 (as indicated in the leftmost part of the mode-line by "-1:") and the ü should be recognized and displayed fine. At least it works for me (and many more people). So if that doesn't work for you, there's something more going on (maybe you'll want to try it with different files, because it may be a problem in the file's encoding). > and word-movement stops at the non-ascii char. I found that if I > customize the latin1-display Variable, they show up as Umlauts, and > word-movement also behaves properly. Is setting latin1-display the > Right Thing to work with the unibyte files? No, the "latin1-display" thingy, as the name implies, deals with display and hence just works around the problem, just like your reliance on UNIBYTE did. Stefan