From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Peter Dyballa Newsgroups: gmane.emacs.help Subject: Re: Trying to input Unicode via GNU Emacs 21.3.1 Date: Sat, 12 Feb 2005 14:29:48 +0100 Message-ID: <829b69a91977297f238f652bca4d03da@Web.DE> References: <74cf5544e5f816d57ce9bcb40de1cbc9@norvelle.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (Apple Message framework v619.2) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1108216318 5060 80.91.229.2 (12 Feb 2005 13:51:58 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 12 Feb 2005 13:51:58 +0000 (UTC) Cc: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sat Feb 12 14:51:57 2005 Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1CzxgI-0005dc-DM for geh-help-gnu-emacs@m.gmane.org; Sat, 12 Feb 2005 14:51:31 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1CzxvQ-0003EP-Gl for geh-help-gnu-emacs@m.gmane.org; Sat, 12 Feb 2005 09:07:08 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Czxls-0001E8-CX for help-gnu-emacs@gnu.org; Sat, 12 Feb 2005 08:57:16 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Czxlh-00018w-Qr for help-gnu-emacs@gnu.org; Sat, 12 Feb 2005 08:57:06 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Czxlb-00015Q-QY for help-gnu-emacs@gnu.org; Sat, 12 Feb 2005 08:56:59 -0500 Original-Received: from [217.72.192.226] (helo=smtp08.web.de) by monty-python.gnu.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.34) id 1CzxLO-0004wk-IY for help-gnu-emacs@gnu.org; Sat, 12 Feb 2005 08:29:55 -0500 Original-Received: from [80.184.167.116] (helo=[192.168.1.2]) by smtp08.web.de with asmtp (TLSv1:RC4-SHA:128) (WEB.DE 4.103 #192) id 1CzxLL-0000sE-00; Sat, 12 Feb 2005 14:29:51 +0100 In-Reply-To: <74cf5544e5f816d57ce9bcb40de1cbc9@norvelle.org> Original-To: List account X-Mailer: Apple Mail (2.619.2) X-Sender: Peter_Dyballa@web.de X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org X-MailScanner-To: geh-help-gnu-emacs@m.gmane.org Xref: main.gmane.org gmane.emacs.help:24052 X-Report-Spam: http://spam.gmane.org/gmane.emacs.help:24052 Am 11.02.2005 um 22:00 schrieb List account: > For instance, I need to be able to display the typical accented=20 > Spanish, Italian and French characters. As an example, I can input=20 > "Alarc=F3n" in Emacs and it looks fine, but it displays in my browser=20= > (Camino 0.82 on Mac OS X) as "Alarc=C3=B3n". The odd thing is that I=20= > basically copied and modified this text from a page that actually=20 > works just fine. Camino is not clever in guessing an HTML file's encoding: I can teach=20 ten times and more the right encoding and when I return to that page=20 it's again the default encoding from the preferences. So you should be=20= not that stupid and start your HTML file this way: Here all charset names are defined:=20 http://www.iana.org/assignments/character-sets. The two characters =C3=B3 explain that, what you've typed in GNU Emacs = was=20 correctly encoded as UTF-8. Character Palette (in Mac OS X) tells me=20 about =F3 that it is in UTF-8 "C3 B3", i.e. =C3 followed by =B3. Camino=20= should be able to display these two characters, if you VIEW it in=20 UTF-8, as one =F3. Defining the charset used in the HTML source's header=20= should Camino, and other browsers, make automatically switch to the=20 correct character set -- and maybe you should have set the correct font=20= that is Unicode! > > I have the following lines in my .emacs: > (setq locale-coding-system 'utf-8) > (set-terminal-coding-system 'utf-8) > (set-keyboard-coding-system 'utf-8) > (set-selection-coding-system 'utf-8) > (prefer-coding-system 'utf-8) It has been said a few times that this is too much, at least=20 set-keyboard-coding-system is incorrect. Usually your keyboard will=20 work in some Latin mode, i.e. produce only *one* character on hitting=20 or releasing a key (UTF-8 is one, two, three, and I think even some=20 more characters, for example in the case that you input a character=20 from a right-to-left script in a left-to-right script environment, and=20= vice versa). It might be more helpful when you set LANG to some=20 (Spanish? French?) UTF-8 setting (man locale). > > I have also tried the technique of hitting [C-q] and entering the=20 > Unicode string, but it chokes on the codes for accented characters and=20= > instead of inserting the accented "a" character (0x00E1) by typing C-q=20= > 0 0 E 1 it produces "^@e1". As far as I know the C-q syntax supports only *octal* values. So the=20 inputs ends when you input something outside the octal range of 0...7,=20= e is that finishing item, RET another. So you see ASCII NUL, which is=20 represented in Emacs as ^@, followed by e and 1, which are unchanged. -- Greetings Pete Basic, n.: A programming language. Related to certain social diseases in that those who have it will not admit it in polite company.