unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Trying to input Unicode via GNU Emacs 21.3.1
@ 2005-02-11 21:00 List account
  2005-02-12 13:29 ` Peter Dyballa
  0 siblings, 1 reply; 9+ messages in thread
From: List account @ 2005-02-11 21:00 UTC (permalink / raw)



[-- Attachment #1.1: Type: text/plain, Size: 1613 bytes --]

Greetings...

My apologies if this question has already been answered before, but I 
couldn't find a relevant answer in the archives.

I am trying to use GNU Emacs 21.3.1 on FreeBSD (5.3) to edit web pages 
(I'm accessing my FreeBSD machine via Terminal.App on a Mac, with 
TERM=xterm-color).  I need to input Unicode characters and have them 
appear properly in web browsers.  Currently, I have gotten Emacs to use 
"Unicode" mode (i.e. the two or three little "u"'s appear at the bottom 
left), and I am able to enter characters that look just fine in Emacs, 
but they display as gibberish in browsers.

For instance, I need to be able to display the typical accented 
Spanish, Italian and French characters.  As an example, I can input 
"Alarcón" in Emacs and it looks fine, but it displays in my browser 
(Camino 0.82 on Mac OS X) as "Alarcón".  The odd thing is that I 
basically copied and modified this text from a page that actually works 
just fine.

I have the following lines in my .emacs:
(setq locale-coding-system 'utf-8)
(set-terminal-coding-system 'utf-8)
(set-keyboard-coding-system 'utf-8)
(set-selection-coding-system 'utf-8)
(prefer-coding-system 'utf-8)

I have also tried the technique of hitting [C-q] and entering the 
Unicode string, but it chokes on the codes for accented characters and 
instead of inserting the accented "a" character (0x00E1) by typing C-q 
0 0 E 1 it produces "^@e1".

Any suggestions?

Thanks a lot!

-Erik Norvelle
erik (at) norvelle (dot) org
Facultad de Filosofía y Letras
Universidad de Navarra
Pamplona, Navarra, España

[-- Attachment #1.2: Type: text/enriched, Size: 1699 bytes --]

Greetings...


My apologies if this question has already been answered before, but I
couldn't find a relevant answer in the archives.


I am trying to use GNU Emacs 21.3.1 on FreeBSD (5.3) to edit web pages
(I'm accessing my FreeBSD machine via Terminal.App on a Mac, with
TERM=xterm-color).  I need to input Unicode characters and have them
appear properly in web browsers.  Currently, I have gotten Emacs to
use "Unicode" mode (i.e. the two or three little "u"'s appear at the
bottom left), and I am able to enter characters that look just fine in
Emacs, but they display as gibberish in browsers.


For instance, I need to be able to display the typical accented
Spanish, Italian and French characters.  As an example, I can input
"Alarcón" in Emacs and it looks fine, but it displays in my browser
(Camino 0.82 on Mac OS X) as "Alarcón".  The odd thing is that I
basically copied and modified this text from a page that actually
works just fine.


I have the following lines in my .emacs:

<fixed><fontfamily><param>Courier New</param>(setq
locale-coding-system 'utf-8)

(set-terminal-coding-system 'utf-8)

(set-keyboard-coding-system 'utf-8)

(set-selection-coding-system 'utf-8)

(prefer-coding-system 'utf-8)</fontfamily></fixed>


I have also tried the technique of hitting [C-q] and entering the
Unicode string, but it chokes on the codes for accented characters and
instead of inserting the accented "a" character (0x00E1) by typing C-q
0 0 E 1 it produces "^@e1".


Any suggestions?


Thanks a lot!


-Erik Norvelle

erik (at) norvelle (dot) org

Facultad de Filosofía y Letras

Universidad de Navarra

Pamplona, Navarra, España

[-- Attachment #2: Type: text/plain, Size: 152 bytes --]

_______________________________________________
Help-gnu-emacs mailing list
Help-gnu-emacs@gnu.org
http://lists.gnu.org/mailman/listinfo/help-gnu-emacs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
       [not found] <mailman.1808.1108157331.2841.help-gnu-emacs@gnu.org>
@ 2005-02-11 21:31 ` David Kastrup
  2005-02-12  3:06   ` August
       [not found]   ` <mailman.1838.1108178666.2841.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 9+ messages in thread
From: David Kastrup @ 2005-02-11 21:31 UTC (permalink / raw)


List account <lists@norvelle.org> writes:

> I am trying to use GNU Emacs 21.3.1 on FreeBSD (5.3) to edit web pages
> (I'm accessing my FreeBSD machine via Terminal.App on a Mac, with
> TERM=xterm-color).  I need to input Unicode characters and have them
> appear properly in web browsers.  Currently, I have gotten Emacs to
> use "Unicode" mode (i.e. the two or three little "u"'s appear at the
> bottom left), and I am able to enter characters that look just fine in
> Emacs, but they display as gibberish in browsers.
>
> For instance, I need to be able to display the typical accented
> Spanish, Italian and French characters.  As an example, I can input
> "Alarcón" in Emacs and it looks fine, but it displays in my browser
> (Camino 0.82 on Mac OS X) as "Alarcón".  The odd thing is that I
> basically copied and modified this text from a page that actually
> works just fine.
>
> I have the following lines in my .emacs:
> (setq locale-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (prefer-coding-system 'utf-8)

It would appear that the browser is of the opinion that the selection
is in latin-1, your system default.  You are explicitly telling Emacs
to ignore the system default.

Also with your other settings you tell Emacs that everything the
locale appears to be is wrong.  The easiest thing probably would be if
you not only told your Emacs that all of your environment is utf-8,
but if you just configured your environment to actually be so, in
which case you would not have to tell all of those lies to Emacs.

It may be that in a Latin-1 locale, Emacs-21.3 does not have a way to
tell the browser "Everything in the selection is utf-8".  I believe
that the development version of Emacs _has_ had some changes, due to
some X conventions that have been introduced or become common-place
only after Emacs 21.3 has been release, so it might fair better with
passing Unicode characters over a selection that it principally
Latin-1, at least when the other program also knows about those
conventions.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
  2005-02-11 21:31 ` Trying to input Unicode via GNU Emacs 21.3.1 David Kastrup
@ 2005-02-12  3:06   ` August
  2005-02-12 16:15     ` August
       [not found]   ` <mailman.1838.1108178666.2841.help-gnu-emacs@gnu.org>
  1 sibling, 1 reply; 9+ messages in thread
From: August @ 2005-02-12  3:06 UTC (permalink / raw)


On fre, 2005-02-11 at 22:31 +0100, David Kastrup wrote:
> List account <lists@norvelle.org> writes:
> 
> > I am trying to use GNU Emacs 21.3.1 on FreeBSD (5.3) to edit web pages
> > (I'm accessing my FreeBSD machine via Terminal.App on a Mac, with
> > TERM=xterm-color).  I need to input Unicode characters and have them
> > appear properly in web browsers.  Currently, I have gotten Emacs to
> > use "Unicode" mode (i.e. the two or three little "u"'s appear at the
> > bottom left), and I am able to enter characters that look just fine in
> > Emacs, but they display as gibberish in browsers.
> >
> > For instance, I need to be able to display the typical accented
> > Spanish, Italian and French characters.  As an example, I can input
> > "Alarcón" in Emacs and it looks fine, but it displays in my browser
> > (Camino 0.82 on Mac OS X) as "Alarcón".  The odd thing is that I
> > basically copied and modified this text from a page that actually
> > works just fine.
> >
> > I have the following lines in my .emacs:
> > (setq locale-coding-system 'utf-8)
> > (set-terminal-coding-system 'utf-8)
> > (set-keyboard-coding-system 'utf-8)
> > (set-selection-coding-system 'utf-8)
> > (prefer-coding-system 'utf-8)
> 
> It would appear that the browser is of the opinion that the selection
> is in latin-1, your system default.  You are explicitly telling Emacs
> to ignore the system default.
> 
> Also with your other settings you tell Emacs that everything the
> locale appears to be is wrong.  The easiest thing probably would be if
> you not only told your Emacs that all of your environment is utf-8,
> but if you just configured your environment to actually be so, in
> which case you would not have to tell all of those lies to Emacs.
> 
> It may be that in a Latin-1 locale, Emacs-21.3 does not have a way to
> tell the browser "Everything in the selection is utf-8".  I believe
> that the development version of Emacs _has_ had some changes, due to
> some X conventions that have been introduced or become common-place
> only after Emacs 21.3 has been release, so it might fair better with
> passing Unicode characters over a selection that it principally
> Latin-1, at least when the other program also knows about those
> conventions.

I'm not sure it's the settings that causes the problem. I run Emacs on
Fedora Core 3 and have no coding system settings in my `.emacs'. All new
buffers have coding system utf-8 by default, but Mozilla Firefox does
not display the letters `å', `ä', `ö', `Å', `Ä' or `Ö' correctly when I
view my own html pages, if I choose Latin-1 in Emacs they work with
Mozilla.

-- 
August

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
       [not found]   ` <mailman.1838.1108178666.2841.help-gnu-emacs@gnu.org>
@ 2005-02-12 10:47     ` David Kastrup
  2005-02-14  1:35       ` Stefan Monnier
  0 siblings, 1 reply; 9+ messages in thread
From: David Kastrup @ 2005-02-12 10:47 UTC (permalink / raw)


August <fusionfive@comhem.se> writes:

> I'm not sure it's the settings that causes the problem. I run Emacs
> on Fedora Core 3 and have no coding system settings in my
> `.emacs'. All new buffers have coding system utf-8 by default, but
> Mozilla Firefox does not display the letters `å', `ä', `ö', `Å', `Ä'
> or `Ö' correctly when I view my own html pages, if I choose Latin-1
> in Emacs they work with Mozilla.

Sounds like a setting in the source of your HTML pages, or a Mozilla
setting.


-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
  2005-02-11 21:00 List account
@ 2005-02-12 13:29 ` Peter Dyballa
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Dyballa @ 2005-02-12 13:29 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 11.02.2005 um 22:00 schrieb List account:

> For instance, I need to be able to display the typical accented 
> Spanish, Italian and French characters.  As an example, I can input 
> "Alarcón" in Emacs and it looks fine, but it displays in my browser 
> (Camino 0.82 on Mac OS X) as "Alarcón".  The odd thing is that I 
> basically copied and modified this text from a page that actually 
> works just fine.

Camino is not clever in guessing an HTML file's encoding: I can teach 
ten times and more the right encoding and when I return to that page 
it's again the default encoding from the preferences. So you should be 
not that stupid and start your HTML file this way:

<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
   <!-- ... other things ... -->
</head>

Here all charset names are defined: 
http://www.iana.org/assignments/character-sets.

The two characters ó explain that, what you've typed in GNU Emacs was 
correctly encoded as UTF-8. Character Palette (in Mac OS X) tells me 
about ó that it is in UTF-8 "C3 B3", i.e. Ã followed by ³. Camino 
should be able to display these two characters, if you VIEW it in 
UTF-8, as one ó. Defining the charset used in the HTML source's header 
should Camino, and other browsers, make automatically switch to the 
correct character set -- and maybe you should have set the correct font 
that is Unicode!

>
> I have the following lines in my .emacs:
> (setq locale-coding-system 'utf-8)
> (set-terminal-coding-system 'utf-8)
> (set-keyboard-coding-system 'utf-8)
> (set-selection-coding-system 'utf-8)
> (prefer-coding-system 'utf-8)

It has been said a few times that this is too much, at least 
set-keyboard-coding-system is incorrect. Usually your keyboard will 
work in some Latin mode, i.e. produce only *one* character on hitting 
or releasing a key (UTF-8 is one, two, three, and I think even some 
more characters, for example in the case that you input a character 
from a right-to-left script in a left-to-right script environment, and 
vice versa). It might be more helpful when you set LANG to some 
(Spanish? French?) UTF-8 setting (man locale).

>
> I have also tried the technique of hitting [C-q] and entering the 
> Unicode string, but it chokes on the codes for accented characters and 
> instead of inserting the accented "a" character (0x00E1) by typing C-q 
> 0 0 E 1 it produces "^@e1".

As far as I know the C-q syntax supports only *octal* values. So the 
inputs ends when you input something outside the octal range of 0...7, 
e is that finishing item, RET another. So you see ASCII NUL, which is 
represented in Emacs as ^@, followed by e and 1, which are unchanged.

--
Greetings

   Pete

   Basic, n.:
	A programming language.  Related to certain social diseases in
that those who have it will not admit it in polite company.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
  2005-02-12  3:06   ` August
@ 2005-02-12 16:15     ` August
  2005-02-12 17:27       ` Erik Norvelle
  0 siblings, 1 reply; 9+ messages in thread
From: August @ 2005-02-12 16:15 UTC (permalink / raw)


On lör, 2005-02-12 at 04:06 +0100, August wrote:
> On fre, 2005-02-11 at 22:31 +0100, David Kastrup wrote:
> > List account <lists@norvelle.org> writes:
> > 
> > > I am trying to use GNU Emacs 21.3.1 on FreeBSD (5.3) to edit web pages
> > > (I'm accessing my FreeBSD machine via Terminal.App on a Mac, with
> > > TERM=xterm-color).  I need to input Unicode characters and have them
> > > appear properly in web browsers.  Currently, I have gotten Emacs to
> > > use "Unicode" mode (i.e. the two or three little "u"'s appear at the
> > > bottom left), and I am able to enter characters that look just fine in
> > > Emacs, but they display as gibberish in browsers.
> > >
> > > For instance, I need to be able to display the typical accented
> > > Spanish, Italian and French characters.  As an example, I can input
> > > "Alarcón" in Emacs and it looks fine, but it displays in my browser
> > > (Camino 0.82 on Mac OS X) as "Alarcón".  The odd thing is that I
> > > basically copied and modified this text from a page that actually
> > > works just fine.
> > >
> > > I have the following lines in my .emacs:
> > > (setq locale-coding-system 'utf-8)
> > > (set-terminal-coding-system 'utf-8)
> > > (set-keyboard-coding-system 'utf-8)
> > > (set-selection-coding-system 'utf-8)
> > > (prefer-coding-system 'utf-8)
> > 
> > It would appear that the browser is of the opinion that the selection
> > is in latin-1, your system default.  You are explicitly telling Emacs
> > to ignore the system default.
> > 
> > Also with your other settings you tell Emacs that everything the
> > locale appears to be is wrong.  The easiest thing probably would be if
> > you not only told your Emacs that all of your environment is utf-8,
> > but if you just configured your environment to actually be so, in
> > which case you would not have to tell all of those lies to Emacs.
> > 
> > It may be that in a Latin-1 locale, Emacs-21.3 does not have a way to
> > tell the browser "Everything in the selection is utf-8".  I believe
> > that the development version of Emacs _has_ had some changes, due to
> > some X conventions that have been introduced or become common-place
> > only after Emacs 21.3 has been release, so it might fair better with
> > passing Unicode characters over a selection that it principally
> > Latin-1, at least when the other program also knows about those
> > conventions.
> 
> I'm not sure it's the settings that causes the problem. I run Emacs on
> Fedora Core 3 and have no coding system settings in my `.emacs'. All new
> buffers have coding system utf-8 by default, but Mozilla Firefox does
> not display the letters `å', `ä', `ö', `Å', `Ä' or `Ö' correctly when I
> view my own html pages, if I choose Latin-1 in Emacs they work with
> Mozilla.

In my case the problem showed out to be caused by a combination of the
settings in Mozilla Firefox and in the Tidy HTML validation tool. In
Mozilla I changed `Edit -> Preferences -> General -> Languages ->
Character Encoding' from the default `Western (ISO-8859-1)' to `Unicode
(UTF-8)' and in the tidy command i added `-utf8'. Now it works.

-- 
August

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
  2005-02-12 16:15     ` August
@ 2005-02-12 17:27       ` Erik Norvelle
  2005-02-12 19:06         ` Peter Dyballa
  0 siblings, 1 reply; 9+ messages in thread
From: Erik Norvelle @ 2005-02-12 17:27 UTC (permalink / raw)
  Cc: Peter Dyballa

Peter Dyballa & August,

Thanks to both of you for your suggestions and clarifications.  Knowing 
that the Unicode characters were correctly entered in Emacs helped me 
to trace the issue to the browser, which, as you both suspected, was 
not responding to the fact that the file was in Unicode.

As it turns out, if I change the text encoding of the page from 
`Western (ISO-8859-1)' to `Unicode (UTF-8)' everything does in fact 
appear perfectly.

So this is no longer really an Emacs question, but perhaps one of you 
might know the answer anyhow:  Why don't my browsers (both Camino and 
Safari) respond to the fact that my HTML contains the following:

<html lang="en-US"><head>
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
...

Feel free to reply directly to me if you don't want to clutter the 
Emacs list with HTML questions

Cheers,
-Erik

On 12/02/2005, at 17:15, August wrote:

> On lör, 2005-02-12 at 04:06 +0100, August wrote:
>> On fre, 2005-02-11 at 22:31 +0100, David Kastrup wrote:
>>> List account <lists@norvelle.org> writes:
>>>
>>>> I am trying to use GNU Emacs 21.3.1 on FreeBSD (5.3) to edit web 
>>>> pages
>>>> (I'm accessing my FreeBSD machine via Terminal.App on a Mac, with
>>>> TERM=xterm-color).  I need to input Unicode characters and have them
>>>> appear properly in web browsers.  Currently, I have gotten Emacs to
>>>> use "Unicode" mode (i.e. the two or three little "u"'s appear at the
>>>> bottom left), and I am able to enter characters that look just fine 
>>>> in
>>>> Emacs, but they display as gibberish in browsers.
>>>>
>>>> For instance, I need to be able to display the typical accented
>>>> Spanish, Italian and French characters.  As an example, I can input
>>>> "Alarcón" in Emacs and it looks fine, but it displays in my browser
>>>> (Camino 0.82 on Mac OS X) as "Alarcón".  The odd thing is that I
>>>> basically copied and modified this text from a page that actually
>>>> works just fine.
>>>>
>>>> I have the following lines in my .emacs:
>>>> (setq locale-coding-system 'utf-8)
>>>> (set-terminal-coding-system 'utf-8)
>>>> (set-keyboard-coding-system 'utf-8)
>>>> (set-selection-coding-system 'utf-8)
>>>> (prefer-coding-system 'utf-8)
>>>
>>> It would appear that the browser is of the opinion that the selection
>>> is in latin-1, your system default.  You are explicitly telling Emacs
>>> to ignore the system default.
>>>
>>> Also with your other settings you tell Emacs that everything the
>>> locale appears to be is wrong.  The easiest thing probably would be 
>>> if
>>> you not only told your Emacs that all of your environment is utf-8,
>>> but if you just configured your environment to actually be so, in
>>> which case you would not have to tell all of those lies to Emacs.
>>>
>>> It may be that in a Latin-1 locale, Emacs-21.3 does not have a way to
>>> tell the browser "Everything in the selection is utf-8".  I believe
>>> that the development version of Emacs _has_ had some changes, due to
>>> some X conventions that have been introduced or become common-place
>>> only after Emacs 21.3 has been release, so it might fair better with
>>> passing Unicode characters over a selection that it principally
>>> Latin-1, at least when the other program also knows about those
>>> conventions.
>>
>> I'm not sure it's the settings that causes the problem. I run Emacs on
>> Fedora Core 3 and have no coding system settings in my `.emacs'. All 
>> new
>> buffers have coding system utf-8 by default, but Mozilla Firefox does
>> not display the letters `å', `ä', `ö', `Å', `Ä' or `Ö' correctly when 
>> I
>> view my own html pages, if I choose Latin-1 in Emacs they work with
>> Mozilla.
>
> In my case the problem showed out to be caused by a combination of the
> settings in Mozilla Firefox and in the Tidy HTML validation tool. In
> Mozilla I changed `Edit -> Preferences -> General -> Languages ->
> Character Encoding' from the default `Western (ISO-8859-1)' to `Unicode
> (UTF-8)' and in the tidy command i added `-utf8'. Now it works.
>
> -- 
> August
>
>
>
> _______________________________________________
> Help-gnu-emacs mailing list
> Help-gnu-emacs@gnu.org
> http://lists.gnu.org/mailman/listinfo/help-gnu-emacs
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
  2005-02-12 17:27       ` Erik Norvelle
@ 2005-02-12 19:06         ` Peter Dyballa
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Dyballa @ 2005-02-12 19:06 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 12.02.2005 um 18:27 schrieb Erik Norvelle:

> <html lang="en-US">

English, particularly US English, does not contain any UTF-8 
characters. It is 'standard' 7 bit US-ASCII and nothing else. Not even 
Latin from old Europe.

--
Greetings

   Pete       (:
         _    / __    -    -
       _/ \__/_/        -     -
      (´`)      (´`)   -    -
       `´        `´

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Trying to input Unicode via GNU Emacs 21.3.1
  2005-02-12 10:47     ` David Kastrup
@ 2005-02-14  1:35       ` Stefan Monnier
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2005-02-14  1:35 UTC (permalink / raw)


> Sounds like a setting in the source of your HTML pages, or a Mozilla
> setting.

Or the web server.


        Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2005-02-14  1:35 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.1808.1108157331.2841.help-gnu-emacs@gnu.org>
2005-02-11 21:31 ` Trying to input Unicode via GNU Emacs 21.3.1 David Kastrup
2005-02-12  3:06   ` August
2005-02-12 16:15     ` August
2005-02-12 17:27       ` Erik Norvelle
2005-02-12 19:06         ` Peter Dyballa
     [not found]   ` <mailman.1838.1108178666.2841.help-gnu-emacs@gnu.org>
2005-02-12 10:47     ` David Kastrup
2005-02-14  1:35       ` Stefan Monnier
2005-02-11 21:00 List account
2005-02-12 13:29 ` Peter Dyballa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).