unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Re: set UTF-8 for a file (HTML)
       [not found] <mailman.7237.1202608828.18990.help-gnu-emacs@gnu.org>
@ 2008-02-10  4:04 ` Tim X
  2008-02-10  9:04 ` Harald Hanche-Olsen
  1 sibling, 0 replies; 6+ messages in thread
From: Tim X @ 2008-02-10  4:04 UTC (permalink / raw)
  To: help-gnu-emacs

ken <gebser@speakeasy.net> writes:

> I'm editing an HTML file (in emacs, of course) and want to preserve the
> utf-8 encoding when the file is opened in subsequent sessions.  I know I
> can put a line at the top of the file which will set a variable in emacs
> whenever the file is opened.  So what should this line say to specify that
> the file is encoded in utf-8?
>

What version of emacs are you running? Emacs 22 has much better utf-8
support than earlier versions. 

The coding used/preferred by emacs is influenced by your locale
setting. If your not running a utf-8 locale, you probably will need to
set some variables (easiest via customize). What variables depends on
the version of emacs your running. 

In emacs 22, if you have a utf-8 locale set, emacs will use that as its
preferred coding and you shouldn't need to do anything unless the file
has already been created in another coding system. Emacs works quite
hard to try and not change the coding used on any file it
edits. Creating a new file will use the default/preferred coding and if
you have'nt set that manually, will default to what your locale setting
is. 

I do have the following in my .emacs 

 '(current-language-environment "UTF-8")

and a locale setting of en_AU.utf-8

Note that in emacs22 you can tell if emacs is using a utf-8 encoding by
the existance of a 'u' in the mode line (left hand side, second
character).

Tim


-- 
tcross (at) rapttech dot com dot au


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: set UTF-8 for a file (HTML)
       [not found] <mailman.7237.1202608828.18990.help-gnu-emacs@gnu.org>
  2008-02-10  4:04 ` set UTF-8 for a file (HTML) Tim X
@ 2008-02-10  9:04 ` Harald Hanche-Olsen
  2008-02-10 20:09   ` display of ancient Greek chars (after: Re: set UTF-8 for a file (HTML)) ken
  1 sibling, 1 reply; 6+ messages in thread
From: Harald Hanche-Olsen @ 2008-02-10  9:04 UTC (permalink / raw)
  To: help-gnu-emacs

+ ken <gebser@speakeasy.net>:

> I'm editing an HTML file (in emacs, of course) and want to preserve
> the utf-8 encoding when the file is opened in subsequent sessions.  I
> know I can put a line at the top of the file which will set a variable
> in emacs whenever the file is opened.  So what should this line say to
> specify that the file is encoded in utf-8?

If you're using HTML mode, just specifying

  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

in the head element of the file should do it.

Otherwise, a generic method for specifying the coding to emacs is
having -*- coding: utf-8 -*- in the first line of the file.  You should
typically protect that by putting in a comment, as follows:

<!-- -*- coding: utf-8 -*- -->

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- It is undesirable to believe a proposition
  when there is no ground whatsoever for supposing it is true.
  -- Bertrand Russell


^ permalink raw reply	[flat|nested] 6+ messages in thread

* display of ancient Greek chars (after: Re: set UTF-8 for a file (HTML))
  2008-02-10  9:04 ` Harald Hanche-Olsen
@ 2008-02-10 20:09   ` ken
  2008-02-10 23:11     ` Peter Dyballa
  0 siblings, 1 reply; 6+ messages in thread
From: ken @ 2008-02-10 20:09 UTC (permalink / raw)
  To: GNU Emacs List

On 02/10/2008 04:04 AM Harald Hanche-Olsen wrote:
> + ken <gebser@speakeasy.net>:
> 
>> I'm editing an HTML file (in emacs, of course) and want to preserve
>> the utf-8 encoding when the file is opened in subsequent sessions.  I
>> know I can put a line at the top of the file which will set a variable
>> in emacs whenever the file is opened.  So what should this line say to
>> specify that the file is encoded in utf-8?
> 
> If you're using HTML mode, just specifying
> 
>   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> 
> in the head element of the file should do it.
> 
> Otherwise, a generic method for specifying the coding to emacs is
> having -*- coding: utf-8 -*- in the first line of the file.  You should
> typically protect that by putting in a comment, as follows:
> 
> <!-- -*- coding: utf-8 -*- -->
> 

Thanks, Harald and Peter,

The above is perfect.  On to the subsequent issue....

Prior to doing the above I somehow managed to figure out how to insert a 
word in Greek into my HTML file and have it display properly both in 
emacs and in the web browser.  (This required (1) setting the keyboard 
for inputting Greek and (2) setting some emacs display variable, which I 
no longer recall, also to Greek.)  Though both the Greek and the English 
displayed correctly in emacs when first typed in, after reloading the 
file specifying utf-8, the Greek characters now display in emacs as a 
series of little rectangles.  (They still display fine in the web browser.)

Is there another variable:value pair I can include in the first line 
(specified above) to make the Greek characters display correctly in emacs?


Much appreciated.


-- 
The significant problems we face cannot be solved at the
same level of thinking we were at when we created them.
	-- Albert Einstein





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: display of ancient Greek chars (after: Re: set UTF-8 for a file (HTML))
  2008-02-10 20:09   ` display of ancient Greek chars (after: Re: set UTF-8 for a file (HTML)) ken
@ 2008-02-10 23:11     ` Peter Dyballa
  2008-02-12  0:17       ` ken
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Dyballa @ 2008-02-10 23:11 UTC (permalink / raw)
  To: ken; +Cc: GNU Emacs List


Am 10.02.2008 um 21:09 schrieb ken:

> Is there another variable:value pair I can include in the first  
> line (specified above) to make the Greek characters display  
> correctly in emacs?

No, not that easily (there were rich or augmented text modes  
mentioned on this list, but I don't remember). The problem you have  
in GNU Emacs is that you either need a (mono-spaced) font that has  
Latin and Greek glyphs (xfd can display a font's contents, you could  
also set for xfontsel a sampleTextUCS resource that combines Greek  
and some Latin to instantly show whether the chosen font is able to  
display it, you also can use fc-list to display only fonts that have  
Greek support: 'fc-list :lang=el', the "word" ``el´´ is from RFC-3066/ 
ISO 639, which you should know as HTML programmer) or you need to  
setup a fontset in which you combine font A to serve for Latin and  
font B to serve for Greek (and font C for Indic ...). Something like  
this might work (for one font size):


     (create-fontset-from-fontset-spec "-adobe-courier-medium-r-*-*-9- 
*-*-*-*-*-fontset-09pt_adobe_courier" t 'noerror)
	(set-fontset-font "fontset-09pt_adobe_courier"       'latin- 
iso8859-1  '("adobe-courier" . "iso8859-1"))
	(set-fontset-font "fontset-09pt_adobe_courier"       'latin- 
iso8859-2  '("adobe-courier" . "iso8859-2"))
	(set-fontset-font "fontset-09pt_adobe_courier"       'latin- 
iso8859-3  '("adobe-courier" . "iso8859-3"))
	(set-fontset-font "fontset-09pt_adobe_courier"       'latin- 
iso8859-4  '("adobe-courier" . "iso8859-4"))
	(set-fontset-font "fontset-09pt_adobe_courier"       'latin- 
iso8859-9  '("adobe-courier" . "iso8859-9"))
	(set-fontset-font "fontset-09pt_adobe_courier"       'latin- 
iso8859-14 '("adobe-courier" . "iso8859-14"))
	(set-fontset-font "fontset-09pt_adobe_courier"       'latin- 
iso8859-15 '("adobe-courier" . "iso8859-15"))
	(set-fontset-font "fontset-09pt_adobe_courier" 'mule- 
unicode-0100-24ff '("adobe-courier" . "iso10646-1"))
	(set-fontset-font "fontset-09pt_adobe_courier" 'mule- 
unicode-2500-33ff '("adobe-courier" . "iso10646-1"))
	(set-fontset-font "fontset-09pt_adobe_courier" 'mule-unicode-e000- 
ffff '("adobe-courier" . "iso10646-1"))
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x0370) (decode-char 'ucs #x03cf)) '("courier new" .  
"iso10646-1"))	; Greek
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x03d0) (decode-char 'ucs #x03ff)) '("lucida sans typewriter" .  
"iso10646-1"))	; Coptic
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x0400) (decode-char 'ucs #x04ff)) '("lucida sans typewriter" .  
"iso10646-1"))	; Cyrillic
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x0500) (decode-char 'ucs #x052f)) '("lucida sans typewriter" .  
"iso10646-1"))	; Cyrillic Suppl
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x0530) (decode-char 'ucs #x058f)) '("aramian unicode" .  
"iso10646-1"))	; Armenian (sylfaen
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x0590) (decode-char 'ucs #x05ff)) '("courier new" .  
"iso10646-1"))	; Hebrew
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x0600) (decode-char 'ucs #x06ff)) '("lucida sans typewriter" .  
"iso10646-1"))	; Arabic
	(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
'ucs #x0700) (decode-char 'ucs #x074f)) '("courier new" .  
"iso10646-1"))	; Syriac


--
Greetings

   Pete

Our enemies are innovative and resourceful, and so are we. They never  
stop thinking about new ways to harm our country and our people, and  
neither do we.
				– Georges W. Bush







^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: display of ancient Greek chars (after: Re: set UTF-8 for a file (HTML))
  2008-02-10 23:11     ` Peter Dyballa
@ 2008-02-12  0:17       ` ken
  2008-02-12  9:25         ` Peter Dyballa
  0 siblings, 1 reply; 6+ messages in thread
From: ken @ 2008-02-12  0:17 UTC (permalink / raw)
  To: GNU Emacs List

On 02/10/2008 06:11 PM Peter Dyballa wrote:
> 
> Am 10.02.2008 um 21:09 schrieb ken:
> 
>> Is there another variable:value pair I can include in the first line 
>> (specified above) to make the Greek characters display correctly in 
>> emacs?
> 
> No, not that easily (there were rich or augmented text modes mentioned 
> on this list, but I don't remember). The problem you have in GNU Emacs 
> is that you either need a (mono-spaced) font that has Latin and Greek 
> glyphs (xfd can display a font's contents, you could also set for 
> xfontsel a sampleTextUCS resource that combines Greek and some Latin to 
> instantly show whether the chosen font is able to display it, you also 
> can use fc-list to display only fonts that have Greek support: 'fc-list 
> :lang=el', the "word" ``el´´ is from RFC-3066/ISO 639, which you should 
> know as HTML programmer) or you need to setup a fontset in which you 
> combine font A to serve for Latin and font B to serve for Greek (and 
> font C for Indic ...). Something like this might work (for one font size):
> 
> 
>     (create-fontset-from-fontset-spec 
> "-adobe-courier-medium-r-*-*-9-*-*-*-*-*-fontset-09pt_adobe_courier" t 
> 'noerror)
>     (set-fontset-font "fontset-09pt_adobe_courier"       
> 'latin-iso8859-1  '("adobe-courier" . "iso8859-1"))
> ....

Peter,

Thanks, but that didn't work.

I cut-n-pasted the code you provided into its own file.  Then I ran

emacs -q -l emacs.d/.emacs-with-multi-langs &

(emacs.d/.emacs-with-multi-langs is the file your code went into.)

Then I opened up (visited) my file with the Greek chars in it and they 
showed up as what I can only describe as "garbage" characters, not even 
the blocks I had before and certainly nothing like Greek.


On an optimistic note, running "fc-list :lang=el" returned 86 lines of 
fonts, many with multiple styles.  And

$ fc-list :lang=el|grep -i "courier new"
Courier 
New:style=Regular,Normal,obyčejné,Standard,Κανονικά,Normaali,Normál,Normale,Standaard,Normalny,Обычный,Normálne,Navadno,thường,Arrunta
Courier New:style=Bold Italic,Negreta cursiva,tučné kurzíva,fed 
kursiv,Fett Kursiv,Έντονα Πλάγια,Negrita Cursiva,Lihavoitu Kursivoi,Gras 
Italique,Félkövér dőlt,Grassetto Corsivo,Vet Cursief,Halvfet 
Kursiv,Pogrubiona kursywa,Negrito Itálico,Полужирный Курсив,Tučná 
kurzíva,Fet Kursiv,Kalın İtalik,Krepko poševno,Lodi etzana
Courier 
New:style=Italic,Cursiva,kurzíva,kursiv,Πλάγια,Kursivoitu,Italique,Dőlt,Corsivo,Cursief,Kursywa,Itálico,Курсив,İtalik,Poševno,nghiêng,Etzana
Courier 
New:style=Bold,Negreta,tučné,fed,Fett,Έντονα,Negrita,Lihavoitu,Gras,Félkövér,Grassetto,Vet,Halvfet,Pogrubiony,Negrito,Полужирный,Fet,Kalın,Krepko,đậm,Lodia

(Line wrap broke the lines above between "Courier" and "New".)

So would it fix things to change somehow this line in your code:

(set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs 
#x0370) (decode-char 'ucs #x03cf)) '("courier new" . "iso10646-1"))    ; 
Greek

??

Thanks again.

-- 
The significant problems we face cannot be solved at the
same level of thinking we were at when we created them.
	-- Albert Einstein





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: display of ancient Greek chars (after: Re: set UTF-8 for a file (HTML))
  2008-02-12  0:17       ` ken
@ 2008-02-12  9:25         ` Peter Dyballa
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Dyballa @ 2008-02-12  9:25 UTC (permalink / raw)
  To: ken; +Cc: GNU Emacs List


Am 12.02.2008 um 01:17 schrieb ken:

> So would it fix things to change somehow this line in your code:
>
> (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char  
> 'ucs #x0370) (decode-char 'ucs #x03cf)) '("courier new" .  
> "iso10646-1"))    ; Greek

It might. Check with xfd (or fontforge or ...) that the font contains  
what it claims! The important thing is that you need to make the  
frame with its buffers use that fontset. If it exists just an option,  
then nothing changes for you. And you can also use a different font  
size. Some possible settings:

	(setq initial-frame-alist '(
	  (border-color     . "#4e3832")
	  (foreground-color . "grey10")
	  (background-color . "AliceBlue")
	  (active-alpha     . 0.875)
	  (inactive-alpha   . 0.75)
	  (font . "fontset-10pt_lucidatypewriter")
	  (top . 5) (left . 500) (width . 106) (height . 50)
	  )
	)

There is also default-frame-alist.


--
Greetings

   Pete

Bigamy is having one wife too many. Monogamy is the same.
				– Oscar Wilde







^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-02-12  9:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.7237.1202608828.18990.help-gnu-emacs@gnu.org>
2008-02-10  4:04 ` set UTF-8 for a file (HTML) Tim X
2008-02-10  9:04 ` Harald Hanche-Olsen
2008-02-10 20:09   ` display of ancient Greek chars (after: Re: set UTF-8 for a file (HTML)) ken
2008-02-10 23:11     ` Peter Dyballa
2008-02-12  0:17       ` ken
2008-02-12  9:25         ` Peter Dyballa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).