coding system

unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed

* coding system
@ 2003-05-26  9:58 Stein A. Stromme
  2003-05-26 11:05 ` lawrence mitchell
  0 siblings, 1 reply; 17+ messages in thread
From: Stein A. Stromme @ 2003-05-26  9:58 UTC (permalink / raw)



I use iso-latin-1 as my default coding system (Emacs 21.3.50 from CVS.)

If I save the following contents (buffer in text mode) to a file:

    <?xml   ?>    ø

it gets saved just fine, with -1:-- in the mode line.

If I change it to

    <?xml   ?>

I'm able to save it with "nil" (?) encoding, with --:-- in the mode
line (using C-x RET f RET).

Now I change this to 
 
    <?xml "  ?>

and it still saves to --:--  .

Then the puzzle:  if I put back the ø, like this:

    <?xml "  ?>  ø

    Selected encoding iso-latin-1 disagrees with utf-8 specified by
    file contents.  Really save (else edit coding cookies and try
    again)? (y or n)

I answer y and the file gets saved with -1:-- , which is fine, but I
get this question each time the file gets modified and needs saving
again. 

What is going on here?  

Thanks for your time,
Stein 
-- 
Stein Arild Strømme            +47 55584825, +47 95801887
Universitetet i Bergen                  Fax: +47 55589672     
Matematisk institutt                www.mi.uib.no/stromme         
Johs Brunsg 12, N-5008 BERGEN           stromme@mi.uib.no

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2003-05-26  9:58 Stein A. Stromme
@ 2003-05-26 11:05 ` lawrence mitchell
  2003-05-26 11:52   ` Stein A. Stromme
  0 siblings, 1 reply; 17+ messages in thread
From: lawrence mitchell @ 2003-05-26 11:05 UTC (permalink / raw)

Stein A. Stromme wrote:

[...]

>     <?xml "  ?>

>     Selected encoding iso-latin-1 disagrees with utf-8 specified by
>     file contents.  Really save (else edit coding cookies and try
>     again)? (y or n)

> I answer y and the file gets saved with -1:-- , which is fine, but I
> get this question each time the file gets modified and needs saving
> again.

> What is going on here?

Emacs thinks the buffer is an XML file.  The XML spec states
that any XML document which does not explicitly specify its
coding system in its opening line.   e.g.
<?xml version="1.0" encoding="iso-8859-1" ?> for latin-1 should
be taken to be encoded as UTF-8.  As such, Emacs tries to do the
right thing, and ensure that the document is saved as UTF-8.

To avoid the questioning regarding the coding-system, either,
specify the coding-system in the <?xml ?> declaration, or save
the file as UTF-8.

-- 
lawrence mitchell <wence@gmx.li>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2003-05-26 11:05 ` lawrence mitchell
@ 2003-05-26 11:52   ` Stein A. Stromme
  2003-05-26 11:58     ` Stein A. Stromme
                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Stein A. Stromme @ 2003-05-26 11:52 UTC (permalink / raw)


[lawrence mitchell]

| Emacs thinks the buffer is an XML file.  The XML spec states
| that any XML document which does not explicitly specify its
| coding system in its opening line.   e.g.
| <?xml version="1.0" encoding="iso-8859-1" ?> for latin-1 should
| be taken to be encoded as UTF-8.  As such, Emacs tries to do the
| right thing, and ensure that the document is saved as UTF-8.
| 
| To avoid the questioning regarding the coding-system, either,
| specify the coding-system in the <?xml ?> declaration, or save
| the file as UTF-8.

Thanks for the explanation.  In reality, the file is automatically
generated (by blogmax.el), so I'll have to go digging for the right
spot!

Btw, is the time (and Emacs) ripe now for going utf-8 as default?
What are the pros and cons?  (Apologies if this is off-topic.)

Stein 
-- 
Stein Arild Strømme            +47 55584825, +47 95801887
Universitetet i Bergen                  Fax: +47 55589672     
Matematisk institutt                www.mi.uib.no/stromme         
Johs Brunsg 12, N-5008 BERGEN           stromme@mi.uib.no

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2003-05-26 11:52   ` Stein A. Stromme
@ 2003-05-26 11:58     ` Stein A. Stromme
  2003-05-26 13:47     ` Oliver Scholz
  2003-05-26 13:55     ` Kai Großjohann
  2 siblings, 0 replies; 17+ messages in thread
From: Stein A. Stromme @ 2003-05-26 11:58 UTC (permalink / raw)


[Stein A. Stromme]

| Thanks for the explanation.  In reality, the file is automatically
| generated (by blogmax.el), so I'll have to go digging for the right
| spot!

That was easy enough.  Thanks again.

Stein 
-- 
Stein Arild Strømme            +47 55584825, +47 95801887
Universitetet i Bergen                  Fax: +47 55589672     
Matematisk institutt                www.mi.uib.no/stromme         
Johs Brunsg 12, N-5008 BERGEN           stromme@mi.uib.no

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2003-05-26 11:52   ` Stein A. Stromme
  2003-05-26 11:58     ` Stein A. Stromme
@ 2003-05-26 13:47     ` Oliver Scholz
  2003-05-26 13:55     ` Kai Großjohann
  2 siblings, 0 replies; 17+ messages in thread
From: Oliver Scholz @ 2003-05-26 13:47 UTC (permalink / raw)

stromme@mi.uib.no (Stein A. Stromme) writes:

> [lawrence mitchell]
>
> | Emacs thinks the buffer is an XML file.  The XML spec states
> | that any XML document which does not explicitly specify its
> | coding system in its opening line.   e.g.
> | <?xml version="1.0" encoding="iso-8859-1" ?> for latin-1 should
> | be taken to be encoded as UTF-8.  As such, Emacs tries to do the
> | right thing, and ensure that the document is saved as UTF-8.
> | 
> | To avoid the questioning regarding the coding-system, either,
> | specify the coding-system in the <?xml ?> declaration, or save
> | the file as UTF-8.
>
> Thanks for the explanation.  In reality, the file is automatically
> generated (by blogmax.el), so I'll have to go digging for the right
> spot!
>
> Btw, is the time (and Emacs) ripe now for going utf-8 as default?
> What are the pros and cons?  (Apologies if this is off-topic.)
[...]

In my experience it is fine, although I hear that there are some
issues with CJK. But I've been using UTF-8 as default for several
months now and I don't think I encountered real problems. Except maybe
(as I seem to recall) that my Gnus did not sent proper UTF-8 two or
three times, probably because of clashes with the encoding of the
message to which I replied (I don't quite remember); but I suspect
that I had even in this regard less problems than if I were using
Latin-n (with n > 1).

Besides the Latin-1 repertoire I mostly use "Greek and Coptic",
"Greek Extended", "General Punctuation" and now and then "Box
Drawing" and "Miscellaneous Symbols". I won't switch back, at least
not for my local files.

    Oliver
-- 
7 Prairial an 211 de la Révolution
Liberté, Egalité, Fraternité!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2003-05-26 11:52   ` Stein A. Stromme
  2003-05-26 11:58     ` Stein A. Stromme
  2003-05-26 13:47     ` Oliver Scholz
@ 2003-05-26 13:55     ` Kai Großjohann
  2 siblings, 0 replies; 17+ messages in thread
From: Kai Großjohann @ 2003-05-26 13:55 UTC (permalink / raw)


stromme@mi.uib.no (Stein A. Stromme) writes:

> Btw, is the time (and Emacs) ripe now for going utf-8 as default?
> What are the pros and cons?  (Apologies if this is off-topic.)

In Emacs 21.3, CJK support for UTF-8 is still limited.  With Emacs
from CVS, it's much better.
-- 
This line is not blank.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* coding system
@ 2005-03-22 12:47 Olive
  2005-03-22 13:57 ` Joe Corneli
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Olive @ 2005-03-22 12:47 UTC (permalink / raw)


I am confused about emacs and coding system. If I evaluate the following 
function

(read-event "Press a key: ")

and press the é key (e acute); I see 2281 in the echo aera. If I want to 
rebind the é key the command which works is

(global-set-key [2281] 'foo)

the command

(global-set-key "é" 'foo)

does not work.

However it seems that the coding system for keyboard input is latin-1. 
This is a unibyte coding system; why does emacs see a multibyte charater 
when I press é? To what corresponds this 2281?

beginnning of the output of (describe-coding-system):

Coding system for saving this buffer:
   Not set locally, use the default.
Default coding system (for new files):
   1 -- iso-latin-1 (alias: iso-8859-1 latin-1)
Coding system for keyboard input:
   1 -- latin-1 (alias of iso-latin-1)
Coding system for terminal output:
   1 -- latin-1 (alias of iso-latin-1)
Defaults for subprocess I/O:
   decoding: 1 -- iso-latin-1 (alias: iso-8859-1 latin-1)
   encoding: 1 -- iso-latin-1 (alias: iso-8859-1 latin-1)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-22 12:47 coding system Olive
@ 2005-03-22 13:57 ` Joe Corneli
       [not found] ` <mailman.4714.1111501237.32256.help-gnu-emacs@gnu.org>
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Joe Corneli @ 2005-03-22 13:57 UTC (permalink / raw)



   rebind the é key the command which works is

   (global-set-key [2281] 'foo)

   the command

   (global-set-key "é" 'foo)

   does not work.


Since you have a working command, I guess this question is partly out
of curiousity... so I don't feel bad about suggesting something that
I'm not sure will help, but - out of curiousity, what does

 (kbd "?")

return?  

BTW, here, in a buffer with coding system described by
(describe-current-coding-system) to be

Coding system for saving this buffer:
  Not set locally, use the default.
Default coding system (for new files):
  nil
Coding system for keyboard input:
  nil
Coding system for terminal output:
  nil
Defaults for subprocess I/O:
  decoding: - -- undecided

  encoding: 1 -- iso-latin-1 (alias: iso-8859-1 latin-1)

a *local* binding to "?" works fine.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
       [not found] ` <mailman.4714.1111501237.32256.help-gnu-emacs@gnu.org>
@ 2005-03-22 14:32   ` Olive
  2005-03-23  0:43     ` Miles Bader
  0 siblings, 1 reply; 17+ messages in thread
From: Olive @ 2005-03-22 14:32 UTC (permalink / raw)


Joe Corneli wrote:
>    rebind the é key the command which works is
> 
>    (global-set-key [2281] 'foo)
> 
>    the command
> 
>    (global-set-key "é" 'foo)
> 
>    does not work.
> 
> 
> Since you have a working command, I guess this question is partly out
> of curiousity... so I don't feel bad about suggesting something that
> I'm not sure will help, but - out of curiousity, what does
> 
>  (kbd "?")
> 
> return?  
> 
> BTW, here, in a buffer with coding system described by
> (describe-current-coding-system) to be
> 
> Coding system for saving this buffer:
>   Not set locally, use the default.
> Default coding system (for new files):
>   nil
> Coding system for keyboard input:
>   nil
> Coding system for terminal output:
>   nil
> Defaults for subprocess I/O:
>   decoding: - -- undecided
> 
>   encoding: 1 -- iso-latin-1 (alias: iso-8859-1 latin-1)
> 
> a *local* binding to "?" works fine.
> 
> 

(kbd "?") return "?". All works as expected since "?" is an ASCII 
character. The reason to my question is that I want to understand emacs 
and the coding systems (so yes it is of curiousity). What confused me is 
that all seems to set to Latin-1 and nevertheless é is 2281, which I do 
not know were it come from.

Olive

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-22 12:47 coding system Olive
  2005-03-22 13:57 ` Joe Corneli
       [not found] ` <mailman.4714.1111501237.32256.help-gnu-emacs@gnu.org>
@ 2005-03-22 19:08 ` Peter Dyballa
  2005-03-26 23:46 ` Stefan Monnier
  3 siblings, 0 replies; 17+ messages in thread
From: Peter Dyballa @ 2005-03-22 19:08 UTC (permalink / raw)
  Cc: help-gnu-emacs


Am 22.03.2005 um 13:47 schrieb Olive:

> I am confused about emacs and coding system. If I evaluate the 
> following function
>
> (read-event "Press a key: ")
>
> and press the é key (e acute); I see 2281 in the echo aera. If I want 
> to rebind the é key the command which works is
>
> (global-set-key [2281] 'foo)
>
> the command
>
> (global-set-key "é" 'foo)
>
> does not work.

Can it be that you too are confused about the difference of an event 
and a character representation? I think your keyboard does not deliver 
glyphs or characters, it's just an event that has to be transformed 
into some other property. So your both global-set-key's are distinct.

If you want to know a character's representation, position the cursor 
on it and type C-u C-x =.

For me (kbd "é") is 2281 ...

--
Greetings

   Pete

"Eternity is a terrible thought. I mean, where's it going to end?"
                                                    - Tom Stoppard

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-22 14:32   ` Olive
@ 2005-03-23  0:43     ` Miles Bader
  0 siblings, 0 replies; 17+ messages in thread
From: Miles Bader @ 2005-03-23  0:43 UTC (permalink / raw)


Olive <olive.lin@versateladsl.be> writes:
> What confused me is that all seems to set to Latin-1 and nevertheless
> é is 2281, which I do not know were it come from.

2281 is Emacs' internal representation for a latin-1 é.  If you do
`M-: (insert 2281) RET' in a buffer, it should insert a é.

-Miles
-- 
`Life is a boundless sea of bitterness'

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-22 12:47 coding system Olive
                   ` (2 preceding siblings ...)
  2005-03-22 19:08 ` Peter Dyballa
@ 2005-03-26 23:46 ` Stefan Monnier
  2005-03-27  6:56   ` B.T. Raven
  3 siblings, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2005-03-26 23:46 UTC (permalink / raw)


> However it seems that the coding system for keyboard input is latin-1.
> This is a unibyte coding system; why does emacs see a multibyte charater
> when I press é? To what corresponds this 2281?

Inside Emacs, there's no such thing as unibyte characters and
a multibyte characters.   There are just characters, which are represented
by integers.  When loading/saving a file, characters are decoded/encoded
into sequences of bytes which can be unibyte or multibyte.  This same "é"
can be represented in some files with a single byte (e.g. if it's a latin-1
file) or as two bytes (e.g. if it's a utf-8 file), or ...


        Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-26 23:46 ` Stefan Monnier
@ 2005-03-27  6:56   ` B.T. Raven
  2005-03-27 10:50     ` Eli Zaretskii
                       ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: B.T. Raven @ 2005-03-27  6:56 UTC (permalink / raw)



"Stefan Monnier" <monnier@iro.umontreal.ca> wrote in message
news:877jju9dog.fsf-monnier+gnu.emacs.help@gnu.org...
> > However it seems that the coding system for keyboard input is
latin-1.
> > This is a unibyte coding system; why does emacs see a multibyte
charater
> > when I press é? To what corresponds this 2281?
>
> Inside Emacs, there's no such thing as unibyte characters and
> a multibyte characters.   There are just characters, which are
represented
> by integers.  When loading/saving a file, characters are
decoded/encoded
> into sequences of bytes which can be unibyte or multibyte.  This same
"é"
> can be represented in some files with a single byte (e.g. if it's a
latin-1
> file) or as two bytes (e.g. if it's a utf-8 file), or ...
>
>
>         Stefan

That "or ..." is pregnant with meaning.  It seems that the same
character can be represented in the same buffer itself with 3 or more
different byte sequences. Here is the C-u C-x = report for three e with
acute and two e with macron:
(Sorry about the munged characters. I don't know how to use gnus under
w32 so I have to copypaste from emacs to Outlook.

Notice that the e with macron expands from a 2-byte to a 4-byte
representation in the buffer after being saved and then reloaded. Also
the part of the font it uses seems to be different. Even if unification
on decoding were working, could it overcome this great a difference in
the representation of the characters?

Ed.


ééé^[$,1 3^[,D:


^[(Bcharacter: é (04351, 2281, 0x8e9)
    charset: latin-iso8859-1 (Right-Hand Part of Latin Alphabet 1
(ISO/IEC 8859-1): ISO-IR-100)
 code point: 105
     syntax: word
   category: l:Latin
buffer code: 0x81 0xE9
  file code: E9 (encoded by coding system iso-latin-1-dos)
       font: -outline-Arial Unicode
MS-normal-r-normal-normal-14-105-96-96-p-60-iso8859-1

 character: é (04551, 2409, 0x969)
    charset: latin-iso8859-2 (Right-Hand Part of Latin Alphabet 2
(ISO/IEC 8859-2): ISO-IR-101)
 code point: 105
     syntax: word
   category: l:Latin
buffer code: 0x82 0xE9
  file code: 0xC3 0xA9 (encoded by coding system mule-utf-8-dos)
       font: -outline-Arial Unicode
MS-normal-r-normal-normal-14-105-96-96-p-60-iso8859-2

  character: é (05151, 2665, 0xa69)
    charset: latin-iso8859-4 (Right-Hand Part of Latin Alphabet 4
(ISO/IEC 8859-4): ISO-IR-110)
 code point: 105
     syntax: word
   category: l:Latin
buffer code: 0x84 0xE9
  file code: E9 (encoded by coding system iso-latin-1-dos)
       font: -outline-Arial Unicode
MS-normal-r-normal-normal-14-105-96-96-p-60-iso8859-4

  character: ^[$,1 3^[(B (05072, 2618, 0xa3a)
    charset: latin-iso8859-4 (Right-Hand Part of Latin Alphabet 4
(ISO/IEC 8859-4): ISO-IR-110)
 code point: 58
     syntax: word
   category: l:Latin
buffer code: 0x84 0xBA
  file code: 0xC4 0x93 (encoded by coding system utf-8-dos)
       font: -outline-Arial Unicode
MS-normal-r-normal-normal-14-105-96-96-p-60-iso8859-4

 character: ^[$,1 3^[(B (01210063, 331827, 0x51033)
    charset: mule-unicode-0100-24ff (Unicode characters of the range
U+0100..U+24FF.)
 code point: 32 51
     syntax: word
   category: l:Latin
buffer code: 0x9C 0xF4 0xA0 0xB3
  file code: 0xC4 0x93 (encoded by coding system mule-utf-8-dos)
       font: -outline-Arial Unicode
MS-normal-r-normal-normal-14-105-96-96-p-60-iso10646-1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-27  6:56   ` B.T. Raven
@ 2005-03-27 10:50     ` Eli Zaretskii
       [not found]     ` <mailman.321.1111924417.28103.help-gnu-emacs@gnu.org>
  2005-03-29 14:54     ` Stefan Monnier
  2 siblings, 0 replies; 17+ messages in thread
From: Eli Zaretskii @ 2005-03-27 10:50 UTC (permalink / raw)


> From: "B.T. Raven" <ejmn@cpinternet.com>
> Date: Sun, 27 Mar 2005 00:56:37 -0600
> 
> That "or ..." is pregnant with meaning.  It seems that the same
> character can be represented in the same buffer itself with 3 or more
> different byte sequences.

That is precisely so: Emacs 21.x treats Latin-N character sets as
disjoint, so they are represented by different codes internally.

The CVS version introduces features (unification on en- and decoding)
that make this distinction less visible, and work is under way on a
Unicode-based Emacs where the distinction will go away entirely.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
       [not found]     ` <mailman.321.1111924417.28103.help-gnu-emacs@gnu.org>
@ 2005-03-27 15:54       ` Reiner Steib
  2005-03-27 20:03         ` B.T. Raven
  0 siblings, 1 reply; 17+ messages in thread
From: Reiner Steib @ 2005-03-27 15:54 UTC (permalink / raw)

On Sun, Mar 27 2005, Eli Zaretskii wrote:

> That is precisely so: Emacs 21.x treats Latin-N character sets as
> disjoint, so they are represented by different codes internally.
>
> The CVS version introduces features (unification on en- and decoding)
> that make this distinction less visible, and work is under way on a
> Unicode-based Emacs where the distinction will go away entirely.

In case you refer to `unify-8859-on-{en,de}coding-mode', those are
already included in Emacs 21.3 and 21.4 (but not in 21.1 and 21.2):

,----[ NEWS ]
| ** Translation tables are available between equivalent characters in
| different Emacs charsets -- for instance `e with acute' coming from the
| Latin-1 and Latin-2 charsets.  User options `unify-8859-on-encoding-mode'
| and `unify-8859-on-decoding-mode' respectively turn on translation
| between ISO 8859 character sets (`unification') on encoding
| (e.g. writing a file) and decoding (e.g. reading a file).  Note that
| `unify-8859-on-encoding-mode' is useful and safe, but
| `unify-8859-on-decoding-mode' can cause text to change when you read
| it and write it out again without edits, so it is not generally advisable.
| By default `unify-8859-on-encoding-mode' is turned on.
`----

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-27 15:54       ` Reiner Steib
@ 2005-03-27 20:03         ` B.T. Raven
  0 siblings, 0 replies; 17+ messages in thread
From: B.T. Raven @ 2005-03-27 20:03 UTC (permalink / raw)

"Reiner Steib" <reinersteib+from-uce@imap.cc> wrote in message
news:v9eke1krsl.fsf@marauder.physik.uni-ulm.de...
> On Sun, Mar 27 2005, Eli Zaretskii wrote:
>
> > That is precisely so: Emacs 21.x treats Latin-N character sets as
> > disjoint, so they are represented by different codes internally.
> >
> > The CVS version introduces features (unification on en- and
decoding)
> > that make this distinction less visible, and work is under way on a
> > Unicode-based Emacs where the distinction will go away entirely.
>
> In case you refer to `unify-8859-on-{en,de}coding-mode', those are
> already included in Emacs 21.3 and 21.4 (but not in 21.1 and 21.2):
>
> ,----[ NEWS ]
> | ** Translation tables are available between equivalent characters in
> | different Emacs charsets -- for instance `e with acute' coming from
the
> | Latin-1 and Latin-2 charsets.  User options
`unify-8859-on-encoding-mode'
> | and `unify-8859-on-decoding-mode' respectively turn on translation
> | between ISO 8859 character sets (`unification') on encoding
> | (e.g. writing a file) and decoding (e.g. reading a file).  Note that
> | `unify-8859-on-encoding-mode' is useful and safe, but
> | `unify-8859-on-decoding-mode' can cause text to change when you read
> | it and write it out again without edits, so it is not generally
advisable.
> | By default `unify-8859-on-encoding-mode' is turned on.
> `----
>
> Bye, Reiner.

Thanks Eli and Reiner.  My problem is that the unify directives don't
work on the NT build of emacs 21.3, at least not for those 4-byte buffer
representations of Unicode characters outside the Latin-1 range. Once I
have saved the file as utf-8, these characters are invisible to isearch,
replace, etc.  I tried to follow someone's (Stefan's?) suggestion to
Google around for a CVS binary for w32 but I couldn't find anything. I
downloaded WinCVS in the hope that I might be able to figure out how
make an NT build from sources but that program just locked up my
machine. It's probably for the best since I suspect that I'm out of my
depth trying to use gcc with mingw libraries. "Hello, world" is more my
speed.

I read somewhere that a collation scheme and algorithm have been
developed by the Unicode consortium (version 4.0) that would allow
sorting of even Latin, Arabic, and Mandarin characters in the same file.
I don't need that but it would be nice to have it work with Latin,
Greek, and Hebrew.

Btw, (kibitzing anent the "newbie friendly Emacs" thread) although I
agree that the editor shouldn't be crippled or dumbed down  to
accomodate those whose sensibilities have been perverted by Mr. Gates'
OS, it would be nice if the CUA mode actually worked. Maybe C-c x, C-c
c, C-c v could be dedicated temporarily for newbie mode. It might not be
going too far even to provide a couple of "wean" modes that would enable
new users to graduate from the MS bindings to the more sensible Emacs'
ones. Even more useful would be an Emacs "typing tutor" program that
exercises the Ctl and Alt (Meta)* combinations. People who have worn
deep grooves into their brains by using MS programs need to hear more
than "get used to it" if they are going to make the switch to Emacs
permanent.

That RMS wants to give emacs more word processor capabilities is a
welcome development and a shrewd move but if he wants to alienate the
affections of wiser heads for their Windows applications, he should
realize that the potential of emacs for those who are not programmers is
great enough that considerations affecting (hopefully former) MS'
program users should be more than an after-thought.

Thanks again,

Ed

* What's the best keyboard to use with Emacs? Is there one with Super,
Hyper, etc. keys that could be made to work with the w32 version of
Emacs? Is there a keyboard whose layout is more perfectly symmetrical?
There is one useless Win key between my left Ctl and Alt and two of them
between my right Ctl and Alt.  Anyone here using a Dvorak layout?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: coding system
  2005-03-27  6:56   ` B.T. Raven
  2005-03-27 10:50     ` Eli Zaretskii
       [not found]     ` <mailman.321.1111924417.28103.help-gnu-emacs@gnu.org>
@ 2005-03-29 14:54     ` Stefan Monnier
  2 siblings, 0 replies; 17+ messages in thread
From: Stefan Monnier @ 2005-03-29 14:54 UTC (permalink / raw)


>> into sequences of bytes which can be unibyte or multibyte.  This same "é"
>> can be represented in some files with a single byte (e.g. if it's
>> a latin-1 file) or as two bytes (e.g. if it's a utf-8 file), or ...

> That "or ..." is pregnant with meaning.  It seems that the same
> character can be represented in the same buffer itself with 3 or more
> different byte sequences.

While what you say is true, this is not that to which I was referring.
I was not talking about byte-sequences in buffers but in files.


        Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2005-03-29 14:54 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-22 12:47 coding system Olive
2005-03-22 13:57 ` Joe Corneli
     [not found] ` <mailman.4714.1111501237.32256.help-gnu-emacs@gnu.org>
2005-03-22 14:32   ` Olive
2005-03-23  0:43     ` Miles Bader
2005-03-22 19:08 ` Peter Dyballa
2005-03-26 23:46 ` Stefan Monnier
2005-03-27  6:56   ` B.T. Raven
2005-03-27 10:50     ` Eli Zaretskii
     [not found]     ` <mailman.321.1111924417.28103.help-gnu-emacs@gnu.org>
2005-03-27 15:54       ` Reiner Steib
2005-03-27 20:03         ` B.T. Raven
2005-03-29 14:54     ` Stefan Monnier
  -- strict thread matches above, loose matches on Subject: below --
2003-05-26  9:58 Stein A. Stromme
2003-05-26 11:05 ` lawrence mitchell
2003-05-26 11:52   ` Stein A. Stromme
2003-05-26 11:58     ` Stein A. Stromme
2003-05-26 13:47     ` Oliver Scholz
2003-05-26 13:55     ` Kai Großjohann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).