unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Adding cp858?
@ 2006-09-05 15:08 Reiner Steib
  2006-09-06  1:01 ` Kenichi Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Reiner Steib @ 2006-09-05 15:08 UTC (permalink / raw)


Hi,

I just opened the LaTeX file <http://home.vr-web.de/was/x/bad.tex>[1]
in Emacs.  To my surprise, the non-ASCII characters were not displayed
correctly.  The file is encoded in cp858; cp858 is missing in
`latex-inputenc-coding-alist':

,----
|    ;; ("cp858" . undecided) ; IBM code page 850 but with a euro symbol
`----

According to the LaTeX *.def files, the only difference between cp850
and cp858 is the `€' (EURO SIGN, U+20AC) replacing `ı' (LATIN SMALL
LETTER DOTLESS I, U+0131). [2]

May I add the following change?  (The "(cp-make-coding-system cp858
...)" part is a copy of the corresponding cp850 entry with dotless i
replaced by the EUR sign.

I don't know if a similar change to codepage.el should be done as
well.

2006-09-05  Reiner Steib  <Reiner.Steib@gmx.de>

	* international/latexenc.el (latex-inputenc-coding-alist): Add cp858.

	* international/code-pages.el: Add cp858.

--8<---------------cut here---------------start------------->8---
--- international/latexenc.el	9 Aug 2006 01:11:44 -0000	1.16
+++ international/latexenc.el	5 Sep 2006 14:54:04 -0000
@@ -63,7 +63,7 @@
     ("cp437" . cp437) ; IBM code page 437: 225 is \beta
     ("cp850" . cp850) ; IBM code page 850
     ("cp852" . cp852) ; IBM code page 852
-    ;; ("cp858" . undecided) ; IBM code page 850 but with a euro symbol
+    ("cp858" . cp858) ; IBM code page 850 but with a euro symbol
     ("cp865" . cp865) ; IBM code page 865
     ;; The DECMultinational charaterset used by the OpenVMS system
     ;; ("decmulti" . undecided)

--- international/code-pages.el	19 May 2006 04:24:00 -0000	1.35
+++ international/code-pages.el	5 Sep 2006 14:54:04 -0000
@@ -1273,6 +1273,138 @@
   ?\■
   ?\ ])
 
+;;;###autoload(autoload-coding-system 'cp858 '(require 'code-pages))
+(cp-make-coding-system
+ cp858
+ [?\Ç
+  ?\ü
+  ?\é
+  ?\â
+  ?\ä
+  ?\à
+  ?\å
+  ?\ç
+  ?\ê
+  ?\ë
+  ?\è
+  ?\ï
+  ?\î
+  ?\ì
+  ?\Ä
+  ?\Å
+  ?\É
+  ?\æ
+  ?\Æ
+  ?\ô
+  ?\ö
+  ?\ò
+  ?\û
+  ?\ù
+  ?\ÿ
+  ?\Ö
+  ?\Ü
+  ?\ø
+  ?\£
+  ?\Ø
+  ?\×
+  ?\ƒ
+  ?\á
+  ?\í
+  ?\ó
+  ?\ú
+  ?\ñ
+  ?\Ñ
+  ?\ª
+  ?\º
+  ?\¿
+  ?\®
+  ?\¬
+  ?\½
+  ?\¼
+  ?\¡
+  ?\«
+  ?\»
+  ?\░
+  ?\▒
+  ?\▓
+  ?\│
+  ?\┤
+  ?\Á
+  ?\Â
+  ?\À
+  ?\©
+  ?\╣
+  ?\║
+  ?\╗
+  ?\╝
+  ?\¢
+  ?\¥
+  ?\┐
+  ?\└
+  ?\┴
+  ?\┬
+  ?\├
+  ?\─
+  ?\┼
+  ?\ã
+  ?\Ã
+  ?\╚
+  ?\╔
+  ?\╩
+  ?\╦
+  ?\╠
+  ?\═
+  ?\╬
+  ?\¤
+  ?\ð
+  ?\Ð
+  ?\Ê
+  ?\Ë
+  ?\È
+  ?\€
+  ?\Í
+  ?\Î
+  ?\Ï
+  ?\┘
+  ?\┌
+  ?\█
+  ?\▄
+  ?\¦
+  ?\Ì
+  ?\▀
+  ?\Ó
+  ?\ß
+  ?\Ô
+  ?\Ò
+  ?\õ
+  ?\Õ
+  ?\µ
+  ?\þ
+  ?\Þ
+  ?\Ú
+  ?\Û
+  ?\Ù
+  ?\ý
+  ?\Ý
+  ?\¯
+  ?\´
+  ?\­
+  ?\±
+  ?\‗
+  ?\¾
+  ?\¶
+  ?\§
+  ?\÷
+  ?\¸
+  ?\°
+  ?\¨
+  ?\·
+  ?\¹
+  ?\³
+  ?\²
+  ?\■
+  ?\ ])
+
 ;;;###autoload(autoload-coding-system 'cp860 '(require 'code-pages))
 (cp-make-coding-system
  cp860
--8<---------------cut here---------------end--------------->8---

Bye, Reiner.

[1]
,----[ http://home.vr-web.de/was/x/bad.tex ]
| \documentclass[a4paper,twoside]{article}
| \usepackage[cp858]{inputenc}              % OS/2 (sic!)
| [...]
`----

[2] From TeXlive 2005:

,----[ diff -U0 texmf-dist/tex/latex/base/{cp850,cp858}.def ]
| --- texmf-dist/tex/latex/base/cp850.def 2004-03-02 00:28:48.000000000 +0100
| +++ texmf-dist/tex/latex/base/cp858.def 2004-03-02 00:28:48.000000000 +0100
| @@ -2 +2 @@
| -%% This is file `cp850.def',
| +%% This is file `cp858.def',
| @@ -7 +7 @@
| -%% inputenc.dtx  (with options: `cp850')
| +%% inputenc.dtx  (with options: `cp858')
| @@ -54 +54 @@
| -  \ProvidesFile{cp850.def}
| +  \ProvidesFile{cp858.def}
| @@ -56,3 +55,0 @@
| -%%
| -%% If you need a euro symbol, try cp858 instead.
| -%%
| @@ -137 +134 @@
| -\DeclareInputText{213}{\i}
| +\DeclareInputText{213}{\texteuro}
| @@ -180 +177 @@
| -%% End of file `cp850.def'.
| +%% End of file `cp858.def'.
`----
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Adding cp858?
  2006-09-05 15:08 Adding cp858? Reiner Steib
@ 2006-09-06  1:01 ` Kenichi Handa
  2006-09-06  9:00   ` Reiner Steib
  0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2006-09-06  1:01 UTC (permalink / raw)
  Cc: emacs-devel

In article <v9hczmfj0n.fsf@marauder.physik.uni-ulm.de>, Reiner Steib <reinersteib+gmane@imap.cc> writes:

> I just opened the LaTeX file <http://home.vr-web.de/was/x/bad.tex>[1]
> in Emacs.  To my surprise, the non-ASCII characters were not displayed
> correctly.  The file is encoded in cp858; cp858 is missing in
> `latex-inputenc-coding-alist':

> ,----
> |    ;; ("cp858" . undecided) ; IBM code page 850 but with a euro symbol
> `----

> According to the LaTeX *.def files, the only difference between cp850
> and cp858 is the `€' (EURO SIGN, U+20AC) replacing `ı' (LATIN SMALL
> LETTER DOTLESS I, U+0131). [2]

> May I add the following change?  (The "(cp-make-coding-system cp858
> ...)" part is a copy of the corresponding cp850 entry with dotless i
> replaced by the EUR sign.

> I don't know if a similar change to codepage.el should be done as
> well.

> 2006-09-05  Reiner Steib  <Reiner.Steib@gmx.de>

> 	* international/latexenc.el (latex-inputenc-coding-alist): Add cp858.

> 	* international/code-pages.el: Add cp858.

As the changes are straight forward, I agree with installing
those changes.  For codepage.el, if cp858 can also be used
on DOS, I think the similar change should be installed.

---
Kenichi Handa
handa@m17n.org


> --8<---------------cut here---------------start------------->8---
> --- international/latexenc.el	9 Aug 2006 01:11:44 -0000	1.16
> +++ international/latexenc.el	5 Sep 2006 14:54:04 -0000
> @@ -63,7 +63,7 @@
>      ("cp437" . cp437) ; IBM code page 437: 225 is \beta
>      ("cp850" . cp850) ; IBM code page 850
>      ("cp852" . cp852) ; IBM code page 852
> -    ;; ("cp858" . undecided) ; IBM code page 850 but with a euro symbol
> +    ("cp858" . cp858) ; IBM code page 850 but with a euro symbol
>      ("cp865" . cp865) ; IBM code page 865
>      ;; The DECMultinational charaterset used by the OpenVMS system
>      ;; ("decmulti" . undecided)

> --- international/code-pages.el	19 May 2006 04:24:00 -0000	1.35
> +++ international/code-pages.el	5 Sep 2006 14:54:04 -0000
> @@ -1273,6 +1273,138 @@
>    ?\■
>    ?\ ])
 
> +;;;###autoload(autoload-coding-system 'cp858 '(require 'code-pages))
> +(cp-make-coding-system
> + cp858
> + [?\Ç
> +  ?\ü
> +  ?\é
> +  ?\â
> +  ?\ä
> +  ?\à
> +  ?\å
> +  ?\ç
> +  ?\ê
> +  ?\ë
> +  ?\è
> +  ?\ï
> +  ?\î
> +  ?\ì
> +  ?\Ä
> +  ?\Å
> +  ?\É
> +  ?\æ
> +  ?\Æ
> +  ?\ô
> +  ?\ö
> +  ?\ò
> +  ?\û
> +  ?\ù
> +  ?\ÿ
> +  ?\Ö
> +  ?\Ü
> +  ?\ø
> +  ?\£
> +  ?\Ø
> +  ?\×
> +  ?\ƒ
> +  ?\á
> +  ?\í
> +  ?\ó
> +  ?\ú
> +  ?\ñ
> +  ?\Ñ
> +  ?\ª
> +  ?\º
> +  ?\¿
> +  ?\®
> +  ?\¬
> +  ?\½
> +  ?\¼
> +  ?\¡
> +  ?\«
> +  ?\»
> +  ?\░
> +  ?\▒
> +  ?\▓
> +  ?\│
> +  ?\┤
> +  ?\Á
> +  ?\Â
> +  ?\À
> +  ?\©
> +  ?\╣
> +  ?\║
> +  ?\╗
> +  ?\╝
> +  ?\¢
> +  ?\¥
> +  ?\┐
> +  ?\└
> +  ?\┴
> +  ?\┬
> +  ?\├
> +  ?\─
> +  ?\┼
> +  ?\ã
> +  ?\Ã
> +  ?\╚
> +  ?\╔
> +  ?\╩
> +  ?\╦
> +  ?\╠
> +  ?\═
> +  ?\╬
> +  ?\¤
> +  ?\ð
> +  ?\Ð
> +  ?\Ê
> +  ?\Ë
> +  ?\È
> +  ?\€
> +  ?\Í
> +  ?\Î
> +  ?\Ï
> +  ?\┘
> +  ?\┌
> +  ?\█
> +  ?\▄
> +  ?\¦
> +  ?\Ì
> +  ?\▀
> +  ?\Ó
> +  ?\ß
> +  ?\Ô
> +  ?\Ò
> +  ?\õ
> +  ?\Õ
> +  ?\µ
> +  ?\þ
> +  ?\Þ
> +  ?\Ú
> +  ?\Û
> +  ?\Ù
> +  ?\ý
> +  ?\Ý
> +  ?\¯
> +  ?\´
> +  ?\­
> +  ?\±
> +  ?\‗
> +  ?\¾
> +  ?\¶
> +  ?\§
> +  ?\÷
> +  ?\¸
> +  ?\°
> +  ?\¨
> +  ?\·
> +  ?\¹
> +  ?\³
> +  ?\²
> +  ?\■
> +  ?\ ])
> +
>  ;;;###autoload(autoload-coding-system 'cp860 '(require 'code-pages))
>  (cp-make-coding-system
>   cp860
> --8<---------------cut here---------------end--------------->8---

> Bye, Reiner.

> [1]
> ,----[ http://home.vr-web.de/was/x/bad.tex ]
> | \documentclass[a4paper,twoside]{article}
> | \usepackage[cp858]{inputenc}              % OS/2 (sic!)
> | [...]
> `----

> [2] From TeXlive 2005:

> ,----[ diff -U0 texmf-dist/tex/latex/base/{cp850,cp858}.def ]
> | --- texmf-dist/tex/latex/base/cp850.def 2004-03-02 00:28:48.000000000 +0100
> | +++ texmf-dist/tex/latex/base/cp858.def 2004-03-02 00:28:48.000000000 +0100
> | @@ -2 +2 @@
> | -%% This is file `cp850.def',
> | +%% This is file `cp858.def',
> | @@ -7 +7 @@
> | -%% inputenc.dtx  (with options: `cp850')
> | +%% inputenc.dtx  (with options: `cp858')
> | @@ -54 +54 @@
> | -  \ProvidesFile{cp850.def}
> | +  \ProvidesFile{cp858.def}
> | @@ -56,3 +55,0 @@
> | -%%
> | -%% If you need a euro symbol, try cp858 instead.
> | -%%
> | @@ -137 +134 @@
> | -\DeclareInputText{213}{\i}
> | +\DeclareInputText{213}{\texteuro}
> | @@ -180 +177 @@
> | -%% End of file `cp850.def'.
> | +%% End of file `cp858.def'.
> `----
> -- 
>        ,,,
>       (o o)
> ---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/



> _______________________________________________
> Emacs-devel mailing list
> Emacs-devel@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Adding cp858?
  2006-09-06  1:01 ` Kenichi Handa
@ 2006-09-06  9:00   ` Reiner Steib
  2006-09-06 18:17     ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Reiner Steib @ 2006-09-06  9:00 UTC (permalink / raw)
  Cc: Eli Zaretskii, emacs-devel

On Wed, Sep 06 2006, Kenichi Handa wrote:

> Reiner Steib <reinersteib+gmane@imap.cc> writes:
[...]
>> According to the LaTeX *.def files, the only difference between cp850
>> and cp858 is the `€' (EURO SIGN, U+20AC) replacing `ı' (LATIN SMALL
>> LETTER DOTLESS I, U+0131). [2]
>
>> May I add the following change?  (The "(cp-make-coding-system cp858
>> ...)" part is a copy of the corresponding cp850 entry with dotless i
>> replaced by the EUR sign.
>
>> I don't know if a similar change to codepage.el should be done as
>> well.
[...]
> As the changes are straight forward, I agree with installing
> those changes.  

Okay, I will install it if nobody objects.

> For codepage.el, if cp858 can also be used on DOS, I think the
> similar change should be installed.

I don't know if it's used on DOS; the author of the LaTeX file
mentioned OS/2.

If I understand it correctly, `cp850-decode-table' in `codepage.el' is
derived from Latin-1 (iso-8859-1).  As Latin-1 doesn't include the EUR
sign, cp858 can't be derived from Latin-1.  But `codepage.el' doesn't
seem to support Latin-9 (iso-8859-15) which includes the EUR (and some
other changes to Latin-1).  I didn't find anything related to Latin-9
there.  I'm not sure if `codepage.el' provides any codepages that
include the EUR sign:

;; Support for the Windows 12xx series of codepages that MS has
;; butchered from the ISO-8859 specs. This does not add support for
;; the extended characters that MS has added in the 128 - 159 coding
;; range, only translates those characters that can be expressed in
;; the corresponding iso-8859 charset.

If it's not possible to provide full support of cp858 (including the
EUR) within `codepage.el', we should probably treat it as cp850 (in
the same way as windows-1252 is treated as Latin-1, if I understand
correctly).

Eli?

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Adding cp858?
  2006-09-06  9:00   ` Reiner Steib
@ 2006-09-06 18:17     ` Eli Zaretskii
  2006-09-07 12:54       ` Reiner Steib
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2006-09-06 18:17 UTC (permalink / raw)
  Cc: handa

> From: Reiner Steib <reinersteib+gmane@imap.cc>
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
> Date: Wed, 06 Sep 2006 11:00:25 +0200
> 
> > For codepage.el, if cp858 can also be used on DOS, I think the
> > similar change should be installed.
> 
> I don't know if it's used on DOS; the author of the LaTeX file
> mentioned OS/2.

I don't know if there's a DOS version that supports cp858 (maybe the
just-released FreeDOS will?), but I see no harm in adding this
support.

> If I understand it correctly, `cp850-decode-table' in `codepage.el' is
> derived from Latin-1 (iso-8859-1).  As Latin-1 doesn't include the EUR
> sign, cp858 can't be derived from Latin-1.  But `codepage.el' doesn't
> seem to support Latin-9 (iso-8859-15) which includes the EUR (and some
> other changes to Latin-1).  I didn't find anything related to Latin-9
> there.  I'm not sure if `codepage.el' provides any codepages that
> include the EUR sign:

codepage.el can support _any_ ISO-8859 character set.  The target
character set is given by the `charset' property of the symbol that
holds the decoding table.  Here's an example (watch the call to
setplist):

  (defvar cp855-decode-table
    [
     255 133 129 131 135 137 139 141 143 145 147 149 151 240 153 155
     161 163 236 173 167 169 234 244 184 190 199 209 211 213 215 221
     226 228 230 232 171 182 165 252 246 250 159 242 238 248 157 224
     160 162 235 172 166 168 233 243 183 189 198 208 210 212 214 216
     225 227 229 231 170 181 164 251 245 249 158 241 237 247 156 222
     239 132 128 130 134 136 138 140 142 144 146 148 150 253 152 154]
    "Table for converting ISO-8859-5 characters into codepage 855 glyphs.")
  (setplist 'cp855-decode-table
	    '(charset cyrillic-iso8859-5 language "Cyrillic-ISO" offset 160))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Adding cp858?
  2006-09-06 18:17     ` Eli Zaretskii
@ 2006-09-07 12:54       ` Reiner Steib
  2006-09-09 12:19         ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Reiner Steib @ 2006-09-07 12:54 UTC (permalink / raw)
  Cc: handa, emacs-devel

On Wed, Sep 06 2006, Eli Zaretskii wrote:

>> From: Reiner Steib <reinersteib+gmane@imap.cc>
>> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
>> Date: Wed, 06 Sep 2006 11:00:25 +0200
>> 
>> > For codepage.el, if cp858 can also be used on DOS, I think the
>> > similar change should be installed.
>> 
>> I don't know if it's used on DOS; the author of the LaTeX file
>> mentioned OS/2.
>
> I don't know if there's a DOS version that supports cp858 (maybe the
> just-released FreeDOS will?), but I see no harm in adding this
> support.

I've added it to latexenc.el and code-pages.el.

> codepage.el can support _any_ ISO-8859 character set.  The target
> character set is given by the `charset' property of the symbol that
> holds the decoding table.  

Sorry, I don't understand how this has to be done for cp858.  Could
you please add it to codepage.el as well?

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Adding cp858?
  2006-09-07 12:54       ` Reiner Steib
@ 2006-09-09 12:19         ` Eli Zaretskii
  0 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2006-09-09 12:19 UTC (permalink / raw)


> Cc: emacs-devel@gnu.org, handa@m17n.org
> From: Reiner Steib <reinersteib+gmane@imap.cc>
> Date: Thu, 07 Sep 2006 14:54:58 +0200
> 
> > codepage.el can support _any_ ISO-8859 character set.  The target
> > character set is given by the `charset' property of the symbol that
> > holds the decoding table.  
> 
> Sorry, I don't understand how this has to be done for cp858.  Could
> you please add it to codepage.el as well?

Done.

Of course, I couldn't test it, since I don't have access to a PC
running DOS with cp858 as its system codepage.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-09-09 12:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-05 15:08 Adding cp858? Reiner Steib
2006-09-06  1:01 ` Kenichi Handa
2006-09-06  9:00   ` Reiner Steib
2006-09-06 18:17     ` Eli Zaretskii
2006-09-07 12:54       ` Reiner Steib
2006-09-09 12:19         ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).