unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* coding problem char \217 etc
@ 2018-11-16 13:57 Uwe Brauer
  2018-11-16 14:28 ` Andreas Schwab
  2018-11-16 15:05 ` Eli Zaretskii
  0 siblings, 2 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 13:57 UTC (permalink / raw)
  To: emacs-devel


Hi 

From time to time I obtain documents which look like

\usepackage[latin1]{inputenc} 
 \usepackage[T1]{fontenc}      % Police contenant les caractres franais
\usepackage{geometry}         % DŽfinir les marges
%  \usepackage[francais]{babel}  % Placez ici une liste de langues, la
                              % dernire Žtant la langue principale

% \pagestyle{headings}        % Pour mettre des enttes avec les titres
                              % des subsections en haut de page


Since I am not sure that the coding survives I write that as

\usepackage[latin1]{inputenc} 
 \usepackage[T1]{fontenc}      % Police contenant les caract\217res fran\215ais
\usepackage{geometry}         % D\216finir les marges
%  \usepackage[francais]{babel}  % Placez ici une liste de langues, la
                              % derni\217re \216tant la langue principale

% \pagestyle{headings}        % Pour mettre des ent\220tes avec les titres
                              % des subsections en haut de page


And I simple don't know who to deal with it, I try to save it as latin-1
or as utf-8 I run encode-coding-region with various codings. Nothing
works.

C-x = shows me for example for \217

Char:  (143, #o217, #x8f, file ...) point=546 of 1416 (38%) <57-1417> column=37

Which is not very helpful.

Any ideas what to do. Any possibility how emacs could deal with this
automatically?

Uwe Brauer 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 13:57 coding problem char \217 etc Uwe Brauer
@ 2018-11-16 14:28 ` Andreas Schwab
  2018-11-16 15:36   ` Uwe Brauer
  2018-11-16 15:05 ` Eli Zaretskii
  1 sibling, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-11-16 14:28 UTC (permalink / raw)
  To: emacs-devel

On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:

> And I simple don't know who to deal with it, I try to save it as latin-1
> or as utf-8 I run encode-coding-region with various codings. Nothing
> works.

Define "works".  What is your goal?
Note that \217 is not a valid ISO 8859-1 character.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 13:57 coding problem char \217 etc Uwe Brauer
  2018-11-16 14:28 ` Andreas Schwab
@ 2018-11-16 15:05 ` Eli Zaretskii
  2018-11-16 15:22   ` Eli Zaretskii
                     ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 15:05 UTC (permalink / raw)
  To: Uwe Brauer; +Cc: emacs-devel

> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Fri, 16 Nov 2018 14:57:36 +0100
> 
> >From time to time I obtain documents which look like
> 
> \usepackage[latin1]{inputenc} 
>  \usepackage[T1]{fontenc}      % Police contenant les caractres franais
> \usepackage{geometry}         % DŽfinir les marges
> %  \usepackage[francais]{babel}  % Placez ici une liste de langues, la
>                               % dernire Žtant la langue principale
> 
> % \pagestyle{headings}        % Pour mettre des enttes avec les titres
>                               % des subsections en haut de page
> 
> 
> Since I am not sure that the coding survives I write that as
> 
> \usepackage[latin1]{inputenc} 
>  \usepackage[T1]{fontenc}      % Police contenant les caract\217res fran\215ais
> \usepackage{geometry}         % D\216finir les marges
> %  \usepackage[francais]{babel}  % Placez ici une liste de langues, la
>                               % derni\217re \216tant la langue principale
> 
> % \pagestyle{headings}        % Pour mettre des ent\220tes avec les titres
>                               % des subsections en haut de page
> 
> 
> And I simple don't know who to deal with it

You need to tell Emacs to read the file with the correct decoding.  In
this case, I think this will do the trick:

  C-x RET c mac-roman RET C-x C-f FILE-NAME RET

> Any possibility how emacs could deal with this automatically?

Maybe.  It depends on your locale defaults and on whether you
customized those defaults (with the likes of prefer-coding-system).
Not every encoding can be reliably decoded, if the defaults defeat
that.

One way of avoiding the manual specification of the encoding is to use
the coding: tag inside the file, either on the first line or in the
file-local variables.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 15:05 ` Eli Zaretskii
@ 2018-11-16 15:22   ` Eli Zaretskii
  2018-11-16 15:33   ` Stefan Monnier
  2018-11-16 15:36   ` Uwe Brauer
  2 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 15:22 UTC (permalink / raw)
  To: oub; +Cc: emacs-devel

> Date: Fri, 16 Nov 2018 17:05:22 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> Not every encoding can be reliably decoded
                                     ^^^^^^^
I meant "detected", not "decoded".

Sorry.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 15:05 ` Eli Zaretskii
  2018-11-16 15:22   ` Eli Zaretskii
@ 2018-11-16 15:33   ` Stefan Monnier
  2018-11-16 15:58     ` Uwe Brauer
  2018-11-16 15:36   ` Uwe Brauer
  2 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2018-11-16 15:33 UTC (permalink / raw)
  To: emacs-devel

> One way of avoiding the manual specification of the encoding is to use
> the coding: tag inside the file, either on the first line or in the
> file-local variables.

Note that in his examples he has:

    \usepackage[latin1]{inputenc} 

which Emacs normally recognizes the indicate that the file uses latin-1
(which is actually a lie in this case, tho that lie might only affect
comments, so it's not a complete lie).

Nowadays, the better bet is to use utf-8 (which is is much easier to
auto-detect and doesn't have umpteen extensions like latin-1 has).


        Stefan




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 15:05 ` Eli Zaretskii
  2018-11-16 15:22   ` Eli Zaretskii
  2018-11-16 15:33   ` Stefan Monnier
@ 2018-11-16 15:36   ` Uwe Brauer
  2018-11-16 15:53     ` Eli Zaretskii
  2 siblings, 1 reply; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 15:36 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]



   > You need to tell Emacs to read the file with the correct decoding.  In
   > this case, I think this will do the trick:

   >   C-x RET c mac-roman RET C-x C-f FILE-NAME RET

That worked thank you!

    1. You did you guess mac-roman? How can I find out myself in the future?

    2. Is there any faster possibility? I tried out
       (set-buffer-file-coding-system 'mac nil) but this did not work,
       also there I did not find mac-roman as a coding system.

   > Maybe.  It depends on your locale defaults and on whether you
   > customized those defaults (with the likes of prefer-coding-system).
   > Not every encoding can be reliably decoded, if the defaults defeat
   > that.

   > One way of avoiding the manual specification of the encoding is to use
   > the coding: tag inside the file, either on the first line or in the
   > file-local variables.

That is what I usually do, but a tag in the first line but if I receive
a file whose coding I don't know and don't know how to find it out, I am
sort of stuck.



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 14:28 ` Andreas Schwab
@ 2018-11-16 15:36   ` Uwe Brauer
  2018-11-16 15:46     ` Andreas Schwab
  0 siblings, 1 reply; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 15:36 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 475 bytes --]

>>> "Andreas" == Andreas Schwab <schwab@linux-m68k.org> writes:

   > On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:
   >> And I simple don't know who to deal with it, I try to save it as latin-1
   >> or as utf-8 I run encode-coding-region with various codings. Nothing
   >> works.

   > Define "works".  What is your goal?

The steps Eli provided: I obtained a file with correct encoded chars.


   > Note that \217 is not a valid ISO 8859-1 character.

   > Andreas.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 15:36   ` Uwe Brauer
@ 2018-11-16 15:46     ` Andreas Schwab
  0 siblings, 0 replies; 16+ messages in thread
From: Andreas Schwab @ 2018-11-16 15:46 UTC (permalink / raw)
  To: emacs-devel

On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:

>>>> "Andreas" == Andreas Schwab <schwab@linux-m68k.org> writes:
>
>    > On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:
>    >> And I simple don't know who to deal with it, I try to save it as latin-1
>    >> or as utf-8 I run encode-coding-region with various codings. Nothing
>    >> works.
>
>    > Define "works".  What is your goal?
>
> The steps Eli provided: I obtained a file with correct encoded chars.

Which you didn't.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 15:36   ` Uwe Brauer
@ 2018-11-16 15:53     ` Eli Zaretskii
  2018-11-16 16:00       ` Uwe Brauer
  0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 15:53 UTC (permalink / raw)
  To: Uwe Brauer; +Cc: emacs-devel

> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Fri, 16 Nov 2018 16:36:01 +0100
> 
>    >   C-x RET c mac-roman RET C-x C-f FILE-NAME RET
> 
> That worked thank you!
> 
>     1. You did you guess mac-roman? How can I find out myself in the future?

The "fran\215ais" was supposed to be français, obviously, so I
searched the Internet for an encoding where ç is encoded as 215 octal.

>     2. Is there any faster possibility?

Faster than what?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 15:33   ` Stefan Monnier
@ 2018-11-16 15:58     ` Uwe Brauer
  0 siblings, 0 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 15:58 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 883 bytes --]

>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:

   >> One way of avoiding the manual specification of the encoding is to use
   >> the coding: tag inside the file, either on the first line or in the
   >> file-local variables.

   > Note that in his examples he has:

   >     \usepackage[latin1]{inputenc} 

   > which Emacs normally recognizes the indicate that the file uses latin-1
   > (which is actually a lie in this case, tho that lie might only affect
   > comments, so it's not a complete lie).

   > Nowadays, the better bet is to use utf-8 (which is is much easier to
   > auto-detect and doesn't have umpteen extensions like latin-1 has).

Yes I know, I tried to delete this line and reopen the file but it did
not help. That is why I asked for the coding. Eli guessed it correctly
and I still don't know how he did it....

Uwe 

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 15:53     ` Eli Zaretskii
@ 2018-11-16 16:00       ` Uwe Brauer
  2018-11-16 16:08         ` Eli Zaretskii
  2018-11-20 19:30         ` Charles A. Roelli
  0 siblings, 2 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 16:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 710 bytes --]

>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:

   >> From: Uwe Brauer <oub@mat.ucm.es>
   >> Date: Fri, 16 Nov 2018 16:36:01 +0100
   >> 
   >> >   C-x RET c mac-roman RET C-x C-f FILE-NAME RET
   >> 
   >> That worked thank you!
   >> 
   >> 1. You did you guess mac-roman? How can I find out myself in the future?

   > The "fran\215ais" was supposed to be français, obviously, so I
   > searched the Internet for an encoding where ç is encoded as 215 octal.

Ah, ok. 
   >> 2. Is there any faster possibility?

   > Faster than what?

Then typing 
C-x RET c mac-roman RET C-x C-f FILE-NAME RET

I mean open the file, mark the region and recode the text to the correct
coding. 

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 16:00       ` Uwe Brauer
@ 2018-11-16 16:08         ` Eli Zaretskii
  2018-11-16 16:31           ` Uwe Brauer
  2018-11-16 16:35           ` Uwe Brauer
  2018-11-20 19:30         ` Charles A. Roelli
  1 sibling, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 16:08 UTC (permalink / raw)
  To: Uwe Brauer; +Cc: emacs-devel

> From: Uwe Brauer <oub@mat.ucm.es>
> Cc: Uwe Brauer <oub@mat.ucm.es>, emacs-devel@gnu.org
> Date: Fri, 16 Nov 2018 17:00:40 +0100
> 
>    >> 2. Is there any faster possibility?
> 
>    > Faster than what?
> 
> Then typing 
> C-x RET c mac-roman RET C-x C-f FILE-NAME RET
> 
> I mean open the file, mark the region and recode the text to the correct
> coding. 

If you already have the file visited in Emacs, then

  C-x RET r mac-roman RET

will revert the buffer with that encoding.  Is this what you wanted?



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 16:08         ` Eli Zaretskii
@ 2018-11-16 16:31           ` Uwe Brauer
  2018-11-16 16:35           ` Uwe Brauer
  1 sibling, 0 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 16:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 193 bytes --]



   > If you already have the file visited in Emacs, then

   >   C-x RET r mac-roman RET

   > will revert the buffer with that encoding.  Is this what you wanted?

Exactly! Thanks 

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 16:08         ` Eli Zaretskii
  2018-11-16 16:31           ` Uwe Brauer
@ 2018-11-16 16:35           ` Uwe Brauer
  2018-11-16 17:23             ` Eli Zaretskii
  1 sibling, 1 reply; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 16:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 759 bytes --]

>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:

   >> From: Uwe Brauer <oub@mat.ucm.es>
   >> Cc: Uwe Brauer <oub@mat.ucm.es>, emacs-devel@gnu.org
   >> Date: Fri, 16 Nov 2018 17:00:40 +0100
   >> 
   >> >> 2. Is there any faster possibility?
   >> 
   >> > Faster than what?
   >> 
   >> Then typing 
   >> C-x RET c mac-roman RET C-x C-f FILE-NAME RET
   >> 
   >> I mean open the file, mark the region and recode the text to the correct
   >> coding. 

   > If you already have the file visited in Emacs, then

   >   C-x RET r mac-roman RET

   > will revert the buffer with that encoding.  Is this what you wanted?

But still wondering why 

(decode-coding-region (region-beginning) (region-end) 'mac-roman)

Does not work.

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 16:35           ` Uwe Brauer
@ 2018-11-16 17:23             ` Eli Zaretskii
  0 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 17:23 UTC (permalink / raw)
  To: Uwe Brauer; +Cc: emacs-devel

> From: Uwe Brauer <oub@mat.ucm.es>
> Cc: Uwe Brauer <oub@mat.ucm.es>, emacs-devel@gnu.org
> Date: Fri, 16 Nov 2018 17:35:03 +0100
> 
> But still wondering why 
> 
> (decode-coding-region (region-beginning) (region-end) 'mac-roman)
> 
> Does not work.

In "emacs -Q" or in your customized Emacs?

Besides, you should invoke this after visiting the file literally ,
with find-file-literally.  After "C-x C-f", it's too late, because the
file is already decoded.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: coding problem char \217 etc
  2018-11-16 16:00       ` Uwe Brauer
  2018-11-16 16:08         ` Eli Zaretskii
@ 2018-11-20 19:30         ` Charles A. Roelli
  1 sibling, 0 replies; 16+ messages in thread
From: Charles A. Roelli @ 2018-11-20 19:30 UTC (permalink / raw)
  To: Uwe Brauer; +Cc: oub, eliz, emacs-devel

> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Fri, 16 Nov 2018 17:00:40 +0100

>    >> 2. Is there any faster possibility?
> 
>    > Faster than what?
> 
> Then typing 
> C-x RET c mac-roman RET C-x C-f FILE-NAME RET
> 
> I mean open the file, mark the region and recode the text to the correct
> coding. 

Try the command 'recode-region' if you haven't already done so.



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-11-20 19:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-16 13:57 coding problem char \217 etc Uwe Brauer
2018-11-16 14:28 ` Andreas Schwab
2018-11-16 15:36   ` Uwe Brauer
2018-11-16 15:46     ` Andreas Schwab
2018-11-16 15:05 ` Eli Zaretskii
2018-11-16 15:22   ` Eli Zaretskii
2018-11-16 15:33   ` Stefan Monnier
2018-11-16 15:58     ` Uwe Brauer
2018-11-16 15:36   ` Uwe Brauer
2018-11-16 15:53     ` Eli Zaretskii
2018-11-16 16:00       ` Uwe Brauer
2018-11-16 16:08         ` Eli Zaretskii
2018-11-16 16:31           ` Uwe Brauer
2018-11-16 16:35           ` Uwe Brauer
2018-11-16 17:23             ` Eli Zaretskii
2018-11-20 19:30         ` Charles A. Roelli

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).