* coding problem char \217 etc
@ 2018-11-16 13:57 Uwe Brauer
2018-11-16 14:28 ` Andreas Schwab
2018-11-16 15:05 ` Eli Zaretskii
0 siblings, 2 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 13:57 UTC (permalink / raw)
To: emacs-devel
Hi
From time to time I obtain documents which look like
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc} % Police contenant les caractres franais
\usepackage{geometry} % Dfinir les marges
% \usepackage[francais]{babel} % Placez ici une liste de langues, la
% dernire tant la langue principale
% \pagestyle{headings} % Pour mettre des enttes avec les titres
% des subsections en haut de page
Since I am not sure that the coding survives I write that as
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc} % Police contenant les caract\217res fran\215ais
\usepackage{geometry} % D\216finir les marges
% \usepackage[francais]{babel} % Placez ici une liste de langues, la
% derni\217re \216tant la langue principale
% \pagestyle{headings} % Pour mettre des ent\220tes avec les titres
% des subsections en haut de page
And I simple don't know who to deal with it, I try to save it as latin-1
or as utf-8 I run encode-coding-region with various codings. Nothing
works.
C-x = shows me for example for \217
Char: (143, #o217, #x8f, file ...) point=546 of 1416 (38%) <57-1417> column=37
Which is not very helpful.
Any ideas what to do. Any possibility how emacs could deal with this
automatically?
Uwe Brauer
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 13:57 coding problem char \217 etc Uwe Brauer
@ 2018-11-16 14:28 ` Andreas Schwab
2018-11-16 15:36 ` Uwe Brauer
2018-11-16 15:05 ` Eli Zaretskii
1 sibling, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-11-16 14:28 UTC (permalink / raw)
To: emacs-devel
On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:
> And I simple don't know who to deal with it, I try to save it as latin-1
> or as utf-8 I run encode-coding-region with various codings. Nothing
> works.
Define "works". What is your goal?
Note that \217 is not a valid ISO 8859-1 character.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 13:57 coding problem char \217 etc Uwe Brauer
2018-11-16 14:28 ` Andreas Schwab
@ 2018-11-16 15:05 ` Eli Zaretskii
2018-11-16 15:22 ` Eli Zaretskii
` (2 more replies)
1 sibling, 3 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 15:05 UTC (permalink / raw)
To: Uwe Brauer; +Cc: emacs-devel
> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Fri, 16 Nov 2018 14:57:36 +0100
>
> >From time to time I obtain documents which look like
>
> \usepackage[latin1]{inputenc}
> \usepackage[T1]{fontenc} % Police contenant les caractres franais
> \usepackage{geometry} % Dfinir les marges
> % \usepackage[francais]{babel} % Placez ici une liste de langues, la
> % dernire tant la langue principale
>
> % \pagestyle{headings} % Pour mettre des enttes avec les titres
> % des subsections en haut de page
>
>
> Since I am not sure that the coding survives I write that as
>
> \usepackage[latin1]{inputenc}
> \usepackage[T1]{fontenc} % Police contenant les caract\217res fran\215ais
> \usepackage{geometry} % D\216finir les marges
> % \usepackage[francais]{babel} % Placez ici une liste de langues, la
> % derni\217re \216tant la langue principale
>
> % \pagestyle{headings} % Pour mettre des ent\220tes avec les titres
> % des subsections en haut de page
>
>
> And I simple don't know who to deal with it
You need to tell Emacs to read the file with the correct decoding. In
this case, I think this will do the trick:
C-x RET c mac-roman RET C-x C-f FILE-NAME RET
> Any possibility how emacs could deal with this automatically?
Maybe. It depends on your locale defaults and on whether you
customized those defaults (with the likes of prefer-coding-system).
Not every encoding can be reliably decoded, if the defaults defeat
that.
One way of avoiding the manual specification of the encoding is to use
the coding: tag inside the file, either on the first line or in the
file-local variables.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 15:05 ` Eli Zaretskii
@ 2018-11-16 15:22 ` Eli Zaretskii
2018-11-16 15:33 ` Stefan Monnier
2018-11-16 15:36 ` Uwe Brauer
2 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 15:22 UTC (permalink / raw)
To: oub; +Cc: emacs-devel
> Date: Fri, 16 Nov 2018 17:05:22 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
>
> Not every encoding can be reliably decoded
^^^^^^^
I meant "detected", not "decoded".
Sorry.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 15:05 ` Eli Zaretskii
2018-11-16 15:22 ` Eli Zaretskii
@ 2018-11-16 15:33 ` Stefan Monnier
2018-11-16 15:58 ` Uwe Brauer
2018-11-16 15:36 ` Uwe Brauer
2 siblings, 1 reply; 16+ messages in thread
From: Stefan Monnier @ 2018-11-16 15:33 UTC (permalink / raw)
To: emacs-devel
> One way of avoiding the manual specification of the encoding is to use
> the coding: tag inside the file, either on the first line or in the
> file-local variables.
Note that in his examples he has:
\usepackage[latin1]{inputenc}
which Emacs normally recognizes the indicate that the file uses latin-1
(which is actually a lie in this case, tho that lie might only affect
comments, so it's not a complete lie).
Nowadays, the better bet is to use utf-8 (which is is much easier to
auto-detect and doesn't have umpteen extensions like latin-1 has).
Stefan
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 15:05 ` Eli Zaretskii
2018-11-16 15:22 ` Eli Zaretskii
2018-11-16 15:33 ` Stefan Monnier
@ 2018-11-16 15:36 ` Uwe Brauer
2018-11-16 15:53 ` Eli Zaretskii
2 siblings, 1 reply; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 15:36 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]
> You need to tell Emacs to read the file with the correct decoding. In
> this case, I think this will do the trick:
> C-x RET c mac-roman RET C-x C-f FILE-NAME RET
That worked thank you!
1. You did you guess mac-roman? How can I find out myself in the future?
2. Is there any faster possibility? I tried out
(set-buffer-file-coding-system 'mac nil) but this did not work,
also there I did not find mac-roman as a coding system.
> Maybe. It depends on your locale defaults and on whether you
> customized those defaults (with the likes of prefer-coding-system).
> Not every encoding can be reliably decoded, if the defaults defeat
> that.
> One way of avoiding the manual specification of the encoding is to use
> the coding: tag inside the file, either on the first line or in the
> file-local variables.
That is what I usually do, but a tag in the first line but if I receive
a file whose coding I don't know and don't know how to find it out, I am
sort of stuck.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 14:28 ` Andreas Schwab
@ 2018-11-16 15:36 ` Uwe Brauer
2018-11-16 15:46 ` Andreas Schwab
0 siblings, 1 reply; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 15:36 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 475 bytes --]
>>> "Andreas" == Andreas Schwab <schwab@linux-m68k.org> writes:
> On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:
>> And I simple don't know who to deal with it, I try to save it as latin-1
>> or as utf-8 I run encode-coding-region with various codings. Nothing
>> works.
> Define "works". What is your goal?
The steps Eli provided: I obtained a file with correct encoded chars.
> Note that \217 is not a valid ISO 8859-1 character.
> Andreas.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 15:36 ` Uwe Brauer
@ 2018-11-16 15:46 ` Andreas Schwab
0 siblings, 0 replies; 16+ messages in thread
From: Andreas Schwab @ 2018-11-16 15:46 UTC (permalink / raw)
To: emacs-devel
On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:
>>>> "Andreas" == Andreas Schwab <schwab@linux-m68k.org> writes:
>
> > On Nov 16 2018, Uwe Brauer <oub@mat.ucm.es> wrote:
> >> And I simple don't know who to deal with it, I try to save it as latin-1
> >> or as utf-8 I run encode-coding-region with various codings. Nothing
> >> works.
>
> > Define "works". What is your goal?
>
> The steps Eli provided: I obtained a file with correct encoded chars.
Which you didn't.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 15:36 ` Uwe Brauer
@ 2018-11-16 15:53 ` Eli Zaretskii
2018-11-16 16:00 ` Uwe Brauer
0 siblings, 1 reply; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 15:53 UTC (permalink / raw)
To: Uwe Brauer; +Cc: emacs-devel
> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Fri, 16 Nov 2018 16:36:01 +0100
>
> > C-x RET c mac-roman RET C-x C-f FILE-NAME RET
>
> That worked thank you!
>
> 1. You did you guess mac-roman? How can I find out myself in the future?
The "fran\215ais" was supposed to be français, obviously, so I
searched the Internet for an encoding where ç is encoded as 215 octal.
> 2. Is there any faster possibility?
Faster than what?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 15:33 ` Stefan Monnier
@ 2018-11-16 15:58 ` Uwe Brauer
0 siblings, 0 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 15:58 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 883 bytes --]
>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> One way of avoiding the manual specification of the encoding is to use
>> the coding: tag inside the file, either on the first line or in the
>> file-local variables.
> Note that in his examples he has:
> \usepackage[latin1]{inputenc}
> which Emacs normally recognizes the indicate that the file uses latin-1
> (which is actually a lie in this case, tho that lie might only affect
> comments, so it's not a complete lie).
> Nowadays, the better bet is to use utf-8 (which is is much easier to
> auto-detect and doesn't have umpteen extensions like latin-1 has).
Yes I know, I tried to delete this line and reopen the file but it did
not help. That is why I asked for the coding. Eli guessed it correctly
and I still don't know how he did it....
Uwe
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 15:53 ` Eli Zaretskii
@ 2018-11-16 16:00 ` Uwe Brauer
2018-11-16 16:08 ` Eli Zaretskii
2018-11-20 19:30 ` Charles A. Roelli
0 siblings, 2 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 16:00 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 710 bytes --]
>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:
>> From: Uwe Brauer <oub@mat.ucm.es>
>> Date: Fri, 16 Nov 2018 16:36:01 +0100
>>
>> > C-x RET c mac-roman RET C-x C-f FILE-NAME RET
>>
>> That worked thank you!
>>
>> 1. You did you guess mac-roman? How can I find out myself in the future?
> The "fran\215ais" was supposed to be français, obviously, so I
> searched the Internet for an encoding where ç is encoded as 215 octal.
Ah, ok.
>> 2. Is there any faster possibility?
> Faster than what?
Then typing
C-x RET c mac-roman RET C-x C-f FILE-NAME RET
I mean open the file, mark the region and recode the text to the correct
coding.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 16:00 ` Uwe Brauer
@ 2018-11-16 16:08 ` Eli Zaretskii
2018-11-16 16:31 ` Uwe Brauer
2018-11-16 16:35 ` Uwe Brauer
2018-11-20 19:30 ` Charles A. Roelli
1 sibling, 2 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 16:08 UTC (permalink / raw)
To: Uwe Brauer; +Cc: emacs-devel
> From: Uwe Brauer <oub@mat.ucm.es>
> Cc: Uwe Brauer <oub@mat.ucm.es>, emacs-devel@gnu.org
> Date: Fri, 16 Nov 2018 17:00:40 +0100
>
> >> 2. Is there any faster possibility?
>
> > Faster than what?
>
> Then typing
> C-x RET c mac-roman RET C-x C-f FILE-NAME RET
>
> I mean open the file, mark the region and recode the text to the correct
> coding.
If you already have the file visited in Emacs, then
C-x RET r mac-roman RET
will revert the buffer with that encoding. Is this what you wanted?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 16:08 ` Eli Zaretskii
@ 2018-11-16 16:31 ` Uwe Brauer
2018-11-16 16:35 ` Uwe Brauer
1 sibling, 0 replies; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 16:31 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 193 bytes --]
> If you already have the file visited in Emacs, then
> C-x RET r mac-roman RET
> will revert the buffer with that encoding. Is this what you wanted?
Exactly! Thanks
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 16:08 ` Eli Zaretskii
2018-11-16 16:31 ` Uwe Brauer
@ 2018-11-16 16:35 ` Uwe Brauer
2018-11-16 17:23 ` Eli Zaretskii
1 sibling, 1 reply; 16+ messages in thread
From: Uwe Brauer @ 2018-11-16 16:35 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Uwe Brauer, emacs-devel
[-- Attachment #1: Type: text/plain, Size: 759 bytes --]
>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:
>> From: Uwe Brauer <oub@mat.ucm.es>
>> Cc: Uwe Brauer <oub@mat.ucm.es>, emacs-devel@gnu.org
>> Date: Fri, 16 Nov 2018 17:00:40 +0100
>>
>> >> 2. Is there any faster possibility?
>>
>> > Faster than what?
>>
>> Then typing
>> C-x RET c mac-roman RET C-x C-f FILE-NAME RET
>>
>> I mean open the file, mark the region and recode the text to the correct
>> coding.
> If you already have the file visited in Emacs, then
> C-x RET r mac-roman RET
> will revert the buffer with that encoding. Is this what you wanted?
But still wondering why
(decode-coding-region (region-beginning) (region-end) 'mac-roman)
Does not work.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5025 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 16:35 ` Uwe Brauer
@ 2018-11-16 17:23 ` Eli Zaretskii
0 siblings, 0 replies; 16+ messages in thread
From: Eli Zaretskii @ 2018-11-16 17:23 UTC (permalink / raw)
To: Uwe Brauer; +Cc: emacs-devel
> From: Uwe Brauer <oub@mat.ucm.es>
> Cc: Uwe Brauer <oub@mat.ucm.es>, emacs-devel@gnu.org
> Date: Fri, 16 Nov 2018 17:35:03 +0100
>
> But still wondering why
>
> (decode-coding-region (region-beginning) (region-end) 'mac-roman)
>
> Does not work.
In "emacs -Q" or in your customized Emacs?
Besides, you should invoke this after visiting the file literally ,
with find-file-literally. After "C-x C-f", it's too late, because the
file is already decoded.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: coding problem char \217 etc
2018-11-16 16:00 ` Uwe Brauer
2018-11-16 16:08 ` Eli Zaretskii
@ 2018-11-20 19:30 ` Charles A. Roelli
1 sibling, 0 replies; 16+ messages in thread
From: Charles A. Roelli @ 2018-11-20 19:30 UTC (permalink / raw)
To: Uwe Brauer; +Cc: oub, eliz, emacs-devel
> From: Uwe Brauer <oub@mat.ucm.es>
> Date: Fri, 16 Nov 2018 17:00:40 +0100
> >> 2. Is there any faster possibility?
>
> > Faster than what?
>
> Then typing
> C-x RET c mac-roman RET C-x C-f FILE-NAME RET
>
> I mean open the file, mark the region and recode the text to the correct
> coding.
Try the command 'recode-region' if you haven't already done so.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2018-11-20 19:30 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-16 13:57 coding problem char \217 etc Uwe Brauer
2018-11-16 14:28 ` Andreas Schwab
2018-11-16 15:36 ` Uwe Brauer
2018-11-16 15:46 ` Andreas Schwab
2018-11-16 15:05 ` Eli Zaretskii
2018-11-16 15:22 ` Eli Zaretskii
2018-11-16 15:33 ` Stefan Monnier
2018-11-16 15:58 ` Uwe Brauer
2018-11-16 15:36 ` Uwe Brauer
2018-11-16 15:53 ` Eli Zaretskii
2018-11-16 16:00 ` Uwe Brauer
2018-11-16 16:08 ` Eli Zaretskii
2018-11-16 16:31 ` Uwe Brauer
2018-11-16 16:35 ` Uwe Brauer
2018-11-16 17:23 ` Eli Zaretskii
2018-11-20 19:30 ` Charles A. Roelli
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).