unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Email text that confuses charset recognition in emacs
@ 2013-04-16 16:27 Giorgos Keramidas
  2013-04-17  4:37 ` Paul Eggert
  0 siblings, 1 reply; 4+ messages in thread
From: Giorgos Keramidas @ 2013-04-16 16:27 UTC (permalink / raw)
  To: emacs-devel

Hi everyone,

I just noticed that the attached email message confuses the charset
detection machinery of Emacs, and it starts interpreting all text as
Japanese text -- even though most of the contents of the file are plain
us-ascii text.

I first noticed the problem when I received the attached message from
the `freebsd-current' mailing list, and the text looked Japanese in a
Gnus article buffer.  But saving the email text with `C-u M-g' to look
at the raw article text and re-opening the raw article text in Emacs
always shows Japanese text.

If I open the article text with `M-x find-file-literally', I can read
all of the English text, minus a few parts of the attribution line of
the email text.

The problem seems to start near this text in the email body:

| >>> In message <18DF99B0-6E66-4906-A233-7778451B8A92@felyko.com>, Rui Paulo
| >>> writes:
| >>>> 2013/04/15 9:55^[$B!"^[(BCy Schubert <Cy.Schubert@komquats.com> ^[$B$N%a%C%;!<%8^[
| >> (B:

I think it's because one of the intermediate mail relays (or even Gmail,
which got the final delivery for me) wrapper a line of text in the
middle of the byte sequence: "%8^[(B" one line before the last one.

In fact, joining the two last lines, and removing the bogusly inserted
text of "\n>> " restores the sequence to something that decodes properly
in Emacs:

    >>>> 2013/04/15 9:55^[$A!"^[(BCy Schubert <Cy.Schubert@komquats.com> ^[$A$N%a%C%;^[$B!<^[$A%8^[(B:

Since this is not a problem in what Emacs does with charset decoding of
the entire buffer, is it something we should try to fix in Gnus?  Is
there any way we can detect this sort of decoding problem at all?

I am running an Emacs snapshot built earlier this morning from this
version, by the way:

| 0416 18:10 saturn:~/git/emacs$ git log -1
| commit 3e440d19d12bc740010d9e98958d529260eea321
| Author: Michael Albinus <michael.albinus@gmx.de>
| Date:   Tue Apr 16 10:11:56 2013 +0200
|
|     * tramp.texi (Frequently Asked Questions): Precise, how to define
|     an own ControlPath.

Here's also the uuencoded "broken" text of the email/article buffer:

begin 644 gnus-article-confusing-charset.txt
M1&5L:79E<F5D+51O.B!G:V5R86UI9&%S0&=M86EL+F-O;0I296-E:79E9#H@
M8GD@,3`N-C@N,S8N.3<@=VET:"!33510(&ED('`Q8W-P,30W-3%P8FH["B`@
M("`@("`@36]N+"`Q-2!!<'(@,C`Q,R`Q,SHS-SHT-R`M,#<P,"`H4$14*0I8
M+5)E8V5I=F5D.B!B>2`Q,"XQ-"XQ.#(N-S(@=VET:"!33510(&ED(&XT.&UR
M.#$Y.38W-V5E;2XS+C$S-C8P-3@R-C8U,C@["B`@("`@("`@36]N+"`Q-2!!
M<'(@,C`Q,R`Q,SHS-SHT-B`M,#<P,"`H4$14*0I2971U<FXM4&%T:#H@/&]W
M;F5R+69R965B<V0M8W5R<F5N=$!F<F5E8G-D+F]R9SX*4F5C96EV960Z(&9R
M;VT@<&]S96ED;VXN8V5I9"YU<&%T<F%S+F=R("AP;W-E:61O;BYC96ED+G5P
M871R87,N9W(N(%LQ-3`N,30P+C$T,2XQ-CE=*0H@("`@("`@(&)Y(&UX+F=O
M;V=L92YC;VT@=VET:"!%4TU44"!I9"!M-#%S:3,Q,C(U,C,X965N+C$U-RXR
M,#$S+C`T+C$U+C$S+C,W+C0U.PH@("`@("`@($UO;BP@,34@07!R(#(P,3,@
M,3,Z,S<Z-#8@+3`W,#`@*%!$5"D*4F5C96EV960M4U!&.B!S;V9T9F%I;"`H
M9V]O9VQE+F-O;3H@9&]M86EN(&]F('1R86YS:71I;VYI;F<@;W=N97(M9G)E
M96)S9"UC=7)R96YT0&9R965B<V0N;W)G(&1O97,@;F]T(&1E<VEG;F%T92`Q
M-3`N,30P+C$T,2XQ-CD@87,@<&5R;6ET=&5D('-E;F1E<BD@8VQI96YT+6EP
M/3$U,"XQ-#`N,30Q+C$V.3L*075T:&5N=&EC871I;VXM4F5S=6QT<SH@;7@N
M9V]O9VQE+F-O;3L*("`@("`@('-P9CUS;V9T9F%I;"`H9V]O9VQE+F-O;3H@
M9&]M86EN(&]F('1R86YS:71I;VYI;F<@;W=N97(M9G)E96)S9"UC=7)R96YT
M0&9R965B<V0N;W)G(&1O97,@;F]T(&1E<VEG;F%T92`Q-3`N,30P+C$T,2XQ
M-CD@87,@<&5R;6ET=&5D('-E;F1E<BD@<VUT<"YM86EL/6]W;F5R+69R965B
M<V0M8W5R<F5N=$!F<F5E8G-D+F]R9SL*("`@("`@(&1K:6T];F5U=')A;"`H
M8F%D(&9O<FUA="D@:&5A9&5R+FD]0'EA:&]O+F-O;0I296-E:79E9#H@9G)O
M;2!M86EL+F-E:60N=7!A=')A<RYG<B`H975R;W!A+F-E:60N=7!A=')A<RYG
M<B!;,3`N,2XP+C$T,UTI"@EB>2!P;W-E:61O;BYC96ED+G5P871R87,N9W(@
M*%!O<W1F:7@I('=I=&@@15--5%`@:60@,D)%1$(X,C4V00H)9F]R(#QG:V5R
M86UI9&%S0&=M86EL+F-O;3X[($UO;BP@,34@07!R(#(P,3,@,C,Z,S`Z,34@
M*S`S,#`@*$5%4U0I"E)E8V5I=F5D.B!B>2!M86EL+F-E:60N=7!A=')A<RYG
M<B`H4&]S=&9I>"D*"6ED(#DT.30T.3%#049#-SL@36]N+"`Q-2!!<'(@,C`Q
M,R`R,SHS-SHT-2`K,#,P,"`H14535"D*1&5L:79E<F5D+51O.B!K97)A;6ED
M84!C96ED+G5P871R87,N9W(*4F5C96EV960Z(&9R;VT@;&]C86QH;W-T("AL
M;V-A;&AO<W0@6S$R-RXP+C`N,5TI"@EB>2!M86EL+F-E:60N=7!A=')A<RYG
M<B`H4&]S=&9I>"D@=VET:"!%4TU44"!I9"`Y,#)&-3DQ0T%&0S8*"69O<B`\
M:V5R86UI9&%`8V5I9"YU<&%T<F%S+F=R/CL@36]N+"`Q-2!!<'(@,C`Q,R`R
M,SHS-SHT-2`K,#,P,"`H14535"D*6"U6:7)U<RU38V%N;F5D.B!A;6%V:7-D
M+6YE=R!A="!C96ED+G5P871R87,N9W(*4F5C96EV960Z(&9R;VT@;6%I;"YC
M96ED+G5P871R87,N9W(@*%LQ,C<N,"XP+C%=*0H)8GD@;&]C86QH;W-T("AE
M=7)O<&$N8V5I9"YU<&%T<F%S+F=R(%LQ,C<N,"XP+C%=*2`H86UA=FES9"UN
M97<L('!O<G0@,3`P,C0I"@EW:71H($533510(&ED(%954V-W<DA92U1V;"!F
M;W(@/&ME<F%M:61A0&-E:60N=7!A=')A<RYG<CX["@E-;VXL(#$U($%P<B`R
M,#$S(#(S.C,W.C0U("LP,S`P("A%15-4*0I296-E:79E9#H@9G)O;2!P;W-E
M:61O;BYC96ED+G5P871R87,N9W(@*'!O<V5I9&]N+F-E:60N=7!A=')A<RYG
M<B!;,3`N,2XP+C$V.5TI"@EB>2!M86EL+F-E:60N=7!A=')A<RYG<B`H4&]S
M=&9I>"D@=VET:"!%4TU44"!I9"`W,#1",CDQ0T%&0S4*"69O<B`\:V5R86UI
M9&%`8V5I9"YU<&%T<F%S+F=R/CL@36]N+"`Q-2!!<'(@,C`Q,R`R,SHS-SHT
M-2`K,#,P,"`H14535"D*4F5C96EV960Z(&)Y('!O<V5I9&]N+F-E:60N=7!A
M=')A<RYG<B`H4&]S=&9I>"P@9G)O;2!U<V5R:60@-#DW*0H):60@,#`V-#!!
M,3,S1#L@36]N+"`Q-2!!<'(@,C`Q,R`R,SHS,#HQ-"`K,#,P,"`H14535"D*
M6"U3<&%M+4-H96-K97(M5F5R<VEO;CH@4W!A;4%S<V%S<VEN(#,N,RXQ("@R
M,#$P+3`S+3$V*2!O;@H)<&]S96ED;VXN8V5I9"YU<&%T<F%S+F=R"E@M4W!A
M;2U,979E;#H@*@I8+5-P86TM4W1A='5S.B!.;RP@<V-O<F4],2XQ(')E<75I
M<F5D/30N-2!T97-T<SU$2TE-7T%$4U!?0U535$]-7TU%1#TP+C`P,2P*"41+
M24U?4TE'3D5$/3`N,2Q&3U)'141?64%(3T]?4D-61#TQ+C`R,BQ&4D5%34%)
M3%]&4D]-/3`N,#`Q+`H)5%]$2TE-7TE.5D%,240],"XP,2!A=71O;&5A<FX]
M9&ES86)L960@=F5R<VEO;CTS+C,N,0I296-E:79E9#H@9G)O;2!M>#(N9G)E
M96)S9"YO<F<@*&UX,BY&<F5E0E-$+F]R9R!;."XX+C$W."XQ,39=*0H)8GD@
M<&]S96ED;VXN8V5I9"YU<&%T<F%S+F=R("A0;W-T9FEX*2!W:71H($533510
M(&ED(#!#-#5%.49%1C8*"69O<B`\:V5R86UI9&%`8V5I9"YU<&%T<F%S+F=R
M/CL@36]N+"`Q-2!!<'(@,C`Q,R`R,SHS,#HQ,R`K,#,P,"`H14535"D*4F5C
M96EV960Z(&9R;VT@:'5B+F9R965B<V0N;W)G("AH=6(N9G)E96)S9"YO<F<@
M6TE0=C8Z,C`P,3HQ.3`P.C(R-30Z,C`V8SHZ,38Z.#A=*0H)8GD@;7@R+F9R
M965B<V0N;W)G("A0;W-T9FEX*2!W:71H($533510(&ED($4Y1C)"-44X,CL*
M"4UO;BP@,34@07!R(#(P,3,@,C`Z,S<Z-#`@*S`P,#`@*%540RD*4F5C96EV
M960Z(&9R;VT@:'5B+F9R965B<V0N;W)G("AH=6(N9G)E96)S9"YO<F<@6TE0
M=C8Z,C`P,3HQ.3`P.C(R-30Z,C`V8SHZ,38Z.#A=*0H)8GD@:'5B+F9R965B
M<V0N;W)G("A0;W-T9FEX*2!W:71H($533510(&ED(#9$-T8T.#(P.PH)36]N
M+"`Q-2!!<'(@,C`Q,R`R,#HS-SHS.2`K,#`P,"`H551#*0H)*&5N=F5L;W!E
M+69R;VT@;W=N97(M9G)E96)S9"UC=7)R96YT0&9R965B<V0N;W)G*0I$96QI
M=F5R960M5&\Z(&-U<G)E;G1`9G)E96)S9"YO<F<*4F5C96EV960Z(&9R;VT@
M;7@Q+F9R965B<V0N;W)G("AM>#$N1G)E94)31"YO<F<@6S@N."XQ-S@N,3$U
M72D*(&)Y(&AU8BYF<F5E8G-D+F]R9R`H4&]S=&9I>"D@=VET:"!%4TU44"!I
M9"!&,D4P1D5#1@H@9F]R(#QC=7)R96YT0&9R965B<V0N;W)G/CL@36]N+"`Q
M-2!!<'(@,C`Q,R`Q.3HS.3HR,2`K,#`P,"`H551#*0H@*&5N=F5L;W!E+69R
M;VT@<V-O='0T;&]N9T!Y86AO;RYC;VTI"E)E8V5I=F5D.B!F<F]M(&YM,34M
M=FTQ+F)U;&QE="YM86EL+F=Q,2YY86AO;RYC;VT*("AN;3$U+79M,2YB=6QL
M970N;6%I;"YG<3$N>6%H;V\N8V]M(%LY."XQ,S<N,3<V+C<S72D*(&)Y(&UX
M,2YF<F5E8G-D+F]R9R`H4&]S=&9I>"D@=VET:"!33510(&ED($,S,#DX,38U
M,@H@9F]R(#QC=7)R96YT0&9R965B<V0N;W)G/CL@36]N+"`Q-2!!<'(@,C`Q
M,R`Q.3HS.3HR,2`K,#`P,"`H551#*0I296-E:79E9#H@9G)O;2!;.3@N,3,W
M+C$R+C4Y72!B>2!N;3$U+F)U;&QE="YM86EL+F=Q,2YY86AO;RYC;VT@=VET
M:"!.3D9-4#L*(#$U($%P<B`R,#$S(#$Y.C,R.C0Y("TP,#`P"E)E8V5I=F5D
M.B!F<F]M(%LR,#@N-S$N-#(N,3DR72!B>2!T;30N8G5L;&5T+FUA:6PN9W$Q
M+GEA:&]O+F-O;2!W:71H($Y.1DU0.PH@,34@07!R(#(P,3,@,3DZ,S(Z-#D@
M+3`P,#`*4F5C96EV960Z(&9R;VT@6S$R-RXP+C`N,5T@8GD@<VUT<#(P,RYM
M86EL+F=Q,2YY86AO;RYC;VT@=VET:"!.3D9-4#L*(#$U($%P<B`R,#$S(#$Y
M.C,R.C0Y("TP,#`P"D1+24TM4VEG;F%T=7)E.B!V/3$[(&$]<G-A+7-H83(U
M-CL@8SUR96QA>&5D+W)E;&%X960[(&0]>6%H;V\N8V]M.R!S/7,Q,#(T.PH@
M=#TQ,S8V,#4T,S8Y.R!B:#U+:S=M,VY,5D1-,$UK3FE5:&DO9&98351:=C)-
M:3!%04E"-$UZ<G)F>D1W/3L*(&@]6"U986AO;RU.97=M86XM260Z6"U986AO
M;RU.97=M86XM4')O<&5R='DZ6"U936%I;"U/4T<Z6"U986AO;RU33510.E@M
M4F]C:V5T+5)E8V5I=F5D.D-O;G1E;G0M5'EP93I-:6UE+59E<G-I;VXZ4W5B
M:F5C=#I&<F]M.DEN+5)E<&QY+51O.D1A=&4Z0V,Z0V]N=&5N="U4<F%N<V9E
M<BU%;F-O9&EN9SI-97-S86=E+4ED.E)E9F5R96YC97,Z5&\Z6"U-86EL97([
M"B!B/6@T87<S<CE9.&)9=$8Y,#AF;U!05E!J16EC9DQP2#5G=F1!,%`Y;DYV
M;UDU;V=L2W!76FPP94Y%2D5P9D]:<6UU5V]J3VE3=3!&6D=046TT-#-%,GEP
M;%,S0FEE:'!D0W)M4&XV5T=S.%!/561G4UAM.6A58G!U;#0U85$O060S9&%/
M8FE$37=V1U!0,TLX2EAN3F932%AU=$=H5VMQ<7=-=FEE8F=A<F)B03T*6"U9
M86AO;RU.97=M86XM260Z(#0P-S<S-RXU-#<S,BYB;4!S;71P,C`S+FUA:6PN
M9W$Q+GEA:&]O+F-O;0I8+5EA:&]O+4YE=VUA;BU0<F]P97)T>3H@>6UA:6PM
M,PI8+5E-86EL+4]31SH@34UG+CE.,%9-,6YT2$-2>&QY:S=&-W9D0C1E5'5.
M96-356Y03'II:EI#2&U4:V$*(#!J67=O,DYD3&E.1'I926-%=7HN,69D<$=3
M0T-">'AS:%%X<&AM0V93:6Y8;'=N66A72#!0,'%&6FE8<`H@-W)"<W!':%!!
M8VQ'14YH2W<P-7E$3W-.1TLY5U!#>G1V,55784=O7UEV1U):;G5F;E8N;FTS
M;5HR8DE("B!O-65(:T%V:TY&6&AR6EA#1$E#24(R<T=H-3=?6E1?;3)"9%-M
M.#)Y>3E"83A/<%=I0W$N04UL6#-&<%\*(&E#4G-83V1,27DP55I$46=J94\P
M8W4Y4&=Y>4-8=5!15%0Q54QV2F\R>7DU9T5B9%]Y3V,V>#=T5VTQ5`H@=G,P
M=%EL>&IT4#!&44E?,V-036%T131(3WHQ:W%O>GIX:%]Y4'-!0VE-95%->C1D
M=TU#86IO1'8W=SE&"B!*-$]&841S2U!!>E]C65!Q,'%N44U#1E%":&,X<UAU
M=U$X3DE8,V,Y;%-(;DAF-TUX8D\S=5%M5TIU3E@*($$U3&\T=F%#,S%B84]?
M4W!F=6QT0G%Q6$Y,1V8V-U1-2$UZ=G1);#4S6CA"9$5F4'$Q85=G<65G2&Y4
M1`H@8T-T5SEK;%EQ9C!L44UP37<V5$]C,W-W86Y2;4%-+G4R<$PX44]'=F5M
M0S-2-W1Z=GA162T*6"U986AO;RU33510.B!C;&A!0G`N<W="0C=F<RY,=TE*
M<'8S:FM79V\R3E4X+0I8+5)O8VME="U296-E:79E9#H@9G)O;2!;,3DR+C$V
M."XR-30N,C`V72`H<V-O='0T;&]N9T`Q-C@N,3`S+C@U+C4W('=I=&@@<&QA
M:6XI"B!B>2!S;71P,C`S+FUA:6PN9W$Q+GEA:&]O+F-O;2!W:71H(%--5%`[
M(#$U($%P<B`R,#$S(#$R.C,R.C0Y("TP-S`P(%!$5`I-:6UE+59E<G-I;VXZ
M(#$N,"`H36%C($]3(%@@36%I;"`V+C,@7"@Q-3`S7"DI"E-U8FIE8W0Z(%)E
M.B!I<&9I;'1E<B@T*2!N965D<R!M86EN=&%I;F5R"D9R;VTZ(%-C;W1T($QO
M;F<@/'-C;W1T-&QO;F=`>6%H;V\N8V]M/@I);BU297!L>2U4;SH@/#(P,3,P
M-#$U,3DR-RYR,T9*4G1Q.3`P,C<Y.4!S;&EP<'DN8W=S96YT+F-O;3X*1&%T
M93H@36]N+"`Q-2!!<'(@,C`Q,R`Q,SHS,CHT."`M,#8P,`I-97-S86=E+4ED
M.B`\,SA&,S$U1D8M,#=$,RTT-3%#+3DX,3$M1#1#1C@Y13E%0CA%0'EA:&]O
M+F-O;3X*4F5F97)E;F-E<SH@/#(P,3,P-#$U,3DR-RYR,T9*4G1Q.3`P,C<Y
M.4!S;&EP<'DN8W=S96YT+F-O;3X*5&\Z($-Y(%-C:'5B97)T(#Q#>2Y38VAU
M8F5R=$!K;VUQ=6%T<RYC;VT^"E@M36%I;&5R.B!!<'!L92!-86EL("@R+C$U
M,#,I"E@M36%I;&UA;BU!<'!R;W9E9"U!=#H@36]N+"`Q-2!!<'(@,C`Q,R`R
M,#HS-SHS-R`K,#`P,`I#8SH@5V%R<F5N($)L;V-K(#QW8FQO8VM`=V]N:VET
M>2YC;VT^+`H@(F-U<G)E;G1`9G)E96)S9"YO<F<B(#QC=7)R96YT0&9R965B
M<V0N;W)G/BP@0VAR:7,@4F5E<R`\8W)E97-`9G)E96)S9"YO<F<^+`H@4G5I
M(%!A=6QO(#QR<&%U;&]`9F5L>6MO+F-O;3XL(")N971`9G)E96)S9"YO<F<B
M(#QN971`9G)E96)S9"YO<F<^+`H@9&%R<F5N<D!F<F5E8G-D+F]R9RP@(F-P
M971`<V1F+F]R9R(@/&-P971`<V1F+F]R9SX*6"U"965N5&AE<F4Z(&9R965B
M<V0M8W5R<F5N=$!F<F5E8G-D+F]R9PI8+4UA:6QM86XM5F5R<VEO;CH@,BXQ
M+C$T"E!R96-E9&5N8V4Z(&QI<W0*3&ES="U)9#H@1&ES8W5S<VEO;G,@86)O
M=70@=&AE('5S92!O9B!&<F5E0E-$+6-U<G)E;G0*(#QF<F5E8G-D+6-U<G)E
M;G0N9G)E96)S9"YO<F<^"DQI<W0M56YS=6)S8W)I8F4Z(#QH='1P.B\O;&ES
M=',N9G)E96)S9"YO<F<O;6%I;&UA;B]O<'1I;VYS+V9R965B<V0M8W5R<F5N
M=#XL(`H@/&UA:6QT;SIF<F5E8G-D+6-U<G)E;G0M<F5Q=65S=$!F<F5E8G-D
M+F]R9S]S=6)J96-T/75N<W5B<V-R:6)E/@I,:7-T+4%R8VAI=F4Z(#QH='1P
M.B\O;&ES=',N9G)E96)S9"YO<F<O<&EP97)M86EL+V9R965B<V0M8W5R<F5N
M=#X*3&ES="U0;W-T.B`\;6%I;'1O.F9R965B<V0M8W5R<F5N=$!F<F5E8G-D
M+F]R9SX*3&ES="U(96QP.B`\;6%I;'1O.F9R965B<V0M8W5R<F5N="UR97%U
M97-T0&9R965B<V0N;W)G/W-U8FIE8W0]:&5L<#X*3&ES="U3=6)S8W)I8F4Z
M(#QH='1P.B\O;&ES=',N9G)E96)S9"YO<F<O;6%I;&UA;B]L:7-T:6YF;R]F
M<F5E8G-D+6-U<G)E;G0^+`H@/&UA:6QT;SIF<F5E8G-D+6-U<G)E;G0M<F5Q
M=65S=$!F<F5E8G-D+F]R9S]S=6)J96-T/7-U8G-C<FEB93X*0V]N=&5N="U4
M>7!E.B!T97AT+W!L86EN.R!C:&%R<V5T/2)U<RUA<V-I:2(*0V]N=&5N="U4
M<F%N<V9E<BU%;F-O9&EN9SH@-V)I=`I%<G)O<G,M5&\Z(&]W;F5R+69R965B
M<V0M8W5R<F5N=$!F<F5E8G-D+F]R9PI396YD97(Z(&]W;F5R+69R965B<V0M
M8W5R<F5N=$!F<F5E8G-D+F]R9PH*"D]N($%P<B`Q-2P@,C`Q,RP@870@,3HR
M-R!032P@0WD@4V-H=6)E<G0@/$-Y+E-C:'5B97)T0&MO;7%U871S+F-O;3X@
M=W)O=&4Z"@H^($EN(&UE<W-A9V4@/$$R-#4P,S8Q+40Y13DM-#DX1BU!1#0T
M+3@T-C4V,T5&,#1#0D!Y86AO;RYC;VT^+"!38V]T="!,;VYG(`H^('=R:71E
M<SH*/CX@"CX^($]N($%P<B`Q-2P@,C`Q,RP@870@,3$Z-#@@04TL($-Y(%-C
M:'5B97)T(#Q#>2Y38VAU8F5R=$!K;VUQ=6%T<RYC;VT^('=R;W1E.@H^/B`*
M/CX^($EN(&UE<W-A9V4@/#$X1$8Y.4(P+39%-C8M-#DP-BU!,C,S+3<W-S@T
M-3%".$$Y,D!F96QY:V\N8V]M/BP@4G5I(%!A=6QO(`H^/CX@=W)I=&5S.@H^
M/CX^(#(P,3,O,#0O,34@.3HU-1LD0B$B&RA"0WD@4V-H=6)E<G0@/$-Y+E-C
M:'5B97)T0&MO;7%U871S+F-O;3X@&R1")$XE825#)3LA/"4X&PH^/B`H0CH*
M/CX^/B`*/CX^/CX@22=V92!B965N('!L86YN:6YG(&]N('1A:VEN9R!O;B!)
M4"!&:6QT97(@9F]R('%U:71E('-O;64@=&EM92X@"CX^/CX^(%5N9F]R='5N
M871E;'D@22=V92!L969T(&UY('-R8R!C;VUM:70@8FET(&QA<'-E("AM>2!P
M;W)T<R!C;VUM:70@8FET(&ES(`H^/CX^/B!A;&EV92!A;F0@=V5L;"!T:&]U
M9V@I('1H=7,@22=M(&QO;VMI;F<@9F]R(&$@;65N=&]R+B!);B!A9&1I=&EO
M;B!))VT@"CX^/CX^('=O<FMI;F<@;VX@86X@04-%4B!734DO04-022!K;&0N
M($]N92!M96YT;W(@=V]U;&0@8F4@<')E9F5R<F5D(&)U="!T=V\@"CX^/CX^
M('=O=6QD(&)E(&9I;F4@=&]O+@H^/CX^(`H^/CX^(%=H870@87)E('EO=7(@
M<&QA;G,@<F5G87)D:6YG(&EP9FEL=&5R/R!)(')E;6%I;B!U;F-O;G9I;F-E
M9"!T:&%T(&ET('-H;W5L"CX^(&0@8@H^/CX^(&4@:6X@=&AE(&)A<V4@<WES
M=&5M+B!097)H87!S('EO=2!C86X@=V]R:R!O;B!I="!A<R!A('!O<G0_"CX^
M/B`*/CX^(%1H92!I;FET:6%L('!L86X@=V%S('1O(&EM<&]R="!)4"!&:6QT
M97(@-2XQ+C(@:6YT;R!(14%$+B!D87)R96YR0"!H861N)W0@"CX^/B!D;VYE
M(&UU8V@@=VET:"!)4$8@=VAI;&4@96UP;&]Y960@=VET:"!3=6XN(%-I;F-E
M('1H96X@=&AE<F4@:&%S(&)E96X@<V]M92`*/CX^(&1E=F5L;W!M96YT('1H
M870@:7,@;&]N9R!O=F5R9'5E(&9O<B!(14%$+@H^/CX@"CX^/B!))VT@;F]T
M('-U<F4@:68@22=D($U&0R!I="!I;G1O(#D@;W(@;F]T+@H^/CX@"CX^/B!)
M(&1I9"!C;VYS:61E<B!A('!O<G0@8G5T(&=I=F5N(&ET('=O=6QD(&AA<R!T
M;R!T;W5C:"!B:71S(&%N9"!P:65C97,@;V8@"CX^/B!T:&4@<V]U<F-E('1R
M964@*"]U<W(O<W)C*2P@82!P;W)T('=O=6QD(&)E(&UE<W-Y(&%N9"!T:&4@
M9&5C:7-I;VX@=V%S(&UA9&4*/CX@"CX^/B!T;R!W;W)K(&]N(&EM<&]R=&EN
M9R!I="!I;G1O(&)A<V4N"CX^/B`*/CX^/B`*/CX^/B!7:'D@9&\@>6]U('=A
M;G0@=&\@=V]R:R!O;B!S;VUE=&AI;F<@=&AA="!P96]P;&4@:&%V92!B965N
M('1R>6EN9R!T;R!R96UO=@H^/B!E(',*/CX^/B!I;F-E(#(P,#4_"CX^/B`*
M/CX^($D@86YD(&]T:&5R<R!H879E(&)E96X@=7-I;F<@:70@:6X@1G)E94)3
M1"!F;W(@;W9E<B!D96-A9&4N($9O<B!T:&4@;&]N9V5S=`H^/B`*/CX^(&]F
M('1I;64@=V4G9"!U<V4@82!C;VUM;VX@<V5T(&]F(')U;&5S(&%C<F]S<R!A
M($9R965"4T0@86YD(%-O;&%R:7,@9F%R;2`*/CX^("AU<VEN9R!I<&9M971A
M+"!M86ME9FEL97,L(')S>6YC+"!R9&ES="P@86YD(&$@;&]C86P@0U93(')E
M<&\I+B`*/CX^($EN=&5R;W!E<F%B:6QI='D@=VET:"!O=&AE<B!S>7-T96US
M('=H:6-H('5S92!)4"!&:6QT97(@:7,@82!P;'5S+B!)9B`*/CX^('1H97)E
M)W,@82!M86EN=&%I;F5R+"!I="!O;FQY(&UA:V5S($9R965"4T0@<FEC:&5R
M+B!,;W-I;F<@25`@1FEL=&5R('=O=6QD(`H^/CX@8F4@82!L;W-S+@H^/CX@
M"CX^(`H^/B`*/CX@268@>6]U)W)E(&-O;6UI='1E9"!T;R!M86EN=&%I;FEN
M9R!)4$9I;'1E<BP@=&AA="=S(&=R96%T+B`@2&]W979E<BP@:70@8V%N)W0*
M/CX@8F4*/CX@;&5F="!T;R!S=&%G9V5R(&%L;VYG(&EN(&$@('IO;6)I92!S
M=&%T92!W:71H(&YO=&AI;F<@;6]R92!T:&%N(&=O;V0@:6YT96YT:6\*/CX@
M;G,*/CX@9G)O;2!W96QL(&UE86YI;F<@<&5O<&QE+B`@5VAA="!I<R!Y;W5R
M('1I;65L:6YE(&9O<B!G971T:6YG(&ET(&)A8VL@:6YT;R!S:&$*/CX@<&4*
M/CX@86YD(')E+6EN=&5G<F%T:6YG('EO=7)S96QF(&EN=&\@=&AE(&-O;6UI
M='1E<B!C;VUM=6YI='D_"CX@"CX@22!W;W5L9"!T:&EN:R!T:&ES('=O=6QD
M(&)E(&UY('1O<"!P<FEO<FET>2!R:6=H="!N;W<N($DG9"!L:6ME('1O('-E
M92!I="`*/B!A="!T:&4@;&%T97-T(&QE=F5L(&EN($A%040N($D@=V]U;&0@
M;&EK92!T;R!-1D,@=&\@.2U35$%"3$4@870@<V]M92!P;VEN="X*/B`*/B!'
M:79E;B!T:&%T($E01B!A;')E861Y(&QI=F5S(&EN('-R8R]C;VYT<FEB(&%N
M9"!S<F,O<WES+V-O;G1R:6(L('=O=6QD('1H92`*/B!C:&%N9V4@:6X@3&EC
M96YS92!F<F]M($1A<G)E;B!2965D)W,@;W=N(&YO="!S;R!"4T0@9G)I96YD
M;'D@25!&(&QI8V5N<V4@=&\@"CX@1U!,=C(@8F4@;V8@8V]N8V5R;BX@22!R
M96-A;&P@=&AE<F4@=V%S(&$@;&]T(&]F(&-O;F-E<FX@;W9E<B!)4$8G<R!L
M:6-E;G-E(`H^(&-H86YG92!A="!T:&4@=&EM92X@*$9R965"4T0@;6]V960@
M:70@=&\@8V]N=')I8B!W:&EL92!/<&5N0E-$(')E;6]V960@:70@"CX@8V]M
M<&QE=&5L>2!A;F0@=W)O=&4@4$8@+2T@22=M(&YO="!S=7)E('=H870@3F5T
M0E-$(&1I9"DN"CX@"@H*22!W;W5L9"!A<W-U;64@=&AA="!T:&4@;&EC96YS
M92!C:&%N9V4@=V]U;&0@8F4@3TLL(&5S<&5C:6%L;'D@<VEN8V4@=&AE(&]T
M:&5R"F]P=&EO;B!I<R!T;R!R96UO=F4@:70@*&]R(&QE="!I="!C;VYT:6YU
M92!T;R!R;W0@86YD(&)E(&%N(&5Y97-O<F4I(&)U="!))VQL(&1E9F5R('1O
M('1H;W-E(&QI:V4*1VQE8B!A;F0@4G5I('=I=&@@82!M;W)E('9E<W1E9"!I
M;G1E<F5S="!I;B!I="X*"E-C;W1T"@I?7U]?7U]?7U]?7U]?7U]?7U]?7U]?
M7U]?7U]?7U]?7U]?7U]?7U]?7U]?7U]?7PIF<F5E8G-D+6-U<G)E;G1`9G)E
M96)S9"YO<F<@;6%I;&EN9R!L:7-T"FAT='`Z+R]L:7-T<RYF<F5E8G-D+F]R
M9R]M86EL;6%N+VQI<W1I;F9O+V9R965B<V0M8W5R<F5N=`I4;R!U;G-U8G-C
M<FEB92P@<V5N9"!A;GD@;6%I;"!T;R`B9G)E96)S9"UC=7)R96YT+75N<W5B
4<V-R:6)E0&9R965B<V0N;W)G(@H`
`
end



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Email text that confuses charset recognition in emacs
  2013-04-16 16:27 Email text that confuses charset recognition in emacs Giorgos Keramidas
@ 2013-04-17  4:37 ` Paul Eggert
  2013-04-24 15:13   ` Kenichi Handa
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Eggert @ 2013-04-17  4:37 UTC (permalink / raw)
  To: Giorgos Keramidas; +Cc: emacs-devel

On 04/16/2013 09:27 AM, Giorgos Keramidas wrote:
> the attached email message confuses the charset
> detection machinery of Emacs, and it starts interpreting all text as
> Japanese text -- even though most of the contents of the file are plain
> us-ascii text.

Although the text is US-ASCII it contains a valid ISO-2022-7bit
coding sequence (the two things are not incompatible)
which Emacs is properly detecting and converting.  The problem is that
the text later contains the invalid escape sequence

   ESC LF > > SP ( B

This text was intended to switch out of a Japanese charset (the immediately
preceding text is valid ISO-2022-7bit Japanese), but a mailer that
*thought* that the text was ASCII inserted LF > > SP after the ESC
and before the ( B, causing the ESC ( B to be corrupted, so Emacs remains
in Japanese mode until the end of the input.

Perhaps when Emacs is decoding ISO-2022-7bit and sees an invalid
escape sequence, it should switch back to ASCII.  That would have
fixed your problem, and wouldn't break the decoding of any valid
ISO-2022-7bit sequence.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Email text that confuses charset recognition in emacs
  2013-04-17  4:37 ` Paul Eggert
@ 2013-04-24 15:13   ` Kenichi Handa
  2013-04-24 20:34     ` Giorgos Keramidas
  0 siblings, 1 reply; 4+ messages in thread
From: Kenichi Handa @ 2013-04-24 15:13 UTC (permalink / raw)
  To: Paul Eggert; +Cc: keramida, emacs-devel

In article <516E26F4.1020303@cs.ucla.edu>, Paul Eggert <eggert@cs.ucla.edu> writes:

> Perhaps when Emacs is decoding ISO-2022-7bit and sees an invalid
> escape sequence, it should switch back to ASCII.  That would have
> fixed your problem, and wouldn't break the decoding of any valid
> ISO-2022-7bit sequence.

It seems like a good idea.  I've just installed such a
change to the trunk.

---
Kenichi Handa
handa@gnu.org




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Email text that confuses charset recognition in emacs
  2013-04-24 15:13   ` Kenichi Handa
@ 2013-04-24 20:34     ` Giorgos Keramidas
  0 siblings, 0 replies; 4+ messages in thread
From: Giorgos Keramidas @ 2013-04-24 20:34 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Paul Eggert, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 612 bytes --]

Thanks!  I'll try this right away...


On Wed, Apr 24, 2013 at 5:13 PM, Kenichi Handa <handa@gnu.org> wrote:

> In article <516E26F4.1020303@cs.ucla.edu>, Paul Eggert <eggert@cs.ucla.edu>
> writes:
>
> > Perhaps when Emacs is decoding ISO-2022-7bit and sees an invalid
> > escape sequence, it should switch back to ASCII.  That would have
> > fixed your problem, and wouldn't break the decoding of any valid
> > ISO-2022-7bit sequence.
>
> It seems like a good idea.  I've just installed such a
> change to the trunk.
>
> ---
> Kenichi Handa
> handa@gnu.org
>
>


-- 
Giorgos Keramidas; keramida@ceid.upatras.gr

[-- Attachment #2: Type: text/html, Size: 1231 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-04-24 20:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-16 16:27 Email text that confuses charset recognition in emacs Giorgos Keramidas
2013-04-17  4:37 ` Paul Eggert
2013-04-24 15:13   ` Kenichi Handa
2013-04-24 20:34     ` Giorgos Keramidas

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).