From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Peter Dyballa Newsgroups: gmane.emacs.help Subject: Re: Polish characters in emacs Date: Fri, 12 Oct 2007 12:20:22 +0200 Message-ID: <79F3E108-4C73-44C9-9926-BDA8C8C2AC7C@Web.DE> References: <1192136997.140783.256650@d55g2000hsg.googlegroups.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=UTF-8; delsp=yes; format=flowed Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1192184476 2975 80.91.229.12 (12 Oct 2007 10:21:16 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 12 Oct 2007 10:21:16 +0000 (UTC) Cc: help-gnu-emacs@gnu.org To: Wojtek Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Oct 12 12:21:05 2007 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1IgHdX-0004GX-53 for geh-help-gnu-emacs@m.gmane.org; Fri, 12 Oct 2007 12:20:55 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IgHdQ-0002BH-KZ for geh-help-gnu-emacs@m.gmane.org; Fri, 12 Oct 2007 06:20:48 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1IgHdB-0002B4-CL for help-gnu-emacs@gnu.org; Fri, 12 Oct 2007 06:20:33 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1IgHd8-0002Aq-HV for help-gnu-emacs@gnu.org; Fri, 12 Oct 2007 06:20:32 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IgHd8-0002An-EF for help-gnu-emacs@gnu.org; Fri, 12 Oct 2007 06:20:30 -0400 Original-Received: from fmmailgate03.web.de ([217.72.192.234]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1IgHd7-0007rd-KX for help-gnu-emacs@gnu.org; Fri, 12 Oct 2007 06:20:30 -0400 Original-Received: from smtp05.web.de (fmsmtp05.dlan.cinetic.de [172.20.4.166]) by fmmailgate03.web.de (Postfix) with ESMTP id 2348BA315511; Fri, 12 Oct 2007 12:20:28 +0200 (CEST) Original-Received: from [195.4.209.228] (helo=[192.168.1.2]) by smtp05.web.de with asmtp (TLSv1:AES128-SHA:128) (WEB.DE 4.108 #197) id 1IgHd5-0000Uh-00; Fri, 12 Oct 2007 12:20:27 +0200 In-Reply-To: <1192136997.140783.256650@d55g2000hsg.googlegroups.com> X-Mailer: Apple Mail (2.752.2) X-Sender: Peter_Dyballa@web.de X-Provags-ID: V01U2FsdGVkX184UWilXJ0pj3EppD4Tu22dHS+c/lny/sOdYXoZ kbUsVCCZIGsAnOfErPwTrNClHhOF+LBYunBHuzltKEmZyeLENE kJrUP5nxeY/66KdYONaA== X-detected-kernel: by monty-python.gnu.org: Linux 2.4-2.6 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:48368 Archived-At: Am 11.10.2007 um 23:09 schrieb Wojtek: > Could someone point me to an explanation of settings so that Polish > characters are displayed correctly in an emacs buffer and whether this > has to do with the environment outside of emacs. There are three 8 bit ISO Latin encodings that support Polish: ISO =20 8859-2, ISO 8859-13, and ISO 8859-16, the 8 bit MS encoding Code Page =20= 1250, and finally UTF-8. The all have =20 =C3=93=C3=B3=C4=84=C4=85=C4=86=C4=87=C4=98=C4=99=C5=81=C5=82=C5=83=C5=84=C5= =9A=C5=9B=C5=B9=C5=BA=C5=BC=C5=BC. > When I open up a connection to my account and run emacs from Fedora 7 > the characters do not show up when viewing a mail message encoded as > utf-8. *How* does this happen? Does it happen that you only see empty boxes? =20= Then you're using a font that does not have the Polish characters. =20 Change, for example, to Lucida Sans Typewriter from Java SDK! > However I can toggle the input method to polish-slash and > enter polish characters and they do show up. This can be something completely different. (I never use an "input =20 method." At least not by conscience.) And it's no proof, except that =20 this GNU Emacs can display the chosen item properly. So their might =20 be some mis-understanding come from the eMail client used to retrieve =20= the input data for that buffer. > When connecting to the same account from a Windows machine using =20 > Cygwin-X, the characters in > the mail message show up without a problem. Ahh! So you are writing the whole time about eMails and their textual =20= presentation? Which eMail client do you use to read the eMails? Can =20 you make some of the header lines of the eMails appear in your eMail =20 client, particularly those that describe the way the message was =20 encoded for the transport through the Internet? The eMail client can =20 have its own ideas of representing an eMail's contents ... > Since the emacs I am running is starting up with the same =20 > parameters, what controlsl the display of characters? It's definitely the encoding used in the buffer. It's indicated at =20 the beginning of the mode-line (left-most characters). -*: for MS CP1250 or CP1252 -2: for ISO 8859-2 (Latin 2) -l: for ISO 8859-13 (Latin 7) -r: for ISO 8859-16 (Latin 10) -u: for UTF-8 BTW, with the mouse cursor you can select that character and a *Help* =20= buffer with explanation opens. A good method to check where the error can come from is to use a =20 "neutral" simple and pure text file like this one: ;;; -*- mode: Text; coding: iso-8859-2; -*- ; ; Time-stamp: <2005-05-11 23:52:49 pete> ; ; Central and Eastern European Glyphs (Latin 2) ; ; oct dec hex UCS2 UTF-8 ;=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D 240 =3D 160 =3D A0 =3D U+00A0 =3D C2 A0 : NO-BREAK SPACE =C4=84 =3D 241 =3D 161 =3D A1 =3D U+0104 =3D C4 84 : LATIN CAPITAL = LETTER A WITH =20 OGONEK =CB=98 =3D 242 =3D 162 =3D A2 =3D U+02D8 =3D CB 98 : BREVE =C5=81 =3D 243 =3D 163 =3D A3 =3D U+0141 =3D C5 81 : LATIN CAPITAL = LETTER L WITH =20 STROKE =C2=A4 =3D 244 =3D 164 =3D A4 =3D U+00A4 =3D C2 A4 : CURRENCY SIGN =C4=BD =3D 245 =3D 165 =3D A5 =3D U+013D =3D C4 BD : LATIN CAPITAL = LETTER L WITH =20 CARON =C5=9A =3D 246 =3D 166 =3D A6 =3D U+015A =3D C5 9A : LATIN CAPITAL = LETTER S WITH =20 ACUTE =C2=A7 =3D 247 =3D 167 =3D A7 =3D U+00A7 =3D C2 A7 : SECTION SIGN =C2=A8 =3D 250 =3D 168 =3D A8 =3D U+00A8 =3D C2 A8 : DIAERESIS =C5=A0 =3D 251 =3D 169 =3D A9 =3D U+0160 =3D C5 A0 : LATIN CAPITAL = LETTER S WITH =20 CARON =C5=9E =3D 252 =3D 170 =3D AA =3D U+015E =3D C5 9E : LATIN CAPITAL = LETTER S WITH =20 CEDILLA =C5=A4 =3D 253 =3D 171 =3D AB =3D U+0164 =3D C5 A4 : LATIN CAPITAL = LETTER T WITH =20 CARON =C5=B9 =3D 254 =3D 172 =3D AC =3D U+0179 =3D C5 B9 : LATIN CAPITAL = LETTER Z WITH =20 ACUTE - =3D 255 =3D 173 =3D AD =3D U+00AD =3D C2 AD : HYPHEN-MINUS =C5=BD =3D 256 =3D 174 =3D AE =3D U+017D =3D C5 BD : LATIN CAPITAL = LETTER Z WITH =20 CARON =C5=BB =3D 257 =3D 175 =3D AF =3D U+017B =3D C5 BB : LATIN CAPITAL = LETTER Z WITH =20 DOT ABOVE =C2=B0 =3D 260 =3D 176 =3D B0 =3D U+00B0 =3D C2 B0 : DEGREE SIGN =C4=85 =3D 261 =3D 177 =3D B1 =3D U+0105 =3D C4 85 : LATIN SMALL = LETTER A WITH =20 OGONEK =CB=9B =3D 262 =3D 178 =3D B2 =3D U+02DB =3D CB 9B : OGONEK =C5=82 =3D 263 =3D 179 =3D B3 =3D U+0142 =3D C5 82 : LATIN SMALL = LETTER L WITH =20 STROKE =C2=B4 =3D 264 =3D 180 =3D B4 =3D U+00B4 =3D C2 B4 : ACUTE ACCENT =C4=BE =3D 265 =3D 181 =3D B5 =3D U+013E =3D C4 BE : LATIN SMALL = LETTER L WITH =20 CARON =C5=9B =3D 266 =3D 182 =3D B6 =3D U+015B =3D C5 9B : LATIN SMALL = LETTER S WITH =20 ACUTE =CB=87 =3D 267 =3D 183 =3D B7 =3D U+02C7 =3D CB 87 : CARON =C2=B8 =3D 270 =3D 184 =3D B8 =3D U+00B8 =3D C2 B8 : CEDILLA =C5=A1 =3D 271 =3D 185 =3D B9 =3D U+0161 =3D C5 A1 : LATIN SMALL = LETTER S WITH =20 CARON =C5=9F =3D 272 =3D 186 =3D BA =3D U+015F =3D C5 9F : LATIN SMALL = LETTER S WITH =20 CEDILLA =C5=A5 =3D 273 =3D 187 =3D BB =3D U+0165 =3D C5 A5 : LATIN SMALL = LETTER T WITH =20 CARON =C5=BA =3D 274 =3D 188 =3D BC =3D U+017A =3D C5 BA : LATIN SMALL = LETTER Z WITH =20 ACUTE =CB=9D =3D 275 =3D 189 =3D BD =3D U+02DD =3D CB 9D : DOUBLE ACUTE = ACCENT =C5=BE =3D 276 =3D 190 =3D BE =3D U+017E =3D C5 BE : LATIN SMALL = LETTER Z WITH =20 CARON =C5=BC =3D 277 =3D 191 =3D BF =3D U+017C =3D C5 BC : LATIN SMALL = LETTER Z WITH =20 DOT ABOVE =C5=94 =3D 300 =3D 192 =3D C0 =3D U+0154 =3D C5 94 : LATIN CAPITAL = LETTER R WITH =20 ACUTE =C3=81 =3D 301 =3D 193 =3D C1 =3D U+00C1 =3D C3 81 : LATIN CAPITAL = LETTER A WITH =20 ACUTE =C3=82 =3D 302 =3D 194 =3D C2 =3D U+00C2 =3D C3 82 : LATIN CAPITAL = LETTER A WITH =20 CIRCUMFLEX =C4=82 =3D 303 =3D 195 =3D C3 =3D U+0102 =3D C4 82 : LATIN CAPITAL = LETTER A WITH =20 BREVE =C3=84 =3D 304 =3D 196 =3D C4 =3D U+00C4 =3D C3 84 : LATIN CAPITAL = LETTER A WITH =20 DIAERESIS =C4=B9 =3D 305 =3D 197 =3D C5 =3D U+0139 =3D C4 B9 : LATIN CAPITAL = LETTER L WITH =20 ACUTE =C4=86 =3D 306 =3D 198 =3D C6 =3D U+0106 =3D C4 86 : LATIN CAPITAL = LETTER C WITH =20 ACUTE =C3=87 =3D 307 =3D 199 =3D C7 =3D U+00C7 =3D C3 87 : LATIN CAPITAL = LETTER C WITH =20 CEDILLA =C4=8C =3D 310 =3D 200 =3D C8 =3D U+010C =3D C4 8C : LATIN CAPITAL = LETTER C WITH =20 CARON =C3=89 =3D 311 =3D 201 =3D C9 =3D U+00C9 =3D C3 89 : LATIN CAPITAL = LETTER E WITH =20 ACUTE =C4=98 =3D 312 =3D 202 =3D CA =3D U+0118 =3D C4 98 : LATIN CAPITAL = LETTER E WITH =20 OGONEK =C3=8B =3D 313 =3D 203 =3D CB =3D U+00CB =3D C3 8B : LATIN CAPITAL = LETTER E WITH =20 DIAERESIS =C4=9A =3D 314 =3D 204 =3D CC =3D U+011A =3D C4 9A : LATIN CAPITAL = LETTER E WITH =20 CARON =C3=8D =3D 315 =3D 205 =3D CD =3D U+00CD =3D C3 8D : LATIN CAPITAL = LETTER I WITH =20 ACUTE =C3=8E =3D 316 =3D 206 =3D CE =3D U+00CE =3D C3 8E : LATIN CAPITAL = LETTER I WITH =20 CIRCUMFLEX =C4=8E =3D 317 =3D 207 =3D CF =3D U+010E =3D C4 8E : LATIN CAPITAL = LETTER D WITH =20 CARON =C4=90 =3D 320 =3D 208 =3D D0 =3D U+0110 =3D C4 90 : LATIN CAPITAL = LETTER D WITH =20 STROKE =C5=83 =3D 321 =3D 209 =3D D1 =3D U+0143 =3D C5 83 : LATIN CAPITAL = LETTER N WITH =20 ACUTE =C5=87 =3D 322 =3D 210 =3D D2 =3D U+0147 =3D C5 87 : LATIN CAPITAL = LETTER N WITH =20 CARON =C3=93 =3D 323 =3D 211 =3D D3 =3D U+00D3 =3D C3 93 : LATIN CAPITAL = LETTER O WITH =20 ACUTE =C3=94 =3D 324 =3D 212 =3D D4 =3D U+00D4 =3D C3 94 : LATIN CAPITAL = LETTER O WITH =20 CIRCUMFLEX =C5=90 =3D 325 =3D 213 =3D D5 =3D U+0150 =3D C5 90 : LATIN CAPITAL = LETTER O WITH =20 DOUBLE ACUTE =C3=96 =3D 326 =3D 214 =3D D6 =3D U+00D6 =3D C3 96 : LATIN CAPITAL = LETTER O WITH =20 DIAERESIS =C3=97 =3D 327 =3D 215 =3D D7 =3D U+00D7 =3D C3 97 : MULTIPLICATION = SIGN =C5=98 =3D 330 =3D 216 =3D D8 =3D U+0158 =3D C5 98 : LATIN CAPITAL = LETTER R WITH =20 CARON =C5=AE =3D 331 =3D 217 =3D D9 =3D U+016E =3D C5 AE : LATIN CAPITAL = LETTER U WITH =20 RING ABOVE =C3=9A =3D 332 =3D 218 =3D DA =3D U+00DA =3D C3 9A : LATIN CAPITAL = LETTER U WITH =20 ACUTE =C5=B0 =3D 333 =3D 219 =3D DB =3D U+0170 =3D C5 B0 : LATIN CAPITAL = LETTER U WITH =20 DOUBLE ACUTE =C3=9C =3D 334 =3D 220 =3D DC =3D U+00DC =3D C3 9C : LATIN CAPITAL = LETTER U WITH =20 DIAERESIS =C3=9D =3D 335 =3D 221 =3D DD =3D U+00DD =3D C3 9D : LATIN CAPITAL = LETTER Y WITH =20 ACUTE =C5=A2 =3D 336 =3D 222 =3D DE =3D U+0162 =3D C5 A2 : LATIN CAPITAL = LETTER T WITH =20 CEDILLA =C3=9F =3D 337 =3D 223 =3D DF =3D U+00DF =3D C3 9F : LATIN SMALL = LETTER SHARP S =C5=95 =3D 340 =3D 224 =3D E0 =3D U+0155 =3D C5 95 : LATIN SMALL = LETTER R WITH =20 ACUTE =C3=A1 =3D 341 =3D 225 =3D E1 =3D U+00E1 =3D C3 A1 : LATIN SMALL = LETTER A WITH =20 ACUTE =C3=A2 =3D 342 =3D 226 =3D E2 =3D U+00E2 =3D C3 A2 : LATIN SMALL = LETTER A WITH =20 CIRCUMFLEX =C4=83 =3D 343 =3D 227 =3D E3 =3D U+0103 =3D C4 83 : LATIN SMALL = LETTER A WITH =20 BREVE =C3=A4 =3D 344 =3D 228 =3D E4 =3D U+00E4 =3D C3 A4 : LATIN SMALL = LETTER A WITH =20 DIAERESIS =C4=BA =3D 345 =3D 229 =3D E5 =3D U+013A =3D C4 BA : LATIN SMALL = LETTER L WITH =20 ACUTE =C4=87 =3D 346 =3D 230 =3D E6 =3D U+0107 =3D C4 87 : LATIN SMALL = LETTER C WITH =20 ACUTE =C3=A7 =3D 347 =3D 231 =3D E7 =3D U+00E7 =3D C3 A7 : LATIN SMALL = LETTER C WITH =20 CEDILLA =C4=8D =3D 350 =3D 232 =3D E8 =3D U+010D =3D C4 8D : LATIN SMALL = LETTER C WITH =20 CARON =C3=A9 =3D 351 =3D 233 =3D E9 =3D U+00E9 =3D C3 A9 : LATIN SMALL = LETTER E WITH =20 ACUTE =C4=99 =3D 352 =3D 234 =3D EA =3D U+0119 =3D C4 99 : LATIN SMALL = LETTER E WITH =20 OGONEK =C3=AB =3D 353 =3D 235 =3D EB =3D U+00EB =3D C3 AB : LATIN SMALL = LETTER E WITH =20 DIAERESIS =C4=9B =3D 354 =3D 236 =3D EC =3D U+011B =3D C4 9B : LATIN SMALL = LETTER E WITH =20 CARON =C3=AD =3D 355 =3D 237 =3D ED =3D U+00ED =3D C3 AD : LATIN SMALL = LETTER I WITH =20 ACUTE =C3=AE =3D 356 =3D 238 =3D EE =3D U+00EE =3D C3 AE : LATIN SMALL = LETTER I WITH =20 CIRCUMFLEX =C4=8F =3D 357 =3D 239 =3D EF =3D U+010F =3D C4 8F : LATIN SMALL = LETTER D WITH =20 CARON =C4=91 =3D 360 =3D 240 =3D F0 =3D U+0111 =3D C4 91 : LATIN SMALL = LETTER D WITH =20 STROKE =C5=84 =3D 361 =3D 241 =3D F1 =3D U+0144 =3D C5 84 : LATIN SMALL = LETTER N WITH =20 ACUTE =C5=88 =3D 362 =3D 242 =3D F2 =3D U+0148 =3D C5 88 : LATIN SMALL = LETTER N WITH =20 CARON =C3=B3 =3D 363 =3D 243 =3D F3 =3D U+00F3 =3D C3 B3 : LATIN SMALL = LETTER O WITH =20 ACUTE =C3=B4 =3D 364 =3D 244 =3D F4 =3D U+00F4 =3D C3 B4 : LATIN SMALL = LETTER O WITH =20 CIRCUMFLEX =C5=91 =3D 365 =3D 245 =3D F5 =3D U+0151 =3D C5 91 : LATIN SMALL = LETTER O WITH =20 DOUBLE ACUTE =C3=B6 =3D 366 =3D 246 =3D F6 =3D U+00F6 =3D C3 B6 : LATIN SMALL = LETTER O WITH =20 DIAERESIS =C3=B7 =3D 367 =3D 247 =3D F7 =3D U+00F7 =3D C3 B7 : DIVISION SIGN =C5=99 =3D 370 =3D 248 =3D F8 =3D U+0159 =3D C5 99 : LATIN SMALL = LETTER R WITH =20 CARON =C5=AF =3D 371 =3D 249 =3D F9 =3D U+016F =3D C5 AF : LATIN SMALL = LETTER U WITH =20 RING ABOVE =C3=BA =3D 372 =3D 250 =3D FA =3D U+00FA =3D C3 BA : LATIN SMALL = LETTER U WITH =20 ACUTE =C5=B1 =3D 373 =3D 251 =3D FB =3D U+0171 =3D C5 B1 : LATIN SMALL = LETTER U WITH =20 DOUBLE ACUTE =C3=BC =3D 374 =3D 252 =3D FC =3D U+00FC =3D C3 BC : LATIN SMALL = LETTER U WITH =20 DIAERESIS =C3=BD =3D 375 =3D 253 =3D FD =3D U+00FD =3D C3 BD : LATIN SMALL = LETTER Y WITH =20 ACUTE =C5=A3 =3D 376 =3D 254 =3D FE =3D U+0163 =3D C5 A3 : LATIN SMALL = LETTER T WITH =20 CEDILLA =CB=99 =3D 377 =3D 255 =3D FF =3D U+02D9 =3D CB 99 : DOT ABOVE and it in both Emacsen. Run them at the same time and compare mode-=20 lines and other details (encodings, fonts used: C-u C-x =3D on a =20 glyph, ...). In your user init file you can prepare sections for =20 emacs-major-version or window-system variables. If you want to have some fun, then change this files first line from =20 iso-8859-2 to, let's say, iso-8859-16 *outside* of GNU Emacs, by for =20 example, cat | sed -e s/iso-8859-2/iso-8859-16/ > . This will only change exactly *one* byte (the 2 will become =20 16), but the first column will be totally different in GNU Emacs =E2=80=93= =20 and the descriptional text will become untrue for most characters. =20 Just to learn that there is contents somewhere below and you only get =20= some *presentation* of this contents. (As in real life you can't see =20 the reality outside your head.) -- Greetings Pete Time flies like an error -- but fruit flies like a banana! (almost Groucho Marx)