all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Peter Dyballa <Peter_Dyballa@Web.DE>
To: Wojtek <wnkltd@gmail.com>
Cc: help-gnu-emacs@gnu.org
Subject: Re: Polish characters in emacs
Date: Fri, 12 Oct 2007 12:20:22 +0200	[thread overview]
Message-ID: <79F3E108-4C73-44C9-9926-BDA8C8C2AC7C@Web.DE> (raw)
In-Reply-To: <1192136997.140783.256650@d55g2000hsg.googlegroups.com>


Am 11.10.2007 um 23:09 schrieb Wojtek:

> Could someone point me to an explanation of settings so that Polish
> characters are displayed correctly in an emacs buffer and whether this
> has to do with the environment outside of emacs.

There are three 8 bit ISO Latin encodings that support Polish: ISO  
8859-2, ISO 8859-13, and ISO 8859-16, the 8 bit MS encoding Code Page  
1250, and finally UTF-8. The all have  
ÓóĄąĆćĘꣳŃńŚśŹźżż.

> When I open up a connection to my account and run emacs from Fedora 7
> the characters do not show up when viewing a mail message encoded as
> utf-8.

*How* does this happen? Does it happen that you only see empty boxes?  
Then you're using a font that does not have the Polish characters.  
Change, for example, to Lucida Sans Typewriter from Java SDK!

> However I can toggle the input method to polish-slash and
> enter polish characters and they do show up.

This can be something completely different. (I never use an "input  
method." At least not by conscience.) And it's no proof, except that  
this GNU Emacs can display the chosen item properly. So their might  
be some mis-understanding come from the eMail client used to retrieve  
the input data for that buffer.

> When connecting to the same account from a Windows machine using  
> Cygwin-X, the characters in
> the mail message show up without a problem.

Ahh! So you are writing the whole time about eMails and their textual  
presentation? Which eMail client do you use to read the eMails? Can  
you make some of the header lines of the eMails appear in your eMail  
client, particularly those that describe the way the message was  
encoded for the transport through the Internet? The eMail client can  
have its own ideas of representing an eMail's contents ...

> Since the emacs I am running is starting up with the same  
> parameters, what controlsl the display of characters?

It's definitely the encoding used in the buffer. It's indicated at  
the beginning of the mode-line (left-most characters).

	-*: for MS CP1250 or CP1252
	-2: for ISO 8859-2   (Latin  2)
	-l: for ISO 8859-13  (Latin  7)
	-r: for ISO 8859-16  (Latin 10)
	-u: for UTF-8

BTW, with the mouse cursor you can select that character and a *Help*  
buffer with explanation opens.


A good method to check where the error can come from is to use a  
"neutral" simple and pure text file like this one:

;;; -*- mode: Text; coding: iso-8859-2; -*-
;
;	Time-stamp: <2005-05-11 23:52:49 pete>
;
;   Central and Eastern European Glyphs (Latin 2)
;
;   oct   dec   hex    UCS2    UTF-8
;=====================================
   = 240 = 160 = A0 = U+00A0 =    C2 A0 : NO-BREAK SPACE
Ą = 241 = 161 = A1 = U+0104 =    C4 84 : LATIN CAPITAL LETTER A WITH  
OGONEK
˘ = 242 = 162 = A2 = U+02D8 =    CB 98 : BREVE
Ł = 243 = 163 = A3 = U+0141 =    C5 81 : LATIN CAPITAL LETTER L WITH  
STROKE
¤ = 244 = 164 = A4 = U+00A4 =    C2 A4 : CURRENCY SIGN
Ľ = 245 = 165 = A5 = U+013D =    C4 BD : LATIN CAPITAL LETTER L WITH  
CARON
Ś = 246 = 166 = A6 = U+015A =    C5 9A : LATIN CAPITAL LETTER S WITH  
ACUTE
§ = 247 = 167 = A7 = U+00A7 =    C2 A7 : SECTION SIGN
¨ = 250 = 168 = A8 = U+00A8 =    C2 A8 : DIAERESIS
Š = 251 = 169 = A9 = U+0160 =    C5 A0 : LATIN CAPITAL LETTER S WITH  
CARON
Ş = 252 = 170 = AA = U+015E =    C5 9E : LATIN CAPITAL LETTER S WITH  
CEDILLA
Ť = 253 = 171 = AB = U+0164 =    C5 A4 : LATIN CAPITAL LETTER T WITH  
CARON
Ź = 254 = 172 = AC = U+0179 =    C5 B9 : LATIN CAPITAL LETTER Z WITH  
ACUTE
- = 255 = 173 = AD = U+00AD =    C2 AD : HYPHEN-MINUS
Ž = 256 = 174 = AE = U+017D =    C5 BD : LATIN CAPITAL LETTER Z WITH  
CARON
Ż = 257 = 175 = AF = U+017B =    C5 BB : LATIN CAPITAL LETTER Z WITH  
DOT ABOVE
° = 260 = 176 = B0 = U+00B0 =    C2 B0 : DEGREE SIGN
ą = 261 = 177 = B1 = U+0105 =    C4 85 : LATIN SMALL LETTER A WITH  
OGONEK
˛ = 262 = 178 = B2 = U+02DB =    CB 9B : OGONEK
ł = 263 = 179 = B3 = U+0142 =    C5 82 : LATIN SMALL LETTER L WITH  
STROKE
´ = 264 = 180 = B4 = U+00B4 =    C2 B4 : ACUTE ACCENT
ľ = 265 = 181 = B5 = U+013E =    C4 BE : LATIN SMALL LETTER L WITH  
CARON
ś = 266 = 182 = B6 = U+015B =    C5 9B : LATIN SMALL LETTER S WITH  
ACUTE
ˇ = 267 = 183 = B7 = U+02C7 =    CB 87 : CARON
¸ = 270 = 184 = B8 = U+00B8 =    C2 B8 : CEDILLA
š = 271 = 185 = B9 = U+0161 =    C5 A1 : LATIN SMALL LETTER S WITH  
CARON
ş = 272 = 186 = BA = U+015F =    C5 9F : LATIN SMALL LETTER S WITH  
CEDILLA
ť = 273 = 187 = BB = U+0165 =    C5 A5 : LATIN SMALL LETTER T WITH  
CARON
ź = 274 = 188 = BC = U+017A =    C5 BA : LATIN SMALL LETTER Z WITH  
ACUTE
˝ = 275 = 189 = BD = U+02DD =    CB 9D : DOUBLE ACUTE ACCENT
ž = 276 = 190 = BE = U+017E =    C5 BE : LATIN SMALL LETTER Z WITH  
CARON
ż = 277 = 191 = BF = U+017C =    C5 BC : LATIN SMALL LETTER Z WITH  
DOT ABOVE
Ŕ = 300 = 192 = C0 = U+0154 =    C5 94 : LATIN CAPITAL LETTER R WITH  
ACUTE
Á = 301 = 193 = C1 = U+00C1 =    C3 81 : LATIN CAPITAL LETTER A WITH  
ACUTE
 = 302 = 194 = C2 = U+00C2 =    C3 82 : LATIN CAPITAL LETTER A WITH  
CIRCUMFLEX
Ă = 303 = 195 = C3 = U+0102 =    C4 82 : LATIN CAPITAL LETTER A WITH  
BREVE
Ä = 304 = 196 = C4 = U+00C4 =    C3 84 : LATIN CAPITAL LETTER A WITH  
DIAERESIS
Ĺ = 305 = 197 = C5 = U+0139 =    C4 B9 : LATIN CAPITAL LETTER L WITH  
ACUTE
Ć = 306 = 198 = C6 = U+0106 =    C4 86 : LATIN CAPITAL LETTER C WITH  
ACUTE
Ç = 307 = 199 = C7 = U+00C7 =    C3 87 : LATIN CAPITAL LETTER C WITH  
CEDILLA
Č = 310 = 200 = C8 = U+010C =    C4 8C : LATIN CAPITAL LETTER C WITH  
CARON
É = 311 = 201 = C9 = U+00C9 =    C3 89 : LATIN CAPITAL LETTER E WITH  
ACUTE
Ę = 312 = 202 = CA = U+0118 =    C4 98 : LATIN CAPITAL LETTER E WITH  
OGONEK
Ë = 313 = 203 = CB = U+00CB =    C3 8B : LATIN CAPITAL LETTER E WITH  
DIAERESIS
Ě = 314 = 204 = CC = U+011A =    C4 9A : LATIN CAPITAL LETTER E WITH  
CARON
Í = 315 = 205 = CD = U+00CD =    C3 8D : LATIN CAPITAL LETTER I WITH  
ACUTE
Î = 316 = 206 = CE = U+00CE =    C3 8E : LATIN CAPITAL LETTER I WITH  
CIRCUMFLEX
Ď = 317 = 207 = CF = U+010E =    C4 8E : LATIN CAPITAL LETTER D WITH  
CARON
Đ = 320 = 208 = D0 = U+0110 =    C4 90 : LATIN CAPITAL LETTER D WITH  
STROKE
Ń = 321 = 209 = D1 = U+0143 =    C5 83 : LATIN CAPITAL LETTER N WITH  
ACUTE
Ň = 322 = 210 = D2 = U+0147 =    C5 87 : LATIN CAPITAL LETTER N WITH  
CARON
Ó = 323 = 211 = D3 = U+00D3 =    C3 93 : LATIN CAPITAL LETTER O WITH  
ACUTE
Ô = 324 = 212 = D4 = U+00D4 =    C3 94 : LATIN CAPITAL LETTER O WITH  
CIRCUMFLEX
Ő = 325 = 213 = D5 = U+0150 =    C5 90 : LATIN CAPITAL LETTER O WITH  
DOUBLE ACUTE
Ö = 326 = 214 = D6 = U+00D6 =    C3 96 : LATIN CAPITAL LETTER O WITH  
DIAERESIS
× = 327 = 215 = D7 = U+00D7 =    C3 97 : MULTIPLICATION SIGN
Ř = 330 = 216 = D8 = U+0158 =    C5 98 : LATIN CAPITAL LETTER R WITH  
CARON
Ů = 331 = 217 = D9 = U+016E =    C5 AE : LATIN CAPITAL LETTER U WITH  
RING ABOVE
Ú = 332 = 218 = DA = U+00DA =    C3 9A : LATIN CAPITAL LETTER U WITH  
ACUTE
Ű = 333 = 219 = DB = U+0170 =    C5 B0 : LATIN CAPITAL LETTER U WITH  
DOUBLE ACUTE
Ü = 334 = 220 = DC = U+00DC =    C3 9C : LATIN CAPITAL LETTER U WITH  
DIAERESIS
Ý = 335 = 221 = DD = U+00DD =    C3 9D : LATIN CAPITAL LETTER Y WITH  
ACUTE
Ţ = 336 = 222 = DE = U+0162 =    C5 A2 : LATIN CAPITAL LETTER T WITH  
CEDILLA
ß = 337 = 223 = DF = U+00DF =    C3 9F : LATIN SMALL LETTER SHARP S
ŕ = 340 = 224 = E0 = U+0155 =    C5 95 : LATIN SMALL LETTER R WITH  
ACUTE
á = 341 = 225 = E1 = U+00E1 =    C3 A1 : LATIN SMALL LETTER A WITH  
ACUTE
â = 342 = 226 = E2 = U+00E2 =    C3 A2 : LATIN SMALL LETTER A WITH  
CIRCUMFLEX
ă = 343 = 227 = E3 = U+0103 =    C4 83 : LATIN SMALL LETTER A WITH  
BREVE
ä = 344 = 228 = E4 = U+00E4 =    C3 A4 : LATIN SMALL LETTER A WITH  
DIAERESIS
ĺ = 345 = 229 = E5 = U+013A =    C4 BA : LATIN SMALL LETTER L WITH  
ACUTE
ć = 346 = 230 = E6 = U+0107 =    C4 87 : LATIN SMALL LETTER C WITH  
ACUTE
ç = 347 = 231 = E7 = U+00E7 =    C3 A7 : LATIN SMALL LETTER C WITH  
CEDILLA
č = 350 = 232 = E8 = U+010D =    C4 8D : LATIN SMALL LETTER C WITH  
CARON
é = 351 = 233 = E9 = U+00E9 =    C3 A9 : LATIN SMALL LETTER E WITH  
ACUTE
ę = 352 = 234 = EA = U+0119 =    C4 99 : LATIN SMALL LETTER E WITH  
OGONEK
ë = 353 = 235 = EB = U+00EB =    C3 AB : LATIN SMALL LETTER E WITH  
DIAERESIS
ě = 354 = 236 = EC = U+011B =    C4 9B : LATIN SMALL LETTER E WITH  
CARON
í = 355 = 237 = ED = U+00ED =    C3 AD : LATIN SMALL LETTER I WITH  
ACUTE
î = 356 = 238 = EE = U+00EE =    C3 AE : LATIN SMALL LETTER I WITH  
CIRCUMFLEX
ď = 357 = 239 = EF = U+010F =    C4 8F : LATIN SMALL LETTER D WITH  
CARON
đ = 360 = 240 = F0 = U+0111 =    C4 91 : LATIN SMALL LETTER D WITH  
STROKE
ń = 361 = 241 = F1 = U+0144 =    C5 84 : LATIN SMALL LETTER N WITH  
ACUTE
ň = 362 = 242 = F2 = U+0148 =    C5 88 : LATIN SMALL LETTER N WITH  
CARON
ó = 363 = 243 = F3 = U+00F3 =    C3 B3 : LATIN SMALL LETTER O WITH  
ACUTE
ô = 364 = 244 = F4 = U+00F4 =    C3 B4 : LATIN SMALL LETTER O WITH  
CIRCUMFLEX
ő = 365 = 245 = F5 = U+0151 =    C5 91 : LATIN SMALL LETTER O WITH  
DOUBLE ACUTE
ö = 366 = 246 = F6 = U+00F6 =    C3 B6 : LATIN SMALL LETTER O WITH  
DIAERESIS
÷ = 367 = 247 = F7 = U+00F7 =    C3 B7 : DIVISION SIGN
ř = 370 = 248 = F8 = U+0159 =    C5 99 : LATIN SMALL LETTER R WITH  
CARON
ů = 371 = 249 = F9 = U+016F =    C5 AF : LATIN SMALL LETTER U WITH  
RING ABOVE
ú = 372 = 250 = FA = U+00FA =    C3 BA : LATIN SMALL LETTER U WITH  
ACUTE
ű = 373 = 251 = FB = U+0171 =    C5 B1 : LATIN SMALL LETTER U WITH  
DOUBLE ACUTE
ü = 374 = 252 = FC = U+00FC =    C3 BC : LATIN SMALL LETTER U WITH  
DIAERESIS
ý = 375 = 253 = FD = U+00FD =    C3 BD : LATIN SMALL LETTER Y WITH  
ACUTE
ţ = 376 = 254 = FE = U+0163 =    C5 A3 : LATIN SMALL LETTER T WITH  
CEDILLA
˙ = 377 = 255 = FF = U+02D9 =    CB 99 : DOT ABOVE

and it in both Emacsen. Run them at the same time and compare mode- 
lines and other details (encodings, fonts used: C-u C-x = on a  
glyph, ...). In your user init file you can prepare sections  for  
emacs-major-version or window-system variables.


If you want to have some fun, then change this files first line from  
iso-8859-2 to, let's say, iso-8859-16 *outside* of GNU Emacs, by for  
example, cat <file> | sed -e s/iso-8859-2/iso-8859-16/ > <other  
file>. This will only change exactly *one* byte (the 2 will become  
16), but the first column will be totally different in GNU Emacs –  
and the descriptional text will become untrue for most characters.  
Just to learn that there is contents somewhere below and you only get  
some *presentation* of this contents. (As in real life you can't see  
the reality outside your head.)

--
Greetings

   Pete

Time flies like an error -- but fruit flies like a banana!
                              (almost Groucho Marx)

  parent reply	other threads:[~2007-10-12 10:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-11 21:09 Polish characters in emacs Wojtek
2007-10-12  9:03 ` Eli Zaretskii
2007-10-12 10:20 ` Peter Dyballa [this message]
     [not found] ` <mailman.1976.1192184434.18990.help-gnu-emacs@gnu.org>
2007-10-12 19:03   ` Wojtek
2007-10-12 20:44     ` Peter Dyballa
     [not found]     ` <mailman.1993.1192221868.18990.help-gnu-emacs@gnu.org>
2007-10-19 16:36       ` Wojtek
2007-10-19 19:38         ` Peter Dyballa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=79F3E108-4C73-44C9-9926-BDA8C8C2AC7C@Web.DE \
    --to=peter_dyballa@web.de \
    --cc=help-gnu-emacs@gnu.org \
    --cc=wnkltd@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.