* query-replace? @ 2006-01-07 19:04 B. T. Raven 2006-01-08 4:20 ` query-replace? Eli Zaretskii ` (3 more replies) 0 siblings, 4 replies; 14+ messages in thread From: B. T. Raven @ 2006-01-07 19:04 UTC (permalink / raw) i. I have a character \234 in a file that should be displayed as an oe ligature. I have tried M-% C-q (0)234 but this doesn't work for getting it into the patter to be replaced. How do I refer to this character? ii. Is it possible to include newlines in regexps? I have a five line header on each page that begins with same characters on 1st line and ends with same on last line. What is the most automated method of deleting just these lines? .*\n can't work of course. Since some of the lines are blank, I inserted C-q C-j into the string but that produced a litteral ^M. ?? Thanks ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: query-replace? 2006-01-07 19:04 query-replace? B. T. Raven @ 2006-01-08 4:20 ` Eli Zaretskii [not found] ` <mailman.301.1136694143.26925.help-gnu-emacs@gnu.org> ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: Eli Zaretskii @ 2006-01-08 4:20 UTC (permalink / raw) > From: "B. T. Raven" <ecinmn@peoplepc.com> > Date: Sat, 07 Jan 2006 19:04:29 GMT > > I have a character \234 in a file that should be displayed as an oe > ligature. I have tried M-% C-q (0)234 but this doesn't work for getting it > into the patter to be replaced. How do I refer to this character? Not as \234. Internally, non-ASCII characters are encoded differently inside Emacs buffers; \234 is that character's _external_ encoding, in a file. To see what is the internal codepoint, go to that character and type "C-u C-x =". In the buffer that Emacs pops up, look for the number labeled "buffer code". > Is it possible to include newlines in regexps? Yes, use "C-q C-j". > .*\n can't work of course. Since some of the lines are blank, I inserted > C-q C-j into the string but that produced a litteral ^M. ?? C-q C-j should produce a literal newline, not ^M. Btw, in the future I suggest not to ask several different unrelated questions in the same message, but instead split it into several messages. That would allow you to give meaningful Subject lines to each message, and readers of this forum will be able to tell in advance, by just reading the Subject lines, whether they can help you with some of the problems. ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <mailman.301.1136694143.26925.help-gnu-emacs@gnu.org>]
* Re: query-replace? [not found] ` <mailman.301.1136694143.26925.help-gnu-emacs@gnu.org> @ 2006-01-08 5:38 ` B. T. Raven 0 siblings, 0 replies; 14+ messages in thread From: B. T. Raven @ 2006-01-08 5:38 UTC (permalink / raw) "Eli Zaretskii" <eliz@gnu.org> wrote in message news:mailman.301.1136694143.26925.help-gnu-emacs@gnu.org... > > From: "B. T. Raven" <ecinmn@peoplepc.com> > > Date: Sat, 07 Jan 2006 19:04:29 GMT > > > > I have a character \234 in a file that should be displayed as an oe > > ligature. I have tried M-% C-q (0)234 but this doesn't work for getting it > > into the patter to be replaced. How do I refer to this character? > > Not as \234. Internally, non-ASCII characters are encoded differently > inside Emacs buffers; \234 is that character's _external_ encoding, in > a file. > > To see what is the internal codepoint, go to that character and type > "C-u C-x =". In the buffer that Emacs pops up, look for the number > labeled "buffer code". > > > Is it possible to include newlines in regexps? > > Yes, use "C-q C-j". > > > .*\n can't work of course. Since some of the lines are blank, I inserted > > C-q C-j into the string but that produced a litteral ^M. ?? > > C-q C-j should produce a literal newline, not ^M. > > Btw, in the future I suggest not to ask several different unrelated > questions in the same message, but instead split it into several > messages. That would allow you to give meaningful Subject lines to > each message, and readers of this forum will be able to tell in > advance, by just reading the Subject lines, whether they can help you > with some of the problems. > > Thanks, Eli. Actually \234 (dec. 156, hex 9c) was the internal representation of the char. It was the only one with diacriticals that showed up that way(clicking anywhere in the string put the cursor at the backslash). The actual oe ligature is 01210163, 331891, 0x51073 but wherever \234 was needed to be 'oe' from context. I had left read-quoted-char-radix at 16 and thought I could override it by typing C-q 0234 but that is also a hex number I guess. I used query-replace for the subject line because I saw both questions as related to that function. I suppose one was really a question on regexp syntax. Anyway I understand both procedures a little better now (until next time). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: query-replace? 2006-01-07 19:04 query-replace? B. T. Raven 2006-01-08 4:20 ` query-replace? Eli Zaretskii [not found] ` <mailman.301.1136694143.26925.help-gnu-emacs@gnu.org> @ 2006-01-08 12:03 ` Peter Dyballa 2006-01-08 12:11 ` query-replace? Lennart Borgman [not found] ` <mailman.315.1136721918.26925.help-gnu-emacs@gnu.org> 3 siblings, 1 reply; 14+ messages in thread From: Peter Dyballa @ 2006-01-08 12:03 UTC (permalink / raw) Cc: help-gnu-emacs Am 07.01.2006 um 19:04 schrieb B. T. Raven: > i. > I have a character \234 in a file that should be displayed as an oe > ligature. I have tried M-% C-q (0)234 but this doesn't work for > getting it > into the patter to be replaced. How do I refer to this character? With default read-quoted-char-radix being 8 it's C-q 2 3 4 <RET>. HEX input would be C-q 9 C RET>, and decimal C-q 1 5 6 <RET>. I don't remember that this failed in any case ... A different approach would be to bind some function key to enter œ. > > ii. Is it possible to include newlines in regexps? I have a five line > header on each page that begins with same characters on 1st line > and ends > with same on last line. What is the most automated method of > deleting just > these lines? C-q C-j. -- Greetings Pete There's no place like ~ (UNIX Guru) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: query-replace? 2006-01-08 12:03 ` query-replace? Peter Dyballa @ 2006-01-08 12:11 ` Lennart Borgman 2006-01-08 12:40 ` query-replace? Peter Dyballa 0 siblings, 1 reply; 14+ messages in thread From: Lennart Borgman @ 2006-01-08 12:11 UTC (permalink / raw) Cc: B. T. Raven, help-gnu-emacs > >> >> ii. Is it possible to include newlines in regexps? I have a five line >> header on each page that begins with same characters on 1st line and >> ends >> with same on last line. What is the most automated method of >> deleting just >> these lines? > > > C-q C-j. Maybe the question was about multiple lines, not the newline character? Then perhaps http://www.emacswiki.org/cgi-bin/wiki/MultilineRegexp can help? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: query-replace? 2006-01-08 12:11 ` query-replace? Lennart Borgman @ 2006-01-08 12:40 ` Peter Dyballa 2006-01-08 12:48 ` query-replace? Lennart Borgman 2006-01-08 19:41 ` query-replace? Eli Zaretskii 0 siblings, 2 replies; 14+ messages in thread From: Peter Dyballa @ 2006-01-08 12:40 UTC (permalink / raw) Cc: B. T. Raven, help-gnu-emacs Am 08.01.2006 um 13:11 schrieb Lennart Borgman: > Maybe the question was about multiple lines, not the newline > character? Then perhaps > > http://www.emacswiki.org/cgi-bin/wiki/MultilineRegexp > > can help? Oh yes: \n. The general approach. Could be in Losedows C-q C-m C-q C- j (CR LF) is needed ... Nice site anyway. -- Greetings Pete $ sumascii BILL GATES B I L L G A T E S 66+ 73+ 76+ 76+ 71+ 65+ 84+ 69+ 83 = 663 and add 3 because he's Bill Gates the third. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: query-replace? 2006-01-08 12:40 ` query-replace? Peter Dyballa @ 2006-01-08 12:48 ` Lennart Borgman 2006-01-08 19:41 ` query-replace? Eli Zaretskii 1 sibling, 0 replies; 14+ messages in thread From: Lennart Borgman @ 2006-01-08 12:48 UTC (permalink / raw) Cc: B. T. Raven, help-gnu-emacs Peter Dyballa wrote: > > Am 08.01.2006 um 13:11 schrieb Lennart Borgman: > >> Maybe the question was about multiple lines, not the newline >> character? Then perhaps >> >> http://www.emacswiki.org/cgi-bin/wiki/MultilineRegexp >> >> can help? > > > Oh yes: \n. The general approach. Could be in Losedows C-q C-m C-q C- > j (CR LF) is needed ... > > Nice site anyway. Those unlucky dwelling in Losedows (like me) and even those more fortunate may need to handle files from both hell and heaven. Fortunately the mighty Emacs then can help because it can see from where the file arosed. But you must ask Emacs to do so with \n. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: query-replace? 2006-01-08 12:40 ` query-replace? Peter Dyballa 2006-01-08 12:48 ` query-replace? Lennart Borgman @ 2006-01-08 19:41 ` Eli Zaretskii 1 sibling, 0 replies; 14+ messages in thread From: Eli Zaretskii @ 2006-01-08 19:41 UTC (permalink / raw) > From: Peter Dyballa <Peter_Dyballa@Web.DE> > Date: Sun, 8 Jan 2006 13:40:32 +0100 > Cc: "B. T. Raven" <ecinmn@peoplepc.com>, help-gnu-emacs@gnu.org > > Could be in Losedows C-q C-m C-q C-j (CR LF) is needed ... There're no CR characters in the buffer after the CR-LF file is read into it, so no, C-m is not needed on Windows more than it is needed on other platforms. Especially since Emacs gives the same treatment to files with DOS CR-LF EOLs on all platforms. ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <mailman.315.1136721918.26925.help-gnu-emacs@gnu.org>]
* imput methods (was Re: query-replace?) [not found] ` <mailman.315.1136721918.26925.help-gnu-emacs@gnu.org> @ 2006-01-08 16:17 ` B. T. Raven 2006-01-08 17:05 ` Peter Dyballa [not found] ` <mailman.349.1136740073.26925.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 14+ messages in thread From: B. T. Raven @ 2006-01-08 16:17 UTC (permalink / raw) "Peter Dyballa" <Peter_Dyballa@Web.DE> wrote in message news:mailman.315.1136721918.26925.help-gnu-emacs@gnu.org... Am 07.01.2006 um 19:04 schrieb B. T. Raven: > i. > I have a character \234 in a file that should be displayed as an oe > ligature. I have tried M-% C-q (0)234 but this doesn't work for > getting it > into the patter to be replaced. How do I refer to this character? With default read-quoted-char-radix being 8 it's C-q 2 3 4 <RET>. HEX input would be C-q 9 C RET>, and decimal C-q 1 5 6 <RET>. I don't remember that this failed in any case ... A different approach would be to bind some function key to enter œ. > > ii. Is it possible to include newlines in regexps? I have a five line > header on each page that begins with same characters on 1st line > and ends > with same on last line. What is the most automated method of > deleting just > these lines? C-q C-j. Thanks, Peter and Lennart. For me it's easier to use leim since I use latin-4-postfix most of the time anyway. With this the oe ligature is O& or o&, which seems mnemonic since the first component of ampersand is a Greek e (e-t). This brings up another question: When I type C-h I (describe input method) and then "latin-3-postfix" all I see are the empty rectangles again. This happen while looking at them with the .ttf font arialuni, which certainly has the glyphs for Turkish characters. ??? Ed ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: imput methods (was Re: query-replace?) 2006-01-08 16:17 ` imput methods (was Re: query-replace?) B. T. Raven @ 2006-01-08 17:05 ` Peter Dyballa [not found] ` <mailman.349.1136740073.26925.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 14+ messages in thread From: Peter Dyballa @ 2006-01-08 17:05 UTC (permalink / raw) Cc: help-gnu-emacs Am 08.01.2006 um 16:17 schrieb B. T. Raven: > This brings up another question: When I type C-h I > (describe input method) and then "latin-3-postfix" all I see are > the empty > rectangles again. This happen while looking at them with the .ttf font > arialuni, which certainly has the glyphs for Turkish characters. ??? Arial Unicode looks to be very complete in the many Latin and Latin- Extended areas. Probably you just need to create fontsets: (message "Neue fontsets für X11") (if (fboundp 'new-fontset) (progn ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;; Adobe Courier - Unicode encoded OpenType font, version 1.020, 374 glyphs ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (create-fontset-from-fontset-spec "-adobe-courier-medium-r-*-*-9- *-*-*-*-*-fontset-09pt_adobe_courier" t 'noerror) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-1 '("adobe-courier" . "iso8859-1")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-2 '("adobe-courier" . "iso8859-2")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-3 '("adobe-courier" . "iso8859-3")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-4 '("adobe-courier" . "iso8859-4")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-9 '("adobe-courier" . "iso8859-9")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-14 '("adobe-courier" . "iso8859-14")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-15 '("adobe-courier" . "iso8859-15")) ; (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-16 '("adobe-courier" . "iso8859-16")) (set-fontset-font "fontset-09pt_adobe_courier" 'mule- unicode-0100-24ff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-09pt_adobe_courier" 'mule- unicode-2500-33ff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-09pt_adobe_courier" 'mule-unicode-e000- ffff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0370) (decode-char 'ucs #x03cf)) '("courier new" . "iso10646-1")) ; Greek (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x03d0) (decode-char 'ucs #x03ff)) '("lucida sans typewriter" . "iso10646-1")) ; Coptic (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0400) (decode-char 'ucs #x04ff)) '("lucida sans typewriter" . "iso10646-1")) ; Cyrillic (set-fontset -font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0500) (decode-char 'ucs #x052f)) '("lucida sans typewriter" . "iso10646-1")) ; Cyrillic Suppll (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0530) (decode-char 'ucs #x058f)) '("aramian unicode" . "iso10646-1")) ; Armenian (sylfaen (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0590) (decode-char 'ucs #x05ff)) '("courier new" . "iso10646-1")) ; Hebrew (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0600) (decode-char 'ucs #x06ff)) '("lucida sans typewriter" . "iso10646-1")) ; Arabic (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0700) (decode-char 'ucs #x074f)) '("courier new" . "iso10646-1")) ; Syriac (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0780) (decode-char 'ucs #x07bf)) '("courier new" . "iso10646-1")) ; Thaana (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0900) (decode-char 'ucs #x097f)) '("courier new" . "iso10646-1")) ; Devanagari )) (provide 'site-fontsets-x11) One template that has some more regions of Unicode defined for one font, and of course there are some more sizes defined. Turkish is ISO Latin-5 or ISO 8859-9, so the definition above should work for your case. This works in X11. I don't know whether it works Losedows or whether this is necessary at all ... Try to find out which fonts GNU Emacs sees: M-x set-frame-font RET TAB TAB RET, change to *Completions* buffer and save it to a name you have determined before! If you try to expand a partial file name it will erase the *Completions* buffer ... -- Greetings Pete Mac OS X is like a wigwam: no fences, no gates, but an apache inside. ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <mailman.349.1136740073.26925.help-gnu-emacs@gnu.org>]
* fontsets: (was Re: query-replace?) [not found] ` <mailman.349.1136740073.26925.help-gnu-emacs@gnu.org> @ 2006-01-08 22:18 ` B. T. Raven 2006-01-09 11:53 ` Peter Dyballa [not found] ` <mailman.445.1136812310.26925.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 14+ messages in thread From: B. T. Raven @ 2006-01-08 22:18 UTC (permalink / raw) "Peter Dyballa" <Peter_Dyballa@Web.DE> wrote in message news:mailman.349.1136740073.26925.help-gnu-emacs@gnu.org... Am 08.01.2006 um 16:17 schrieb B. T. Raven: > This brings up another question: When I type C-h I > (describe input method) and then "latin-3-postfix" all I see are > the empty > rectangles again. This happen while looking at them with the .ttf font > arialuni, which certainly has the glyphs for Turkish characters. ??? Arial Unicode looks to be very complete in the many Latin and Latin- Extended areas. Probably you just need to create fontsets: (message "Neue fontsets für X11") (if (fboundp 'new-fontset) (progn ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;; Adobe Courier - Unicode encoded OpenType font, version 1.020, 374 glyphs ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (create-fontset-from-fontset-spec "-adobe-courier-medium-r-*-*-9- *-*-*-*-*-fontset-09pt_adobe_courier" t 'noerror) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-1 '("adobe-courier" . "iso8859-1")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-2 '("adobe-courier" . "iso8859-2")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-3 '("adobe-courier" . "iso8859-3")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-4 '("adobe-courier" . "iso8859-4")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-9 '("adobe-courier" . "iso8859-9")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-14 '("adobe-courier" . "iso8859-14")) (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-15 '("adobe-courier" . "iso8859-15")) ; (set-fontset-font "fontset-09pt_adobe_courier" 'latin- iso8859-16 '("adobe-courier" . "iso8859-16")) (set-fontset-font "fontset-09pt_adobe_courier" 'mule- unicode-0100-24ff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-09pt_adobe_courier" 'mule- unicode-2500-33ff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-09pt_adobe_courier" 'mule-unicode-e000- ffff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0370) (decode-char 'ucs #x03cf)) '("courier new" . "iso10646-1")) ; Greek (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x03d0) (decode-char 'ucs #x03ff)) '("lucida sans typewriter" . "iso10646-1")) ; Coptic (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0400) (decode-char 'ucs #x04ff)) '("lucida sans typewriter" . "iso10646-1")) ; Cyrillic (set-fontset -font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0500) (decode-char 'ucs #x052f)) '("lucida sans typewriter" . "iso10646-1")) ; Cyrillic Suppll (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0530) (decode-char 'ucs #x058f)) '("aramian unicode" . "iso10646-1")) ; Armenian (sylfaen (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0590) (decode-char 'ucs #x05ff)) '("courier new" . "iso10646-1")) ; Hebrew (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0600) (decode-char 'ucs #x06ff)) '("lucida sans typewriter" . "iso10646-1")) ; Arabic (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0700) (decode-char 'ucs #x074f)) '("courier new" . "iso10646-1")) ; Syriac (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0780) (decode-char 'ucs #x07bf)) '("courier new" . "iso10646-1")) ; Thaana (set-fontset-font "fontset-09pt_adobe_courier" (cons (decode-char 'ucs #x0900) (decode-char 'ucs #x097f)) '("courier new" . "iso10646-1")) ; Devanagari )) (provide 'site-fontsets-x11) One template that has some more regions of Unicode defined for one font, and of course there are some more sizes defined. Turkish is ISO Latin-5 or ISO 8859-9, so the definition above should work for your case. This works in X11. I don't know whether it works Losedows or whether this is necessary at all ... Try to find out which fonts GNU Emacs sees: M-x set-frame-font RET TAB TAB RET, change to *Completions* buffer and save it to a name you have determined before! If you try to expand a partial file name it will erase the *Completions* buffer ... The part of interest is here: -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-*-#130 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-big5 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-gb2312 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso10646-1 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-1 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-13 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-2 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-4 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-5 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-6 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-7 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-8 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-iso8859-9 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-jisx0201-katakana -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-jisx0201-latin -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-jisx0208-sjis -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-koi8-r -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-ksc5601.1987 -outline-Arial Unicode MS-normal-r-normal-normal-*-*-96-96-p-*-tis620 So iso8859-3 isn't even there! How do I get it there? Is this related to the codepage *.nls files again? I normally have w32-use-w32-font-dialog set to t so that I get the standard MSwindows font selection dialog box. When I set it to nil I get what every one else sees, I guess, which says misc, courier, fontsets, and under these, Lucida, Terminal, etc. but not nearly as many as are in the \windows\fonts subdirectory. Right now all I have relating to fonts in my .emacs is: (custom-set-faces ;; custom-set-faces was added by Custom -- don't edit or cut/paste it! ;; Your init file should contain only one such instance. '(default ((t (:stipple nil :background "ghostwhite" :foreground "black" :inverse-video nil :box nil :strike-through nil :overline nil :underline nil :slant normal :weight normal :height 108 :width normal :family "outline-arial unicode ms"))))) So that's what emacs starts up with. Since this font covers such a large swath of Unicode I would rather stick with the Losedows interface for now. I still don't understand fontsets yet. Now the dialog box shows fonts, styles, point size (8-72) and script all in one place. If I went the the fontset route and I wanted only sizes 9-12 and only four different font styles, wouldn't I have to produced about 16 times as much lisp code as you show above to get the same Unicode coverage? Ed ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fontsets: (was Re: query-replace?) 2006-01-08 22:18 ` fontsets: " B. T. Raven @ 2006-01-09 11:53 ` Peter Dyballa [not found] ` <mailman.445.1136812310.26925.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 14+ messages in thread From: Peter Dyballa @ 2006-01-09 11:53 UTC (permalink / raw) Cc: help-gnu-emacs Am 08.01.2006 um 22:18 schrieb B. T. Raven: > Since this font covers such a large swath of Unicode I would rather > stick > with the Losedows interface for now. I still don't understand fontsets > yet. Now the dialog box shows fonts, styles, point size (8-72) and > script > all in one place. If I went the the fontset route and I wanted only > sizes > 9-12 and only four different font styles, wouldn't I have to produced > about 16 times as much lisp code as you show above to get the same > Unicode > coverage? THe font variants (italic, bold, bold-italic) are automatically chosen, so four sets for 9, 10, 11, and 12 would suffice. Fontsets actually are necessary when you handle texts that have more different characters than the small MS or ISO encodings provide. Then GNU Emacs needs to create a table that maps code points (characters) to members in fonts (glyphs) and this choice can look ugly. You help GNU Emacs when you construct a fontset. IMO it would be enough to create a fontset like this for 9 pt and three others for 10, 11, and 12 pt:: (create-fontset-from-fontset-spec "-outline-Arial Unicode MS- normal-r-*-*-9-*-*-*-*-*-fontset-09pt_arial_UC" t 'noerror) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-1 '("Arial Unicode MS" . "iso8859-1")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-2 '("Arial Unicode MS" . "iso8859-2")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-4 '("Arial Unicode MS" . "iso8859-4")) (set-fontset-font "fontset-09pt_arial_UC" 'cyrillic-iso8859-5 '("Arial Unicode MS" . "iso8859-5")) (set-fontset-font "fontset-09pt_arial_UC" 'arabic-iso8859-6 '("Arial Unicode MS" . "iso8859-6")) (set-fontset-font "fontset-09pt_arial_UC" 'greek-iso8859-7 '("Arial Unicode MS" . "iso8859-7")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-8 '("Arial Unicode MS" . "iso8859-8")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-9 '("Arial Unicode MS" . "iso8859-9")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-13 '("Arial Unicode MS" . "iso8859-13")) (set-fontset-font "fontset-09pt_arial_UC" 'mule-unicode-0100-24ff '("Arial Unicode MS" . "iso10646-1")) (set-fontset-font "fontset-09pt_arial_UC" 'mule-unicode-2500-33ff '("Arial Unicode MS" . "iso10646-1")) (set-fontset-font "fontset-09pt_arial_UC" 'mule-unicode-e000-ffff '("Arial Unicode MS" . "iso10646-1")) What about Lucida Console (666 glyphs, 714 mappings)? It can display ISO 8859-3 in X11 completely (Arial Unicode MS has 51,180 glyphs and 38,933 mappings -- and in X11 it has an ISO 8859-3 encoding!). It's even monospaced. There is another monospaced font on the Web: Lucida Sans Typewriter (1,376 glyphs, 1,425 mappings). It's part of the Java SDKs (starting with Java 1.4 the Lucida fonts were reduced in variants, so it's worth to retrieve JDK 1.3 first and update some of these fonts with 1.4 and/or 1.5 fonts). The JDKs too have Lucida Sans (2,929 glyphs, 2,410 mappings). Probably you need some ISO 8859-3 encoding file. *I* have no idea where in MS Losedows this would be needed, somewhere in the machinery that creates partial, specifically named encodings from a Unicode encoded font? If it does not work in a *partial* encoding: would *complete* Unicode succeed?! ;;; -*- mode: Text; coding: utf-8; -*- First open in ISO 8859-3, then select to save in UTF-8 -- conversion done! http://aspell.net/charsets/, http://www.slovo.info/unifonts.htm, http://www.cs.tut.fi/%7Ejkorpela/chars.html, http://www.topology.org/ soft/alpha.html, http://www.i18nguy.com/, http://web.archive.org/web/ 20030622083607/www.diffuse.org/chars.html How do you declare ISO Latin-3 or ISO 8859-3? This is meant for Southern European, Maltese, and Esperanto Glyphs, very exotic! Or do you live on Malta? Here is my test file for this encoding, starting with a hint for GNU Emacs: ;;; -*- mode: Text; coding: iso-8859-3; -*- ; ; Time-stamp: <2005-07-15 14:20:24 pete> ; ; Southern European, Maltese and Esperanto Glyphs (Latin 3) ; ; oct dec hex UCS2 UTF-8 ;===================================== = 240 = 160 = A0 = U+00A0 = C2 A0 : NO-BREAK SPACE Ħ = 241 = 161 = A1 = U+0126 = C4 A6 : LATIN CAPITAL LETTER H WITH STROKE ˘ = 242 = 162 = A2 = U+02D8 = CB 98 : BREVE £ = 243 = 163 = A3 = U+00A3 = C2 A3 : POUND SIGN ¤ = 244 = 164 = A4 = U+00A4 = C2 A4 : CURRENCY SIGN Ĥ = 246 = 166 = A6 = U+0124 = C4 A4 : LATIN CAPITAL LETTER H WITH CIRCUMFLEX § = 247 = 167 = A7 = U+00A7 = C2 A7 : SECTION SIGN ¨ = 250 = 168 = A8 = U+00A8 = C2 A8 : DIAERESIS İ = 251 = 169 = A9 = U+0130 = C4 B0 : LATIN CAPITAL LETTER I WITH DOT ABOVE Ş = 252 = 170 = AA = U+015E = C5 9E : LATIN CAPITAL LETTER S WITH CEDILLA Ğ = 253 = 171 = AB = U+011E = C4 9E : LATIN CAPITAL LETTER G WITH BREVE Ĵ = 254 = 172 = AC = U+0134 = C4 B4 : LATIN CAPITAL LETTER J WITH CIRCUMFLEX = 255 = 173 = AD = U+00AD = C2 AD : HYPHEN-MINUS Ż = 257 = 175 = AF = U+017B = C5 BB : LATIN CAPITAL LETTER Z WITH DOT ABOVE ° = 260 = 176 = B0 = U+00B0 = C2 B0 : DEGREE SIGN ħ = 261 = 177 = B1 = U+0127 = C4 A7 : LATIN SMALL LETTER H WITH STROKE ² = 262 = 178 = B2 = U+00B2 = C2 B2 : SUPERSCRIPT TWO ³ = 263 = 179 = B3 = U+00B3 = C2 B3 : SUPERSCRIPT THREE ´ = 264 = 180 = B4 = U+00B4 = C2 B4 : ACUTE ACCENT µ = 265 = 181 = B5 = U+00B5 = C2 B5 : MICRO SIGN ĥ = 266 = 182 = B6 = U+0125 = C4 A5 : LATIN SMALL LETTER H WITH CIRCUMFLEX · = 267 = 183 = B7 = U+00B7 = C2 B7 : MIDDLE DOT ¸ = 270 = 184 = B8 = U+00B8 = C2 B8 : CEDILLA ı = 271 = 185 = B9 = U+0131 = C4 B1 : LATIN SMALL LETTER DOTLESS I ş = 272 = 186 = BA = U+015F = C5 9F : LATIN SMALL LETTER S WITH CEDILLA ğ = 273 = 187 = BB = U+011F = C4 9F : LATIN SMALL LETTER G WITH BREVE ĵ = 274 = 188 = BC = U+0135 = C4 B5 : LATIN SMALL LETTER J WITH CIRCUMFLEX ½ = 275 = 189 = BD = U+00BD = C2 BD : VULGAR FRACTION ONE HALF ż = 277 = 191 = BF = U+017C = C5 BC : LATIN SMALL LETTER Z WITH DOT ABOVE À = 300 = 192 = C0 = U+00C0 = C3 80 : LATIN CAPITAL LETTER A WITH GRAVE Á = 301 = 193 = C1 = U+00C1 = C3 81 : LATIN CAPITAL LETTER A WITH ACUTE Â = 302 = 194 = C2 = U+00C2 = C3 82 : LATIN CAPITAL LETTER A WITH CIRCUMFLEX Ä = 304 = 196 = C4 = U+00C4 = C3 84 : LATIN CAPITAL LETTER A WITH DIAERESIS Ċ = 305 = 197 = C5 = U+010A = C4 8A : LATIN CAPITAL LETTER C WITH DOT ABOVE Ĉ = 306 = 198 = C6 = U+0108 = C4 88 : LATIN CAPITAL LETTER C WITH CIRCUMFLEX Ç = 307 = 199 = C7 = U+00C7 = C3 87 : LATIN CAPITAL LETTER C WITH CEDILLA È = 310 = 200 = C8 = U+00C8 = C3 88 : LATIN CAPITAL LETTER E WITH GRAVE É = 311 = 201 = C9 = U+00C9 = C3 89 : LATIN CAPITAL LETTER E WITH ACUTE Ê = 312 = 202 = CA = U+00CA = C3 8A : LATIN CAPITAL LETTER E WITH CIRCUMFLEX Ë = 313 = 203 = CB = U+00CB = C3 8B : LATIN CAPITAL LETTER E WITH DIAERESIS Ì = 314 = 204 = CC = U+00CC = C3 8C : LATIN CAPITAL LETTER I WITH GRAVE Í = 315 = 205 = CD = U+00CD = C3 8D : LATIN CAPITAL LETTER I WITH ACUTE Î = 316 = 206 = CE = U+00CE = C3 8E : LATIN CAPITAL LETTER I WITH CIRCUMFLEX Ï = 317 = 207 = CF = U+00CF = C3 8F : LATIN CAPITAL LETTER I WITH DIAERESIS Ñ = 321 = 209 = D1 = U+00D1 = C3 91 : LATIN CAPITAL LETTER N WITH TILDE Ò = 322 = 210 = D2 = U+00D2 = C3 92 : LATIN CAPITAL LETTER O WITH GRAVE Ó = 323 = 211 = D3 = U+00D3 = C3 93 : LATIN CAPITAL LETTER O WITH ACUTE Ô = 324 = 212 = D4 = U+00D4 = C3 94 : LATIN CAPITAL LETTER O WITH CIRCUMFLEX Ġ = 325 = 213 = D5 = U+0120 = C4 A0 : LATIN CAPITAL LETTER G WITH DOT ABOVE Ö = 326 = 214 = D6 = U+00D6 = C3 96 : LATIN CAPITAL LETTER O WITH DIAERESIS × = 327 = 215 = D7 = U+00D7 = C3 97 : MULTIPLICATION SIGN Ĝ = 330 = 216 = D8 = U+011C = C4 9C : LATIN CAPITAL LETTER G WITH CIRCUMFLEX Ù = 331 = 217 = D9 = U+00D9 = C3 99 : LATIN CAPITAL LETTER U WITH GRAVE Ú = 332 = 218 = DA = U+00DA = C3 9A : LATIN CAPITAL LETTER U WITH ACUTE Û = 333 = 219 = DB = U+00DB = C3 9B : LATIN CAPITAL LETTER U WITH CIRCUMFLEX Ü = 334 = 220 = DC = U+00DC = C3 9C : LATIN CAPITAL LETTER U WITH DIAERESIS Ŭ = 335 = 221 = DD = U+016C = C5 AC : LATIN CAPITAL LETTER U WITH BREVE Ŝ = 336 = 222 = DE = U+015C = C5 9C : LATIN CAPITAL LETTER S WITH CIRCUMFLEX ß = 337 = 223 = DF = U+00DF = C3 9F : LATIN SMALL LETTER SHARP S à = 340 = 224 = E0 = U+00E0 = C3 A0 : LATIN SMALL LETTER A WITH GRAVE á = 341 = 225 = E1 = U+00E1 = C3 A1 : LATIN SMALL LETTER A WITH ACUTE â = 342 = 226 = E2 = U+00E2 = C3 A2 : LATIN SMALL LETTER A WITH CIRCUMFLEX ä = 344 = 228 = E4 = U+00E4 = C3 A4 : LATIN SMALL LETTER A WITH DIAERESIS ċ = 345 = 229 = E5 = U+010B = C4 8B : LATIN SMALL LETTER C WITH DOT ABOVE ĉ = 346 = 230 = E6 = U+0109 = C4 89 : LATIN SMALL LETTER C WITH CIRCUMFLEX ç = 347 = 231 = E7 = U+00E7 = C3 A7 : LATIN SMALL LETTER C WITH CEDILLA è = 350 = 232 = E8 = U+00E8 = C3 A8 : LATIN SMALL LETTER E WITH GRAVE é = 351 = 233 = E9 = U+00E9 = C3 A9 : LATIN SMALL LETTER E WITH ACUTE ê = 352 = 234 = EA = U+00EA = C3 AA : LATIN SMALL LETTER E WITH CIRCUMFLEX ë = 353 = 235 = EB = U+00EB = C3 AB : LATIN SMALL LETTER E WITH DIAERESIS ì = 354 = 236 = EC = U+00EC = C3 AC : LATIN SMALL LETTER I WITH GRAVE í = 355 = 237 = ED = U+00ED = C3 AD : LATIN SMALL LETTER I WITH ACUTE î = 356 = 238 = EE = U+00EE = C3 AE : LATIN SMALL LETTER I WITH CIRCUMFLEX ï = 357 = 239 = EF = U+00EF = C3 AF : LATIN SMALL LETTER I WITH DIAERESIS ñ = 361 = 241 = F1 = U+00F1 = C3 B1 : LATIN SMALL LETTER N WITH TILDE ò = 362 = 242 = F2 = U+00F2 = C3 B2 : LATIN SMALL LETTER O WITH GRAVE ó = 363 = 243 = F3 = U+00F3 = C3 B3 : LATIN SMALL LETTER O WITH ACUTE ô = 364 = 244 = F4 = U+00F4 = C3 B4 : LATIN SMALL LETTER O WITH CIRCUMFLEX ġ = 365 = 245 = F5 = U+0121 = C4 A1 : LATIN SMALL LETTER G WITH DOT ABOVE ö = 366 = 246 = F6 = U+00F6 = C3 B6 : LATIN SMALL LETTER O WITH DIAERESIS ÷ = 367 = 247 = F7 = U+00F7 = C3 B7 : DIVISION SIGN ĝ = 370 = 248 = F8 = U+011D = C4 9D : LATIN SMALL LETTER G WITH CIRCUMFLEX ù = 371 = 249 = F9 = U+00F9 = C3 B9 : LATIN SMALL LETTER U WITH GRAVE ú = 372 = 250 = FA = U+00FA = C3 BA : LATIN SMALL LETTER U WITH ACUTE û = 373 = 251 = FB = U+00FB = C3 BB : LATIN SMALL LETTER U WITH CIRCUMFLEX ü = 374 = 252 = FC = U+00FC = C3 BC : LATIN SMALL LETTER U WITH DIAERESIS ŭ = 375 = 253 = FD = U+016D = C5 AD : LATIN SMALL LETTER U WITH BREVE ŝ = 376 = 254 = FE = U+015D = C5 9D : LATIN SMALL LETTER S WITH CIRCUMFLEX ˙ = 377 = 255 = FF = U+02D9 = CB 99 : DOT ABOVE -- Greetings Pete The human brain operates at only 10% of its capacity. The rest is overhead for the operating system. ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <mailman.445.1136812310.26925.help-gnu-emacs@gnu.org>]
* Re: fontsets: (was Re: query-replace?) [not found] ` <mailman.445.1136812310.26925.help-gnu-emacs@gnu.org> @ 2006-01-10 6:12 ` B. T. Raven 2006-01-10 10:29 ` Peter Dyballa 0 siblings, 1 reply; 14+ messages in thread From: B. T. Raven @ 2006-01-10 6:12 UTC (permalink / raw) "Peter Dyballa" <Peter_Dyballa@Web.DE> wrote in message news:mailman.445.1136812310.26925.help-gnu-emacs@gnu.org... Am 08.01.2006 um 22:18 schrieb B. T. Raven: > Since this font covers such a large swath of Unicode I would rather > stick > with the Losedows interface for now. I still don't understand fontsets > yet. Now the dialog box shows fonts, styles, point size (8-72) and > script > all in one place. If I went the the fontset route and I wanted only > sizes > 9-12 and only four different font styles, wouldn't I have to produced > about 16 times as much lisp code as you show above to get the same > Unicode > coverage? THe font variants (italic, bold, bold-italic) are automatically chosen, so four sets for 9, 10, 11, and 12 would suffice. Fontsets actually are necessary when you handle texts that have more different characters than the small MS or ISO encodings provide. Then GNU Emacs needs to create a table that maps code points (characters) to members in fonts (glyphs) and this choice can look ugly. You help GNU Emacs when you construct a fontset. IMO it would be enough to create a fontset like this for 9 pt and three others for 10, 11, and 12 pt:: (create-fontset-from-fontset-spec "-outline-Arial Unicode MS- normal-r-*-*-9-*-*-*-*-*-fontset-09pt_arial_UC" t 'noerror) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-1 '("Arial Unicode MS" . "iso8859-1")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-2 '("Arial Unicode MS" . "iso8859-2")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-4 '("Arial Unicode MS" . "iso8859-4")) (set-fontset-font "fontset-09pt_arial_UC" 'cyrillic-iso8859-5 '("Arial Unicode MS" . "iso8859-5")) (set-fontset-font "fontset-09pt_arial_UC" 'arabic-iso8859-6 '("Arial Unicode MS" . "iso8859-6")) (set-fontset-font "fontset-09pt_arial_UC" 'greek-iso8859-7 '("Arial Unicode MS" . "iso8859-7")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-8 '("Arial Unicode MS" . "iso8859-8")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-9 '("Arial Unicode MS" . "iso8859-9")) (set-fontset-font "fontset-09pt_arial_UC" 'latin-iso8859-13 '("Arial Unicode MS" . "iso8859-13")) (set-fontset-font "fontset-09pt_arial_UC" 'mule-unicode-0100-24ff '("Arial Unicode MS" . "iso10646-1")) (set-fontset-font "fontset-09pt_arial_UC" 'mule-unicode-2500-33ff '("Arial Unicode MS" . "iso10646-1")) (set-fontset-font "fontset-09pt_arial_UC" 'mule-unicode-e000-ffff '("Arial Unicode MS" . "iso10646-1")) The example most similar to this at the Gnu Windows FAQ shows a long string in the (create-fontset-from-fontset-spec "....") function. Is the above to be evaluated as 15 forms or does another paren go on the end? Is each fontset definition in a file somewhere or does all this go into the .emacs? Anyway, I saved it to a file. What about Lucida Console (666 glyphs, 714 mappings)? It can display ISO 8859-3 in X11 completely (Arial Unicode MS has 51,180 glyphs and 38,933 mappings -- and in X11 it has an ISO 8859-3 encoding!). It's even monospaced. There is another monospaced font on the Web: Lucida Sans Typewriter (1,376 glyphs, 1,425 mappings). It's part of the Java SDKs (starting with Java 1.4 the Lucida fonts were reduced in variants, so it's worth to retrieve JDK 1.3 first and update some of these fonts with 1.4 and/or 1.5 fonts). The JDKs too have Lucida Sans (2,929 glyphs, 2,410 mappings). I think Windows has those encodings too. I don't know why I can't see the glyphs in emacs. In Eli Z.'s codepage.el the Latin-3 encoding is associated with cp857 and I have a cp_857.nls file in \windows\system (win98). Also, I see all the characters in your table below. I can get them into emacs, but by a roundabout method. Describe input method latin-3-postfix shows only empty rectangles. Probably you need some ISO 8859-3 encoding file. *I* have no idea where in MS Losedows this would be needed, somewhere in the machinery that creates partial, specifically named encodings from a Unicode encoded font? If it does not work in a *partial* encoding: would *complete* Unicode succeed?! This is what I think the locale file cp_857.nls does. I had to add some of these files to get emacs i18n working to the degree I have now. ;;; -*- mode: Text; coding: utf-8; -*- First open in ISO 8859-3, then select to save in UTF-8 -- conversion done! But remember that some of us are lost in Dozeland. I edit with emacs, almost everything else is done with other programs. I can get the table below into emacs but only by copy-pasting into Open Office and then saving as a utf-8 encoded text file. I am in Outlook Express here. Don't know how to use Gnus, Rmail, most other things. http://aspell.net/charsets/, http://www.slovo.info/unifonts.htm, http://www.cs.tut.fi/%7Ejkorpela/chars.html, http://www.topology.org/ soft/alpha.html, http://www.i18nguy.com/, http://web.archive.org/web/ 20030622083607/www.diffuse.org/chars.html Thanks for these. The i18nguy has some very implessive stuff (and useful links). Just this one: http://www.i18nguy.com/unicode/codepages.html is a treasure trove. How do you declare ISO Latin-3 or ISO 8859-3? This is meant for Southern European, Maltese, and Esperanto Glyphs, very exotic! Or do you live on Malta? Here is my test file for this encoding, starting with a hint for GNU Emacs: No, in U.S. In Eli's codepage.el this block of chars (or glyphs) is headed by ;; Turkish. This goes with DOS code page 857. ;;; -*- mode: Text; coding: iso-8859-3; -*- ; ; Time-stamp: <2005-07-15 14:20:24 pete> ; ; Southern European, Maltese and Esperanto Glyphs (Latin 3) ; ; oct dec hex UCS2 UTF-8 ;===================================== = 240 = 160 = A0 = U+00A0 = C2 A0 : NO-BREAK SPACE Ħ = 241 = 161 = A1 = U+0126 = C4 A6 : LATIN CAPITAL LETTER H WITH STROKE ˘ = 242 = 162 = A2 = U+02D8 = CB 98 : BREVE £ = 243 = 163 = A3 = U+00A3 = C2 A3 : POUND SIGN ¤ = 244 = 164 = A4 = U+00A4 = C2 A4 : CURRENCY SIGN Ĥ = 246 = 166 = A6 = U+0124 = C4 A4 : LATIN CAPITAL LETTER H WITH CIRCUMFLEX § = 247 = 167 = A7 = U+00A7 = C2 A7 : SECTION SIGN . . . etc. Apparently your emacs is using Unicode as its internal representation. I can't do that with 21.3. Thanks for all the food for thought. Ed ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: fontsets: (was Re: query-replace?) 2006-01-10 6:12 ` B. T. Raven @ 2006-01-10 10:29 ` Peter Dyballa 0 siblings, 0 replies; 14+ messages in thread From: Peter Dyballa @ 2006-01-10 10:29 UTC (permalink / raw) Cc: help-gnu-emacs Am 10.01.2006 um 06:12 schrieb B. T. Raven: > No, in U.S. In Eli's codepage.el this block of chars (or glyphs) is > headed > by ;; Turkish. This goes with DOS code page 857. > I don't know so much about DOS code pages, I stick to standards: ISO 8859-1: Western European Glyphs (Latin 1) ISO 8859-2: Central and Eastern European Glyphs (Latin 2) ISO 8859-3: Southern European, Maltese, and Esperanto Glyphs (Latin 3) ISO 8859-4: Northern European Glyphs (Latin 4) ISO 8859-5: Cyrillic Glyphs ISO 8859-6: Arabic Glyphs ISO 8859-7: Modern Greek Glyphs (ELOT928) ISO 8859-8: Hebrew Glyphs ISO 8859-9: Turkish Glyphs (Latin 5) ISO 8859-10: New Nordic Glyphs: Saami, Inuit, Icelandic (Latin 6) ISO 8859-11: Thai Glyphs ISO 8859-13: Baltic Glyphs (Latin 7) ISO 8859-14: Celtic Glyphs (Latin 8) ISO 8859-15: Western European Glyphs with € (Latin 9, Latin 0) ISO 8859-16: South-Eastern European Glyphs with €, Romanian (Latin 10) If you want I can send you my 8 bit Turkish test file in ISO Latin-5. Which should work because you have -outline-Arial Unicode MS-normal-r- normal-normal-*-*-96-96-p-*-iso8859-5. > > Apparently your emacs is using Unicode as its internal > representation. I > can't do that with 21.3. Thanks for all the food for thought. No. I took the excerpt off GNU Emacs 22.0.50, the one that is coming closer to Unicode every day. The important thing is that the file has in the leftmost column the right code values. With the encoding line Emacs tries to present these codes in characters belonging to this set. Then I have the right fontset defined so that the right glyphs are chosen from the font(s). The codes in the leftmost column are still 8 bit! Maybe you know printf, a function in C, that is used in UNIX as a modern substitute to echo, that is also used in script languages like awk or Perl. In Perl printf "%c = %d = %o = %x\n", 234, 234, 234, 234; would create the left columns of one line (in a loop it can become more lines), the others come from a programme that can translate ISO to UTF-8 representation or lookup the Unicode position for that character. The descriptions, I think, are taken from a file in the Kermit distribution. In GNU Emacs they all fused together. -- Greetings Pete Eat the rich -- the poor are tough and stringy. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2006-01-10 10:29 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-01-07 19:04 query-replace? B. T. Raven 2006-01-08 4:20 ` query-replace? Eli Zaretskii [not found] ` <mailman.301.1136694143.26925.help-gnu-emacs@gnu.org> 2006-01-08 5:38 ` query-replace? B. T. Raven 2006-01-08 12:03 ` query-replace? Peter Dyballa 2006-01-08 12:11 ` query-replace? Lennart Borgman 2006-01-08 12:40 ` query-replace? Peter Dyballa 2006-01-08 12:48 ` query-replace? Lennart Borgman 2006-01-08 19:41 ` query-replace? Eli Zaretskii [not found] ` <mailman.315.1136721918.26925.help-gnu-emacs@gnu.org> 2006-01-08 16:17 ` imput methods (was Re: query-replace?) B. T. Raven 2006-01-08 17:05 ` Peter Dyballa [not found] ` <mailman.349.1136740073.26925.help-gnu-emacs@gnu.org> 2006-01-08 22:18 ` fontsets: " B. T. Raven 2006-01-09 11:53 ` Peter Dyballa [not found] ` <mailman.445.1136812310.26925.help-gnu-emacs@gnu.org> 2006-01-10 6:12 ` B. T. Raven 2006-01-10 10:29 ` Peter Dyballa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).