* bug#7786: 23.2; Encoding of PostScript files @ 2011-01-05 0:18 Peter Dyballa 2021-01-20 18:02 ` Lars Ingebrigtsen ` (2 more replies) 0 siblings, 3 replies; 37+ messages in thread From: Peter Dyballa @ 2011-01-05 0:18 UTC (permalink / raw) To: 7786 Hello! When I open a PostScript file it's opened "(encoded by coding system undecided-unix)" – as the *Help* buffer explains after invocation of C- u x =. This is incorrect, because, as PRML, The PostScript® Language Reference manual, explains in a footnote near the end, on encodings: 3. The ISOLatin1Encoding encoding vector deviates from the ISO 8859-1 standard in one respect: the character at position 140 is quoteleft, whereas the ISO standard specifies grave. A PostScript program needing to conform exactly to the ISO standard should create a modified encoding vector with this entry changed. So what is displayed in the buffer as character: ` (96, #o140, #x60) is in reality, printed on some medium or on screen character: ‘ (8216, #o20030, #x2018) or: instead of /grave the character /quoteleft is encoded here. IMO GNU Emacs should open a PostScript file in adobe-standard- encoding, except it sees in the file that the font(s) used is (are) re- encoded in ISOLatin1Encoding (which is *not* the same as ISO 8819-1), CE Encoding, or whatever. In GNU Emacs 23.2.1 (powerpc-apple-darwin9.8.0, X toolkit, Xaw3d scroll bars) of 2010-08-01 on Latsche.local Windowing system distributor `The X.Org Foundation', version 11.0.10903000 configured using `configure '--without-sound' '--without-dbus' '-- without-pop' '--without-gconf' '--with-x-toolkit=athena' '--x- libraries=/usr/X11/lib' '--x-includes=/usr/X11/include' '--enable- locallisppath=/Library/Application Support/Emacs/calendar23:/Library/ Application Support/Emacs' 'CFLAGS=-H -Wno-pointer-sign -pipe -fPIC - fno-common -mcpu=7450 -mtune=7450 -faltivec -fast' 'CPPFLAGS=' 'LDFLAGS=' 'CC=gcc-4.2' 'CPP=cpp-4.2'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: de_DE.UTF-8 value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: de_DE.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default enable-multibyte-characters: t Major mode: PostScript Minor modes in effect: doc-view-minor-mode: t tooltip-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t view-mode: t Recent input: <down-mouse-1> <mouse-1> C-x d <return> <escape> < s <down> <down> <down> <down> <down> <down> <down> <down> v <end> <escape> > <prior> <prior> M-x d e s c r i b <tab> e n c o <tab> <backspace> <backspace> c <tab> <backspace> <backspace> <backspace> <tab> c h a r <tab> a <tab> <return> C-g M-x d e s c r i b e - <tab> c o d <tab> <return> <return> <help-echo> <prior> <prior> <prior> <prior> <prior> <prior> <next> <next> <down> <down> <right> <right> <right> <right> <right> <right> <right> <right> <right> <right> <right> C-u C-x = <help-echo> <help-echo> <menu-bar> <PostScript> <Cookbook> <ISOLatin1Extended> <help-echo> <help-echo> <help-echo> <help-echo> <help-echo> <help-echo> <help-echo> <help-echo> <menu-bar> <help-menu> <send-emacs-bug -report> Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Mark set Type C-c C-c to toggle between editing or viewing the document. View mode: type C-h for help, h for commands, q to quit. Mark set Making completion list... Quit Making completion list... Char: ` (96, #o140, #x60) point=36185 of 39534 (92%) column=11 call-interactively: Buffer is read-only: #<buffer man_ascii.ps> -- Mit friedvollen Grüßen Pete Üblicherweise begehen Menschen beim Entwerfen vollkommen narrensicherer Dinge gerne den Fehler, das Genie des Volltrottels zu unterschätzen. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2011-01-05 0:18 bug#7786: 23.2; Encoding of PostScript files Peter Dyballa @ 2021-01-20 18:02 ` Lars Ingebrigtsen 2021-06-02 8:39 ` Lars Ingebrigtsen 2021-10-13 13:51 ` Lars Ingebrigtsen 2 siblings, 0 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-01-20 18:02 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > When I open a PostScript file it's opened "(encoded by coding system > undecided-unix)" – as the *Help* buffer explains after invocation of > C-u x =. > > This is incorrect, because, as PRML, The PostScript® Language > Reference manual, explains in a footnote near the end, on encodings: > > 3. The ISOLatin1Encoding encoding vector deviates from the ISO > 8859-1 standard in one > respect: the character at position 140 is quoteleft, > whereas the ISO standard specifies > grave. A PostScript program needing to conform exactly to > the ISO standard should > create a modified encoding vector with this entry changed. [...] > IMO GNU Emacs should open a PostScript file in > adobe-standard-encoding, except it sees in the file that the font(s) > used is (are) re-encoded in ISOLatin1Encoding (which is *not* the same > as ISO 8819-1), CE Encoding, or whatever. (I'm going through old bug reports that unfortunately got no response at the time.) I'm not quite sure I understand the final paragraph there, but the suggestion is that .ps files should be opened with `adobe-standard-encoding' and not `iso-latin-1' if there's non-ASCII characters in the file? Anybody got any comments on that? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2011-01-05 0:18 bug#7786: 23.2; Encoding of PostScript files Peter Dyballa 2021-01-20 18:02 ` Lars Ingebrigtsen @ 2021-06-02 8:39 ` Lars Ingebrigtsen 2021-06-02 16:37 ` Peter Dyballa 2021-10-13 12:49 ` Lars Ingebrigtsen 2021-10-13 13:51 ` Lars Ingebrigtsen 2 siblings, 2 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-06-02 8:39 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > IMO GNU Emacs should open a PostScript file in > adobe-standard-encoding, except it sees in the file that the font(s) > used is (are) re-encoded in ISOLatin1Encoding (which is *not* the same > as ISO 8819-1), CE Encoding, or whatever. I took a first stab at this, but this is obviously not correct. I'm not sure how to detect whether it's a ISOLatin1Encoding file? And... I'm this will probably make the file opened like this be saved in utf-8, which isn't what we want... diff --git a/lisp/international/mule-conf.el b/lisp/international/mule-conf.el index 2d36dab632..dc936ba2c2 100644 --- a/lisp/international/mule-conf.el +++ b/lisp/international/mule-conf.el @@ -1637,6 +1637,7 @@ 'utf-7-imap ("\\.el\\'" . prefer-utf-8) ("\\.utf\\(-8\\)?\\'" . utf-8) ("\\.xml\\'" . xml-find-file-coding-system) + ("\\.ps\\'" . ps-find-file-coding-system) ;; We use raw-text for reading loaddefs.el so that if it ;; happens to have DOS or Mac EOLs, they are converted to ;; newlines. This is required to make the special treatment diff --git a/lisp/international/mule.el b/lisp/international/mule.el index 9cd38afd8b..6efdaba6e8 100644 --- a/lisp/international/mule.el +++ b/lisp/international/mule.el @@ -2511,6 +2511,17 @@ sgml-html-meta-auto-coding-function (message "Warning: unknown coding system \"%s\"" match) nil))))) +(defun ps-find-file-coding-system (args) + (if (not (eq (car args) 'insert-file-contents)) + 'undecided + (let ((coding-system + (coding-system-base + (detect-coding-region (point-min) (point-max) t)))) + ;; If it's an ASCII file, then interpret ` specially. + (if (eq coding-system 'undecided) + 'adobe-standard-encoding + coding-system)))) + (defun xml-find-file-coding-system (args) "Determine the coding system of an XML file without a declaration. Strictly speaking, the file should be utf-8, but mistakes are -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-06-02 8:39 ` Lars Ingebrigtsen @ 2021-06-02 16:37 ` Peter Dyballa 2021-10-13 12:49 ` Lars Ingebrigtsen 1 sibling, 0 replies; 37+ messages in thread From: Peter Dyballa @ 2021-06-02 16:37 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 7786 > Am 2.6.2021 um 10:39 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > > I took a first stab at this, but this is obviously not correct. I'm not > sure how to detect whether it's a ISOLatin1Encoding file? And... I'm > this will probably make the file opened like this be saved in utf-8, > which isn't what we want... This looks to me like the proper solution. (Although I cannot remember how I would have tested this…) -- Greetings Pete Almost anything is easier to get into than out of. – Allen's Law ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-06-02 8:39 ` Lars Ingebrigtsen 2021-06-02 16:37 ` Peter Dyballa @ 2021-10-13 12:49 ` Lars Ingebrigtsen 2021-10-13 13:12 ` Lars Ingebrigtsen 1 sibling, 1 reply; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-13 12:49 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 [-- Attachment #1: Type: text/plain, Size: 3260 bytes --] Lars Ingebrigtsen <larsi@gnus.org> writes: >> IMO GNU Emacs should open a PostScript file in >> adobe-standard-encoding, except it sees in the file that the font(s) >> used is (are) re-encoded in ISOLatin1Encoding (which is *not* the same >> as ISO 8819-1), CE Encoding, or whatever. > > I took a first stab at this, but this is obviously not correct. I'm not > sure how to detect whether it's a ISOLatin1Encoding file? And... I'm > this will probably make the file opened like this be saved in utf-8, > which isn't what we want... I tested this a bit more now, and it doesn't work. First of all, saving the file with adobe-standard-encoding means that all the newlines are stripped from the file. So I tried the patch below, but 1) it didn't display non-ascii chars correctly, and 2) when saving, I got: These default coding systems were tried to encode the following problematic characters in the buffer ‘a.ps’: Coding System Pos Codepoint Char adobe-standard-encoding-unix 1 #xA 4 #xA ... utf-8-unix 328 #x3FFFF3 I.e., it's complaining about the newlines, as well as the non-ASCII char. So it seems like the adobe coding system doesn't actually work, and I wonder whether anybody's ever tried using it before? Possibly not? I've never actually tried working with the coding system stuff before on this level, so I'm probably missing something really simple. The work-in-progress patch is below, as well as a .ps test file. Anybody see immediately what's wrong here? diff --git a/lisp/international/mule-conf.el b/lisp/international/mule-conf.el index 9a68fce2e8..1fe4b5c55a 100644 --- a/lisp/international/mule-conf.el +++ b/lisp/international/mule-conf.el @@ -1637,6 +1637,7 @@ 'utf-7-imap ("\\.el\\'" . prefer-utf-8) ("\\.utf\\(-8\\)?\\'" . utf-8) ("\\.xml\\'" . xml-find-file-coding-system) + ("\\.ps\\'" . ps-find-file-coding-system) ;; We use raw-text for reading loaddefs.el so that if it ;; happens to have DOS or Mac EOLs, they are converted to ;; newlines. This is required to make the special treatment diff --git a/lisp/international/mule.el b/lisp/international/mule.el index 5022a17db5..b2945bbbf3 100644 --- a/lisp/international/mule.el +++ b/lisp/international/mule.el @@ -2526,6 +2526,17 @@ sgml-html-meta-auto-coding-function (message "Warning: unknown coding system \"%s\"" match) nil))))) +(defun ps-find-file-coding-system (args) + (if (not (eq (car args) 'insert-file-contents)) + 'undecided + (let ((coding-system + (coding-system-base + (detect-coding-region (point-min) (point-max) t)))) + ;; If it's an ASCII file, then interpret ` specially. + (if (memq coding-system '(undecided iso-latin-1)) + 'adobe-standard-encoding-unix + coding-system)))) + (defun xml-find-file-coding-system (args) "Determine the coding system of an XML file without a declaration. Strictly speaking, the file should be utf-8, but mistakes are -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no [-- Attachment #2: a.ps --] [-- Type: application/postscript, Size: 331 bytes --] ^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 12:49 ` Lars Ingebrigtsen @ 2021-10-13 13:12 ` Lars Ingebrigtsen 0 siblings, 0 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-13 13:12 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Aha! ;; To make a coding system with this, a pre-write-conversion should ;; account for the commented-out multi-valued code points in ;; stdenc.map. (define-charset 'adobe-standard-encoding And this hasn't been done? the stdenc.map file is missing a whole bunch of characters... -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2011-01-05 0:18 bug#7786: 23.2; Encoding of PostScript files Peter Dyballa 2021-01-20 18:02 ` Lars Ingebrigtsen 2021-06-02 8:39 ` Lars Ingebrigtsen @ 2021-10-13 13:51 ` Lars Ingebrigtsen 2021-10-13 15:41 ` Eli Zaretskii ` (2 more replies) 2 siblings, 3 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-13 13:51 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > This is incorrect, because, as PRML, The PostScript® Language > Reference manual, explains in a footnote near the end, on encodings: > > 3. The ISOLatin1Encoding encoding vector deviates from the ISO > 8859-1 standard in one > respect: the character at position 140 is quoteleft, > whereas the ISO standard specifies > grave. A PostScript program needing to conform exactly to > the ISO standard should > create a modified encoding vector with this entry changed. This seems to be incorrect. https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding https://en.wikipedia.org/wiki/ISO/IEC_8859-1 differ in a whole bunch of places. In addition, ISOLatin1Encoding is not the same as stdenc (which is what adobe-standard-encoding uses): https://unicode.org/Public/MAPPINGS/VENDORS/ADOBE/stdenc.txt Emacs doesn't seem to have any support for ISOLatin1Encoding: "In 1995, IBM assigned code page 1277 (CCSID 1277) to this character set." Unless we have it under some other name. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 13:51 ` Lars Ingebrigtsen @ 2021-10-13 15:41 ` Eli Zaretskii 2021-10-13 16:05 ` Lars Ingebrigtsen 2021-10-13 21:02 ` Peter Dyballa 2021-10-13 21:55 ` Peter Dyballa 2 siblings, 1 reply; 37+ messages in thread From: Eli Zaretskii @ 2021-10-13 15:41 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Peter_Dyballa, 7786 > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Wed, 13 Oct 2021 15:51:48 +0200 > Cc: 7786@debbugs.gnu.org > > Emacs doesn't seem to have any support for ISOLatin1Encoding: > > "In 1995, IBM assigned code page 1277 (CCSID 1277) to this character set." > > Unless we have it under some other name. I think you are right. But we could create such an encoding, see etc/charsets/ and the coding-system definitions to go with them. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 15:41 ` Eli Zaretskii @ 2021-10-13 16:05 ` Lars Ingebrigtsen 2021-10-13 16:18 ` Eli Zaretskii 0 siblings, 1 reply; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-13 16:05 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Peter_Dyballa, 7786 [-- Attachment #1: Type: text/plain, Size: 964 bytes --] Eli Zaretskii <eliz@gnu.org> writes: > I think you are right. But we could create such an encoding, see > etc/charsets/ and the coding-system definitions to go with them. We could, but unfortunately, I'm not able to find any quality source for the charset. The closest I've been able to find is the file from IBM (attached), but it doesn't map to Unicode code points, of course: ... 90 LI610000 i Dotless Small 91 SD130000 Grave Accent 92 SD110000 Acute Accent glibc doesn't seem to have this, and I can't find it on the Unicode web site, either. So we'd have to maintain this by hand (and the easiest way is probably to copy the table from Wikipedia and massage it). But... it seems like an awful lot of work for something like this, so I think I'll bow out. If somebody else wants to implement this, that's totally OK, though. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no [-- Attachment #2: CP01277.txt --] [-- Type: text/plain, Size: 7854 bytes --] * ---------------------------------------------------------------------- * Copyright IBM Corporation 1995. All rights reserved. * C-H 3-3220-050 : REGISTRY, Graphic Character Sets and Code Pages * Code Page (CPGID) : 01277 * Common Name : Adobe (PostScript) Latin 1 * Registration Date : 1995 * Last Revision Date : * Default Encoding : 4105 * Code : MS Windows (ISO 8 variant) * Maximal Character * Set (GCSGID) : 01427 * Other GCSGIDs : * ---------------------------------------------------------------------- *- GCGID --------- GCGID Name ------------------------------------------ 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 SP010000 Space 21 SP020000 Exclamation Point 22 SP040000 Quotation Marks 23 SM010000 Number Sign 24 SC030000 Dollar Sign 25 SM020000 Percent Sign 26 SM030000 Ampersand 27 SP200000 Right Single Quote 28 SP060000 Left Parenthesis 29 SP070000 Right Parenthesis 2A SM040000 Asterisk 2B SA010000 Plus Sign 2C SP080000 Comma 2D SP100000 Hyphen/Minus Sign 2E SP110000 Period/Full Stop 2F SP120000 Slash 30 ND100000 Zero 31 ND010000 One 32 ND020000 Two 33 ND030000 Three 34 ND040000 Four 35 ND050000 Five 36 ND060000 Six 37 ND070000 Seven 38 ND080000 Eight 39 ND090000 Nine 3A SP130000 Colon 3B SP140000 Semicolon 3C SA030000 Less Than Sign/Greater Than Sign (Arabic) 3D SA040000 Equal Sign 3E SA050000 Greater Than Sign/Less Than Sign (Arabic) 3F SP150000 Question Mark 40 SM050000 At Sign 41 LA020000 A Capital 42 LB020000 B Capital 43 LC020000 C Capital 44 LD020000 D Capital 45 LE020000 E Capital 46 LF020000 F Capital 47 LG020000 G Capital 48 LH020000 H Capital 49 LI020000 I Capital 4A LJ020000 J Capital 4B LK020000 K Capital 4C LL020000 L Capital 4D LM020000 M Capital 4E LN020000 N Capital 4F LO020000 O Capital 50 LP020000 P Capital 51 LQ020000 Q Capital 52 LR020000 R Capital 53 LS020000 S Capital 54 LT020000 T Capital 55 LU020000 U Capital 56 LV020000 V Capital 57 LW020000 W Capital 58 LX020000 X Capital 59 LY020000 Y Capital 5A LZ020000 Z Capital 5B SM060000 Left Bracket 5C SM070000 Backslash 5D SM080000 Right Bracket 5E SD150000 Circumflex Accent 5F SP090000 Underline/Continuous Underscore 60 SP190000 Left Single Quote 61 LA010000 a Small 62 LB010000 b Small 63 LC010000 c Small 64 LD010000 d Small 65 LE010000 e Small 66 LF010000 f Small 67 LG010000 g Small 68 LH010000 h Small 69 LI010000 i Small 6A LJ010000 j Small 6B LK010000 k Small 6C LL010000 l Small 6D LM010000 m Small 6E LN010000 n Small 6F LO010000 o Small 70 LP010000 p Small 71 LQ010000 q Small 72 LR010000 r Small 73 LS010000 s Small 74 LT010000 t Small 75 LU010000 u Small 76 LV010000 v Small 77 LW010000 w Small 78 LX010000 x Small 79 LY010000 y Small 7A LZ010000 z Small 7B SM110000 Left Brace 7C SM130000 Vertical Line/Logical OR 7D SM140000 Right Brace 7E SD190000 Tilde Accent 7F 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F 90 LI610000 i Dotless Small 91 SD130000 Grave Accent 92 SD110000 Acute Accent 93 SD150100 Circumflex Accent (Over Small Alphabetics Without Ascenders) 94 SD190100 Tilde Accent (Over Small Alphabetics Without Ascenders) 95 SD310000 Macron Accent 96 SD230000 Breve Accent 97 SD290000 Overdot Accent 98 SD170000 Diaeresis/Umlaut Accent 99 9A SD270000 Overcircle Accent 9B SD410000 Cedilla or Sedila Accent 9C 9D SD250000 Double Acute Accent 9E SD430000 Ogonek Accent 9F SD210000 Caron Accent A0 SP300000 Required Space A1 SP030000 Exclamation Point, Inverted A2 SC040000 Cent Sign A3 SC020000 Pound Sterling Sign A4 SC010000 International Currency Symbol A5 SC050000 Yen Sign A6 SM650000 Vertical Line, Broken A7 SM240000 Section Symbol (USA)/Paragraph Symbol (Europe) A8 SD170000 Diaeresis/Umlaut Accent A9 SM520000 Copyright Symbol AA SM210000 Ordinal Indicator, Feminine AB SP170000 Left Angle Quotes AC SM660000 Logical NOT/End Of Line Symbol AD SP320000 Syllable Hyphen AE SM530000 Registered Trademark Symbol AF SD310000 Macron Accent B0 SM190000 Degree Symbol B1 SA020000 Plus or Minus Sign B2 ND021000 Two Superscript B3 ND031000 Three Superscript B4 SD110000 Acute Accent B5 SM170000 Micro Symbol B6 SM250000 Paragraph Symbol (USA) B7 SD630000 Middle Dot B8 SD410000 Cedilla or Sedila Accent B9 ND011000 One Superscript BA SM200000 Ordinal Indicator, Masculine BB SP180000 Right Angle Quotes BC NF040000 One Quarter BD NF010000 One Half BE NF050000 Three Quarters BF SP160000 Question Mark, Inverted C0 LA140000 A Grave Capital C1 LA120000 A Acute Capital C2 LA160000 A Circumflex Capital C3 LA200000 A Tilde Capital C4 LA180000 A Diaeresis Capital C5 LA280000 A Overcircle Capital C6 LA520000 ae Diphthong Capital C7 LC420000 C Cedilla Capital C8 LE140000 E Grave Capital C9 LE120000 E Acute Capital CA LE160000 E Circumflex Capital CB LE180000 E Diaeresis Capital CC LI140000 I Grave Capital CD LI120000 I Acute Capital CE LI160000 I Circumflex Capital CF LI180000 I Diaeresis Capital D0 LD620000 D Stroke Capital/Eth Icelandic Capital D1 LN200000 N Tilde Capital D2 LO140000 O Grave Capital D3 LO120000 O Acute Capital D4 LO160000 O Circumflex Capital D5 LO200000 O Tilde Capital D6 LO180000 O Diaeresis Capital D7 SA070000 Multiply Sign D8 LO620000 O Slash Capital D9 LU140000 U Grave Capital DA LU120000 U Acute Capital DB LU160000 U Circumflex Capital DC LU180000 U Diaeresis Capital DD LY120000 Y Acute Capital DE LT640000 Thorn Icelandic Capital DF LS610000 Sharp s Small E0 LA130000 a Grave Small E1 LA110000 a Acute Small E2 LA150000 a Circumflex Small E3 LA190000 a Tilde Small E4 LA170000 a Diaeresis Small E5 LA270000 a Overcircle Small E6 LA510000 ae Diphthong Small E7 LC410000 c Cedilla Small E8 LE130000 e Grave Small E9 LE110000 e Acute Small EA LE150000 e Circumflex Small EB LE170000 e Diaeresis Small EC LI130000 i Grave Small ED LI110000 i Acute Small EE LI150000 i Circumflex Small EF LI170000 i Diaeresis Small F0 LD630000 eth Icelandic Small F1 LN190000 n Tilde Small F2 LO130000 o Grave Small F3 LO110000 o Acute Small F4 LO150000 o Circumflex Small F5 LO190000 o Tilde Small F6 LO170000 o Diaeresis Small F7 SA060000 Divide Sign F8 LO610000 o Slash Small F9 LU130000 u Grave Small FA LU110000 u Acute Small FB LU150000 u Circumflex Small FC LU170000 u Diaeresis Small FD LY110000 y Acute Small FE LT630000 Thorn Icelandic Small FF LY170000 y Diaeresis Small /* END of table -------------------------------------------------------- \x1a ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:05 ` Lars Ingebrigtsen @ 2021-10-13 16:18 ` Eli Zaretskii 2021-10-13 16:20 ` Lars Ingebrigtsen 0 siblings, 1 reply; 37+ messages in thread From: Eli Zaretskii @ 2021-10-13 16:18 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Peter_Dyballa, 7786 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Peter_Dyballa@Freenet.DE, 7786@debbugs.gnu.org > Date: Wed, 13 Oct 2021 18:05:07 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > I think you are right. But we could create such an encoding, see > > etc/charsets/ and the coding-system definitions to go with them. > > We could, but unfortunately, I'm not able to find any quality source for > the charset. The closest I've been able to find is the file from IBM > (attached), but it doesn't map to Unicode code points, of course: What's wrong with this: https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding It shows the Unicode codepoint for each character in the codepage. Or what am I missing? ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:18 ` Eli Zaretskii @ 2021-10-13 16:20 ` Lars Ingebrigtsen 2021-10-13 16:23 ` Peter Dyballa 2021-10-13 16:43 ` Eli Zaretskii 0 siblings, 2 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-13 16:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Peter_Dyballa, 7786 Eli Zaretskii <eliz@gnu.org> writes: > What's wrong with this: > > https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding > > It shows the Unicode codepoint for each character in the codepage. Or > what am I missing? Yes, that's what I suggested using as the source -- somebody would have to transcribe that into a machine readable file. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:20 ` Lars Ingebrigtsen @ 2021-10-13 16:23 ` Peter Dyballa 2021-10-13 16:28 ` Lars Ingebrigtsen 2021-10-13 16:45 ` Eli Zaretskii 2021-10-13 16:43 ` Eli Zaretskii 1 sibling, 2 replies; 37+ messages in thread From: Peter Dyballa @ 2021-10-13 16:23 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 7786 > Am 13.10.2021 um 18:20 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > > Eli Zaretskii <eliz@gnu.org> writes: > >> What's wrong with this: >> >> https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding >> >> It shows the Unicode codepoint for each character in the codepage. Or >> what am I missing? > > Yes, that's what I suggested using as the source -- somebody would have > to transcribe that into a machine readable file. Is this OK and usable? ;;; -*- mode: Text; coding: utf-8; -*- ; ; Time-stamp: <2011-01-05 10:52:40 pete> ; ; Standard PostScript Glyphs (Adobe) ; ; oct dec hex UTF8 ;=================================== = 40 = 32 = 20 = 20 = U+0020 : SPACE ! = 41 = 33 = 21 = 21 = U+0021 : EXCLAMATION MARK " = 42 = 34 = 22 = 22 = U+0022 : QUOTATION MARK # = 43 = 35 = 23 = 23 = U+0023 : NUMBER SIGN $ = 44 = 36 = 24 = 24 = U+0024 : DOLLAR SIGN % = 45 = 37 = 25 = 25 = U+0025 : PERCENT SIGN & = 46 = 38 = 26 = 26 = U+0026 : AMPERSAND ' = 47 = 39 = 27 = 27 = U+2019 : RIGHT SINGLE QUOTATION MARK ( = 50 = 40 = 28 = 28 = U+0028 : LEFT PARENTHESIS ) = 51 = 41 = 29 = 29 = U+0029 : RIGHT PARENTHESIS * = 52 = 42 = 2A = 2A = U+002A : ASTERISK + = 53 = 43 = 2B = 2B = U+002B : PLUS SIGN , = 54 = 44 = 2C = 2C = U+002C : COMMA - = 55 = 45 = 2D = 2D = U+002D : HYPHEN-MINUS . = 56 = 46 = 2E = 2E = U+002E : FULL STOP / = 57 = 47 = 2F = 2F = U+002F : SOLIDUS 0 = 60 = 48 = 30 = 30 = U+0030 : DIGIT ZERO 1 = 61 = 49 = 31 = 31 = U+0031 : DIGIT ONE 2 = 62 = 50 = 32 = 32 = U+0032 : DIGIT TWO 3 = 63 = 51 = 33 = 33 = U+0033 : DIGIT THREE 4 = 64 = 52 = 34 = 34 = U+0034 : DIGIT FOUR 5 = 65 = 53 = 35 = 35 = U+0035 : DIGIT FIVE 6 = 66 = 54 = 36 = 36 = U+0036 : DIGIT SIX 7 = 67 = 55 = 37 = 37 = U+0037 : DIGIT SEVEN 8 = 70 = 56 = 38 = 38 = U+0038 : DIGIT EIGHT 9 = 71 = 57 = 39 = 39 = U+0039 : DIGIT NINE : = 72 = 58 = 3A = 3A = U+003A : COLON ; = 73 = 59 = 3B = 3B = U+003B : SEMICOLON < = 74 = 60 = 3C = 3C = U+003C : LESS-THAN SIGN = = 75 = 61 = 3D = 3D = U+003D : EQUALS SIGN > = 76 = 62 = 3E = 3E = U+003E : GREATER-THAN SIGN ? = 77 = 63 = 3F = 3F = U+003F : QUESTION MARK @ = 100 = 64 = 40 = 40 = U+0040 : COMMERCIAL AT A = 101 = 65 = 41 = 41 = U+0041 : LATIN CAPITAL LETTER A B = 102 = 66 = 42 = 42 = U+0042 : LATIN CAPITAL LETTER B C = 103 = 67 = 43 = 43 = U+0043 : LATIN CAPITAL LETTER C D = 104 = 68 = 44 = 44 = U+0044 : LATIN CAPITAL LETTER D E = 105 = 69 = 45 = 45 = U+0045 : LATIN CAPITAL LETTER E F = 106 = 70 = 46 = 46 = U+0046 : LATIN CAPITAL LETTER F G = 107 = 71 = 47 = 47 = U+0047 : LATIN CAPITAL LETTER G H = 110 = 72 = 48 = 48 = U+0048 : LATIN CAPITAL LETTER H I = 111 = 73 = 49 = 49 = U+0049 : LATIN CAPITAL LETTER I J = 112 = 74 = 4A = 4A = U+004A : LATIN CAPITAL LETTER J K = 113 = 75 = 4B = 4B = U+004B : LATIN CAPITAL LETTER K L = 114 = 76 = 4C = 4C = U+004C : LATIN CAPITAL LETTER L M = 115 = 77 = 4D = 4D = U+004D : LATIN CAPITAL LETTER M N = 116 = 78 = 4E = 4E = U+004E : LATIN CAPITAL LETTER N O = 117 = 79 = 4F = 4F = U+004F : LATIN CAPITAL LETTER O P = 120 = 80 = 50 = 50 = U+0050 : LATIN CAPITAL LETTER P Q = 121 = 81 = 51 = 51 = U+0051 : LATIN CAPITAL LETTER Q R = 122 = 82 = 52 = 52 = U+0052 : LATIN CAPITAL LETTER R S = 123 = 83 = 53 = 53 = U+0053 : LATIN CAPITAL LETTER S T = 124 = 84 = 54 = 54 = U+0054 : LATIN CAPITAL LETTER T U = 125 = 85 = 55 = 55 = U+0055 : LATIN CAPITAL LETTER U V = 126 = 86 = 56 = 56 = U+0056 : LATIN CAPITAL LETTER V W = 127 = 87 = 57 = 57 = U+0057 : LATIN CAPITAL LETTER W X = 130 = 88 = 58 = 58 = U+0058 : LATIN CAPITAL LETTER X Y = 131 = 89 = 59 = 59 = U+0059 : LATIN CAPITAL LETTER Y Z = 132 = 90 = 5A = 5A = U+005A : LATIN CAPITAL LETTER Z [ = 133 = 91 = 5B = 5B = U+005B : LEFT SQUARE BRACKET \ = 134 = 92 = 5C = 5C = U+005C : REVERSE SOLIDUS ] = 135 = 93 = 5D = 5D = U+005D : RIGHT SQUARE BRACKET ^ = 136 = 94 = 5E = 5E = U+005E : CIRCUMFLEX ACCENT _ = 137 = 95 = 5F = 5F = U+005F : LOW LINE ` = 140 = 96 = 60 = 60 = U+2018 : LEFT SINGLE QUOTATION MARK a = 141 = 97 = 61 = 61 = U+0061 : LATIN SMALL LETTER A b = 142 = 98 = 62 = 62 = U+0062 : LATIN SMALL LETTER B c = 143 = 99 = 63 = 63 = U+0063 : LATIN SMALL LETTER C d = 144 = 100 = 64 = 64 = U+0064 : LATIN SMALL LETTER D e = 145 = 101 = 65 = 65 = U+0065 : LATIN SMALL LETTER E f = 146 = 102 = 66 = 66 = U+0066 : LATIN SMALL LETTER F g = 147 = 103 = 67 = 67 = U+0067 : LATIN SMALL LETTER G h = 150 = 104 = 68 = 68 = U+0068 : LATIN SMALL LETTER H i = 151 = 105 = 69 = 69 = U+0069 : LATIN SMALL LETTER I j = 152 = 106 = 6A = 6A = U+006A : LATIN SMALL LETTER J k = 153 = 107 = 6B = 6B = U+006B : LATIN SMALL LETTER K l = 154 = 108 = 6C = 6C = U+006C : LATIN SMALL LETTER L m = 155 = 109 = 6D = 6D = U+006D : LATIN SMALL LETTER M n = 156 = 110 = 6E = 6E = U+006E : LATIN SMALL LETTER N o = 157 = 111 = 6F = 6F = U+006F : LATIN SMALL LETTER O p = 160 = 112 = 70 = 70 = U+0070 : LATIN SMALL LETTER P q = 161 = 113 = 71 = 71 = U+0071 : LATIN SMALL LETTER Q r = 162 = 114 = 72 = 72 = U+0072 : LATIN SMALL LETTER R s = 163 = 115 = 73 = 73 = U+0073 : LATIN SMALL LETTER S t = 164 = 116 = 74 = 74 = U+0074 : LATIN SMALL LETTER T u = 165 = 117 = 75 = 75 = U+0075 : LATIN SMALL LETTER U v = 166 = 118 = 76 = 76 = U+0076 : LATIN SMALL LETTER V w = 167 = 119 = 77 = 77 = U+0077 : LATIN SMALL LETTER W x = 170 = 120 = 78 = 78 = U+0078 : LATIN SMALL LETTER X y = 171 = 121 = 79 = 79 = U+0079 : LATIN SMALL LETTER Y z = 172 = 122 = 7A = 7A = U+007A : LATIN SMALL LETTER Z { = 173 = 123 = 7B = 7B = U+007B : LEFT CURLY BRACKET | = 174 = 124 = 7C = 7C = U+007C : VERTICAL LINE } = 175 = 125 = 7D = 7D = U+007D : RIGHT CURLY BRACKET ~ = 176 = 126 = 7E = 7E = U+007E : TILDE ¡ = 241 = 161 = A1 = C2A1 = U+00A1 : INVERTED EXCLAMATION MARK ¢ = 242 = 162 = A2 = C2A2 = U+00A2 : CENT SIGN £ = 243 = 163 = A3 = C2A3 = U+00A3 : POUND SIGN ⁄ = 244 = 164 = A4 = E28184 = U+2044 : FRACTION SLASH ¥ = 245 = 165 = A5 = C2A5 = U+00A5 : YEN SIGN ƒ = 246 = 166 = A6 = C692 = U+0192 : LATIN SMALL LETTER F WITH HOOK § = 247 = 167 = A7 = C2A7 = U+00A7 : SECTION SIGN ¤ = 250 = 168 = A8 = C2A4 = U+00A4 : CURRENCY SIGN ' = 251 = 169 = A9 = 27 = U+0027 : APOSTROPHE “ = 252 = 170 = AA = E2809C = U+201C : LEFT DOUBLE QUOTATION MARK « = 253 = 171 = AB = C2AB = U+00AB : LEFT-POINTING DOUBLE ANGLE QUOTATION MARK ‹ = 254 = 172 = AC = E280B9 = U+2039 : SINGLE LEFT-POINTING ANGLE QUOTATION MARK › = 255 = 173 = AD = E280BA = U+203A : SINGLE RIGHT-POINTING ANGLE QUOTATION MARK fi = 256 = 174 = AE = EFAC81 = U+FB01 : LATIN SMALL LIGATURE FI fl = 257 = 175 = AF = EFAC82 = U+FB02 : LATIN SMALL LIGATURE FL – = 261 = 177 = B1 = E28093 = U+2013 : EN DASH † = 262 = 178 = B2 = E280A0 = U+2020 : DAGGER ‡ = 263 = 179 = B3 = E280A1 = U+2021 : DOUBLE DAGGER · = 264 = 180 = B4 = C2B7 = U+00B7 : MIDDLE DOT ¶ = 266 = 182 = B6 = C2B6 = U+00B6 : PILCROW SIGN • = 267 = 183 = B7 = E280A2 = U+2022 : BULLET ‚ = 270 = 184 = B8 = E2809A = U+201A : SINGLE LOW-9 QUOTATION MARK „ = 271 = 185 = B9 = E2809E = U+201E : DOUBLE LOW-9 QUOTATION MARK ” = 272 = 186 = BA = E2809D = U+201D : RIGHT DOUBLE QUOTATION MARK » = 273 = 187 = BB = C2BB = U+00BB : RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK … = 274 = 188 = BC = E280A6 = U+2026 : HORIZONTAL ELLIPSIS ‰ = 275 = 189 = BD = E280B0 = U+2030 : PER MILLE SIGN ¿ = 277 = 191 = BF = C2BF = U+00BF : INVERTED QUESTION MARK ` = 301 = 193 = C1 = 60 = U+0060 : GRAVE ACCENT ´ = 302 = 194 = C2 = C2B4 = U+00B4 : ACUTE ACCENT ˆ = 303 = 195 = C3 = CB86 = U+02C6 : MODIFIER LETTER CIRCUMFLEX ACCENT ˜ = 304 = 196 = C4 = CB9C = U+02DC : SMALL TILDE ¯ = 305 = 197 = C5 = C2AF = U+00AF : MACRON ˘ = 306 = 198 = C6 = CB98 = U+02D8 : BREVE ˙ = 307 = 199 = C7 = CB99 = U+02D9 : DOT ABOVE ¨ = 310 = 200 = C8 = C2A8 = U+00A8 : DIAERESIS ˚ = 312 = 202 = CA = CB9A = U+02DA : RING ABOVE ¸ = 313 = 203 = CB = C2B8 = U+00B8 : CEDILLA ˝ = 315 = 205 = CD = CB9D = U+02DD : DOUBLE ACUTE ACCENT ˛ = 316 = 206 = CE = CB9B = U+02DB : OGONEK ˇ = 317 = 207 = CF = CB87 = U+02C7 : CARON — = 320 = 208 = D0 = E28094 = U+2014 : EM DASH Æ = 341 = 225 = E1 = C386 = U+00C6 : LATIN CAPITAL LETTER AE ª = 343 = 227 = E3 = C2AA = U+00AA : FEMININE ORDINAL INDICATOR Ł = 350 = 232 = E8 = C581 = U+0141 : LATIN CAPITAL LETTER L WITH STROKE Ø = 351 = 233 = E9 = C398 = U+00D8 : LATIN CAPITAL LETTER O WITH STROKE Œ = 352 = 234 = EA = C592 = U+0152 : LATIN CAPITAL LIGATURE OE º = 353 = 235 = EB = C2BA = U+00BA : MASCULINE ORDINAL INDICATOR æ = 361 = 241 = F1 = C3A6 = U+00E6 : LATIN SMALL LETTER AE ı = 365 = 245 = F5 = C4B1 = U+0131 : LATIN SMALL LETTER DOTLESS I ł = 370 = 248 = F8 = C582 = U+0142 : LATIN SMALL LETTER L WITH STROKE ø = 371 = 249 = F9 = C3B8 = U+00F8 : LATIN SMALL LETTER O WITH STROKE œ = 372 = 250 = FA = C593 = U+0153 : LATIN SMALL LIGATURE OE ß = 373 = 251 = FB = C39F = U+00DF : LATIN SMALL LETTER SHARP S -- Greetings Pete Film is a dog: the head is commerce, the tail is art. And only rarely does the tail wag the dog. – Joseph Losey ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:23 ` Peter Dyballa @ 2021-10-13 16:28 ` Lars Ingebrigtsen 2021-10-13 16:43 ` Peter Dyballa 2021-10-13 16:45 ` Eli Zaretskii 1 sibling, 1 reply; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-13 16:28 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > Is this OK and usable? > > ;;; -*- mode: Text; coding: utf-8; -*- > ; > ; Time-stamp: <2011-01-05 10:52:40 pete> > ; > ; Standard PostScript Glyphs (Adobe) Where is this from? [...] > } = 175 = 125 = 7D = 7D = U+007D : RIGHT CURLY BRACKET > ~ = 176 = 126 = 7E = 7E = U+007E : TILDE > ¡ = 241 = 161 = A1 = C2A1 = U+00A1 : INVERTED EXCLAMATION MARK > ¢ = 242 = 162 = A2 = C2A2 = U+00A2 : CENT SIGN > £ = 243 = 163 = A3 = C2A3 = U+00A3 : POUND SIGN > ⁄ = 244 = 164 = A4 = E28184 = U+2044 : FRACTION SLASH But this doesn't seem to correspond to the table on Wikipedia (or the IBM document) -- it doesn't have any of the mappings in the 0x90-0xA0 range, for instance. (I didn't check the rest.) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:28 ` Lars Ingebrigtsen @ 2021-10-13 16:43 ` Peter Dyballa 0 siblings, 0 replies; 37+ messages in thread From: Peter Dyballa @ 2021-10-13 16:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 7786 > Am 13.10.2021 um 18:28 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > > Where is this from? I am quite sure that I took it from PLRM2, the PostScript Language Reference Manual, December 1990, page 604. -- Greetings Pete Time is an illusion. Lunchtime, doubly so. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:23 ` Peter Dyballa 2021-10-13 16:28 ` Lars Ingebrigtsen @ 2021-10-13 16:45 ` Eli Zaretskii 2021-10-13 17:35 ` Peter Dyballa 1 sibling, 1 reply; 37+ messages in thread From: Eli Zaretskii @ 2021-10-13 16:45 UTC (permalink / raw) To: Peter Dyballa; +Cc: larsi, 7786 > From: Peter Dyballa <Peter_Dyballa@Freenet.DE> > Date: Wed, 13 Oct 2021 18:23:50 +0200 > Cc: Eli Zaretskii <eliz@gnu.org>, > 7786@debbugs.gnu.org > > >> https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding > >> > >> It shows the Unicode codepoint for each character in the codepage. Or > >> what am I missing? > > > > Yes, that's what I suggested using as the source -- somebody would have > > to transcribe that into a machine readable file. > > Is this OK and usable? AFAICT, that's very different from the Wikipedia data. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:45 ` Eli Zaretskii @ 2021-10-13 17:35 ` Peter Dyballa 0 siblings, 0 replies; 37+ messages in thread From: Peter Dyballa @ 2021-10-13 17:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, 7786 > Am 13.10.2021 um 18:45 schrieb Eli Zaretskii <eliz@gnu.org>: > >> From: Peter Dyballa <Peter_Dyballa@Freenet.DE> >> Date: Wed, 13 Oct 2021 18:23:50 +0200 >> Cc: Eli Zaretskii <eliz@gnu.org>, >> 7786@debbugs.gnu.org >> >>>> https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding >>>> >>>> It shows the Unicode codepoint for each character in the codepage. Or >>>> what am I missing? >>> >>> Yes, that's what I suggested using as the source -- somebody would have >>> to transcribe that into a machine readable file. >> >> Is this OK and usable? > > AFAICT, that's very different from the Wikipedia data. I created the file ten years ago. Could be I missed later some standardisation by ISO… which the Wikipedia's author knew? Could be we mix things. My encoding stands for the encoding vector of a PostScript font. Anyway would it help you to retrieve these files? stdenc.txt # Name: Adobe Standard Encoding to Unicode # Unicode version: 2.0 # Table version: 0.2 # Date: 30 March 1999 # # Copyright (c) 1991-1999 Unicode, Inc. All Rights reserved. # # This file is provided as-is by Unicode, Inc. (The Unicode Consortium). No # claims are made as to fitness for any particular purpose. No warranties of # any kind are expressed or implied. The recipient agrees to determine # applicability of information provided. If this file has been provided on # magnetic media by Unicode, Inc., the sole remedy for any claim will be # exchange of defective media within 90 days of receipt. # # Recipient is granted the right to make copies in any form for internal # distribution and to freely use the information supplied in the creation of # products supporting Unicode. Unicode, Inc. specifically excludes the right # to re-distribute this file directly to third parties or other organizations # whether for profit or not. # # Format: 4 tab-delimited fields: # # (1) The Unicode value (in hexadecimal) # (2) The Adobe Standard Encoding code point (in hexadecimal) # (3) # Unicode name # (4) # PostScript character name # # General Notes: # # The Unicode values in this table were produced as the result of applying # the algorithm described in the section "Populating a Unicode space" in the # document "Unicode and Glyph Names," at # http://partners.adobe.com/asn/developer/typeforum/unicodegn.html # to the characters encoded in Adobe Standard Encoding. Note that some # Standard Encoding characters, such as "space", are mapped to 2 Unicode # values. Refer to the above document for more details. # # Revision History: # # [v0.2, 30 March 1999] # Different algorithm to produce Unicode values (see notes above) results in # some character codes being mapped to 2 Unicode values. Updated Unicode # names to Unicode 2.0 names. # # [v0.1, 5 May 1995] First release. # # Contact <unicode-inc@unicode.org> with any questions or comments. # symbol.txt # # Name: Adobe Symbol Encoding to Unicode # Unicode version: 2.0 # Table version: 0.2 # Date: 30 March 1999 # # Copyright (c) 1991-1999 Unicode, Inc. All Rights reserved. # # This file is provided as-is by Unicode, Inc. (The Unicode Consortium). No # claims are made as to fitness for any particular purpose. No warranties of # any kind are expressed or implied. The recipient agrees to determine # applicability of information provided. If this file has been provided on # magnetic media by Unicode, Inc., the sole remedy for any claim will be # exchange of defective media within 90 days of receipt. # # Recipient is granted the right to make copies in any form for internal # distribution and to freely use the information supplied in the creation of # products supporting Unicode. Unicode, Inc. specifically excludes the right # to re-distribute this file directly to third parties or other organizations # whether for profit or not. # # Format: 4 tab-delimited fields: # # (1) The Unicode value (in hexadecimal) # (2) The Symbol Encoding code point (in hexadecimal) # (3) # Unicode name # (4) # PostScript character name # # General Notes: # # The Unicode values in this table were produced as the result of applying # the algorithm described in the section "Populating a Unicode space" in the # document "Unicode and Glyph Names," at # http://partners.adobe.com/asn/developer/typeforum/unicodegn.html # to the characters in Symbol. Note that some characters, such as "space", # are mapped to 2 Unicode values. 29 characters have assignments in the # Corporate Use Subarea; these are indicated by "(CUS)" in field 4. Refer to # the above document for more details. # # Revision History: # # [v0.2, 30 March 1999] # Different algorithm to produce Unicode values (see notes above) results in # some character codes being mapped to 2 Unicode values; use of Corporate # Use subarea values; addition of the euro character; changed assignments of # some characters such as the COPYRIGHT SIGNs and RADICAL EXTENDER. Updated # Unicode names to Unicode 2.0 names. # # [v0.1, 5 May 1995] First release. # # Contact <unicode-inc@unicode.org> with any questions or comments. # # # Name: Adobe Zapf Dingbats Encoding to Unicode # Unicode version: 2.0 # Table version: 0.2 # Date: 30 March 1999 # # Copyright (c) 1991-1999 Unicode, Inc. All Rights reserved. # # This file is provided as-is by Unicode, Inc. (The Unicode Consortium). No # claims are made as to fitness for any particular purpose. No warranties of # any kind are expressed or implied. The recipient agrees to determine # applicability of information provided. If this file has been provided on # magnetic media by Unicode, Inc., the sole remedy for any claim will be # exchange of defective media within 90 days of receipt. # # Recipient is granted the right to make copies in any form for internal # distribution and to freely use the information supplied in the creation of # products supporting Unicode. Unicode, Inc. specifically excludes the right # to re-distribute this file directly to third parties or other organizations # whether for profit or not. # # Format: Three tab-delimited fields: # # (1) The Unicode value (in hexadecimal) # (2) The Zapf Dingbats Encoding code point (in hexadecimal) # (3) # Unicode 2.0 name # (4) # PostScript character name # # General Notes: # # The Unicode values in this table were produced as the result of # applying the algorithm described in the section "Populating a Unicode # space" in the document "Unicode and Glyph Names," at # http://partners.adobe.com/asn/developer/typeforum/unicodegn.html # to the characters in Zapf Dingbats. Note that some characters, such as # "space", are mapped to 2 Unicode values. 14 characters have assignments in # the Corporate Use Subarea; these are indicated by "(CUS)" in field 4. # Refer to the above document for more details. # # Revision History: # # [v0.2, 30 March 1999] Different algorithm to produce Unicode values (see # notes above) results in some character codes being mapped to 2 Unicode # values; use of Corporate Use subarea values; included BLACK CIRCLE and # RIGHT HALF BLACK CIRCLE. Updated Unicode names to Unicode 2.0 names. # # [v0.1, 5 May 1995] First release. # # Contact <unicode-inc@unicode.org> with any questions or comments. # zdingbat.txt # # Name: Adobe Zapf Dingbats Encoding to Unicode # Unicode version: 2.0 # Table version: 0.2 # Date: 30 March 1999 # # Copyright (c) 1991-1999 Unicode, Inc. All Rights reserved. # # This file is provided as-is by Unicode, Inc. (The Unicode Consortium). No # claims are made as to fitness for any particular purpose. No warranties of # any kind are expressed or implied. The recipient agrees to determine # applicability of information provided. If this file has been provided on # magnetic media by Unicode, Inc., the sole remedy for any claim will be # exchange of defective media within 90 days of receipt. # # Recipient is granted the right to make copies in any form for internal # distribution and to freely use the information supplied in the creation of # products supporting Unicode. Unicode, Inc. specifically excludes the right # to re-distribute this file directly to third parties or other organizations # whether for profit or not. # # Format: Three tab-delimited fields: # # (1) The Unicode value (in hexadecimal) # (2) The Zapf Dingbats Encoding code point (in hexadecimal) # (3) # Unicode 2.0 name # (4) # PostScript character name # # General Notes: # # The Unicode values in this table were produced as the result of # applying the algorithm described in the section "Populating a Unicode # space" in the document "Unicode and Glyph Names," at # http://partners.adobe.com/asn/developer/typeforum/unicodegn.html # to the characters in Zapf Dingbats. Note that some characters, such as # "space", are mapped to 2 Unicode values. 14 characters have assignments in # the Corporate Use Subarea; these are indicated by "(CUS)" in field 4. # Refer to the above document for more details. # # Revision History: # # [v0.2, 30 March 1999] Different algorithm to produce Unicode values (see # notes above) results in some character codes being mapped to 2 Unicode # values; use of Corporate Use subarea values; included BLACK CIRCLE and # RIGHT HALF BLACK CIRCLE. Updated Unicode names to Unicode 2.0 names. # # [v0.1, 5 May 1995] First release. # # Contact <unicode-inc@unicode.org> with any questions or comments. # These were just the file headers. -- Greetings Pete "To infinity and beyond!" – Captain Buzz Lightyear ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:20 ` Lars Ingebrigtsen 2021-10-13 16:23 ` Peter Dyballa @ 2021-10-13 16:43 ` Eli Zaretskii 2021-10-13 18:55 ` Lars Ingebrigtsen 1 sibling, 1 reply; 37+ messages in thread From: Eli Zaretskii @ 2021-10-13 16:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Peter_Dyballa, 7786 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Peter_Dyballa@Freenet.DE, 7786@debbugs.gnu.org > Date: Wed, 13 Oct 2021 18:20:34 +0200 > > > https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding > > > > It shows the Unicode codepoint for each character in the codepage. Or > > what am I missing? > > Yes, that's what I suggested using as the source -- somebody would have > to transcribe that into a machine readable file. Will the below do? # Generated from https://en.wikipedia.org/wiki/PostScript_Latin_1_Encoding 0x00-0x5F 0x0000 0x60 0x2018 0x61-0x7E 0x0061 0x90 0x0131 0x91 0x0050 0x92 0x00B4 0x93 0x02C5 0x94 0x02DC 0x95 0x02C9 0x96-0x97 0x02D8 0x98 0x00A8 0x9A 0x02DA 0x9B 0x00B8 0x9D 0x02DD 0x9E 0x02DB 0x9F 0x02C7 0xA0-0xFF 0x00A0 ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 16:43 ` Eli Zaretskii @ 2021-10-13 18:55 ` Lars Ingebrigtsen 2021-10-13 19:05 ` Eli Zaretskii 2021-10-13 19:07 ` Peter Dyballa 0 siblings, 2 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-13 18:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Peter_Dyballa, 7786 Eli Zaretskii <eliz@gnu.org> writes: > Will the below do? Looks good, but... Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: >> Am 13.10.2021 um 18:28 schrieb Lars Ingebrigtsen <larsi@gnus.org>: >> >> Where is this from? > > I am quite sure that I took it from PLRM2, the PostScript Language > Reference Manual, December 1990, page 604. ... I'm not sure whether that IBM document the Wikipedia page has sourced the table for is authoritative. There seems to be many versions of the Adobe PostScript ISO-8859-1-alike code page. Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > Could be we mix things. My encoding stands for the encoding vector of > a PostScript font. Hm... I don't think that's what we need here -- we need the encoding of text files, not fonts. > Anyway would it help you to retrieve these files? > > stdenc.txt > # Name: Adobe Standard Encoding to Unicode > # Unicode version: 2.0 > # Table version: 0.2 > # Date: 30 March 1999 That's the one we have in Emacs today as adobe-standard-encoding, but it seems very odd. I mean, both stdenc.txt itself, as well as our interpretation of it, because stdenc.map leaves most 8-bit chars undefined. So I'm not sure what we should do here, if anything. Is there some Adobe printing expert we could reach out to? :-) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 18:55 ` Lars Ingebrigtsen @ 2021-10-13 19:05 ` Eli Zaretskii 2021-10-13 19:07 ` Peter Dyballa 1 sibling, 0 replies; 37+ messages in thread From: Eli Zaretskii @ 2021-10-13 19:05 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Peter_Dyballa, 7786 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Peter_Dyballa@Freenet.DE, 7786@debbugs.gnu.org > Date: Wed, 13 Oct 2021 20:55:25 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Will the below do? > > Looks good, but... > > Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > > >> Am 13.10.2021 um 18:28 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > >> > >> Where is this from? > > > > I am quite sure that I took it from PLRM2, the PostScript Language > > Reference Manual, December 1990, page 604. > > ... I'm not sure whether that IBM document the Wikipedia page has > sourced the table for is authoritative. There seems to be many versions > of the Adobe PostScript ISO-8859-1-alike code page. The IBM PDF document is consistent 1:1 with Wikipedia, FWIW. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 18:55 ` Lars Ingebrigtsen 2021-10-13 19:05 ` Eli Zaretskii @ 2021-10-13 19:07 ` Peter Dyballa 1 sibling, 0 replies; 37+ messages in thread From: Peter Dyballa @ 2021-10-13 19:07 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 7786 > Am 13.10.2021 um 20:55 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > > I mean, both stdenc.txt itself, as well as our > interpretation of it, because stdenc.map leaves most 8-bit chars > undefined. Because it is also the font encoding! "map" stands for the character mapping of a font. -- Greetings Pete Si ça fait mal c'est que ça fait du bien ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 13:51 ` Lars Ingebrigtsen 2021-10-13 15:41 ` Eli Zaretskii @ 2021-10-13 21:02 ` Peter Dyballa 2021-10-14 6:42 ` Eli Zaretskii 2021-10-13 21:55 ` Peter Dyballa 2 siblings, 1 reply; 37+ messages in thread From: Peter Dyballa @ 2021-10-13 21:02 UTC (permalink / raw) To: Lars Ingebrigtsen, Eli Zaretskii; +Cc: 7786 Maybe this leads to an Adobe ISO Latin-1 encoding for GNU Emacs… I copied off PLRM the encoding from page 605 and pasted into *scratch* buffer. In rectangular editing mode I reconstructed this table: octal 0 1 2 3 4 5 6 7 ------------------------------------------------------------------ \04x ! " # $ % & ’ \05x ( ) * + , - . / \06x 0 1 2 3 4 5 6 7 \07x 8 9 : ; < = > ? \10x @ A B C D E F G \11x H I J K L M N O \12x P Q R S T U V W \13x X Y Z [ \ ] ^ _ \14x ‘ a b c d e f g \15x h i j k l m n o \16x p q r s t u v w \17x x y z { | } ~ \20x \21x \22x ı ` ́ ˆ ̃ ̄ ̆ ̇ \23x ̈ ̊ ̧ ̋ ̨ \24x ¡ ¢ £ ¤ ¥ ¦ § \25x ̈ © ª « ¬ - ® ̄ \26x ° ± ² ³ ́ μ ¶ · \27x ̧ ¹ º » ¼ ½ ¾ ¿ \30x À Á Â Ã Ä Å Æ Ç \31x È É Ê Ë Ì Í Î Ï \32x Ð Ñ Ò Ó Ô Õ Ö × \33x Ø Ù Ú Û Ü Ý Þ ß \34x à á â ã ä å æ ç \35x è é ê ë ì í î ï \36x ð ñ ò ó ô õ ö ÷ \37x ø ù ú û ü ý þ ÿ This was easy, each column is 27 lines high. So I could easily move, cut, and put. In Wikipedia I went to "edit" the table. So I could copy the source code and paste it into *scratch* buffer. The lines started with some rubbish which ended in the reg-exp "[lr]l|". SO I could turn "^.*[rl]l|" → "". I sorted the lines in order remove all lines containing no context. On what was left I could sort-regexp-fields the complete lines on "|[0-9]+}}$". So the list was sorted. In column 64 I added in rectangular mode "• " because "•" is not encoded. A copy of the table above was stripped off the left-most column all TABs were converted into LINEFEEDs. By removing the empty lines I had a "vector" which I could easily move right of "• ". The result is: 0020|[[space character|SP]]|32|040}} • 0021|[[Exclamation mark|!]]|33|041}} • ! 0022|[[Quotation mark|"]] |34|042}} • " 0023|[[Number sign|#]]|35|043}} • # 0024|[[Dollar sign|$]]|36|044}} • $ 0025|[[Percent sign|%]]|37|045}} • % 0026|[[Ampersand|&]]|38|046}} • & 2019|[[Quotation mark|’]]|39|047}} • ’ 0028|[[Bracket|(]]|40|050}} • ( 0029|[[Bracket|)]]|41|051}} • ) 002A|[[Asterisk|*]]|42|052}} • * 002B|[[Plus and minus signs|+]]|43|053}} • + 002C|[[Comma (punctuation)|,]] |44|054}} • , 002D|[[Hyphen-minus|-]]|45|055}} • - 002E|[[Full stop|.]]|46|056}} • . 002F|[[Slash (punctuation)|/]] |47|057}} • / 0030|[[0 (number)|0]]|48|060}} • 0 0031|[[1 (number)|1]]|49|061}} • 1 0032|[[2 (number)|2]]|50|062}} • 2 0033|[[3 (number)|3]]|51|063}} • 3 0034|[[4 (number)|4]]|52|064}} • 4 0035|[[5 (number)|5]]|53|065}} • 5 0036|[[6 (number)|6]]|54|066}} • 6 0037|[[7 (number)|7]]|55|067}} • 7 0038|[[8 (number)|8]]|56|070}} • 8 0039|[[9 (number)|9]]|57|071}} • 9 003A|[[colon (punctuation)|:]]|58|072}} • : 003B|[[semicolon|;]]|59|073}} • ; 003C|[[less-than sign|<]]|60|074}} • < 003D|[[equal sign|{{=}}]]|61|075}} • = 003E|[[greater-than sign|>]]|62|076}} • > 003F|[[question mark|?]]|63|077}} • ? 0040|[[@]]|64|100}} • @ 0041|[[A]]|65|101}} • A 0042|[[B]]|66|102}} • B 0043|[[C]]|67|103}} • C 0044|[[D]]|68|104}} • D 0045|[[E]]|69|105}} • E 0046|[[F]]|70|106}} • F 0047|[[G]]|71|107}} • G 0048|[[H]]|72|110}} • H 0049|[[I]]|73|111}} • I 004A|[[J]]|74|112}} • J 004B|[[K]]|75|113}} • K 004C|[[L]]|76|114}} • L 004D|[[M]]|77|115}} • M 004E|[[N]]|78|116}} • N 004F|[[O]]|79|117}} • O 0050|[[P]]|80|120}} • P 0051|[[Q]]|81|121}} • Q 0052|[[R]]|82|122}} • R 0053|[[S]]|83|123}} • S 0054|[[T]]|84|124}} • T 0055|[[U]]|85|125}} • U 0056|[[V]]|86|126}} • V 0057|[[W]]|87|127}} • W 0058|[[X]]|88|130}} • X 0059|[[Y]]|89|131}} • Y 005A|[[Z]]|90|132}} • Z 005B|[[Square brackets|[]]|91|133}} • [ 005C|[[Backslash|\]]|92|134}} • \ 005D|[[Square brackets|]]]|93|135}} • ] 005E|[[Circumflex|^]]|94|136}} • ^ 005F|[[Underscore|_]]|95|137}} • _ 2018|[[Quotation mark|‘]]|96|140}} • ‘ 0061|[[a]]|97|141}} • a 0062|[[b]]|98|142}} • b 0063|[[c]]|99|143}} • c 0064|[[d]]|100|144}} • d 0065|[[e]]|101|145}} • e 0066|[[f]]|102|146}} • f 0067|[[g]]|103|147}} • g 0068|[[h]]|104|150}} • h 0069|[[i]]|105|151}} • i 006A|[[j]]|106|152}} • j 006B|[[k]]|107|153}} • k 006C|[[l]]|108|154}} • l 006D|[[m]]|109|155}} • m 006E|[[n]]|110|156}} • n 006F|[[o]]|111|157}} • o 0070|[[p]]|112|160}} • p 0071|[[q]]|113|161}} • q 0072|[[r]]|114|162}} • r 0073|[[s]]|115|163}} • s 0074|[[t]]|116|164}} • t 0075|[[u]]|117|165}} • u 0076|[[v]]|118|166}} • v 0077|[[w]]|119|167}} • w 0078|[[x]]|120|170}} • x 0079|[[y]]|121|171}} • y 007A|[[z]]|122|172}} • z 007B|[[Braces (punctuation)|{]]|123|173}} • { 007C|[[Vertical bar|{{pipe}}]]|124|174}} • | 007D|[[Braces (punctuation)|}]]|125|175}} • } 007E|[[Tilde|~]]|126|176}} • ~ 0131|[[ı]]|144|220}} • ı 0060|[[`]]|145|221}} • ` 00B4|[[´]]|146|222}} • ́ 02C6|[[ˆ]]|147|223}} • ˆ 02DC|[[˜]]|148|224}} • ̃ 02C9|[[ˉ]]|149|225}} • ̄ 02D8|[[˘]]|150|226}} • ̆ 02D9|[[˙]]|151|227}} • ̇ 00A8|[[¨]]|152|230}} • ̈ 02DA|[[˚]]|154|232}} • ̊ 00B8|[[¸]]|155|233}} • ̧ 02DD|[[˝]]|157|235}} • ̋ 02DB|[[˛]]|158|236}} • ̨ 02C7|[[ˇ]]|159|237}} • ˇ 00A0|[[Non-breaking space|NBSP]]|160|240}} • 00A1|[[Inverted question and exclamation marks|¡]]|161|241}} • ¡ 00A2|[[Cent (currency)#Symbol|¢]]|162|242}} • ¢ 00A3|[[Pound sign|£]]|163|243}} • £ 00A4|[[Currency (typography)|¤]]|164|244}} • ¤ 00A5|[[¥]]|165|245}} • ¥ 00A6|[[Vertical bar|¦]]|166|246}} • ¦ 00A7|[[Section sign|§]]|167|247}} • § 00A8|[[¨]]|168|250}} • ̈ 00A9|[[Copyright symbol|©]]|169|251}} • © 00AA|[[Ordinal indicator|ª]]|170|252}} • ª 00AB|[[Guillemet|«]]|171|253}} • « 00AC|[[Negation|¬]]|172|254}} • ¬ 00AD|[[Soft hyphen|SHY]]|173|255}} • - 00AE|[[Registered trademark symbol|®]]|174|256}} • ® 00AF|[[Macron (diacritic)|¯]]|175|257}} • ̄ 00B0|[[Degree symbol|°]]|176|260}} • ° 00B1|[[Plus-minus sign|±]]|177|261}} • ± 00B2|[[Square (algebra)|²]]|178|262}} • ² 00B3|[[Cube (algebra)|³]]|179|263}} • ³ 00B4|[[Acute accent|´]]|180|264}} • ́ 00B5|[[Micro sign|µ]]|181|265}} • μ 00B6|[[Pilcrow|¶]]|182|266}} • ¶ 00B7|[[Interpunct|·]]|183|267}} • · 00B8|[[Cedilla|¸]]|184|270}} • ̧ 00B9|[[Unicode subscripts and superscripts|¹]]|185|271}} • ¹ 00BA|[[Ordinal indicator|º]]|186|272}} • º 00BB|[[Guillemet|»]]|187|273}} • » 00BC|[[1/4 (disambiguation)|¼]]|188|274}} • ¼ 00BD|[[1/2 (disambiguation)|½]]|189|275}} • ½ 00BE|[[3/4 (disambiguation)|¾]]|190|276}} • ¾ 00BF|[[Inverted question mark|¿]]|191|277}} • ¿ 00C0|[[À]]|192|300}} • À 00C1|[[Á]]|193|301}} • Á 00C2|[[Â]]|194|302}} • Â 00C3|[[Ã]]|195|303}} • Ã 00C4|[[Ä]]|196|304}} • Ä 00C5|[[Å]]|197|305}} • Å 00C6|[[Æ]]|198|306}} • Æ 00C7|[[Ç]]|199|307}} • Ç 00C8|[[È]]|200|310}} • È 00C9|[[É]]|201|311}} • É 00CA|[[Ê]]|202|312}} • Ê 00CB|[[Ë]]|203|313}} • Ë 00CC|[[Ì]]|204|314}} • Ì 00CD|[[Í]]|205|315}} • Í 00CE|[[Î]]|206|316}} • Î 00CF|[[Ï]]|207|317}} • Ï 00D0|[[Eth|Ð]]|208|320}} • Ð 00D1|[[Ñ]]|209|321}} • Ñ 00D2|[[Ò]]|210|322}} • Ò 00D3|[[Ó]]|211|323}} • Ó 00D4|[[Ô]]|212|324}} • Ô 00D5|[[Õ]]|213|325}} • Õ 00D6|[[Ö]]|214|326}} • Ö 00D7|[[Multiplication sign|×]]|215|327}} • × 00D8|[[Ø]]|216|330}} • Ø 00D9|[[Ù]]|217|331}} • Ù 00DA|[[Ú]]|218|332}} • Ú 00DB|[[Û]]|219|333}} • Û 00DC|[[Ü]]|220|334}} • Ü 00DD|[[Ý]]|221|335}} • Ý 00DE|[[Thorn (letter)|Þ]]|222|336}} • Þ 00DF|[[ß]]|223|337}} • ß 00E0|[[à]]|224|340}} • à 00E1|[[á]]|225|341}} • á 00E2|[[â]]|226|342}} • â 00E3|[[ã]]|227|343}} • ã 00E4|[[ä]]|228|344}} • ä 00E5|[[å]]|229|345}} • å 00E6|[[æ]]|230|346}} • æ 00E7|[[ç]]|231|347}} • ç 00E8|[[è]]|232|350}} • è 00E9|[[é]]|233|351}} • é 00EA|[[ê]]|234|352}} • ê 00EB|[[ë]]|235|353}} • ë 00EC|[[ì]]|236|354}} • ì 00ED|[[í]]|237|355}} • í 00EE|[[î]]|238|356}} • î 00EF|[[ï]]|239|357}} • ï 00F0|[[Eth|ð]]|240|360}} • ð 00F1|[[ñ]]|241|361}} • ñ 00F2|[[ò]]|242|362}} • ò 00F3|[[ó]]|243|363}} • ó 00F4|[[ô]]|244|364}} • ô 00F5|[[õ]]|245|365}} • õ 00F6|[[ö]]|246|366}} • ö 00F7|[[Obelus|÷]]|247|367}} • ÷ 00F8|[[ø]]|248|370}} • ø 00F9|[[ù]]|249|371}} • ù 00FA|[[ú]]|250|372}} • ú 00FB|[[û]]|251|373}} • û 00FC|[[ü]]|252|374}} • ü 00FD|[[ý]]|253|375}} • ý 00FE|[[Thorn (letter)|þ]]|254|376}} • þ 00FF|[[ÿ]]|255|377}} • ÿ It looks as if both tables (PLRM and Wikipedia) describe the same encoding. To have a proof I took a copy of this table and cut the "vector" at the right edge and put it below the table, between them a separating line of plus signs. From the remainder of the table I could delete all from "]]" to "$", and then all from"^" to "|". The remaining "[[" pairs could be removed. I split the window into two and invoked compare-windows. Of course it choked a few times because some characters are in HTML notation, but it proved that both encoding vectors are actually the same. So I presume the data above can be used for the Adobe Standard ISO Latin-1 encoding. (After some further editing.) -- Greetings Pete It is so hot in some places that the people there have to live in other places. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 21:02 ` Peter Dyballa @ 2021-10-14 6:42 ` Eli Zaretskii 2021-10-15 12:47 ` Lars Ingebrigtsen 0 siblings, 1 reply; 37+ messages in thread From: Eli Zaretskii @ 2021-10-14 6:42 UTC (permalink / raw) To: Peter Dyballa; +Cc: larsi, 7786 > From: Peter Dyballa <Peter_Dyballa@Freenet.DE> > Date: Wed, 13 Oct 2021 23:02:29 +0200 > Cc: 7786@debbugs.gnu.org > > Maybe this leads to an Adobe ISO Latin-1 encoding for GNU Emacs… > > I copied off PLRM the encoding from page 605 and pasted into *scratch* buffer. In rectangular editing mode I reconstructed this table: This seems to be the same as the CP1277.map file I threw together and posted here yesterday. So I think it is ready to be used, and we should just define the additional coding-system using it. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-14 6:42 ` Eli Zaretskii @ 2021-10-15 12:47 ` Lars Ingebrigtsen 2021-10-15 15:59 ` Peter Dyballa 0 siblings, 1 reply; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-15 12:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Peter Dyballa, 7786 Eli Zaretskii <eliz@gnu.org> writes: > This seems to be the same as the CP1277.map file I threw together and > posted here yesterday. So I think it is ready to be used, and we > should just define the additional coding-system using it. Right. But it might be nice to have some real-world PS files to test with. Peter, do you have any (preferably smallish -- I mean, not multi-gigabyte) PostScript files that use this encoding? Two or three would be cool. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-15 12:47 ` Lars Ingebrigtsen @ 2021-10-15 15:59 ` Peter Dyballa 2021-10-18 7:09 ` Lars Ingebrigtsen 0 siblings, 1 reply; 37+ messages in thread From: Peter Dyballa @ 2021-10-15 15:59 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 7786 [-- Attachment #1: Type: text/plain, Size: 1579 bytes --] > Peter, do you have any (preferably smallish -- I mean, not > multi-gigabyte) PostScript files that use this encoding? Two or three > would be cool. I have no real test files at hand. What I still have is a set of files with ISO 8859-X encodings. I once used a2ps to create PS files from them. These sorts are in the tar file. a2ps changed the real characters into their octal representations for portability. I took one such PS file, from ISO Latin-1 encoding, and added to these octal codes the real characters, taken off the encoding TXT file. PS-Test-1.ps displays in X11 with Ghostscript 9.54.0 OK. I can see "character MINUS character" at the left, followed by their description/explanation. You could use any text file and convert it into PostScript. It should not matter whether you use a2ps or enscript or something else. The produced PS output file should be in ISOLatin1Encoding, presumingly using octal representations for 8 bit characters. You might take one such file and convert it to PDF. You could take the same file, change it, and save it under a new name in ISOLatin1Encoding. Convert it to PDF. Change the new file in ISOLatin1Encoding, undo the previous edit change, and save it as a newer file in ISO Latin-1 (or -15) text encoding. Convert this PS file too to PDF. Are there differences visible in PDF output? Could be this is a way to test the ISOLatin1Encoding encoding. -- Mit friedvollen Grüßen Pete To most people solutions mean finding the answers. But to chemists solutions are things that are still all mixed up. [-- Attachment #2: ISO-Latin-encodings.tar.xz --] [-- Type: application/x-xz, Size: 19696 bytes --] [-- Attachment #3: PS-Test-1.ps --] [-- Type: application/postscript, Size: 23146 bytes --] ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-15 15:59 ` Peter Dyballa @ 2021-10-18 7:09 ` Lars Ingebrigtsen 2021-10-18 12:25 ` Eli Zaretskii 0 siblings, 1 reply; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-18 7:09 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > You could use any text file and convert it into PostScript. It should > not matter whether you use a2ps or enscript or something else. Well, the problem is that I can't find anything that actually generates codes that match the Wikipedia listing. With a text file like this: This is a sentence with `foo'. a2ps gives me (This is a sentence with `foo'.) p n Note that the ` is 0x60, not 0x2018, like Wikipedia says it should be. Perhaps the reason no software out there actually supports this encoding is that it's not actually used in nature. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-18 7:09 ` Lars Ingebrigtsen @ 2021-10-18 12:25 ` Eli Zaretskii 2021-10-18 13:17 ` Lars Ingebrigtsen 2021-10-18 15:51 ` Peter Dyballa 0 siblings, 2 replies; 37+ messages in thread From: Eli Zaretskii @ 2021-10-18 12:25 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Peter_Dyballa, 7786 > From: Lars Ingebrigtsen <larsi@gnus.org> > Cc: Eli Zaretskii <eliz@gnu.org>, 7786@debbugs.gnu.org > Date: Mon, 18 Oct 2021 09:09:28 +0200 > > Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > > > You could use any text file and convert it into PostScript. It should > > not matter whether you use a2ps or enscript or something else. > > Well, the problem is that I can't find anything that actually generates > codes that match the Wikipedia listing. > > With a text file like this: > > This is a sentence with `foo'. > > a2ps gives me > > (This is a sentence with `foo'.) p n > > Note that the ` is 0x60, not 0x2018, like Wikipedia says it should be. And what a2ps produces prints correctly on a PS printer? ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-18 12:25 ` Eli Zaretskii @ 2021-10-18 13:17 ` Lars Ingebrigtsen 2021-10-18 15:51 ` Peter Dyballa 1 sibling, 0 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-18 13:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Peter_Dyballa, 7786 [-- Attachment #1: Type: text/plain, Size: 148 bytes --] Eli Zaretskii <eliz@gnu.org> writes: > And what a2ps produces prints correctly on a PS printer? ghostview displays the file perfectly, at least: [-- Attachment #2: Type: image/png, Size: 1856 bytes --] [-- Attachment #3: Type: text/plain, Size: 140 bytes --] I don't have a PS printer, though. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-18 12:25 ` Eli Zaretskii 2021-10-18 13:17 ` Lars Ingebrigtsen @ 2021-10-18 15:51 ` Peter Dyballa 2021-10-18 16:00 ` Eli Zaretskii 1 sibling, 1 reply; 37+ messages in thread From: Peter Dyballa @ 2021-10-18 15:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Lars Ingebrigtsen, 7786 > Am 18.10.2021 um 14:25 schrieb Eli Zaretskii <eliz@gnu.org>: > > And what a2ps produces prints correctly on a PS printer? Yes. The output was correct on Epson EPL-5800 PS (PostScript 3) and HP LaserJet 2100 TN. I was not aware of an error 0x60 vs. 0x2018, i.e. ` vs. ‘. Which file is faulty? (Grep does not show a reasonable result, finds only comments.) -- Greetings Pete Real Time, adj.: Here and now, as opposed to fake time, which only occurs there and then. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-18 15:51 ` Peter Dyballa @ 2021-10-18 16:00 ` Eli Zaretskii 2021-10-19 5:49 ` Peter Dyballa 0 siblings, 1 reply; 37+ messages in thread From: Eli Zaretskii @ 2021-10-18 16:00 UTC (permalink / raw) To: Peter Dyballa; +Cc: larsi, 7786 > From: Peter Dyballa <Peter_Dyballa@Freenet.DE> > Date: Mon, 18 Oct 2021 17:51:47 +0200 > Cc: Lars Ingebrigtsen <larsi@gnus.org>, > 7786@debbugs.gnu.org > > > > Am 18.10.2021 um 14:25 schrieb Eli Zaretskii <eliz@gnu.org>: > > > > And what a2ps produces prints correctly on a PS printer? > > Yes. The output was correct on Epson EPL-5800 PS (PostScript 3) and HP LaserJet 2100 TN. Then maybe we should simply use Latin-1. AFAIR, that's what ps-mule.el is doing. > I was not aware of an error 0x60 vs. 0x2018, i.e. ` vs. ‘. Which file is faulty? (Grep does not show a reasonable result, finds only comments.) It's the only difference between Latin-1 and that special encoding of PS files, according to Wikipedia. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-18 16:00 ` Eli Zaretskii @ 2021-10-19 5:49 ` Peter Dyballa 2021-10-19 11:59 ` Eli Zaretskii 0 siblings, 1 reply; 37+ messages in thread From: Peter Dyballa @ 2021-10-19 5:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: larsi, 7786 PostScript is (was?) meant to produce good looking text print-outs. Therefore it uses typographical quotes instead of those from ASCII. This design decision should be honoured. -- Greetings Pete "If I can't dance to it, it's not my revolution.“ – A t-shirt designed by Jack Frager ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-19 5:49 ` Peter Dyballa @ 2021-10-19 11:59 ` Eli Zaretskii 2021-10-19 13:47 ` Lars Ingebrigtsen 0 siblings, 1 reply; 37+ messages in thread From: Eli Zaretskii @ 2021-10-19 11:59 UTC (permalink / raw) To: Peter Dyballa; +Cc: larsi, 7786 > From: Peter Dyballa <Peter_Dyballa@Freenet.DE> > Date: Tue, 19 Oct 2021 07:49:34 +0200 > Cc: larsi@gnus.org, > 7786@debbugs.gnu.org > > PostScript is (was?) meant to produce good looking text print-outs. Therefore it uses typographical quotes instead of those from ASCII. This design decision should be honoured. So why a2ps doesn't? ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-19 11:59 ` Eli Zaretskii @ 2021-10-19 13:47 ` Lars Ingebrigtsen 2021-10-20 5:39 ` Peter Dyballa 0 siblings, 1 reply; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-19 13:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Peter Dyballa, 7786 Eli Zaretskii <eliz@gnu.org> writes: >> PostScript is (was?) meant to produce good looking text >> print-outs. Therefore it uses typographical quotes instead of those >> from ASCII. This design decision should be honoured. My screenshot showed that the actual output used proper typographical quotes, but that may be down to the font used. > So why a2ps doesn't? I think the conclusion here is that we shouldn't do anything. Adobe created two encodings -- the "standard" one (which is ASCII with some alterations), and the ISOLatin1Encoding (which is 8859-1 with some alterations). But we can't really detect these simply: a2ps, for instance, uses 8859-1 instead, %%BeginResource: encoding ISO-8859-1Encoding enscript does the same, but in a different way: %%BeginResource: procset Enscript-Encoding-88591 1.6.5 90 None of the .ps files I can find on this laptop uses ISOLatin1Encoding (or the "standard encoding"), as far as I can see. So 1) these encodings went out of fashion decades ago and, 2) even if we wanted to support them, Emacs can't auto-detect when they're used. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-19 13:47 ` Lars Ingebrigtsen @ 2021-10-20 5:39 ` Peter Dyballa 2021-10-20 5:45 ` Lars Ingebrigtsen 0 siblings, 1 reply; 37+ messages in thread From: Peter Dyballa @ 2021-10-20 5:39 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 7786 > Am 19.10.2021 um 15:47 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > > So 1) these encodings went out of fashion decades ago and, 2) even if we > wanted to support them, Emacs can't auto-detect when they're used. Can't there be a default binding of ISOLatin1Encoding to files with extension .ps or that are otherwise found or set to be PostScript files? -- Greetings Pete A census taker is a man who goes from house to house increasing the population. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-20 5:39 ` Peter Dyballa @ 2021-10-20 5:45 ` Lars Ingebrigtsen 2021-10-20 6:18 ` Lars Ingebrigtsen 2021-10-20 16:34 ` Peter Dyballa 0 siblings, 2 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-20 5:45 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Peter Dyballa <Peter_Dyballa@Freenet.DE> writes: > Can't there be a default binding of ISOLatin1Encoding to files with > extension .ps or that are otherwise found or set to be PostScript > files? None of the .ps files we've found have been in ISOLatin1Encoding, so I'm not sure I understand what you mean? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-20 5:45 ` Lars Ingebrigtsen @ 2021-10-20 6:18 ` Lars Ingebrigtsen 2021-10-20 16:34 ` Peter Dyballa 1 sibling, 0 replies; 37+ messages in thread From: Lars Ingebrigtsen @ 2021-10-20 6:18 UTC (permalink / raw) To: Peter Dyballa; +Cc: 7786 Lars Ingebrigtsen <larsi@gnus.org> writes: > None of the .ps files we've found have been in ISOLatin1Encoding, so I'm > not sure I understand what you mean? Er. My analysis of these .ps files is wrong -- ` is indeed interpreted as quoteright instead of grave. /ISO-8859-1Encoding [ [...] /space /exclam /quotedbl /numbersign /dollar /percent /ampersand /quoteright Which is what this bug report was originally about. However, none of them adhere to the encoding found on the Wikipedia page (which claims to document ISOLatin1Encoding) in the 0x9x area: /x /y /z /braceleft /bar /braceright /asciitilde /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /space /exclamdown /cent /sterling /currency /yen /brokenbar /section Like iso-8859-1, the 0x9x area is blank instead of having dotless i and all the diacritics. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-20 5:45 ` Lars Ingebrigtsen 2021-10-20 6:18 ` Lars Ingebrigtsen @ 2021-10-20 16:34 ` Peter Dyballa 1 sibling, 0 replies; 37+ messages in thread From: Peter Dyballa @ 2021-10-20 16:34 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: 7786 > Am 20.10.2021 um 07:45 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > >> Can't there be a default binding of ISOLatin1Encoding to files with >> extension .ps or that are otherwise found or set to be PostScript >> files? > > None of the .ps files we've found have been in ISOLatin1Encoding, so I'm > not sure I understand what you mean? Isn't this the default encoding of a PostScript (text) file using standard encoded fonts? The situation can be different once you re-encode the font, then the PS file's text encoding has to follow. If the font is re-encoded in ISO Latin-1 then the PS code has to use it too. Same for every other 8-bit font re-encoding (I have no idea how it works with CJK). The characters that are allowed to be used as PostScript code are taken from US-ASCII. So the code is independent from the text encoding. Care has to be taken when texts in an 8-bit encoding should be output, or printed, usually enclosed in parentheses. The text encoding has to match the font encoding, or the font encoding (of the user-defined font) has to be prepared for the text encoding to be used below. -- Greetings Pete The box said "Use Windows 95 or better," so I got a Macintosh. ^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#7786: 23.2; Encoding of PostScript files 2021-10-13 13:51 ` Lars Ingebrigtsen 2021-10-13 15:41 ` Eli Zaretskii 2021-10-13 21:02 ` Peter Dyballa @ 2021-10-13 21:55 ` Peter Dyballa 2 siblings, 0 replies; 37+ messages in thread From: Peter Dyballa @ 2021-10-13 21:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 7786 > Am 13.10.2021 um 15:51 schrieb Lars Ingebrigtsen <larsi@gnus.org>: > > https://unicode.org/Public/MAPPINGS/VENDORS/ADOBE/stdenc.txt The above is just the Adobe Standard Font Encoding in terms of Unicode names and code points. Quite similar to this: ;;; -*- mode: Text; coding: utf-8; -*- ; ; Time-stamp: <2011-01-05 10:52:40 pete> ; ; Standard PostScript Glyphs (Adobe) ; ; oct dec hex UTF8 ;=================================== = 40 = 32 = 20 = 20 = U+0020 : SPACE ! = 41 = 33 = 21 = 21 = U+0021 : EXCLAMATION MARK " = 42 = 34 = 22 = 22 = U+0022 : QUOTATION MARK # = 43 = 35 = 23 = 23 = U+0023 : NUMBER SIGN $ = 44 = 36 = 24 = 24 = U+0024 : DOLLAR SIGN % = 45 = 37 = 25 = 25 = U+0025 : PERCENT SIGN & = 46 = 38 = 26 = 26 = U+0026 : AMPERSAND ' = 47 = 39 = 27 = 27 = U+2019 : RIGHT SINGLE QUOTATION MARK ( = 50 = 40 = 28 = 28 = U+0028 : LEFT PARENTHESIS ) = 51 = 41 = 29 = 29 = U+0029 : RIGHT PARENTHESIS * = 52 = 42 = 2A = 2A = U+002A : ASTERISK + = 53 = 43 = 2B = 2B = U+002B : PLUS SIGN , = 54 = 44 = 2C = 2C = U+002C : COMMA - = 55 = 45 = 2D = 2D = U+002D : HYPHEN-MINUS . = 56 = 46 = 2E = 2E = U+002E : FULL STOP / = 57 = 47 = 2F = 2F = U+002F : SOLIDUS 0 = 60 = 48 = 30 = 30 = U+0030 : DIGIT ZERO 1 = 61 = 49 = 31 = 31 = U+0031 : DIGIT ONE 2 = 62 = 50 = 32 = 32 = U+0032 : DIGIT TWO 3 = 63 = 51 = 33 = 33 = U+0033 : DIGIT THREE 4 = 64 = 52 = 34 = 34 = U+0034 : DIGIT FOUR 5 = 65 = 53 = 35 = 35 = U+0035 : DIGIT FIVE 6 = 66 = 54 = 36 = 36 = U+0036 : DIGIT SIX 7 = 67 = 55 = 37 = 37 = U+0037 : DIGIT SEVEN 8 = 70 = 56 = 38 = 38 = U+0038 : DIGIT EIGHT 9 = 71 = 57 = 39 = 39 = U+0039 : DIGIT NINE : = 72 = 58 = 3A = 3A = U+003A : COLON ; = 73 = 59 = 3B = 3B = U+003B : SEMICOLON < = 74 = 60 = 3C = 3C = U+003C : LESS-THAN SIGN = = 75 = 61 = 3D = 3D = U+003D : EQUALS SIGN > = 76 = 62 = 3E = 3E = U+003E : GREATER-THAN SIGN ? = 77 = 63 = 3F = 3F = U+003F : QUESTION MARK @ = 100 = 64 = 40 = 40 = U+0040 : COMMERCIAL AT A = 101 = 65 = 41 = 41 = U+0041 : LATIN CAPITAL LETTER A B = 102 = 66 = 42 = 42 = U+0042 : LATIN CAPITAL LETTER B C = 103 = 67 = 43 = 43 = U+0043 : LATIN CAPITAL LETTER C D = 104 = 68 = 44 = 44 = U+0044 : LATIN CAPITAL LETTER D E = 105 = 69 = 45 = 45 = U+0045 : LATIN CAPITAL LETTER E F = 106 = 70 = 46 = 46 = U+0046 : LATIN CAPITAL LETTER F G = 107 = 71 = 47 = 47 = U+0047 : LATIN CAPITAL LETTER G H = 110 = 72 = 48 = 48 = U+0048 : LATIN CAPITAL LETTER H I = 111 = 73 = 49 = 49 = U+0049 : LATIN CAPITAL LETTER I J = 112 = 74 = 4A = 4A = U+004A : LATIN CAPITAL LETTER J K = 113 = 75 = 4B = 4B = U+004B : LATIN CAPITAL LETTER K L = 114 = 76 = 4C = 4C = U+004C : LATIN CAPITAL LETTER L M = 115 = 77 = 4D = 4D = U+004D : LATIN CAPITAL LETTER M N = 116 = 78 = 4E = 4E = U+004E : LATIN CAPITAL LETTER N O = 117 = 79 = 4F = 4F = U+004F : LATIN CAPITAL LETTER O P = 120 = 80 = 50 = 50 = U+0050 : LATIN CAPITAL LETTER P Q = 121 = 81 = 51 = 51 = U+0051 : LATIN CAPITAL LETTER Q R = 122 = 82 = 52 = 52 = U+0052 : LATIN CAPITAL LETTER R S = 123 = 83 = 53 = 53 = U+0053 : LATIN CAPITAL LETTER S T = 124 = 84 = 54 = 54 = U+0054 : LATIN CAPITAL LETTER T U = 125 = 85 = 55 = 55 = U+0055 : LATIN CAPITAL LETTER U V = 126 = 86 = 56 = 56 = U+0056 : LATIN CAPITAL LETTER V W = 127 = 87 = 57 = 57 = U+0057 : LATIN CAPITAL LETTER W X = 130 = 88 = 58 = 58 = U+0058 : LATIN CAPITAL LETTER X Y = 131 = 89 = 59 = 59 = U+0059 : LATIN CAPITAL LETTER Y Z = 132 = 90 = 5A = 5A = U+005A : LATIN CAPITAL LETTER Z [ = 133 = 91 = 5B = 5B = U+005B : LEFT SQUARE BRACKET \ = 134 = 92 = 5C = 5C = U+005C : REVERSE SOLIDUS ] = 135 = 93 = 5D = 5D = U+005D : RIGHT SQUARE BRACKET ^ = 136 = 94 = 5E = 5E = U+005E : CIRCUMFLEX ACCENT _ = 137 = 95 = 5F = 5F = U+005F : LOW LINE ` = 140 = 96 = 60 = 60 = U+2018 : LEFT SINGLE QUOTATION MARK a = 141 = 97 = 61 = 61 = U+0061 : LATIN SMALL LETTER A b = 142 = 98 = 62 = 62 = U+0062 : LATIN SMALL LETTER B c = 143 = 99 = 63 = 63 = U+0063 : LATIN SMALL LETTER C d = 144 = 100 = 64 = 64 = U+0064 : LATIN SMALL LETTER D e = 145 = 101 = 65 = 65 = U+0065 : LATIN SMALL LETTER E f = 146 = 102 = 66 = 66 = U+0066 : LATIN SMALL LETTER F g = 147 = 103 = 67 = 67 = U+0067 : LATIN SMALL LETTER G h = 150 = 104 = 68 = 68 = U+0068 : LATIN SMALL LETTER H i = 151 = 105 = 69 = 69 = U+0069 : LATIN SMALL LETTER I j = 152 = 106 = 6A = 6A = U+006A : LATIN SMALL LETTER J k = 153 = 107 = 6B = 6B = U+006B : LATIN SMALL LETTER K l = 154 = 108 = 6C = 6C = U+006C : LATIN SMALL LETTER L m = 155 = 109 = 6D = 6D = U+006D : LATIN SMALL LETTER M n = 156 = 110 = 6E = 6E = U+006E : LATIN SMALL LETTER N o = 157 = 111 = 6F = 6F = U+006F : LATIN SMALL LETTER O p = 160 = 112 = 70 = 70 = U+0070 : LATIN SMALL LETTER P q = 161 = 113 = 71 = 71 = U+0071 : LATIN SMALL LETTER Q r = 162 = 114 = 72 = 72 = U+0072 : LATIN SMALL LETTER R s = 163 = 115 = 73 = 73 = U+0073 : LATIN SMALL LETTER S t = 164 = 116 = 74 = 74 = U+0074 : LATIN SMALL LETTER T u = 165 = 117 = 75 = 75 = U+0075 : LATIN SMALL LETTER U v = 166 = 118 = 76 = 76 = U+0076 : LATIN SMALL LETTER V w = 167 = 119 = 77 = 77 = U+0077 : LATIN SMALL LETTER W x = 170 = 120 = 78 = 78 = U+0078 : LATIN SMALL LETTER X y = 171 = 121 = 79 = 79 = U+0079 : LATIN SMALL LETTER Y z = 172 = 122 = 7A = 7A = U+007A : LATIN SMALL LETTER Z { = 173 = 123 = 7B = 7B = U+007B : LEFT CURLY BRACKET | = 174 = 124 = 7C = 7C = U+007C : VERTICAL LINE } = 175 = 125 = 7D = 7D = U+007D : RIGHT CURLY BRACKET ~ = 176 = 126 = 7E = 7E = U+007E : TILDE ¡ = 241 = 161 = A1 = C2A1 = U+00A1 : INVERTED EXCLAMATION MARK ¢ = 242 = 162 = A2 = C2A2 = U+00A2 : CENT SIGN £ = 243 = 163 = A3 = C2A3 = U+00A3 : POUND SIGN ⁄ = 244 = 164 = A4 = E28184 = U+2044 : FRACTION SLASH ¥ = 245 = 165 = A5 = C2A5 = U+00A5 : YEN SIGN ƒ = 246 = 166 = A6 = C692 = U+0192 : LATIN SMALL LETTER F WITH HOOK § = 247 = 167 = A7 = C2A7 = U+00A7 : SECTION SIGN ¤ = 250 = 168 = A8 = C2A4 = U+00A4 : CURRENCY SIGN ' = 251 = 169 = A9 = 27 = U+0027 : APOSTROPHE “ = 252 = 170 = AA = E2809C = U+201C : LEFT DOUBLE QUOTATION MARK « = 253 = 171 = AB = C2AB = U+00AB : LEFT-POINTING DOUBLE ANGLE QUOTATION MARK ‹ = 254 = 172 = AC = E280B9 = U+2039 : SINGLE LEFT-POINTING ANGLE QUOTATION MARK › = 255 = 173 = AD = E280BA = U+203A : SINGLE RIGHT-POINTING ANGLE QUOTATION MARK fi = 256 = 174 = AE = EFAC81 = U+FB01 : LATIN SMALL LIGATURE FI fl = 257 = 175 = AF = EFAC82 = U+FB02 : LATIN SMALL LIGATURE FL – = 261 = 177 = B1 = E28093 = U+2013 : EN DASH † = 262 = 178 = B2 = E280A0 = U+2020 : DAGGER ‡ = 263 = 179 = B3 = E280A1 = U+2021 : DOUBLE DAGGER · = 264 = 180 = B4 = C2B7 = U+00B7 : MIDDLE DOT ¶ = 266 = 182 = B6 = C2B6 = U+00B6 : PILCROW SIGN • = 267 = 183 = B7 = E280A2 = U+2022 : BULLET ‚ = 270 = 184 = B8 = E2809A = U+201A : SINGLE LOW-9 QUOTATION MARK „ = 271 = 185 = B9 = E2809E = U+201E : DOUBLE LOW-9 QUOTATION MARK ” = 272 = 186 = BA = E2809D = U+201D : RIGHT DOUBLE QUOTATION MARK » = 273 = 187 = BB = C2BB = U+00BB : RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK … = 274 = 188 = BC = E280A6 = U+2026 : HORIZONTAL ELLIPSIS ‰ = 275 = 189 = BD = E280B0 = U+2030 : PER MILLE SIGN ¿ = 277 = 191 = BF = C2BF = U+00BF : INVERTED QUESTION MARK ` = 301 = 193 = C1 = 60 = U+0060 : GRAVE ACCENT ´ = 302 = 194 = C2 = C2B4 = U+00B4 : ACUTE ACCENT ˆ = 303 = 195 = C3 = CB86 = U+02C6 : MODIFIER LETTER CIRCUMFLEX ACCENT ˜ = 304 = 196 = C4 = CB9C = U+02DC : SMALL TILDE ¯ = 305 = 197 = C5 = C2AF = U+00AF : MACRON ˘ = 306 = 198 = C6 = CB98 = U+02D8 : BREVE ˙ = 307 = 199 = C7 = CB99 = U+02D9 : DOT ABOVE ¨ = 310 = 200 = C8 = C2A8 = U+00A8 : DIAERESIS ˚ = 312 = 202 = CA = CB9A = U+02DA : RING ABOVE ¸ = 313 = 203 = CB = C2B8 = U+00B8 : CEDILLA ˝ = 315 = 205 = CD = CB9D = U+02DD : DOUBLE ACUTE ACCENT ˛ = 316 = 206 = CE = CB9B = U+02DB : OGONEK ˇ = 317 = 207 = CF = CB87 = U+02C7 : CARON — = 320 = 208 = D0 = E28094 = U+2014 : EM DASH Æ = 341 = 225 = E1 = C386 = U+00C6 : LATIN CAPITAL LETTER AE ª = 343 = 227 = E3 = C2AA = U+00AA : FEMININE ORDINAL INDICATOR Ł = 350 = 232 = E8 = C581 = U+0141 : LATIN CAPITAL LETTER L WITH STROKE Ø = 351 = 233 = E9 = C398 = U+00D8 : LATIN CAPITAL LETTER O WITH STROKE Œ = 352 = 234 = EA = C592 = U+0152 : LATIN CAPITAL LIGATURE OE º = 353 = 235 = EB = C2BA = U+00BA : MASCULINE ORDINAL INDICATOR æ = 361 = 241 = F1 = C3A6 = U+00E6 : LATIN SMALL LETTER AE ı = 365 = 245 = F5 = C4B1 = U+0131 : LATIN SMALL LETTER DOTLESS I ł = 370 = 248 = F8 = C582 = U+0142 : LATIN SMALL LETTER L WITH STROKE ø = 371 = 249 = F9 = C3B8 = U+00F8 : LATIN SMALL LETTER O WITH STROKE œ = 372 = 250 = FA = C593 = U+0153 : LATIN SMALL LIGATURE OE ß = 373 = 251 = FB = C39F = U+00DF : LATIN SMALL LETTER SHARP S -- Greetings Pete Encryption, n.: A powerful algorithmic encoding technique employed in the creation of computer manuals. ^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2021-10-20 16:34 UTC | newest] Thread overview: 37+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-05 0:18 bug#7786: 23.2; Encoding of PostScript files Peter Dyballa 2021-01-20 18:02 ` Lars Ingebrigtsen 2021-06-02 8:39 ` Lars Ingebrigtsen 2021-06-02 16:37 ` Peter Dyballa 2021-10-13 12:49 ` Lars Ingebrigtsen 2021-10-13 13:12 ` Lars Ingebrigtsen 2021-10-13 13:51 ` Lars Ingebrigtsen 2021-10-13 15:41 ` Eli Zaretskii 2021-10-13 16:05 ` Lars Ingebrigtsen 2021-10-13 16:18 ` Eli Zaretskii 2021-10-13 16:20 ` Lars Ingebrigtsen 2021-10-13 16:23 ` Peter Dyballa 2021-10-13 16:28 ` Lars Ingebrigtsen 2021-10-13 16:43 ` Peter Dyballa 2021-10-13 16:45 ` Eli Zaretskii 2021-10-13 17:35 ` Peter Dyballa 2021-10-13 16:43 ` Eli Zaretskii 2021-10-13 18:55 ` Lars Ingebrigtsen 2021-10-13 19:05 ` Eli Zaretskii 2021-10-13 19:07 ` Peter Dyballa 2021-10-13 21:02 ` Peter Dyballa 2021-10-14 6:42 ` Eli Zaretskii 2021-10-15 12:47 ` Lars Ingebrigtsen 2021-10-15 15:59 ` Peter Dyballa 2021-10-18 7:09 ` Lars Ingebrigtsen 2021-10-18 12:25 ` Eli Zaretskii 2021-10-18 13:17 ` Lars Ingebrigtsen 2021-10-18 15:51 ` Peter Dyballa 2021-10-18 16:00 ` Eli Zaretskii 2021-10-19 5:49 ` Peter Dyballa 2021-10-19 11:59 ` Eli Zaretskii 2021-10-19 13:47 ` Lars Ingebrigtsen 2021-10-20 5:39 ` Peter Dyballa 2021-10-20 5:45 ` Lars Ingebrigtsen 2021-10-20 6:18 ` Lars Ingebrigtsen 2021-10-20 16:34 ` Peter Dyballa 2021-10-13 21:55 ` Peter Dyballa
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).