* prettify symbols question
@ 2020-11-11 17:01 Alfred M. Szmidt
2020-11-12 14:59 ` Eli Zaretskii
0 siblings, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-11 17:01 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 343 bytes --]
Not sure if this is better for help-gnu-emacs or here.
What would the proper way to handle say #o210 in prettify-symbols?
I've attached a simple test, I would expect to see the #o210 sequence
in the file to be shown as a unicode lambda, but nothing changes -- I
suspect it is due to some encoding mismatch between the buffer and the
string.
[-- Attachment #2: prettify-test.el --]
[-- Type: application/emacs-lisp, Size: 282 bytes --]
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-11 17:01 prettify symbols question Alfred M. Szmidt
@ 2020-11-12 14:59 ` Eli Zaretskii
2020-11-12 15:17 ` Alfred M. Szmidt
0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-12 14:59 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Date: Wed, 11 Nov 2020 12:01:37 -0500
>
> What would the proper way to handle say #o210 in prettify-symbols?
>
> I've attached a simple test, I would expect to see the #o210 sequence
> in the file to be shown as a unicode lambda, but nothing changes -- I
> suspect it is due to some encoding mismatch between the buffer and the
> string.
prettify-symbols-mode doesn't act on text in comments, see
'prettify-symbols-default-compose-p'. If you move your #o210 out of
the comment, it should get displayed as you expect.
You can replace 'prettify-symbols-default-compose-p' with your own
function, and set up 'prettify-symbols-compose-predicate' to use it
instead of the default predicate, if you want to prettify stuff in
comments.
If the above doesn't work, then maybe it _is_ related to encoding.
What does the mode line say about 'buffer-file-coding-system when' you
visit this file?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 14:59 ` Eli Zaretskii
@ 2020-11-12 15:17 ` Alfred M. Szmidt
2020-11-12 15:38 ` Eli Zaretskii
0 siblings, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-12 15:17 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> What would the proper way to handle say #o210 in prettify-symbols?
>
> I've attached a simple test, I would expect to see the #o210 sequence
> in the file to be shown as a unicode lambda, but nothing changes -- I
> suspect it is due to some encoding mismatch between the buffer and the
> string.
prettify-symbols-mode doesn't act on text in comments, see
'prettify-symbols-default-compose-p'. If you move your #o210 out of
the comment, it should get displayed as you expect.
Ah, that explains some.
You can replace 'prettify-symbols-default-compose-p' with your own
function, and set up 'prettify-symbols-compose-predicate' to use it
instead of the default predicate, if you want to prettify stuff in
comments.
Thank you for the tip, that will be useful (I need this to act on all
sequences even in symbols).
If the above doesn't work, then maybe it _is_ related to encoding.
What does the mode line say about 'buffer-file-coding-system when' you
visit this file?
So when the buffer-file-coding-system is utf-8-unix everything works
(where also the sequence is not acted on in comments). But when the
buffer is raw-text-unix, it does not work for #o210, but works for say
#o10. Some multi-byte thing going on?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 15:17 ` Alfred M. Szmidt
@ 2020-11-12 15:38 ` Eli Zaretskii
2020-11-12 16:14 ` Eli Zaretskii
2020-11-13 8:27 ` prettify symbols question Alfred M. Szmidt
0 siblings, 2 replies; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-12 15:38 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Thu, 12 Nov 2020 10:17:05 -0500
>
> If the above doesn't work, then maybe it _is_ related to encoding.
> What does the mode line say about 'buffer-file-coding-system when' you
> visit this file?
>
> So when the buffer-file-coding-system is utf-8-unix everything works
> (where also the sequence is not acted on in comments). But when the
> buffer is raw-text-unix, it does not work for #o210, but works for say
> #o10. Some multi-byte thing going on?
Yes, raw-text means the buffer includes raw bytes, not characters.
Emacs doesn't do anything useful with raw bytes above 127, and in
particular doesn't interpret them as characters.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 15:38 ` Eli Zaretskii
@ 2020-11-12 16:14 ` Eli Zaretskii
2020-11-12 20:53 ` Alfred M. Szmidt
2020-11-13 8:27 ` prettify symbols question Alfred M. Szmidt
1 sibling, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-12 16:14 UTC (permalink / raw)
To: ams; +Cc: emacs-devel
> Date: Thu, 12 Nov 2020 17:38:12 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
>
> > So when the buffer-file-coding-system is utf-8-unix everything works
> > (where also the sequence is not acted on in comments). But when the
> > buffer is raw-text-unix, it does not work for #o210, but works for say
> > #o10. Some multi-byte thing going on?
>
> Yes, raw-text means the buffer includes raw bytes, not characters.
> Emacs doesn't do anything useful with raw bytes above 127, and in
> particular doesn't interpret them as characters.
Btw, in what encoding does \210 stand for GREEK SMALL LETTER LAMBDA?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 16:14 ` Eli Zaretskii
@ 2020-11-12 20:53 ` Alfred M. Szmidt
2020-11-12 21:12 ` Basil L. Contovounesios
2020-11-13 7:24 ` Eli Zaretskii
0 siblings, 2 replies; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-12 20:53 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> > So when the buffer-file-coding-system is utf-8-unix everything works
> > (where also the sequence is not acted on in comments). But when the
> > buffer is raw-text-unix, it does not work for #o210, but works for say
> > #o10. Some multi-byte thing going on?
>
> Yes, raw-text means the buffer includes raw bytes, not characters.
> Emacs doesn't do anything useful with raw bytes above 127, and in
> particular doesn't interpret them as characters.
Btw, in what encoding does \210 stand for GREEK SMALL LETTER LAMBDA?
The Lisp Machine character set -- there is a long story that I could
tell about why if anyone is curious but very much a tanget.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 20:53 ` Alfred M. Szmidt
@ 2020-11-12 21:12 ` Basil L. Contovounesios
2020-11-12 21:25 ` Drew Adams
2020-11-13 7:44 ` Eli Zaretskii
2020-11-13 7:24 ` Eli Zaretskii
1 sibling, 2 replies; 29+ messages in thread
From: Basil L. Contovounesios @ 2020-11-12 21:12 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: Eli Zaretskii, emacs-devel
"Alfred M. Szmidt" <ams@gnu.org> writes:
> Btw, in what encoding does \210 stand for GREEK SMALL LETTER LAMBDA?
>
> The Lisp Machine character set -- there is a long story that I could
> tell about why if anyone is curious but very much a tanget.
Feel free to CC me if you end up going on that tangent. :)
--
Basil
^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: prettify symbols question
2020-11-12 21:12 ` Basil L. Contovounesios
@ 2020-11-12 21:25 ` Drew Adams
2020-11-13 7:44 ` Eli Zaretskii
1 sibling, 0 replies; 29+ messages in thread
From: Drew Adams @ 2020-11-12 21:25 UTC (permalink / raw)
To: Basil L. Contovounesios, Alfred M. Szmidt; +Cc: Eli Zaretskii, emacs-devel
> > Btw, in what encoding does \210 stand for GREEK SMALL LETTER
> LAMBDA?
> >
> > The Lisp Machine character set -- there is a long story that I could
> > tell about why if anyone is curious but very much a tanget.
>
> Feel free to CC me if you end up going on that tangent. :)
+1
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 21:12 ` Basil L. Contovounesios
2020-11-12 21:25 ` Drew Adams
@ 2020-11-13 7:44 ` Eli Zaretskii
1 sibling, 0 replies; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-13 7:44 UTC (permalink / raw)
To: Basil L. Contovounesios; +Cc: ams, emacs-devel
> From: "Basil L. Contovounesios" <contovob@tcd.ie>
> Date: Thu, 12 Nov 2020 21:12:31 +0000
> Cc: Eli Zaretskii <eliz@gnu.org>, emacs-devel@gnu.org
>
> "Alfred M. Szmidt" <ams@gnu.org> writes:
>
> > Btw, in what encoding does \210 stand for GREEK SMALL LETTER LAMBDA?
> >
> > The Lisp Machine character set -- there is a long story that I could
> > tell about why if anyone is curious but very much a tanget.
>
> Feel free to CC me if you end up going on that tangent. :)
There's emacs-tangents@gnu.org for such tangents, personal email is
not necessarily necessary.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 20:53 ` Alfred M. Szmidt
2020-11-12 21:12 ` Basil L. Contovounesios
@ 2020-11-13 7:24 ` Eli Zaretskii
2020-11-13 10:15 ` Alfred M. Szmidt
2020-11-13 11:17 ` Alfred M. Szmidt
1 sibling, 2 replies; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-13 7:24 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Thu, 12 Nov 2020 15:53:05 -0500
>
> > > So when the buffer-file-coding-system is utf-8-unix everything works
> > > (where also the sequence is not acted on in comments). But when the
> > > buffer is raw-text-unix, it does not work for #o210, but works for say
> > > #o10. Some multi-byte thing going on?
> >
> > Yes, raw-text means the buffer includes raw bytes, not characters.
> > Emacs doesn't do anything useful with raw bytes above 127, and in
> > particular doesn't interpret them as characters.
>
> Btw, in what encoding does \210 stand for GREEK SMALL LETTER LAMBDA?
>
> The Lisp Machine character set
Emacs doesn't support such an encoding/charset, does it? Maybe it
should? Is this character set documented somewhere? The Lisp Machine
Manual I have seems to say that \210 is BS or Overstrike, not LAMBDA
(https://tumbleweed.nu/r/lm-3/uv/chinual.html#The-Character-Set).
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-13 7:24 ` Eli Zaretskii
@ 2020-11-13 10:15 ` Alfred M. Szmidt
2020-11-13 11:17 ` Alfred M. Szmidt
1 sibling, 0 replies; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-13 10:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> The Lisp Machine character set
Emacs doesn't support such an encoding/charset, does it? Maybe it
should?
No, not yet. I can see about doing that.
Is this character set documented somewhere? The Lisp Machine
Manual I have seems to say that \210 is BS or Overstrike, not LAMBDA
(https://tumbleweed.nu/r/lm-3/uv/chinual.html#The-Character-Set).
That is how the Lisp Machine sees things (there is an implicit
conversion of the files from the host to the Lisp Machine when read
over Chaosnet); e.g, newline is #o215, but when files are stored on a
Unix host they have been translated so that newline #o215 becomes
#o12, similar for tab, etc so things are viewable on ASCII systems.
So there are two encodings, one is native to the Lisp Machine (where
#o215 is left as is), and the other one for UNIX (where #o215, etc,
are translated).
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-13 7:24 ` Eli Zaretskii
2020-11-13 10:15 ` Alfred M. Szmidt
@ 2020-11-13 11:17 ` Alfred M. Szmidt
2020-11-13 12:22 ` Eli Zaretskii
1 sibling, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-13 11:17 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> The Lisp Machine character set
Emacs doesn't support such an encoding/charset, does it? Maybe it
should? Is this character set documented somewhere? The Lisp Machine
Manual I have seems to say that \210 is BS or Overstrike, not LAMBDA
(https://tumbleweed.nu/r/lm-3/uv/chinual.html#The-Character-Set).
That now contains both the Unix stored files, and the native one (also
attached).
I'm slightly confused as how to add a new coding system, do I need to
first add a charset (the converted one would be an :ascii-compatible-p
t, and the native nil?)? I found the manual slightly sparse on this
front.
===File ~/lispm-charset.text================================
000 center-dot 040 space 100 @ 140 `
001 down arrow 041 ! 101 A 141 a
002 alpha 042 " 102 B 142 b
003 beta 043 # 103 C 143 c
004 and-sign 044 $ 104 D 144 d
005 not-sign 045 % 105 E 145 e
006 epsilon 046 & 106 F 146 f
007 pi 047 ' 107 G 147 g
010 lambda 050 ( 110 H 150 h
011 gamma 051 ) 111 I 151 i
012 delta 052 * 112 J 152 j
013 uparrow 053 + 113 K 153 k
014 plus-minus 054 , 114 L 154 l
015 circle-plus 055 - 115 M 155 m
016 infinity 056 . 116 N 156 n
017 partial delta 057 / 117 O 157 o
020 left horseshoe 060 0 120 P 160 p
021 right horseshoe 061 1 121 Q 161 q
022 up horseshoe 062 2 122 R 162 r
023 down horseshoe 063 3 123 S 163 s
024 universal quantifier 064 4 124 T 164 t
025 existential quantifier 065 5 125 U 165 u
026 circle-X 066 6 126 V 166 v
027 double-arrow 067 7 127 W 167 w
030 left arrow 070 8 130 X 170 x
031 right arrow 071 9 131 Y 171 y
032 not-equals 072 : 132 Z 172 z
033 diamond (altmode) 073 ; 133 [ 173 {
034 less-or-equal 074 < 134 \ 174 |
035 greater-or-equal 075 = 135 ] 175 }
036 equivalence 076 > 136 ^ 176 ~
037 or 077 ? 137 _ 177 @ref{ctl-qm}
200 Null character 210 Overstrike 220 Stop-output 230 Roman-iv
201 Break 211 Tab 221 Abort 231 Hand-up
202 Clear 212 Line 222 Resume 232 Hand-down
203 Call 213 Delete 223 Status 233 Hand-left
204 Terminal escape 214 Page 224 End 234 Hand-right
205 Macro/backnext 215 Return 225 Roman-i 235 System
206 Help 216 Quote 226 Roman-ii 236 Network
207 Rubout 217 Hold-output 227 Roman-iii
237-377 reserved for the future
The Lisp Machine Character Set
(all numbers in octal)
\f
000 center-dot 040 space 100 @ 140 `
001 down arrow 041 ! 101 A 141 a
002 alpha 042 " 102 B 142 b
003 beta 043 # 103 C 143 c
004 and-sign 044 $ 104 D 144 d
005 not-sign 045 % 105 E 145 e
006 epsilon 046 & 106 F 146 f
007 pi 047 ' 107 G 147 g
210 lambda 050 ( 110 H 150 h
211 gamma 051 ) 111 I 151 i
212 delta 052 * 112 J 152 j
213 uparrow 053 + 113 K 153 k
214 plus-minus 054 , 114 L 154 l
215 circle-plus 055 - 115 M 155 m
016 infinity 056 . 116 N 156 n
017 partial delta 057 / 117 O 157 o
020 left horseshoe 060 0 120 P 160 p
021 right horseshoe 061 1 121 Q 161 q
022 up horseshoe 062 2 122 R 162 r
023 down horseshoe 063 3 123 S 163 s
024 universal quantifier 064 4 124 T 164 t
025 existential quantifier 065 5 125 U 165 u
026 circle-X 066 6 126 V 166 v
027 double-arrow 067 7 127 W 167 w
030 left arrow 070 8 130 X 170 x
031 right arrow 071 9 131 Y 171 y
032 not-equals 072 : 132 Z 172 z
033 diamond (altmode) 073 ; 133 [ 173 {
034 less-or-equal 074 < 134 \ 174 |
035 greater-or-equal 075 = 135 ] 175 }
036 equivalence 076 > 136 ^ 176 ~
037 or 077 ? 137 _ 177 @ref{ctl-qm}
200 Null character 10 Overstrike 220 Stop-output 230 Roman-iv
201 Break 11 Tab 221 Abort 231 Hand-up
202 Clear 15 Line 222 Resume 232 Hand-down
203 Call 13 Delete 223 Status 233 Hand-left
204 Terminal escape 14 Page 224 End 234 Hand-right
205 Macro/backnext 12 Return 225 Roman-i 235 System
206 Help 216 Quote 226 Roman-ii 236 Network
207 Rubout 217 Hold-output 227 Roman-iii
237-377 reserved for the future
The Lisp Machine Character Set
as stored on UNIX
(all numbers in octal)
============================================================
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-13 11:17 ` Alfred M. Szmidt
@ 2020-11-13 12:22 ` Eli Zaretskii
2020-11-13 13:31 ` Alfred M. Szmidt
0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-13 12:22 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 13 Nov 2020 06:17:36 -0500
>
> I'm slightly confused as how to add a new coding system, do I need to
> first add a charset (the converted one would be an :ascii-compatible-p
> t, and the native nil?)?
Yes. You will also need to prepare a mapping file, see below. See
the example of how we define, for example, coding-systems for
MS-Windows codepages:
(define-charset 'windows-1250
"WINDOWS-1250 (Central Europe)"
:short-name "WINDOWS-1250"
:ascii-compatible-p t
:code-space [0 255]
:map "CP1250")
(define-coding-system 'windows-1250
"windows-1250 (Central European) encoding (MIME: WINDOWS-1250)"
:coding-type 'charset
:mnemonic ?*
:charset-list '(windows-1250)
:mime-charset 'windows-1250)
(The mapping files are in etc/charsets; the :map attribute of the
charset names the mapping file to use.)
> I found the manual slightly sparse on this front.
That's on purpose. The ELisp manual says:
How to define a coding system is an arcane matter, and is not
documented here.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-13 12:22 ` Eli Zaretskii
@ 2020-11-13 13:31 ` Alfred M. Szmidt
2020-11-13 13:47 ` Eli Zaretskii
0 siblings, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-13 13:31 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> I'm slightly confused as how to add a new coding system, do I need to
> first add a charset (the converted one would be an :ascii-compatible-p
> t, and the native nil?)?
Yes. You will also need to prepare a mapping file, see below. See
the example of how we define, for example, coding-systems for
MS-Windows codepages:
(The mapping files are in etc/charsets; the :map attribute of the
charset names the mapping file to use.)
Which are generated from the admin/charsets files, which in turn
sometimes pulled in from glibc. So the easiest route is to add a
LISPM like charset mapping following glibc (and then also see if it
can be included there). And then do,
(define-charset 'lispm
"LISPM"
:short-name "LISPM"
:ascii-compatible-p nil
:code-space [0 255]
:map "LISPM")
(define-coding-system 'lispm
"Lisp Machine encoding"
:coding-type 'charset
:mnemonic ?L
:charset-list '(lispm))
So that sorts it out for the native one, but what should be done for
the Unix friendly mapping? LISPM-ASCII, and similar as above?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-13 13:31 ` Alfred M. Szmidt
@ 2020-11-13 13:47 ` Eli Zaretskii
2020-11-13 14:47 ` new coding system (was: Re: prettify symbols question) Alfred M. Szmidt
0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-13 13:47 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 13 Nov 2020 08:31:42 -0500
>
> (define-charset 'lispm
> "LISPM"
> :short-name "LISPM"
> :ascii-compatible-p nil
> :code-space [0 255]
> :map "LISPM")
>
> (define-coding-system 'lispm
> "Lisp Machine encoding"
> :coding-type 'charset
> :mnemonic ?L
> :charset-list '(lispm))
>
> So that sorts it out for the native one, but what should be done for
> the Unix friendly mapping? LISPM-ASCII, and similar as above?
Something like that. Although I'm not sure about the name. But why
do you need the native variant? If we only need one charset, for how
it is seen on Unix, we could call that 'lispm'.
^ permalink raw reply [flat|nested] 29+ messages in thread
* new coding system (was: Re: prettify symbols question)
2020-11-13 13:47 ` Eli Zaretskii
@ 2020-11-13 14:47 ` Alfred M. Szmidt
2020-11-13 14:59 ` Eli Zaretskii
2020-11-13 17:32 ` new coding system Andreas Schwab
0 siblings, 2 replies; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-13 14:47 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 8156 bytes --]
> (define-charset 'lispm
> "LISPM"
> :short-name "LISPM"
> :ascii-compatible-p nil
> :code-space [0 255]
> :map "LISPM")
>
> (define-coding-system 'lispm
> "Lisp Machine encoding"
> :coding-type 'charset
> :mnemonic ?L
> :charset-list '(lispm))
>
> So that sorts it out for the native one, but what should be done for
> the Unix friendly mapping? LISPM-ASCII, and similar as above?
Something like that. Although I'm not sure about the name. But why
do you need the native variant? If we only need one charset, for how
it is seen on Unix, we could call that 'lispm'.
Right, it is easy enough to convert if one has native files.
So I've created a LISPM charmap, and a LISPM charset map based on
that. Then calling define-charset and define-coding-system, if I now
try to open a Lisp machine file in the lispm coding it seems to be
unable to handle the various characters; e.g., #o210.
These default coding systems were tried to encode text
in the buffer ‘lispm-char-test.text’:
(lispm-unix (1 . 0) (59 . 1) (117 . 2) (175 . 3) (233 . 4) (291 . 5)
(349 . 6) (407 . 7) (465 . 4194184) (523 . 4194185) (581 . 4194186))
However, each of them encountered characters it couldn’t encode:
....
Is there something that I forgot to do?
===File ~/emacs/admin/charsets/glibc/LISPM.gz===============
<code_set_name> LISPM
<comment_char> %
<escape_char> /
% version: 1.0
% source: The Lisp Machine Manual, 6th ed.
CHARMAP
<U00B7> /x00 MIDDLE DOT
<U2193> /x01 DOWNWARDS ARROW
<U03B1> /x02 GREEK SMALL LETTER ALPHA
<U03B2> /x03 GREEK SMALL LETTER BETA
<U2227> /x04 LOGICAL AND
<U00AC> /x05 NOT SIGN
<U03B5> /x06 GREEK SMALL LETTER EPSILON
<U03C0> /x07 GREEK SMALL LETTER PI
<U03BB> /x88 GREEK SMALL LETTER LAMDA
<U03B3> /x89 GREEK SMALL LETTER GAMMA
<U03B4> /x8a GREEK SMALL LETTER DELTA
<U2191> /x8b UPWARDS ARROW
<U00B1> /x8c PLUS-MINUS SIGN
<U2295> /x8d CIRCLED PLUS
<U221E> /x0e INFINITY
<U2202> /x0f PARTIAL DIFFERENTIAL
<U2282> /x10 SUBSET OF
<U2283> /x11 SUPERSET OF
<U2229> /x12 INTERSECTION
<U222A> /x13 UNION
<U2200> /x14 FOR ALL
<U2203> /x15 THERE EXISTS
<U2297> /x16 CIRCLED TIMES
<U2194> /x17 LEFT RIGHT ARROW
<U2190> /x18 LEFTWARDS ARROW
<U2192> /x19 RIGHTWARDS ARROW
<U2260> /x1a NOT EQUAL TO
<U25CA> /x1b LOZENGE
<U2264> /x1c LESS-THAN OR EQUAL TO
<U2265> /x1d GREATER-THAN OR EQUAL TO
<U2261> /x1e IDENTICAL TO
<U2228> /x1f LOGICAL OR
<U0020> /x20 SPACE
<U0021> /x21 EXCLAMATION MARK
<U0022> /x22 QUOTATION MARK
<U0023> /x23 NUMBER SIGN
<U0024> /x24 DOLLAR SIGN
<U0025> /x25 PERCENT SIGN
<U0026> /x26 AMPERSAND
<U0027> /x27 APOSTROPHE
<U0028> /x28 LEFT PARENTHESIS
<U0029> /x29 RIGHT PARENTHESIS
<U002A> /x2a ASTERISK
<U002B> /x2b PLUS SIGN
<U002C> /x2c COMMA
<U002D> /x2d HYPHEN-MINUS
<U002E> /x2e FULL STOP
<U002F> /x2f SOLIDUS
<U0030> /x30 DIGIT ZERO
<U0031> /x31 DIGIT ONE
<U0032> /x32 DIGIT TWO
<U0033> /x33 DIGIT THREE
<U0034> /x34 DIGIT FOUR
<U0035> /x35 DIGIT FIVE
<U0036> /x36 DIGIT SIX
<U0037> /x37 DIGIT SEVEN
<U0038> /x38 DIGIT EIGHT
<U0039> /x39 DIGIT NINE
<U003A> /x3a COLON
<U003B> /x3b SEMICOLON
<U003C> /x3c LESS-THAN SIGN
<U003D> /x3d EQUALS SIGN
<U003E> /x3e GREATER-THAN SIGN
<U003F> /x3f QUESTION MARK
<U0040> /x40 COMMERCIAL AT
<U0041> /x41 LATIN CAPITAL LETTER A
<U0042> /x42 LATIN CAPITAL LETTER B
<U0043> /x43 LATIN CAPITAL LETTER C
<U0044> /x44 LATIN CAPITAL LETTER D
<U0045> /x45 LATIN CAPITAL LETTER E
<U0046> /x46 LATIN CAPITAL LETTER F
<U0047> /x47 LATIN CAPITAL LETTER G
<U0048> /x48 LATIN CAPITAL LETTER H
<U0049> /x49 LATIN CAPITAL LETTER I
<U004A> /x4a LATIN CAPITAL LETTER J
<U004B> /x4b LATIN CAPITAL LETTER K
<U004C> /x4c LATIN CAPITAL LETTER L
<U004D> /x4d LATIN CAPITAL LETTER M
<U004E> /x4e LATIN CAPITAL LETTER N
<U004F> /x4f LATIN CAPITAL LETTER O
<U0050> /x50 LATIN CAPITAL LETTER P
<U0051> /x51 LATIN CAPITAL LETTER Q
<U0052> /x52 LATIN CAPITAL LETTER R
<U0053> /x53 LATIN CAPITAL LETTER S
<U0054> /x54 LATIN CAPITAL LETTER T
<U0055> /x55 LATIN CAPITAL LETTER U
<U0056> /x56 LATIN CAPITAL LETTER V
<U0057> /x57 LATIN CAPITAL LETTER W
<U0058> /x58 LATIN CAPITAL LETTER X
<U0059> /x59 LATIN CAPITAL LETTER Y
<U005A> /x5a LATIN CAPITAL LETTER Z
<U005B> /x5b LEFT SQUARE BRACKET
<U005C> /x5c REVERSE SOLIDUS
<U005D> /x5d RIGHT SQUARE BRACKET
<U005E> /x5e CIRCUMFLEX ACCENT
<U005F> /x5f LOW LINE
<U0060> /x60 GRAVE ACCENT
<U0061> /x61 LATIN SMALL LETTER A
<U0062> /x62 LATIN SMALL LETTER B
<U0063> /x63 LATIN SMALL LETTER C
<U0064> /x64 LATIN SMALL LETTER D
<U0065> /x65 LATIN SMALL LETTER E
<U0066> /x66 LATIN SMALL LETTER F
<U0067> /x67 LATIN SMALL LETTER G
<U0068> /x68 LATIN SMALL LETTER H
<U0069> /x69 LATIN SMALL LETTER I
<U006A> /x6a LATIN SMALL LETTER J
<U006B> /x6b LATIN SMALL LETTER K
<U006C> /x6c LATIN SMALL LETTER L
<U006D> /x6d LATIN SMALL LETTER M
<U006E> /x6e LATIN SMALL LETTER N
<U006F> /x6f LATIN SMALL LETTER O
<U0070> /x70 LATIN SMALL LETTER P
<U0071> /x71 LATIN SMALL LETTER Q
<U0072> /x72 LATIN SMALL LETTER R
<U0073> /x73 LATIN SMALL LETTER S
<U0074> /x74 LATIN SMALL LETTER T
<U0075> /x75 LATIN SMALL LETTER U
<U0076> /x76 LATIN SMALL LETTER V
<U0077> /x77 LATIN SMALL LETTER W
<U0078> /x78 LATIN SMALL LETTER X
<U0079> /x79 LATIN SMALL LETTER Y
<U007A> /x7a LATIN SMALL LETTER Z
<U007B> /x7b LEFT CURLY BRACKET
<U007C> /x7c VERTICAL LINE
<U007D> /x7d RIGHT CURLY BRACKET
<U007E> /x7e TILDE
% 177 ctl-qm
% 200 Null character
% 201 Break
% 202 Clear
% 203 Call
% 204 Terminal escape
% 205 Macro/backnext
% 206 Help
% 207 Rubout
<U0008> /x08 BACKSPACE (BS) / Overstrike
<U0009> /x09 CHARACTER TABULATION (HT) / Tab
<U000D> /x0d CARRIAGE RETURN (CR) / Line
<U000B> /x0b LINE TABULATION (VT) / Delete
<U000C> /x0c FORM FEED (FF) / Page
<U000A> /x0a LINE FEED (LF) / Return
% 216 Quote
% 217 Hold-output
% 220 Stop-output
% 221 Abort
% 222 Resume
% 223 Status
% 224 End
% 225 Roman-i
% 226 Roman-ii
% 227 Roman-iii
% 230 Roman-iv
% 231 Hand-up
% 232 Hand-down
% 233 Hand-left
% 234 Hand-right
% 235 System
% 236 Network
% 237-377 reserved for the future
END CHARMAP
============================================================
===File ~/emacs/etc/charsets/LISPM.map======================
# Generated from LISPM in localedata/charmaps of glibc
0x00 0x00B7
0x01 0x2193
0x02-0x03 0x03B1
0x04 0x2227
0x05 0x00AC
0x06 0x03B5
0x07 0x03C0
0x08-0x0D 0x0008
0x0E 0x221E
0x0F 0x2202
0x10-0x11 0x2282
0x12-0x13 0x2229
0x14 0x2200
0x15 0x2203
0x16 0x2297
0x17 0x2194
0x18 0x2190
0x19 0x2192
0x1A 0x2260
0x1B 0x25CA
0x1C-0x1D 0x2264
0x1E 0x2261
0x1F 0x2228
0x20-0x7E 0x0020
0x88 0x03BB
0x89-0x8A 0x03B3
0x8B 0x2191
0x8C 0x00B1
0x8D 0x2295
============================================================
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-13 14:47 ` new coding system (was: Re: prettify symbols question) Alfred M. Szmidt
@ 2020-11-13 14:59 ` Eli Zaretskii
2020-11-13 17:11 ` Alfred M. Szmidt
2020-11-13 17:11 ` Alfred M. Szmidt
2020-11-13 17:32 ` new coding system Andreas Schwab
1 sibling, 2 replies; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-13 14:59 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 13 Nov 2020 09:47:16 -0500
>
> So I've created a LISPM charmap, and a LISPM charset map based on
> that. Then calling define-charset and define-coding-system, if I now
> try to open a Lisp machine file in the lispm coding it seems to be
> unable to handle the various characters; e.g., #o210.
>
> These default coding systems were tried to encode text
> in the buffer ‘lispm-char-test.text’:
> (lispm-unix (1 . 0) (59 . 1) (117 . 2) (175 . 3) (233 . 4) (291 . 5)
> (349 . 6) (407 . 7) (465 . 4194184) (523 . 4194185) (581 . 4194186))
> However, each of them encountered characters it couldn’t encode:
> ....
What does "M-x describe-character-set RET lispm RET" show?
And what was shown where you show the ellipsis?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-13 14:59 ` Eli Zaretskii
@ 2020-11-13 17:11 ` Alfred M. Szmidt
2020-11-14 14:24 ` Eli Zaretskii
2020-11-13 17:11 ` Alfred M. Szmidt
1 sibling, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-13 17:11 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1302 bytes --]
> So I've created a LISPM charmap, and a LISPM charset map based on
> that. Then calling define-charset and define-coding-system, if I now
> try to open a Lisp machine file in the lispm coding it seems to be
> unable to handle the various characters; e.g., #o210.
>
> These default coding systems were tried to encode text
> in the buffer ‘lispm-char-test.text’:
> (lispm-unix (1 . 0) (59 . 1) (117 . 2) (175 . 3) (233 . 4) (291 . 5)
> (349 . 6) (407 . 7) (465 . 4194184) (523 . 4194185) (581 . 4194186))
> However, each of them encountered characters it couldn’t encode:
> ....
What does "M-x describe-character-set RET lispm RET" show?
It says:
Character set: lispm
LISPM
Number of contained characters: 256
Map file: LISPM
Code space: [0 255]
And what was shown where you show the ellipsis?
These default coding systems were tried to encode text
in the buffer `lispm-char-test.text':
(lispm-unix (1 . 0) (59 . 1) (117 . 2) (175 . 3) (233 . 4) (291 . 5)
(349 . 6) (407 . 7) (465 . 4194184) (523 . 4194185) (581 . 4194186))
However, each of them encountered characters it couldn't encode:
lispm-unix cannot encode these: ^@ ^A ^B ^C ^D ^E ^F ^G \210 \211 ...
(where ^@ etc are #o0, #o1, etc and #o210 ...)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-13 17:11 ` Alfred M. Szmidt
@ 2020-11-14 14:24 ` Eli Zaretskii
2020-11-14 15:29 ` Alfred M. Szmidt
0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-14 14:24 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 13 Nov 2020 12:11:18 -0500
>
> These default coding systems were tried to encode text
> in the buffer `lispm-char-test.text':
> (lispm-unix (1 . 0) (59 . 1) (117 . 2) (175 . 3) (233 . 4) (291 . 5)
> (349 . 6) (407 . 7) (465 . 4194184) (523 . 4194185) (581 . 4194186))
> However, each of them encountered characters it couldn't encode:
> lispm-unix cannot encode these: ^@ ^A ^B ^C ^D ^E ^F ^G \210 \211 ...
>
> (where ^@ etc are #o0, #o1, etc and #o210 ...)
Is this the encoding on Unix systems? If so, maybe try without
mapping characters below ASCII 128, I'm not sure this is supported in
an ASCII-compatible encoding.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-14 14:24 ` Eli Zaretskii
@ 2020-11-14 15:29 ` Alfred M. Szmidt
2020-11-14 16:19 ` Eli Zaretskii
0 siblings, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-14 15:29 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 13 Nov 2020 12:11:18 -0500
>
> These default coding systems were tried to encode text
> in the buffer `lispm-char-test.text':
> (lispm-unix (1 . 0) (59 . 1) (117 . 2) (175 . 3) (233 . 4) (291 . 5)
> (349 . 6) (407 . 7) (465 . 4194184) (523 . 4194185) (581 . 4194186))
> However, each of them encountered characters it couldn't encode:
> lispm-unix cannot encode these: ^@ ^A ^B ^C ^D ^E ^F ^G \210 \211 ...
>
> (where ^@ etc are #o0, #o1, etc and #o210 ...)
Is this the encoding on Unix systems? If so, maybe try without
mapping characters below ASCII 128, I'm not sure this is supported in
an ASCII-compatible encoding.
I am not sure I understand. On unix #o0 maps to the MIDDLE DOT, #o1
to DOWNWARDS ARROW, etc. The Lisp Machine character set isn't
compatible with ASCII -- the control characters have a entierly
diffierent function. As I understood it, the charmap/charset is a
mapping from UCS-4 Unicode to whatever is on the target?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-14 15:29 ` Alfred M. Szmidt
@ 2020-11-14 16:19 ` Eli Zaretskii
2020-11-23 20:40 ` Alfred M. Szmidt
0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-14 16:19 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Sat, 14 Nov 2020 10:29:15 -0500
>
> Is this the encoding on Unix systems? If so, maybe try without
> mapping characters below ASCII 128, I'm not sure this is supported in
> an ASCII-compatible encoding.
>
> I am not sure I understand. On unix #o0 maps to the MIDDLE DOT, #o1
> to DOWNWARDS ARROW, etc.
If the low codes aren't identical to ASCII, then I think
ascii-compatible should be nil, and I think the relevant example to
follow is that of EBCDIC. I'd suggest to construct a map file by
hand, using EBCDIC maps as example, and see if that works.
If it doesn't work, we might need to bring Kenichi Handa on board of
the discussion.
> As I understood it, the charmap/charset is a mapping from UCS-4
> Unicode to whatever is on the target?
Not UCS-4, but Unicode codepoints (which is the same thing in
practice, but just so we get our terminology right.)
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-14 16:19 ` Eli Zaretskii
@ 2020-11-23 20:40 ` Alfred M. Szmidt
2020-11-23 20:49 ` Eli Zaretskii
0 siblings, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-23 20:40 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> Is this the encoding on Unix systems? If so, maybe try without
> mapping characters below ASCII 128, I'm not sure this is supported in
> an ASCII-compatible encoding.
>
> I am not sure I understand. On unix #o0 maps to the MIDDLE DOT, #o1
> to DOWNWARDS ARROW, etc.
If the low codes aren't identical to ASCII, then I think
ascii-compatible should be nil, and I think the relevant example to
follow is that of EBCDIC. I'd suggest to construct a map file by
hand, using EBCDIC maps as example, and see if that works.
It didn't, I took the EBCDIC-US map, and replaced the first entry,
<U0000> /x00 NULL (NUL)
with
<U00B7> /x00 MIDDLE DOT
If it doesn't work, we might need to bring Kenichi Handa on board of
the discussion.
If Kenichi Handa can help, that would be very nice -- it isn't a very
important one but it would be useful for me to get this working.
> As I understood it, the charmap/charset is a mapping from UCS-4
> Unicode to whatever is on the target?
Not UCS-4, but Unicode codepoints (which is the same thing in
practice, but just so we get our terminology right.)
Are you sure? According to the glibc manual (and a quick glance at the
source, glibc/locale/program/charmap.c), the Unicode entry is supposed
to be a UCS-4 name.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-23 20:40 ` Alfred M. Szmidt
@ 2020-11-23 20:49 ` Eli Zaretskii
2020-11-28 17:27 ` Alfred M. Szmidt
0 siblings, 1 reply; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-23 20:49 UTC (permalink / raw)
To: Alfred M. Szmidt, Kenichi Handa; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Mon, 23 Nov 2020 15:40:24 -0500
>
> If it doesn't work, we might need to bring Kenichi Handa on board of
> the discussion.
>
> If Kenichi Handa can help, that would be very nice -- it isn't a very
> important one but it would be useful for me to get this working.
I've CC'ed him, let's hope he responds soon.
> > As I understood it, the charmap/charset is a mapping from UCS-4
> > Unicode to whatever is on the target?
>
> Not UCS-4, but Unicode codepoints (which is the same thing in
> practice, but just so we get our terminology right.)
>
> Are you sure? According to the glibc manual (and a quick glance at the
> source, glibc/locale/program/charmap.c), the Unicode entry is supposed
> to be a UCS-4 name.
There's no difference between them. UCS-4 comes from ISO, the Unicode
codepoints from the Unicode Consortium, but the values are identical.
I prefer not to use UCS-4, because it's confusing nowadays. The Emacs
manuals use the "Unicode codepoint" terminology.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-23 20:49 ` Eli Zaretskii
@ 2020-11-28 17:27 ` Alfred M. Szmidt
0 siblings, 0 replies; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-28 17:27 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: handa, emacs-devel
> If Kenichi Handa can help, that would be very nice -- it isn't a very
> important one but it would be useful for me to get this working.
I've CC'ed him, let's hope he responds soon.
Thank you.
> Are you sure? According to the glibc manual (and a quick glance at the
> source, glibc/locale/program/charmap.c), the Unicode entry is supposed
> to be a UCS-4 name.
There's no difference between them. UCS-4 comes from ISO, the
Unicode codepoints from the Unicode Consortium, but the values are
identical. I prefer not to use UCS-4, because it's confusing
nowadays. The Emacs manuals use the "Unicode codepoint"
terminology.
That makes sense; double thanks!
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system (was: Re: prettify symbols question)
2020-11-13 14:59 ` Eli Zaretskii
2020-11-13 17:11 ` Alfred M. Szmidt
@ 2020-11-13 17:11 ` Alfred M. Szmidt
1 sibling, 0 replies; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-13 17:11 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Seems that it doesn't like it when you have:
<U00B7> /x00 MIDDLE DOT
but
<U0000> /x00 MIDDLE DOT
works. According to the glibc manual, this should accept a UCS-4
value which U00B7 is. Not sure what gives...
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: new coding system
2020-11-13 14:47 ` new coding system (was: Re: prettify symbols question) Alfred M. Szmidt
2020-11-13 14:59 ` Eli Zaretskii
@ 2020-11-13 17:32 ` Andreas Schwab
2020-11-13 17:36 ` Alfred M. Szmidt
1 sibling, 1 reply; 29+ messages in thread
From: Andreas Schwab @ 2020-11-13 17:32 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: Eli Zaretskii, emacs-devel
On Nov 13 2020, Alfred M. Szmidt wrote:
> % 200 Null character
Why isn't that mapped to <U0000>?
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-12 15:38 ` Eli Zaretskii
2020-11-12 16:14 ` Eli Zaretskii
@ 2020-11-13 8:27 ` Alfred M. Szmidt
2020-11-13 8:40 ` Eli Zaretskii
1 sibling, 1 reply; 29+ messages in thread
From: Alfred M. Szmidt @ 2020-11-13 8:27 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> If the above doesn't work, then maybe it _is_ related to encoding.
> What does the mode line say about 'buffer-file-coding-system when' you
> visit this file?
>
> So when the buffer-file-coding-system is utf-8-unix everything works
> (where also the sequence is not acted on in comments). But when the
> buffer is raw-text-unix, it does not work for #o210, but works for say
> #o10. Some multi-byte thing going on?
Yes, raw-text means the buffer includes raw bytes, not characters.
Emacs doesn't do anything useful with raw bytes above 127, and in
particular doesn't interpret them as characters.
Do you have any ideas on what a good coding system would be for this?
utf-8 is obviously wrong. The char. set is just 8-bit, or should I
write a coding system?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: prettify symbols question
2020-11-13 8:27 ` prettify symbols question Alfred M. Szmidt
@ 2020-11-13 8:40 ` Eli Zaretskii
0 siblings, 0 replies; 29+ messages in thread
From: Eli Zaretskii @ 2020-11-13 8:40 UTC (permalink / raw)
To: Alfred M. Szmidt; +Cc: emacs-devel
> From: "Alfred M. Szmidt" <ams@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Fri, 13 Nov 2020 03:27:17 -0500
>
> Yes, raw-text means the buffer includes raw bytes, not characters.
> Emacs doesn't do anything useful with raw bytes above 127, and in
> particular doesn't interpret them as characters.
>
> Do you have any ideas on what a good coding system would be for this?
> utf-8 is obviously wrong. The char. set is just 8-bit, or should I
> write a coding system?
The latter, IMO. More accurately, define a charset, and then defining
a coding-system for it should be almost trivial.
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2020-11-28 17:27 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-11 17:01 prettify symbols question Alfred M. Szmidt
2020-11-12 14:59 ` Eli Zaretskii
2020-11-12 15:17 ` Alfred M. Szmidt
2020-11-12 15:38 ` Eli Zaretskii
2020-11-12 16:14 ` Eli Zaretskii
2020-11-12 20:53 ` Alfred M. Szmidt
2020-11-12 21:12 ` Basil L. Contovounesios
2020-11-12 21:25 ` Drew Adams
2020-11-13 7:44 ` Eli Zaretskii
2020-11-13 7:24 ` Eli Zaretskii
2020-11-13 10:15 ` Alfred M. Szmidt
2020-11-13 11:17 ` Alfred M. Szmidt
2020-11-13 12:22 ` Eli Zaretskii
2020-11-13 13:31 ` Alfred M. Szmidt
2020-11-13 13:47 ` Eli Zaretskii
2020-11-13 14:47 ` new coding system (was: Re: prettify symbols question) Alfred M. Szmidt
2020-11-13 14:59 ` Eli Zaretskii
2020-11-13 17:11 ` Alfred M. Szmidt
2020-11-14 14:24 ` Eli Zaretskii
2020-11-14 15:29 ` Alfred M. Szmidt
2020-11-14 16:19 ` Eli Zaretskii
2020-11-23 20:40 ` Alfred M. Szmidt
2020-11-23 20:49 ` Eli Zaretskii
2020-11-28 17:27 ` Alfred M. Szmidt
2020-11-13 17:11 ` Alfred M. Szmidt
2020-11-13 17:32 ` new coding system Andreas Schwab
2020-11-13 17:36 ` Alfred M. Szmidt
2020-11-13 8:27 ` prettify symbols question Alfred M. Szmidt
2020-11-13 8:40 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).