* Re: Typing raw bytes
2013-01-20 20:10 Typing raw bytes Eli Zaretskii
@ 2013-01-20 20:50 ` Ivan Andrus
2013-01-20 20:55 ` Michael Welsh Duggan
` (5 subsequent siblings)
6 siblings, 0 replies; 23+ messages in thread
From: Ivan Andrus @ 2013-01-20 20:50 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel@gnu.org discussions
On Jan 20, 2013, at 9:10 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> Suppose I want to create a file whose contents is a series of certain
> bytes. How would I go about that?
>
> I tried "M-x hexl-mode RET" in a new buffer, but it evidently doesn't
> let you insert bytes, only edit existing bytes.
I for one would love it if hexl-mode let you insert bytes.
-Ivan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:10 Typing raw bytes Eli Zaretskii
2013-01-20 20:50 ` Ivan Andrus
@ 2013-01-20 20:55 ` Michael Welsh Duggan
2013-01-20 21:33 ` Eli Zaretskii
2013-01-20 20:59 ` Benjamin Riefenstahl
` (4 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Michael Welsh Duggan @ 2013-01-20 20:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
> Suppose I want to create a file whose contents is a series of certain
> bytes. How would I go about that?
>
> I tried "M-x hexl-mode RET" in a new buffer, but it evidently doesn't
> let you insert bytes, only edit existing bytes.
>
> Next I tried "C-x RET f raw-text RET" in a new buffer followed by
> "C-q NNN" etc., but the data written thereafter to disk is more bytes
> than I typed, because, I guess, "C-q NNN" inserts windows-1255
> characters (this is on Windows, where keyboard-coding-system is
> windows-1255-unix), and what winds up in the file is their UTF-8
> encoding.
Could you give an example that fails to do what is expected? I tried
what you referenced above, using "C-x RET f raw-text RET" and C-q insert
bytes. Saving the file worked just fine, resulting in the expected file
length. "M-x find-file-literally" on the resulting file also seemed to
work just fine. Note, however, that I am running Emacs under GNU/Linux.
--
Michael Welsh Duggan
(md5i@md5i.com)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:55 ` Michael Welsh Duggan
@ 2013-01-20 21:33 ` Eli Zaretskii
2013-01-20 21:43 ` Michael Welsh Duggan
0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-20 21:33 UTC (permalink / raw)
To: Michael Welsh Duggan; +Cc: emacs-devel
> From: Michael Welsh Duggan <mwd@md5i.com>
> Cc: emacs-devel@gnu.org
> Date: Sun, 20 Jan 2013 15:55:13 -0500
>
> Could you give an example that fails to do what is expected? I tried
> what you referenced above, using "C-x RET f raw-text RET" and C-q insert
> bytes. Saving the file worked just fine, resulting in the expected file
> length. "M-x find-file-literally" on the resulting file also seemed to
> work just fine. Note, however, that I am running Emacs under GNU/Linux.
If that was a GUI session, then on GNU/Linux there's no decoding of
keyboard input. Try in a TTY session instead.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 21:33 ` Eli Zaretskii
@ 2013-01-20 21:43 ` Michael Welsh Duggan
0 siblings, 0 replies; 23+ messages in thread
From: Michael Welsh Duggan @ 2013-01-20 21:43 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
>> From: Michael Welsh Duggan <mwd@md5i.com>
>> Cc: emacs-devel@gnu.org
>> Date: Sun, 20 Jan 2013 15:55:13 -0500
>>
>> Could you give an example that fails to do what is expected? I tried
>> what you referenced above, using "C-x RET f raw-text RET" and C-q insert
>> bytes. Saving the file worked just fine, resulting in the expected file
>> length. "M-x find-file-literally" on the resulting file also seemed to
>> work just fine. Note, however, that I am running Emacs under GNU/Linux.
>
> If that was a GUI session, then on GNU/Linux there's no decoding of
> keyboard input. Try in a TTY session instead.
Ah, ha! I see. Yes, I would have to call that a bug. Regardless of
whether the extraneous \302 characters I am getting are incorrect (and I
agree with you that they are), the fact that the results are different
in a GUI and a terminal session is definitely a bug.
--
Michael Welsh Duggan
(md5i@md5i.com)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:10 Typing raw bytes Eli Zaretskii
2013-01-20 20:50 ` Ivan Andrus
2013-01-20 20:55 ` Michael Welsh Duggan
@ 2013-01-20 20:59 ` Benjamin Riefenstahl
2013-01-20 21:35 ` Eli Zaretskii
2013-01-20 20:59 ` Andreas Schwab
` (3 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Benjamin Riefenstahl @ 2013-01-20 20:59 UTC (permalink / raw)
To: emacs-devel
Hi Eli,
> Suppose I want to create a file whose contents is a series of certain
> bytes. How would I go about that?
I tried M-x set-buffer-coding-system RET iso-8859-1 RET and it seems to
work. I also had to set read-quoted-char-radix to 10 to be able to use
codepoints in the form that I know them ;-).
The trick here is that iso-8859-1 uses the same codepoints as Unicode,
but encodes them directly as bytes.
I think this should work with the no-conversion/binary coding system,
but it doesn't. I do not know if the current behaviour of
no-conversion/binary is useful to somebody.
benny
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:59 ` Benjamin Riefenstahl
@ 2013-01-20 21:35 ` Eli Zaretskii
2013-01-21 14:47 ` Benjamin Riefenstahl
0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-20 21:35 UTC (permalink / raw)
To: Benjamin Riefenstahl; +Cc: emacs-devel
> From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> Date: Sun, 20 Jan 2013 21:59:44 +0100
>
> > Suppose I want to create a file whose contents is a series of certain
> > bytes. How would I go about that?
>
> I tried M-x set-buffer-coding-system RET iso-8859-1 RET and it seems to
> work.
iso-8859-1 doesn't cover the entire range of 8-bit bytes from 0x80 to
0xff, so not every byte can be typed with this trick.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 21:35 ` Eli Zaretskii
@ 2013-01-21 14:47 ` Benjamin Riefenstahl
2013-01-21 17:26 ` Eli Zaretskii
0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Riefenstahl @ 2013-01-21 14:47 UTC (permalink / raw)
To: emacs-devel
> iso-8859-1 doesn't cover the entire range of 8-bit bytes from 0x80 to
> 0xff, so not every byte can be typed with this trick.
I just tried 0x80, 0x90, 0x85, 0xff and 0x7f which all worked. Last
time I had also tried 0x00, which also was ok. What is missing? AFAIK,
while the range 0x80-0x9F is reserved for non-printable control
characters, those characters are not actually invalid.
benny
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-21 14:47 ` Benjamin Riefenstahl
@ 2013-01-21 17:26 ` Eli Zaretskii
2013-01-21 19:38 ` Benjamin Riefenstahl
0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-21 17:26 UTC (permalink / raw)
To: Benjamin Riefenstahl; +Cc: emacs-devel
> From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> Date: Mon, 21 Jan 2013 15:47:33 +0100
>
> > iso-8859-1 doesn't cover the entire range of 8-bit bytes from 0x80 to
> > 0xff, so not every byte can be typed with this trick.
>
> I just tried 0x80, 0x90, 0x85, 0xff and 0x7f which all worked.
What do you mean by "worked"? Did you try to save them?
Inserting raw bytes into a multibyte buffer is playing with fire.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-21 17:26 ` Eli Zaretskii
@ 2013-01-21 19:38 ` Benjamin Riefenstahl
2013-01-21 20:02 ` Eli Zaretskii
0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Riefenstahl @ 2013-01-21 19:38 UTC (permalink / raw)
To: emacs-devel
>> I just tried 0x80, 0x90, 0x85, 0xff and 0x7f which all worked.
>
> What do you mean by "worked"? Did you try to save them?
I set the coding system to iso-8859-1, set read-quoted-char-radix to 16,
inserted the characters with C-q 80 RET etc., saved the file and the
result was this:
$ hexdump -C test.bin
00000000 80 90 85 ff 7f |.....|
00000005
$
> Inserting raw bytes into a multibyte buffer is playing with fire.
I inserted ISO8859-1 characters with the knowledge that ISO8859-1 is a
one-to-one mapping from Unicode codepoints to bytes. This is
effectively how the Unicode codepoints U+0000 to U+00FF are defined, so
I expect that that coding system should work that way.
For the record, for testing I am using a GTK-based Emacs 24.2.92 on X11
(GNU/Linux), running "emacs -Q", my locale is UTF-8, even
keyboard-coding-system is utf-8-unix. I tried the same with "emacs -Q
-nw" and this also works the same.
benny
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-21 19:38 ` Benjamin Riefenstahl
@ 2013-01-21 20:02 ` Eli Zaretskii
0 siblings, 0 replies; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-21 20:02 UTC (permalink / raw)
To: Benjamin Riefenstahl; +Cc: emacs-devel
> From: Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net>
> Date: Mon, 21 Jan 2013 20:38:01 +0100
>
> I inserted ISO8859-1 characters with the knowledge that ISO8859-1 is a
> one-to-one mapping from Unicode codepoints to bytes. This is
> effectively how the Unicode codepoints U+0000 to U+00FF are defined, so
> I expect that that coding system should work that way.
But Emacs doesn't represent characters as Unicode code points
internally. It uses a variant of UTF-8, where all codes above U+007F
are represented by more than one byte.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:10 Typing raw bytes Eli Zaretskii
` (2 preceding siblings ...)
2013-01-20 20:59 ` Benjamin Riefenstahl
@ 2013-01-20 20:59 ` Andreas Schwab
2013-01-20 21:31 ` Eli Zaretskii
2013-01-20 23:22 ` Kenichi Handa
` (2 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Andreas Schwab @ 2013-01-20 20:59 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
Eli Zaretskii <eliz@gnu.org> writes:
> Next I tried "C-x RET f raw-text RET" in a new buffer followed by
> "C-q NNN" etc., but the data written thereafter to disk is more bytes
> than I typed, because, I guess, "C-q NNN" inserts windows-1255
> characters (this is on Windows, where keyboard-coding-system is
> windows-1255-unix), and what winds up in the file is their UTF-8
> encoding.
FWIW, raw bytes are in the range #x3fff80 - #x3fffff.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:10 Typing raw bytes Eli Zaretskii
` (3 preceding siblings ...)
2013-01-20 20:59 ` Andreas Schwab
@ 2013-01-20 23:22 ` Kenichi Handa
2013-01-21 3:50 ` Eli Zaretskii
2013-01-21 1:40 ` Stefan Monnier
2013-01-21 14:19 ` Richard Stallman
6 siblings, 1 reply; 23+ messages in thread
From: Kenichi Handa @ 2013-01-20 23:22 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
In article <83hambpqds.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> Suppose I want to create a file whose contents is a series of certain
> bytes. How would I go about that?
> I tried "M-x hexl-mode RET" in a new buffer, but it evidently doesn't
> let you insert bytes, only edit existing bytes.
> Next I tried "C-x RET f raw-text RET" in a new buffer followed by
> "C-q NNN" etc., but the data written thereafter to disk is more bytes
> than I typed, because, I guess, "C-q NNN" inserts windows-1255
> characters (this is on Windows, where keyboard-coding-system is
> windows-1255-unix), and what winds up in the file is their UTF-8
> encoding.
Please use C-q in a unibyte buffer. For instance,
M-x find-file-literally RET _FILE_NAME_ RET
C-q 3 7 7 RET
C-x C-s
or
C-x b _NEW_BUFFER_NAME_ ERT
M-x toggle-enable-multibyte-characters RET
C-q 3 7 7 RET
C-x C-s
---
Kenichi Handa
handa@gnu.org
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 23:22 ` Kenichi Handa
@ 2013-01-21 3:50 ` Eli Zaretskii
2013-01-21 17:39 ` Eli Zaretskii
0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-21 3:50 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel
> From: Kenichi Handa <handa@gnu.org>
> Cc: emacs-devel@gnu.org
> Date: Mon, 21 Jan 2013 08:22:17 +0900
>
> In article <83hambpqds.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
>
> > Suppose I want to create a file whose contents is a series of certain
> > bytes. How would I go about that?
>
> > I tried "M-x hexl-mode RET" in a new buffer, but it evidently doesn't
> > let you insert bytes, only edit existing bytes.
>
> > Next I tried "C-x RET f raw-text RET" in a new buffer followed by
> > "C-q NNN" etc., but the data written thereafter to disk is more bytes
> > than I typed, because, I guess, "C-q NNN" inserts windows-1255
> > characters (this is on Windows, where keyboard-coding-system is
> > windows-1255-unix), and what winds up in the file is their UTF-8
> > encoding.
>
> Please use C-q in a unibyte buffer.
But "C-x RET f raw-text RET" does just that. And it still didn't work
for me, at least not on Windows. Does it work for you in a TTY
session on Unix?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-21 3:50 ` Eli Zaretskii
@ 2013-01-21 17:39 ` Eli Zaretskii
0 siblings, 0 replies; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-21 17:39 UTC (permalink / raw)
To: handa; +Cc: emacs-devel
> Date: Mon, 21 Jan 2013 05:50:31 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
>
> > From: Kenichi Handa <handa@gnu.org>
> > Cc: emacs-devel@gnu.org
> > Date: Mon, 21 Jan 2013 08:22:17 +0900
> >
> > In article <83hambpqds.fsf@gnu.org>, Eli Zaretskii <eliz@gnu.org> writes:
> >
> > > Suppose I want to create a file whose contents is a series of certain
> > > bytes. How would I go about that?
> >
> > > I tried "M-x hexl-mode RET" in a new buffer, but it evidently doesn't
> > > let you insert bytes, only edit existing bytes.
> >
> > > Next I tried "C-x RET f raw-text RET" in a new buffer followed by
> > > "C-q NNN" etc., but the data written thereafter to disk is more bytes
> > > than I typed, because, I guess, "C-q NNN" inserts windows-1255
> > > characters (this is on Windows, where keyboard-coding-system is
> > > windows-1255-unix), and what winds up in the file is their UTF-8
> > > encoding.
> >
> > Please use C-q in a unibyte buffer.
>
> But "C-x RET f raw-text RET" does just that.
Ignore me: raw-text doesn't make the buffer unibyte.
find-file-literally is the way. Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:10 Typing raw bytes Eli Zaretskii
` (4 preceding siblings ...)
2013-01-20 23:22 ` Kenichi Handa
@ 2013-01-21 1:40 ` Stefan Monnier
2013-01-21 3:51 ` Eli Zaretskii
2013-01-21 14:19 ` Richard Stallman
6 siblings, 1 reply; 23+ messages in thread
From: Stefan Monnier @ 2013-01-21 1:40 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
> I tried "M-x hexl-mode RET" in a new buffer, but it evidently doesn't
> let you insert bytes, only edit existing bytes.
nhexl-mode should let you do that.
> Next I tried "C-x RET f raw-text RET" in a new buffer followed by
> "C-q NNN" etc., but the data written thereafter to disk is more bytes
> than I typed, because, I guess, "C-q NNN" inserts windows-1255
> characters (this is on Windows, where keyboard-coding-system is
> windows-1255-unix), and what winds up in the file is their UTF-8
> encoding.
Subsequent messages explain it seems to be due to the C-q NNN char being
passed through keyboard-coding-system: that seems to be a bug.
Another problem might be that C-x RET f raw-text RET should try to put
the buffer in unibyte mode.
Stefan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-21 1:40 ` Stefan Monnier
@ 2013-01-21 3:51 ` Eli Zaretskii
2013-01-21 15:38 ` Stefan Monnier
0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-21 3:51 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: emacs-devel@gnu.org
> Date: Sun, 20 Jan 2013 20:40:57 -0500
>
> Subsequent messages explain it seems to be due to the C-q NNN char being
> passed through keyboard-coding-system: that seems to be a bug.
How is that a bug? Keyboard input decoding is not the issue here; how
the result is inserted into a buffer is, IMO.
> Another problem might be that C-x RET f raw-text RET should try to put
> the buffer in unibyte mode.
It did.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-21 3:51 ` Eli Zaretskii
@ 2013-01-21 15:38 ` Stefan Monnier
2013-01-21 17:38 ` Eli Zaretskii
0 siblings, 1 reply; 23+ messages in thread
From: Stefan Monnier @ 2013-01-21 15:38 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
>> Subsequent messages explain it seems to be due to the C-q NNN char being
>> passed through keyboard-coding-system: that seems to be a bug.
> How is that a bug?
AFAIK C-q NNN should insert the char whose internal character code is
NNN (in Emacs's own encoding, not in the keyboard's), so the
keyboard-coding-system should not have any influence.
> Keyboard input decoding is not the issue here; how
> the result is inserted into a buffer is, IMO.
In a unibyte buffer, a character code NNN (where NNN is less than 256)
should be inserted as that byte without any loss.
Stefan
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-21 15:38 ` Stefan Monnier
@ 2013-01-21 17:38 ` Eli Zaretskii
2013-01-22 0:59 ` Stefan Monnier
0 siblings, 1 reply; 23+ messages in thread
From: Eli Zaretskii @ 2013-01-21 17:38 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: emacs-devel@gnu.org
> Date: Mon, 21 Jan 2013 10:38:20 -0500
>
> >> Subsequent messages explain it seems to be due to the C-q NNN char being
> >> passed through keyboard-coding-system: that seems to be a bug.
> > How is that a bug?
>
> AFAIK C-q NNN should insert the char whose internal character code is
> NNN (in Emacs's own encoding, not in the keyboard's), so the
> keyboard-coding-system should not have any influence.
It doesn't, indeed. I was wrong. The problem was the insertion of
8-bit codes into a multibyte buffer, which silently converts them to
multibyte characters as appropriate to the current locale.
> > Keyboard input decoding is not the issue here; how
> > the result is inserted into a buffer is, IMO.
>
> In a unibyte buffer, a character code NNN (where NNN is less than 256)
> should be inserted as that byte without any loss.
I was wrong about the buffer being unibyte: raw-text doesn't do that.
(I think it used to do so in some old version of Emacs, and the memory
sticked.)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: Typing raw bytes
2013-01-20 20:10 Typing raw bytes Eli Zaretskii
` (5 preceding siblings ...)
2013-01-21 1:40 ` Stefan Monnier
@ 2013-01-21 14:19 ` Richard Stallman
6 siblings, 0 replies; 23+ messages in thread
From: Richard Stallman @ 2013-01-21 14:19 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
I tried "M-x hexl-mode RET" in a new buffer, but it evidently doesn't
let you insert bytes, only edit existing bytes.
Insertion and deletion would be a good feature to add.
Next I tried "C-x RET f raw-text RET" in a new buffer followed by
"C-q NNN" etc., but the data written thereafter to disk is more bytes
than I typed, because, I guess, "C-q NNN" inserts windows-1255
characters (this is on Windows, where keyboard-coding-system is
windows-1255-unix), and what winds up in the file is their UTF-8
encoding.
Perhaps find-file-literally should set keyboard-coding-system so that
this does the right thing.
--
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
Use Ekiga or an ordinary phone call
^ permalink raw reply [flat|nested] 23+ messages in thread