all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#10919: emacs-mule/utf-8 difference
@ 2012-03-01 15:39 Tiphaine Turpin
  2012-03-01 15:48 ` Tiphaine Turpin
  2012-03-01 17:54 ` Eli Zaretskii
  0 siblings, 2 replies; 4+ messages in thread
From: Tiphaine Turpin @ 2012-03-01 15:39 UTC (permalink / raw)
  To: 10919

Hi,

I have a problem regarding coding systems:

I'm using process-send-string to send substrings of a buffer through a 
socket, after setting the process encoding and decoding systems to 
emacs-mule.
I expect the number of bytes written to match the byte-length of the 
substring as obtained by position-bytes, since the specification of 
position-bytes in emacs-devel is to always work with the emacs-mule 
encoding. From emacs-devel:

"The byte sequence of a buffer after decoded is always in emacs-mule (in 
emacs-unicode-2 branch, it's utf-8).  So, changing 
buffer-file-coding-system or any other coding-system-related variables 
doesn't affects position-bytes."

However, this is not the case with 3bytes utf8 characters: 
position-bytes counts them as 3 bytes, but process-send-string wirtes 4 
bytes.

Setting the process coding systems for the socket to utf-8 solves the 
problem, but I don't think it will with other coding systems, even if I 
used buffer-file-coding-system instead, since position-bytes does not 
use it.

What is the real expected behavior of these things, and how to make this 
correct ?

Regards,

Tiphaine Turpin






^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#10919: emacs-mule/utf-8 difference
  2012-03-01 15:39 bug#10919: emacs-mule/utf-8 difference Tiphaine Turpin
@ 2012-03-01 15:48 ` Tiphaine Turpin
  2012-03-01 17:45   ` Stefan Monnier
  2012-03-01 17:54 ` Eli Zaretskii
  1 sibling, 1 reply; 4+ messages in thread
From: Tiphaine Turpin @ 2012-03-01 15:48 UTC (permalink / raw)
  To: 10919

I just found a solution which seems to work: using emacs-internal 
instead of emacs-mule. So it seems to be just a documentation problem 
(or a problem with my reading of it).

Tiphaine

On 01/03/2012 16:39, Tiphaine Turpin wrote:
> Hi,
>
> I have a problem regarding coding systems:
>
> I'm using process-send-string to send substrings of a buffer through a 
> socket, after setting the process encoding and decoding systems to 
> emacs-mule.
> I expect the number of bytes written to match the byte-length of the 
> substring as obtained by position-bytes, since the specification of 
> position-bytes in emacs-devel is to always work with the emacs-mule 
> encoding. From emacs-devel:
>
> "The byte sequence of a buffer after decoded is always in emacs-mule 
> (in emacs-unicode-2 branch, it's utf-8).  So, changing 
> buffer-file-coding-system or any other coding-system-related variables 
> doesn't affects position-bytes."
>
> However, this is not the case with 3bytes utf8 characters: 
> position-bytes counts them as 3 bytes, but process-send-string wirtes 
> 4 bytes.
>
> Setting the process coding systems for the socket to utf-8 solves the 
> problem, but I don't think it will with other coding systems, even if 
> I used buffer-file-coding-system instead, since position-bytes does 
> not use it.
>
> What is the real expected behavior of these things, and how to make 
> this correct ?
>
> Regards,
>
> Tiphaine Turpin
>






^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#10919: emacs-mule/utf-8 difference
  2012-03-01 15:48 ` Tiphaine Turpin
@ 2012-03-01 17:45   ` Stefan Monnier
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Monnier @ 2012-03-01 17:45 UTC (permalink / raw)
  To: Tiphaine Turpin; +Cc: 10919

> I just found a solution which seems to work: using emacs-internal instead of
> emacs-mule. So it seems to be just a documentation problem (or a problem
> with my reading of it).

emacs-mule was internally used in Emacs<23, now it's a variant of utf-8.
So position-bytes in Emacs<23 should be consistent with emasc-mule, but
in Emacs≥23 it is only consistent with emacs-internal (or utf-8).


        Stefan





^ permalink raw reply	[flat|nested] 4+ messages in thread

* bug#10919: emacs-mule/utf-8 difference
  2012-03-01 15:39 bug#10919: emacs-mule/utf-8 difference Tiphaine Turpin
  2012-03-01 15:48 ` Tiphaine Turpin
@ 2012-03-01 17:54 ` Eli Zaretskii
  1 sibling, 0 replies; 4+ messages in thread
From: Eli Zaretskii @ 2012-03-01 17:54 UTC (permalink / raw)
  To: Tiphaine Turpin; +Cc: 10919-done

> Date: Thu, 01 Mar 2012 16:39:57 +0100
> From: Tiphaine Turpin <tiphaine.turpin@inria.fr>
> 
> From emacs-devel:
> 
> "The byte sequence of a buffer after decoded is always in emacs-mule (in 
> emacs-unicode-2 branch, it's utf-8).

This is very old info.  The emacs-unicode-2 branch was merged with the
mainline when Emacs 23.1 was released.

> So, changing 
> buffer-file-coding-system or any other coding-system-related variables 
> doesn't affects position-bytes."
> 
> However, this is not the case with 3bytes utf8 characters: 
> position-bytes counts them as 3 bytes, but process-send-string wirtes 4 
> bytes.

process-send-string _encodes_ the string, it does not send the
internal representation of the string in the buffer.  Using
process-send-string is like writing the string to a disk file: Emacs
encodes it before sending or writing.

Therefore, buffer-file-coding-system _does_ affect what is being sent.

I'm closing this non-bug.





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-03-01 17:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-01 15:39 bug#10919: emacs-mule/utf-8 difference Tiphaine Turpin
2012-03-01 15:48 ` Tiphaine Turpin
2012-03-01 17:45   ` Stefan Monnier
2012-03-01 17:54 ` Eli Zaretskii

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.