unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: copying to clipboard silently discards lines
       [not found] ` <1515.220.255.172.231.1109259901.squirrel@220.255.172.231>
@ 2005-02-25 15:09   ` Chong Yidong
  2005-02-25 17:24     ` Benjamin Riefenstahl
  0 siblings, 1 reply; 6+ messages in thread
From: Chong Yidong @ 2005-02-25 15:09 UTC (permalink / raw)


I sent an email to emacs-pretest-bug earlier about NTEmacs silently
discarding lines copied onto the clipboard. I've identified the problem,
but I don't know enough about coding systems to come up with a fix. Maybe
someone can help me out here.

The bug is in the Lisp_Object QUNICODE, which specifies the "utf-16le-dos"
coding system used by `selection-coding-system'. When a coding_system is
extracted from QUNICODE, the parameter
coding->spec.ccl.encoder.buf_magnification is 1. This causes a call to
encoding_buffer_size (in convert_to_handle_as_coded, w32select.c:247) to
return a buffer size that is too small for the encoded string.

I'm guessing the magnification has to be set to 2, because `abc...' gets
encoded to `a\0b\0c\0...' (encoding_buffer_size multiplies this again by
2, because of the CRLF issue.)

This bug did not exist prior to the unicode support patch, because Emacs
used the iso2022 coding system for selections.

Trouble is, I can't find where the `spec.ccl...', or indeed any of the
other parameters of QUNICODE are initialized. The only place I can see
that sets anything in QUNICODE is w32select.c:1074, which is just

  QUNICODE = intern ("utf-16le-dos");

Could someone with knowledge of coding systems and/or w32select.c help?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying to clipboard silently discards lines
  2005-02-25 15:09   ` copying to clipboard silently discards lines Chong Yidong
@ 2005-02-25 17:24     ` Benjamin Riefenstahl
  2005-02-26  2:53       ` Chong Yidong
  0 siblings, 1 reply; 6+ messages in thread
From: Benjamin Riefenstahl @ 2005-02-25 17:24 UTC (permalink / raw)
  Cc: emacs-devel

Hi Chong Yidong,


"Chong Yidong" writes:
> The bug is in the Lisp_Object QUNICODE, which specifies the
> "utf-16le-dos" coding system used by `selection-coding-system'.

QUINCODE is just a symbol, a unique string.  It is used for looking up
the actual coding-system.  setup_coding_system() in src/coding.c does
that lookup by filling in the struct coding_system.

> When a coding_system is extracted from QUNICODE, the parameter
> coding->spec.ccl.encoder.buf_magnification is 1. This causes a call to
> encoding_buffer_size (in convert_to_handle_as_coded, w32select.c:247) to
> return a buffer size that is too small for the encoded string.
>
> I'm guessing the magnification has to be set to 2, because `abc...'
> gets encoded to `a\0b\0c\0...' (encoding_buffer_size multiplies this
> again by 2, because of the CRLF issue.)

That would indicate that the coding-system "utf-16le-dos" itself has a
bug.  The source code description comment for encoding_buffer_size()
says

 Return maximum size (bytes) of a buffer enough for encoding SRC_BYTES
 of text to CODING.

which matches its usage in w32select.c, I think.

> Trouble is, I can't find where the `spec.ccl...', or indeed any of
> the other parameters of QUNICODE are initialized.

Grep is our friend ;-).  The coding-system (and its CCL programs) is
created in lisp/international/ccl.el using (define-ccl-program ...).
The parameter BUFFER_MAGNIFICATION for that function is set as 1 there
for the encoders (ccl-encode-mule-utf-16le, ccl-encode-mule-utf-16be,
ccl-encode-mule-utf-16le-with-signature,
ccl-encode-mule-utf-16be-with-signature), so that might well be the
problem.

Can you try and replace that 1 with 2 and see if your problem goes
away?


benny

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying to clipboard silently discards lines
  2005-02-25 17:24     ` Benjamin Riefenstahl
@ 2005-02-26  2:53       ` Chong Yidong
  2005-02-28  7:20         ` Kenichi Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Chong Yidong @ 2005-02-26  2:53 UTC (permalink / raw)
  Cc: Benjamin Riefenstahl

"Benjamin Riefenstahl" <b.riefenstahl@turtle-trading.net> wrote:
> Grep is our friend ;-).  The coding-system (and its CCL programs) is
> created in lisp/international/ccl.el using (define-ccl-program ...).
> The parameter BUFFER_MAGNIFICATION for that function is set as 1 there
> for the encoders (ccl-encode-mule-utf-16le, ccl-encode-mule-utf-16be,
> ccl-encode-mule-utf-16le-with-signature,
> ccl-encode-mule-utf-16be-with-signature), so that might well be the
> problem.
>
> Can you try and replace that 1 with 2 and see if your problem goes
> away?

Yes, that did the trick. Can someone verify that this is the correct fix?

Thanks, Ben.

*** emacs/lisp/international/utf-16.el~	Sat Feb 26 10:46:46 2005
--- emacs/lisp/international/utf-16.el	Sat Feb 26 10:47:03 2005
***************
*** 391,397 ****


  (define-ccl-program ccl-encode-mule-utf-16le
!   `(1
      ,utf-16le-encode-loop)
    "Encode to UTF-16LE (little endian without signature).
  Characters from the charsets ascii, eight-bit-control,
--- 391,397 ----


  (define-ccl-program ccl-encode-mule-utf-16le
!   `(2
      ,utf-16le-encode-loop)
    "Encode to UTF-16LE (little endian without signature).
  Characters from the charsets ascii, eight-bit-control,
***************
*** 401,407 ****
  Others are encoded as U+FFFD.")

  (define-ccl-program ccl-encode-mule-utf-16be
!   `(1
      ,utf-16be-encode-loop)
    "Encode to UTF-16BE (big endian without signature).
  Characters from the charsets ascii, eight-bit-control,
--- 401,407 ----
  Others are encoded as U+FFFD.")

  (define-ccl-program ccl-encode-mule-utf-16be
!   `(2
      ,utf-16be-encode-loop)
    "Encode to UTF-16BE (big endian without signature).
  Characters from the charsets ascii, eight-bit-control,
***************
*** 411,417 ****
  Others are encoded as U+FFFD.")

  (define-ccl-program ccl-encode-mule-utf-16le-with-signature
!   `(1
      ((write #xFF)
       (write #xFE)
       ,@utf-16le-encode-loop))
--- 411,417 ----
  Others are encoded as U+FFFD.")

  (define-ccl-program ccl-encode-mule-utf-16le-with-signature
!   `(2
      ((write #xFF)
       (write #xFE)
       ,@utf-16le-encode-loop))
***************
*** 423,429 ****
  Others are encoded as U+FFFD.")

  (define-ccl-program ccl-encode-mule-utf-16be-with-signature
!   `(1
      ((write #xFE)
       (write #xFF)
       ,@utf-16be-encode-loop))
--- 423,429 ----
  Others are encoded as U+FFFD.")

  (define-ccl-program ccl-encode-mule-utf-16be-with-signature
!   `(2
      ((write #xFE)
       (write #xFF)
       ,@utf-16be-encode-loop))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying to clipboard silently discards lines
  2005-02-26  2:53       ` Chong Yidong
@ 2005-02-28  7:20         ` Kenichi Handa
  2005-02-28 19:13           ` Benjamin Riefenstahl
  0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2005-02-28  7:20 UTC (permalink / raw)
  Cc: b.riefenstahl, emacs-devel

In article <1401.220.255.172.231.1109386419.squirrel@www.stupidchicken.com>, "Chong Yidong" <cyd@stupidchicken.com> writes:

> "Benjamin Riefenstahl" <b.riefenstahl@turtle-trading.net> wrote:
>>  Grep is our friend ;-).  The coding-system (and its CCL programs) is
>>  created in lisp/international/ccl.el using (define-ccl-program ...).
>>  The parameter BUFFER_MAGNIFICATION for that function is set as 1 there
>>  for the encoders (ccl-encode-mule-utf-16le, ccl-encode-mule-utf-16be,
>>  ccl-encode-mule-utf-16le-with-signature,
>>  ccl-encode-mule-utf-16be-with-signature), so that might well be the
>>  problem.
>> 
>>  Can you try and replace that 1 with 2 and see if your problem goes
>>  away?

> Yes, that did the trick. Can someone verify that this is the correct fix?

Thank you for finding this bug.  The first two fixes are
correct, but we must set 4 for XXX-with-signature.  I've
just installed these fixes.


2005-02-28  Chong Yidong" <cyd@stupidchicken.com> (tiny change)

	* international/utf-16.el (ccl-encode-mule-utf-16le): Fix
	BUFFER_MAGNIFICATION to 2.
	(ccl-encode-mule-utf-16be): Likewise.

2005-02-28  Kenichi Handa  <handa@m17n.org>

	* international/utf-16.el (ccl-encode-mule-utf-16le-with-signature):
	Fix BUFFER_MAGNIFICATION to 4.
	(ccl-encode-mule-utf-16be-with-signature): Likewise.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying to clipboard silently discards lines
  2005-02-28  7:20         ` Kenichi Handa
@ 2005-02-28 19:13           ` Benjamin Riefenstahl
  2005-03-08  9:10             ` Kenichi Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Benjamin Riefenstahl @ 2005-02-28 19:13 UTC (permalink / raw)
  Cc: Chong Yidong, emacs-devel

Hi Kenichi Handa,

Kenichi Handa writes:
> I've just installed these fixes.

Thanks. 

> but we must set 4 for XXX-with-signature. 

Really?  I thought XXX-with-signature just adds 2 additional bytes,
regardless how long the string is.  That should be covered by "+
CONVERSION_BUFFER_EXTRA_ROOM" (== 256) in encoding_buffer_size() and
similar extra space in other places.  I may be missing something, of
course.

benny

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: copying to clipboard silently discards lines
  2005-02-28 19:13           ` Benjamin Riefenstahl
@ 2005-03-08  9:10             ` Kenichi Handa
  0 siblings, 0 replies; 6+ messages in thread
From: Kenichi Handa @ 2005-03-08  9:10 UTC (permalink / raw)
  Cc: cyd, emacs-devel

In article <m3fyzgxztb.fsf@seneca.benny.turtle-trading.net>, Benjamin Riefenstahl <b.riefenstahl@turtle-trading.net> writes:
>>  but we must set 4 for XXX-with-signature. 

> Really?  I thought XXX-with-signature just adds 2 additional bytes,
> regardless how long the string is.  That should be covered by "+
> CONVERSION_BUFFER_EXTRA_ROOM" (== 256) in encoding_buffer_size() and
> similar extra space in other places.  I may be missing something, of
> course.

Those extra 256 bytes were for internal use only, for
instance, to produce some error message of CCL program.
But, as it's too much to require 4 times bigger output
buffer for UTF-16 encoding, I documented those extra 256
bytes in the docstring of define-ccl-prgoram, and changed
ccl-encode-mule-utf-16le/be-with-signature to require only 2
times bigger output buffer.

---
Ken'ichi HANDA
handa@m17n.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-03-08  9:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2150.220.255.172.231.1109220271.squirrel@220.255.172.231>
     [not found] ` <1515.220.255.172.231.1109259901.squirrel@220.255.172.231>
2005-02-25 15:09   ` copying to clipboard silently discards lines Chong Yidong
2005-02-25 17:24     ` Benjamin Riefenstahl
2005-02-26  2:53       ` Chong Yidong
2005-02-28  7:20         ` Kenichi Handa
2005-02-28 19:13           ` Benjamin Riefenstahl
2005-03-08  9:10             ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).