CCL_WRITE_CHAR and CCL_WRITE_MULTIBYTE

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* CCL_WRITE_CHAR and CCL_WRITE_MULTIBYTE_CHAR
@ 2008-01-17  2:37 YAMAMOTO Mitsuharu
  2008-01-31 11:35 ` Kenichi Handa
  0 siblings, 1 reply; 4+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-01-17  2:37 UTC (permalink / raw)
  To: emacs-devel

I suspect the boundary checking in CCL_WRITE_CHAR and
CCL_WRITE_MULTIBYTE_CHAR can be relaxed by 1.  I mean,

    else if (dst + bytes + extra_bytes <= (dst_bytes ? dst_end : src))	\

instead of 

    else if (dst + bytes + extra_bytes < (dst_bytes ? dst_end : src))	\

I have a situation where the destination buffer size is tight and the
last byte is not filled.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CCL_WRITE_CHAR and CCL_WRITE_MULTIBYTE_CHAR
  2008-01-17  2:37 CCL_WRITE_CHAR and CCL_WRITE_MULTIBYTE_CHAR YAMAMOTO Mitsuharu
@ 2008-01-31 11:35 ` Kenichi Handa
  2008-01-31 12:37   ` YAMAMOTO Mitsuharu
  0 siblings, 1 reply; 4+ messages in thread
From: Kenichi Handa @ 2008-01-31 11:35 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: emacs-devel

In article <wlejchrxch.wl%mituharu@math.s.chiba-u.ac.jp>, YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:

> I suspect the boundary checking in CCL_WRITE_CHAR and
> CCL_WRITE_MULTIBYTE_CHAR can be relaxed by 1.  I mean,

>     else if (dst + bytes + extra_bytes <= (dst_bytes ? dst_end : src))	\

> instead of 

>     else if (dst + bytes + extra_bytes < (dst_bytes ? dst_end : src))	\

At least, in CCL_WRITE_CHAR, that change is not safe because
extra_bytes will be incremented after that check.  I've just
installed the attached change to the main trunk.  I dared
not install that change to EMACS_22_BASE because I'm still
not that confident about the change.  In addition the
problem has not been revealed so long, and it can be avoided
by giving a bigger BUFFER_MAGNIFICATION in
define-ccl-program.

---
Kenichi Handa
handa@ni.aist.go.jp

Index: ccl.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/ccl.c,v
retrieving revision 1.102
retrieving revision 1.103
diff -u -r1.102 -r1.103
--- ccl.c	8 Jan 2008 20:44:20 -0000	1.102
+++ ccl.c	31 Jan 2008 11:27:46 -0000	1.103
@@ -748,16 +748,13 @@
     int bytes = SINGLE_BYTE_CHAR_P (ch) ? 1: CHAR_BYTES (ch);		\
     if (!dst)								\
       CCL_INVALID_CMD;							\
-    else if (dst + bytes + extra_bytes < (dst_bytes ? dst_end : src))	\
+    if (ccl->eight_bit_control						\
+	&& bytes == 1 && (ch) >= 0x80 && (ch) < 0xA0)			\
+      extra_bytes++;							\
+    if (dst + bytes + extra_bytes <= (dst_bytes ? dst_end : src))	\
       {									\
 	if (bytes == 1)							\
-	  {								\
-	    *dst++ = (ch);						\
-	    if (extra_bytes && (ch) >= 0x80 && (ch) < 0xA0)		\
-	      /* We may have to convert this eight-bit char to		\
-		 multibyte form later.  */				\
-	      extra_bytes++;						\
-	  }								\
+	  *dst++ = (ch);						\
 	else if (CHAR_VALID_P (ch, 0))					\
 	  dst += CHAR_STRING (ch, dst);					\
 	else								\
@@ -775,7 +772,7 @@
     int bytes = CHAR_BYTES (ch);					\
     if (!dst)								\
       CCL_INVALID_CMD;							\
-    else if (dst + bytes + extra_bytes < (dst_bytes ? dst_end : src))	\
+    else if (dst + bytes + extra_bytes <= (dst_bytes ? dst_end : src))	\
       {									\
 	if (CHAR_VALID_P ((ch), 0))					\
 	  dst += CHAR_STRING ((ch), dst);				\
@@ -919,7 +916,7 @@
      each of them will be converted to multibyte form of 2-byte
      sequence.  For that conversion, we remember how many more bytes
      we must keep in DESTINATION in this variable.  */
-  int extra_bytes = ccl->eight_bit_control;
+  int extra_bytes = 0;
   int eof_ic = ccl->eof_ic;
   int eof_hit = 0;
 
> I have a situation where the destination buffer size is tight and the
> last byte is not filled.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CCL_WRITE_CHAR and CCL_WRITE_MULTIBYTE_CHAR
  2008-01-31 11:35 ` Kenichi Handa
@ 2008-01-31 12:37   ` YAMAMOTO Mitsuharu
  2008-02-01  1:21     ` Kenichi Handa
  0 siblings, 1 reply; 4+ messages in thread
From: YAMAMOTO Mitsuharu @ 2008-01-31 12:37 UTC (permalink / raw)
  To: handa; +Cc: emacs-devel

[-- Attachment #1: Type: Text/Plain, Size: 2810 bytes --]

>>>>> On Thu, 31 Jan 2008 20:35:34 +0900, Kenichi Handa <handa@ni.aist.go.jp> said:

> At least, in CCL_WRITE_CHAR, that change is not safe because
> extra_bytes will be incremented after that check.

But in the original code, extra_bytes is initialized to 1, not 0, in
the case that increment occurs.

> I've just installed the attached change to the main trunk.  I dared
> not install that change to EMACS_22_BASE because I'm still not that
> confident about the change.  In addition the problem has not been
> revealed so long, and it can be avoided by giving a bigger
> BUFFER_MAGNIFICATION in define-ccl-program.

I'm planning to add some event handlers to the Carbon port after the
Emacs 22.2 release so we can look up a word pointed to by the mouse in
dictionaries using Command-Control-D.  That hander requires us to fill
a given storage with a specified range of buffer text in UTF-16.  (I'm
thinking about BMP-only case.)  The size of the storage in bytes is
exactly twice as large as the length of the range.  That's the "tight
situation" I mentioned in the previous mail.

				     YAMAMOTO Mitsuharu
				mituharu@math.s.chiba-u.ac.jp

/* Store the text of the buffer BUF from START to END as Unicode
   characters in CHARACTERS.  Return non-zero if successful.  */

int
mac_store_buffer_text_to_unicode_chars (buf, start, end, characters)
     struct buffer *buf;
     int start, end;
     UniChar *characters;
{
  int start_byte, end_byte, char_count, byte_count;
  struct coding_system coding;
  unsigned char *dst = (unsigned char *) characters;

  start_byte = buf_charpos_to_bytepos (buf, start);
  end_byte = buf_charpos_to_bytepos (buf, end);
  char_count = end - start;
  byte_count = end_byte - start_byte;

  if (setup_coding_system (
#ifdef WORDS_BIG_ENDIAN
			   intern ("utf-16be")
#else
			   intern ("utf-16le")
#endif
			   , &coding) < 0)
    return 0;

  coding.src_multibyte = !NILP (buf->enable_multibyte_characters);
  coding.dst_multibyte = 0;
  coding.mode |= CODING_MODE_LAST_BLOCK;
  coding.composing = COMPOSITION_DISABLED;

  if (BUF_GPT_BYTE (buf) <= start_byte || end_byte <= BUF_GPT_BYTE (buf))
    encode_coding (&coding, BUF_BYTE_ADDRESS (buf, start_byte), dst,
		   byte_count, char_count * sizeof (UniChar));
  else
    {
      int first_byte_count = BUF_GPT_BYTE (buf) - start_byte;

      encode_coding (&coding, BUF_BYTE_ADDRESS (buf, start_byte), dst,
		     first_byte_count, char_count * sizeof (UniChar));
      if (coding.result == CODING_FINISH_NORMAL)
	encode_coding (&coding,
		       BUF_BYTE_ADDRESS (buf, start_byte + first_byte_count),
		       dst + coding.produced,
		       byte_count - first_byte_count,
		       char_count * sizeof (UniChar) - coding.produced);
    }

  if (coding.result != CODING_FINISH_NORMAL)
    return 0;

  return 1;
}

[-- Attachment #2: image.png --]
[-- Type: Image/Png, Size: 26160 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CCL_WRITE_CHAR and CCL_WRITE_MULTIBYTE_CHAR
  2008-01-31 12:37   ` YAMAMOTO Mitsuharu
@ 2008-02-01  1:21     ` Kenichi Handa
  0 siblings, 0 replies; 4+ messages in thread
From: Kenichi Handa @ 2008-02-01  1:21 UTC (permalink / raw)
  To: YAMAMOTO Mitsuharu; +Cc: emacs-devel

In article <20080131.213724.217838043.mituharu@math.s.chiba-u.ac.jp>, YAMAMOTO Mitsuharu <mituharu@math.s.chiba-u.ac.jp> writes:

> [1  <text/plain; us-ascii (7bit)>]
>>>>>> On Thu, 31 Jan 2008 20:35:34 +0900, Kenichi Handa <handa@ni.aist.go.jp> said:

> > At least, in CCL_WRITE_CHAR, that change is not safe because
> > extra_bytes will be incremented after that check.

> But in the original code, extra_bytes is initialized to 1, not 0, in
> the case that increment occurs.

Ah, I recalled why I used that tricky logic.  Ok, I've just
installed your change in EMACS_22_BASE, and cancel the last
change in the trunk.  So, the change in EMACS_22_BASE will
propagate to the trunk eventually.

---
Kenichi Handa
handa@ni.aist.go.jp

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-02-01  1:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-17  2:37 CCL_WRITE_CHAR and CCL_WRITE_MULTIBYTE_CHAR YAMAMOTO Mitsuharu
2008-01-31 11:35 ` Kenichi Handa
2008-01-31 12:37   ` YAMAMOTO Mitsuharu
2008-02-01  1:21     ` Kenichi Handa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).