all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* bug#10701: 24.0.93; Crash while decoding input with DOS EOLs
@ 2012-02-02 18:15 Eli Zaretskii
  2012-02-03  8:56 ` Andreas Schwab
  0 siblings, 1 reply; 5+ messages in thread
From: Eli Zaretskii @ 2012-02-02 18:15 UTC (permalink / raw
  To: 10701

This bug report will be sent to the Bug-GNU-Emacs mailing list
and the GNU bug tracker at debbugs.gnu.org.  Please check that
the From: line contains a valid email address.  After a delay of up
to one day, you should receive an acknowledgement at that address.

Please write in English if possible, as the Emacs maintainers
usually do not have translators for other languages.

Please describe exactly what actions triggered the bug, and
the precise symptoms of the bug.  If you can, give a recipe
starting from `emacs -Q':

I see this both with today's trunk and in the 24.0.93 pretest, both on
GNU/Linux and on MS-Windows.

To reproduce:

 emacs -Q
 C-x b foo RET
 M-: (set-buffer-multibyte nil) RET
 C-x RET c undecided-dos RET C-u M-! gunzip -c emacs-24.0.93.tar.gz RET

(It must be the tarball of Emacs 24.0.93, because the bug is
data-dependent.  It doesn't have to be .tar.gz, as long as you use the
correct decompressor: bunzip2 for .tar.bz2. xz for .tar.xz, etc.  You
can even do this with an uncompressed tarball and cat.  The important
part is that Emacs gets the byte stream of that tarball, and it gets
it from a subprocess.)

This crashes somewhere in the middle of reading the output from the
subprocess.  The immediate reason for the crash can be seen from this
fragment of the backtrace:

  #0  w32_abort () at w32fns.c:7196
  #1  0x012eea83 in temp_set_point_both (buffer=0x10137600, charpos=45817604,
      bytepos=45817605) at intervals.c:1870
  #2  0x01135816 in Fcall_process (nargs=6, args=0x82f644) at callproc.c:846

As you see temp_set_point_both gets character position and byte
position that are different, which cannot happen in a unibyte buffer
(as can be seen above, the recipe makes the buffer `foo' a unibyte
one).  There's an assertion inside temp_set_point_both that aborts due
to this.

The call to temp_set_point_both is in call-process:

		  TEMP_SET_PT_BOTH (PT + process_coding.produced_char,
				    PT_BYTE + process_coding.produced);
		  carryover = process_coding.carryover_bytes;
		  if (carryover > 0)
		    memcpy (buf, process_coding.carryover,
			    process_coding.carryover_bytes);

The crash happens at the point in the input byte stream where the last
byte in the chunk we read from the pipe is \r.  Since the stream is
decoded with raw-text-dos coding-system, this last \r is left as a
"carryover", for the possibility that there will be a \n at the
beginning of the next chunk.  However, process_coding.produced does
not account for this single byte that was not processed, and gets the
value one more than it should.

As far as I could see, the problematic code that sets
process_coding.produced to incorrect value is in decode_coding, around
line 7176:

      else
	{
	  /* Record unprocessed bytes in coding->carryover.  We are
	     sure that the number of data is less than the size of
	     coding->carryover.  */
	  unsigned char *p = coding->carryover;

	  if (nbytes > sizeof coding->carryover)
	    nbytes = sizeof coding->carryover;
	  coding->carryover_bytes = nbytes;
	  while (nbytes-- > 0)
	    *p++ = *src++;
	}
      coding->consumed = coding->src_bytes; <<<<<<<<<<<<<<<<<<<

This last assignment then causes produce_chars to set
coding->produced to an incorrect value:

      /* Source characters are at coding->source.  */
      const unsigned char *src = coding->source;
      const unsigned char *src_end = src + coding->consumed; <<<<<<<<<<<<
      ...
	  produced_chars = coding->consumed_char;
	  while (src < src_end)
	    *dst++ = *src++;
	}
    }

  produced = dst - (coding->destination + coding->produced);  <<<<<<<<<<<
  if (BUFFERP (coding->dst_object) && produced_chars > 0)
    insert_from_gap (produced_chars, produced);
  coding->produced += produced; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  coding->produced_char += produced_chars;

I don't understand the logic of "carryover" in decode_coding well
enough to decide how to fix it.

If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
    `bt full' and `xbacktrace'.
For information about debugging Emacs, please read the file
d:/gnu/bzr/emacs/trunk/etc/DEBUG.


In GNU Emacs 24.0.93.1 (i386-mingw-nt5.1.2600)
 of 2012-02-02 on HOME-C4E4A596F7
Windowing system distributor `Microsoft Corp.', version 5.1.2600
Configured using:
 `configure --with-gcc (3.4) --no-opt'

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: ENU
  value of $XMODIFIERS: nil
  locale-coding-system: cp1255
  default enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
M-x r e p o r t - e m <tab> <return>

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr message format-spec rfc822 mml easymenu
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045
ietf-drums mm-util mail-prsvr mailabbrev mail-utils gmm-utils mailheader
emacsbug time-date tooltip ediff-hook vc-hooks lisp-float-type mwheel
dos-w32 disp-table ls-lisp w32-win w32-vars tool-bar dnd fontset image
fringe lisp-mode register page menu-bar rfn-eshadow timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham
georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese hebrew greek romanian slovak czech european ethiopic
indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple
abbrev minibuffer loaddefs button faces cus-face files text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process multi-tty emacs)





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#10701: 24.0.93; Crash while decoding input with DOS EOLs
  2012-02-02 18:15 bug#10701: 24.0.93; Crash while decoding input with DOS EOLs Eli Zaretskii
@ 2012-02-03  8:56 ` Andreas Schwab
  2012-02-03 15:41   ` Leo
  2012-02-08  8:33   ` Kenichi Handa
  0 siblings, 2 replies; 5+ messages in thread
From: Andreas Schwab @ 2012-02-03  8:56 UTC (permalink / raw
  To: Eli Zaretskii; +Cc: 10701

(with-temp-buffer
  (let ((coding-system-for-read 'undecided-dos))
    (set-buffer-multibyte nil)
    (shell-command "yes 'a\r'" t)))

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#10701: 24.0.93; Crash while decoding input with DOS EOLs
  2012-02-03  8:56 ` Andreas Schwab
@ 2012-02-03 15:41   ` Leo
  2012-02-08  8:33   ` Kenichi Handa
  1 sibling, 0 replies; 5+ messages in thread
From: Leo @ 2012-02-03 15:41 UTC (permalink / raw
  To: 10701

On 2012-02-03 16:56 +0800, Andreas Schwab wrote:
> (with-temp-buffer
>   (let ((coding-system-for-read 'undecided-dos))
>     (set-buffer-multibyte nil)
>     (shell-command "yes 'a\r'" t)))

Crash emacs 23.4 as well!

Leo






^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#10701: 24.0.93; Crash while decoding input with DOS EOLs
  2012-02-03  8:56 ` Andreas Schwab
  2012-02-03 15:41   ` Leo
@ 2012-02-08  8:33   ` Kenichi Handa
  2012-02-10 11:05     ` Eli Zaretskii
  1 sibling, 1 reply; 5+ messages in thread
From: Kenichi Handa @ 2012-02-08  8:33 UTC (permalink / raw
  To: Andreas Schwab; +Cc: 10701

In article <m2liokcn21.fsf@igel.home>, Andreas Schwab <schwab@linux-m68k.org> writes:

> (with-temp-buffer
>   (let ((coding-system-for-read 'undecided-dos))
>     (set-buffer-multibyte nil)
>     (shell-command "yes 'a\r'" t)))

I've just installed a fix to emacs-23 branch.

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#10701: 24.0.93; Crash while decoding input with DOS EOLs
  2012-02-08  8:33   ` Kenichi Handa
@ 2012-02-10 11:05     ` Eli Zaretskii
  0 siblings, 0 replies; 5+ messages in thread
From: Eli Zaretskii @ 2012-02-10 11:05 UTC (permalink / raw
  To: Kenichi Handa; +Cc: 10701, schwab

> From: Kenichi Handa <handa@m17n.org>
> Cc: eliz@gnu.org, 10701@debbugs.gnu.org
> Date: Wed, 08 Feb 2012 17:33:39 +0900
> 
> In article <m2liokcn21.fsf@igel.home>, Andreas Schwab <schwab@linux-m68k.org> writes:
> 
> > (with-temp-buffer
> >   (let ((coding-system-for-read 'undecided-dos))
> >     (set-buffer-multibyte nil)
> >     (shell-command "yes 'a\r'" t)))
> 
> I've just installed a fix to emacs-23 branch.

Thanks, confirmed.





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-02-10 11:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-02 18:15 bug#10701: 24.0.93; Crash while decoding input with DOS EOLs Eli Zaretskii
2012-02-03  8:56 ` Andreas Schwab
2012-02-03 15:41   ` Leo
2012-02-08  8:33   ` Kenichi Handa
2012-02-10 11:05     ` Eli Zaretskii

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.