all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Bug in regexp matching in data readed in binary mode
@ 2003-01-14 14:42 Alex Ott
  0 siblings, 0 replies; 3+ messages in thread
From: Alex Ott @ 2003-01-14 14:42 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 2793 bytes --]

This bug report will be sent to the Free Software Foundation,
not to your local site managers!
Please write in English, because the Emacs maintainers do not have
translators to read other languages for them.

Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list,
and to the gnu.emacs.bug news group.

In GNU Emacs 21.2.1 (i686-pc-linux-gnu, GNU/LessTif Version 1.2 Release 0.92.32)
 of 2002-03-19 on seal.service.jet.msk.su
configured using `configure  --prefix=/usr --with-x-toolkit=motif'
Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: ru_RU.koi8r
  value of $LC_CTYPE: ru_RU.koi8r
  value of $LC_MESSAGES: ru_RU.koi8r
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: C
  value of $LC_TIME: C
  value of $LANG: ru_RU.koi8r
  locale-coding-system: cyrillic-koi8
  default-enable-multibyte-characters: t

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

I got this error many times when i read my mail with gnus via
pop3. Gnus read data from pop3 server in binary mode and some time
this bug lead to loss my letters.

to reproduce this i write test case (in attachment). To reproduce you
need open test.el in emacs and eval it twice (i can't understand why
it not worked in first pass). file emacs-test-mail.txt must be in one
dir with test.el - it is my test mail letter. after eval you got two
buffers -- *test-case* with results of command output, and
*test-result* with results of regexp match (buggy). 

This error arise only when exists small russian char 'r', and also
exists char '.' in end of string. so "^\\.\r\r" will match full
string. 


Recent input:
<backspace> <backspace> <backspace> <backspace> <backspace> 
<backspace> <backspace> <backspace> <backspace> t e 
s <tab> <return> <down-mouse-1> <mouse-1> <menu-bar> 
<emacs-lisp> <eval-buffer> <down-mouse-1> <mouse-1> 
C-x b * t <tab> <tab> c <tab> <return> C-x k <return> 
C-x b * t e <tab> <return> C-x k <return> <menu-bar> 
<emacs-lisp> <eval-buffer> C-x b * t e <tab> r e <tab> 
<return> C-x k <return> C-x b * t e <tab> <return> 
C-x k <return> C-x k <return> <up> <down> <left> <down> 
<left> <left> M-g m e m <backspace> <backspace> C-x 
k <return> y <help-echo> <help-echo> <help-echo> <help-echo> 
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo> 
<menu-bar> <help-menu> <report-emacs-bug>

Recent messages:
No more unread newsgroups
nnml: Reading incoming mail from pop... [3 times]
nnml: Reading incoming mail (no new mail)...done
nnml: Reading incoming mail from pop... [3 times]
nnml: Reading incoming mail (no new mail)...done
Making completion list...
nnml: Reading incoming mail from pop... [3 times]
nnml: Reading incoming mail (no new mail)...done
Parsing /home/ott/.mailrc... done
Loading emacsbug...done


[-- Attachment #2: emacs-error.tar.gz --]
[-- Type: application/x-gzip, Size: 812 bytes --]

[-- Attachment #3: Type: text/plain, Size: 55 bytes --]



-- 
With best wishes, Alex Ott
					Jet Infosystems


[-- Attachment #4: Type: text/plain, Size: 148 bytes --]

_______________________________________________
Bug-gnu-emacs mailing list
Bug-gnu-emacs@gnu.org
http://mail.gnu.org/mailman/listinfo/bug-gnu-emacs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug in regexp matching in data readed in binary mode
       [not found] <mailman.261.1042555834.21513.bug-gnu-emacs@gnu.org>
@ 2003-01-15 18:00 ` ShengHuo ZHU
  2003-01-16  0:31   ` Kenichi Handa
  0 siblings, 1 reply; 3+ messages in thread
From: ShengHuo ZHU @ 2003-01-15 18:00 UTC (permalink / raw)
  Cc: handa

Alex Ott <ott@jet.msk.su> writes:

[...]

> I got this error many times when i read my mail with gnus via
> pop3. Gnus read data from pop3 server in binary mode and some time
> this bug lead to loss my letters.
>
> to reproduce this i write test case (in attachment). To reproduce you
> need open test.el in emacs and eval it twice (i can't understand why
> it not worked in first pass). file emacs-test-mail.txt must be in one
> dir with test.el - it is my test mail letter. after eval you got two
> buffers -- *test-case* with results of command output, and
> *test-result* with results of regexp match (buggy). 
>
> This error arise only when exists small russian char 'r', and also
> exists char '.' in end of string. so "^\\.\r\r" will match full
> string. 

I can reproduce the bug, however I think the bug is probably related
to the decoding part of the read_process_output function, which I am
not familiar with.

I hope that the following code (based on Alex's code) may illustrate
the bug in a better way.  The first piece of code returns t, which is
obviously wrong. But if we insert a space and delete it, it returns
nil in the second piece.

(let ((coding-system-for-read 'binary)
      (coding-system-for-write 'binary)
      (buf (get-buffer-create "*test-case*"))
      process)
  (with-current-buffer buf
    (erase-buffer)
    (setq process (start-process "test" buf "echo" "-e" "\r\n\305.\r\n"))
    (accept-process-output process)
    (goto-char (point-min))
    (search-forward ".\r")
    (forward-char -2)
    (looking-at "^\\.\r")))


(let ((coding-system-for-read 'binary)
      (coding-system-for-write 'binary)
      (buf (get-buffer-create "*test-case*"))
      process)
  (with-current-buffer buf
    (erase-buffer)
    (setq process (start-process "test" buf "echo" "-e" "\r\n\305.\r\n"))
    (accept-process-output process)
    (goto-char (point-min))
    (search-forward ".\r")
    (forward-char -3)
    (insert " ")
    (delete-backward-char 1)
    (forward-char 1)
    (looking-at "^\\.\r")))

ShengHuo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug in regexp matching in data readed in binary mode
  2003-01-15 18:00 ` ShengHuo ZHU
@ 2003-01-16  0:31   ` Kenichi Handa
  0 siblings, 0 replies; 3+ messages in thread
From: Kenichi Handa @ 2003-01-16  0:31 UTC (permalink / raw)
  Cc: monnier

In article <2nvg0qqomo.fsf@zsh.cs.rochester.edu>, ShengHuo ZHU <zsh@cs.rochester.edu> writes:
> I can reproduce the bug, however I think the bug is probably related
> to the decoding part of the read_process_output function, which I am
> not familiar with.

> I hope that the following code (based on Alex's code) may illustrate
> the bug in a better way.  The first piece of code returns t, which is
> obviously wrong. But if we insert a space and delete it, it returns
> nil in the second piece.

Thank you for the test case.  The bug was in regex.c.  The
attached patch will fix it.  Please try it.  It also fixes
the bug of backward searching of eight-bit-graphic char.

Stefan, it seems that you are maintaining regex.c.  What
should I do for this change?  Can I directly install it in
HEAD (and perhaps in RC)?

---
Ken'ichi HANDA
handa@m17n.org

2003-01-16  Kenichi Handa  <handa@m17n.org>

	* regex.c (GET_CHAR_BEFORE_2): Fix for the case that the previous
	char is eight-bit-graphic.
	(re_search_2): Likewise.

*** regex.c.~1.183.~	Wed Dec  4 17:26:49 2002
--- regex.c	Thu Jan 16 09:27:28 2003
***************
*** 157,164 ****
         {						    		\
  	 re_char *dtemp = (p) == (str2) ? (end1) : (p);		    	\
  	 re_char *dlimit = ((p) > (str2) && (p) <= (end2)) ? (str2) : (str1); \
  	 while (dtemp-- > dlimit && !CHAR_HEAD_P (*dtemp));		\
! 	 c = STRING_CHAR (dtemp, (p) - dtemp);				\
         }						    		\
       else						    		\
         (c = ((p) == (str2) ? (end1) : (p))[-1]);			\
--- 157,168 ----
         {						    		\
  	 re_char *dtemp = (p) == (str2) ? (end1) : (p);		    	\
  	 re_char *dlimit = ((p) > (str2) && (p) <= (end2)) ? (str2) : (str1); \
+ 	 re_char *d0 = dtemp;						\
  	 while (dtemp-- > dlimit && !CHAR_HEAD_P (*dtemp));		\
! 	 if (MULTIBYTE_FORM_LENGTH (dtemp, d0 - dtemp) == d0 - dtemp)	\
! 	   c = STRING_CHAR (dtemp, d0 - dtemp);				\
! 	 else								\
! 	   c = d0[-1];							\
         }						    		\
       else						    		\
         (c = ((p) == (str2) ? (end1) : (p))[-1]);			\
***************
*** 4307,4324 ****
  		p--, len++;
  
  	      /* Adjust it. */
- #if 0				/* XXX */
  	      if (MULTIBYTE_FORM_LENGTH (p, len + 1) != (len + 1))
! 		;
! 	      else
! #endif
! 		{
! 		  range += len;
! 		  if (range > 0)
! 		    break;
  
! 		  startpos -= len;
! 		}
  	    }
  	}
      }
--- 4311,4326 ----
  		p--, len++;
  
  	      /* Adjust it. */
  	      if (MULTIBYTE_FORM_LENGTH (p, len + 1) != (len + 1))
! 		/* The previous character is eight-bit-graphic which
! 		   is represented by one byte even in a multibyte
! 		   buffer/string.  */
! 		len = 0;
! 	      range += len;
! 	      if (range > 0)
! 		break;
  
! 	      startpos -= len;
  	    }
  	}
      }

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-01-16  0:31 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-14 14:42 Bug in regexp matching in data readed in binary mode Alex Ott
     [not found] <mailman.261.1042555834.21513.bug-gnu-emacs@gnu.org>
2003-01-15 18:00 ` ShengHuo ZHU
2003-01-16  0:31   ` Kenichi Handa

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.