From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Kenichi Handa Newsgroups: gmane.emacs.bugs Subject: Re: Bug in regexp matching in data readed in binary mode Date: Thu, 16 Jan 2003 09:31:41 +0900 (JST) Sender: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Message-ID: <200301160031.JAA10750@etlken.m17n.org> References: <2nvg0qqomo.fsf@zsh.cs.rochester.edu> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII X-Trace: main.gmane.org 1042677464 18148 80.91.224.249 (16 Jan 2003 00:37:44 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 16 Jan 2003 00:37:44 +0000 (UTC) Cc: monnier@cs.yale.edu Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18Yy2Q-0004iP-00 for ; Thu, 16 Jan 2003 01:37:42 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18Yy3d-0008Fv-01 for gnu-bug-gnu-emacs@m.gmane.org; Wed, 15 Jan 2003 19:38:57 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18Yy2n-0007fo-00 for bug-gnu-emacs@gnu.org; Wed, 15 Jan 2003 19:38:05 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18Yy2i-0007aV-00 for bug-gnu-emacs@gnu.org; Wed, 15 Jan 2003 19:38:00 -0500 Original-Received: from tsukuba.m17n.org ([192.47.44.130]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18Yxwq-0004Jl-00 for bug-gnu-emacs@gnu.org; Wed, 15 Jan 2003 19:31:56 -0500 Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2])h0G0Vgk13050; Thu, 16 Jan 2003 09:31:42 +0900 (JST) (envelope-from handa@m17n.org) Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125]) h0G0VfR10287; Thu, 16 Jan 2003 09:31:42 +0900 (JST) Original-Received: (from handa@localhost) by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id JAA10750; Thu, 16 Jan 2003 09:31:41 +0900 (JST) Original-To: zsh@cs.rochester.edu In-reply-to: <2nvg0qqomo.fsf@zsh.cs.rochester.edu> (message from ShengHuo ZHU on Wed, 15 Jan 2003 13:00:47 -0500) User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI) Original-cc: ott@jet.msk.su Original-cc: bug-gnu-emacs@gnu.org X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.bugs:4210 X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:4210 In article <2nvg0qqomo.fsf@zsh.cs.rochester.edu>, ShengHuo ZHU writes: > I can reproduce the bug, however I think the bug is probably related > to the decoding part of the read_process_output function, which I am > not familiar with. > I hope that the following code (based on Alex's code) may illustrate > the bug in a better way. The first piece of code returns t, which is > obviously wrong. But if we insert a space and delete it, it returns > nil in the second piece. Thank you for the test case. The bug was in regex.c. The attached patch will fix it. Please try it. It also fixes the bug of backward searching of eight-bit-graphic char. Stefan, it seems that you are maintaining regex.c. What should I do for this change? Can I directly install it in HEAD (and perhaps in RC)? --- Ken'ichi HANDA handa@m17n.org 2003-01-16 Kenichi Handa * regex.c (GET_CHAR_BEFORE_2): Fix for the case that the previous char is eight-bit-graphic. (re_search_2): Likewise. *** regex.c.~1.183.~ Wed Dec 4 17:26:49 2002 --- regex.c Thu Jan 16 09:27:28 2003 *************** *** 157,164 **** { \ re_char *dtemp = (p) == (str2) ? (end1) : (p); \ re_char *dlimit = ((p) > (str2) && (p) <= (end2)) ? (str2) : (str1); \ while (dtemp-- > dlimit && !CHAR_HEAD_P (*dtemp)); \ ! c = STRING_CHAR (dtemp, (p) - dtemp); \ } \ else \ (c = ((p) == (str2) ? (end1) : (p))[-1]); \ --- 157,168 ---- { \ re_char *dtemp = (p) == (str2) ? (end1) : (p); \ re_char *dlimit = ((p) > (str2) && (p) <= (end2)) ? (str2) : (str1); \ + re_char *d0 = dtemp; \ while (dtemp-- > dlimit && !CHAR_HEAD_P (*dtemp)); \ ! if (MULTIBYTE_FORM_LENGTH (dtemp, d0 - dtemp) == d0 - dtemp) \ ! c = STRING_CHAR (dtemp, d0 - dtemp); \ ! else \ ! c = d0[-1]; \ } \ else \ (c = ((p) == (str2) ? (end1) : (p))[-1]); \ *************** *** 4307,4324 **** p--, len++; /* Adjust it. */ - #if 0 /* XXX */ if (MULTIBYTE_FORM_LENGTH (p, len + 1) != (len + 1)) ! ; ! else ! #endif ! { ! range += len; ! if (range > 0) ! break; ! startpos -= len; ! } } } } --- 4311,4326 ---- p--, len++; /* Adjust it. */ if (MULTIBYTE_FORM_LENGTH (p, len + 1) != (len + 1)) ! /* The previous character is eight-bit-graphic which ! is represented by one byte even in a multibyte ! buffer/string. */ ! len = 0; ! range += len; ! if (range > 0) ! break; ! startpos -= len; } } }