From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Wolfgang Scherer Newsgroups: gmane.emacs.bugs Subject: replace-match problem Date: Fri, 3 May 2002 17:44:13 +0200 Sender: bug-gnu-emacs-admin@gnu.org Message-ID: <15570.45133.389840.832342@farmer.simul.de> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: main.gmane.org 1020443061 17448 127.0.0.1 (3 May 2002 16:24:21 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Fri, 3 May 2002 16:24:21 +0000 (UTC) Return-path: Original-Received: from fencepost.gnu.org ([199.232.76.164]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 173fr2-0004XJ-00 for ; Fri, 03 May 2002 18:24:20 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 173fqz-0006WH-00; Fri, 03 May 2002 12:24:17 -0400 Original-Received: from mailout01.sul.t-online.com ([194.25.134.80]) by fencepost.gnu.org with esmtp (Exim 3.34 #1 (Debian)) id 173fpp-0006NS-00 for ; Fri, 03 May 2002 12:23:05 -0400 Original-Received: from fwd03.sul.t-online.de by mailout01.sul.t-online.com with smtp id 173fFe-0007Ny-08; Fri, 03 May 2002 17:45:42 +0200 Original-Received: from farmer.simul.de (520043676698-0001@[80.133.254.130]) by fmrl03.sul.t-online.com with esmtp id 173fFY-1TnrKiC; Fri, 3 May 2002 17:45:36 +0200 Original-Received: (from ws@localhost) by farmer.simul.de (8.11.6/8.10.2/SuSE Linux 8.10.0-0.3) id g43FiRZ23767; Fri, 3 May 2002 17:44:27 +0200 Original-To: bug-gnu-emacs@gnu.org X-Mailer: VM 7.04 under Emacs 21.1.1 X-Sender: 520043676698-0001@t-dialin.net Errors-To: bug-gnu-emacs-admin@gnu.org X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.0.9 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.emacs.bugs:1139 X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:1139 This bug report will be sent to the Free Software Foundation, not to your local site managers! Please write in English, because the Emacs maintainers do not have translators to read other languages for them. Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list, and to the gnu.emacs.bug news group. In GNU Emacs 21.1.1 (i386-suse-linux, X toolkit, Xaw3d scroll bars) of 2002-03-25 on stephens configured using `configure --with-gcc --with-pop --with-system-malloc --prefix=/usr --exec-prefix=/usr --infodir=/usr/share/info --mandir=/usr/share/man --sharedstatedir=/var/lib --libexecdir=/usr/lib --with-x --with-xpm --with-jpeg --with-tiff --with-gif --with-png --with-x-toolkit=lucid --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib i386-suse-linux CC=gcc 'CFLAGS=-O2 -march=i486 -mcpu=i686 -pipe -DSYSTEM_PURESIZE_EXTRA=25000 -DSITELOAD_PURESIZE_EXTRA=10000 -D_GNU_SOURCE ' LDFLAGS=-s build_alias=i386-suse-linux host_alias=i386-suse-linux target_alias=i386-suse-linux' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: POSIX value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: german locale-coding-system: iso-latin-1 default-enable-multibyte-characters: nil Please describe exactly what actions triggered the bug and the precise symptoms of the bug: REPLACE-MATCH PROBLEM ===================== The built-in function `replace-match' seems to behave inconsistently. Specifically, I have a problem with the semantics of "words" and "newtext". >From the documentation of `replace-match': Otherwise maybe capitalize the whole text, or maybe just word initials, based on the replaced text. [1] If the replaced text has only capital letters and has at least one multiletter word, convert NEWTEXT to all caps. [2] If the replaced text has at least one word starting with a capital letter, then capitalize each word in NEWTEXT. 1. The lower case and upper case examples in lines 1, 2, 6, 7, 11, 12, 16, 17 could suggest that "\\&" is subject to case conversion. Lines 4, 5, 9, 10, 14, 15, 19, 20 show that this is not the case. (I think a clarification would be nice, e.g. "Case conversion is done before any special sequences are expanded.") 2. The lower case and upper case examples also suggest, that the amount of non-word constituent characters between words does not make a difference. The examples for mixed-case replaced text in lines 3, 8, 13, 18 show that the amount of non-word constituent characters does in fact make a difference. This is a consequence of replace-match in search.c not checking the syntax-code of the current character, which leads to the assumption, that the second and further separators are actually the initial characters of a word. 3. The test examples for mixed-case replaced text in lines 4, 5, 9, 10, 14, 15, 19, 20 show that description [2] is plainly wrong. It should state, that capitalization is only done, when ALL words in the replaced text are capitalized. At least the code in search.c says so: /* Capitalize each word, if the old text has all capitalized words. */ TEST CASE ========= The following table was generated with a test expression that copies INPUT with fixed case ("\\& => \\&" ) and then replaces the copy of INPUT with case conversion (e.g. "\\& : your-string"). INPUT \& REPL STRING-REPL 1 my-string => my-string : your--string 2 MY-STRING => MY-STRING : YOUR--STRING 3 My-String => My-String : Your--String 4 My-string => My-string : your--string 5 my-String => my-String : your--string 6 my--string => my--string : your--string 7 MY--STRING => MY--STRING : YOUR--STRING 8 My--String => My--String : your--string 9 My--string => My--string : your--string 10 my--String => my--String : your--string 11 my string => my string : your string 12 MY STRING => MY STRING : YOUR STRING 13 My String => My String : Your String 14 My string => My string : your string 15 my String => my String : your string 16 my string => my string : your string 17 MY STRING => MY STRING : YOUR STRING 18 My String => My String : your string 19 My string => My string : your string 20 my String => my String : your string EMACS search.c (no difference between 20.7, 21.1 and 21.2) ============== >> if (LOWERCASEP (c)) >> { >> /* Cannot be all caps if any original char is lower case */ >> >> some_lowercase = 1; >> if (SYNTAX (prevc) != Sword) >> some_nonuppercase_initial = 1; >> else >> some_multiletter_word = 1; >> } >> else if (!NOCASEP (c)) >> { >> some_uppercase = 1; >> if (SYNTAX (prevc) != Sword) >> ; >> else >> some_multiletter_word = 1; >> } >> else >> { >> /* If the initial is a caseless word constituent, >> treat that like a lowercase initial. */ >> if (SYNTAX (prevc) != Sword) >> some_nonuppercase_initial = 1; >> } I think it should be more correctly: if (SYNTAX (c) == Sword) { >> if (LOWERCASEP (c)) >> { >> /* Cannot be all caps if any original char is lower case */ >> >> some_lowercase = 1; >> if (SYNTAX (prevc) != Sword) >> some_nonuppercase_initial = 1; >> else >> some_multiletter_word = 1; >> } >> else if (!NOCASEP (c)) >> { >> some_uppercase = 1; >> if (SYNTAX (prevc) != Sword) >> ; >> else >> some_multiletter_word = 1; >> } >> else >> { >> /* If the initial is a caseless word constituent, >> treat that like a lowercase initial. */ >> if (SYNTAX (prevc) != Sword) >> some_nonuppercase_initial = 1; >> } } Or: >> if (LOWERCASEP (c)) >> { >> /* Cannot be all caps if any original char is lower case */ >> >> some_lowercase = 1; >> if (SYNTAX (prevc) != Sword) >> some_nonuppercase_initial = 1; >> else >> some_multiletter_word = 1; >> } >> else if (!NOCASEP (c)) >> { >> some_uppercase = 1; >> if (SYNTAX (prevc) != Sword) >> ; >> else >> some_multiletter_word = 1; >> } else if (SYNTAX (c) == Sword) >> { >> /* If the initial is a caseless word constituent, >> treat that like a lowercase initial. */ >> if (SYNTAX (prevc) != Sword) >> some_nonuppercase_initial = 1; >> } TEST EXPRESSION (JUST FOR REFERENCE) ==================================== ;; |:debug:| (let ((case-fold-search t) (case-replace nil) (str-wid 13) (line-no 1) (r-s (function (lambda (SEARCH REPL) (while (search-forward SEARCH nil t) ;; Duplicate SEARCH (with FIXEDCASE == t) (replace-match (format (format "%%-%ss => %%s" (- str-wid (length (match-string 0)))) "\\&" "\\&") t nil) (goto-char (match-beginning 0)) ;; Find copy of SEARCH (search-forward SEARCH nil t 2) ;; Replace SEARCH (with FIXEDCASE == nil) by "\\& => REPL" (replace-match (format (format "%%-%ss : %%s" (- str-wid (length (match-string 0)))) "\\&" REPL) nil nil) ;; Add line number (beginning-of-line) (insert (format "%3d " line-no)) (end-of-line) (setq line-no (1+ line-no))))))) (save-excursion (funcall r-s "my-string" "your--string") (funcall r-s "my--string" "your--string") (funcall r-s "my string" "your string") (funcall r-s "my string" "your string") )) ;; |:debug:| Recent input: C-x C-f m a i l C-u C-c u d f C-x k Recent messages: emacs-replace-match-bug.el has auto save data; consider M-x recover-file Scanning buffer for index ( 0%) Scanning buffer for index (100%) call-interactively: Quit Wrote /usr/people/ws/emacs-init/replace-match/emacs-replace-match-bug.el [3 times] Mark set [4 times] Wrote /usr/people/ws/emacs-init/replace-match/emacs-replace-match-bug.el [2 times] Mark set (New file) Loading emacsbug...done