From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Herbert Euler" Newsgroups: gmane.emacs.devel Subject: Re: A prototype of intelligent replace for Emacs Date: Sat, 05 May 2007 10:30:27 +0800 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; format=flowed X-Trace: sea.gmane.org 1178332242 18544 80.91.229.12 (5 May 2007 02:30:42 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 5 May 2007 02:30:42 +0000 (UTC) Cc: emacs-devel@gnu.org To: rms@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat May 05 04:30:41 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1HkA2i-0004z0-EP for ged-emacs-devel@m.gmane.org; Sat, 05 May 2007 04:30:40 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HkA9P-0002Cs-Fq for ged-emacs-devel@m.gmane.org; Fri, 04 May 2007 22:37:35 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1HkA9L-0002BM-Og for emacs-devel@gnu.org; Fri, 04 May 2007 22:37:31 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1HkA9L-00029j-4G for emacs-devel@gnu.org; Fri, 04 May 2007 22:37:31 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HkA9K-00029L-Vs for emacs-devel@gnu.org; Fri, 04 May 2007 22:37:31 -0400 Original-Received: from bay0-omc2-s9.bay0.hotmail.com ([65.54.246.145]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1HkA2a-0004tV-TG; Fri, 04 May 2007 22:30:33 -0400 Original-Received: from hotmail.com ([65.55.154.97]) by bay0-omc2-s9.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2668); Fri, 4 May 2007 19:30:31 -0700 Original-Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Fri, 4 May 2007 19:30:31 -0700 Original-Received: from 65.55.154.123 by by143fd.bay143.hotmail.msn.com with HTTP; Sat, 05 May 2007 02:30:27 GMT X-Originating-IP: [221.223.210.138] X-Originating-Email: [herberteuler@hotmail.com] X-Sender: herberteuler@hotmail.com In-Reply-To: X-OriginalArrivalTime: 05 May 2007 02:30:31.0890 (UTC) FILETIME=[5A560B20:01C78EBD] X-detected-kernel: Windows 2000 SP4, XP SP1+ X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:70562 Archived-At: Sorry for the delay. I went travel these days and I could not connect to net. >The concept of "classes" and "blocks" sounds rather complex. In order >to install such a feature, we would have to document it. At present, >I don't understand it myself. > >Could you give an example to explain what that means? Sure. The reason I did not write comment/document in the attached source code is that, the user interface included in the prototype is in unstable status, and it may be changed hugely to satisfy most people. In such a progress, the written comment/document must be updated, which in my opinion is not so easy for the sake of consistency. So I do not want to maintain both the code and the comment/document when they are unstable, and I was deciding to postpone them until the prototype is stable. I am sorry for this. Now I am going to try to explain the concept of "blocks" and "classes". These concepts are defined in the search phase of a search/replace process. When a word is searched in a document, many matches may be found in it. "Blocks" and "classes" are defined based on the matches. To define "block" and "class", the concept of "feature" has to be introduced first. A _feature_ of a match is a value, computed from the context of the match and a predefined rule. A restriction to features computed from a same rule is that they must be able to be compared. Features can be strings, or integers. For example, rule A could be "a feature of a match is the shortest word sequence that contains the match". Now suppose the word "at" is searched in a document. Three matches are found, the first is "status", the second is "match", and the third is "status" again. Under rule A, the feature of the first and the third match is "status", since the shortest word sequence that contains the match "at" is the sequence "status". Similarly, the feature for the second match is the word "match". In this example, both word sequences consist of only one word. If another word "a b" is searched under rule A, the word sequences may consist of more than one word. If there are many different rules, many features can be computed for a match. Because features computed from a same rule can be compared, matches can be classified, or grouped together, with their features. This is based on similarity among the matches. Now the concept of "block" and "class" can be defined. A _block_ is a match plus its features. A _class_ is a set of blocks, all of which have the same features. Continuing from the previous example, since the first and the third block have the same feature "status", they are in one class. The second block is a class itself. Now let us go back to the search/replace topic. When the user wants to replace A with B in a document, it can invoke `replace-string' to replace all matches of A with B, or invoke `query-replace' to replace matches of A with B one by one, by answering `y' or `n' on each of the matches. As described in the paper Cluster-Based Find and Replace by Robert C. Miller and Alisa M. Marshall (1), another approach, replacing several matches at one time by similarity, is faster and more reliable, provided the predefined rules are carefully defined. The prototype "ireplace" tries to implement such a search/replace mechanism for Emacs, with the concept of "blocks" and "classes". Currently, there is only one feature for every match in the prototype: the shortest word sequence that contains the match. In the ireplace buffer, the feature of the blocks in a class is displayed after the class number (i.e. [Class m of n]). Defining proper rules is important future work. For example, another rule of computing features for matches in program source code could be the section (separated by ^L) a match appears in, or the type (variable, function, and so on) of a match. I hope I explained clearly. In fact, because English is not my native language, I chose the words "feature", "block", and "class" at will. I am not sure how to call them in English. But since you found them complex, perhaps I should change them to clearer names. What do you think are better names? Thanks. And both the user interface and the concepts are needing to be checked now. Regards, Guanpeng Xu (1) Web link: http://graphics.csail.mit.edu/~rcm/chi04.pdf _________________________________________________________________ FREE pop-up blocking with the new MSN Toolbar - get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/