From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.help Subject: Re: Negative occur Date: Thu, 29 Nov 2007 09:58:11 -0600 Organization: =?utf-8?B?0KLQtdC+0LTQvtGAINCX0LvQsNGC0LDQvdC+0LI=?= @ Cienfuegos Message-ID: <8663zl2gdo.fsf@lifelogs.com> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1196354529 8217 80.91.229.12 (29 Nov 2007 16:42:09 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 29 Nov 2007 16:42:09 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Nov 29 17:42:17 2007 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1IxmSq-00062N-Kv for geh-help-gnu-emacs@m.gmane.org; Thu, 29 Nov 2007 17:42:12 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IxmSa-0001q6-Vk for geh-help-gnu-emacs@m.gmane.org; Thu, 29 Nov 2007 11:41:57 -0500 Original-Path: shelby.stanford.edu!newsfeed.stanford.edu!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!xs4all!feeder.news-service.com!newsfeed.kamp.net!newsfeed.kamp.net!newsfeed.freenet.de!news.albasani.net!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 75 Original-X-Trace: news.albasani.net un5EXSajmFq1uU6jnjATMUpeHatYFhg0hijYKW3kdCBN25EXByxqTJK/tK9uVwfP8hP6k4a1c0vL4ZKeOI2Nd7sEBJIUYyoL5OLf+qSYZT1lTGu1kdF+obISOxXNOK0m Original-X-Complaints-To: abuse@albasani.net Original-NNTP-Posting-Date: Thu, 29 Nov 2007 15:58:04 +0000 (UTC) X-User-ID: LbI+yiEK4sKduIGY1fzlKcXVyYgFCpfA3KEoR1Lgzis= X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6; d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" Cancel-Lock: sha1:YmjY+HJibYNfWJ3LFJoRfK20ztQ= sha1:tvsMR/yN9vwS/z/HX55eKFy8U6U= User-Agent: Gnus/5.110007 (No Gnus v0.7) Emacs/22.1 (gnu/linux) X-NNTP-Posting-Host: lE+2otPpKHPpCy0+ESt+gDzehebZdJhl3FU+y6SXD+Q= Original-Xref: shelby.stanford.edu gnu.emacs.help:154262 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:49690 Archived-At: On Wed, 28 Nov 2007 14:52:15 -0800 "Drew Adams" wrote: >> >> > You could try running "occur" with the pattern "^" (which matches >> >> > every line), then prune the results with M-x delete-matching-lines DA> RET DA> [spamfilteraccount suggested that Emacs should have this as part of DA> `occur'...] DA> I realize that your suggestion is that this be added to DA> Emacs. I agree. FYI - In Icicles, just do this: C-' foobar C-~ DA> That shows and lets you visit all lines that do not match the DA> regexp "foobar". >> >> Both solutions will be slower on a large buffer than they should be. DA> What does "slower than they should be" mean? How slow should they be? How DA> slow are they in fact? How large is a large buffer? How do you judge that DA> "they" (two totally different approaches and implementations) are slower DA> than they should be? I'm certain that creating an *occur* buffer on every line of a 100+ MB buffer and then removing most of them compares poorly in memory usage and CPU usage to just matching what you need from it. It's a very suboptimal approach whose only advantage is that it doesn't require changes to any internal logic. A parallel would be (using `sort' instead of `cat' to account for Emacs' memory usage): sort file | grep x grep x file | sort >> A real inversion parameter, either as a predicate function or a variable, >> passed lexically or as a parameter to the occur-engine function call, is >> necessary. DA> Necessary? For what? Why necessary? These are generalizations that don't DA> help. Necessary to implement the solution in such a way that it will satisfy both the OP and future needs for tuning the occur results. I'm not talking about Icicles (that's why I mentioned occur-1 and occur-engine originally), sorry if I didn't state that clearly. I just thought that since you recommended the filter-later approach, Icicles didn't support predicates, so it made sense to follow up to you. DA> Your statements are vague, but I'm guessing that what you're really trying DA> to say is that it is often more efficient to apply a predicate earlier DA> rather than later (filter promotion), which is true. Sure. Reduce the search results as early as possible, as in my earlier example of sort/grep usage. DA> The Icicles approach is designed for interactive use, which is why it DA> emphasises changing search patterns (and predicates) on the fly. It works DA> fine with any buffers I've ever used, some of which are pretty darn big. DA> (How big is big? I just searched a 19MB buffer with no effect on DA> interactivity.) DA> As always, the usefulness of a tool depends on what you use it for. If you DA> want to search a 5 terabyte file, then interactivity might suffer with some DA> approaches (depending on your hardware... and, especially, depending on your DA> regexp). But, as always, the devil is in the details. I can see that between a O(n) and O(n log(n)) algorithm for small data sets, but when the difference is that one approach copies every line and the other doesn't, while they achieve the same result, it literally bothers me to recommend the former approach just because the API doesn't support the latter. So I'll propose the API change to emacs-devel. As for hardware, I maintain an Emacs Maemo port, which is for the Nokia 770/800/810 tablets that run GNU/Linux. There is little memory available and the CPU is slow, so copying a large buffer unnecessarily would be terrible for the user experience. Ted