From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: joakim@verona.se Newsgroups: gmane.emacs.devel Subject: Re: Pushing the `gnus-range-*' functions down into the C layer Date: Fri, 10 Sep 2010 15:07:05 +0200 Message-ID: References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1284124048 2158 80.91.229.12 (10 Sep 2010 13:07:28 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 10 Sep 2010 13:07:28 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Sep 10 15:07:26 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Ou3KA-00024k-5D for ged-emacs-devel@m.gmane.org; Fri, 10 Sep 2010 15:07:26 +0200 Original-Received: from localhost ([127.0.0.1]:44197 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ou3K9-0003QO-Gh for ged-emacs-devel@m.gmane.org; Fri, 10 Sep 2010 09:07:25 -0400 Original-Received: from [140.186.70.92] (port=60324 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ou3Jz-0003QI-7b for emacs-devel@gnu.org; Fri, 10 Sep 2010 09:07:19 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Ou3Jt-0000XB-MO for emacs-devel@gnu.org; Fri, 10 Sep 2010 09:07:15 -0400 Original-Received: from smtprelay-h32.telenor.se ([213.150.131.5]:57835) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Ou3Jt-0000Ws-DP for emacs-devel@gnu.org; Fri, 10 Sep 2010 09:07:09 -0400 Original-Received: from ipb4.telenor.se (ipb4.telenor.se [195.54.127.167]) by smtprelay-h32.telenor.se (Postfix) with ESMTP id DCBC5C9B5 for ; Fri, 10 Sep 2010 15:07:07 +0200 (CEST) X-SENDER-IP: [83.227.138.150] X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqwxABbKiUxT44qWPGdsb2JhbACHa5MjhjMMAQEBAR4XCyK8bIU9BIoggxM X-IronPort-AV: E=Sophos;i="4.56,346,1280700000"; d="scan'208";a="1669534627" Original-Received: from ua-83-227-138-150.cust.bredbandsbolaget.se (HELO www.verona.se) ([83.227.138.150]) by ipb4.telenor.se with ESMTP; 10 Sep 2010 15:07:07 +0200 Original-Received: from localhost.localdomain (unknown [192.168.201.6]) by www.verona.se (Postfix) with ESMTP id 004F271631C for ; Fri, 10 Sep 2010 15:07:05 +0200 (CEST) In-Reply-To: (Lars Magne Ingebrigtsen's message of "Fri, 10 Sep 2010 14:35:31 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129890 Archived-At: Lars Magne Ingebrigtsen writes: > Stefan Monnier writes: > >> I wouldn't rule it out, but I'm reluctant to move Elisp code to C for >> the same kinds of reason Andy just gave. Especially for such a thing as >> range/interval algorithms where the best algorithm depends a fair bit on >> the particular shape of your data. >> OTOH, speeding up Elisp is not something I expect to happen soon >> (although moving to lexical bindings is a step in the right direction). > > Actually, I think Emacs could do with moving more things down into the C > layer. I mean, having en elisp implementation of sha1 is quite an > achievement, but linking with libgcrypt would have given you that and > much more. Just to take a random example. > > I don't think the C layer should be considered "second class". I mean, > C isn't as nice as Lisp, but it's not like the C style in Emacs is less > maintainable than most Emacs-Lisp functions out there. :-) > > Anyway, back to the range functions. I think it's slightly generally > useful. My guess would be that if there was a faster > sorted-integer-list interface, it would be used by others with similar > needs. `range-memq' would be much faster than `memq' on a list of > integers in most use cases. > >> Maybe once we have dyn-loading in place, Gnus can provide its own >> C functions outside of Emacs's own C code. > > Would that help much, though? Anyway, that reminds me -- I think the > module support in XEmacs hurt XEmacs more than it helped. You have to > program so very defensively because you don't know what packages are > installed by the user. In Emacs, it's so much more productive > programming, because you can rely on all its bits being present. Interesting, I havent actually used Xemacs so I didnt know this. Would you say that this would still be true with a dependency management mechanism that downloaded and installed required dependencies automatically? When I code Clojure, which has a Maven based dependency resolver, I find that dependencies just get downloaded automatically and I dont need to spend more time thinking about those kind of issues. But perhaps I'm missing understanding of some use-case. > >> What algorithm does Gnus use currently? I guess you call `sort' and >> then scan the list looking for contiguous points? That would make it >> O(n log n), which isn't too bad, but it also means that the C part of >> the code is O(n log n) and the Elisp part is O(n), so the speed of Elisp >> should not be a big issue. > > In general, the ranges come from the .newsrc.eld file, and are then > uncompressed before being worked on. The lists are generally kept > sorted, so it's not necessary to re-sort them before compressing. > > But I think you kinda underestimate how slow Emacs Lisp is compared to > C. An O(n) function in Emacs Lisp can be slow enough to make it > unpleasant for the user. > >> Ah, so you don't just convert to/from lists of points to intervals, but >> you also work directly on intervals. > > At present, they're compressed and uncompressed, because memq is much > faster in the general use cases than the Lisp range-member functions. > But if range-memq was present in C, then I'd never uncompress. > >> What algorithms do you use there? >> These should be O(n), right? What do you know about the usual shape of >> your data? E.g. what's the expected difference between the smallest and >> the largest element (if it's not too large, we could store them in >> bitvectors and then provide bitwise or/and on bitvectors)? > > It varies a bit. The "read" range is usually on the form > `(1 . 24254214)', and this is (of course) never uncompressed. But you'd > usually have additional ranges for, say, ticked, and these would be > quite sparse, like `(45 (90 . 96) (200 . 259))' Or something like that. -- Joakim Verona