* Pushing the `gnus-range-*' functions down into the C layer
@ 2010-09-09 15:16 Lars Magne Ingebrigtsen
2010-09-09 19:01 ` Ted Zlatanov
` (3 more replies)
0 siblings, 4 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-09 15:16 UTC (permalink / raw)
To: emacs-devel
Would anyone mind if I implemented the `gnus-range-*' functions in C for
Emacs 24? Gnus spends significant time on group entry/exit doing range
handling/compressing/etc, and I think the functions are (slightly)
generally useful.
Gnus addresses articles by their number, so you have things like the
"marked" list of articles that looks like
(3 4 5 6 7 11 14 15 16 17 18)
(or something). To save loading/saving time, these are stored in the
.newsrc.eld file as range structures, which look like this:
((3 . 7) 11 (14 . 18))
To handle this, Gnus has a set of functions for computing
intersections/differences/etc on the range structures. However, Emacs
Lisp isn't really fast at doing this, so if you exit from a group that
needs lots (I mean *lots*) of range handling, you get an annoying wait
while it sits there computing.
So I'm proposing implementing a set of useful functions in C, like
`range-memq', `range-add', `range-intersection' and stuff.
Thoughts?
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 15:16 Pushing the `gnus-range-*' functions down into the C layer Lars Magne Ingebrigtsen
@ 2010-09-09 19:01 ` Ted Zlatanov
2010-09-09 21:58 ` Lars Magne Ingebrigtsen
2010-09-09 22:21 ` Wojciech Meyer
` (2 subsequent siblings)
3 siblings, 1 reply; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-09 19:01 UTC (permalink / raw)
To: emacs-devel
On Thu, 09 Sep 2010 17:16:56 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
LMI> Would anyone mind if I implemented the `gnus-range-*' functions in C for
LMI> Emacs 24? Gnus spends significant time on group entry/exit doing range
LMI> handling/compressing/etc, and I think the functions are (slightly)
LMI> generally useful.
LMI> Gnus addresses articles by their number, so you have things like the
LMI> "marked" list of articles that looks like
LMI> (3 4 5 6 7 11 14 15 16 17 18)
LMI> (or something). To save loading/saving time, these are stored in the
LMI> .newsrc.eld file as range structures, which look like this:
LMI> ((3 . 7) 11 (14 . 18))
LMI> To handle this, Gnus has a set of functions for computing
LMI> intersections/differences/etc on the range structures. However, Emacs
LMI> Lisp isn't really fast at doing this, so if you exit from a group that
LMI> needs lots (I mean *lots*) of range handling, you get an annoying wait
LMI> while it sits there computing.
LMI> So I'm proposing implementing a set of useful functions in C, like
LMI> `range-memq', `range-add', `range-intersection' and stuff.
I think it's a good idea. We've seen IMAP article numbers overflow
native Emacs ints at least once and it's been discussed a few times, e.g.:
http://article.gmane.org/gmane.emacs.gnus.general/65390/match=overflow
so maybe this API needs to support true 32-bit ints (perhaps through
doubles) as per RFC 3501.
Ted
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 19:01 ` Ted Zlatanov
@ 2010-09-09 21:58 ` Lars Magne Ingebrigtsen
2010-09-09 22:25 ` Ted Zlatanov
2010-09-10 3:24 ` Stephen J. Turnbull
0 siblings, 2 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-09 21:58 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> I think it's a good idea. We've seen IMAP article numbers overflow
> native Emacs ints at least once and it's been discussed a few times, e.g.:
>
> http://article.gmane.org/gmane.emacs.gnus.general/65390/match=overflow
>
> so maybe this API needs to support true 32-bit ints (perhaps through
> doubles) as per RFC 3501.
Well, I was thinking that the range things would just have normal Elisp
ints in them (like they do now), so having the ranges in C wouldn't
actually solve that problem.
But aren't everybody on 64-bit systems now? :-)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 15:16 Pushing the `gnus-range-*' functions down into the C layer Lars Magne Ingebrigtsen
2010-09-09 19:01 ` Ted Zlatanov
@ 2010-09-09 22:21 ` Wojciech Meyer
2010-09-09 23:48 ` Ted Zlatanov
2010-09-10 8:06 ` Andy Wingo
2010-09-10 10:43 ` Stefan Monnier
3 siblings, 1 reply; 48+ messages in thread
From: Wojciech Meyer @ 2010-09-09 22:21 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Would anyone mind if I implemented the `gnus-range-*' functions in C for
> Emacs 24? Gnus spends significant time on group entry/exit doing range
> handling/compressing/etc, and I think the functions are (slightly)
> generally useful.
>
> Gnus addresses articles by their number, so you have things like the
> "marked" list of articles that looks like
>
> (3 4 5 6 7 11 14 15 16 17 18)
>
> (or something). To save loading/saving time, these are stored in the
> .newsrc.eld file as range structures, which look like this:
>
> ((3 . 7) 11 (14 . 18))
>
> To handle this, Gnus has a set of functions for computing
> intersections/differences/etc on the range structures. However, Emacs
> Lisp isn't really fast at doing this, so if you exit from a group that
> needs lots (I mean *lots*) of range handling, you get an annoying wait
> while it sits there computing.
>
> So I'm proposing implementing a set of useful functions in C, like
> `range-memq', `range-add', `range-intersection' and stuff.
>
> Thoughts?
Hi,
In my opinion your algorithm might need to change, but that's maybe I
that I don't understand the problem domain correctly :). You don't
really need to care about ranges or intervals. You load them into array
of chars, each char index representing one article index and set to 0 or
1, if article was read. (ideally a bit array) Then at exit perform
operation at array in the same way as you loaded, and save it back to
ranges. It has linear complexity. You trade off space complexity for
computational complexity. As a bonus you get the minimal representation
in intervals. So I don't think you would need a specific core support
(but that's my personal opinion). If you really want a clean solution,
then there are algorithms. (I know they exist, but unfortunately I don't
remember the name of it).
Just mine two cents,
Wojciech
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 21:58 ` Lars Magne Ingebrigtsen
@ 2010-09-09 22:25 ` Ted Zlatanov
2010-09-10 3:24 ` Stephen J. Turnbull
1 sibling, 0 replies; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-09 22:25 UTC (permalink / raw)
To: emacs-devel
On Thu, 09 Sep 2010 23:58:01 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
LMI> Ted Zlatanov <tzz@lifelogs.com> writes:
>> I think it's a good idea. We've seen IMAP article numbers overflow
>> native Emacs ints at least once and it's been discussed a few times, e.g.:
>>
>> http://article.gmane.org/gmane.emacs.gnus.general/65390/match=overflow
>>
>> so maybe this API needs to support true 32-bit ints (perhaps through
>> doubles) as per RFC 3501.
LMI> Well, I was thinking that the range things would just have normal Elisp
LMI> ints in them (like they do now), so having the ranges in C wouldn't
LMI> actually solve that problem.
I meant "while you're in there..." :) but it's not a significant need
compared to the other issues with Gnus you're targeting. So never mind.
Ted
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 22:21 ` Wojciech Meyer
@ 2010-09-09 23:48 ` Ted Zlatanov
2010-09-09 23:56 ` Wojciech Meyer
0 siblings, 1 reply; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-09 23:48 UTC (permalink / raw)
To: emacs-devel
On Thu, 09 Sep 2010 23:21:30 +0100 Wojciech Meyer <wojciech.meyer@googlemail.com> wrote:
WM> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>> Gnus addresses articles by their number, so you have things like the
>> "marked" list of articles that looks like
>>
>> (3 4 5 6 7 11 14 15 16 17 18)
WM> In my opinion your algorithm might need to change, but that's maybe I
WM> that I don't understand the problem domain correctly :). You don't
WM> really need to care about ranges or intervals. You load them into array
WM> of chars, each char index representing one article index and set to 0 or
WM> 1, if article was read. (ideally a bit array)
You mean a bool-vector, I think. Emacs has those built-in, see
(info "(elisp) Bool-Vectors")
but they are not good at representing sparse mutable lists, which Gnus
needs to represent article ranges.
Ted
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 23:48 ` Ted Zlatanov
@ 2010-09-09 23:56 ` Wojciech Meyer
2010-09-10 0:07 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 48+ messages in thread
From: Wojciech Meyer @ 2010-09-09 23:56 UTC (permalink / raw)
To: Ted Zlatanov; +Cc: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
>
> You mean a bool-vector, I think. Emacs has those built-in, see
>
> (info "(elisp) Bool-Vectors")
>
> but they are not good at representing sparse mutable lists, which Gnus
> needs to represent article ranges.
>
> Ted
Hi,
I think they are best to solve your performance problems. The idea is
to keep just one array for all of the articles, means one per
group. That would be just the internal representation, you still
save-read from lists with article ranges.
Wojciech
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 23:56 ` Wojciech Meyer
@ 2010-09-10 0:07 ` Lars Magne Ingebrigtsen
2010-09-10 0:17 ` Wojciech Meyer
0 siblings, 1 reply; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 0:07 UTC (permalink / raw)
To: emacs-devel
Wojciech Meyer <wojciech.meyer@googlemail.com> writes:
> I think they are best to solve your performance problems. The idea is
> to keep just one array for all of the articles, means one per
> group. That would be just the internal representation, you still
> save-read from lists with article ranges.
The performance issue is converting between FOO and ranges.
Representing FOO as arrays instead of lists doesn't help with the
conversion problem.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 0:07 ` Lars Magne Ingebrigtsen
@ 2010-09-10 0:17 ` Wojciech Meyer
0 siblings, 0 replies; 48+ messages in thread
From: Wojciech Meyer @ 2010-09-10 0:17 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Wojciech Meyer <wojciech.meyer@googlemail.com> writes:
>
>> I think they are best to solve your performance problems. The idea is
>> to keep just one array for all of the articles, means one per
>> group. That would be just the internal representation, you still
>> save-read from lists with article ranges.
>
> The performance issue is converting between FOO and ranges.
> Representing FOO as arrays instead of lists doesn't help with the
> conversion problem.
Yes, I understand that, but one could eliminate ranges completely, but I
am not an expert on nntp. Plus it has a linear complexity, as you it can
be computed with just simple state machine. How long it will take to
traverse 10000 articles consing pair of integers? (I bet not that
long). (maybe just reading the articles on the fly, then you have only
additional cost of skipping - proportional to number of them (0(n)).)
As I said, I am not entirely in the problem domain so I can't say with
100% sure, but I feel it should not be that complex to cause delays in
Gnus, and BTW: I am very happy user of Gnus, and I didn't notice this
problem :).
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 21:58 ` Lars Magne Ingebrigtsen
2010-09-09 22:25 ` Ted Zlatanov
@ 2010-09-10 3:24 ` Stephen J. Turnbull
2010-09-10 13:53 ` Ted Zlatanov
2010-09-13 11:45 ` Eli Zaretskii
1 sibling, 2 replies; 48+ messages in thread
From: Stephen J. Turnbull @ 2010-09-10 3:24 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: emacs-devel
Lars Magne Ingebrigtsen writes:
> But aren't everybody on 64-bit systems now? :-)
Everybody using Microsoft or Apple products, yes; they have no choice.
*We* on the other hand can keep using our beloved antiques (anything
manufactured before last January ;-) indefinitely.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 15:16 Pushing the `gnus-range-*' functions down into the C layer Lars Magne Ingebrigtsen
2010-09-09 19:01 ` Ted Zlatanov
2010-09-09 22:21 ` Wojciech Meyer
@ 2010-09-10 8:06 ` Andy Wingo
2010-09-10 10:20 ` Wojciech Meyer
2010-09-10 10:43 ` Stefan Monnier
3 siblings, 1 reply; 48+ messages in thread
From: Andy Wingo @ 2010-09-10 8:06 UTC (permalink / raw)
To: emacs-devel
Hi,
On Thu 09 Sep 2010 17:16, Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Would anyone mind if I implemented the `gnus-range-*' functions in C for
> Emacs 24?
Speaking with my Guile maintainer hat on, more C is an evil. Guile was a
slow Scheme so lots of stuff got moved to C, but we still weren't quite
fast enough, and by the time we exhausted our possibilities for C
optimization, we were left with a brittle thing.
Instead the real fix was to improve the language implementation, so it
wasn't so slow. The same thing will be the real fix for Emacs. And now
that things are better in Guile, we are rolling back on the
"C-for-speed", because in many cases it's not faster.
I should say that sometimes C is a necessary evil; but it's an evil
nonetheless. If you can fix this particular case algorithmically, that
would be loads better.
Just my 2 eurocents,
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 8:06 ` Andy Wingo
@ 2010-09-10 10:20 ` Wojciech Meyer
0 siblings, 0 replies; 48+ messages in thread
From: Wojciech Meyer @ 2010-09-10 10:20 UTC (permalink / raw)
To: Andy Wingo; +Cc: emacs-devel
> I should say that sometimes C is a necessary evil; but it's an evil
> nonetheless. If you can fix this particular case algorithmically, that
> would be loads better.
Yes. I would say that you can bootstrap a compiler from minimal
interpreter. Starting from a very basic primitives (especially Lisp is
well suited for it). You can start from assembly, implement Forth, then
go for Lisp, and then you can actually do anything, However at the end
you probably re-implement half of the language again in gradual rise of
the abstraction barrier. The key point here is to use C in project
likes Emacs, not only for basic language primitives or things that
cannot be done on the Lisp level but for all the critical functions as
well. Deciding when the function should stay in the core is however
another complex subject. So you trade efficiency for having a clean and
minimal kernel. So yes, getting rid of C is good but some of stuff
should stay in the core. In this case I think the problem is different,
the functions proposed are very specific, and besides efficiency (which
I think would not increase the performance that significantly, because
of a nature of data structures here, and cost of allocation), there
would be a minimal reason for doing it (however it is mine *PERSONAL*
opinion), and also I believe (but maybe I am wrong), it could be fixed
by replacing the algorithm. Nevertheless, I am curious how much it would
increase the performance.
Mine two (euro)cents,
Wojciech
>
> Just my 2 eurocents,
>
> Andy
> --
> http://wingolog.org/
>
>
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-09 15:16 Pushing the `gnus-range-*' functions down into the C layer Lars Magne Ingebrigtsen
` (2 preceding siblings ...)
2010-09-10 8:06 ` Andy Wingo
@ 2010-09-10 10:43 ` Stefan Monnier
2010-09-10 12:35 ` Lars Magne Ingebrigtsen
3 siblings, 1 reply; 48+ messages in thread
From: Stefan Monnier @ 2010-09-10 10:43 UTC (permalink / raw)
To: emacs-devel
> Would anyone mind if I implemented the `gnus-range-*' functions in C for
> Emacs 24? Gnus spends significant time on group entry/exit doing range
> handling/compressing/etc, and I think the functions are (slightly)
> generally useful.
I wouldn't rule it out, but I'm reluctant to move Elisp code to C for
the same kinds of reason Andy just gave. Especially for such a thing as
range/interval algorithms where the best algorithm depends a fair bit on
the particular shape of your data.
OTOH, speeding up Elisp is not something I expect to happen soon
(although moving to lexical bindings is a step in the right direction).
Maybe once we have dyn-loading in place, Gnus can provide its own
C functions outside of Emacs's own C code.
> Gnus addresses articles by their number, so you have things like the
> "marked" list of articles that looks like
> (3 4 5 6 7 11 14 15 16 17 18)
What algorithm does Gnus use currently? I guess you call `sort' and
then scan the list looking for contiguous points? That would make it
O(n log n), which isn't too bad, but it also means that the C part of
the code is O(n log n) and the Elisp part is O(n), so the speed of Elisp
should not be a big issue.
> (or something). To save loading/saving time, these are stored in the
> .newsrc.eld file as range structures, which look like this:
> ((3 . 7) 11 (14 . 18))
> To handle this, Gnus has a set of functions for computing
> intersections/differences/etc on the range structures.
Ah, so you don't just convert to/from lists of points to intervals, but
you also work directly on intervals. What algorithms do you use there?
These should be O(n), right? What do you know about the usual shape of
your data? E.g. what's the expected difference between the smallest and
the largest element (if it's not too large, we could store them in
bitvectors and then provide bitwise or/and on bitvectors)?
> However, Emacs Lisp isn't really fast at doing this,
It's not fast at other things either ;-)
Stefan
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 10:43 ` Stefan Monnier
@ 2010-09-10 12:35 ` Lars Magne Ingebrigtsen
2010-09-10 12:49 ` rfc2047-decode-string in C? Lars Magne Ingebrigtsen
` (3 more replies)
0 siblings, 4 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 12:35 UTC (permalink / raw)
To: emacs-devel
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> I wouldn't rule it out, but I'm reluctant to move Elisp code to C for
> the same kinds of reason Andy just gave. Especially for such a thing as
> range/interval algorithms where the best algorithm depends a fair bit on
> the particular shape of your data.
> OTOH, speeding up Elisp is not something I expect to happen soon
> (although moving to lexical bindings is a step in the right direction).
Actually, I think Emacs could do with moving more things down into the C
layer. I mean, having en elisp implementation of sha1 is quite an
achievement, but linking with libgcrypt would have given you that and
much more. Just to take a random example.
I don't think the C layer should be considered "second class". I mean,
C isn't as nice as Lisp, but it's not like the C style in Emacs is less
maintainable than most Emacs-Lisp functions out there. :-)
Anyway, back to the range functions. I think it's slightly generally
useful. My guess would be that if there was a faster
sorted-integer-list interface, it would be used by others with similar
needs. `range-memq' would be much faster than `memq' on a list of
integers in most use cases.
> Maybe once we have dyn-loading in place, Gnus can provide its own
> C functions outside of Emacs's own C code.
Would that help much, though? Anyway, that reminds me -- I think the
module support in XEmacs hurt XEmacs more than it helped. You have to
program so very defensively because you don't know what packages are
installed by the user. In Emacs, it's so much more productive
programming, because you can rely on all its bits being present.
> What algorithm does Gnus use currently? I guess you call `sort' and
> then scan the list looking for contiguous points? That would make it
> O(n log n), which isn't too bad, but it also means that the C part of
> the code is O(n log n) and the Elisp part is O(n), so the speed of Elisp
> should not be a big issue.
In general, the ranges come from the .newsrc.eld file, and are then
uncompressed before being worked on. The lists are generally kept
sorted, so it's not necessary to re-sort them before compressing.
But I think you kinda underestimate how slow Emacs Lisp is compared to
C. An O(n) function in Emacs Lisp can be slow enough to make it
unpleasant for the user.
> Ah, so you don't just convert to/from lists of points to intervals, but
> you also work directly on intervals.
At present, they're compressed and uncompressed, because memq is much
faster in the general use cases than the Lisp range-member functions.
But if range-memq was present in C, then I'd never uncompress.
> What algorithms do you use there?
> These should be O(n), right? What do you know about the usual shape of
> your data? E.g. what's the expected difference between the smallest and
> the largest element (if it's not too large, we could store them in
> bitvectors and then provide bitwise or/and on bitvectors)?
It varies a bit. The "read" range is usually on the form
`(1 . 24254214)', and this is (of course) never uncompressed. But you'd
usually have additional ranges for, say, ticked, and these would be
quite sparse, like `(45 (90 . 96) (200 . 259))' Or something like that.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* rfc2047-decode-string in C?
2010-09-10 12:35 ` Lars Magne Ingebrigtsen
@ 2010-09-10 12:49 ` Lars Magne Ingebrigtsen
2010-09-10 13:47 ` Stefan Monnier
2010-09-10 13:07 ` Pushing the `gnus-range-*' functions down into the C layer joakim
` (2 subsequent siblings)
3 siblings, 1 reply; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 12:49 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Actually, I think Emacs could do with moving more things down into the C
> layer.
Speaking of which -- I've been elp-ing Gnus group entry a bit, and Gnus
spends quite a but of time doing rfc2047 decoding. If I read the
results correctly, that's a quite significant portion of the time.
rfc2047-encoding looks like this:
=?iso-8859-1?q?fo=F2o?=
There are two encodings -- b and q. The b (base64) is already in C
(was I the one that put it there? Hm.), but the q isn't
(quoted-printable), so I'd propose two new C functions:
`quoted-printable-decode-string' and `rfc2047-decode-string'.
I can implement them and do some benchmarking?
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 12:35 ` Lars Magne Ingebrigtsen
2010-09-10 12:49 ` rfc2047-decode-string in C? Lars Magne Ingebrigtsen
@ 2010-09-10 13:07 ` joakim
2010-09-10 13:22 ` Lars Magne Ingebrigtsen
2010-09-10 13:45 ` Stefan Monnier
2010-09-11 3:18 ` Daniel Pittman
3 siblings, 1 reply; 48+ messages in thread
From: joakim @ 2010-09-10 13:07 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>> I wouldn't rule it out, but I'm reluctant to move Elisp code to C for
>> the same kinds of reason Andy just gave. Especially for such a thing as
>> range/interval algorithms where the best algorithm depends a fair bit on
>> the particular shape of your data.
>> OTOH, speeding up Elisp is not something I expect to happen soon
>> (although moving to lexical bindings is a step in the right direction).
>
> Actually, I think Emacs could do with moving more things down into the C
> layer. I mean, having en elisp implementation of sha1 is quite an
> achievement, but linking with libgcrypt would have given you that and
> much more. Just to take a random example.
>
> I don't think the C layer should be considered "second class". I mean,
> C isn't as nice as Lisp, but it's not like the C style in Emacs is less
> maintainable than most Emacs-Lisp functions out there. :-)
>
> Anyway, back to the range functions. I think it's slightly generally
> useful. My guess would be that if there was a faster
> sorted-integer-list interface, it would be used by others with similar
> needs. `range-memq' would be much faster than `memq' on a list of
> integers in most use cases.
>
>> Maybe once we have dyn-loading in place, Gnus can provide its own
>> C functions outside of Emacs's own C code.
>
> Would that help much, though? Anyway, that reminds me -- I think the
> module support in XEmacs hurt XEmacs more than it helped. You have to
> program so very defensively because you don't know what packages are
> installed by the user. In Emacs, it's so much more productive
> programming, because you can rely on all its bits being present.
Interesting, I havent actually used Xemacs so I didnt know this.
Would you say that this would still be true with a dependency management
mechanism that downloaded and installed required dependencies
automatically?
When I code Clojure, which has a Maven based dependency resolver, I find
that dependencies just get downloaded automatically and I dont need to
spend more time thinking about those kind of issues. But perhaps I'm
missing understanding of some use-case.
>
>> What algorithm does Gnus use currently? I guess you call `sort' and
>> then scan the list looking for contiguous points? That would make it
>> O(n log n), which isn't too bad, but it also means that the C part of
>> the code is O(n log n) and the Elisp part is O(n), so the speed of Elisp
>> should not be a big issue.
>
> In general, the ranges come from the .newsrc.eld file, and are then
> uncompressed before being worked on. The lists are generally kept
> sorted, so it's not necessary to re-sort them before compressing.
>
> But I think you kinda underestimate how slow Emacs Lisp is compared to
> C. An O(n) function in Emacs Lisp can be slow enough to make it
> unpleasant for the user.
>
>> Ah, so you don't just convert to/from lists of points to intervals, but
>> you also work directly on intervals.
>
> At present, they're compressed and uncompressed, because memq is much
> faster in the general use cases than the Lisp range-member functions.
> But if range-memq was present in C, then I'd never uncompress.
>
>> What algorithms do you use there?
>> These should be O(n), right? What do you know about the usual shape of
>> your data? E.g. what's the expected difference between the smallest and
>> the largest element (if it's not too large, we could store them in
>> bitvectors and then provide bitwise or/and on bitvectors)?
>
> It varies a bit. The "read" range is usually on the form
> `(1 . 24254214)', and this is (of course) never uncompressed. But you'd
> usually have additional ranges for, say, ticked, and these would be
> quite sparse, like `(45 (90 . 96) (200 . 259))' Or something like that.
--
Joakim Verona
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 13:07 ` Pushing the `gnus-range-*' functions down into the C layer joakim
@ 2010-09-10 13:22 ` Lars Magne Ingebrigtsen
0 siblings, 0 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 13:22 UTC (permalink / raw)
To: emacs-devel
joakim@verona.se writes:
> Would you say that this would still be true with a dependency management
> mechanism that downloaded and installed required dependencies
> automatically?
Yes. Well, unless everything depends on everything else.
To take a trivial example, I was tinkering with the Gnus code to jump to
a group. Somebody suggested providing ido completion as an option, and
that sounds reasonable. If I can assume that ido exists in Emacs, I can
just write the code. If not, I have to test and stuff.
I mean, it's not like it's "OMG I HAVE TO TEST" hassle to do that, but
when you have to test for so many things, the code gets really boring to
write, hard to read, and difficult to debug for interactions between
different combinations of things that may or may not exist in the Emacs.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 12:35 ` Lars Magne Ingebrigtsen
2010-09-10 12:49 ` rfc2047-decode-string in C? Lars Magne Ingebrigtsen
2010-09-10 13:07 ` Pushing the `gnus-range-*' functions down into the C layer joakim
@ 2010-09-10 13:45 ` Stefan Monnier
2010-09-10 14:01 ` Ted Zlatanov
2010-09-10 14:06 ` Lars Magne Ingebrigtsen
2010-09-11 3:18 ` Daniel Pittman
3 siblings, 2 replies; 48+ messages in thread
From: Stefan Monnier @ 2010-09-10 13:45 UTC (permalink / raw)
To: emacs-devel
> Actually, I think Emacs could do with moving more things down into the C
> layer. I mean, having en elisp implementation of sha1 is quite an
> achievement, but linking with libgcrypt would have given you that and
> much more. Just to take a random example.
I won't try to argue that sha1 in Elisp is a good idea, indeed.
There's no hard rule here, it's just a general preference for Elisp code.
> I don't think the C layer should be considered "second class".
It can't be modified from the .emacs, so it is necessarily second class.
But of course for functions like sha1 where speed matters and where the
functionality and API is stable, C is a better choice.
> Anyway, back to the range functions. I think it's slightly generally
> useful.
I wouldn't dream of arguing that intervals are not generally useful.
Hell, we use them for text-properties. But I think that if we want to
provide something like that from C, we should try and make them better
than just "sorted lists of cons cells". If the code can be shared with
the one for text-properties (which uses a balanced binary tree), that's
even better.
> But I think you kinda underestimate how slow Emacs Lisp is compared to
> C. An O(n) function in Emacs Lisp can be slow enough to make it
> unpleasant for the user.
I know how slow Elisp is, yes. AFAIC it's the only reason why I'd ever
consider moving to another language.
>> Ah, so you don't just convert to/from lists of points to intervals, but
>> you also work directly on intervals.
> At present, they're compressed and uncompressed, because memq is much
> faster in the general use cases than the Lisp range-member functions.
> But if range-memq was present in C, then I'd never uncompress.
Hmm... so how 'bout installing a C version of range-memq and keep
everything else in Elisp? Would that be good enough as a "quick fix"?
Stefan
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: rfc2047-decode-string in C?
2010-09-10 12:49 ` rfc2047-decode-string in C? Lars Magne Ingebrigtsen
@ 2010-09-10 13:47 ` Stefan Monnier
2010-09-10 13:51 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 48+ messages in thread
From: Stefan Monnier @ 2010-09-10 13:47 UTC (permalink / raw)
To: emacs-devel
>> Actually, I think Emacs could do with moving more things down into
>> the C layer.
> Speaking of which -- I've been elp-ing Gnus group entry a bit, and Gnus
> spends quite a but of time doing rfc2047 decoding. If I read the
> results correctly, that's a quite significant portion of the time.
> rfc2047-encoding looks like this:
> =?iso-8859-1?q?fo=F2o?=
> There are two encodings -- b and q. The b (base64) is already in C
> (was I the one that put it there? Hm.), but the q isn't
> (quoted-printable), so I'd propose two new C functions:
> `quoted-printable-decode-string' and `rfc2047-decode-string'.
> I can implement them and do some benchmarking?
A version of quoted-printable-decode-string written in C would be
fine, yes. For rfc2047-decode-string, I'd need some convincing evidence
that there is a large benefit.
Stefan "who uses Gnus and sometimes finds it quite slow, tho the main
slowness is when I do gnus-group-get-new-news over IMAP"
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: rfc2047-decode-string in C?
2010-09-10 13:47 ` Stefan Monnier
@ 2010-09-10 13:51 ` Lars Magne Ingebrigtsen
2010-09-11 16:10 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 13:51 UTC (permalink / raw)
To: emacs-devel
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> A version of quoted-printable-decode-string written in C would be
> fine, yes. For rfc2047-decode-string, I'd need some convincing evidence
> that there is a large benefit.
Ok; I'll do some benchmarking over the weekend. (If I find the time
after reimplementing nnimap.el.)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 3:24 ` Stephen J. Turnbull
@ 2010-09-10 13:53 ` Ted Zlatanov
2010-09-10 14:01 ` Lars Magne Ingebrigtsen
2010-09-13 11:45 ` Eli Zaretskii
1 sibling, 1 reply; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-10 13:53 UTC (permalink / raw)
To: emacs-devel
On Fri, 10 Sep 2010 12:24:26 +0900 "Stephen J. Turnbull" <stephen@xemacs.org> wrote:
SJT> Lars Magne Ingebrigtsen writes:
>> But aren't everybody on 64-bit systems now? :-)
SJT> Everybody using Microsoft or Apple products, yes; they have no choice.
SJT> *We* on the other hand can keep using our beloved antiques (anything
SJT> manufactured before last January ;-) indefinitely.
64-bit int support in Emacs should not require a 64-bit system. If they
are natively available it's faster, that's all. It's not like you'd be
packing and unpacking the bits directly.
Ted
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 13:53 ` Ted Zlatanov
@ 2010-09-10 14:01 ` Lars Magne Ingebrigtsen
2010-09-10 14:08 ` Leo
` (2 more replies)
0 siblings, 3 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 14:01 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> 64-bit int support in Emacs should not require a 64-bit system. If they
> are natively available it's faster, that's all. It's not like you'd be
> packing and unpacking the bits directly.
Yeah. Bignum support in Emacs would totally rock. No more manual
conversions to floating point numbers to do safe-ish computations on
numbers that might be too big would be needed.
But I suspect that it's a kinda big project. :-)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 13:45 ` Stefan Monnier
@ 2010-09-10 14:01 ` Ted Zlatanov
2010-09-10 14:09 ` Lars Magne Ingebrigtsen
2010-09-11 9:36 ` Stefan Monnier
2010-09-10 14:06 ` Lars Magne Ingebrigtsen
1 sibling, 2 replies; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-10 14:01 UTC (permalink / raw)
To: emacs-devel
On Fri, 10 Sep 2010 15:45:23 +0200 Stefan Monnier <monnier@iro.umontreal.ca> wrote:
SM> Hmm... so how 'bout installing a C version of range-memq and keep
SM> everything else in Elisp? Would that be good enough as a "quick
SM> fix"?
Lars, how about the performance of the set operations (difference,
union, subset/superset, intersection)? Were you planning to do those in
C as well? They'll benefit from a fast range-memq but they would still
do a lot of ELisp work.
Can ranges be an opaque type (with rangep, make-range, etc.) so the
implementation is irrelevant to the user? That seems the best
compromise between performance and maintainability. As long as you can
read and print ranges in a consistent format, that is.
Depending on the expected size and performance, the internals can
convert between bool-vectors, binary trees, inversion lists, plain
lists, whatever.
Ted
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 13:45 ` Stefan Monnier
2010-09-10 14:01 ` Ted Zlatanov
@ 2010-09-10 14:06 ` Lars Magne Ingebrigtsen
1 sibling, 0 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 14:06 UTC (permalink / raw)
To: emacs-devel
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> I wouldn't dream of arguing that intervals are not generally useful.
> Hell, we use them for text-properties. But I think that if we want to
> provide something like that from C, we should try and make them better
> than just "sorted lists of cons cells". If the code can be shared with
> the one for text-properties (which uses a balanced binary tree), that's
> even better.
Yeah, that would really be nice. A `range-memq' on a balanced binary
tree would be really fast. Hm... perhaps I should have a peek at the
text property code?
> Hmm... so how 'bout installing a C version of range-memq and keep
> everything else in Elisp? Would that be good enough as a "quick fix"?
It would do away with the compressing/decompressing needs that Gnus has,
and it might be enough, but I'd have to do some more benchmarking
first, I think.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:01 ` Lars Magne Ingebrigtsen
@ 2010-09-10 14:08 ` Leo
2010-09-10 14:15 ` Lars Magne Ingebrigtsen
2010-09-11 9:44 ` Stefan Monnier
2010-09-10 14:18 ` Ted Zlatanov
2010-09-11 5:52 ` Stephen J. Turnbull
2 siblings, 2 replies; 48+ messages in thread
From: Leo @ 2010-09-10 14:08 UTC (permalink / raw)
To: emacs-devel
On 2010-09-10 15:01 +0100, Lars Magne Ingebrigtsen wrote:
> Yeah. Bignum support in Emacs would totally rock. No more manual
> conversions to floating point numbers to do safe-ish computations on
> numbers that might be too big would be needed.
>
> But I suspect that it's a kinda big project. :-)
GNU already has http://gmplib.org/. Would love to see it linked in too ;)
Leo
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:01 ` Ted Zlatanov
@ 2010-09-10 14:09 ` Lars Magne Ingebrigtsen
2010-09-11 9:36 ` Stefan Monnier
1 sibling, 0 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 14:09 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> Lars, how about the performance of the set operations (difference,
> union, subset/superset, intersection)? Were you planning to do those in
> C as well? They'll benefit from a fast range-memq but they would still
> do a lot of ELisp work.
I was thinking about that, but I don't have any hard numbers to back up
my feeling that those functions would provide significant usage speedup
if implemented in C.
> Can ranges be an opaque type (with rangep, make-range, etc.) so the
> implementation is irrelevant to the user? That seems the best
> compromise between performance and maintainability. As long as you can
> read and print ranges in a consistent format, that is.
Like Stefan said, an implementation based on a balanced binary tree
would be more interesting than just a cons cell approach, and I agree.
They could still print as that, though.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:08 ` Leo
@ 2010-09-10 14:15 ` Lars Magne Ingebrigtsen
2010-09-10 14:19 ` Wojciech Meyer
2010-09-11 9:44 ` Stefan Monnier
1 sibling, 1 reply; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 14:15 UTC (permalink / raw)
To: emacs-devel
Leo <sdl.web@gmail.com> writes:
> On 2010-09-10 15:01 +0100, Lars Magne Ingebrigtsen wrote:
>> Yeah. Bignum support in Emacs would totally rock. No more manual
>> conversions to floating point numbers to do safe-ish computations on
>> numbers that might be too big would be needed.
>>
>> But I suspect that it's a kinda big project. :-)
>
> GNU already has http://gmplib.org/. Would love to see it linked in too ;)
Let me just quote a random line from the Emacs sources:
int nbytes = XINT (length);
And then think about what you'd have to do to make Emacs support random
precision arithmetic. It makes me swoon just to contemplate. :-)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:01 ` Lars Magne Ingebrigtsen
2010-09-10 14:08 ` Leo
@ 2010-09-10 14:18 ` Ted Zlatanov
2010-09-10 14:28 ` Lars Magne Ingebrigtsen
2010-09-10 19:28 ` Tom Tromey
2010-09-11 5:52 ` Stephen J. Turnbull
2 siblings, 2 replies; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-10 14:18 UTC (permalink / raw)
To: emacs-devel
On Fri, 10 Sep 2010 16:01:41 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
LMI> Ted Zlatanov <tzz@lifelogs.com> writes:
>> 64-bit int support in Emacs should not require a 64-bit system. If they
>> are natively available it's faster, that's all. It's not like you'd be
>> packing and unpacking the bits directly.
LMI> Yeah. Bignum support in Emacs would totally rock. No more manual
LMI> conversions to floating point numbers to do safe-ish computations on
LMI> numbers that might be too big would be needed.
LMI> But I suspect that it's a kinda big project. :-)
If everything was inside a num64-* namespace and a num64.el package,
it's a pretty easy implementation. I'm not an expert on numeric
algorithms[1] but I'd guess the internals can be pretty easily
implemented in ELisp as a list of up to 3 native 28-bit ints. In fact
I'm surprised that doesn't exist yet; maybe I've missed it.
Of course everyone will want 64-bit numbers native from the very start.
I say that would delay the availability of the feature; just make people
do (featurep 'num64) and get it done ASAP.
Specifically for Gnus purposes, if range-* knew about the num64 API, it
would be sufficient in order to use it.
Ted
[1] but I know enough not to assume anything at that level is trivial
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:15 ` Lars Magne Ingebrigtsen
@ 2010-09-10 14:19 ` Wojciech Meyer
0 siblings, 0 replies; 48+ messages in thread
From: Wojciech Meyer @ 2010-09-10 14:19 UTC (permalink / raw)
To: emacs-devel
On Fri, Sep 10, 2010 at 3:15 PM, Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Leo <sdl.web@gmail.com> writes:
>
>> On 2010-09-10 15:01 +0100, Lars Magne Ingebrigtsen wrote:
>>> Yeah. Bignum support in Emacs would totally rock. No more manual
>>> conversions to floating point numbers to do safe-ish computations on
>>> numbers that might be too big would be needed.
>>>
>>> But I suspect that it's a kinda big project. :-)
>>
>> GNU already has http://gmplib.org/. Would love to see it linked in too ;)
>
> Let me just quote a random line from the Emacs sources:
>
> int nbytes = XINT (length);
>
> And then think about what you'd have to do to make Emacs support random
> precision arithmetic. It makes me swoon just to contemplate. :-)
You have a type system. You don't need to make functions to accept bignums as
a replacement for ints. They have quite limited usage, mostly for numerical
computations. I tend to agree it would be nice to have them though.
>
> --
> (domestic pets only, the antidote for overdose, milk.)
> larsi@gnus.org * Lars Magne Ingebrigtsen
>
>
>
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:18 ` Ted Zlatanov
@ 2010-09-10 14:28 ` Lars Magne Ingebrigtsen
2010-09-10 14:38 ` bignums (was: Pushing the `gnus-range-*' functions down into the C layer) Ted Zlatanov
2010-09-10 15:16 ` Pushing the `gnus-range-*' functions down into the C layer Andreas Schwab
2010-09-10 19:28 ` Tom Tromey
1 sibling, 2 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 14:28 UTC (permalink / raw)
To: emacs-devel
Ted Zlatanov <tzz@lifelogs.com> writes:
> If everything was inside a num64-* namespace and a num64.el package,
> it's a pretty easy implementation.
I think for bignums to be interesting, they'd have to be native in
Emacs. It'd suck if you had a number, but had to say
`(num64+ num1 num2)'
if it's a bignum, and
`(+ num1 num2)'
if not. To take a random example.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* bignums (was: Pushing the `gnus-range-*' functions down into the C layer)
2010-09-10 14:28 ` Lars Magne Ingebrigtsen
@ 2010-09-10 14:38 ` Ted Zlatanov
2010-09-10 15:16 ` Pushing the `gnus-range-*' functions down into the C layer Andreas Schwab
1 sibling, 0 replies; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-10 14:38 UTC (permalink / raw)
To: emacs-devel
On Fri, 10 Sep 2010 16:28:36 +0200 Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
LMI> Ted Zlatanov <tzz@lifelogs.com> writes:
>> If everything was inside a num64-* namespace and a num64.el package,
>> it's a pretty easy implementation.
LMI> I think for bignums to be interesting, they'd have to be native in
LMI> Emacs. It'd suck if you had a number, but had to say
LMI> `(num64+ num1 num2)'
LMI> if it's a bignum, and
LMI> `(+ num1 num2)'
LMI> if not. To take a random example.
I understand. But because of this argument we still don't have bignums,
because it's such a big project to do at once.
IMHO it's easier to get them in somehow and then adapt the C built-ins
like + and > to use them. It won't break existing code either, which
native support probably will as you mentioned.
Ted
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:28 ` Lars Magne Ingebrigtsen
2010-09-10 14:38 ` bignums (was: Pushing the `gnus-range-*' functions down into the C layer) Ted Zlatanov
@ 2010-09-10 15:16 ` Andreas Schwab
2010-09-10 15:22 ` David Kastrup
1 sibling, 1 reply; 48+ messages in thread
From: Andreas Schwab @ 2010-09-10 15:16 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Ted Zlatanov <tzz@lifelogs.com> writes:
>
>> If everything was inside a num64-* namespace and a num64.el package,
>> it's a pretty easy implementation.
>
> I think for bignums to be interesting, they'd have to be native in
> Emacs. It'd suck if you had a number, but had to say
>
> `(num64+ num1 num2)'
>
> if it's a bignum, and
>
> `(+ num1 num2)'
>
> if not. To take a random example.
+ already supports more than one type of number, so it would not be a
problem to add support for yet another one. But that does not require
that Emacs supports bignums everywhere, just like there are already many
places that accept integers but not floats.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 15:16 ` Pushing the `gnus-range-*' functions down into the C layer Andreas Schwab
@ 2010-09-10 15:22 ` David Kastrup
2010-09-10 15:26 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 48+ messages in thread
From: David Kastrup @ 2010-09-10 15:22 UTC (permalink / raw)
To: emacs-devel
Andreas Schwab <schwab@linux-m68k.org> writes:
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>
>> Ted Zlatanov <tzz@lifelogs.com> writes:
>>
>>> If everything was inside a num64-* namespace and a num64.el package,
>>> it's a pretty easy implementation.
>>
>> I think for bignums to be interesting, they'd have to be native in
>> Emacs. It'd suck if you had a number, but had to say
>>
>> `(num64+ num1 num2)'
>>
>> if it's a bignum, and
>>
>> `(+ num1 num2)'
>>
>> if not. To take a random example.
>
> + already supports more than one type of number, so it would not be a
> problem to add support for yet another one. But that does not require
> that Emacs supports bignums everywhere, just like there are already many
> places that accept integers but not floats.
Integers are not transparently promoted to floats depending on size.
Bignums are somewhat pointless if that isn't the case. But then
integers of equal value are eq (and it is easy enough to need to make
use of that by using assq and similar). Equal bignums, being variable
size, can't handily be eq unless they are kept behind hashes making sure
that at any given time, all bignums of equal value are kept in the same
storage location.
--
David Kastrup
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 15:22 ` David Kastrup
@ 2010-09-10 15:26 ` Lars Magne Ingebrigtsen
2010-09-11 9:48 ` Stefan Monnier
0 siblings, 1 reply; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-10 15:26 UTC (permalink / raw)
To: emacs-devel
David Kastrup <dak@gnu.org> writes:
> Integers are not transparently promoted to floats depending on size.
> Bignums are somewhat pointless if that isn't the case.
Yup.
> But then integers of equal value are eq (and it is easy enough to need
> to make use of that by using assq and similar). Equal bignums, being
> variable size, can't handily be eq unless they are kept behind hashes
> making sure that at any given time, all bignums of equal value are
> kept in the same storage location.
Well, that's the case in Common Lisp, but it needn't be in Emacs Lisp.
`eq' could just use `=' as the comparison function for bignums. Even
though that would be a non-traditional way of doing it.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:18 ` Ted Zlatanov
2010-09-10 14:28 ` Lars Magne Ingebrigtsen
@ 2010-09-10 19:28 ` Tom Tromey
2010-09-14 15:52 ` Ted Zlatanov
1 sibling, 1 reply; 48+ messages in thread
From: Tom Tromey @ 2010-09-10 19:28 UTC (permalink / raw)
To: Ted Zlatanov; +Cc: emacs-devel
>>>>> "Ted" == Ted Zlatanov <tzz@lifelogs.com> writes:
Ted> If everything was inside a num64-* namespace and a num64.el package,
Ted> it's a pretty easy implementation. I'm not an expert on numeric
Ted> algorithms[1] but I'd guess the internals can be pretty easily
Ted> implemented in ELisp as a list of up to 3 native 28-bit ints. In fact
Ted> I'm surprised that doesn't exist yet; maybe I've missed it.
I thought there was one in Calc.
Tom
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 12:35 ` Lars Magne Ingebrigtsen
` (2 preceding siblings ...)
2010-09-10 13:45 ` Stefan Monnier
@ 2010-09-11 3:18 ` Daniel Pittman
3 siblings, 0 replies; 48+ messages in thread
From: Daniel Pittman @ 2010-09-11 3:18 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
[...]
>> What algorithms do you use there? These should be O(n), right? What do
>> you know about the usual shape of your data? E.g. what's the expected
>> difference between the smallest and the largest element (if it's not too
>> large, we could store them in bitvectors and then provide bitwise or/and on
>> bitvectors)?
>
> It varies a bit. The "read" range is usually on the form `(1 . 24254214)',
> and this is (of course) never uncompressed. But you'd usually have
> additional ranges for, say, ticked, and these would be quite sparse, like
> `(45 (90 . 96) (200 . 259))' Or something like that.
Oh, wouldn't it be lovely if there were all so polite. One of the problems my
IMAP server delivers is that I get this data for long-lived folders with
sparse email as is available here — because it is 8K in size.
http://daniel.rimspace.net/gnus.el
This is the nice version, incidentally, where I use a copy of the source into
Dovecot because it returns a nicer set of UID values. The original spaces,
for me, many email UIDs by ~ 100, ensuring that there are *no* contiguous
numbers available in the sparse set that Gnus is working with.
(It also leads to mailboxes with ~ 1000 numbered items across a range from 1
through 750,000 or so.)
Daniel
--
✣ Daniel Pittman ✉ daniel@rimspace.net ☎ +61 401 155 707
♽ made with 100 percent post-consumer electrons
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:01 ` Lars Magne Ingebrigtsen
2010-09-10 14:08 ` Leo
2010-09-10 14:18 ` Ted Zlatanov
@ 2010-09-11 5:52 ` Stephen J. Turnbull
2 siblings, 0 replies; 48+ messages in thread
From: Stephen J. Turnbull @ 2010-09-11 5:52 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: emacs-devel
Lars Magne Ingebrigtsen writes:
> Ted Zlatanov <tzz@lifelogs.com> writes:
> > 64-bit int support in Emacs should not require a 64-bit system.
> > If they are natively available it's faster, that's all. It's not
> > like you'd be packing and unpacking the bits directly.
> Yeah. Bignum support in Emacs would totally rock.
Indeed, it rocks in XEmacs and SXEmacs. :-)
> No more manual conversions to floating point numbers to do safe-ish
> computations on numbers that might be too big would be needed.
>
> But I suspect that it's a kinda big project. :-)
Not really. Somewhat bigger than "linking in libxml2", but not that
much. For all computations except buffer and string support, it's
already been done in XEmacs and SXEmacs, and the original
implementation has needed very little in the way of bugfixing -- I
suspect it's a real SMOP. The hard/tedious part would be large buffer
support on 32-bit systems, because for efficiency you probably don't
want to use bignums, but rather 64-bit fixnums. (But nobody has
actually tried, because the real problem with big buffers is
variable-width representations of characters.)
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:01 ` Ted Zlatanov
2010-09-10 14:09 ` Lars Magne Ingebrigtsen
@ 2010-09-11 9:36 ` Stefan Monnier
1 sibling, 0 replies; 48+ messages in thread
From: Stefan Monnier @ 2010-09-11 9:36 UTC (permalink / raw)
To: Ted Zlatanov; +Cc: emacs-devel
SM> Hmm... so how 'bout installing a C version of range-memq and keep
SM> everything else in Elisp? Would that be good enough as a "quick
SM> fix"?
> Lars, how about the performance of the set operations (difference,
> union, subset/superset, intersection)?
The hope would be that when operating on compressed sets those
operations would be sufficiently fast even when coded in Elisp.
> They'll benefit from a fast range-memq but they would still do a lot
> of ELisp work.
Set intersection and difference should not use range-memq, so it would
only gain from working on compressed data. And Elisp is slow, so it
would still be slow. But hopefully these operations would gain enough
from working on compressed data, and are called sufficiently rarely for
it not to be a problem.
> Can ranges be an opaque type (with rangep, make-range, etc.) so the
> implementation is irrelevant to the user? That seems the best
> compromise between performance and maintainability. As long as you can
> read and print ranges in a consistent format, that is.
That would probably be the way to go, indeed.
Stefan
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 14:08 ` Leo
2010-09-10 14:15 ` Lars Magne Ingebrigtsen
@ 2010-09-11 9:44 ` Stefan Monnier
1 sibling, 0 replies; 48+ messages in thread
From: Stefan Monnier @ 2010-09-11 9:44 UTC (permalink / raw)
To: Leo; +Cc: emacs-devel
> GNU already has http://gmplib.org/. Would love to see it linked in too ;)
Patches welcome (actually, you can start from
http://bzr.savannah.gnu.org/r/emacs/other-branches/gerd_big/ which has
an old patch for it). Note that it's not nearly as simple as it sounds:
you have to add a new type, and then make primitives accept it.
But you don't have to update all primitives at once to make it useful,
so it can be made incrementally.
Stefan
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 15:26 ` Lars Magne Ingebrigtsen
@ 2010-09-11 9:48 ` Stefan Monnier
2010-09-11 11:57 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 48+ messages in thread
From: Stefan Monnier @ 2010-09-11 9:48 UTC (permalink / raw)
To: emacs-devel
> Well, that's the case in Common Lisp, but it needn't be in Emacs Lisp.
> `eq' could just use `=' as the comparison function for bignums. Even
> though that would be a non-traditional way of doing it.
You're basically suggesting to replace `eq' with `eql', which has
non-trivial performance implications.
Stefan
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-11 9:48 ` Stefan Monnier
@ 2010-09-11 11:57 ` Lars Magne Ingebrigtsen
2010-09-11 15:36 ` Stephen J. Turnbull
0 siblings, 1 reply; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-11 11:57 UTC (permalink / raw)
To: emacs-devel
Stefan Monnier <monnier@iro.umontreal.ca> writes:
>> Well, that's the case in Common Lisp, but it needn't be in Emacs Lisp.
>> `eq' could just use `=' as the comparison function for bignums. Even
>> though that would be a non-traditional way of doing it.
>
> You're basically suggesting to replace `eq' with `eql', which has
> non-trivial performance implications.
Well, only for the new bignum type.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-11 11:57 ` Lars Magne Ingebrigtsen
@ 2010-09-11 15:36 ` Stephen J. Turnbull
2010-09-11 15:51 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 48+ messages in thread
From: Stephen J. Turnbull @ 2010-09-11 15:36 UTC (permalink / raw)
To: Lars Magne Ingebrigtsen; +Cc: emacs-devel
Lars Magne Ingebrigtsen writes:
> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
> >> Well, that's the case in Common Lisp, but it needn't be in Emacs
> >> Lisp. `eq' could just use `=' as the comparison function for
> >> bignums. Even though that would be a non-traditional way of
> >> doing it.
> >
> > You're basically suggesting to replace `eq' with `eql', which has
> > non-trivial performance implications.
>
> Well, only for the new bignum type.
No, for all types. First you have to check whether something is a
bignum, then if not you do a bit-pattern comparison. Worst case this
could double the cost of doing EQ(x,y) in C, and in Lisp `eq' will
cost a perceptible amount more.
FWIW nobody has complained that for bignums (= x y) doesn't imply (eq
x y) in XEmacs. You might want to ask the SXEmacs people what their
experience is.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-11 15:36 ` Stephen J. Turnbull
@ 2010-09-11 15:51 ` Lars Magne Ingebrigtsen
2010-09-11 16:15 ` Wojciech Meyer
0 siblings, 1 reply; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-11 15:51 UTC (permalink / raw)
To: emacs-devel
"Stephen J. Turnbull" <stephen@xemacs.org> writes:
> No, for all types. First you have to check whether something is a
> bignum, then if not you do a bit-pattern comparison. Worst case this
> could double the cost of doing EQ(x,y) in C, and in Lisp `eq' will
> cost a perceptible amount more.
Right. Is EQ in C just implemented as a memory check, and absolutely
nothing else? It doesn't touch the data at all in any way?
*etag around a bit*
#define EQ(x, y) (XHASH (x) == XHASH (y))
which is
/* Return a perfect hash of the Lisp_Object representation. */
#define XHASH(a) (a)
So, yeah, stupid idea. Never mind.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: rfc2047-decode-string in C?
2010-09-10 13:51 ` Lars Magne Ingebrigtsen
@ 2010-09-11 16:10 ` Lars Magne Ingebrigtsen
0 siblings, 0 replies; 48+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-11 16:10 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Ok; I'll do some benchmarking over the weekend. (If I find the time
> after reimplementing nnimap.el.)
I've done some simple benchmarking by just disabling the
rfc2047-decoding in one Emacs and comparing group entry with decoding
versus group entry without decoding. (The assumption is that a
C-implemented rfc2047-decode-string would take zero time. :-)
I tried various groups, and selected 10K articles. Entry time was
between 5s and 10s with decoding. With the decoding disabled, group
entry was 15-20% faster.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-11 15:51 ` Lars Magne Ingebrigtsen
@ 2010-09-11 16:15 ` Wojciech Meyer
2010-09-12 9:57 ` Stefan Monnier
0 siblings, 1 reply; 48+ messages in thread
From: Wojciech Meyer @ 2010-09-11 16:15 UTC (permalink / raw)
To: emacs-devel
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> "Stephen J. Turnbull" <stephen@xemacs.org> writes:
>
>> No, for all types. First you have to check whether something is a
>> bignum, then if not you do a bit-pattern comparison. Worst case this
>> could double the cost of doing EQ(x,y) in C, and in Lisp `eq' will
>> cost a perceptible amount more.
>
> Right. Is EQ in C just implemented as a memory check, and absolutely
> nothing else? It doesn't touch the data at all in any way?
>
> *etag around a bit*
>
> #define EQ(x, y) (XHASH (x) == XHASH (y))
>
> which is
>
> /* Return a perfect hash of the Lisp_Object representation. */
> #define XHASH(a) (a)
>
> So, yeah, stupid idea. Never mind.
Yes, because it is physical equivalence, and pointers are unique. In
this case I agree, it should not be called XHASH because that implies it
is used for structural equivalence, which is misleading. Also, hashing
object should behave like deep copying - follow pointers.
Wojciech
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-11 16:15 ` Wojciech Meyer
@ 2010-09-12 9:57 ` Stefan Monnier
0 siblings, 0 replies; 48+ messages in thread
From: Stefan Monnier @ 2010-09-12 9:57 UTC (permalink / raw)
To: Wojciech Meyer; +Cc: emacs-devel
>> #define EQ(x, y) (XHASH (x) == XHASH (y))
Not only that, but also
#define NILP(x) EQ (x, Qnil)
So EQ is really used extensively in C and slowing it down is likely to
make a perceptible difference.
>> /* Return a perfect hash of the Lisp_Object representation. */
>> #define XHASH(a) (a)
> this case I agree, it should not be called XHASH because that implies it
> is used for structural equivalence, which is misleading. Also, hashing
> object should behave like deep copying - follow pointers.
Nothing says that hashing an object should behave like deep copying.
That's why make-hash-table has a :test parameter which can be `eq' or
`equal'.
As for why we have XHASH: because there's another definition of it for
when we use a union rather than an int for Lisp_Object:
#define XHASH(a) ((a).i)
-- Stefan
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 3:24 ` Stephen J. Turnbull
2010-09-10 13:53 ` Ted Zlatanov
@ 2010-09-13 11:45 ` Eli Zaretskii
1 sibling, 0 replies; 48+ messages in thread
From: Eli Zaretskii @ 2010-09-13 11:45 UTC (permalink / raw)
To: Stephen J. Turnbull; +Cc: larsi, emacs-devel
> From: "Stephen J. Turnbull" <stephen@xemacs.org>
> Date: Fri, 10 Sep 2010 12:24:26 +0900
> Cc: emacs-devel@gnu.org
>
> Lars Magne Ingebrigtsen writes:
>
> > But aren't everybody on 64-bit systems now? :-)
>
> Everybody using Microsoft or Apple products, yes; they have no choice.
The Windows port of Emacs does not yet support a 64-bit build, AFAIK.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Pushing the `gnus-range-*' functions down into the C layer
2010-09-10 19:28 ` Tom Tromey
@ 2010-09-14 15:52 ` Ted Zlatanov
0 siblings, 0 replies; 48+ messages in thread
From: Ted Zlatanov @ 2010-09-14 15:52 UTC (permalink / raw)
To: emacs-devel
On Fri, 10 Sep 2010 13:28:28 -0600 Tom Tromey <tromey@redhat.com> wrote:
>>>>>> "Ted" == Ted Zlatanov <tzz@lifelogs.com> writes:
Ted> If everything was inside a num64-* namespace and a num64.el package,
Ted> it's a pretty easy implementation. I'm not an expert on numeric
Ted> algorithms[1] but I'd guess the internals can be pretty easily
Ted> implemented in ELisp as a list of up to 3 native 28-bit ints. In fact
Ted> I'm surprised that doesn't exist yet; maybe I've missed it.
Tom> I thought there was one in Calc.
Yeah, looks like calc-math.el:math-* could work. It's all ELisp and
could use some C love, especially if 64-bit is native. So Lars can use
the math-* functions to store and retrieve large numbers (they fall back
to native ELisp numbers) and he'll need a special range-memq to work
with the math-* bignums in a range.
Ted
^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2010-09-14 15:52 UTC | newest]
Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-09 15:16 Pushing the `gnus-range-*' functions down into the C layer Lars Magne Ingebrigtsen
2010-09-09 19:01 ` Ted Zlatanov
2010-09-09 21:58 ` Lars Magne Ingebrigtsen
2010-09-09 22:25 ` Ted Zlatanov
2010-09-10 3:24 ` Stephen J. Turnbull
2010-09-10 13:53 ` Ted Zlatanov
2010-09-10 14:01 ` Lars Magne Ingebrigtsen
2010-09-10 14:08 ` Leo
2010-09-10 14:15 ` Lars Magne Ingebrigtsen
2010-09-10 14:19 ` Wojciech Meyer
2010-09-11 9:44 ` Stefan Monnier
2010-09-10 14:18 ` Ted Zlatanov
2010-09-10 14:28 ` Lars Magne Ingebrigtsen
2010-09-10 14:38 ` bignums (was: Pushing the `gnus-range-*' functions down into the C layer) Ted Zlatanov
2010-09-10 15:16 ` Pushing the `gnus-range-*' functions down into the C layer Andreas Schwab
2010-09-10 15:22 ` David Kastrup
2010-09-10 15:26 ` Lars Magne Ingebrigtsen
2010-09-11 9:48 ` Stefan Monnier
2010-09-11 11:57 ` Lars Magne Ingebrigtsen
2010-09-11 15:36 ` Stephen J. Turnbull
2010-09-11 15:51 ` Lars Magne Ingebrigtsen
2010-09-11 16:15 ` Wojciech Meyer
2010-09-12 9:57 ` Stefan Monnier
2010-09-10 19:28 ` Tom Tromey
2010-09-14 15:52 ` Ted Zlatanov
2010-09-11 5:52 ` Stephen J. Turnbull
2010-09-13 11:45 ` Eli Zaretskii
2010-09-09 22:21 ` Wojciech Meyer
2010-09-09 23:48 ` Ted Zlatanov
2010-09-09 23:56 ` Wojciech Meyer
2010-09-10 0:07 ` Lars Magne Ingebrigtsen
2010-09-10 0:17 ` Wojciech Meyer
2010-09-10 8:06 ` Andy Wingo
2010-09-10 10:20 ` Wojciech Meyer
2010-09-10 10:43 ` Stefan Monnier
2010-09-10 12:35 ` Lars Magne Ingebrigtsen
2010-09-10 12:49 ` rfc2047-decode-string in C? Lars Magne Ingebrigtsen
2010-09-10 13:47 ` Stefan Monnier
2010-09-10 13:51 ` Lars Magne Ingebrigtsen
2010-09-11 16:10 ` Lars Magne Ingebrigtsen
2010-09-10 13:07 ` Pushing the `gnus-range-*' functions down into the C layer joakim
2010-09-10 13:22 ` Lars Magne Ingebrigtsen
2010-09-10 13:45 ` Stefan Monnier
2010-09-10 14:01 ` Ted Zlatanov
2010-09-10 14:09 ` Lars Magne Ingebrigtsen
2010-09-11 9:36 ` Stefan Monnier
2010-09-10 14:06 ` Lars Magne Ingebrigtsen
2010-09-11 3:18 ` Daniel Pittman
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).