unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* how to access a large datastructure efficiently?
@ 2010-03-04  1:50 Christian Wittern
  2010-03-04  6:59 ` Thierry Volpiatto
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Wittern @ 2010-03-04  1:50 UTC (permalink / raw)
  To: help-gnu-emacs

Hi there,

Here is the problem I am trying to solve:

I have a large list of items which I want to access.  The items are in 
sequential order, but many are missing in between, like:

(1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
with this, but I took it out for simplicity]

Now when I am trying to access with a key that is not in the list, I 
want to have the one with the closest smaller key returned, so for 6 
and 7 this would be 1, but for 8 and 9 this would be 8.

Since the list will have thousands of elements, I do not want to simply 
loop through it but am looking for better ways to do this in Emacs lisp.  
Any ideas how to achieve this?

Christian





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  1:50 Christian Wittern
@ 2010-03-04  6:59 ` Thierry Volpiatto
  2010-03-04  7:25   ` Thierry Volpiatto
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Volpiatto @ 2010-03-04  6:59 UTC (permalink / raw)
  To: help-gnu-emacs

Hi,

Christian Wittern <cwittern@gmail.com> writes:

> Hi there,
>
> Here is the problem I am trying to solve:
>
> I have a large list of items which I want to access.  The items are in 
> sequential order, but many are missing in between, like:
>
> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
> with this, but I took it out for simplicity]
>
> Now when I am trying to access with a key that is not in the list, I 
> want to have the one with the closest smaller key returned, so for 6 
> and 7 this would be 1, but for 8 and 9 this would be 8.
>
> Since the list will have thousands of elements, I do not want to simply 
> loop through it but am looking for better ways to do this in Emacs lisp.  
> Any ideas how to achieve this?

,----
| (defun closest-elm-in-seq (n seq)
|   (let ((pair (loop with elm = n with last-elm
|                  for i in seq
|                  if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
|                  do (setq last-elm i))))
|     (if (< (- n (car pair)) (- (cadr pair) n))
|         (car pair) (cadr pair))))
`----

That return the closest, but not the smaller closest, but it should be
easy to adapt.

-- 
Thierry Volpiatto
Gpg key: http://pgp.mit.edu/





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  6:59 ` Thierry Volpiatto
@ 2010-03-04  7:25   ` Thierry Volpiatto
  2010-03-04  8:13     ` Andreas Röhler
  2010-03-04  8:15     ` Christian Wittern
  0 siblings, 2 replies; 14+ messages in thread
From: Thierry Volpiatto @ 2010-03-04  7:25 UTC (permalink / raw)
  To: help-gnu-emacs

Thierry Volpiatto <thierry.volpiatto@gmail.com> writes:

> Hi,
>
> Christian Wittern <cwittern@gmail.com> writes:
>
>> Hi there,
>>
>> Here is the problem I am trying to solve:
>>
>> I have a large list of items which I want to access.  The items are in 
>> sequential order, but many are missing in between, like:
>>
>> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
>> with this, but I took it out for simplicity]
>>
>> Now when I am trying to access with a key that is not in the list, I 
>> want to have the one with the closest smaller key returned, so for 6 
>> and 7 this would be 1, but for 8 and 9 this would be 8.
>>
>> Since the list will have thousands of elements, I do not want to simply 
>> loop through it but am looking for better ways to do this in Emacs lisp.  
>> Any ideas how to achieve this?
>
> ,----
> | (defun closest-elm-in-seq (n seq)
> |   (let ((pair (loop with elm = n with last-elm
> |                  for i in seq
> |                  if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
> |                  do (setq last-elm i))))
> |     (if (< (- n (car pair)) (- (cadr pair) n))
> |         (car pair) (cadr pair))))
> `----
>
> That return the closest, but not the smaller closest, but it should be
> easy to adapt.

Case where your element is member of list, return it:

,----
| (defun closest-elm-in-seq (n seq)
|   (let ((pair (loop with elm = n with last-elm
|                  for i in seq
|                  if (eq i elm) return (list i)
|                  else if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
|                  do (setq last-elm i))))
|     (if (> (length pair) 1)
|         (if (< (- n (car pair)) (- (cadr pair) n))
|             (car pair) (cadr pair))
|         (car pair))))
`----
For the smallest just return the car...

-- 
Thierry Volpiatto
Gpg key: http://pgp.mit.edu/





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  7:25   ` Thierry Volpiatto
@ 2010-03-04  8:13     ` Andreas Röhler
  2010-03-04 11:00       ` Thierry Volpiatto
  2010-03-04  8:15     ` Christian Wittern
  1 sibling, 1 reply; 14+ messages in thread
From: Andreas Röhler @ 2010-03-04  8:13 UTC (permalink / raw)
  To: help-gnu-emacs

Thierry Volpiatto wrote:
> Thierry Volpiatto <thierry.volpiatto@gmail.com> writes:
> 
>> Hi,
>>
>> Christian Wittern <cwittern@gmail.com> writes:
>>
>>> Hi there,
>>>
>>> Here is the problem I am trying to solve:
>>>
>>> I have a large list of items which I want to access.  The items are in 
>>> sequential order, but many are missing in between, like:
>>>
>>> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
>>> with this, but I took it out for simplicity]
>>>
>>> Now when I am trying to access with a key that is not in the list, I 
>>> want to have the one with the closest smaller key returned, so for 6 
>>> and 7 this would be 1, but for 8 and 9 this would be 8.
>>>
>>> Since the list will have thousands of elements, I do not want to simply 
>>> loop through it but am looking for better ways to do this in Emacs lisp.  
>>> Any ideas how to achieve this?
>> ,----
>> | (defun closest-elm-in-seq (n seq)
>> |   (let ((pair (loop with elm = n with last-elm
>> |                  for i in seq
>> |                  if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
>> |                  do (setq last-elm i))))
>> |     (if (< (- n (car pair)) (- (cadr pair) n))
>> |         (car pair) (cadr pair))))
>> `----
>>
>> That return the closest, but not the smaller closest, but it should be
>> easy to adapt.
> 
> Case where your element is member of list, return it:
> 
> ,----
> | (defun closest-elm-in-seq (n seq)
> |   (let ((pair (loop with elm = n with last-elm
> |                  for i in seq
> |                  if (eq i elm) return (list i)
> |                  else if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
> |                  do (setq last-elm i))))
> |     (if (> (length pair) 1)
> |         (if (< (- n (car pair)) (- (cadr pair) n))
> |             (car pair) (cadr pair))
> |         (car pair))))
> `----
> For the smallest just return the car...
> 

if n is member of the seq, maybe equal-operator too

(<= last-elm elm)

is correct?

Thanks BTW, very interesting

Andreas




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  7:25   ` Thierry Volpiatto
  2010-03-04  8:13     ` Andreas Röhler
@ 2010-03-04  8:15     ` Christian Wittern
  2010-03-04 10:24       ` Thierry Volpiatto
                         ` (2 more replies)
  1 sibling, 3 replies; 14+ messages in thread
From: Christian Wittern @ 2010-03-04  8:15 UTC (permalink / raw)
  To: help-gnu-emacs

Thierry Volpiatto <thierry.volpiatto <at> gmail.com> writes:

> >> I have a large list of items which I want to access.  The items are in 
> >> sequential order, but many are missing in between, like:
> >>
> >> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
> >> with this, but I took it out for simplicity]
> 
> ,----
> | (defun closest-elm-in-seq (n seq)
> |   (let ((pair (loop with elm = n with last-elm
> |                  for i in seq
> |                  if (eq i elm) return (list i)
> |                  else if (and last-elm (< last-elm elm) (> i elm)) return
(list last-elm i)
> |                  do (setq last-elm i))))
> |     (if (> (length pair) 1)
> |         (if (< (- n (car pair)) (- (cadr pair) n))
> |             (car pair) (cadr pair))
> |         (car pair))))
> `----
> For the smallest just return the car...
> 

This seems to do what I need, thanks!  Now I have to see how that 
performs on the real data.  I was hoping there would be a method 
that did not involve loops, but some kind of binary search.  Would it be
possible to use a hash-table here?

Christian 







^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  8:15     ` Christian Wittern
@ 2010-03-04 10:24       ` Thierry Volpiatto
  2010-03-04 15:01       ` Thien-Thi Nguyen
  2010-03-04 16:49       ` Andreas Politz
  2 siblings, 0 replies; 14+ messages in thread
From: Thierry Volpiatto @ 2010-03-04 10:24 UTC (permalink / raw)
  To: help-gnu-emacs

Christian Wittern <cwittern@gmail.com> writes:

> Thierry Volpiatto <thierry.volpiatto <at> gmail.com> writes:
>
>> >> I have a large list of items which I want to access.  The items are in 
>> >> sequential order, but many are missing in between, like:
>> >>
>> >> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
>> >> with this, but I took it out for simplicity]
>> 
>> ,----
>> | (defun closest-elm-in-seq (n seq)
>> |   (let ((pair (loop with elm = n with last-elm
>> |                  for i in seq
>> |                  if (eq i elm) return (list i)
>> |                  else if (and last-elm (< last-elm elm) (> i elm)) return
> (list last-elm i)
>> |                  do (setq last-elm i))))
>> |     (if (> (length pair) 1)
>> |         (if (< (- n (car pair)) (- (cadr pair) n))
>> |             (car pair) (cadr pair))
>> |         (car pair))))
>> `----
>> For the smallest just return the car...
>> 
>
> This seems to do what I need, thanks!  Now I have to see how that 
> performs on the real data.  I was hoping there would be a method 
> that did not involve loops, but some kind of binary search.  Would it be
> possible to use a hash-table here?

Yes.
Use loop like this:

,----
| ELISP> (setq A (make-hash-table))
| #s(hash-table size 65 test eql rehash-size 1.5 rehash-threshold 0.8 data
|               ())
| ELISP> A
| #s(hash-table size 65 test eql rehash-size 1.5 rehash-threshold 0.8 data
|               (1 "a" 2 "b" 3 "c"))
| 
| ELISP> (puthash 1 "a" A)
| "a"
| ELISP> (puthash 2 "b" A)
| "b"
| ELISP> (puthash 3 "c" A)
| "c"
| ELISP> (loop for k being the hash-keys in A
|           collect k)
| (1 2 3)
| 
| ELISP> (loop for v being the hash-values in A
|           collect v)
| ("a" "b" "c")
| 
| ELISP> (loop for v being the hash-values in A using (hash-key k)
|           collect (list k v))
| ((1 "a")
|  (2 "b")
|  (3 "c"))
`----


-- 
Thierry Volpiatto
Gpg key: http://pgp.mit.edu/





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
       [not found] <mailman.2247.1267667449.14305.help-gnu-emacs@gnu.org>
@ 2010-03-04 10:36 ` Alan Mackenzie
  2010-03-04 20:22 ` Pascal J. Bourguignon
  2010-03-05  0:29 ` Stefan Monnier
  2 siblings, 0 replies; 14+ messages in thread
From: Alan Mackenzie @ 2010-03-04 10:36 UTC (permalink / raw)
  To: help-gnu-emacs

Hi, Christian,

Christian Wittern <cwittern@gmail.com> wrote:
> Hi there,

> Here is the problem I am trying to solve:

> I have a large list of items which I want to access.  The items are in 
> sequential order, but many are missing in between, like:

> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
> with this, but I took it out for simplicity]

> Now when I am trying to access with a key that is not in the list, I 
> want to have the one with the closest smaller key returned, so for 6 
> and 7 this would be 1, but for 8 and 9 this would be 8.

> Since the list will have thousands of elements, I do not want to simply 
> loop through it but am looking for better ways to do this in Emacs lisp.  
> Any ideas how to achieve this?

You may want to use some sort of optimised tree structure, such as a
B-tree or an AVL-tree.  I am not an expert on such things.  There is an
article on the AVL-tree on the emacs wiki at
<http://www.emacswiki.org/emacs/AVLtrees>.

This may be what you want.  But I would try it first with a normal flat
list, and only go to the more complex data structure when the first try
really, really isn't fast enough.

> Christian

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  8:13     ` Andreas Röhler
@ 2010-03-04 11:00       ` Thierry Volpiatto
  2010-03-04 15:49         ` Andreas Röhler
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Volpiatto @ 2010-03-04 11:00 UTC (permalink / raw)
  To: help-gnu-emacs

Andreas Röhler <andreas.roehler@easy-emacs.de> writes:

> Thierry Volpiatto wrote:
>> Thierry Volpiatto <thierry.volpiatto@gmail.com> writes:
>> 
>>> Hi,
>>>
>>> Christian Wittern <cwittern@gmail.com> writes:
>>>
>>>> Hi there,
>>>>
>>>> Here is the problem I am trying to solve:
>>>>
>>>> I have a large list of items which I want to access.  The items are in 
>>>> sequential order, but many are missing in between, like:
>>>>
>>>> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
>>>> with this, but I took it out for simplicity]
>>>>
>>>> Now when I am trying to access with a key that is not in the list, I 
>>>> want to have the one with the closest smaller key returned, so for 6 
>>>> and 7 this would be 1, but for 8 and 9 this would be 8.
>>>>
>>>> Since the list will have thousands of elements, I do not want to simply 
>>>> loop through it but am looking for better ways to do this in Emacs lisp.  
>>>> Any ideas how to achieve this?
>>> ,----
>>> | (defun closest-elm-in-seq (n seq)
>>> |   (let ((pair (loop with elm = n with last-elm
>>> |                  for i in seq
>>> |                  if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
>>> |                  do (setq last-elm i))))
>>> |     (if (< (- n (car pair)) (- (cadr pair) n))
>>> |         (car pair) (cadr pair))))
>>> `----
>>>
>>> That return the closest, but not the smaller closest, but it should be
>>> easy to adapt.
>> 
>> Case where your element is member of list, return it:
>> 
>> ,----
>> | (defun closest-elm-in-seq (n seq)
>> |   (let ((pair (loop with elm = n with last-elm
>> |                  for i in seq
>> |                  if (eq i elm) return (list i)
>> |                  else if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
>> |                  do (setq last-elm i))))
>> |     (if (> (length pair) 1)
>> |         (if (< (- n (car pair)) (- (cadr pair) n))
>> |             (car pair) (cadr pair))
>> |         (car pair))))
>> `----
>> For the smallest just return the car...
>> 
>
> if n is member of the seq, maybe equal-operator too
>
> (<= last-elm elm)
>
> is correct?

No, in this case:

if (eq i elm) return (list i) ==> (i) ; which is n

and finally (car pair) ==> n

> Thanks BTW, very interesting
>
> Andreas
>
>
>

-- 
Thierry Volpiatto
Gpg key: http://pgp.mit.edu/





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  8:15     ` Christian Wittern
  2010-03-04 10:24       ` Thierry Volpiatto
@ 2010-03-04 15:01       ` Thien-Thi Nguyen
  2010-03-04 16:49       ` Andreas Politz
  2 siblings, 0 replies; 14+ messages in thread
From: Thien-Thi Nguyen @ 2010-03-04 15:01 UTC (permalink / raw)
  To: help-gnu-emacs

() Christian Wittern <cwittern@gmail.com>
() Thu, 4 Mar 2010 08:15:47 +0000 (UTC)

   some kind of binary search

How do you binary search a singly-linked list?

thi




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04 11:00       ` Thierry Volpiatto
@ 2010-03-04 15:49         ` Andreas Röhler
  2010-03-04 16:09           ` Thierry Volpiatto
  0 siblings, 1 reply; 14+ messages in thread
From: Andreas Röhler @ 2010-03-04 15:49 UTC (permalink / raw)
  To: help-gnu-emacs

Thierry Volpiatto wrote:
> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
> 
>> Thierry Volpiatto wrote:
>>> Thierry Volpiatto <thierry.volpiatto@gmail.com> writes:
>>>
>>>> Hi,
>>>>
>>>> Christian Wittern <cwittern@gmail.com> writes:
>>>>
>>>>> Hi there,
>>>>>
>>>>> Here is the problem I am trying to solve:
>>>>>
>>>>> I have a large list of items which I want to access.  The items are in 
>>>>> sequential order, but many are missing in between, like:
>>>>>
>>>>> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
>>>>> with this, but I took it out for simplicity]
>>>>>
>>>>> Now when I am trying to access with a key that is not in the list, I 
>>>>> want to have the one with the closest smaller key returned, so for 6 
>>>>> and 7 this would be 1, but for 8 and 9 this would be 8.
>>>>>
>>>>> Since the list will have thousands of elements, I do not want to simply 
>>>>> loop through it but am looking for better ways to do this in Emacs lisp.  
>>>>> Any ideas how to achieve this?
>>>> ,----
>>>> | (defun closest-elm-in-seq (n seq)
>>>> |   (let ((pair (loop with elm = n with last-elm
>>>> |                  for i in seq
>>>> |                  if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
>>>> |                  do (setq last-elm i))))
>>>> |     (if (< (- n (car pair)) (- (cadr pair) n))
>>>> |         (car pair) (cadr pair))))
>>>> `----
>>>>
>>>> That return the closest, but not the smaller closest, but it should be
>>>> easy to adapt.
>>> Case where your element is member of list, return it:
>>>
>>> ,----
>>> | (defun closest-elm-in-seq (n seq)
>>> |   (let ((pair (loop with elm = n with last-elm
>>> |                  for i in seq
>>> |                  if (eq i elm) return (list i)
>>> |                  else if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
>>> |                  do (setq last-elm i))))
>>> |     (if (> (length pair) 1)
>>> |         (if (< (- n (car pair)) (- (cadr pair) n))
>>> |             (car pair) (cadr pair))
>>> |         (car pair))))
>>> `----
>>> For the smallest just return the car...
>>>
>> if n is member of the seq, maybe equal-operator too
>>
>> (<= last-elm elm)
>>
>> is correct?
> 
> No, in this case:
> 
> if (eq i elm) return (list i) ==> (i) ; which is n
> 
> and finally (car pair) ==> n
> 

Hmm, sorry being the imprecise,
aimed at the first form, whose result equals the the second form once implemented this "="

Andreas

>> Thanks BTW, very interesting
>>
>> Andreas
>>
>>
>>
> 





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04 15:49         ` Andreas Röhler
@ 2010-03-04 16:09           ` Thierry Volpiatto
  0 siblings, 0 replies; 14+ messages in thread
From: Thierry Volpiatto @ 2010-03-04 16:09 UTC (permalink / raw)
  To: help-gnu-emacs

Andreas Röhler <andreas.roehler@easy-emacs.de> writes:

> Thierry Volpiatto wrote:
>> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
>> 
>>> Thierry Volpiatto wrote:
>>>> Thierry Volpiatto <thierry.volpiatto@gmail.com> writes:
>>>>
>>>>> Hi,
>>>>>
>>>>> Christian Wittern <cwittern@gmail.com> writes:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> Here is the problem I am trying to solve:
>>>>>>
>>>>>> I have a large list of items which I want to access.  The items are in 
>>>>>> sequential order, but many are missing in between, like:
>>>>>>
>>>>>> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
>>>>>> with this, but I took it out for simplicity]
>>>>>>
>>>>>> Now when I am trying to access with a key that is not in the list, I 
>>>>>> want to have the one with the closest smaller key returned, so for 6 
>>>>>> and 7 this would be 1, but for 8 and 9 this would be 8.
>>>>>>
>>>>>> Since the list will have thousands of elements, I do not want to simply 
>>>>>> loop through it but am looking for better ways to do this in Emacs lisp.  
>>>>>> Any ideas how to achieve this?
>>>>> ,----
>>>>> | (defun closest-elm-in-seq (n seq)
>>>>> |   (let ((pair (loop with elm = n with last-elm
>>>>> |                  for i in seq
>>>>> |                  if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
>>>>> |                  do (setq last-elm i))))
>>>>> |     (if (< (- n (car pair)) (- (cadr pair) n))
>>>>> |         (car pair) (cadr pair))))
>>>>> `----
>>>>>
>>>>> That return the closest, but not the smaller closest, but it should be
>>>>> easy to adapt.
>>>> Case where your element is member of list, return it:
>>>>
>>>> ,----
>>>> | (defun closest-elm-in-seq (n seq)
>>>> |   (let ((pair (loop with elm = n with last-elm
>>>> |                  for i in seq
>>>> |                  if (eq i elm) return (list i)
>>>> |                  else if (and last-elm (< last-elm elm) (> i elm)) return (list last-elm i)
>>>> |                  do (setq last-elm i))))
>>>> |     (if (> (length pair) 1)
>>>> |         (if (< (- n (car pair)) (- (cadr pair) n))
>>>> |             (car pair) (cadr pair))
>>>> |         (car pair))))
>>>> `----
>>>> For the smallest just return the car...
>>>>
>>> if n is member of the seq, maybe equal-operator too
>>>
>>> (<= last-elm elm)
>>>
>>> is correct?
>> 
>> No, in this case:
>> 
>> if (eq i elm) return (list i) ==> (i) ; which is n
>> 
>> and finally (car pair) ==> n
>> 
>
> Hmm, sorry being the imprecise,
> aimed at the first form, whose result equals the the second form once implemented this "="

Ok, i understand, yes, we can do what you say and it's more elegant, i
just notice also i forget to remove a unuseful else:

,----
| (defun closest-elm-in-seq (n seq)
|   (let ((pair (loop with elm = n with last-elm
|                  for i in seq
|                  if (and last-elm (<= last-elm elm) (> i elm)) return (list last-elm i)
|                  do (setq last-elm i))))
|     (if (< (- n (car pair)) (- (cadr pair) n))
|         (car pair) (cadr pair))))
`----


That should work the same.
Thanks. ;-)


> Andreas
>
>>> Thanks BTW, very interesting
>>>
>>> Andreas
>>>
>>>
>>>
>> 
>
>
>
>

-- 
Thierry Volpiatto
Gpg key: http://pgp.mit.edu/





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
  2010-03-04  8:15     ` Christian Wittern
  2010-03-04 10:24       ` Thierry Volpiatto
  2010-03-04 15:01       ` Thien-Thi Nguyen
@ 2010-03-04 16:49       ` Andreas Politz
  2 siblings, 0 replies; 14+ messages in thread
From: Andreas Politz @ 2010-03-04 16:49 UTC (permalink / raw)
  To: help-gnu-emacs

Christian Wittern <cwittern@gmail.com> writes:

> Thierry Volpiatto <thierry.volpiatto <at> gmail.com> writes:
>
>> >> I have a large list of items which I want to access.  The items are in 
>> >> sequential order, but many are missing in between, like:
>> >>
>> >> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
>> >> with this, but I took it out for simplicity]
>> 
>> ,----
>> | (defun closest-elm-in-seq (n seq)
>> |   (let ((pair (loop with elm = n with last-elm
>> |                  for i in seq
>> |                  if (eq i elm) return (list i)
>> |                  else if (and last-elm (< last-elm elm) (> i elm)) return
> (list last-elm i)
>> |                  do (setq last-elm i))))
>> |     (if (> (length pair) 1)
>> |         (if (< (- n (car pair)) (- (cadr pair) n))
>> |             (car pair) (cadr pair))
>> |         (car pair))))
>> `----
>> For the smallest just return the car...
>> 
>
> This seems to do what I need, thanks!  Now I have to see how that 
> performs on the real data.  I was hoping there would be a method 
> that did not involve loops, but some kind of binary search.  Would it be
> possible to use a hash-table here?
>
> Christian 


I don't know how hash-table could help in this case, but maybe you want
to consider binary trees, as implemented in the avl-tree.el package.
Though, there is no function for finding the closest member with respect
to some data and a distance-function, but see below.

Finding a (closest) member should be constraint in logarithmic time.

(require 'avl-tree)

(setq avl (avl-tree-create '<))

(dotimes (i 2000)
  (when (= 0 (% i 4))
    (avl-tree-enter avl i)))

(avl-tree-member avl 80)
=> 80
(avl-tree-member avl 70)
=> nil


(defun avl-tree-closest-member (tree data delta-fn)
  ;; delta-fn : data x data -> Z
  (flet ((comp-delta (node)
           (funcall delta-fn data
                    (avl-tree--node-data node))))
    (let* ((node (avl-tree--root tree))
           closest
           (delta most-positive-fixnum)
           (compare-function (avl-tree--cmpfun tree))
           found)
      (while (and node
                  (not found))
        (when (< (comp-delta node) delta)
          (setq delta (comp-delta node)
                closest node))
        (cond
         ((funcall compare-function data (avl-tree--node-data node))
          (setq node (avl-tree--node-left node)))
         ((funcall compare-function (avl-tree--node-data node) data)
          (setq node (avl-tree--node-right node)))
         (t
          (setq found t))))
      (if closest
          (avl-tree--node-data closest)
        nil))))



(mapcar 
 (lambda (data)
   (avl-tree-closest-member
    avl data (lambda (n1 n2)
               (abs (- n1 n2)))))
 '(1001 1002 1003 1004))

=> (1000 1004 1004 1004)


-ap





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
       [not found] <mailman.2247.1267667449.14305.help-gnu-emacs@gnu.org>
  2010-03-04 10:36 ` how to access a large datastructure efficiently? Alan Mackenzie
@ 2010-03-04 20:22 ` Pascal J. Bourguignon
  2010-03-05  0:29 ` Stefan Monnier
  2 siblings, 0 replies; 14+ messages in thread
From: Pascal J. Bourguignon @ 2010-03-04 20:22 UTC (permalink / raw)
  To: help-gnu-emacs

Christian Wittern <cwittern@gmail.com> writes:

> Hi there,
>
> Here is the problem I am trying to solve:
>
> I have a large list of items which I want to access.  The items are in 
> sequential order, but many are missing in between, like:
>
> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
> with this, but I took it out for simplicity]
>
> Now when I am trying to access with a key that is not in the list, I 
> want to have the one with the closest smaller key returned, so for 6 
> and 7 this would be 1, but for 8 and 9 this would be 8.
>
> Since the list will have thousands of elements, I do not want to simply 
> loop through it but am looking for better ways to do this in Emacs lisp.  
> Any ideas how to achieve this?

Why do you have a list?

If you had a vector, you could do a dichotomy, and find it in O(log(n)).

Or, if you need to insert or remove elements between searches, as the
others have advised, use a binary tree, preferablemente a balanced
binary tree.  I prefer the left-leaning red-black trees over the avl
trees. 



-- 
__Pascal Bourguignon__
http://www.informatimago.com


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: how to access a large datastructure efficiently?
       [not found] <mailman.2247.1267667449.14305.help-gnu-emacs@gnu.org>
  2010-03-04 10:36 ` how to access a large datastructure efficiently? Alan Mackenzie
  2010-03-04 20:22 ` Pascal J. Bourguignon
@ 2010-03-05  0:29 ` Stefan Monnier
  2 siblings, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2010-03-05  0:29 UTC (permalink / raw)
  To: help-gnu-emacs

> I have a large list of items which I want to access.  The items are in 
> sequential order, but many are missing in between, like:

> (1 8 17 23 25 34 45 47 50)  [in reality, there is a value associated 
> with this, but I took it out for simplicity]

If those integer-valued keys are really positions in a buffer, then
a good solution might be to store this "list" as a text property.


        Stefan


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-03-05  0:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <mailman.2247.1267667449.14305.help-gnu-emacs@gnu.org>
2010-03-04 10:36 ` how to access a large datastructure efficiently? Alan Mackenzie
2010-03-04 20:22 ` Pascal J. Bourguignon
2010-03-05  0:29 ` Stefan Monnier
2010-03-04  1:50 Christian Wittern
2010-03-04  6:59 ` Thierry Volpiatto
2010-03-04  7:25   ` Thierry Volpiatto
2010-03-04  8:13     ` Andreas Röhler
2010-03-04 11:00       ` Thierry Volpiatto
2010-03-04 15:49         ` Andreas Röhler
2010-03-04 16:09           ` Thierry Volpiatto
2010-03-04  8:15     ` Christian Wittern
2010-03-04 10:24       ` Thierry Volpiatto
2010-03-04 15:01       ` Thien-Thi Nguyen
2010-03-04 16:49       ` Andreas Politz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).