On Wed, Oct 31, 2012 at 5:53 PM, Nick Dokos <nicholas.dokos@hp.com> wrote:
John Hendy <jw.hendy@gmail.com> wrote:Smaller set of data I'd guess :-) But it does not seem to be the
> On Wed, Oct 31, 2012 at 3:12 PM, <cberry@tajo.ucsd.edu> wrote:
>
> John Hendy <jw.hendy@gmail.com> writes:
>
> > On Wed, Oct 31, 2012 at 11:41 AM, <span dir="ltr"><mailto:cberry@tajo.ucsd.edu></span> wrote:
> > John Hendy <mailto:jw.hendy@gmail.com> writes:
> >
> >> I edited the subject to be more concise/clear.I let orgmode chug away
> >> on reading in some ~10-30mb csv files for nearly 30min.
> >
> > [rest deleted]
> >
> > You need an ECM.I did my best to provide one, other than the file, which I offered to provide
> if others requested that I upload it somewhere. Since you have done so, so have I:
> > - https://docs.google.com/open?id=0BzQupOSnvw08WHdabHh5VVczRGM
>
> > Let me know if that doesn't work. I put it on Google docs and sometimes have issues with
> the sharing settings...
>
> Not an ECM in my book, but ...
>
> What else would you like? I provided:
> - the config
> - the data
> - how to [attempt to] reproduce
> - the org-mode text
>
size of the data that matters.
A few things to try in no particular order:
>
>
> On my 4 year old MacBook:
>
> ,----
> |
> | #+PROPERTY: session *R*
> |
> | #+name: bigcsv
> | #+begin_src R
> | bigcsv <- Sys.glob("~/Downloads/*.csv")
> | #+end_src
> |
> | #+RESULTS: bigcsv
> | : /Users/cberry/Downloads/test-file.csv
> |
> | #+name: readbig
> | #+begin_src R :results output
> | system.time(
> | tmp <- read.csv(bigcsv)
> | )
> |
> | #+end_src
> |
> | #+RESULTS: readbig
> | : user system elapsed
> | : 5.679 0.306 6.002
> |
> `----
>
> About the same as running from ESS.
>
> Not sure what to say. Looking for ways to troubleshoot or confirm. Since you can't confirm, any
> suggestions on where I should look for my issue? I can't explain it! All I know is that org chugs
> and chugs and the direct execution in ESS session is lightning fast.
>
o run top (or whatever equivalent is available on your OS) and see
whether the CPU (or one of the CPUs) gets pegged at 100% utilization
and stays there. If yes, that's an indication of an infinite loop
somewhere.
o run vmstat (or equivalent) and see if any of the counters are out of whack.
That requires some experience though.
o use elp-instrument-package to instrument org and run the test, getting
a profile. I'm not sure whether the results will be useful, since you
are going to interrupt the test when you run out of patience, but it
cannot hurt and it might tell you something useful.
o run your ECM on a different computer/OS/emacs installation. Being able
to compare things side by side is often very useful.
o Halve your file and run the test on each half (but that's probably not
the problem given Chuck's results).
o Reinstall org from scratch - you might have some corruption in one of
the compiled files that's causing it to go into an infinite loop.
o Turn on debug-on-quit, start your test, wait a bit and then interrupt
it. Check the backtrace. Do it again and check whether the backtrace
looks the same. That's often an indication of an infinite loop
(inferring an infinite loop from a two element sample is statistically
suspect of course, but surprisingly effective nevertheless). The point
here is that the infinite loop is in emacs and the backtrace tells you
something about the parties involved.
These are obviously not independent and the results of one experiment will
have to guide you in what you try next.
Good luck,
Nick