From: Eric Schulte <eric.schulte@gmx.com>
To: "Thomas S. Dye" <tsd@tsdye.com>
Cc: Michael Hannon <jm_hannon@yahoo.com>,
Org-Mode List <emacs-orgmode@gnu.org>
Subject: Re: Babel: communicating irregular data to R source-code block
Date: Sun, 22 Apr 2012 11:58:40 -0400 [thread overview]
Message-ID: <87ipgrn4by.fsf@gmx.com> (raw)
In-Reply-To: m1397w1toe.fsf@tsdye.com
[-- Attachment #1: Type: text/plain, Size: 7302 bytes --]
tsd@tsdye.com (Thomas S. Dye) writes:
> Aloha Michael,
>
> Michael Hannon <jm_hannon@yahoo.com> writes:
>
>> Greetings. I'm sitting in on a weekly, informal, "brown-bag" seminar on data
>> technologies in statistics. There are more people attending the seminar than
>> there are weeks in which to give talks, so I may get by with being my usual,
>> passive-slug self.
>>
>> But I thought it might be useful to have a contingency plan and decided that
>> giving a brief talk about Babel might be useful/instructive. I thought (and
>> think) that mushing together (with attribution) some of the content of the
>> paper [1] by The Gang of Four and the content of Eric's talk [2] might be a
>> good approach. (BTW, if this isn't legal, desirable, permissible, etc., this
>> would be a good time to tell me.)
>>
I would be happy for you to re-use these materials.
>>
>> I liked the Pascal's Triangle example (which morphed from elisp to Python, or
>> vice versa, in the two references), but I was afraid that the elisp routine
>> "pst-check", used as a check on the correctness of the previously-generated
>> Pascal's triangle, might be too esoteric for this audience, not to mention me.
>> (The recursive Fibonacci function is virtually identical in all languages,
>> but the second part is more obscure.)
>>
I was giving a presentation to a local lisp/scheme user group, so I
figured I'd spare them the pain of trying to read python code :).
>>
>> I thought it should be possible to use R to do the same sanity check, as R
>> would be much more-familiar to this audience (and its use would still
>> demonstrate the meta-language feature of Babel).
>>
>> Unfortunately, I haven't been able to find a way to communicate the output of
>> the Pascal's Triangle example to an R source-code block. The gist of the
>> problem seems to be that regardless of how I try to grab the data (scan,
>> readLines, etc.) Babel always ends up trying to read a data frame (table) and
>> I get an error similar to:
>>
I present some options below specific to Tom's discussion, but another
option may be to use the ":results output" option on a python code block
which prints the table to STDOUT, and then use something line readLines
to read from the resulting string into R.
>>
>> <<<<<<
>>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>> : line 1 did not have 5 elements
>>
>> Enter a frame number, or 0 to exit
>>
>> 1: read.table("/tmp/babel-3780tje/R-import-3780Akj", header = FALSE, row.names
>> = NULL, sep = "
>>>>>>>>
>>
>> If I construct a table "by hand" with all of the cells occupied, everything
>> goes OK. For instance:
>>
>> <<<<<<
>> #+TBLNAME: some-junk
>> | 1 | 0 | 0 | 0 |
>> | 1 | 1 | 0 | 0 |
>> | 1 | 2 | 1 | 0 |
>> | 1 | 3 | 3 | 1 |
>>
>> #+NAME: read-some-junk(sj_input=some-junk)
>> #+BEGIN_SRC R
>>
>> rowSums(sj_input)
>>
>> #+END_SRC
>>
>> #+RESULTS: read-some-junk
>> | 1 |
>> | 2 |
>> | 4 |
>> | 8 |
>>>>>>>>
>>
>> But the following gives the kind of error I described above:
>>
>> <<<<<<
>> #+name: pascals_triangle
>> #+begin_src python :var n=5 :exports none :return pascals_triangle(5)
>> def pascals_triangle(n):
>> if n == 0:
>> return [[1]]
>> prev_triangle = pascals_triangle(n-1)
>> prev_row = prev_triangle[n-1]
>> this_row = map(sum, zip([0] + prev_row, prev_row + [0]))
>> return prev_triangle + [this_row]
>>
>> pascals_triangle(n)
>> #+end_src
>
> A few things are wrong at this point. It seems the JSS article has
> an error in the header of the pascals_triangle source block. AFAIK
> there is no header argument :return. I don't know how :return
> pascals_triangle(5) got there, but am fairly certain it shouldn't be.
>
The :return header argument *is* a supported header argument of python
code blocks and is not an error. The python code block should run w/o
error and without the extra "return pascals_triangle(n)" at the bottom.
The following works for me.
#+name: pascals_triangle
#+begin_src python :var n=5 :exports none :return pascals_triangle(5)
def pascals_triangle(n):
if n == 0:
return [[1]]
prev_triangle = pascals_triangle(n-1)
prev_row = prev_triangle[n-1]
this_row = map(sum, zip([0] + prev_row, prev_row + [0]))
return prev_triangle + [this_row]
#+end_src
#+RESULTS: pascals_triangle
| 1 | | | | | |
| 1 | 1 | | | | |
| 1 | 2 | 1 | | | |
| 1 | 3 | 3 | 1 | | |
| 1 | 4 | 6 | 4 | 1 | |
| 1 | 5 | 10 | 10 | 5 | 1 |
[...]
>
> I vaguely remember that it once was possible to pass variables in
> through the name line, but I couldn't find this syntax in some fairly
> recent documentation.
This style of passing arguments is still supported, but not necessarily
encouraged by the documentation.
> It does appear to work still using a recent Org-mode. If I rename the
> results and then pass that to the source code block, all is well.
>
> #+RESULTS: pascals-tri
> | 1 | | | | | |
> | 1 | 1 | | | | |
> | 1 | 2 | 1 | | | |
> | 1 | 3 | 3 | 1 | | |
> | 1 | 4 | 6 | 4 | 1 | |
> | 1 | 5 | 10 | 10 | 5 | 1 |
>
>
> #+name: pst-checkR(p=pascals-tri)
> #+BEGIN_SRC R
> p
> #+END_SRC
>
> #+RESULTS: pst-checkR
>
> | 1 | nil | nil | nil | nil | nil |
> | 1 | 1 | nil | nil | nil | nil |
> | 1 | 2 | 1 | nil | nil | nil |
> | 1 | 3 | 3 | 1 | nil | nil |
> | 1 | 4 | 6 | 4 | 1 | nil |
> | 1 | 5 | 10 | 10 | 5 | 1 |
>
> This looks like a bug to me, but Eric S. will know better what might be
> going on.
The above is due to the inability of R (or at least of the read.table
function) to read in tables with different row length. The process of
writing to an Org-mode table and *then* referencing that table as Tom
suggests above has the side effect of filling in blank spots in the
final exported table, turning what would otherwise be something like
1
1 1
1 2 1
into something like
1 "" ""
1 1 ""
1 2 1
You could also use a function like the following to explicitly fill in
these missing lines.
#+name: padded_pascals_triangle
#+begin_src emacs-lisp :var data=pascals_triangle
(let ((max-length (apply #'max (mapcar #'length data))))
(mapcar (lambda (row)
(append row (make-vector (- max-length (length row)) "") nil))
data))
#+end_src
> I can't do much more than this, but I'm optimistic things will be
> sorted out before your turn to speak at the seminar rolls around.
>
> Thanks for bringing the error in the JSS article to light.
>
> All the best,
> Tom
>
I often have to explicitly convert data read into R code blocks as a
table into some other data structure like a vector or a matrix. I run
into this myself when trying to use the statistical functions of R. It
generally takes a while to look up the function to do the conversion,
but I imagine that there is a reason why people who know more R than I
do chose to make tables the default data type for data read into R
blocks.
Best,
Combining the examples above yields the following,
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: example.org --]
[-- Type: text/x-org, Size: 924 bytes --]
#+name: pascals_triangle
#+begin_src python :var n=5 :exports none :return pascals_triangle(5) :results vector
def pascals_triangle(n):
if n == 0:
return [[1]]
prev_triangle = pascals_triangle(n-1)
prev_row = prev_triangle[n-1]
this_row = map(sum, zip([0] + prev_row, prev_row + [0]))
return prev_triangle + [this_row]
#+end_src
#+name: padded_pascals_triangle
#+begin_src emacs-lisp :var data=pascals_triangle
(let ((max-length (apply #'max (mapcar #'length data))))
(mapcar (lambda (row)
(append row (make-vector (- max-length (length row)) "") nil))
data))
#+end_src
#+begin_src R :var data=padded_pascals_triangle
data
#+end_src
#+RESULTS:
| 1 | nil | nil | nil | nil | nil |
| 1 | 1 | nil | nil | nil | nil |
| 1 | 2 | 1 | nil | nil | nil |
| 1 | 3 | 3 | 1 | nil | nil |
| 1 | 4 | 6 | 4 | 1 | nil |
| 1 | 5 | 10 | 10 | 5 | 1 |
[-- Attachment #3: Type: text/plain, Size: 1843 bytes --]
>
>>>>>>>>
>>
>> Note that I don't really want to do rowSums in this case. I'm just trying to
>> demonstrate the error.
>>
>> Of course, it's clear that the first line does NOT contain five elements, nor
>> does the second, etc., as all of the above-diagonal elements are blanks.
>>
>> But I've been unable to find an R input function that doesn't end up treating
>> the source data as a table, i.e., in the context of Babel source blocks -- R
>> is "happy" to read a lower-diagonal structure. See the appendix for an
>> example.
>>
>> Any suggestions? Note that I'm happy to acknowledge that my own ignorance of
>> R and/or Babel might be the source of the problem. If so, please enlighten
>> me.
>>
>> Thanks.
>>
>> -- Mike
>>
>> [1] http://www.jstatsoft.org/v46/i03
>> [2] https://github.com/eschulte/babel-presentation
>>
>> <<<<<<
>> Appendix
>> --------
>>
>>
>> $ cat pascal.dat
>> 1
>> 1 1
>> 1 2 1
>> 1 3 3 1
>> 1 4 6 4 1
>>
>> $ R --vanilla < pascal.R
>>
>> R version 2.15.0 (2012-03-30)
>> Copyright (C) 2012 The R Foundation for Statistical Computing
>> ISBN 3-900051-07-0
>> Platform: x86_64-redhat-linux-gnu (64-bit)
>> .
>> .
>> .
>>
>>> x <- readLines("pascal.dat")
>>> x
>> [1] "1" "1 1" "1 2 1" "1 3 3 1" "1 4 6 4 1"
>>> str(x)
>> chr [1:5] "1" "1 1" "1 2 1" "1 3 3 1" "1 4 6 4 1"
>>>
>>> y <- scan("pascal.dat")
>> Read 15 items
>>> y
>> [1] 1 1 1 1 2 1 1 3 3 1 1 4 6 4 1
>>> str(y)
>> num [1:15] 1 1 1 1 2 1 1 3 3 1 ...
>>>
>>> z <- read.table("pascal.dat", header=FALSE)
>> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
>> line 1 did not have 5 elements
>> Calls: read.table -> scan
>> Execution halted
>>
>>
--
Eric Schulte
http://cs.unm.edu/~eschulte/
next prev parent reply other threads:[~2012-04-22 18:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-21 20:17 Babel: communicating irregular data to R source-code block Michael Hannon
2012-04-22 0:44 ` Thomas S. Dye
2012-04-22 15:58 ` Eric Schulte [this message]
2012-04-23 16:46 ` Thomas S. Dye
2012-04-23 15:41 ` Eric Schulte
2012-04-23 19:17 ` Thomas S. Dye
2012-04-23 22:24 ` Michael Hannon
2012-04-23 21:05 ` Eric Schulte
2012-04-24 0:23 ` Thomas S. Dye
2012-04-23 22:55 ` Eric Schulte
2012-04-24 6:44 ` Thomas S. Dye
2012-04-24 7:07 ` Michael Hannon
2012-04-24 17:18 ` Thomas S. Dye
2012-04-24 19:23 ` Thomas S. Dye
2012-04-25 23:52 ` Thomas S. Dye
2012-04-26 2:06 ` Michael Hannon
2012-04-26 6:34 ` Thomas S. Dye
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ipgrn4by.fsf@gmx.com \
--to=eric.schulte@gmx.com \
--cc=emacs-orgmode@gnu.org \
--cc=jm_hannon@yahoo.com \
--cc=tsd@tsdye.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).