emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
@ 2023-08-15 23:46 Jack Kamm
  2023-08-16  9:32 ` Ihor Radchenko
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Jack Kamm @ 2023-08-15 23:46 UTC (permalink / raw)
  To: emacs-orgmode; +Cc: Ihor Radchenko, Liu Hui

[-- Attachment #1: Type: text/plain, Size: 2350 bytes --]

Following up on a discussion from last month [1], I am reviving my
proposal from a couple years ago [2] to improve ob-python results
handling. Since it's a relatively large change, I am sending it to the
list for review before applying the patch.

The patch changes how ob-python handles the following types of
results:

- Dictionaries
- Numpy arrays
- Pandas dataframes and series
- Matplotlib figures

Starting with dicts: these are no longer mangled. The current behavior
(before patch) is like so:

#+begin_src python
  return {"a": 1, "b": 2}
#+end_src

#+RESULTS:
| a | : | 1 | b | : | 2 |

But after the patch they appear like so:

#+begin_src python
  return {"a": 1, "b": 2}
#+end_src

#+RESULTS:
: {'a': 1, 'b': 2}

Next, for numpy arrays and pandas dataframes/series: these are
converted to tables, for example:

#+begin_src python
  import pandas as pd
  import numpy as np

  return pd.DataFrame(np.array([[1,2,3],[4,5,6]]),
                      columns=['a','b','c'])
#+end_src

#+RESULTS:
|   | a | b | c |
|---+---+---+---|
| 0 | 1 | 2 | 3 |
| 1 | 4 | 5 | 6 |

To avoid conversion, you can specify "raw", "verbatim", "scalar", or
"output" in the ":results" header argument.

Finally, for plots: ob-python now supports ":results graphics" header
arg. The behavior depends on whether using output or value
results. For output results, the current figure (pyplot.gcf) is
cleared before evaluating, then the result saved. For value results,
the block is expected to return a matplotlib Figure, which is
saved. To set the figure size, do it from within Python.

Here is an example of how to plot:

#+begin_src python :results output graphics file :file boxplot.svg
  import matplotlib.pyplot as plt
  import seaborn as sns
  plt.figure(figsize=(5, 5))
  tips = sns.load_dataset("tips")
  sns.boxplot(x="day", y="tip", data=tips)
#+end_src

Compared to the original version of this patch [2], I tried to
simplify and streamline things as much as possible, since this is a
relatively large and complex change. For example, the handling for
dict objects is much more simplistic now. And there are other
miscellaneous changes to the code structure which I hope improve the
clarity a bit.

[1] https://list.orgmode.org/CAOQTW-N9rE7fDRM1APMO8X5LRZmJfn_ZjhT3rvaF4X+s5M_jZw@mail.gmail.com/
[2] https://list.orgmode.org/87eenpfe77.fsf@gmail.com/


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-ob-python-Results-handling-for-dicts-dataframes-arra.patch --]
[-- Type: text/x-patch, Size: 16691 bytes --]

From 468eeaa69660a18d8b0503e5a68c275301d6e6ae Mon Sep 17 00:00:00 2001
From: Jack Kamm <jackkamm@gmail.com>
Date: Mon, 7 Sep 2020 09:58:30 -0700
Subject: [PATCH] ob-python: Results handling for dicts, dataframes, arrays,
 plots

* lisp/ob-python.el (org-babel-execute:python): Parse graphics-file
from params, and pass it to `org-babel-python-evaluate'.
(org-babel-python-table-or-string): Prevent `org-babel-script-escape'
from mangling dict results.
(org-babel-python--def-format-value): Python code for formatting
value results before returning.
(org-babel-python-wrapper-method): Removed.  Instead use part of the
string directly in `org-babel-python-evaluate-external-process'.
(org-babel-python-pp-wrapper-method): Removed.  Pretty printing is now
handled by `org-babel-python--def-format-value'.
(org-babel-python--output-graphics-wrapper): New constant.  Python
code to save graphical output.
(org-babel-python--exec-tmpfile): Removed.  Instead use the raw string
directly in `org-babel-python-evaluate-session'.
(org-babel-python--def-format-value): New constant.  Python function
to format and save value results to file.  Includes handling for
graphics, dataframes, and arrays.
(org-babel-python-format-session-value): Updated to use
`org-babel-python--def-format-value' for formatting value result.
(org-babel-python-evaluate): New parameter graphics-file.  Pass
graphics-file onto downstream helper functions.
(org-babel-python-evaluate-external-process): New parameter
graphics-file.  Use `org-babel-python--output-graphics-wrapper' for
graphical output.  For value result, use
`org-babel-python--def-format-value'.
(org-babel-python-evaluate-session): New parameter graphics-file.  Use
`org-babel-python--output-graphics-wrapper' for graphical output.
Replace the removed constant `org-babel-python--exec-tmpfile' with the
string directly.  Rename local variable tmp-results-file to
results-file, which may take the value of graphics-file when provided.
(org-babel-python-async-evaluate-session): New parameter
graphics-file.  Use `org-babel-python--output-graphics-wrapper' for
graphical output.  Rename local variable tmp-results-file to
results-file, which may take the value of graphics-file when provided.
---
 etc/ORG-NEWS      |  19 +++++-
 lisp/ob-python.el | 164 ++++++++++++++++++++++++++++------------------
 2 files changed, 119 insertions(+), 64 deletions(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 11fdf2825..2630554ae 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -576,6 +576,21 @@ of all relational operators (~<*~, ~=*~, ~!=*~, etc.) that work like
 the regular, unstarred operators but match a headline only if the
 tested property is actually present.
 
+*** =ob-python.el=: Support for more result types and plotting
+
+=ob-python= now recognizes numpy arrays, and pandas dataframes/series,
+and will convert them to org-mode tables when appropriate.
+
+In addition, dict results are now returned in appropriate string form,
+instead of being mangled as they were previously.
+
+When the header argument =:results graphics= is set, =ob-python= will
+use matplotlib to save graphics. The behavior depends on whether value
+or output results are used. For value results, the last line should
+return a matplotlib Figure object to plot. For output results, the
+current figure (as returned by =pyplot.gcf()=) is cleared before
+evaluation, and then plotted afterwards.
+
 ** New functions and changes in function arguments
 *** =TYPES= argument in ~org-element-lineage~ can now be a symbol
 
@@ -2041,8 +2056,8 @@ to switch to the new signature.
 *** Python session return values must be top-level expression statements
 
 Python blocks with ~:session :results value~ header arguments now only
-return a value if the last line is a top-level expression statement.
-Also, when a None value is returned, "None" will be printed under
+return a value if the last line is a top-level expression statement,
+otherwise the result is None. Also, None will now show up under
 "#+RESULTS:", as it already did with ~:results value~ for non-session
 blocks.
 
diff --git a/lisp/ob-python.el b/lisp/ob-python.el
index c15d45b96..35a82afc0 100644
--- a/lisp/ob-python.el
+++ b/lisp/ob-python.el
@@ -70,6 +70,8 @@ (defun org-babel-execute:python (body params)
 	      org-babel-python-command))
 	 (session (org-babel-python-initiate-session
 		   (cdr (assq :session params))))
+	 (graphics-file (and (member "graphics" (assq :result-params params))
+			     (org-babel-graphical-output-file params)))
          (result-params (cdr (assq :result-params params)))
          (result-type (cdr (assq :result-type params)))
 	 (return-val (when (eq result-type 'value)
@@ -85,7 +87,7 @@ (defun org-babel-execute:python (body params)
 	     (format (if session "\n%s" "\nreturn %s") return-val))))
          (result (org-babel-python-evaluate
 		  session full-body result-type
-		  result-params preamble async)))
+		  result-params preamble async graphics-file)))
     (org-babel-reassemble-table
      result
      (org-babel-pick-name (cdr (assq :colname-names params))
@@ -142,7 +144,9 @@ (defun org-babel-python-table-or-string (results)
   "Convert RESULTS into an appropriate elisp value.
 If the results look like a list or tuple, then convert them into an
 Emacs-lisp table, otherwise return the results as a string."
-  (let ((res (org-babel-script-escape results)))
+  (let ((res (if (string-equal "{" (substring results 0 1))
+                 results ;don't covert dicts to elisp
+               (org-babel-script-escape results))))
     (if (listp res)
         (mapcar (lambda (el) (if (eq el 'None)
                                  org-babel-python-None-to el))
@@ -218,32 +222,51 @@ (defun org-babel-python-initiate-session (&optional session _params)
 (defvar org-babel-python-eoe-indicator "org_babel_python_eoe"
   "A string to indicate that evaluation has completed.")
 
-(defconst org-babel-python-wrapper-method
-  "
-def main():
-%s
-
-open('%s', 'w').write( str(main()) )")
-(defconst org-babel-python-pp-wrapper-method
-  "
-import pprint
-def main():
+(defconst org-babel-python--output-graphics-wrapper "\
+import matplotlib.pyplot
+matplotlib.pyplot.gcf().clear()
 %s
-
-open('%s', 'w').write( pprint.pformat(main()) )")
-
-(defconst org-babel-python--exec-tmpfile "\
-with open('%s') as __org_babel_python_tmpfile:
-    exec(compile(__org_babel_python_tmpfile.read(), __org_babel_python_tmpfile.name, 'exec'))"
-  "Template for Python session command with output results.
-
-Has a single %s escape, the tempfile containing the source code
-to evaluate.")
+matplotlib.pyplot.savefig('%s')"
+  "Format string for saving Python graphical output.
+Has two %s escapes, for the Python code to be evaluated, and the
+file to save the graphics to.")
+
+(defconst org-babel-python--def-format-value "\
+def __org_babel_python_format_value(result, result_file, result_params):
+    with open(result_file, 'w') as f:
+        if 'graphics' in result_params:
+            result.savefig(result_file)
+        elif 'pp' in result_params:
+            import pprint
+            f.write(pprint.pformat(result))
+        else:
+            if not set(result_params).intersection(\
+['scalar', 'verbatim', 'raw']):
+                try:
+                    import pandas
+                except ImportError:
+                    pass
+                else:
+                    if isinstance(result, pandas.DataFrame):
+                        result = [[''] + list(result.columns), None] + \
+[[i] + list(row) for i, row in result.iterrows()]
+                    elif isinstance(result, pandas.Series):
+                        result = list(result.items())
+                try:
+                    import numpy
+                except ImportError:
+                    pass
+                else:
+                    if isinstance(result, numpy.ndarray):
+                        result = result.tolist()
+            f.write(str(result))"
+  "Python function to format value result and save it to file.")
 
 (defun org-babel-python-format-session-value
     (src-file result-file result-params)
   "Return Python code to evaluate SRC-FILE and write result to RESULT-FILE."
-  (format "\
+  (concat org-babel-python--def-format-value
+	  (format "
 import ast
 with open('%s') as __org_babel_python_tmpfile:
     __org_babel_python_ast = ast.parse(__org_babel_python_tmpfile.read())
@@ -253,30 +276,25 @@ (defun org-babel-python-format-session-value
     exec(compile(__org_babel_python_ast, '<string>', 'exec'))
     __org_babel_python_final = eval(compile(ast.Expression(
         __org_babel_python_final.value), '<string>', 'eval'))
-    with open('%s', 'w') as __org_babel_python_tmpfile:
-        if %s:
-            import pprint
-            __org_babel_python_tmpfile.write(pprint.pformat(__org_babel_python_final))
-        else:
-            __org_babel_python_tmpfile.write(str(__org_babel_python_final))
 else:
     exec(compile(__org_babel_python_ast, '<string>', 'exec'))
-    __org_babel_python_final = None"
-	  (org-babel-process-file-name src-file 'noquote)
-	  (org-babel-process-file-name result-file 'noquote)
-	  (if (member "pp" result-params) "True" "False")))
+    __org_babel_python_final = None
+__org_babel_python_format_value(__org_babel_python_final, '%s', %s)"
+		  (org-babel-process-file-name src-file 'noquote)
+		  (org-babel-process-file-name result-file 'noquote)
+		  (org-babel-python-var-to-python result-params))))
 
 (defun org-babel-python-evaluate
-    (session body &optional result-type result-params preamble async)
+    (session body &optional result-type result-params preamble async graphics-file)
   "Evaluate BODY as Python code."
   (if session
       (if async
 	  (org-babel-python-async-evaluate-session
-	   session body result-type result-params)
+	   session body result-type result-params graphics-file)
 	(org-babel-python-evaluate-session
-	 session body result-type result-params))
+	 session body result-type result-params graphics-file))
     (org-babel-python-evaluate-external-process
-     body result-type result-params preamble)))
+     body result-type result-params preamble graphics-file)))
 
 (defun org-babel-python--shift-right (body &optional count)
   (with-temp-buffer
@@ -292,28 +310,36 @@ (defun org-babel-python--shift-right (body &optional count)
     (buffer-string)))
 
 (defun org-babel-python-evaluate-external-process
-    (body &optional result-type result-params preamble)
+    (body &optional result-type result-params preamble graphics-file)
   "Evaluate BODY in external python process.
 If RESULT-TYPE equals `output' then return standard output as a
-string.  If RESULT-TYPE equals `value' then return the value of the
-last statement in BODY, as elisp."
+string.  If RESULT-TYPE equals `value' then return the value of
+the last statement in BODY, as elisp.  If GRAPHICS-FILE is
+non-nil, then save graphical results to that file instead."
   (let ((raw
          (pcase result-type
            (`output (org-babel-eval org-babel-python-command
 				    (concat preamble (and preamble "\n")
-					    body)))
-           (`value (let ((tmp-file (org-babel-temp-file "python-")))
+                                            (if graphics-file
+                                                (format org-babel-python--output-graphics-wrapper
+                                                        body graphics-file)
+                                              body))))
+           (`value (let ((results-file (or graphics-file
+				           (org-babel-temp-file "python-"))))
 		     (org-babel-eval
 		      org-babel-python-command
 		      (concat
 		       preamble (and preamble "\n")
 		       (format
-			(if (member "pp" result-params)
-			    org-babel-python-pp-wrapper-method
-			  org-babel-python-wrapper-method)
-			(org-babel-python--shift-right body)
-			(org-babel-process-file-name tmp-file 'noquote))))
-		     (org-babel-eval-read-file tmp-file))))))
+			(concat org-babel-python--def-format-value "
+def main():
+%s
+
+__org_babel_python_format_value(main(), '%s', %s)")
+                        (org-babel-python--shift-right body)
+			(org-babel-process-file-name results-file 'noquote)
+			(org-babel-python-var-to-python result-params))))
+		     (org-babel-eval-read-file results-file))))))
     (org-babel-result-cond result-params
       raw
       (org-babel-python-table-or-string (org-trim raw)))))
@@ -347,28 +373,36 @@ (defun org-babel-python-send-string (session body)
       (org-babel-chomp (substring string-buffer 0 (match-beginning 0))))))
 
 (defun org-babel-python-evaluate-session
-    (session body &optional result-type result-params)
+    (session body &optional result-type result-params graphics-file)
   "Pass BODY to the Python process in SESSION.
 If RESULT-TYPE equals `output' then return standard output as a
-string.  If RESULT-TYPE equals `value' then return the value of the
-last statement in BODY, as elisp."
+string.  If RESULT-TYPE equals `value' then return the value of
+the last statement in BODY, as elisp.  If GRAPHICS-FILE is
+non-nil, then save graphical results to that file instead."
   (let* ((tmp-src-file (org-babel-temp-file "python-"))
          (results
 	  (progn
-	    (with-temp-file tmp-src-file (insert body))
+	    (with-temp-file tmp-src-file
+              (insert (if (and graphics-file (eq result-type 'output))
+                          (format org-babel-python--output-graphics-wrapper
+                                  body graphics-file)
+                        body)))
             (pcase result-type
 	      (`output
-	       (let ((body (format org-babel-python--exec-tmpfile
+	       (let ((body (format "\
+with open('%s') as f:
+    exec(compile(f.read(), f.name, 'exec'))"
 				   (org-babel-process-file-name
 				    tmp-src-file 'noquote))))
 		 (org-babel-python-send-string session body)))
               (`value
-               (let* ((tmp-results-file (org-babel-temp-file "python-"))
+               (let* ((results-file (or graphics-file
+					(org-babel-temp-file "python-")))
 		      (body (org-babel-python-format-session-value
-			     tmp-src-file tmp-results-file result-params)))
+			     tmp-src-file results-file result-params)))
 		 (org-babel-python-send-string session body)
 		 (sleep-for 0 10)
-		 (org-babel-eval-read-file tmp-results-file)))))))
+		 (org-babel-eval-read-file results-file)))))))
     (org-babel-result-cond result-params
       results
       (org-babel-python-table-or-string results))))
@@ -392,7 +426,7 @@ (defun org-babel-python-async-value-callback (params tmp-file)
       (org-babel-python-table-or-string results))))
 
 (defun org-babel-python-async-evaluate-session
-    (session body &optional result-type result-params)
+    (session body &optional result-type result-params graphics-file)
   "Asynchronously evaluate BODY in SESSION.
 Returns a placeholder string for insertion, to later be replaced
 by `org-babel-comint-async-filter'."
@@ -406,7 +440,10 @@ (defun org-babel-python-async-evaluate-session
        (with-temp-buffer
          (insert (format org-babel-python-async-indicator "start" uuid))
          (insert "\n")
-         (insert body)
+         (insert (if graphics-file
+                     (format org-babel-python--output-graphics-wrapper
+                             body graphics-file)
+                   body))
          (insert "\n")
          (insert (format org-babel-python-async-indicator "end" uuid))
          (let ((python-shell-buffer-name
@@ -414,17 +451,20 @@ (defun org-babel-python-async-evaluate-session
            (python-shell-send-buffer)))
        uuid))
     (`value
-     (let ((tmp-results-file (org-babel-temp-file "python-"))
+     (let ((results-file (or graphics-file
+			     (org-babel-temp-file "python-")))
            (tmp-src-file (org-babel-temp-file "python-")))
        (with-temp-file tmp-src-file (insert body))
        (with-temp-buffer
-         (insert (org-babel-python-format-session-value tmp-src-file tmp-results-file result-params))
+         (insert (org-babel-python-format-session-value
+                  tmp-src-file results-file result-params))
          (insert "\n")
-         (insert (format org-babel-python-async-indicator "file" tmp-results-file))
+         (unless graphics-file
+           (insert (format org-babel-python-async-indicator "file" results-file)))
          (let ((python-shell-buffer-name
                 (org-babel-python-without-earmuffs session)))
            (python-shell-send-buffer)))
-       tmp-results-file))))
+       results-file))))
 
 (provide 'ob-python)
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-15 23:46 [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots Jack Kamm
@ 2023-08-16  9:32 ` Ihor Radchenko
  2023-08-17  4:04   ` Jack Kamm
  2023-08-17  5:35 ` Liu Hui
  2023-08-17 11:57 ` Ihor Radchenko
  2 siblings, 1 reply; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-16  9:32 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, Liu Hui

Jack Kamm <jackkamm@gmail.com> writes:

> Starting with dicts: these are no longer mangled. The current behavior
> (before patch) is like so:
>
> #+begin_src python
>   return {"a": 1, "b": 2}
> #+end_src
>
> #+RESULTS:
> | a | : | 1 | b | : | 2 |
>
> But after the patch they appear like so:
>
> #+begin_src python
>   return {"a": 1, "b": 2}
> #+end_src
>
> #+RESULTS:
> : {'a': 1, 'b': 2}

What about 

 #+begin_src python :results table
   return {"a": 1, "b": 2}
 #+end_src

 #+RESULTS:
 | a | 1 |
 | b | 2 |

or 

 #+begin_src python :results list
   return {"a": 1, "b": 2}
 #+end_src

 #+RESULTS:
 - a :: 1
 - b :: 2

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-16  9:32 ` Ihor Radchenko
@ 2023-08-17  4:04   ` Jack Kamm
  2023-08-17  9:14     ` gerard.vermeulen
  2023-08-17 12:07     ` Ihor Radchenko
  0 siblings, 2 replies; 26+ messages in thread
From: Jack Kamm @ 2023-08-17  4:04 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, Liu Hui

[-- Attachment #1: Type: text/plain, Size: 961 bytes --]

Ihor Radchenko <yantar92@posteo.net> writes:

> What about 
>
>  #+begin_src python :results table
>    return {"a": 1, "b": 2}
>  #+end_src
>
>  #+RESULTS:
>  | a | 1 |
>  | b | 2 |

I attach a 2nd patch implementing this. It also makes ":results table"
the default return type for dict. (Use ":results verbatim" to get the
dict as a string instead).

I am also putting a branch with these changes here:
https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023

>
> or 
>
>  #+begin_src python :results list
>    return {"a": 1, "b": 2}
>  #+end_src
>
>  #+RESULTS:
>  - a :: 1
>  - b :: 2

This seems harder, and may require more widespread changes beyond
ob-python. In particular, I think we'd need to change
`org-babel-insert-result' so that it can call `org-list-to-org' with a
list of type "descriptive" instead of "unordered" here:

https://git.sr.ht/~bzg/org-mode/tree/cc435cba71a99ee7b12676be3b6e1211a9cb7285/item/lisp/ob-core.el#L2535


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-ob-python-Convert-dicts-to-tables.patch --]
[-- Type: text/x-patch, Size: 2598 bytes --]

From c24d2eeb3b8613df9b9c23583a4b26a6c0934931 Mon Sep 17 00:00:00 2001
From: Jack Kamm <jackkamm@gmail.com>
Date: Wed, 16 Aug 2023 20:27:10 -0700
Subject: [PATCH 2/2] ob-python: Convert dicts to tables

This commit to be squashed with its parent before applying
---
 etc/ORG-NEWS      |  8 +++-----
 lisp/ob-python.el | 12 +++++++++---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 2630554ae..509011737 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -578,11 +578,9 @@ tested property is actually present.
 
 *** =ob-python.el=: Support for more result types and plotting
 
-=ob-python= now recognizes numpy arrays, and pandas dataframes/series,
-and will convert them to org-mode tables when appropriate.
-
-In addition, dict results are now returned in appropriate string form,
-instead of being mangled as they were previously.
+=ob-python= now recognizes dictionaries, numpy arrays, and pandas
+dataframes/series, and will convert them to org-mode tables when
+appropriate.
 
 When the header argument =:results graphics= is set, =ob-python= will
 use matplotlib to save graphics. The behavior depends on whether value
diff --git a/lisp/ob-python.el b/lisp/ob-python.el
index 35a82afc0..3d987da2f 100644
--- a/lisp/ob-python.el
+++ b/lisp/ob-python.el
@@ -144,9 +144,7 @@ (defun org-babel-python-table-or-string (results)
   "Convert RESULTS into an appropriate elisp value.
 If the results look like a list or tuple, then convert them into an
 Emacs-lisp table, otherwise return the results as a string."
-  (let ((res (if (string-equal "{" (substring results 0 1))
-                 results ;don't covert dicts to elisp
-               (org-babel-script-escape results))))
+  (let ((res (org-babel-script-escape results)))
     (if (listp res)
         (mapcar (lambda (el) (if (eq el 'None)
                                  org-babel-python-None-to el))
@@ -242,6 +240,14 @@ (defconst org-babel-python--def-format-value "\
         else:
             if not set(result_params).intersection(\
 ['scalar', 'verbatim', 'raw']):
+                def dict2table(res):
+                    if isinstance(res, dict):
+                        return [(k, dict2table(v)) for k, v in res.items()]
+                    elif isinstance(res, list) or isinstance(res, tuple):
+                        return [dict2table(x) for x in res]
+                    else:
+                        return res
+                result = dict2table(result)
                 try:
                     import pandas
                 except ImportError:
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays,  and plots
  2023-08-15 23:46 [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots Jack Kamm
  2023-08-16  9:32 ` Ihor Radchenko
@ 2023-08-17  5:35 ` Liu Hui
  2023-08-18 23:09   ` Jack Kamm
  2023-08-17 11:57 ` Ihor Radchenko
  2 siblings, 1 reply; 26+ messages in thread
From: Liu Hui @ 2023-08-17  5:35 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, Ihor Radchenko

Hi,

Thank you for the patch!

> Next, for numpy arrays and pandas dataframes/series: these are
> converted to tables, for example:
>
> #+begin_src python
>   import pandas as pd
>   import numpy as np
>
>   return pd.DataFrame(np.array([[1,2,3],[4,5,6]]),
>                       columns=['a','b','c'])
> #+end_src
>
> #+RESULTS:
> |   | a | b | c |
> |---+---+---+---|
> | 0 | 1 | 2 | 3 |
> | 1 | 4 | 5 | 6 |
>
> To avoid conversion, you can specify "raw", "verbatim", "scalar", or
> "output" in the ":results" header argument.

Do we need to limit the table/list size by default, or handle them
only with relevant result type (e.g. `table/list')? Dataframe/array
are often large. The following results are truncated by default
previously, which can be tweaked via np.set_printoptions and
pd.set_option.

#+begin_src python
import numpy as np
return np.random.randint(10, size=(30,40))
#+end_src

#+begin_src python
import numpy as np
return np.random.rand(20,3,4,5)
#+end_src

#+begin_src python
import pandas as pd
import numpy as np

d = {'col1': np.random.rand(100), 'col2': np.random.rand(100)}
return pd.DataFrame(d)
#+end_src

> +def __org_babel_python_format_value(result, result_file, result_params):
> +    with open(result_file, 'w') as f:
> +        if 'graphics' in result_params:
> +            result.savefig(result_file)
> +        elif 'pp' in result_params:
> +            import pprint
> +            f.write(pprint.pformat(result))
> +        else:
> +            if not set(result_params).intersection(\
> +['scalar', 'verbatim', 'raw']):
> +                try:
> +                    import pandas
> +                except ImportError:
> +                    pass
> +                else:
> +                    if isinstance(result, pandas.DataFrame):
> +                        result = [[''] + list(result.columns), None] + \

Here we can use '{}'.format(df.index.name) to show the name of index

>  (defun org-babel-python-format-session-value
>      (src-file result-file result-params)
>    "Return Python code to evaluate SRC-FILE and write result to RESULT-FILE."
> -  (format "\
> +  (concat org-babel-python--def-format-value
> +      (format "

Maybe `org-babel-python--def-format-value' can be evaluated only once
in the session mode? It would shorten the string sent to the python
shell, where temp files are used for long strings.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17  4:04   ` Jack Kamm
@ 2023-08-17  9:14     ` gerard.vermeulen
  2023-08-17 12:10       ` Ihor Radchenko
  2023-08-18 23:30       ` Jack Kamm
  2023-08-17 12:07     ` Ihor Radchenko
  1 sibling, 2 replies; 26+ messages in thread
From: gerard.vermeulen @ 2023-08-17  9:14 UTC (permalink / raw)
  To: Jack Kamm
  Cc: Ihor Radchenko, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net



On 17.08.2023 06:04, Jack Kamm wrote:

> I attach a 2nd patch implementing this. It also makes ":results table"
> the default return type for dict. (Use ":results verbatim" to get the
> dict as a string instead).
> 
> I am also putting a branch with these changes here:
> https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023
> 
Happy to see that ob-python gets so much love!

Your patches allow anyone to change org-babel-python--def-format-value.
For instance, I want to use black to "pretty-print" certain tree-like 
structures
and I have now in my init.el:

(with-eval-after-load 'ob-python
   (setq org-babel-python--def-format-value "\
def __org_babel_python_format_value(result, result_file, result_params):
     with open(result_file, 'w') as f:
         if 'graphics' in result_params:
             result.savefig(result_file)
         elif 'pp' in result_params:
             import black
             f.write(black.format_str(repr(result), mode=black.Mode()))
         else:
             if not set(result_params).intersection(\
['scalar', 'verbatim', 'raw']):
                 try:
                     import pandas
                 except ImportError:
                     pass
                 else:
                     if isinstance(result, pandas.DataFrame):
                         result = [[''] + list(result.columns), None] + \
[[i] + list(row) for i, row in result.iterrows()]
                     elif isinstance(result, pandas.Series):
                         result = list(result.items())
                 try:
                     import numpy
                 except ImportError:
                     pass
                 else:
                     if isinstance(result, numpy.ndarray):
                         result = result.tolist()
             f.write(str(result))"))

Without your patches I use advice to override
org-babel-python-format-session-value, which is worse IMO.

This also allows anyone to format for instance AstroPy tables
(https://docs.astropy.org/en/stable/table/).

I do not know how much this "abuse" of defconst is frowned
upon (elisp manual says defconst is advisory), but maybe it
can be advertised as a feature.

Best regards -- Gerard



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-15 23:46 [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots Jack Kamm
  2023-08-16  9:32 ` Ihor Radchenko
  2023-08-17  5:35 ` Liu Hui
@ 2023-08-17 11:57 ` Ihor Radchenko
  2023-08-18 23:18   ` Jack Kamm
  2 siblings, 1 reply; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-17 11:57 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, Liu Hui

Jack Kamm <jackkamm@gmail.com> writes:

> Following up on a discussion from last month [1], I am reviving my
> proposal from a couple years ago [2] to improve ob-python results
> handling. Since it's a relatively large change, I am sending it to the
> list for review before applying the patch.

Some comments on the patch itself.

> @@ -2041,8 +2056,8 @@ to switch to the new signature.
>  *** Python session return values must be top-level expression statements
>  
>  Python blocks with ~:session :results value~ header arguments now only
> -return a value if the last line is a top-level expression statement.
> -Also, when a None value is returned, "None" will be printed under
> +return a value if the last line is a top-level expression statement,
> +otherwise the result is None. Also, None will now show up under
>  "#+RESULTS:", as it already did with ~:results value~ for non-session
>  blocks.

This is an ORG-NEWS entry for Version 9.4. Is it an intentional change?
  
> @@ -142,7 +144,9 @@ (defun org-babel-python-table-or-string (results)
>    "Convert RESULTS into an appropriate elisp value.
>  If the results look like a list or tuple, then convert them into an
>  Emacs-lisp table, otherwise return the results as a string."
> -  (let ((res (org-babel-script-escape results)))
> +  (let ((res (if (string-equal "{" (substring results 0 1))
> +                 results ;don't covert dicts to elisp
> +               (org-babel-script-escape results))))

You may also need to update the docstring for
`org-babel-python-table-or-string' after this change.

> -					    body)))
> -           (`value (let ((tmp-file (org-babel-temp-file "python-")))
> +                                            (if graphics-file
> +                                                (format org-babel-python--output-graphics-wrapper
> +                                                        body graphics-file)
> +                                              body))))
> +           (`value (let ((results-file (or graphics-file
> +				           (org-babel-temp-file "python-"))))

What about :results graphics file ?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17  4:04   ` Jack Kamm
  2023-08-17  9:14     ` gerard.vermeulen
@ 2023-08-17 12:07     ` Ihor Radchenko
  2023-08-18 22:49       ` Jack Kamm
  1 sibling, 1 reply; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-17 12:07 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, Liu Hui

Jack Kamm <jackkamm@gmail.com> writes:

> I attach a 2nd patch implementing this. It also makes ":results table"
> the default return type for dict. (Use ":results verbatim" to get the
> dict as a string instead).

Thanks!

>>  #+begin_src python :results list
>>    return {"a": 1, "b": 2}
>>  #+end_src
>>
>>  #+RESULTS:
>>  - a :: 1
>>  - b :: 2
>
> This seems harder, and may require more widespread changes beyond
> ob-python. In particular, I think we'd need to change
> `org-babel-insert-result' so that it can call `org-list-to-org' with a
> list of type "descriptive" instead of "unordered" here:
>
> https://git.sr.ht/~bzg/org-mode/tree/cc435cba71a99ee7b12676be3b6e1211a9cb7285/item/lisp/ob-core.el#L2535

Actually, (org-list-to-org '(unordered ("a :: b") ("c :: d")))
will just work.

We do not support nested lists when transforming output anyway. So,
unordered/descriptive does not matter in practice.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17  9:14     ` gerard.vermeulen
@ 2023-08-17 12:10       ` Ihor Radchenko
  2023-08-18  4:37         ` gerard.vermeulen
  2023-08-18 23:30       ` Jack Kamm
  1 sibling, 1 reply; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-17 12:10 UTC (permalink / raw)
  To: gerard.vermeulen
  Cc: Jack Kamm, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

gerard.vermeulen@posteo.net writes:

> Your patches allow anyone to change org-babel-python--def-format-value.
> For instance, I want to use black to "pretty-print" certain tree-like 
> structures

May you simply add an extra code to transform output as needed?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17 12:10       ` Ihor Radchenko
@ 2023-08-18  4:37         ` gerard.vermeulen
  2023-08-18  6:01           ` gerard.vermeulen
  0 siblings, 1 reply; 26+ messages in thread
From: gerard.vermeulen @ 2023-08-18  4:37 UTC (permalink / raw)
  To: Ihor Radchenko
  Cc: Jack Kamm, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net



On 17.08.2023 14:10, Ihor Radchenko wrote:
> gerard.vermeulen@posteo.net writes:
> 
>> Your patches allow anyone to change 
>> org-babel-python--def-format-value.
>> For instance, I want to use black to "pretty-print" certain tree-like
>> structures
> 
> May you simply add an extra code to transform output as needed?

Yes, it is a way to switch between Jack's first and second set of 
patches if
one would like.  Or to add code to transform other Python data 
structures.




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-18  4:37         ` gerard.vermeulen
@ 2023-08-18  6:01           ` gerard.vermeulen
  0 siblings, 0 replies; 26+ messages in thread
From: gerard.vermeulen @ 2023-08-18  6:01 UTC (permalink / raw)
  To: Ihor Radchenko
  Cc: Jack Kamm, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net



On 18.08.2023 06:37, gerard.vermeulen@posteo.net wrote:
> On 17.08.2023 14:10, Ihor Radchenko wrote:
>> gerard.vermeulen@posteo.net writes:
>> 
>>> Your patches allow anyone to change 
>>> org-babel-python--def-format-value.
>>> For instance, I want to use black to "pretty-print" certain tree-like
>>> structures
>> 
>> May you simply add an extra code to transform output as needed?
> 
> Yes, it is a way to switch between Jack's first and second set of 
> patches if
> one would like.  Or to add code to transform other Python data 
> structures.

I take back the switching between Jack's first and second set of 
patches,
but I stand by "to add code to transform other Python data structures".



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17 12:07     ` Ihor Radchenko
@ 2023-08-18 22:49       ` Jack Kamm
  0 siblings, 0 replies; 26+ messages in thread
From: Jack Kamm @ 2023-08-18 22:49 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, Liu Hui

Ihor Radchenko <yantar92@posteo.net> writes:

>>>  #+begin_src python :results list
>>>    return {"a": 1, "b": 2}
>>>  #+end_src
>>>
>>>  #+RESULTS:
>>>  - a :: 1
>>>  - b :: 2
>>
>> This seems harder, and may require more widespread changes beyond
>> ob-python. In particular, I think we'd need to change
>> `org-babel-insert-result' so that it can call `org-list-to-org' with a
>> list of type "descriptive" instead of "unordered" here:
>
> Actually, (org-list-to-org '(unordered ("a :: b") ("c :: d")))
> will just work.
>
> We do not support nested lists when transforming output anyway. So,
> unordered/descriptive does not matter in practice.

You're right, thanks for the suggestion.

I've added it now to
https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023

More specifically, here:
https://github.com/jackkamm/org-mode/commit/0440caa3326b867a3a15d5f92a6f99cbf94c14d5


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17  5:35 ` Liu Hui
@ 2023-08-18 23:09   ` Jack Kamm
  2023-08-20 12:13     ` Liu Hui
  0 siblings, 1 reply; 26+ messages in thread
From: Jack Kamm @ 2023-08-18 23:09 UTC (permalink / raw)
  To: Liu Hui; +Cc: emacs-orgmode, Ihor Radchenko

Liu Hui <liuhui1610@gmail.com> writes:

> Hi,
>
> Thank you for the patch!

Thanks for your feedback, I've incorporated it into
https://github.com/jackkamm/org-mode/tree/python-results-revisited-2023

More specifically, here:
https://github.com/jackkamm/org-mode/commit/af1d18314073446045395ff7a3d1de0303e92586

> Do we need to limit the table/list size by default, or handle them
> only with relevant result type (e.g. `table/list')? Dataframe/array
> are often large.

I've updated the patch so that Dataframe/Array are converted to table
only when ":results table" is explicitly set now. If ":results table" is
not set, they will be returned as string by default.

So code blocks that return large dataframes/arrays can continue to be
safely run.

Note I did make an additional change to Numpy array default behavior:
Previously, numpy arrays would be returned as table, but get mangled
when they were very large, e.g.:

  #+begin_src python
  import numpy as np
  return np.zeros((30,40))
  #+end_src
  
  #+RESULTS:
  | (0 0 0 ... 0 0 0) | (0 0 0 ... 0 0 0) | (0 0 0 ... 0 0 0) | ... | (0 0 0 ... 0 0 0) | (0 0 0 ... 0 0 0) | (0 0 0 ... 0 0 0) |

But now, Numpy array is returned in string form by default, in the same
format as in Jupyter:

  #+begin_src python
  import numpy as np
  return np.zeros((30,40))
  #+end_src
  
  #+RESULTS:
  : array([[0., 0., 0., ..., 0., 0., 0.],
  :        [0., 0., 0., ..., 0., 0., 0.],
  :        [0., 0., 0., ..., 0., 0., 0.],
  :        ...,
  :        [0., 0., 0., ..., 0., 0., 0.],
  :        [0., 0., 0., ..., 0., 0., 0.],
  :        [0., 0., 0., ..., 0., 0., 0.]])


>> +                    if isinstance(result, pandas.DataFrame):
>> +                        result = [[''] + list(result.columns), None] + \
>
> Here we can use '{}'.format(df.index.name) to show the name of index

Patch has been updated to print the index name when it is non-None.

> Maybe `org-babel-python--def-format-value' can be evaluated only once
> in the session mode? It would shorten the string sent to the python
> shell, where temp files are used for long strings.

Patch has been updated to evaluate `org-babel-python--def-format-value'
once per session.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17 11:57 ` Ihor Radchenko
@ 2023-08-18 23:18   ` Jack Kamm
  2023-08-19  8:54     ` Ihor Radchenko
  0 siblings, 1 reply; 26+ messages in thread
From: Jack Kamm @ 2023-08-18 23:18 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, Liu Hui

Ihor Radchenko <yantar92@posteo.net> writes:

> This is an ORG-NEWS entry for Version 9.4. Is it an intentional change?

Sorry, that was an accident. I've reverted it now:
https://github.com/jackkamm/org-mode/commit/f12a695d67bc5c06013d9fbe0af844c9739e347a

>> @@ -142,7 +144,9 @@ (defun org-babel-python-table-or-string (results)
>>    "Convert RESULTS into an appropriate elisp value.
>>  If the results look like a list or tuple, then convert them into an
>>  Emacs-lisp table, otherwise return the results as a string."
>> -  (let ((res (org-babel-script-escape results)))
>> +  (let ((res (if (string-equal "{" (substring results 0 1))
>> +                 results ;don't covert dicts to elisp
>> +               (org-babel-script-escape results))))
>
> You may also need to update the docstring for
> `org-babel-python-table-or-string' after this change.

That change got reverted in subsequent update when I changed dict to
return as table by default instead of string. So there's no need to
update the docstring anymore.

>> -					    body)))
>> -           (`value (let ((tmp-file (org-babel-temp-file "python-")))
>> +                                            (if graphics-file
>> +                                                (format org-babel-python--output-graphics-wrapper
>> +                                                        body graphics-file)
>> +                                              body))))
>> +           (`value (let ((results-file (or graphics-file
>> +				           (org-babel-temp-file "python-"))))
>
> What about :results graphics file ?

Not entirely sure what you mean here.

When ":results graphics file", then graphics-file will be non-nil --
org-babel-execute:python passes graphics-file onto
org-babel-python-evaluate and then
org-babel-python-evaluate-external-process. In case of ":results
graphics file output", org-babel-python--output-graphics-wrapper is used
to save pyplot.gcf(). Or if ":results graphics file value", then
org-babel-python--def-format-value saves the result with
Figure.savefig().


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-17  9:14     ` gerard.vermeulen
  2023-08-17 12:10       ` Ihor Radchenko
@ 2023-08-18 23:30       ` Jack Kamm
  2023-08-19  8:50         ` Ihor Radchenko
  2023-08-19  8:58         ` Ihor Radchenko
  1 sibling, 2 replies; 26+ messages in thread
From: Jack Kamm @ 2023-08-18 23:30 UTC (permalink / raw)
  To: gerard.vermeulen
  Cc: Ihor Radchenko, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

gerard.vermeulen@posteo.net writes:

> I do not know how much this "abuse" of defconst is frowned
> upon (elisp manual says defconst is advisory), but maybe it
> can be advertised as a feature.

org-babel-python--def-format-value is a "private" variable (it has
double dash "--" in its name).  Therefore it's not generally recommended
to modify it.

Of course, elisp doesn't have true private variables or functions, and
you are free to change things as you wish -- this is one of the perks of
Emacs :) But you've been warned, since this is a private variable, we
make no guarantees, and may break things in backward-incompatible ways
in the future.

As to the broader point, I agree there are many more features that would
be nice to add ob-python results handling. But making ob-python too
complex will be difficult to maintain, especially since the Python code
is all in quoted strings without proper linting.

So I am thinking now about how we could make this more extensible in
future. One idea is to create a Python package for interfacing with Org
Babel, and release it on PyPi. If we detect the package is installed,
then we can delegate to it for results formatting. And the community
could contribute results handling for all sorts of Python objects to
that package.

That is just one idea for improving extensibility -- I'm not sure it's
the best, and am open to other suggestions as well.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-18 23:30       ` Jack Kamm
@ 2023-08-19  8:50         ` Ihor Radchenko
  2023-08-20 18:01           ` Jack Kamm
  2023-08-19  8:58         ` Ihor Radchenko
  1 sibling, 1 reply; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-19  8:50 UTC (permalink / raw)
  To: Jack Kamm
  Cc: gerard.vermeulen, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

Jack Kamm <jackkamm@gmail.com> writes:

> So I am thinking now about how we could make this more extensible in
> future. One idea is to create a Python package for interfacing with Org
> Babel, and release it on PyPi. If we detect the package is installed,
> then we can delegate to it for results formatting. And the community
> could contribute results handling for all sorts of Python objects to
> that package.
>
> That is just one idea for improving extensibility -- I'm not sure it's
> the best, and am open to other suggestions as well.

Similar to the existing LaTeX formatters, one may write a Python package
that will pretty-print Org markup as text. Not just for Org babel - it
might be useful in general.

And we do not need to support such formatters explicitly in ob-python.
Users can simply arrange to call the formatters in their code blocks by
the usual means.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-18 23:18   ` Jack Kamm
@ 2023-08-19  8:54     ` Ihor Radchenko
  0 siblings, 0 replies; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-19  8:54 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, Liu Hui

Jack Kamm <jackkamm@gmail.com> writes:

>> What about :results graphics file ?
>
> Not entirely sure what you mean here.

Never mind. I was mixing the meaning of header args in my mind after all
the previous discussions.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-18 23:30       ` Jack Kamm
  2023-08-19  8:50         ` Ihor Radchenko
@ 2023-08-19  8:58         ` Ihor Radchenko
  2023-08-20 18:13           ` Jack Kamm
  1 sibling, 1 reply; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-19  8:58 UTC (permalink / raw)
  To: Jack Kamm
  Cc: gerard.vermeulen, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

Jack Kamm <jackkamm@gmail.com> writes:

> As to the broader point, I agree there are many more features that would
> be nice to add ob-python results handling. But making ob-python too
> complex will be difficult to maintain, especially since the Python code
> is all in quoted strings without proper linting.

We might add the code into a separate proper python file. Then, we can
use the contents of that file to retrieve the variable value.

We already do the same thing for CSL style files and odt schema/style.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays,  and plots
  2023-08-18 23:09   ` Jack Kamm
@ 2023-08-20 12:13     ` Liu Hui
  2023-08-20 18:31       ` Jack Kamm
  0 siblings, 1 reply; 26+ messages in thread
From: Liu Hui @ 2023-08-20 12:13 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, Ihor Radchenko

> > Here we can use '{}'.format(df.index.name) to show the name of index
>
> Patch has been updated to print the index name when it is non-None.

Thanks! It would be nice to also support MultiIndex names using
`result.index.names', e.g.

#+begin_src python :results table
import numpy as np
import pandas as pd

df = pd.DataFrame({
    "A": ["foo", "bar", "foo", "bar", "foo", "bar", "foo", "foo"],
    "B": ["one", "one", "two", "three", "two", "two", "one", "three"],
    "C": np.random.randn(8),
    "D": np.random.randn(8)})
return df.groupby(["A", "B"]).agg('sum').round(3)
#+end_src

Another problem is the display of objects like datetime, e.g.

#+begin_src python :results table
import pandas as pd
s = pd.Series(range(3), index=pd.date_range("2000", freq="D", periods=3))
return s.to_frame()
#+end_src

#+RESULTS:
|           | 0                             |   |
|-----------+-------------------------------+---|
| Timestamp | (2000-01-01 00:00:00 freq= D) | 0 |
| Timestamp | (2000-01-02 00:00:00 freq= D) | 1 |
| Timestamp | (2000-01-03 00:00:00 freq= D) | 2 |

#+begin_src python
from pathlib import Path
import numpy as np

return {'a': 1, 'path': Path('/'), 'array': np.zeros(3)}
#+end_src

#+RESULTS:
| a     | 1         |           |
| path  | PosixPath | (/)       |
| array | array     | ((0 0 0)) |

I think these objects need to be shown in a single column rather than
two. Besides, if the python code becomes too complex finally, I think
maintaining the python code outside the ob-python.el, as suggested by
Ihor, is a good idea.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-19  8:50         ` Ihor Radchenko
@ 2023-08-20 18:01           ` Jack Kamm
  2023-08-20 18:21             ` Ihor Radchenko
  0 siblings, 1 reply; 26+ messages in thread
From: Jack Kamm @ 2023-08-20 18:01 UTC (permalink / raw)
  To: Ihor Radchenko
  Cc: gerard.vermeulen, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

Ihor Radchenko <yantar92@posteo.net> writes:

> Similar to the existing LaTeX formatters, one may write a Python package
> that will pretty-print Org markup as text.

This sounds interesting -- are these LaTeX formatters external to Org?
Could you provide a link/reference?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-19  8:58         ` Ihor Radchenko
@ 2023-08-20 18:13           ` Jack Kamm
  2023-08-20 18:25             ` Ihor Radchenko
  0 siblings, 1 reply; 26+ messages in thread
From: Jack Kamm @ 2023-08-20 18:13 UTC (permalink / raw)
  To: Ihor Radchenko
  Cc: gerard.vermeulen, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

Ihor Radchenko <yantar92@posteo.net> writes:

> We might add the code into a separate proper python file. Then, we can
> use the contents of that file to retrieve the variable value.
>
> We already do the same thing for CSL style files and odt schema/style.

Thanks, I think this is a good idea, and will make the python code
easier to maintain.

And thanks also for the pointer to oc-csl and ox-odt -- I think I should
be able to implement this by following their example.

It seems like there will be an extra logistical step, to make sure the
extra python file is added to emacs as well. I'm not familiar with the
details of how we sync Org into Emacs, but will start to look into it.

In the meantime, I'm thinking to squash and apply my patch as is. Then
afterwards, I can start working on a followup patch to move some Python
code into a separate file (and coordinate with emacs-devel if
necessary).


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-20 18:01           ` Jack Kamm
@ 2023-08-20 18:21             ` Ihor Radchenko
  0 siblings, 0 replies; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-20 18:21 UTC (permalink / raw)
  To: Jack Kamm
  Cc: gerard.vermeulen, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

Jack Kamm <jackkamm@gmail.com> writes:

> Ihor Radchenko <yantar92@posteo.net> writes:
>
>> Similar to the existing LaTeX formatters, one may write a Python package
>> that will pretty-print Org markup as text.
>
> This sounds interesting -- are these LaTeX formatters external to Org?
> Could you provide a link/reference?

https://docs.sympy.org/latest/tutorials/intro-tutorial/printing.html
https://jeltef.github.io/PyLaTeX/current/examples/basic.html

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-20 18:13           ` Jack Kamm
@ 2023-08-20 18:25             ` Ihor Radchenko
  2023-08-22 23:37               ` Jack Kamm
  0 siblings, 1 reply; 26+ messages in thread
From: Ihor Radchenko @ 2023-08-20 18:25 UTC (permalink / raw)
  To: Jack Kamm
  Cc: gerard.vermeulen, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

Jack Kamm <jackkamm@gmail.com> writes:

> In the meantime, I'm thinking to squash and apply my patch as is. Then
> afterwards, I can start working on a followup patch to move some Python
> code into a separate file (and coordinate with emacs-devel if
> necessary).

+1
Don't forget to update
https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-python.html
(note how the docs already have an example of org formatting from python)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-20 12:13     ` Liu Hui
@ 2023-08-20 18:31       ` Jack Kamm
  2023-08-21  6:21         ` Liu Hui
  2023-08-22 23:44         ` Jack Kamm
  0 siblings, 2 replies; 26+ messages in thread
From: Jack Kamm @ 2023-08-20 18:31 UTC (permalink / raw)
  To: Liu Hui; +Cc: emacs-orgmode, Ihor Radchenko

Liu Hui <liuhui1610@gmail.com> writes:

> I think these objects need to be shown in a single column rather than
> two. Besides, if the python code becomes too complex finally, I think
> maintaining the python code outside the ob-python.el, as suggested by
> Ihor, is a good idea.

Thanks for reporting these misbehaving examples. I think the root of the
problem is `org-babel-script-escape', which is too aggressive in
recursively converting strings to lists. We may need to rewrite our own
implementation for ob-python.

Also, I agree that moving the python code to an external file will be
helpful in handling these more complex cases.

I may leave these tasks for future patches. In the meantime, we may have
to recommend ":results verbatim" for these more complex cases that
":results table" doesn't fully handle yet.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays,  and plots
  2023-08-20 18:31       ` Jack Kamm
@ 2023-08-21  6:21         ` Liu Hui
  2023-08-22 23:44         ` Jack Kamm
  1 sibling, 0 replies; 26+ messages in thread
From: Liu Hui @ 2023-08-21  6:21 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, Ihor Radchenko

> Thanks for reporting these misbehaving examples. I think the root of the
> problem is `org-babel-script-escape', which is too aggressive in
> recursively converting strings to lists. We may need to rewrite our own
> implementation for ob-python.
>
> Also, I agree that moving the python code to an external file will be
> helpful in handling these more complex cases.
>
> I may leave these tasks for future patches. In the meantime, we may have
> to recommend ":results verbatim" for these more complex cases that
> ":results table" doesn't fully handle yet.

Understand. Thanks again for your work!


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-20 18:25             ` Ihor Radchenko
@ 2023-08-22 23:37               ` Jack Kamm
  0 siblings, 0 replies; 26+ messages in thread
From: Jack Kamm @ 2023-08-22 23:37 UTC (permalink / raw)
  To: Ihor Radchenko
  Cc: gerard.vermeulen, emacs-orgmode, Liu Hui,
	emacs-orgmode-bounces+gerard.vermeulen=posteo.net

Ihor Radchenko <yantar92@posteo.net> writes:

> +1
> Don't forget to update
> https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-python.html
> (note how the docs already have an example of org formatting from python)

Thanks! Done now:
https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=579e8c572345c42ad581d3ddf0f484567d55a787

And updated Worg as well:
https://git.sr.ht/~bzg/worg/commit/7c7d352be72271ae73f31ddffa0f48d225b34259


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
  2023-08-20 18:31       ` Jack Kamm
  2023-08-21  6:21         ` Liu Hui
@ 2023-08-22 23:44         ` Jack Kamm
  1 sibling, 0 replies; 26+ messages in thread
From: Jack Kamm @ 2023-08-22 23:44 UTC (permalink / raw)
  To: Liu Hui; +Cc: emacs-orgmode, Ihor Radchenko

Jack Kamm <jackkamm@gmail.com> writes:

> Liu Hui <liuhui1610@gmail.com> writes:
>
>> I think these objects need to be shown in a single column rather than
>> two. Besides, if the python code becomes too complex finally, I think
>> maintaining the python code outside the ob-python.el, as suggested by
>> Ihor, is a good idea.
>
> Thanks for reporting these misbehaving examples. I think the root of the
> problem is `org-babel-script-escape', which is too aggressive in
> recursively converting strings to lists. We may need to rewrite our own
> implementation for ob-python.
>
> Also, I agree that moving the python code to an external file will be
> helpful in handling these more complex cases.
>
> I may leave these tasks for future patches. In the meantime, we may have
> to recommend ":results verbatim" for these more complex cases that
> ":results table" doesn't fully handle yet.

Pushed the patch now, with one final change: I decided to leave dict as
string by default, converting to table only when ":results table" is
explicitly set. I think it's better this way for now, because of the
misbehaving examples you pointed out -- table conversion is not yet
fully robust for complex dict's containing complicated objects or
structures.


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-08-22 23:44 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-15 23:46 [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots Jack Kamm
2023-08-16  9:32 ` Ihor Radchenko
2023-08-17  4:04   ` Jack Kamm
2023-08-17  9:14     ` gerard.vermeulen
2023-08-17 12:10       ` Ihor Radchenko
2023-08-18  4:37         ` gerard.vermeulen
2023-08-18  6:01           ` gerard.vermeulen
2023-08-18 23:30       ` Jack Kamm
2023-08-19  8:50         ` Ihor Radchenko
2023-08-20 18:01           ` Jack Kamm
2023-08-20 18:21             ` Ihor Radchenko
2023-08-19  8:58         ` Ihor Radchenko
2023-08-20 18:13           ` Jack Kamm
2023-08-20 18:25             ` Ihor Radchenko
2023-08-22 23:37               ` Jack Kamm
2023-08-17 12:07     ` Ihor Radchenko
2023-08-18 22:49       ` Jack Kamm
2023-08-17  5:35 ` Liu Hui
2023-08-18 23:09   ` Jack Kamm
2023-08-20 12:13     ` Liu Hui
2023-08-20 18:31       ` Jack Kamm
2023-08-21  6:21         ` Liu Hui
2023-08-22 23:44         ` Jack Kamm
2023-08-17 11:57 ` Ihor Radchenko
2023-08-18 23:18   ` Jack Kamm
2023-08-19  8:54     ` Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).