From: "Sebastien Vauban" <wxhgmqzgwmuf-geNee64TY+gS+FvcfC7Uqw@public.gmane.org>
To: emacs-orgmode-mXXj517/zsQ@public.gmane.org
Subject: Re: org babel support for tcl and awk
Date: Wed, 25 May 2011 14:30:01 +0200 [thread overview]
Message-ID: <80aaeb2cae.fsf@somewhere.org> (raw)
In-Reply-To: 87lixvd5ei.fsf@gmail.com
Hi Eric,
Eric Schulte wrote:
> "Sebastien Vauban" <wxhgmqzgwmuf-geNee64TY+gS+FvcfC7Uqw@public.gmane.org> writes:
>> Eric Schulte wrote:
>>> Eric S Fraga <e.fraga-hclig2XLE9Zaa/9Udqfwiw@public.gmane.org> writes: I've made a quick change so that
>>> any variable named "stdin" is treated specially, in that, rather than
>>> using its value to replace strings of $stdin in the text of the awk code,
>>> the value of the stdin variable is saved into the file processed by awk.
>>> This allows awk to operate over Org-mode references.
>>>
>>> If babel code block supported a pipe or an actual stdin header argument,
>>> that would be the ideal way to add this behavior, but currently nothing of
>>> that nature exists.
>>>
>>> Please let me know if this misses part of your suggestion, or more
>>> generally what else may be advisable before we add this to the core.
>>
>> Could this be implemented for sh as well?
>>
>> AFAI understand, this is exactly the missing piece for me to be able to:
>
> Unfortunately this simple hack for ob-awk does not address the need you link
> to below -- which I am aware of and which is on my list of larger
> longer-term Babel development items. I think that a future piping
> implementation will be the ultimate solution to the issues you address.
Glad to hear you understand my wish. It's not always easy to express myself in
a very clean, with English not being my mother tongue, especially when trying
to tackle difficult subjects.
> Such an implementation -- allowing data to flow between concurrently
> executing blocks utilizing posix pipes -- will require more sophisticated
> processes interaction and possibly some form of multi-threaded elisp
> execution.
Just for the sake of clarity, I don't need concurrent or multi-threaded
execution of any kind.
My double-sided goal is:
1. to cut a shell script in small parts, and explain what every part does,
with a runnable example (=C-c C-v C-e=).
2. to tangle the executable script out of the Babel document, by concatenating
all its parts (=C-c C-v C-t=).
A quite "dumb" example follows. I've made it as _minimal_ and as _complete_ as
possible, to be able to _express my point_, for further reference.
* Abstract
This script "americanizes" a European CSV file.
* Sample data
The following is a sample CSV file:
#+results: sample-csv
#+begin_example
Date;Amount;Account
28-05-2010;-6.806,25;999-1974050-30
04-06-2009;420,00;999-1500974-23
24-02-2009;-54,93;999-1974050-30
#+end_example
* Script
What the script must do is:
** Load the data
Read the raw contents of the input file.
#+srcname: load-data
#+begin_src sh :var data=sample-csv :results output :exports both
echo "$data"
#+end_src
#+results: load-data
#+begin_example
Date;Amount;Account
28-05-2010;-6.806,25;999-1974050-30
04-06-2009;420,00;999-1500974-23
24-02-2009;-54,93;999-1974050-30
#+end_example
** Convert the date in American format
Convert the date in =MM/DD/YYYY= format.
#+srcname: convert-date
#+begin_src sh :var data=load-data :results output :exports both
echo "$data" |\
sed -r 's/^([[:digit:]]{2})-([[:digit:]]{2})-([[:digit:]]{4})/\2\/\1\/\3/g' |\
sed -r 's/^([[:digit:]]{2})\/([[:digit:]]{2})\/([[:digit:]]{2})/\2\/\1\/20\3/g'
#+end_src
#+results: convert-date
#+begin_example
Date;Amount;Account
28/05/202010;-6.806,25;999-1974050-30
04/06/202009;420,00;999-1500974-23
24/02/202009;-54,93;999-1974050-30
#+end_example
** Convert the separators
Apply the following operations in order to "americanize" the CSV file received
from the bank:
- remove the dot used as thousands separator (=.= -> ==)
- replace the comma used as decimal separator by a dot (=,= -> =.=)
- replace other commas by a dot (=,= -> =.=)
- replace the semi-comma used as field separator by a comma (=;= -> =,=)
#+srcname: convert-separators
#+begin_src sh :var data=convert-date :results output :exports both
echo "$data" |\
sed -r 's/([[:digit:]])\.([[:digit:]]{3})/\1\2/g' |\
sed -r 's/([[:digit:]]),([[:digit:]]{2})/\1.\2/g' |\
sed -r 's/,/./g' |\
sed -r 's/;/,/g'
#+end_src
#+results: convert-separators
#+begin_example
Date,Amount,Account
28/05/202010,-6806.25,999-1974050-30
04/06/202009,420.00,999-1500974-23
24/02/202009,-54.93,999-1974050-30
#+end_example
* Full code
The script is then:
#+begin_src sh :tangle americanize-csv.sh :noweb yes
#!/bin/bash
# americanize-csv.sh -- Convert CSV file to American format
# Usage: americanize-csv FILE.CSV
cat $1 |\
<<convert-date>> |\
<<convert-separators>>
exit 0
# americanize-csv.sh ends here
#+end_src
As you can see, the tangled script is not executable anymore, as I've been
forced to put =echo $data= commands, in every apart code block, as their first
command to run.
#+begin_src sh
#!/bin/bash
# americanize-csv.sh -- Convert CSV file to American format
# Usage: americanize-csv FILE.CSV
cat $1 |\
echo "$data" |\
sed -r 's/^([[:digit:]]{2})-([[:digit:]]{2})-([[:digit:]]{4})/\2\/\1\/\3/g' |\
sed -r 's/^([[:digit:]]{2})\/([[:digit:]]{2})\/([[:digit:]]{2})/\2\/\1\/20\3/g' |\
echo "$data" |\
sed -r 's/([[:digit:]])\.([[:digit:]]{3})/\1\2/g' |\
sed -r 's/([[:digit:]]),([[:digit:]]{2})/\1.\2/g' |\
sed -r 's/,/./g' |\
sed -r 's/;/,/g'
exit 0
# americanize-csv.sh ends here
#+end_src
Would I have the possibility to play with =stdin=, I could have "hidden" that
first line, and assume all the code I'm writing will be executed against
what's read on =stdin=. As well in the Org buffer, as in the stand-alone shell
script. Right?
#+begin_src sh
#!/bin/bash
# americanize-csv.sh -- Convert CSV file to American format
# Usage: americanize-csv FILE.CSV
cat $1 |\
sed -r 's/^([[:digit:]]{2})-([[:digit:]]{2})-([[:digit:]]{4})/\2\/\1\/\3/g' |\
sed -r 's/^([[:digit:]]{2})\/([[:digit:]]{2})\/([[:digit:]]{2})/\2\/\1\/20\3/g' |\
sed -r 's/([[:digit:]])\.([[:digit:]]{3})/\1\2/g' |\
sed -r 's/([[:digit:]]),([[:digit:]]{2})/\1.\2/g' |\
sed -r 's/,/./g' |\
sed -r 's/;/,/g'
exit 0
# americanize-csv.sh ends here
#+end_src
* Conclusions
As you can see, I did not really mean any concurrent execution. Simply being
able to execute parts of code in-situ, in the Org buffer, to document (and
test) what I'm writing.
And to be able to assemble all the parts in one single script file, by the
means of literate programming.
Best regards,
Seb
--
Sébastien Vauban
next prev parent reply other threads:[~2011-05-25 12:30 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-24 9:31 org babel support for tcl and awk orgmode
2011-05-24 12:51 ` Eric Schulte
2011-05-24 17:53 ` Eric S Fraga
2011-05-24 19:03 ` Eric Schulte
2011-05-24 19:55 ` Sebastien Vauban
2011-05-24 23:51 ` Eric Schulte
2011-05-25 12:30 ` Sebastien Vauban [this message]
2011-05-25 15:57 ` Eric Schulte
2011-05-26 11:18 ` Sebastien Vauban
2011-05-26 13:37 ` Eric Schulte
2011-05-26 13:03 ` Eric Schulte
2011-05-26 15:15 ` Eric S Fraga
2011-05-24 18:57 ` orgmode
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=80aaeb2cae.fsf@somewhere.org \
--to=wxhgmqzgwmuf-genee64ty+gs+fvcfc7uqw@public.gmane.org \
--cc=emacs-orgmode-mXXj517/zsQ@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.