unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* salutations and web scraping
@ 2011-12-30 22:58 Catonano
  2012-01-10 21:46 ` Andy Wingo
  0 siblings, 1 reply; 16+ messages in thread
From: Catonano @ 2011-12-30 22:58 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]

Hello people,

Happy New Year.

I´m a beginner, I never wrote a single line of LISP or Scheme in my life
and I´m here for asking for directions and suggestions.

I´m mumbling about a pet project. I would like to scrape the web site of a
comunitarian radio station and grab the flash streamed content they
publish. The license the material is published under is Creative Common  so
what I´m planning is not illegal.

The reason why they chose such an obtuse solution is because they are
obtuse. They started the station in the 70s and now they don´t get this
digital new thing

I read the web stuff. The client chapter suggests to adopt an architecture
similar to that of the server for parallel scrapers and closes flashing the
idea of threads and futures.

I don´t see how I could use threads or futures (I´m not even sure what they
are) and my boldness is such that I´d ask you to write for me an example
skeleton code.

Also I was thinking to write a scraper in Guile scheme and then such
scraper would parse the html source for te relevant bits and then delegate
the flash stuff to a unix command, I think wget, curl or something similar.
Is this reasonable ? Is there any architectural glitch I´m missing, here ?

Don´t worry people, I know that the server setup and the internet
connection is not so strong and I don´t want to be server hostile so I
guess a maximum of 2 parallel connections are gonna run.

Or, I was dreaming I could try to integrate the thing with the Gnome
enviroinment and make it available from the Gnome Shell javascript. So the
people in the community could use it to grab the footages themselves. I
don´t know

Thanks so much for ANY hint
Cato

[-- Attachment #2: Type: text/html, Size: 1752 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-01-30 13:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-30 22:58 salutations and web scraping Catonano
2012-01-10 21:46 ` Andy Wingo
2012-01-16 20:06   ` Catonano
2012-01-24 12:47   ` Catonano
2012-01-24 13:07     ` Andy Wingo
2012-01-24 14:17       ` Catonano
2012-01-25  1:41         ` Catonano
2012-01-25  3:56           ` Daniel Hartwig
2012-01-25  4:57             ` Catonano
2012-01-25  9:07             ` Andy Wingo
2012-01-25 17:23               ` Catonano
2012-01-27 12:18                 ` Catonano
2013-01-07 22:23                   ` Andy Wingo
2013-01-30 13:48                     ` Catonano
2012-01-25  8:57           ` Andy Wingo
2012-01-29 14:23             ` Catonano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).