salutations and web scraping

unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed

* salutations and web scraping
@ 2011-12-30 22:58 Catonano
  2012-01-10 21:46 ` Andy Wingo
  0 siblings, 1 reply; 16+ messages in thread
From: Catonano @ 2011-12-30 22:58 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]

Hello people,

Happy New Year.

I´m a beginner, I never wrote a single line of LISP or Scheme in my life
and I´m here for asking for directions and suggestions.

I´m mumbling about a pet project. I would like to scrape the web site of a
comunitarian radio station and grab the flash streamed content they
publish. The license the material is published under is Creative Common  so
what I´m planning is not illegal.

The reason why they chose such an obtuse solution is because they are
obtuse. They started the station in the 70s and now they don´t get this
digital new thing

I read the web stuff. The client chapter suggests to adopt an architecture
similar to that of the server for parallel scrapers and closes flashing the
idea of threads and futures.

I don´t see how I could use threads or futures (I´m not even sure what they
are) and my boldness is such that I´d ask you to write for me an example
skeleton code.

Also I was thinking to write a scraper in Guile scheme and then such
scraper would parse the html source for te relevant bits and then delegate
the flash stuff to a unix command, I think wget, curl or something similar.
Is this reasonable ? Is there any architectural glitch I´m missing, here ?

Don´t worry people, I know that the server setup and the internet
connection is not so strong and I don´t want to be server hostile so I
guess a maximum of 2 parallel connections are gonna run.

Or, I was dreaming I could try to integrate the thing with the Gnome
enviroinment and make it available from the Gnome Shell javascript. So the
people in the community could use it to grab the footages themselves. I
don´t know

Thanks so much for ANY hint
Cato

[-- Attachment #2: Type: text/html, Size: 1752 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2011-12-30 22:58 salutations and web scraping Catonano
@ 2012-01-10 21:46 ` Andy Wingo
  2012-01-16 20:06   ` Catonano
  2012-01-24 12:47   ` Catonano
  0 siblings, 2 replies; 16+ messages in thread
From: Andy Wingo @ 2012-01-10 21:46 UTC (permalink / raw)
  To: Catonano; +Cc: guile-user

Hi Catonano,

On Fri 30 Dec 2011 23:58, Catonano <catonano@gmail.com> writes:

> I´m a beginner, I never wrote a single line of LISP or Scheme in my life
> and I´m here for asking for directions and suggestions.

Welcome! :-)

> I´m mumbling about a pet project. I would like to scrape the web site of
> a comunitarian radio station and grab the flash streamed content they
> publish. The license the material is published under is Creative Common 
> so what I´m planning is not illegal.

Sounds like fun.

> my boldness is such that I´d ask you to write for me an example
> skeleton code.

Hey, it's fair, I think; that is a new part of Guile, and there is not a
lot of example code.

Generally, we figure out how to solve problems at the REPL, so fire up
your Guile:

  $ guile
  ...
  scheme@(guile-user)> 

(Here I'm assuming you have guile 2.0.3.)

Use the web modules.  Let's assume we're grabbing http://www.gnu.org/,
for simplicity:

  > (use-modules (web client) (web uri))
  > (http-get (string->uri "http://www.gnu.org/software/guile/"))
  [here the text of the web page gets printed out]

Actually there are two return values: the response object, corresponding
to the headers, and the body.  If you scroll your terminal up, you'll
see that they get labels like $1 and $2.

Now you need to parse the HTML.  The best way to do this is with the
pragmatic HTML parser, htmlprag.  It's part of guile-lib.  So download
and install guile-lib (it's at http://www.non-gnu.org/guile-lib/), and
then, assuming the html is in $2:

  > (use-modules (htmlprag))
  > (define the-web-page (html->sxml $2))

That parses the web page to s-expressions.  You can print the result
nicely:

  > ,pretty-print the-web-page

Now you need to get something out of the web page.  The hackiest way to
do it is just to match against the entire page.  Maybe someone else can
come up with an example, but I'm short on time, so I'll proceed to The
Right Thing -- the problem is that whitespace is significant, and maybe
all you want is the contents of "the <title> in the <head> in the
<html>."

So in XML you'd use XPATH.  In SXML you'd use SXPATH.  It's hard to use
right now; we really need to steal
http://www.neilvandyke.org/webscraperhelper/ from Neil van Dyke.  But
you can see from his docs that the thing would be

  > (use-modules (sxml xpath))
  > (define matcher (sxpath '(// html head title)))
  > (matcher the-web-page)
  $3 = ((title "GNU Guile (About Guile)"))

Et voila.

I don't do much web scraping these days, but I know some others do.  So
if those others would chime in with better ways to do things, that is
very welcome.

Happy hacking,

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-10 21:46 ` Andy Wingo
@ 2012-01-16 20:06   ` Catonano
  2012-01-24 12:47   ` Catonano
  1 sibling, 0 replies; 16+ messages in thread
From: Catonano @ 2012-01-16 20:06 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 4080 bytes --]

Andy,

Il giorno 10 gennaio 2012 22:46, Andy Wingo <wingo@pobox.com> ha scritto:

> Hi Catonano,
>
> On Fri 30 Dec 2011 23:58, Catonano <catonano@gmail.com> writes:
>
> > I´m a beginner, I never wrote a single line of LISP or Scheme in my life
> > and I´m here for asking for directions and suggestions.
>
> Welcome! :-)
>

thank you so much for your reply. I had been eagerly waiting for a signal
from the list and I had missed it ! I´m sorry.

The gmail learning mechanism hasn´t still learned enough about my interest
in this issue, so it didn´t promptly reported about your reply. I had to
dig inside the folders structure I had layed out in order to discover it.
As for me I haven´t learned enough about the gmail learning mechaninsm
woes. I guess we´re both learning, now.

Well, I was attempting a joke ;-)



> > my boldness is such that I´d ask you to write for me an example
> > skeleton code.
>
>
> Hey, it's fair, I think; that is a new part of Guile, and there is not a
> lot of example code.
>
>
Thanks, Andy, I´m grateful for this. Actually I managed to set up geiser,
load a file and get me delivered to a prompt in which that file is loaded.
Cool ;-) But there are still some thing I didn´t know that your post made
clear.


> Generally, we figure out how to solve problems at the REPL, so fire up
> your Guile:
>
>  $ guile
>  ...
>  scheme@(guile-user)>
>
> (Here I'm assuming you have guile 2.0.3.)
>


> Use the web modules.  Let's assume we're grabbing http://www.gnu.org/,
> for simplicity:
>
>  > (use-modules (web client) (web uri))
>  > (http-get (string->uri "http://www.gnu.org/software/guile/"))
>  [here the text of the web page gets printed out]
>

Ok, I had managed to arrive so far (thanks to the help received in the
guile cannel in irc)

>
> Actually there are two return values: the response object, corresponding
> to the headers, and the body.  If you scroll your terminal up, you'll
> see that they get labels like $1 and $2.
>

I didn´t know they were 2 values, thanks

>
> Now you need to parse the HTML.  The best way to do this is with the
> pragmatic HTML parser, htmlprag.  It's part of guile-lib.  So download
> and install guile-lib (it's at http://www.non-gnu.org/guile-lib/), and
> then, assuming the html is in $2:
>

I had seen those $i things but I hadn´t understood that stuff was "inside"
them and that I could use them, so I was using a lot of (define this that).
And this is probably why I missed the two values returned by http-get.
Thanks !



>   > (use-modules (htmlprag))
>  > (define the-web-page (html->sxml $2))
>


And I didn´t know about htmlprag, thanks


>
> That parses the web page to s-expressions.  You can print the result
> nicely:
>
>  > ,pretty-print the-web-page
>

thanks, I didn´t know this, either


>
> Now you need to get something out of the web page.  The hackiest way to
> do it is just to match against the entire page.  Maybe someone else can
> come up with an example, but I'm short on time, so I'll proceed to The
> Right Thing -- the problem is that whitespace is significant, and maybe
> all you want is the contents of "the <title> in the <head> in the
> <html>."
>
> So in XML you'd use XPATH.  In SXML you'd use SXPATH.  It's hard to use
> right now; we really need to steal
> http://www.neilvandyke.org/webscraperhelper/ from Neil van Dyke.  But
> you can see from his docs that the thing would be
>
>  > (use-modules (sxml xpath))
>  > (define matcher (sxpath '(// html head title)))
>  > (matcher the-web-page)
>  $3 = ((title "GNU Guile (About Guile)"))
>
>
I was going to attempt something along this line

(sxml-match (xml->sxml page) [(div (@ (id "real_player") (rel ,url))) (str

but I´m going to explore your lines too. I still wasn´t there, I had
stumbled in something I thought it was a bug, but I also had something else
to do (this is a pet project) so this had to wait.

But I´ll surely let you know

Thanks again for your help
Bye
Cato

[-- Attachment #2: Type: text/html, Size: 6183 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-10 21:46 ` Andy Wingo
  2012-01-16 20:06   ` Catonano
@ 2012-01-24 12:47   ` Catonano
  2012-01-24 13:07     ` Andy Wingo
  1 sibling, 1 reply; 16+ messages in thread
From: Catonano @ 2012-01-24 12:47 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 995 bytes --]

Andy,

I´m back onto this thing.

I´ll never thank you enough, your suggestions were helpful and insightful.

Now, I think I have to signal a potential problem

Il giorno 10 gennaio 2012 22:46, Andy Wingo <wingo@pobox.com> ha scritto:

> > (use-modules (web client) (web uri))
>  > (http-get (string->uri "http://www.gnu.org/software/guile/"))
>  [here the text of the web page gets printed out]
>

I tried with www.gnu.org/software/guile and it worked

I tried with www.friendfeed.com and I got back what follows:

 $11 = "<html>\r
<head><title>301 Moved Permanently</title></head>\r
<body bgcolor=\"white\">\r
<center><h1>301 Moved Permanently</h1></center>\r
<hr><center>nginx/0.6.31</center>\r
</body>\r
</html>\r
"

that is, a "Moved Permanently" message. That´s not what I get if I attep to
go that address with Firefox.

Is this a bug or am I missing anything ? Has this anything to do with the
chuncked responses ? I don´t know

Thanks again
Catonano

[-- Attachment #2: Type: text/html, Size: 1578 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-24 12:47   ` Catonano
@ 2012-01-24 13:07     ` Andy Wingo
  2012-01-24 14:17       ` Catonano
  0 siblings, 1 reply; 16+ messages in thread
From: Andy Wingo @ 2012-01-24 13:07 UTC (permalink / raw)
  To: Catonano; +Cc: guile-user

On Tue 24 Jan 2012 13:47, Catonano <catonano@gmail.com> writes:

> I tried with www.friendfeed.com and I got back what follows:
>
>  $11 = "<html>\r
> <head><title>301 Moved Permanently</title></head>\r
> <body bgcolor=\"white\">\r
> <center><h1>301 Moved Permanently</h1></center>\r
> <hr><center>nginx/0.6.31</center>\r
> </body>\r
> </html>\r
> "

I suspect here in $10 you got a response with the response-code of 301.
It's a redirect.  Guile's HTTP client (unlike Firefox's) doesn't
automatically handle redirects for you; you need to grub around for
where it wants to send you.

In this case, try checking the response-code, and if it's a redirect
code, then look at the response-location.  See response-location in the
manual.

Cheers,

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-24 13:07     ` Andy Wingo
@ 2012-01-24 14:17       ` Catonano
  2012-01-25  1:41         ` Catonano
  0 siblings, 1 reply; 16+ messages in thread
From: Catonano @ 2012-01-24 14:17 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

Andy,

again, thank you so much.

Il giorno 24 gennaio 2012 14:07, Andy Wingo <wingo@pobox.com> ha scritto:

> In this case, try checking the response-code, and if it's a redirect
> code, then look at the response-location.  See response-location in the
> manual.
>
>
yes, it was a simple redirect from www.friendfeed.com to friendfeed.com

I´m so used to such an automatism that I hadn´t noticed it.

I´ll let you kow how I´m gonna proceed with this idea

Cheers
Catonano

[-- Attachment #2: Type: text/html, Size: 837 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-24 14:17       ` Catonano
@ 2012-01-25  1:41         ` Catonano
  2012-01-25  3:56           ` Daniel Hartwig
  2012-01-25  8:57           ` Andy Wingo
  0 siblings, 2 replies; 16+ messages in thread
From: Catonano @ 2012-01-25  1:41 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 589 bytes --]

Il giorno 24 gennaio 2012 15:17, Catonano <catonano@gmail.com> haI´ll let
you kow how I´m gonna proceed with this idea

so here I am to let you know.

It happens that the response I get from the server of the radio station is
chuncked (that is (transfer-encoding (chuncked))   )

so when I issue the (htt-get uri) instrucion I get

$2 = #<<response> etc.
$3 = #f

that is no page source

Is there a workround ? Or shall I wait for this issue to be fixed ? I could
try to take a look at the code myself, but I'm an absolute scheme beginner

Thanks for any hint
Catonano

[-- Attachment #2: Type: text/html, Size: 758 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-25  1:41         ` Catonano
@ 2012-01-25  3:56           ` Daniel Hartwig
  2012-01-25  4:57             ` Catonano
  2012-01-25  9:07             ` Andy Wingo
  2012-01-25  8:57           ` Andy Wingo
  1 sibling, 2 replies; 16+ messages in thread
From: Daniel Hartwig @ 2012-01-25  3:56 UTC (permalink / raw)
  To: Catonano; +Cc: guile-user

Hi there

On 25 January 2012 09:41, Catonano <catonano@gmail.com> wrote:
>
> It happens that the response I get from the server of the radio station is
> chuncked (that is (transfer-encoding (chuncked))   )
>
> so when I issue the (htt-get uri) instrucion I get
>
> $2 = #<<response> etc.
> $3 = #f
>
> that is no page source
>
> Is there a workround ? Or shall I wait for this issue to be fixed ? I could
> try to take a look at the code myself, but I'm an absolute scheme beginner
>

Chunked encoding was introduced in HTTP 1.1 so you can try using 1.0
for your request:

(http-get uri #:version '(1 . 0) ...)

or

(build-request uri #:version '(1 . 0) ...)

Otherwise, Ian Price recently posted a patch adding support for
chunked encoding which you might like to try:

http://thread.gmane.org/gmane.lisp.guile.user/8931/focus=8935


Regards



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-25  3:56           ` Daniel Hartwig
@ 2012-01-25  4:57             ` Catonano
  2012-01-25  9:07             ` Andy Wingo
  1 sibling, 0 replies; 16+ messages in thread
From: Catonano @ 2012-01-25  4:57 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 697 bytes --]

Daniel,

Il giorno 25 gennaio 2012 04:56, Daniel Hartwig <mandyke@gmail.com> ha
scritto:

> Hi there
>

thank you for your suggestions. I'll try with them

Otherwise, Ian Price recently posted a patch adding support for
> chunked encoding which you might like to try:
>
> http://thread.gmane.org/gmane.lisp.guile.user/8931/focus=8935
>
>
yes, I knew about Ian's patch; I just wanted to raise the issue in a more
general fashion. I wanted to see if the community would step in to move the
Ian's solution towards a more solid state.

Some days ago I tried with Ian's patch and had some problems. But I need
some more time in order to decide what I should do as next step

Thaks, anyway
Bye
Catonano

[-- Attachment #2: Type: text/html, Size: 1243 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-25  1:41         ` Catonano
  2012-01-25  3:56           ` Daniel Hartwig
@ 2012-01-25  8:57           ` Andy Wingo
  2012-01-29 14:23             ` Catonano
  1 sibling, 1 reply; 16+ messages in thread
From: Andy Wingo @ 2012-01-25  8:57 UTC (permalink / raw)
  To: Catonano; +Cc: guile-user

On Wed 25 Jan 2012 02:41, Catonano <catonano@gmail.com> writes:

> It happens that the response I get from the server of the radio
> station is chuncked (that is (transfer-encoding (chuncked))   )
>
> so when I issue the (htt-get uri) instrucion I get
>
> $2 = #<<response> etc.
> $3 = #f
>
> that is no page source
>
> Is there a workround ? Or shall I wait for this issue to be fixed ? I
> could try to take a look at the code myself, but I'm an absolute
> scheme beginner

There is no workaround in the source, no.

Ian Price has a workaround:

  https://lists.gnu.org/archive/html/guile-user/2011-11/msg00011.html

We need to get that into Guile.

Cheers,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-25  3:56           ` Daniel Hartwig
  2012-01-25  4:57             ` Catonano
@ 2012-01-25  9:07             ` Andy Wingo
  2012-01-25 17:23               ` Catonano
  1 sibling, 1 reply; 16+ messages in thread
From: Andy Wingo @ 2012-01-25  9:07 UTC (permalink / raw)
  To: Daniel Hartwig; +Cc: guile-user

On Wed 25 Jan 2012 04:56, Daniel Hartwig <mandyke@gmail.com> writes:

> Chunked encoding was introduced in HTTP 1.1 so you can try using 1.0
> for your request:
>
> (http-get uri #:version '(1 . 0) ...)

Good point!

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-25  9:07             ` Andy Wingo
@ 2012-01-25 17:23               ` Catonano
  2012-01-27 12:18                 ` Catonano
  0 siblings, 1 reply; 16+ messages in thread
From: Catonano @ 2012-01-25 17:23 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 297 bytes --]

Il giorno 25 gennaio 2012 10:07, Andy Wingo <wingo@pobox.com> ha scritto:

> On Wed 25 Jan 2012 04:56, Daniel Hartwig <mandyke@gmail.com> writes:
>
> > (http-get uri #:version '(1 . 0) ...)
>
> Good point!
>

yes, good point. I tried, it works. Thanks.  This could have been a show
stopper to me.

[-- Attachment #2: Type: text/html, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-25 17:23               ` Catonano
@ 2012-01-27 12:18                 ` Catonano
  2013-01-07 22:23                   ` Andy Wingo
  0 siblings, 1 reply; 16+ messages in thread
From: Catonano @ 2012-01-27 12:18 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 1694 bytes --]

I have some updates about this issue, should anyone be interested.

Il giorno 25 gennaio 2012 18:23, Catonano <catonano@gmail.com> ha scritto:

>
>
> Il giorno 25 gennaio 2012 10:07, Andy Wingo <wingo@pobox.com> ha scritto:
>
>> On Wed 25 Jan 2012 04:56, Daniel Hartwig <mandyke@gmail.com> writes:
>>
>> > (http-get uri #:version '(1 . 0) ...)
>>
>> Good point!
>>
>
> yes, good point. I tried, it works. Thanks.  This could have been a show
> stopper to me.
>

It seems the show stopped anyway.

If you try

(define uri (string->uri "http://www.ubuntu.com"))
(http-get uri #:version '(1 . 0))

you'll get a correct result. But if you try with "http://friendfeed.com"
youll'get

 Unable to connect to database server

and that's what happens with my radio station site too. Interestingly, I
tried with

curl -http1.0 http://friendfeed.com and got a different result.

Also, the version indicated in the response from friendfeed is 1.1 while
itś 1.0 in the response from www.ubuntu.com

So it seems to me that this workaround of indicating a http 1.0 request
introduces too much unpredictability from the servers; I'm probably running
in a not so common case so some shoddiness in servers configurations is
emerging

As for curl, I'm not even sure it is fulfilling my whish to use a http 1.0
request, there was such a bug some time ago. I don't wanna know. I didn't
try with wget because I couldn't find the right switch. Again, I'm not sure
I wanna know

So probably this pet project will have to wait some more. At least until
the common http 1.1 cases will be covered

What do you think ?

Thanks you all, people, anyway
Bye
Catonano

[-- Attachment #2: Type: text/html, Size: 2573 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-25  8:57           ` Andy Wingo
@ 2012-01-29 14:23             ` Catonano
  0 siblings, 0 replies; 16+ messages in thread
From: Catonano @ 2012-01-29 14:23 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 722 bytes --]

Andy,

Il giorno 25 gennaio 2012 09:57, Andy Wingo <wingo@pobox.com> ha scritto:

> There is no workaround in the source, no.
>
> Ian Price has a workaround:
>
>  https://lists.gnu.org/archive/html/guile-user/2011-11/msg00011.html
>
> We need to get that into Guile.
>

Should I want to try to follow this issue and possibly contribute some
patches, which path am I supposed to take ?

I checked out from git but the thing I ended up with doesn't build.

So which node exactly shold I branch from ?

I was mumbing about branching, applying the Ian's patch to my branch,
reprodcuing the issues I run into, reporting to the list and see from there
where the discussion could have leaded.

Should I move to the devel list ?

[-- Attachment #2: Type: text/html, Size: 1152 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2012-01-27 12:18                 ` Catonano
@ 2013-01-07 22:23                   ` Andy Wingo
  2013-01-30 13:48                     ` Catonano
  0 siblings, 1 reply; 16+ messages in thread
From: Andy Wingo @ 2013-01-07 22:23 UTC (permalink / raw)
  To: Catonano; +Cc: guile-user

On Fri 27 Jan 2012 13:18, Catonano <catonano@gmail.com> writes:

> So it seems to me that this workaround of indicating a http 1.0 request
> introduces too much unpredictability from the servers; I'm probably
> running in a not so common case so some shoddiness in servers
> configurations is emerging

I know it's a year late, but I tracked this one down to the http-get
code shutting down the write end of the socket for non-keep-alive
connections.  This appears to stop the bytes being written from
readching their destination!  Perhaps we would need to uncork the socket
or something in the future; surely someone performance-oriented will
come along later and fix it properly.  Anyway, fetching friendfeed.com/
now works properly.

Cheers,

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: salutations and web scraping
  2013-01-07 22:23                   ` Andy Wingo
@ 2013-01-30 13:48                     ` Catonano
  0 siblings, 0 replies; 16+ messages in thread
From: Catonano @ 2013-01-30 13:48 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 180 bytes --]

2013/1/7 Andy Wingo <wingo@pobox.com>

> I know it's a year late, but I tracked this one down




>  Anyway, fetching friendfeed.com/
> now works properly.
>
> Thank you, Andy ;-)

[-- Attachment #2: Type: text/html, Size: 732 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-01-30 13:48 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-30 22:58 salutations and web scraping Catonano
2012-01-10 21:46 ` Andy Wingo
2012-01-16 20:06   ` Catonano
2012-01-24 12:47   ` Catonano
2012-01-24 13:07     ` Andy Wingo
2012-01-24 14:17       ` Catonano
2012-01-25  1:41         ` Catonano
2012-01-25  3:56           ` Daniel Hartwig
2012-01-25  4:57             ` Catonano
2012-01-25  9:07             ` Andy Wingo
2012-01-25 17:23               ` Catonano
2012-01-27 12:18                 ` Catonano
2013-01-07 22:23                   ` Andy Wingo
2013-01-30 13:48                     ` Catonano
2012-01-25  8:57           ` Andy Wingo
2012-01-29 14:23             ` Catonano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).