From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: filebat Mark Newsgroups: gmane.emacs.help Subject: Re: Retrieve a web page into buffer and insert some text into it. Date: Sat, 31 Jul 2010 01:00:34 +0800 Message-ID: References: <4C504E73.6050207@mousecar.com> <4C50C15D.6000808@mousecar.com> <4C51DE14.2060102@mousecar.com> <4C52B62E.9060405@mousecar.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=000e0cd3291c0a020b048c9dcaf8 X-Trace: dough.gmane.org 1280509286 20649 80.91.229.12 (30 Jul 2010 17:01:26 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 30 Jul 2010 17:01:26 +0000 (UTC) Cc: GNU Emacs List To: gebser@mousecar.com Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Jul 30 19:01:23 2010 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OesxU-0000Nt-TV for geh-help-gnu-emacs@m.gmane.org; Fri, 30 Jul 2010 19:01:22 +0200 Original-Received: from localhost ([127.0.0.1]:52447 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OesxU-0001aN-0p for geh-help-gnu-emacs@m.gmane.org; Fri, 30 Jul 2010 13:01:20 -0400 Original-Received: from [140.186.70.92] (port=54430 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Oeswp-0001YR-F8 for help-gnu-emacs@gnu.org; Fri, 30 Jul 2010 13:00:41 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Oeswn-0001AW-Kb for help-gnu-emacs@gnu.org; Fri, 30 Jul 2010 13:00:39 -0400 Original-Received: from mail-pv0-f169.google.com ([74.125.83.169]:60165) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Oeswn-0001AL-8d for help-gnu-emacs@gnu.org; Fri, 30 Jul 2010 13:00:37 -0400 Original-Received: by pvc30 with SMTP id 30so1163722pvc.0 for ; Fri, 30 Jul 2010 10:00:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=3rwdzOYpYXqkJuPixiqboKjOuwcNsFOyh0Cqof5IMRg=; b=HqcfiHNKsgGPbT9GoTSLi7DfT/0Ejx5elX4c1G/kWXvibTw3YlnrMjo581mDTMkubb hIAMjBEDrwhs8crz+Y+Ubd8BKbRItd4cumwb0yy+XThklN9k9TYw/DzDBB0v14RW0poR yyMVS0h5qNwndbcIT5iokO+rgKBnKa5ex9rNw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=hnlDI9imJNARq8NBwZl7nm5dSvNwEejqWcxHKhy0ZqqkFa0U/76597NlPmDv5++1jS zZdQataiHeejp6VMqo968IIPISRicVrq+YC6gWVkNqT+koSjYqNF3gvzmTNU+hIN9X9i EZffI+hwkPjWZVgJ95cTvU8D1jW05k3IdhuD4= Original-Received: by 10.142.144.2 with SMTP id r2mr1993672wfd.60.1280509234179; Fri, 30 Jul 2010 10:00:34 -0700 (PDT) Original-Received: by 10.142.209.12 with HTTP; Fri, 30 Jul 2010 10:00:34 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:74357 Archived-At: --000e0cd3291c0a020b048c9dcaf8 Content-Type: text/plain; charset=ISO-8859-1 Below is the revised version. A little ugly, but more precise. (re-search-forward "\\(<[[:blank:]]*\n*[[:blank:]]*body[[:blank:]]*\n*[[:blank:]]*fgcolor[[:blank:]]*=\".*\"[[:blank:]]*\n*[[:blank:]]*>\\)" nil t 1) On 7/31/10, filebat Mark wrote: > > Does the below solve your problem? > > (defun match-web-body() > (interactive) > (setq case-fold-search t);;Make searches case insensitive > (goto-char 0) > (re-search-forward "\\(< *\n* *body\n* +fgcolor=\".*\" *\n*>\\)" nil t > 1) > (setq match_str (match-string 1)) > (message match_str) > ) > > > > On Sat, Jul 31, 2010 at 12:32 AM, filebat Mark wrote: > >> (setq case-fold-search t), then re-search-forward is case insensitive. >> >> To leverage your effort, can you post your code for the sample of args >> parameter. >> >> >> On Fri, Jul 30, 2010 at 7:23 PM, ken wrote: >> >>> Thanks, Denny. I got the args working right, so it's basically working. >>> So now I'm on to another issue: the re-search-forward function. Again, >>> it's a syntax thing. >>> >>> As the code suggests, I'm looking for the html opening body tag. That >>> text could be as simple as "", but could also be: >>> >>> < >>> BodY >>> fgcolor="hazel" >>> > >>> >>> I thought this would work: >>> >>> (re-search-forward "<\s*[bB][oO][dD][yY]\s.*>" nil t) >>> >>> but apparently "\s" is being treated as a literal and not as >>> representing [whitespace]. Also, it doesn't seem that elisp has a >>> function for doing a case-insensitive RE search. >>> >>> >>> >>> On 07/29/2010 10:37 PM filebat Mark wrote: >>> > Hi Ken >>> > >>> > Yes, the parameter of CBARGS in url-retrieve function is confusing. >>> > I also tried this, but it complains of "wrong number of arguments". >>> > >>> > Let's wait to see whether others have any comment. I will spend some >>> > time, when I'm free. >>> > >>> > >>> > On Fri, Jul 30, 2010 at 4:01 AM, ken >> > > wrote: >>> > >>> > >>> > On 07/29/2010 11:22 AM filebat Mark wrote: >>> > > Hi Ken >>> > >>> > Hi, Denny. Thanks for replying >>> > >>> > >>> > > Where do you set the value of url in the second function? >>> > >>> > It needs to be passed from the first defun, but I don't know the >>> syntax >>> > for doing that. The url-retrieve function is a little complex in >>> terms >>> > of its arguments. I think it's stumped even the experts on this >>> list. >>> > >>> > >>> > > One big enhancement shall be setting the coding system of the >>> temp >>> > > buffer, based on the charset of the html page. >>> > >>> > Thanks for bringing that up. Yeah, I followed that on your thread. >>> I >>> > might have to deal with that too. If, after I get the main part of >>> my >>> > code working how I want it, I find it's having that problem, I'll >>> review >>> > your emails (which I still have) and perhaps get back to you for >>> > some tips. >>> > >>> > >>> > > Regards, >>> > > Denny >>> > >>> > Back at ya! >>> > ken >>> > >>> > >>> > > >>> > > On Thu, Jul 29, 2010 at 7:46 AM, ken >> > >>> > > >> >>> wrote: >>> > > >>> > > Lennart suggested I use a different defun, url-copy-file. I >>> > tried that >>> > > instead, but it didn't work. But then I went back to my >>> > original code, >>> > > moved a single parenthesis and... it worked... mostly. >>> Here's >>> > the code: >>> > > >>> > > ------------------------ start --------------------------- >>> > > load url.el >>> > > >>> > > (defun www-edit-web-page (url) >>> > > "Retrieve web page and load into new buffer for editing. >>> > > Automatically insert after tag URL, appropriately >>> > html-tagged >>> > > URL." >>> > > (interactive "sLoad URL: ") >>> > > (with-temp-buffer (url-retrieve url 'edit-web-page))) >>> > > >>> > > >>> > > (defun edit-web-page (status) >>> > > "Switch to the buffer returned by `url-retreive'. >>> > > The buffer should contain the web page sent by the >>> server." >>> > > (switch-to-buffer (current-buffer)) >>> > > (goto-char 0) >>> > > (re-search-forward "" nil t) ;go to end of >> > ...> tag. >>> > > ;insert URL into page >>> > > (insert "\n

From: " url "\n >>> >

\n\n")) >>> > > >>> > > ------------------------ ende --------------------------- >>> > > >>> > > This properly fetches the web page and loads it into a new, >>> > unsaved >>> > > buffer (exactly what I want), but the last line in the second >>> > defun >>> > > doesn't execute. The error messages are telling me that >>> > edit-web-page >>> > > doesn't know the value of "url". So how do I pass this >>> > variable-- with >>> > > its assignment from www-edit-web-page to edit-web-page? (I >>> have a >>> > > guess, but i'm more a C/bash/blah/blah/blah guy, so elisp is >>> a bit >>> > > mysterious.) >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > -- >>> > > Thanks & Regards >>> > > >>> > > Denny Zhang >>> > > >>> > >>> > >>> > >>> > >>> > -- >>> > Thanks & Regards >>> > >>> > Denny Zhang >>> > >>> >>> >> >> >> -- >> Thanks & Regards >> >> Denny Zhang >> >> > > > -- > Thanks & Regards > > Denny Zhang > > -- Thanks & Regards Denny Zhang --000e0cd3291c0a020b048c9dcaf8 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Below is the revised version. A little ugly, but more precise.
=A0=A0 (r= e-search-forward "\\(<[[:blank:]]*\n*[[:blank:]]*body[[:blank:]]*\n= *[[:blank:]]*fgcolor[[:blank:]]*=3D\".*\"[[:blank:]]*\n*[[:blank:= ]]*>\\)" nil t 1)


On 7/31/10, filebat Mark <file= bat.mark@gmail.com> wrote:
Does the below solve your problem?

(defun match-web-body()
=A0 (i= nteractive)
=A0=A0 (setq case-fold-search t);;Make searches case insensi= tive
=A0=A0 (goto-char 0)
=A0=A0 (re-search-forward "\\(< *\n= * *body\n* +fgcolor=3D\".*\" *\n*>\\)" nil t 1)
=A0=A0 (setq match_str (match-string 1))
=A0=A0 (message match_str)
)=



On Sat, Jul 31, 2010 at 12:32 AM, filebat Mark <filebat.mark@gmail.c= om> wrote:
(setq case-fold-s= earch t), then re-search-forward is case insensitive.

To leverage yo= ur effort, can you post your code for the sample of args parameter.


On Fri, Jul 30, 2010 at = 7:23 PM, ken <g= ebser@mousecar.com> wrote:
Thanks, Denny. = =A0I got the args working right, so it's basically working.
=A0So now I'm on to another issue: the re-search-forward function. =A0A= gain,
it's a syntax thing.

As the code suggests, I'm looking for the html opening body tag. =A0Tha= t
text could be as simple as "<body>", but could also be:

<
=A0BodY
=A0fgcolor=3D"hazel"
>

I thought this would work:

(re-search-forward "<\s*[bB][oO][dD][yY]\s.*>" nil t)

but apparently "\s" is being treated as a literal and not as
representing [whitespace]. =A0Also, it doesn't seem that elisp has a function for doing a case-insensitive RE search.



On 07/29/2010 10:37 PM filebat Mark wrote:
> Hi Ken
>
> Yes, the parameter of CBARGS in url-retrieve function is co= nfusing.
> I also tried this, but it complains of "wrong number of arguments= ".
>
> Let's wait to see whether others have any comment. I will spend so= me
> time, when I'm free.
>
>
> On Fri, Jul 30, 2010 at 4:01 AM, ken <gebser@mousecar.com
> <mailto:gebser@mo= usecar.com>> wrote:
>
>
> =A0 =A0 On 07/29/2010 11:22 AM filebat Mark wrote:
> =A0 =A0 > Hi Ken
>
> =A0 =A0 Hi, Denny. =A0Thanks for replying
>
>
> =A0 =A0 > Where do you set the value of url in the second function?=
>
> =A0 =A0 It needs to be passed from the first defun, but I don't kn= ow the syntax
> =A0 =A0 for doing that. =A0The url-retrieve function is a little compl= ex in terms
> =A0 =A0 of its arguments. =A0I think it's stumped even the experts= on this list.
>
>
> =A0 =A0 > One big enhancement shall be setting the coding system of= the temp
> =A0 =A0 > buffer, based on the charset of the html page.
>
> =A0 =A0 Thanks for bringing that up. =A0Yeah, I followed that on your = thread. =A0I
> =A0 =A0 might have to deal with that too. =A0If, after I get the main = part of my
> =A0 =A0 code working how I want it, I find it's having that proble= m, I'll review
> =A0 =A0 your emails (which I still have) and perhaps get back to you f= or
> =A0 =A0 some tips.
>
>
> =A0 =A0 > Regards,
> =A0 =A0 > Denny
>
> =A0 =A0 Back at ya!
> =A0 =A0 ken
>
>
> =A0 =A0 >
> =A0 =A0 > On Thu, Jul 29, 2010 at 7:46 AM, ken <gebser@mousecar.com
> =A0 =A0 <mailto:gebser@mouse= car.com>
> =A0 =A0 > <mailto:gebser@mousecar.com <mailto:gebser@mousecar.com>>> wrote:
> =A0 =A0 >
> =A0 =A0 > =A0 =A0 Lennart suggested I use a different defun, url-co= py-file. =A0I
> =A0 =A0 tried that
> =A0 =A0 > =A0 =A0 instead, but it didn't work. =A0But then I we= nt back to my
> =A0 =A0 original code,
> =A0 =A0 > =A0 =A0 moved a single parenthesis and... it worked... mo= stly. =A0Here's
> =A0 =A0 the code:
> =A0 =A0 >
> =A0 =A0 > =A0 =A0 ------------------------ start ------------------= ---------
> =A0 =A0 > =A0 =A0 load url.el
> =A0 =A0 >
> =A0 =A0 > =A0 =A0 (defun www-edit-web-page (url)
> =A0 =A0 > =A0 =A0 =A0"Retrieve web page and load into new buff= er for editing.
> =A0 =A0 > =A0 =A0 Automatically insert after <body> tag URL, = appropriately
> =A0 =A0 html-tagged
> =A0 =A0 > =A0 =A0 URL."
> =A0 =A0 > =A0 =A0 =A0(interactive "sLoad URL: ")
> =A0 =A0 > =A0 =A0 =A0(with-temp-buffer (url-retrieve url 'edit-= web-page)))
> =A0 =A0 >
> =A0 =A0 >
> =A0 =A0 > =A0 =A0 (defun edit-web-page (status)
> =A0 =A0 > =A0 =A0 =A0 =A0 =A0"Switch to the buffer returned by= `url-retreive'.
> =A0 =A0 > =A0 =A0 =A0 =A0The buffer should contain the web page sen= t by the server."
> =A0 =A0 > =A0 =A0 =A0 =A0 =A0(switch-to-buffer (current-buffer)) > =A0 =A0 > =A0 =A0 =A0 =A0(goto-char 0)
> =A0 =A0 > =A0 =A0 =A0 =A0(re-search-forward "<body.*>&qu= ot; nil t) ;go to end of <body
> =A0 =A0 ...> tag.
> =A0 =A0 > =A0 =A0 =A0 =A0;insert URL into page
> =A0 =A0 > =A0 =A0 =A0 =A0(insert "\n<p>From: <a href= =3D\"" url "\">" url "</a>\n
> =A0 =A0 </p>\n\n"))
> =A0 =A0 >
> =A0 =A0 > =A0 =A0 ------------------------ ende -------------------= --------
> =A0 =A0 >
> =A0 =A0 > =A0 =A0 This properly fetches the web page and loads it i= nto a new,
> =A0 =A0 unsaved
> =A0 =A0 > =A0 =A0 buffer (exactly what I want), but the last line i= n the second
> =A0 =A0 defun
> =A0 =A0 > =A0 =A0 doesn't execute. =A0The error messages are te= lling me that
> =A0 =A0 edit-web-page
> =A0 =A0 > =A0 =A0 doesn't know the value of "url". = =A0So how do I pass this
> =A0 =A0 variable-- with
> =A0 =A0 > =A0 =A0 its assignment from www-edit-web-page to edit-web= -page? =A0(I have a
> =A0 =A0 > =A0 =A0 guess, but i'm more a C/bash/blah/blah/blah g= uy, so elisp is a bit
> =A0 =A0 > =A0 =A0 mysterious.)
> =A0 =A0 >
> =A0 =A0 >
> =A0 =A0 >
> =A0 =A0 >
> =A0 =A0 >
> =A0 =A0 >
> =A0 =A0 > --
> =A0 =A0 > Thanks & Regards
> =A0 =A0 >
> =A0 =A0 > Denny Zhang
> =A0 =A0 >
>
>
>
>
> --
> Thanks & Regards
>
> Denny Zhang
>




--
Thanks & Regards

Denny Zhang




--
Thanks & Regards

Denny Zhang




--
Thanks & Reg= ards

Denny Zhang
--000e0cd3291c0a020b048c9dcaf8--