From: filebat Mark <filebat.mark@gmail.com>
To: gebser@mousecar.com
Cc: GNU Emacs List <help-gnu-emacs@gnu.org>
Subject: Re: Retrieve a web page into buffer and insert some text into it.
Date: Sat, 31 Jul 2010 01:00:34 +0800 [thread overview]
Message-ID: <AANLkTimR7VZvbE_rqHa_RWBvesX+JpR5KnQys-coZv8u@mail.gmail.com> (raw)
In-Reply-To: <AANLkTinhFqNUD_Phk5NfBXgL0hPiTSsgeih3LLe-SxBX@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5748 bytes --]
Below is the revised version. A little ugly, but more precise.
(re-search-forward
"\\(<[[:blank:]]*\n*[[:blank:]]*body[[:blank:]]*\n*[[:blank:]]*fgcolor[[:blank:]]*=\".*\"[[:blank:]]*\n*[[:blank:]]*>\\)"
nil t 1)
On 7/31/10, filebat Mark <filebat.mark@gmail.com> wrote:
>
> Does the below solve your problem?
>
> (defun match-web-body()
> (interactive)
> (setq case-fold-search t);;Make searches case insensitive
> (goto-char 0)
> (re-search-forward "\\(< *\n* *body\n* +fgcolor=\".*\" *\n*>\\)" nil t
> 1)
> (setq match_str (match-string 1))
> (message match_str)
> )
>
>
>
> On Sat, Jul 31, 2010 at 12:32 AM, filebat Mark <filebat.mark@gmail.com>wrote:
>
>> (setq case-fold-search t), then re-search-forward is case insensitive.
>>
>> To leverage your effort, can you post your code for the sample of args
>> parameter.
>>
>>
>> On Fri, Jul 30, 2010 at 7:23 PM, ken <gebser@mousecar.com> wrote:
>>
>>> Thanks, Denny. I got the args working right, so it's basically working.
>>> So now I'm on to another issue: the re-search-forward function. Again,
>>> it's a syntax thing.
>>>
>>> As the code suggests, I'm looking for the html opening body tag. That
>>> text could be as simple as "<body>", but could also be:
>>>
>>> <
>>> BodY
>>> fgcolor="hazel"
>>> >
>>>
>>> I thought this would work:
>>>
>>> (re-search-forward "<\s*[bB][oO][dD][yY]\s.*>" nil t)
>>>
>>> but apparently "\s" is being treated as a literal and not as
>>> representing [whitespace]. Also, it doesn't seem that elisp has a
>>> function for doing a case-insensitive RE search.
>>>
>>>
>>>
>>> On 07/29/2010 10:37 PM filebat Mark wrote:
>>> > Hi Ken
>>> >
>>> > Yes, the parameter of CBARGS in url-retrieve function is confusing.
>>> > I also tried this, but it complains of "wrong number of arguments".
>>> >
>>> > Let's wait to see whether others have any comment. I will spend some
>>> > time, when I'm free.
>>> >
>>> >
>>> > On Fri, Jul 30, 2010 at 4:01 AM, ken <gebser@mousecar.com
>>> > <mailto:gebser@mousecar.com>> wrote:
>>> >
>>> >
>>> > On 07/29/2010 11:22 AM filebat Mark wrote:
>>> > > Hi Ken
>>> >
>>> > Hi, Denny. Thanks for replying
>>> >
>>> >
>>> > > Where do you set the value of url in the second function?
>>> >
>>> > It needs to be passed from the first defun, but I don't know the
>>> syntax
>>> > for doing that. The url-retrieve function is a little complex in
>>> terms
>>> > of its arguments. I think it's stumped even the experts on this
>>> list.
>>> >
>>> >
>>> > > One big enhancement shall be setting the coding system of the
>>> temp
>>> > > buffer, based on the charset of the html page.
>>> >
>>> > Thanks for bringing that up. Yeah, I followed that on your thread.
>>> I
>>> > might have to deal with that too. If, after I get the main part of
>>> my
>>> > code working how I want it, I find it's having that problem, I'll
>>> review
>>> > your emails (which I still have) and perhaps get back to you for
>>> > some tips.
>>> >
>>> >
>>> > > Regards,
>>> > > Denny
>>> >
>>> > Back at ya!
>>> > ken
>>> >
>>> >
>>> > >
>>> > > On Thu, Jul 29, 2010 at 7:46 AM, ken <gebser@mousecar.com
>>> > <mailto:gebser@mousecar.com>
>>> > > <mailto:gebser@mousecar.com <mailto:gebser@mousecar.com>>>
>>> wrote:
>>> > >
>>> > > Lennart suggested I use a different defun, url-copy-file. I
>>> > tried that
>>> > > instead, but it didn't work. But then I went back to my
>>> > original code,
>>> > > moved a single parenthesis and... it worked... mostly.
>>> Here's
>>> > the code:
>>> > >
>>> > > ------------------------ start ---------------------------
>>> > > load url.el
>>> > >
>>> > > (defun www-edit-web-page (url)
>>> > > "Retrieve web page and load into new buffer for editing.
>>> > > Automatically insert after <body> tag URL, appropriately
>>> > html-tagged
>>> > > URL."
>>> > > (interactive "sLoad URL: ")
>>> > > (with-temp-buffer (url-retrieve url 'edit-web-page)))
>>> > >
>>> > >
>>> > > (defun edit-web-page (status)
>>> > > "Switch to the buffer returned by `url-retreive'.
>>> > > The buffer should contain the web page sent by the
>>> server."
>>> > > (switch-to-buffer (current-buffer))
>>> > > (goto-char 0)
>>> > > (re-search-forward "<body.*>" nil t) ;go to end of <body
>>> > ...> tag.
>>> > > ;insert URL into page
>>> > > (insert "\n<p>From: <a href=\"" url "\">" url "</a>\n
>>> > </p>\n\n"))
>>> > >
>>> > > ------------------------ ende ---------------------------
>>> > >
>>> > > This properly fetches the web page and loads it into a new,
>>> > unsaved
>>> > > buffer (exactly what I want), but the last line in the second
>>> > defun
>>> > > doesn't execute. The error messages are telling me that
>>> > edit-web-page
>>> > > doesn't know the value of "url". So how do I pass this
>>> > variable-- with
>>> > > its assignment from www-edit-web-page to edit-web-page? (I
>>> have a
>>> > > guess, but i'm more a C/bash/blah/blah/blah guy, so elisp is
>>> a bit
>>> > > mysterious.)
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > Thanks & Regards
>>> > >
>>> > > Denny Zhang
>>> > >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks & Regards
>>> >
>>> > Denny Zhang
>>> >
>>>
>>>
>>
>>
>> --
>> Thanks & Regards
>>
>> Denny Zhang
>>
>>
>
>
> --
> Thanks & Regards
>
> Denny Zhang
>
>
--
Thanks & Regards
Denny Zhang
[-- Attachment #2: Type: text/html, Size: 8619 bytes --]
next prev parent reply other threads:[~2010-07-30 17:00 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-28 15:36 Retrieve a web page into buffer and insert some text into it ken
2010-07-28 15:46 ` Lennart Borgman
2010-07-28 23:46 ` ken
2010-07-29 15:22 ` filebat Mark
2010-07-29 20:01 ` ken
[not found] ` <AANLkTi=7Y3=H1CeY_OvKZWwZp4R1JwpJCJQ20OMHTEKu@mail.gmail.com>
2010-07-30 11:23 ` ken
2010-07-30 16:32 ` filebat Mark
2010-07-30 16:51 ` filebat Mark
2010-07-30 17:00 ` filebat Mark [this message]
2010-07-30 17:12 ` Teemu Likonen
2010-07-31 2:05 ` filebat Mark
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTimR7VZvbE_rqHa_RWBvesX+JpR5KnQys-coZv8u@mail.gmail.com \
--to=filebat.mark@gmail.com \
--cc=gebser@mousecar.com \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).