all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: filebat Mark <filebat.mark@gmail.com>
To: Thamer Mahmoud <thamer.mahmoud@gmail.com>
Cc: help-gnu-emacs@gnu.org
Subject: Re: How to get title of web page by url?
Date: Wed, 28 Jul 2010 21:44:49 +0800	[thread overview]
Message-ID: <AANLkTin+3_2bt+Umn6cYt3b=rtOJb+RO=X180Vj3AVcs@mail.gmail.com> (raw)
In-Reply-To: <87vd802nx4.fsf@zemblan.newkuwait.org>

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

Thanks, Thamer. It works.

Below is the code snippet.

Well, I still have an encoding problem.
To get the title of "http://www.baidu.com", the title we get is displayed as
unrecognizable codes.

I have tried to encode it, in the way of "(setq web_title_str
(encode-coding-string  web_title_str 'utf-8-dos))", but it fails.
Since I am a newbie for emacs encoding, can you please help me to point what
the problem is?

;; -------------------------- separator --------------------------
(defun get-page-title()
  "Get title of web page, whose url can be found in current line"
  (interactive)
  ;; Get url from current line
  (copy-region-as-kill (re-search-backward "^") (re-search-forward "$"))
  (setq url (substring-no-properties (current-kill 0)))
  ;; Get title of web page, with the help of functions in url.el
  (with-current-buffer (url-retrieve-synchronously url)
    (goto-char 0)
    (re-search-forward "<title>\\(.*\\)<[/]title>" nil t 1)
    (setq web_title_str (match-string 1)))
    (setq web_title_str (encode-coding-string web_title_str 'utf-8-dos))
  ;; Insert the title in the next line
  (reindent-then-newline-and-indent)
  (insert web_title_str)
  )


On 7/28/10, Thamer Mahmoud <thamer.mahmoud@gmail.com> wrote:
>
> filebat Mark <filebat.mark@gmail.com> writes:
>
> > Such as, given "http://www.emacswiki.org/emacs/Git", we will get the
> title
> > of this web page, which is "EmacsWiki: Git:".
> >
> > Function of w3m-current-title is quite close, but a standalone lisp
> function
> > is much preferred.
>
>
> Using the url.el package,
>
> (defun www-get-page-title (url)
>   (with-current-buffer (url-retrieve-synchronously url)
>     (goto-char 0)
>     (re-search-forward "<title>\\(.*\\)<[/]title>" nil t 1)
>     (match-string 1)))
>
> (www-get-page-title "http://www.emacswiki.org/emacs/Git")
> => "EmacsWiki: Git"
>
> hth,
>
> Thamer
>
>
>


-- 
Thanks & Regards

Denny Zhang

[-- Attachment #2: Type: text/html, Size: 2709 bytes --]

  reply	other threads:[~2010-07-28 13:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-27 12:14 How to get title of web page by url? filebat Mark
2010-07-28  5:08 ` Thamer Mahmoud
2010-07-28 13:44   ` filebat Mark [this message]
2010-07-28 15:34     ` Thamer Mahmoud
2010-07-28 15:44       ` Lennart Borgman
2010-07-28 18:14       ` Thamer Mahmoud
2010-07-29 15:07         ` filebat Mark
2010-07-28 14:12   ` Deniz Dogan
2010-07-28 14:53     ` Teemu Likonen
2010-07-28 16:03       ` Andreas Röhler
2010-07-28 19:52         ` Andreas Röhler
     [not found]   ` <mailman.2.1280326418.17798.help-gnu-emacs@gnu.org>
2010-07-28 14:49     ` Ted Zlatanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTin+3_2bt+Umn6cYt3b=rtOJb+RO=X180Vj3AVcs@mail.gmail.com' \
    --to=filebat.mark@gmail.com \
    --cc=help-gnu-emacs@gnu.org \
    --cc=thamer.mahmoud@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.