all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Seweryn Kokot <sewkokot@gmail.com>
To: help-gnu-emacs@gnu.org
Subject: encoding problem with url library
Date: Wed, 29 Oct 2008 21:07:18 +0100	[thread overview]
Message-ID: <874p2vw4g9.fsf@poczta.po.opole.pl> (raw)

[-- Attachment #1: Type: text/plain, Size: 607 bytes --]

Hello,

I wrote a function which look up a word under point using
http://megaslownik.pl/slownik/angielsko_polski/137151,kludge website.

What the function does is to retrieve the html source and then some text
processing which removes redundant stuff. 

I'm just wondering what is wrong with `url-insert-file-contents'
function, because using this function I get some encoding problems which
can be seen at the upper part of the screenshot. While using
w3m-retrieve is ok! To see the difference just comment or uncomment 9th
and 10th line of the function below.

Is it a bug in `url-insert-file-contents'?


[-- Attachment #2: emacs_compare.png --]
[-- Type: image/png, Size: 6396 bytes --]

[-- Attachment #3: Type: text/plain, Size: 1536 bytes --]


--8<---------------cut here---------------start------------->8---
(defun my-word-lookup-megaslownik ()
  "Look up a word under point with megaslownik."
  (interactive)
  (let ((url-adres
		 (concat "http://megaslownik.pl/slownik/angielsko_polski/"
				 (thing-at-point 'word)))
		(filename (make-temp-file "url" nil ".html")))
	(with-temp-file filename
	  (url-insert-file-contents url-adres)   ; 1. works but with encoding problems
;;;	  (w3m-retrieve url-adres)				 ; 2. works ok
	  (goto-char (point-min))
	  (search-forward "<body>" nil t)
	  (forward-line 1)
	  (delete-region (point) 
					 (progn 
					   (search-forward "<div id=\"content\">" nil t)
					   (beginning-of-line)
					   (point)))
	  (delete-region (progn 
					   (search-forward "<div id=\"content\">" nil t)
					   (forward-line 1)
					   (point))
					 (progn 
					   (search-forward "<div id=\"word\">" nil t)
					   (beginning-of-line)
					   (point)))
	  (delete-region (progn 
					   (search-forward "<div class=\"ikony\">" nil t)
					   (beginning-of-line)
					   (point))
					 (progn 
					   (search-forward "<div id=\"word2\">" nil t)
					   (beginning-of-line)
					   (point)))
	  (delete-region (progn 
					   (search-forward "<div class=\"clearing\">" nil t)
					   (beginning-of-line)
					   (point))
					 (progn 
					   (search-forward "body>" nil t)
					   (forward-line -1)
					   (point))))
	(w3m (concat "file://" filename))))
--8<---------------cut here---------------end--------------->8---

Thanks in advance,
Seweryn


                 reply	other threads:[~2008-10-29 20:07 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874p2vw4g9.fsf@poczta.po.opole.pl \
    --to=sewkokot@gmail.com \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.