all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* encoding problem with url library
@ 2008-10-29 20:07 Seweryn Kokot
  0 siblings, 0 replies; only message in thread
From: Seweryn Kokot @ 2008-10-29 20:07 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 607 bytes --]

Hello,

I wrote a function which look up a word under point using
http://megaslownik.pl/slownik/angielsko_polski/137151,kludge website.

What the function does is to retrieve the html source and then some text
processing which removes redundant stuff. 

I'm just wondering what is wrong with `url-insert-file-contents'
function, because using this function I get some encoding problems which
can be seen at the upper part of the screenshot. While using
w3m-retrieve is ok! To see the difference just comment or uncomment 9th
and 10th line of the function below.

Is it a bug in `url-insert-file-contents'?


[-- Attachment #2: emacs_compare.png --]
[-- Type: image/png, Size: 6396 bytes --]

[-- Attachment #3: Type: text/plain, Size: 1536 bytes --]


--8<---------------cut here---------------start------------->8---
(defun my-word-lookup-megaslownik ()
  "Look up a word under point with megaslownik."
  (interactive)
  (let ((url-adres
		 (concat "http://megaslownik.pl/slownik/angielsko_polski/"
				 (thing-at-point 'word)))
		(filename (make-temp-file "url" nil ".html")))
	(with-temp-file filename
	  (url-insert-file-contents url-adres)   ; 1. works but with encoding problems
;;;	  (w3m-retrieve url-adres)				 ; 2. works ok
	  (goto-char (point-min))
	  (search-forward "<body>" nil t)
	  (forward-line 1)
	  (delete-region (point) 
					 (progn 
					   (search-forward "<div id=\"content\">" nil t)
					   (beginning-of-line)
					   (point)))
	  (delete-region (progn 
					   (search-forward "<div id=\"content\">" nil t)
					   (forward-line 1)
					   (point))
					 (progn 
					   (search-forward "<div id=\"word\">" nil t)
					   (beginning-of-line)
					   (point)))
	  (delete-region (progn 
					   (search-forward "<div class=\"ikony\">" nil t)
					   (beginning-of-line)
					   (point))
					 (progn 
					   (search-forward "<div id=\"word2\">" nil t)
					   (beginning-of-line)
					   (point)))
	  (delete-region (progn 
					   (search-forward "<div class=\"clearing\">" nil t)
					   (beginning-of-line)
					   (point))
					 (progn 
					   (search-forward "body>" nil t)
					   (forward-line -1)
					   (point))))
	(w3m (concat "file://" filename))))
--8<---------------cut here---------------end--------------->8---

Thanks in advance,
Seweryn


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2008-10-29 20:07 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-29 20:07 encoding problem with url library Seweryn Kokot

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.