From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "T. V. Raman" Newsgroups: gmane.emacs.devel Subject: Re: Embedding Html in Lisp Date: Tue, 24 Jun 2008 06:36:13 -0700 Message-ID: <18528.63565.99370.735383@gargle.gargle.HOWL> References: <12840917.385031214230114585.JavaMail.www@wwinf4611> Reply-To: raman@users.sf.net NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1214314671 21520 80.91.229.12 (24 Jun 2008 13:37:51 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 24 Jun 2008 13:37:51 +0000 (UTC) Cc: tomas@tuxteam.de, emacs-devel@gnu.org To: alinsoar@voila.fr Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jun 24 15:38:23 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KB8j0-0007kv-JZ for ged-emacs-devel@m.gmane.org; Tue, 24 Jun 2008 15:38:22 +0200 Original-Received: from localhost ([127.0.0.1]:43732 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KB8iB-0002uv-3h for ged-emacs-devel@m.gmane.org; Tue, 24 Jun 2008 09:37:31 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KB8hE-0002Z1-8x for emacs-devel@gnu.org; Tue, 24 Jun 2008 09:36:32 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KB8hC-0002Wz-Db for emacs-devel@gnu.org; Tue, 24 Jun 2008 09:36:31 -0400 Original-Received: from [199.232.76.173] (port=55212 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KB8hC-0002Wc-4x for emacs-devel@gnu.org; Tue, 24 Jun 2008 09:36:30 -0400 Original-Received: from qmta07.emeryville.ca.mail.comcast.net ([76.96.30.64]:35093) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KB8hB-0006GE-Qv for emacs-devel@gnu.org; Tue, 24 Jun 2008 09:36:30 -0400 Original-Received: from OMTA03.emeryville.ca.mail.comcast.net ([76.96.30.27]) by QMTA07.emeryville.ca.mail.comcast.net with comcast id hnpw1Z0030b6N64A707900; Tue, 24 Jun 2008 13:36:25 +0000 Original-Received: from localhost ([71.202.191.236]) by OMTA03.emeryville.ca.mail.comcast.net with comcast id hpcD1Z00756Ur8v8PpcQk0; Tue, 24 Jun 2008 13:36:25 +0000 X-Authority-Analysis: v=1.0 c=1 a=J0QwwRudAAAA:8 a=2z1OXlWFAAAA:8 a=7KiFLkV17VDewUMLUyYA:9 a=_zeOPhfErw8Ke0p0jD0A:7 a=WXRvC_GaXGvNZTU8tZJN9lLh1AoA:4 a=5o1NfpFvE1MA:10 a=MSl-tDqOz04A:10 a=oltX7JrCFroA:10 Original-Received: by localhost (Postfix, from userid 1000) id 2827C12A41A8; Tue, 24 Jun 2008 06:36:13 -0700 (PDT) In-Reply-To: <12840917.385031214230114585.JavaMail.www@wwinf4611> X-Mailer: VM 8.1.0-devo-509 under Emacs 23.0.60.1 (i686-pc-linux-gnu) x-attribution: tvr X-detected-kernel: by monty-python.gnu.org: Genre and OS details not recognized. X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:99847 Archived-At: A simpler way to do this is to actually hand off the grungy details of dealing with broken HTML to existing software, and get back something clean. Here are some options: 0) libxslt from the Gnome project. It can clean up bad HTML, turn it into something well formed, and give you something easy to parse and handle. 1) Connecting to a browser --- at present I use a Firefox extension called mozrepl to talk to Firefox over a socket from Emacs. You can send JavaScript to Firefox, and get back relevents portions of the DOM back as cleaned up HTML. 2) It might be worthwhile to build something similar to the above, but with Webkit running headless -- since over time that would give a lighter weight solution. A truly ambitious thing to try would be to link in the HTML parser/renderer bits of webkit into Emacs to provide a truly integrated experience. >>>>> "A" =3D=3D A Soare writes: >> On Mon, Jun 23, 2008 at 03:21:07PM +0200, A Soare wrote: >> >=20 >> > >=20 >> > > Html is lisp. >> > >=20 >> > > You dignify html a lot more than it deserves! >> >=20 >> > With the classical definition of HTML , yes. >> >=20 >> > With the new definition, they can be compared. >>=20 >> But only if you squint hard .-) >>=20 >> Of course, at the formal level you are right. But in the >> details, *ML lose big time. See for example the quoting >> business in tag attributes, for one of my pet peeves. Or >> white space handling. >>=20 >> Those folks must have been on acids. Bad ones. >>=20 A>=20 I realised that there are many cases to treat in order to A> correctly transform html -> lisp, and that is why I do not A> want to write a complete browser, just a minimum for what A> I need. A>=20 A> (enough grumbling) A>=20 A>=20 A> Regards. A>=20 A>=20 A>=20 A>=20 A> =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F= =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F= =5F=5F=5F=5F=5F=5F A>=20 A> En quelques secondes, cr=E9ez-vous une autre adresse mail ! A> http://mail.voila.fr A>=20 A>=20 A>=20 -- Best Regards, --raman Email: raman@users.sf.net WWW: http://emacspeak.sf.net/raman/ AIM: emacspeak GTalk: tv.raman.tv@gmail.com PGP: http://emacspeak.sf.net/raman/raman-almaden.asc Google: tv+raman IRC: irc://irc.freenode.net/#emacs