From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Chad Brown Newsgroups: gmane.emacs.devel Subject: Re: Linking Emacs with libxml2 Date: Mon, 6 Sep 2010 18:40:47 -0700 Message-ID: <4E31CBF2-095D-449C-B97A-9E35A6412263@mit.edu> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1283823751 28282 80.91.229.12 (7 Sep 2010 01:42:31 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 7 Sep 2010 01:42:31 +0000 (UTC) Cc: emacs-devel@gnu.org To: Lars Magne Ingebrigtsen Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Sep 07 03:42:30 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OsnCc-0004JH-Fa for ged-emacs-devel@m.gmane.org; Tue, 07 Sep 2010 03:42:30 +0200 Original-Received: from localhost ([127.0.0.1]:36364 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OsnCQ-0007w1-73 for ged-emacs-devel@m.gmane.org; Mon, 06 Sep 2010 21:42:14 -0400 Original-Received: from [140.186.70.92] (port=38833 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OsnCH-0007bs-Ca for emacs-devel@gnu.org; Mon, 06 Sep 2010 21:42:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OsnB7-0005z3-AN for emacs-devel@gnu.org; Mon, 06 Sep 2010 21:40:54 -0400 Original-Received: from dmz-mailsec-scanner-1.mit.edu ([18.9.25.12]:57661) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OsnB7-0005yx-6G for emacs-devel@gnu.org; Mon, 06 Sep 2010 21:40:53 -0400 X-AuditID: 1209190c-b7c9cae00000753f-66-4c8598244a32 Original-Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-1.mit.edu (Symantec Brightmail Gateway) with SMTP id A3.4D.30015.428958C4; Mon, 6 Sep 2010 21:40:52 -0400 (EDT) Original-Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id o871epY3027966; Mon, 6 Sep 2010 21:40:51 -0400 Original-Received: from [10.0.0.144] ([64.241.37.140]) (authenticated bits=0) (User authenticated as yandros@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id o871emNC027032 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 6 Sep 2010 21:40:50 -0400 (EDT) In-Reply-To: X-Mailer: Apple Mail (2.1081) X-Brightmail-Tracker: AAAAAA== X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129737 Archived-At: I was assuming that people would object to adding a dependency on=20 libxml2. It seems that I was the only one making that assumption,=20 especially since it's pulled in for libsrvg already. My apologies for = not=20 saying that up-front. On Sep 6, 2010, at 2:17 PM, Lars Magne Ingebrigtsen wrote: >> - parsing HTML is the easy part, rendering it in Emacs is a lot >> more difficult. >=20 > Well, parsing real work HTML is quite tricky, but you're right in that > the major part of this work wouldn't be hooking libxml2 into Emacs > (probably a day's work for somebody who knows what they're doing, and > three days for me?), but writing an HTML renderer. I've been looking = to > see whether there are any C libraries for rendering HTML, but I = haven't > found anything. (Well, except Gecko and Webkit, but 1) we probably > don't want to make Emacs dependent on those very large libraries, and = 2) > they're oriented towards more graphical environments than Emacs.) >=20 > But I'm kinda unsure how much work writing an HTML renderer would be, = if > you had access to a sensible parse tree. My guess would be that you > could have something that rendered 80% of pages very nicely with one > week's worth of work. And I take those numbers out of the air, but > that's the vague feeling I have... You might want to take a look at w3m (MIT License) or links 2 (GPL) for = some examples of text-based rendering with emacs-like image support. I don't = know=20 that either will be preferable to rendering in elisp, but they might at = least suggest where to expect difficulty. I would personally expect the troubles to = pop up around tables, CSS, and javascript. http://sourceforge.net/projects/w3m/ http://links.twibright.com/ *Chad