From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Chad Brown Newsgroups: gmane.emacs.devel Subject: Re: Linking Emacs with libxml2 Date: Mon, 6 Sep 2010 11:56:16 -0700 Message-ID: <8A20526E-44B3-4434-9D40-54A36F976CD6@mit.edu> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1283799398 7550 80.91.229.12 (6 Sep 2010 18:56:38 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 6 Sep 2010 18:56:38 +0000 (UTC) Cc: emacs-devel@gnu.org To: Lennart Borgman Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Sep 06 20:56:37 2010 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Osgrs-00032q-UD for ged-emacs-devel@m.gmane.org; Mon, 06 Sep 2010 20:56:37 +0200 Original-Received: from localhost ([127.0.0.1]:44197 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Osgrs-0005P2-Af for ged-emacs-devel@m.gmane.org; Mon, 06 Sep 2010 14:56:36 -0400 Original-Received: from [140.186.70.92] (port=46017 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Osgrg-0005MU-Pt for emacs-devel@gnu.org; Mon, 06 Sep 2010 14:56:26 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Osgrc-0000fY-IB for emacs-devel@gnu.org; Mon, 06 Sep 2010 14:56:24 -0400 Original-Received: from dmz-mailsec-scanner-5.mit.edu ([18.7.68.34]:45179) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Osgrc-0000fR-FM for emacs-devel@gnu.org; Mon, 06 Sep 2010 14:56:20 -0400 X-AuditID: 12074422-b7bbfae000005e9b-c7-4c85394848ec Original-Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-5.mit.edu (Symantec Brightmail Gateway) with SMTP id DB.38.24219.849358C4; Mon, 6 Sep 2010 14:56:08 -0400 (EDT) Original-Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id o86IuJdM005541; Mon, 6 Sep 2010 14:56:19 -0400 Original-Received: from [10.0.1.194] (c-71-231-113-235.hsd1.wa.comcast.net [71.231.113.235]) (authenticated bits=0) (User authenticated as yandros@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id o86IuGC5019876 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 6 Sep 2010 14:56:18 -0400 (EDT) In-Reply-To: X-Mailer: Apple Mail (2.1081) X-Brightmail-Tracker: AAAAAhXbpxsV27Mo X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:129717 Archived-At: On Sep 6, 2010, at 11:44 AM, Lennart Borgman wrote: > On Mon, Sep 6, 2010 at 5:21 PM, Lars Magne Ingebrigtsen = wrote: >> Apparently libxml2 comes with a parser for "real world" HTML, which = is >> very intriguing: >>=20 >> http://www.xmlsoft.org/html/libxml-HTMLparser.html >>=20 >> If Emacs provided a native interface to this function, we could say >>=20 >> (parse-html "file.html") >> =3D> (:html (:head ...) (:body ...)) >>=20 >> and get a nice parse tree out very fast. (Parsing HTML from Emacs = Lisp >> is rather slow.) >>=20 >> Has this been discussed before and rejected? It seems like an = obvious >> idea, and would enable both easier extraction of data from HTML = files, >> as well as writing a (simple) HTML renderer in Emacs Lisp. >=20 > It was discussed before here: >=20 > http://lists.gnu.org/archive/html/emacs-devel/2007-06/msg01147.html >=20 > Wasn't there a problem with linking to external libraries at that = time? Yes, there was. The FSF [lawyers] recently determined that it would be=20= possible to use external libraries with some explicit marking of legal = status,=20 along the lines of what is used in GCC. Looking back through the mail=20= archives, it seems that practical implementation is stuck waiting on an = FFI=20 design/implementation. I thought that one had been sketched out, but = I'm=20 not finding it in the archives, so perhaps I am confused. =20 *Chad=