From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Rasmus Newsgroups: gmane.emacs.devel Subject: Re: "Readability" feature in eww Date: Mon, 03 Nov 2014 12:15:00 +0100 Message-ID: <877fzcfscr.fsf@gmx.us> References: <7820496.BS1QHyORAs@descartes> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1415013358 13535 80.91.229.3 (3 Nov 2014 11:15:58 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 3 Nov 2014 11:15:58 +0000 (UTC) Cc: larsi@gnus.org, emacs-devel@gnu.org To: ruediger@c-plusplus.de Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Nov 03 12:15:50 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XlFbw-0007HO-F3 for ged-emacs-devel@m.gmane.org; Mon, 03 Nov 2014 12:15:48 +0100 Original-Received: from localhost ([::1]:33562 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XlFbw-0007dk-16 for ged-emacs-devel@m.gmane.org; Mon, 03 Nov 2014 06:15:48 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58613) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XlFbR-0007de-9o for emacs-devel@gnu.org; Mon, 03 Nov 2014 06:15:23 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XlFbI-0005nL-0Q for emacs-devel@gnu.org; Mon, 03 Nov 2014 06:15:17 -0500 Original-Received: from mout.gmx.net ([212.227.15.19]:60315) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XlFbH-0005hB-Nk for emacs-devel@gnu.org; Mon, 03 Nov 2014 06:15:07 -0500 Original-Received: from x200s ([109.201.154.184]) by mail.gmx.com (mrgmx001) with ESMTPSA (Nemesis) id 0MVNDK-1XYWgz3DME-00YhRp; Mon, 03 Nov 2014 12:15:04 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAAAAAByaaZbAAAAAmJLR0QA/4ePzL8AAAAJcEhZ cwAAAEgAAABIAEbJaz4AAAINSURBVEjH1ZbLcRtBEENVnRESQShIBZEgFsQwmfgwuxJlzsr20Tyy 5hH9Bz/mHz8f/w0A4F8AiJKImcHfAJDo2DHxJnUCSMl2kkQy8QcAJG+gaWv8AQAo+37eNvwbhaRt 11rfiY+TALgj+iSSH4HPHPb7tdaLxDtAkFR2zl2rq211AHALgLmL1HZ1rcbvAISdMkCSctqmW8AH gNaV8paikiY7ab+HBJlXTTHDjSjuahvhHdg/AsoEfX/JpK1PAK85si2SGIDWgEqj9z7ckdOORZKk r+Lw1OmdLMAksSSTbu7eHDpNbmCPtW3JDS1+W6MXQLySzrUJciPK1BGAOAPSX0BaS7KB0ywBA1Jb wSadlSZtXhb1FeA9qLFN2Vld7arBJ0CS5NgAlD1HDaDTPpAg93tiBlTSNtG5rEOAIuV76cFkNQHP VQJAivqSH7WN5nw1dsYiX07EpDUe7tIGSO4C3xKxzgD2opGcXStcADg/A5I0IM2Z4dcy/wDsotCa UfAAAAAFQr6AMYcvm/OeNEWCtHVtNGk/GAqugABATmI5tvXgQCBBi9iB7Uuc3wt6ASIoCZAJzAw/ /URHj4ttOSKju6cAKfFsitcCR+K3IYDOPvrRtpEdMi/eJGueFNomibAlMDN0/GDUH23XiptEGA60 74CejD1tV7cVhANxH43Hvw7taleSpCsYkk3CJ+AXriLS9q3U4kkAAAAASUVORK5CYII= In-Reply-To: <7820496.BS1QHyORAs@descartes> (=?utf-8?Q?=22R=C3=BCdiger?= Sonderfeld"'s message of "Mon, 03 Nov 2014 10:37:47 +0100") User-Agent: Gnus/5.130012 (Ma Gnus v0.12) Emacs/24.3.94 (gnu/linux) X-Provags-ID: V03:K0:ORy0+OEDvrOxysJmo8k9/EMyTKRMOecqDEVChb/eDzzKm9w9AmI ymNsRyUW8kQpahw1TnwOi7h/MPacuEzvVfgBHFRay6Pu817kIsJ7iLgL3CMNuTxNaa3jfnC HT6zg0yjekN4wNGuXuk7JLxdYEnLtz9nmZh5OHVUxuoqtNcMlc69Gh2LMHvE2vvfRkg9lk1 omkyPR0Zi2GnFPqAxri9w== X-UI-Out-Filterresults: notjunk:1; X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 212.227.15.19 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:176260 Archived-At: R=C3=BCdiger Sonderfeld writes: > On Monday 03 November 2014 01:41:14 Lars Magne Ingebrigtsen wrote: >> This is a heuristic, of course, so it can be tweaked endlessly. The >> current algorithm just gives most words a positive score, HTML markup a >> negative score, and words inside tags a negative score. For such a >> simple algorithm, it seems to give pretty good results. >>=20 >> But tweaking is necessary for it to be ... better. If anybody has ideas >> for tweaks or better algorithms, please be my guest and have at it. > > HTML5 has introduced tags such as
and
, which can be used= to=20 > identify the important parts. I'm not sure how widespread their use thus= far=20 > is >=20 > (I think org-mode supports it already if one sets the HTML5 export option= ).=20=20 Indeed, but html5 is not default. As far as I remember you'd have to wrap your article part in #+begin_article =E2=8B=AF #+end_article. There w= as a dicussion at some point, and there were some good html5-reasons why the body is not wrappend in article by default. =E2=80=94Rasmus --=20 . . . The proofs are technical in nature and provides no real understanding