From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Lars Magne Ingebrigtsen Newsgroups: gmane.emacs.devel Subject: Re: "Readability" feature in eww Date: Mon, 03 Nov 2014 23:51:26 +0100 Message-ID: References: <87mw88artr.fsf@engster.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1415055141 19861 80.91.229.3 (3 Nov 2014 22:52:21 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 3 Nov 2014 22:52:21 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Nov 03 23:52:15 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1XlQTt-0000b1-P0 for ged-emacs-devel@m.gmane.org; Mon, 03 Nov 2014 23:52:13 +0100 Original-Received: from localhost ([::1]:37636 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XlQTt-00077s-Fo for ged-emacs-devel@m.gmane.org; Mon, 03 Nov 2014 17:52:13 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51431) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XlQTX-0006uA-8E for emacs-devel@gnu.org; Mon, 03 Nov 2014 17:51:56 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XlQTR-0008Nh-Qy for emacs-devel@gnu.org; Mon, 03 Nov 2014 17:51:51 -0500 Original-Received: from hermes.netfonds.no ([80.91.224.195]:55503) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XlQTR-0008Iy-Ky for emacs-devel@gnu.org; Mon, 03 Nov 2014 17:51:45 -0500 Original-Received: from cm-84.215.51.58.getinternet.no ([84.215.51.58] helo=stories.gnus.org) by hermes.netfonds.no with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1XlQT9-0004Y4-3g for emacs-devel@gnu.org; Mon, 03 Nov 2014 23:51:27 +0100 Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAHlBMVEWSj37AvqNoUUlgAAlU HSFaAAlDLy0wAwZUDBJJAAcdIeQ1AAACbUlEQVQ4jVWTQU/bQBCFx04F111+wbJG5hp5UqXcUIJU OCOEuFELreGGoNqpr0htxBlcZ/9tZ2adtvgQOfN53nsza0M1nzmAEvgqAOZQIaL8gbIBPA5OSFHM XMn1ReOEoKuIqNZ7DwIQT0sHrqq+EkVqwVpzaHylgB9y1ZLkill4LfWFKCNqPWTg10JAgTa0UKrJ 0RnXV8AOgGMGECS1OmgvqFJ/Wpi9wEqaqTDGAKjSywoMt+SGee64FvDkQbS0ocl1UIu3I+eh/l7N sEI3AbXA04aHbhuWakoO6zPoCRvZZDXnek2+bGWOTqXwE2dFBvMQfVuW+1mKBmxW3qMoEbGb33cZ JMTlSrIuN0Ql1O74VqTGsedHG8kqUzneTvcI+0Q3AvAI8XIc81B7vOwDoucTusQ3ZjoTL86zjABK 7J4WiNcZPFp7QHmQkVL/hif5yKiw1gYdpMebIeEil+nJZiD3iFeSSIeNzv7tINxF1VCOtRxYzsuh cDp7BqUzCswHELu2sMY6AUiDgOwdawWW967H+U8qZMBEZ84gSkt7aEHAbPm+/r8jtnyC0nBOP/Aj KHxhi0Lme9Z6M3mEgi2MgqRg/ToBp9YM+pTEpTkZMwjcoCClceDXAC924NsObNOm5WkuJinRyuDs dwx8sK9jTqXbFfD+cMeP4efsEWs5Jxbj219cDw94k6QeeApdFtG91EOX+Jq+RQuGX7gYFYTXCbSy cdOC5lCtDN49sNbBLcRuAncM2D69COCXMYYYMmOwHdOwUFAC23ZZ7SJtzzZpwEMZowZNJD/33HHe b6++FBPodh3hfLsZtj9Xen7hDx2nOuqOD2PvAAAAAElFTkSuQmCC X-Now-Playing: Joe Jackson's _Mike's Murder_ X-Hashcash: 1:23:141103:emacs-devel@gnu.org::7torzY6lELfdBWEj:0000000000000000000000000000000000000000008Vzz In-Reply-To: <87mw88artr.fsf@engster.org> (David Engster's message of "Mon, 03 Nov 2014 22:37:36 +0100") User-Agent: Gnus/5.130012 (Ma Gnus v0.12) Emacs/25.0.50 (gnu/linux) X-MailScanner-ID: 1XlQT9-0004Y4-3g MailScanner-NULL-Check: 1415659887.88095@W91GMLI/u2bMZNUHvY7K+g X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 80.91.224.195 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:176323 Archived-At: David Engster writes: > It'd be great if you could make this extraction method flexible, similar > to the 'washing' feature from Gnus, so that users could hook their own > methods for extracting the main content into eww. The user would provide > an extraction function and the corresponding regexp that matches against > the URL, or optionally also against the source to match things like the > 'generator' meta-tag. Well, the best is if we find a solution that works out of the box, because then all users can just, like, use it. >"? The current heuristics are probably too simple, but I've been going through bunches of pages, and it already seems to basically do what you'd expect it to do, which surprises me, sort of. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no