From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Daniel Rubin Newsgroups: gmane.emacs.help Subject: Re: getting rid of ^M displayed by emacs-w3m Date: Sun, 25 Mar 2007 11:48:19 +0200 Message-ID: <46064563.9010609@warum-ada.de> References: <87r6re1gur.fsf@localhorst.mine.nu> <87mz2216zf.fsf@localhorst.mine.nu> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1174816134 24875 80.91.229.12 (25 Mar 2007 09:48:54 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sun, 25 Mar 2007 09:48:54 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Sun Mar 25 11:48:47 2007 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1HVPLC-00079C-Q3 for geh-help-gnu-emacs@m.gmane.org; Sun, 25 Mar 2007 11:48:47 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HVPNJ-0004xb-Lf for geh-help-gnu-emacs@m.gmane.org; Sun, 25 Mar 2007 04:50:57 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1HVPMy-0004uX-Ob for help-gnu-emacs@gnu.org; Sun, 25 Mar 2007 05:50:36 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1HVPMw-0004qu-3S for help-gnu-emacs@gnu.org; Sun, 25 Mar 2007 05:50:35 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HVPMv-0004qB-Rf for help-gnu-emacs@gnu.org; Sun, 25 Mar 2007 04:50:33 -0500 Original-Received: from smtp4.netcologne.de ([194.8.194.137]) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1HVPKo-0006of-4p for help-gnu-emacs@gnu.org; Sun, 25 Mar 2007 05:48:22 -0400 Original-Received: from [192.168.0.4] (xdsl-87-78-35-115.netcologne.de [87.78.35.115]) by smtp4.netcologne.de (Postfix) with ESMTP id 7A7C5DA900 for ; Sun, 25 Mar 2007 11:48:19 +0200 (CEST) User-Agent: Thunderbird 1.5.0.4 (X11/20060516) In-Reply-To: <87mz2216zf.fsf@localhorst.mine.nu> X-detected-kernel: Linux 2.4-2.6 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:42175 Archived-At: David Hansen wrote: > X-Post to the emacs-w3m mailing list. > > [ summary for the w3m devels: some html page includes the string > " " and emacs-w3m inserts a raw carriage return into the > buffer, which of course looks kind off ugly ] > > On Sun, 25 Mar 2007 10:58:37 +1100 Alexey Pustyntsev wrote: > >> David Hansen writes: >> >>> To me this looks like the page explicitly asked to display a >>> carriage return. So I think what emacs w3m does here is reasonable. >>> But maybe this " " is some html trick I don't know... >> Thanks David. >> >> What I don't understand here is why w3m doesn't display >> ^M (or, perhaps, something else) in xterm when the page explicitly >> asks to do so. > > Some of the rendering is done by w3m and some within emacs. The > translation of entities to characters is one of the things that > happens in emacs. > >>> How do you think should emacs-w3m render a carriage return? >> I consider ^M to be garbage in the rendered html so it should not be >> displayed by default unless, of course, specifically requested. > > If the html source includes the entity " " it explicitly > requested the display of a carriage return (whatever this means), at > least in my opinion. But again, this might be some html "feature" I > don't know about. > > IMHO the right thing to do here is to read up in the HTML specs how > whitespaces encoded with html entities should be treated. Could it be the HTML file contains _both_, line-endings indicated by CR as well as some by newline or CR/LF? So maybe Emacs is somehow tricked to believe it's displaying a -unix or -dos encoded file and rejects to recognise the discrete ^Ms as newlines, while the terminal unconditionally displays anything looking like it might be a newline as such. Just a thought. Have fun ----Daniel -- Daniel Rubin daniel warum-ada de