unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: sebyte <sdt133@netscape.net>
Subject: Re: When is a text file not a text file?
Date: Fri, 09 Jan 2004 18:17:23 +0000	[thread overview]
Message-ID: <RgCLb.12536$tQ6.306982@wards.force9.net> (raw)
In-Reply-To: <uvfnlihii.fsf@ID-87814.user.dfncis.de>


> What "tags" are these? I don't know the actual program you are
> using. But I seem to recall that I once had a program "html2text" or
> "htmltotxt" or whatever that procuced a text file as output
> *containing ANSI escape sequences for colours*.
> 
> Could that be the case? It would explain why dumping such a file on
> the tty---unlike visiting it with a text editor---would display it
> correctly.
> 
> If this *is* the case, then you probably could do something with
> ansi-color.el, though I don't know offhand how exactly.
> 
>     Oliver

Hi Oliver,

Thanks for your time.  Here's an example of html2text's output, displayed in an 
Emacs buffer:


  C^HCo^Hop^Hpy^Hyr^Hri^Hig^Hgh^Hht^Ht n^Hno^Hot^Hti^Hic^Hce^He:^H: All 
reader-contributed material on freshmeat.net is the
  property and responsibility of its author; for reprint rights, please contact
  the author directly.
  -----------------------------------------------------------------------------
  Let me repeat that: OS X is not Unix.
  Consider the following: all of Apple.com's 
_^Hm_^Ha_^Hr_^Hk_^He_^Ht_^Hi_^Hn_^Hg_^H _^Hp_^Ha_^Hg_^He_^Hs on the subject of
  their darling new operating system are extremely careful to note that OS X is
  "_^HU_^HN_^HI_^HX_^H-_^Hb_^Ha_^Hs_^He_^Hd".


Here is how it looks on a tty or in an Emacs *shell* buffer:


  Copyright notice: All reader-contributed material on freshmeat.net is the
  property and responsibility of its author; for reprint rights, please contact
  the author directly.
  -----------------------------------------------------------------------------
  Let me repeat that: OS X is not Unix.
  Consider the following: all of Apple.com's marketing pages on the subject of
  their darling new operating system are extremely careful to note that OS X is
  "UNIX-based".


I had thought that they might be remnants of HTML tags, (I must admit I didn't 
look very closely), but I have found out since they are actually ANSI 'backspace 
control sequences', used to preserve things like underlining and boldface.  The 
html2text option '-nobs' gets rid of them.

(After days spent looking for information, I discovered that html2text comes 
with a manpage and all was revealed.  DOH!)


sebyte

  reply	other threads:[~2004-01-09 18:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-02 17:24 When is a text file not a text file? sebyte
2004-01-09  9:43 ` Oliver Scholz
2004-01-09 18:17   ` sebyte [this message]
2004-01-09 18:48     ` Oliver Scholz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='RgCLb.12536$tQ6.306982@wards.force9.net' \
    --to=sdt133@netscape.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).