all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: MON KEY <monkey@sandpframing.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 6283@debbugs.gnu.org
Subject: bug#6283: doc/lispref/searching.texi reference to octal code `0377' correct?
Date: Fri, 28 May 2010 19:20:18 -0400	[thread overview]
Message-ID: <AANLkTinMI4zqAvw8hdS85tyzUy2bjrQvE-D6Ga8fbbSW@mail.gmail.com> (raw)
In-Reply-To: <83sk5cmr8k.fsf@gnu.org>

On Fri, May 28, 2010 at 3:15 AM, Eli Zaretskii <eliz@gnu.org> wrote:
> Sorry, I don't see the relevance.  The manual talks about the
> _numeric_ code of characters, not about their read syntax.

I must be misunderstanding something.
What is the numeric code of \255 ?

> It uses "octal 0377" to present values because octal notation of
> single-byte characters is something many people are familiar with,

Where is this convention detailed/discussed in the manual?
I don't find it mentioned in the (info "(elisp)Conventions").

Should it be, esp. as 0377 is not a representation exposed by the
Emacs user level interface (at least none that that I'm aware of).

> After all, that is the codepoint of the character.

Of which character?

0377 doesn't have a character that I'm aware of.

> This is explained in "Non-ASCII Characters".  But we generally try not


But this is my point, that section (being the most relevant to
Non-ASCII notation) tends to use the #<Radian> notation.

> to advertise this issue too much, because there should be no good
> reason for a Lisp program to create raw bytes.  Emacs is a text
> editor, while raw bytes are not text

Thats just silly. Emacs accomodates noodling w/ raw-bytes because it
is neccesary to edit them on occasion. Heck, Emacs w32 distributes
with a dedicated executable just to edit binary data in hexadecimal
form.

>> whenever I need to manually revert some raw-bytes or improperly
>> encoded bit-rotted text using regexps.
>
> It's hard to believe Emacs couldn't handle any such text in some other
> way.

It generally can. However, sometimes file encodings get out of whack
over time and once they are more than a generation away from
rightedness Emacs isn't always able to revert them.

The good thing is Emacs can do this and I'm very glad it does :)

Besides, its my prerogative how I choose to abuse Emacs into abusing
my data.

> What "improper encoding" was that which Emacs couldn't handle?

The "mixed bag encoding". Not all of my files origniated in Emacs. Not
all of them get read into an Emacs buffer without problems.

GIGO c'est la vie.

FWIW I have entire SQL databases multi-lingual multi-encoding data
that was improperly uploaded into them via a misconfigured PHP script
with a funky encoding declartion which itself got its input from a
certain legacy proprietary w32 web-browser that understood (read
willfully mis-interpreted) UTF-8 according to its own whims and I can
assure you that encodings don't translate perfectly nor are the
mis-translations always easily caught or corrected.

Stuff like this can sometimes happen with system locales too.
Transitioning files from vfat will clobber file names too if your not carefull.

Sometimes I need to find the raw-bytes and replace them with their
character equivalent.

> Could it be that you simply gave up too early and tried to solve the
> problem by treating text as bytes, while it really wasn't?

Nope. I'm usually pretty good about _not_ approaching these problems
with this type of hammer unless it is a last resort.

--
/s_P\





  reply	other threads:[~2010-05-28 23:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-27 17:28 bug#6283: doc/lispref/searching.texi reference to octal code `0377' correct? MON KEY
2010-05-27 18:10 ` Eli Zaretskii
2010-05-27 22:59   ` MON KEY
2010-05-29 14:28     ` Kevin Rodgers
     [not found]   ` <AANLkTikjCByug1U69tbhsnmS4c1VXSNzoqAOAxmbt3bI@mail.gmail.com>
2010-05-28  7:15     ` Eli Zaretskii
2010-05-28 23:20       ` MON KEY [this message]
2010-05-29  6:45         ` Eli Zaretskii
2010-05-31  5:35           ` MON KEY
2010-05-31 18:49             ` Eli Zaretskii
2010-06-01  0:24               ` MON KEY
2010-06-01 18:38                 ` Eli Zaretskii
2010-06-02 19:41                   ` MON KEY
2010-06-03 14:39                     ` Kevin Rodgers
2010-05-31 14:45           ` MON KEY
2010-05-31 18:51             ` Eli Zaretskii
2010-05-31 23:44 ` MON KEY
2010-06-02 16:06 ` MON KEY
2010-06-02 17:30   ` Chong Yidong
2010-06-02 17:46   ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTinMI4zqAvw8hdS85tyzUy2bjrQvE-D6Ga8fbbSW@mail.gmail.com \
    --to=monkey@sandpframing.com \
    --cc=6283@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.