From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: MON KEY Newsgroups: gmane.emacs.bugs Subject: bug#6283: doc/lispref/searching.texi reference to octal code `0377' correct? Date: Fri, 28 May 2010 19:20:18 -0400 Message-ID: References: <83vda9md09.fsf@gnu.org> <83sk5cmr8k.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: dough.gmane.org 1275089262 1023 80.91.229.12 (28 May 2010 23:27:42 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 28 May 2010 23:27:42 +0000 (UTC) Cc: 6283@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat May 29 01:27:40 2010 connect(): No such file or directory Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OI8xk-00057e-RQ for geb-bug-gnu-emacs@m.gmane.org; Sat, 29 May 2010 01:27:37 +0200 Original-Received: from localhost ([127.0.0.1]:36621 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OI8xk-00019Q-6k for geb-bug-gnu-emacs@m.gmane.org; Fri, 28 May 2010 19:27:36 -0400 Original-Received: from [140.186.70.92] (port=60198 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OI8xe-00019D-MP for bug-gnu-emacs@gnu.org; Fri, 28 May 2010 19:27:31 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OI8xd-0001BM-LJ for bug-gnu-emacs@gnu.org; Fri, 28 May 2010 19:27:30 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:34302) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OI8xd-0001BE-IK for bug-gnu-emacs@gnu.org; Fri, 28 May 2010 19:27:29 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1OI8rO-0002Tr-Eg; Fri, 28 May 2010 19:21:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: MON KEY Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 May 2010 23:21:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 6283 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 6283-submit@debbugs.gnu.org id=B6283.12750888259524 (code B ref 6283); Fri, 28 May 2010 23:21:02 +0000 Original-Received: (at 6283) by debbugs.gnu.org; 28 May 2010 23:20:25 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OI8qm-0002TZ-Uu for submit@debbugs.gnu.org; Fri, 28 May 2010 19:20:25 -0400 Original-Received: from mail-yw0-f172.google.com ([209.85.211.172]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OI8qk-0002TO-ME for 6283@debbugs.gnu.org; Fri, 28 May 2010 19:20:23 -0400 Original-Received: by ywh2 with SMTP id 2so499146ywh.0 for <6283@debbugs.gnu.org>; Fri, 28 May 2010 16:20:18 -0700 (PDT) Original-Received: by 10.150.170.17 with SMTP id s17mr2013666ybe.410.1275088818290; Fri, 28 May 2010 16:20:18 -0700 (PDT) Original-Received: by 10.151.143.21 with HTTP; Fri, 28 May 2010 16:20:18 -0700 (PDT) In-Reply-To: <83sk5cmr8k.fsf@gnu.org> X-Google-Sender-Auth: CpS8dofNvtypfJfq_kj_nhsP1mI X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Fri, 28 May 2010 19:21:02 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:37376 Archived-At: On Fri, May 28, 2010 at 3:15 AM, Eli Zaretskii wrote: > Sorry, I don't see the relevance. The manual talks about the > _numeric_ code of characters, not about their read syntax. I must be misunderstanding something. What is the numeric code of \255 ? > It uses "octal 0377" to present values because octal notation of > single-byte characters is something many people are familiar with, Where is this convention detailed/discussed in the manual? I don't find it mentioned in the (info "(elisp)Conventions"). Should it be, esp. as 0377 is not a representation exposed by the Emacs user level interface (at least none that that I'm aware of). > After all, that is the codepoint of the character. Of which character? 0377 doesn't have a character that I'm aware of. > This is explained in "Non-ASCII Characters". But we generally try not But this is my point, that section (being the most relevant to Non-ASCII notation) tends to use the # notation. > to advertise this issue too much, because there should be no good > reason for a Lisp program to create raw bytes. Emacs is a text > editor, while raw bytes are not text Thats just silly. Emacs accomodates noodling w/ raw-bytes because it is neccesary to edit them on occasion. Heck, Emacs w32 distributes with a dedicated executable just to edit binary data in hexadecimal form. >> whenever I need to manually revert some raw-bytes or improperly >> encoded bit-rotted text using regexps. > > It's hard to believe Emacs couldn't handle any such text in some other > way. It generally can. However, sometimes file encodings get out of whack over time and once they are more than a generation away from rightedness Emacs isn't always able to revert them. The good thing is Emacs can do this and I'm very glad it does :) Besides, its my prerogative how I choose to abuse Emacs into abusing my data. > What "improper encoding" was that which Emacs couldn't handle? The "mixed bag encoding". Not all of my files origniated in Emacs. Not all of them get read into an Emacs buffer without problems. GIGO c'est la vie. FWIW I have entire SQL databases multi-lingual multi-encoding data that was improperly uploaded into them via a misconfigured PHP script with a funky encoding declartion which itself got its input from a certain legacy proprietary w32 web-browser that understood (read willfully mis-interpreted) UTF-8 according to its own whims and I can assure you that encodings don't translate perfectly nor are the mis-translations always easily caught or corrected. Stuff like this can sometimes happen with system locales too. Transitioning files from vfat will clobber file names too if your not carefull. Sometimes I need to find the raw-bytes and replace them with their character equivalent. > Could it be that you simply gave up too early and tried to solve the > problem by treating text as bytes, while it really wasn't? Nope. I'm usually pretty good about _not_ approaching these problems with this type of hammer unless it is a last resort. -- /s_P\