From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: PJ Weisberg Newsgroups: gmane.emacs.help Subject: Re: those funny non-ASCII characters Date: Thu, 31 May 2012 08:59:39 -0700 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=f46d040890c7b6f2e604c157270e X-Trace: dough.gmane.org 1338480745 1425 80.91.229.3 (31 May 2012 16:12:25 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 31 May 2012 16:12:25 +0000 (UTC) Cc: "help-gnu-emacs@gnu.org" To: "Buchs, Kevin" Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu May 31 18:12:24 2012 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Sa7z4-0000Rw-HW for geh-help-gnu-emacs@m.gmane.org; Thu, 31 May 2012 18:12:22 +0200 Original-Received: from localhost ([::1]:55592 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sa7z4-0005To-6p for geh-help-gnu-emacs@m.gmane.org; Thu, 31 May 2012 12:12:22 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:37688) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sa7my-0006LG-FJ for help-gnu-emacs@gnu.org; Thu, 31 May 2012 11:59:58 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Sa7mo-0003s5-Ux for help-gnu-emacs@gnu.org; Thu, 31 May 2012 11:59:52 -0400 Original-Received: from mail-lpp01m010-f41.google.com ([209.85.215.41]:37540) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Sa7mo-0003rg-L3 for help-gnu-emacs@gnu.org; Thu, 31 May 2012 11:59:42 -0400 Original-Received: by lahi5 with SMTP id i5so1005986lah.0 for ; Thu, 31 May 2012 08:59:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yDYplAmdXMWTHIs3hd2Ho+Mqx4OaNpzyUmnyo06yU70=; b=zXtITCf+Nv19Y8/BMBKQUhGZQJI0JFbaKEm6VKGAaaPUQjeVSbNVZKz7JzecQr+eEf 77lJew9Y1XJQ1GeG0lFA3nKOVWNa422qsAsVtklYQpQjqg4xeqjni6LpKZIg6ZKjJmSF m2p6waUOJ2vvlo3kz4Sij/sHYGzXbhb1z/ZbFXcFxEkx4+tD4yxY8qrx6UctRpGb+zCt tD00gc6V59+/uGPmmS8tQWnDjYD0C6hu3ns3uhGcYtIOyjWqvQR+HJplbeydYy2PeMzh 7r5pyKu39VaWDalh/d827zn9U/yR/qDlmXutBCMME5QN/vc4TKg8oYCygDFc50QA4UmG w10Q== Original-Received: by 10.152.103.109 with SMTP id fv13mr2966436lab.33.1338479979459; Thu, 31 May 2012 08:59:39 -0700 (PDT) Original-Received: by 10.112.5.102 with HTTP; Thu, 31 May 2012 08:59:39 -0700 (PDT) In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.215.41 X-Mailman-Approved-At: Thu, 31 May 2012 12:12:17 -0400 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:85060 Archived-At: --f46d040890c7b6f2e604c157270e Content-Type: text/plain; charset=ISO-8859-1 On Wednesday, May 30, 2012, Buchs, Kevin wrote: > What about opening an ASCII coded file? Can emacs > properly detect it or does it come up as UTF-8? Emacs attempts to determine the correct coding system when it opens a file, so you shouldn't have to worry about this. The 128 characters that make up ASCII have the exact same representation in UTF-8. "Converting" as ASCII file to UTF-8 is a no-op. Therefore, treating an ASCII file as UTF-8 should cause no problems. > I assume that if my lisp library files are encoded utf-8, then I can > paste that UTF-8 character from the web page into my call to > (replace-string ...) in order to substitute the longer dash of Unicode > U+2013 with an ASCII hyphen or double hyphen. But, how does that really > work? If the lisp file is encoded utf-8, then how can I put an ASCII > character in the replacement string? Or do I need to encode the hex > value of the ASCII character(s)? A = A. The hyphen-minus is a hyphen-minus whether it's in an ASCII file as 00101101 or a UTF-16 file as 0000000000101101. So, just type it with your keyboard. BTW, I don't know how Xah intended it, but when he said to "embrace unicode," I interpreted it to mean, "Why don't you just leave em-dashes as em-dashes instead of replacing them with two hyphen-minuses?" -- -PJ Gehm's Corollary to Clark's Law: Any technology distinguishable from magic is insufficiently advanced. --f46d040890c7b6f2e604c157270e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wednesday, May 30, 2012, Buchs, Kevin <buchs.kevin@mayo.edu> wrote:

> What about openin= g an ASCII coded file? Can emacs
> properly detect it or does it come= up as UTF-8?

Emacs attempts to determine the correct coding system when it opens a f= ile, so you shouldn't have to worry about this.

The 128 characte= rs that make up ASCII have the exact same representation in UTF-8. =A0"= ;Converting" as ASCII file to UTF-8 is a no-op. =A0Therefore, treating= an ASCII file as UTF-8 should cause no problems.

> I assume that if my lisp library files are encoded utf-8, then I c= an
> paste that UTF-8 character from the web page into my call to
= > (replace-string ...) in order to substitute the longer dash of Unicode=
> U+2013 with an ASCII hyphen or double hyphen. But, how does that reall= y
> work? If the lisp file is encoded utf-8, then how can I put an AS= CII
> character in the replacement string? Or do I need to encode the= hex
> value of the ASCII character(s)?

A =3D A. =A0The hyphen-minus i= s a hyphen-minus whether it's in an ASCII file as 00101101 or a UTF-16 = file as 0000000000101101. =A0So, just type it with your keyboard.

BT= W, I don't know how Xah intended it, but when he said to "embrace = unicode," I interpreted it to mean, "Why don't you just leave= em-dashes as em-dashes instead of replacing them with two hyphen-minuses?&= quot;

--
-PJ

Gehm's Corollary to Clark's Law: Any technolo= gy distinguishable from
magic is insufficiently advanced.
--f46d040890c7b6f2e604c157270e--