From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: converting between charsets Date: Mon, 08 May 2006 10:30:48 -0400 Message-ID: References: <87lktejh6f.fsf@myxomop.com> <87u082109z.fsf-monnier+emacs@gnu.org> <84veshaajc.fsf@vinci.loc> <87d5ep1a2c.fsf-monnier+emacs@gnu.org> <84hd40am8t.fsf@vinci.loc> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1147098688 11423 80.91.229.2 (8 May 2006 14:31:28 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 8 May 2006 14:31:28 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon May 08 16:31:26 2006 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1Fd6lZ-0001X8-SF for ged-emacs-devel@m.gmane.org; Mon, 08 May 2006 16:31:18 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Fd6lZ-000207-CQ for ged-emacs-devel@m.gmane.org; Mon, 08 May 2006 10:31:17 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Fd6lF-0001wD-K2 for emacs-devel@gnu.org; Mon, 08 May 2006 10:30:57 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Fd6lE-0001vg-RG for emacs-devel@gnu.org; Mon, 08 May 2006 10:30:57 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Fd6lE-0001vZ-Nc for emacs-devel@gnu.org; Mon, 08 May 2006 10:30:56 -0400 Original-Received: from [132.204.24.67] (helo=mercure.iro.umontreal.ca) by monty-python.gnu.org with esmtp (Exim 4.52) id 1Fd6m5-0006zb-H2 for emacs-devel@gnu.org; Mon, 08 May 2006 10:31:49 -0400 Original-Received: from hidalgo.iro.umontreal.ca (hidalgo.iro.umontreal.ca [132.204.27.50]) by mercure.iro.umontreal.ca (Postfix) with ESMTP id 3D58F2CF46F; Mon, 8 May 2006 10:30:56 -0400 (EDT) Original-Received: from asado.iro.umontreal.ca (asado.iro.umontreal.ca [132.204.24.84]) by hidalgo.iro.umontreal.ca (Postfix) with ESMTP id 64560452A; Mon, 8 May 2006 10:30:48 -0400 (EDT) Original-Received: by asado.iro.umontreal.ca (Postfix, from userid 20848) id 53C9B71589; Mon, 8 May 2006 10:30:48 -0400 (EDT) Original-To: emacs-devel@gnu.org In-Reply-To: <84hd40am8t.fsf@vinci.loc> (Alexander Kotelnikov's message of "Mon, 08 May 2006 13:39:46 +0400") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) X-DIRO-MailScanner-Information: Please contact the ISP for more information X-DIRO-MailScanner: Found to be clean X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel, SpamAssassin (score=-2.82, requis 5, autolearn=not spam, ALL_TRUSTED -2.82) X-DIRO-MailScanner-From: monnier@iro.umontreal.ca X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:54096 Archived-At: > It fails. Its default value contains element > ("^\\(fido7\\|relcom\\)\\.[^,]*\\(,[ \n]*\\(fido7\\|relcom\\)\\.[^,]*\\)*$" koi8-r > (koi8-r)) > and my post to fido7 hierarchy go in utf-8 anyway. [ I don't know if your settnig is correct and supposed to work, but I'll assume it is. ] Then please report this as a bug with M-x report-emacs-bug (or directly to the Gnus guys, but be sure to include the kind of information included in M-x report-emacs-bug). > Let's first talk about encoding regions. Why does not it work with > encode-coding-region? It works. Any evidence that it doesn't? > What about garbage, if encoding/decoiding works I can always decode > into internal representation and encode into desired charset in > send-hook. Not always: decoding+encoding can't always be exact inverses of each other. > I would be happy to get an answer on question: "How do I decode and > encode in Emacs?" It's not the right question. The question you seem to want to ask is "how do I change the way Emacs's package FOO encodes/decodes my object BAR?" SM> So one way to do it is to take care of the encoding yourself, which may SM> amount to doing the whole "send" yourself (i.e. the NIH approach). SM> Or the > NIH? Not Invented Here: the typical reaction of reinventing your own wheel rather than try to adapt the ones you're already using (but which you haven't built yourself). SM> other way is to figure out how to tell the code that already does the SM> encoding to use koi8 rather than utf-8. > There is no such code right now, and, probably, I will write it. You complain that it uses utf-8, so somewhere a piece of code encodes the text into utf-8. >>> 1. Paste into Emacs frame works strange: SM> What text did you paste? Where does it come from? > I type some Russian text in xterm and paste in into Emacs, have a look > at the attached screenshot. Oh, I see. I don't know enough of how this works to help you much further. If you hit C-u C-x = on the various chars (especially on two similar chars displayed with different fonts), you'll see that they come from different charsets (one is probably something like iso-8859-5 and the other may be unicode). Emacs-22 doesn't unify them by default. You can try to put (unify-8859-on-decoding-mode 1) in your .emacs. And you can also try to play with utf-fragment-on-decoding. And ask someone more knowledgeable about such problems. You could even M-x report-emacs-bug about it, since maybe the default config in a cyrillic locale should already take care of it. >>> Cyrillic nput in emacs -nw in xterm still does not work, if I just >>> change X keyboard layout. SM> SM> That doesn't give us much to go on, does it? What does it do, other than SM> "not work"? > It beeps. What does C-h l show after hitting a particular key? Stefan