From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Kenichi Handa <handa@m17n.org>
Newsgroups: gmane.emacs.bugs
Subject: Re: non-ASCII TAGS
Date: Wed, 23 Apr 2003 20:01:04 +0900 (JST)
Sender: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org
Message-ID: <200304231101.UAA04310@etlken.m17n.org>
References: <rzqu1dgiz0i.fsf@albion.dl.ac.uk>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=US-ASCII
X-Trace: main.gmane.org 1051098171 1666 80.91.224.249 (23 Apr 2003 11:42:51 GMT)
X-Complaints-To: usenet@main.gmane.org
NNTP-Posting-Date: Wed, 23 Apr 2003 11:42:51 +0000 (UTC)
Cc: handa@m17n.org
Original-X-From: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Wed Apr 23 13:42:48 2003
Return-path: <bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org>
Original-Received: from monty-python.gnu.org ([199.232.76.173])
	by main.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 198IeG-0000QY-00
	for <gnu-bug-gnu-emacs@m.gmane.org>; Wed, 23 Apr 2003 13:42:48 +0200
Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org)
	by monty-python.gnu.org with esmtp (Exim 4.10.13)
	id 198IZc-0004hC-07
	for gnu-bug-gnu-emacs@m.gmane.org; Wed, 23 Apr 2003 07:38:00 -0400
Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13)
	id 198IWQ-0003Cy-00
	for bug-gnu-emacs@gnu.org; Wed, 23 Apr 2003 07:34:42 -0400
Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13)
	id 198I27-0005x8-00
	for bug-gnu-emacs@gnu.org; Wed, 23 Apr 2003 07:03:24 -0400
Original-Received: from tsukuba.m17n.org ([192.47.44.130])
	by monty-python.gnu.org with esmtp (Exim 4.10.13)
	id 198I01-0005iH-00
	for bug-gnu-emacs@gnu.org; Wed, 23 Apr 2003 07:01:13 -0400
Original-Received: from fs.m17n.org (fs.m17n.org [192.47.44.2])h3NB14o11582;
	Wed, 23 Apr 2003 20:01:04 +0900 (JST)	(envelope-from handa@m17n.org)
Original-Received: from etlken.m17n.org (etlken.m17n.org [192.47.44.125])
	h3NB14A19898;	Wed, 23 Apr 2003 20:01:04 +0900 (JST)
Original-Received: (from handa@localhost)
	by etlken.m17n.org (8.8.8+Sun/3.7W-2001040620) id UAA04310;
	Wed, 23 Apr 2003 20:01:04 +0900 (JST)
Original-To: d.love@dl.ac.uk
In-reply-to: <rzqu1dgiz0i.fsf@albion.dl.ac.uk> (message from Dave Love on 02
	Apr 2003 18:34:53 +0100)
User-Agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2
	Emacs/21.2.92 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)
Original-cc: bug-gnu-emacs@gnu.org
X-BeenThere: bug-gnu-emacs@gnu.org
X-Mailman-Version: 2.1b5
Precedence: list
List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors
 <bug-gnu-emacs.gnu.org>
List-Help: <mailto:bug-gnu-emacs-request@gnu.org?subject=help>
List-Post: <mailto:bug-gnu-emacs@gnu.org>
List-Subscribe: <http://mail.gnu.org/mailman/listinfo/bug-gnu-emacs>,
	<mailto:bug-gnu-emacs-request@gnu.org?subject=subscribe>
List-Archive: <http://mail.gnu.org/pipermail/bug-gnu-emacs>
List-Unsubscribe: <http://mail.gnu.org/mailman/listinfo/bug-gnu-emacs>,
	<mailto:bug-gnu-emacs-request@gnu.org?subject=unsubscribe>
Errors-To: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org
Xref: main.gmane.org gmane.emacs.bugs:4878
X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:4878

Sorry for the late response on this matter.

In article <rzqu1dgiz0i.fsf@albion.dl.ac.uk>, Dave Love <d.love@dl.ac.uk> writes:
> Probably the worst problem with using non-ASCII programming
> identifiers is etags.  It isn't aware of encoding issues and fixing
> the issues is non-trivial, so this is mainly raising a flag and hoping
> someone can work on it.  I think sorting it out requires not only
> extending the TAGS format, but probably also generating it with Emacs.
> I don't have time to work on this, but here's the problem and some
> ideas.

What do you think about the following proposal which, I
think, work in most cases without extending the TAGS format.
In the case it doesn't work, we can still ask people to use
C-x RET c __CODING__ RET ESC . .

Of course, as you wrote, recoding coding systems in TAGS is
ideal, but it requires more work just to save rare cases.

Kenichi Handa <handa@m17n.org> writes:
> In article <shpto57wa4.fsf_-_@tux.gnu.franken.de>, Karl Eichwalder <keichwa@gmx.net> writes:
>>  Now the next one: `tags-query-replace' does not work properly when file
>>  names are UTF-8 encoded.  First run `etags *' on the files and then
>>  call `tags-query-replace'.

> This is the same type of bug (but more difficult) as what I
> posted to emacs-devel by the subjest "bad interaction with
> C-x RET c and vc-cvs-registered".

> A tag file contains file names plus parts of source code.
> The former must be decoded by file-name-coding-system, but
> the latter must be decoded by the coding system of each
> file.  It's very hard to decided a coding system for the
> latter without actually reading the file.

> Perhaps, a tag file must be read as raw-text (thus in a
> unibyte buffer), and if one gives a non-ASCII TAGNAME to
> `find-tag', it must be encoded by the
> buffer-file-coding-system of the current buffer.

And the reply from Richard is as follows:

> That seems like a good approach.  Would someone like to implement it?

---
Ken'ichi HANDA
handa@m17n.org