From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: etags test is broken on MS-Windows Date: Fri, 22 May 2015 23:05:42 +0300 Message-ID: <83vbfk2x9l.fsf@gnu.org> References: <83y4kmdjmj.fsf@gnu.org> <555A8E62.7060700@cs.ucla.edu> <83h9r8egen.fsf@gnu.org> <83pp5t6gex.fsf@gnu.org> <555E09AE.9070208@cs.ucla.edu> <83lhgh6fb2.fsf@gnu.org> <555E2C10.4010501@cs.ucla.edu> <83h9r5670s.fsf@gnu.org> <555E6A15.8010404@cs.ucla.edu> <831ti957wp.fsf@gnu.org> <83pp5s4uml.fsf@gnu.org> <555F740D.4030304@cs.ucla.edu> <837fs04egz.fsf@gnu.org> <87oalco1mg.fsf@igel.home> <83y4kg2yjc.fsf@gnu.org> <83wq002yct.fsf@gnu.org> <87fv6oo0h3.fsf@igel.home> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1432325212 9225 80.91.229.3 (22 May 2015 20:06:52 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 22 May 2015 20:06:52 +0000 (UTC) Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org To: Andreas Schwab Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri May 22 22:06:46 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YvtDQ-0005gn-Rm for ged-emacs-devel@m.gmane.org; Fri, 22 May 2015 22:06:44 +0200 Original-Received: from localhost ([::1]:35729 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YvtDQ-0008S4-8C for ged-emacs-devel@m.gmane.org; Fri, 22 May 2015 16:06:44 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50191) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YvtCY-0007o7-3w for emacs-devel@gnu.org; Fri, 22 May 2015 16:05:51 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YvtCT-0007BE-S7 for emacs-devel@gnu.org; Fri, 22 May 2015 16:05:49 -0400 Original-Received: from mtaout29.012.net.il ([80.179.55.185]:54217) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YvtCT-0007Aw-F5; Fri, 22 May 2015 16:05:45 -0400 Original-Received: from conversion-daemon.mtaout29.012.net.il by mtaout29.012.net.il (HyperSendmail v2007.08) id <0NOR00N00OW9OC00@mtaout29.012.net.il>; Fri, 22 May 2015 23:04:56 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout29.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NOR00O9QP486F00@mtaout29.012.net.il>; Fri, 22 May 2015 23:04:56 +0300 (IDT) In-reply-to: <87fv6oo0h3.fsf@igel.home> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 80.179.55.185 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:186744 Archived-At: > From: Andreas Schwab > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Fri, 22 May 2015 21:50:48 +0200 > > Eli Zaretskii writes: > > > Or maybe you mean the use case where a Latin-1 file is read into an > > Emacs buffer, and each non-ASCII character is expanded into a UTF-8 > > sequence. Indeed, that will make the byte counts inaccurate (and > > etags.el will have to compensate by searching around the specified > > place). One more reason not to change anything, I guess. > > ??? It's exactly the counter argument. The indices in the tag file must > be file offsets, everything else will lead to wrong offsets. If by "file offsets" you mean counting bytes in the file, then those will also be wrong after decoding non-ASCII characters, unless the file was encoded in UTF-8 to begin with, right? And if you mean counting characters in the file, then etags will be unable to do that, unless it grows the capability to detect the encoding of the file, or rely on the locale and assume that the file is encoded in locale's codeset. Right? Or am I again missing something?