* Re: etags test is broken on MS-Windows [not found] ` <555A8E62.7060700@cs.ucla.edu> @ 2015-05-19 15:27 ` Eli Zaretskii 2015-05-19 17:57 ` Paul Eggert ` (2 more replies) 0 siblings, 3 replies; 51+ messages in thread From: Eli Zaretskii @ 2015-05-19 15:27 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel [Sorry, I didn't mean to discuss this in private, I just forgot to CC the list. Adding it now, and repeating my original message.] I wrote: > > Commit e0117b1 changed the new etags test suite in a way that makes it > > always be skipped on MS-Windows (and in general on any platform that > > doesn't have the 'locale' command or doesn't have a UTF-8 locale > > installed). > > > > I don't understand why a test suite needs to use UTF-8, but I don't > > really mind as long as the tests can run on all supported platforms. > > Can we fix the test to not require these features, please? And Paul answered: > Date: Mon, 18 May 2015 18:14:10 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > > > I don't > > really mind as long as the tests can run on all supported platforms. > > Without that patch, the tests failed on my GNU/Linux host due to encoding > problems. See attached file I don't think it's due to encoding problem. (AFAIK, etags doesn't regard its input as characters, but as a stream of bytes.) I think it's due to DOS CR-LF EOL format of some files in the test suite. For example, the first file whose tags were different in your testing is dostorture.c, which has DOS EOLs, the second file, c.C, has a lone ^M character at the end of one of its lines, and so on. Could you please verify that this is indeed the source of the problem? (There's also an unrelated problem with the gzip-compressed file in f-src, which seems to be some Windows-specific glitch; I will look into it separately.) > > Can we fix the test to not require these features, please? > > I don't know what will work on MS-Windows, but I checked in a stab > at it. Thanks, it works now, but I have the same problems due to EOL format, and in the same files, just in reverse. If we agree that the problem is due to EOL format, we could try thinking about a solution. The root cause for the problem is that on Windows, etags accounts for the stripped CR characters, while on Unix it treats them as part of the contents, so the byte counts are offset by the number of the preceding lines. > If this fails, I suggest removing all the non-ASCII characters from > these test files and then regenerating the "good" data to match. I don't see this as necessary, not yet. Thanks. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-19 15:27 ` etags test is broken on MS-Windows Eli Zaretskii @ 2015-05-19 17:57 ` Paul Eggert 2015-05-19 18:26 ` Eli Zaretskii 2015-05-20 15:38 ` Eli Zaretskii 2015-05-21 13:16 ` Francesco Potortì 2 siblings, 1 reply; 51+ messages in thread From: Paul Eggert @ 2015-05-19 17:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 05/19/2015 08:27 AM, Eli Zaretskii wrote: > I think it's due to DOS CR-LF EOL format of some files in the test suite. You're right, I misdiagnosed the porting problem. Sorry about that. > If we agree that the problem is due to EOL format, we could try > thinking about a solution. The root cause for the problem is that on > Windows, etags accounts for the stripped CR characters, while on Unix > it treats them as part of the contents, so the byte counts are offset > by the number of the preceding lines. That sounds like a problem, but not a problem that the test case is trying to detect. A simple way that should cajole the tests into passing is to remove the trailing CRs from the test data, so I installed a patch to do that. If we ever want to make ctags output portable among Unix vs DOS conventions we can bring back test cases involving CRs. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-19 17:57 ` Paul Eggert @ 2015-05-19 18:26 ` Eli Zaretskii 0 siblings, 0 replies; 51+ messages in thread From: Eli Zaretskii @ 2015-05-19 18:26 UTC (permalink / raw) To: Paul Eggert, Francesco Potortì; +Cc: emacs-devel > Date: Tue, 19 May 2015 10:57:57 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > Cc: emacs-devel@gnu.org > > On 05/19/2015 08:27 AM, Eli Zaretskii wrote: > > I think it's due to DOS CR-LF EOL format of some files in the test suite. > > You're right, I misdiagnosed the porting problem. Sorry about that. Well, I should have thought about that (and tested it) before committing the test suite in the first place. > > If we agree that the problem is due to EOL format, we could try > > thinking about a solution. The root cause for the problem is that on > > Windows, etags accounts for the stripped CR characters, while on Unix > > it treats them as part of the contents, so the byte counts are offset > > by the number of the preceding lines. > > That sounds like a problem, but not a problem that the test case is > trying to detect. A simple way that should cajole the tests into > passing is to remove the trailing CRs from the test data, so I installed > a patch to do that. Thanks. I'm not sure the test suite wasn't trying to test this, though: dostorture.c seems to be an exact copy of torture.c, except for the EOL format. Francesco, can you please comment on this? Given that the Unix build of etags does not remove the CR characters from DOS CR-LF EOLs, what was the purpose of including files with DOS EOLs in the test suite? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-19 15:27 ` etags test is broken on MS-Windows Eli Zaretskii 2015-05-19 17:57 ` Paul Eggert @ 2015-05-20 15:38 ` Eli Zaretskii 2015-05-21 5:05 ` Paul Eggert 2015-05-21 13:24 ` Francesco Potortì 2015-05-21 13:16 ` Francesco Potortì 2 siblings, 2 replies; 51+ messages in thread From: Eli Zaretskii @ 2015-05-20 15:38 UTC (permalink / raw) To: Paul Eggert, Francesco Potortì; +Cc: emacs-devel > Date: Tue, 19 May 2015 18:27:44 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > (There's also an unrelated problem with the gzip-compressed file in > f-src, which seems to be some Windows-specific glitch; I will look > into it separately.) I found the reason for this: etags calls 'rewind' on a FILE stream that was created by 'popen', which is non-portable, AFAIK. On Windows, that caused the initial portions of the input to be skipped by etags, i.e. some symbols were not tagged. There are a few comments about that in the source, like this: /* We rewind here, even if inf may be a pipe. We fail if the length of the first line is longer than the pipe block size, which is unlikely. */ rewind (inf); These comments notwithstanding, it sounds like etags expects this to work satisfactorily at least on GNU/Linux, and at least when "length of the first line is not longer than the pipe block size", otherwise I don't understand why the test suite includes gzip-compressed files (Francesco?). So, on the assumption that this does work on Posix hosts, at least those that use glibc, I hacked etags to provide a Windows-specific replacement for 'rewind' that supports this expectation, assuming the stuff read and buffered before the call to 'rewind' is less than a full buffer of the FILE object. Then the Windows build no longer misses symbols in the first part of the compressed files. However, now I see something strange in the ETAGS.good files, which AFAIU were produced by Paul on a Posix host. Please look at this excerpt from ETAGS.good_1: f-src/entry.for,172 LOGICAL FUNCTION PRTPKG ^?3,75 ENTRY SETPRT ^?194,3866 ENTRY MSGSEL ^?395,8478 & intensity1(^?577,12231 character*(*) function foo(^?579,12307 ^L f-src/entry.strange_suffix,172 LOGICAL FUNCTION PRTPKG ^?3,75 ENTRY SETPRT ^?194,3866 ENTRY MSGSEL ^?395,8478 & intensity1(^?577,12231 character*(*) function foo(^?579,12307 ^L f-src/entry.strange,171 LOGICAL FUNCTION PRTPKG ^?2,2 ENTRY SETPRT ^?193,3793 ENTRY MSGSEL ^?394,8405 & intensity1(^?576,12158 character*(*) function foo(^?578,12234 Now, these 3 files have exactly identical contents, and the _only_ difference between the first 2 and the 3rd is that the latter is gzip-compressed. So that should be the only reason why all its line counts are off by 1, and its byte counts are all off by 73, which just happens to be the length of the first line of the (uncompressed) file. So could it be that rewinding a 'popen'-created stream doesn't work correctly on GNU/Linux as well? If so, we will have to make changes in etags to not do that, I think, and instead reuse the already-read stuff as needed. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-20 15:38 ` Eli Zaretskii @ 2015-05-21 5:05 ` Paul Eggert 2015-05-21 13:24 ` Francesco Potortì 1 sibling, 0 replies; 51+ messages in thread From: Paul Eggert @ 2015-05-21 5:05 UTC (permalink / raw) To: Eli Zaretskii, Francesco Potortì; +Cc: emacs-devel Eli Zaretskii wrote: > So could it be that rewinding a 'popen'-created stream doesn't work > correctly on GNU/Linux as well? If so, we will have to make changes > in etags to not do that, I think I think you're right. The behavior of rewind on pipes is implementation-defined and etags shouldn't rely on it. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-20 15:38 ` Eli Zaretskii 2015-05-21 5:05 ` Paul Eggert @ 2015-05-21 13:24 ` Francesco Potortì 2015-05-21 16:49 ` Eli Zaretskii 1 sibling, 1 reply; 51+ messages in thread From: Francesco Potortì @ 2015-05-21 13:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel >> (There's also an unrelated problem with the gzip-compressed file in >> f-src, which seems to be some Windows-specific glitch; I will look >> into it separately.) > >I found the reason for this: etags calls 'rewind' on a FILE stream >that was created by 'popen', which is non-portable, AFAIK. On >Windows, that caused the initial portions of the input to be skipped >by etags, i.e. some symbols were not tagged. > >There are a few comments about that in the source, like this: > > /* We rewind here, even if inf may be a pipe. We fail if the > length of the first line is longer than the pipe block size, > which is unlikely. */ > rewind (inf); > >These comments notwithstanding, it sounds like etags expects this to >work satisfactorily at least on GNU/Linux, and at least when "length >of the first line is not longer than the pipe block size", otherwise I >don't understand why the test suite includes gzip-compressed files >(Francesco?). Yes, that's it. I implemented the gzip feature myself, and I included tests for it. A source file should produce the same tags whether compressed or not. >So, on the assumption that this does work on Posix hosts, at least >those that use glibc, I hacked etags to provide a Windows-specific >replacement for 'rewind' that supports this expectation, assuming the >stuff read and buffered before the call to 'rewind' is less than a >full buffer of the FILE object. Then the Windows build no longer >misses symbols in the first part of the compressed files. > >However, now I see something strange in the ETAGS.good files, which >AFAIU were produced by Paul on a Posix host. Please look at this >excerpt from ETAGS.good_1: > >f-src/entry.for,172 > LOGICAL FUNCTION PRTPKG ^?3,75 > ENTRY SETPRT ^?194,3866 > ENTRY MSGSEL ^?395,8478 > & intensity1(^?577,12231 > character*(*) function foo(^?579,12307 >^L >f-src/entry.strange_suffix,172 > LOGICAL FUNCTION PRTPKG ^?3,75 > ENTRY SETPRT ^?194,3866 > ENTRY MSGSEL ^?395,8478 > & intensity1(^?577,12231 > character*(*) function foo(^?579,12307 >^L >f-src/entry.strange,171 > LOGICAL FUNCTION PRTPKG ^?2,2 > ENTRY SETPRT ^?193,3793 > ENTRY MSGSEL ^?394,8405 > & intensity1(^?576,12158 > character*(*) function foo(^?578,12234 > >Now, these 3 files have exactly identical contents, and the _only_ >difference between the first 2 and the 3rd is that the latter is >gzip-compressed. So that should be the only reason why all its line >counts are off by 1, and its byte counts are all off by 73, which just >happens to be the length of the first line of the (uncompressed) file. This is a bug. >So could it be that rewinding a 'popen'-created stream doesn't work >correctly on GNU/Linux as well? If so, we will have to make changes >in etags to not do that, I think, and instead reuse the already-read >stuff as needed. It could well be. It may have happened that, when I checked that the TAGS files were the same, I just looked at them without running diff and I missed this discrepancy. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 13:24 ` Francesco Potortì @ 2015-05-21 16:49 ` Eli Zaretskii 2015-05-23 8:46 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-21 16:49 UTC (permalink / raw) To: Francesco Potortì; +Cc: eggert, emacs-devel > Date: Thu, 21 May 2015 15:24:44 +0200 > From: Francesco Potortì <pot@gnu.org> > Cc: emacs-devel@gnu.org, Paul Eggert <eggert@cs.ucla.edu> > > >f-src/entry.for,172 > > LOGICAL FUNCTION PRTPKG ^?3,75 > > ENTRY SETPRT ^?194,3866 > > ENTRY MSGSEL ^?395,8478 > > & intensity1(^?577,12231 > > character*(*) function foo(^?579,12307 > >^L > >f-src/entry.strange_suffix,172 > > LOGICAL FUNCTION PRTPKG ^?3,75 > > ENTRY SETPRT ^?194,3866 > > ENTRY MSGSEL ^?395,8478 > > & intensity1(^?577,12231 > > character*(*) function foo(^?579,12307 > >^L > >f-src/entry.strange,171 > > LOGICAL FUNCTION PRTPKG ^?2,2 > > ENTRY SETPRT ^?193,3793 > > ENTRY MSGSEL ^?394,8405 > > & intensity1(^?576,12158 > > character*(*) function foo(^?578,12234 > > > >Now, these 3 files have exactly identical contents, and the _only_ > >difference between the first 2 and the 3rd is that the latter is > >gzip-compressed. So that should be the only reason why all its line > >counts are off by 1, and its byte counts are all off by 73, which just > >happens to be the length of the first line of the (uncompressed) file. > > This is a bug. > > >So could it be that rewinding a 'popen'-created stream doesn't work > >correctly on GNU/Linux as well? If so, we will have to make changes > >in etags to not do that, I think, and instead reuse the already-read > >stuff as needed. > > It could well be. It may have happened that, when I checked that the > TAGS files were the same, I just looked at them without running diff and > I missed this discrepancy. After thinking a bit about the alternative solution, I concluded that the simplest will be to decompress to a temporary file and read from there. Does the patch below look OK? Or can someone think about a more elegant way of solving this? diff --git a/lib-src/etags.c b/lib-src/etags.c index 0a308c1..28729da 100644 --- a/lib-src/etags.c +++ b/lib-src/etags.c @@ -116,6 +116,7 @@ CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # undef HAVE_NTGUI # undef DOS_NT # define DOS_NT +# define O_CLOEXEC O_NOINHERIT #endif /* WINDOWSNT */ #include <unistd.h> @@ -125,6 +126,7 @@ CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF #include <sysstdio.h> #include <ctype.h> #include <errno.h> +#include <fcntl.h> #include <sys/types.h> #include <sys/stat.h> #include <binary-io.h> @@ -336,6 +338,7 @@ CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF static char *absolute_dirname (char *, char *); static bool filename_is_absolute (char *f); static void canonicalize_filename (char *); +static char *etags_mktmp (void); static void linebuffer_init (linebuffer *); static void linebuffer_setlen (linebuffer *, int); static void *xmalloc (size_t); @@ -1437,7 +1440,7 @@ C code are parsed as C code (use --help --lang=c --lang=yacc\n\ fdesc *fdp; compressor *compr; char *compressed_name, *uncompressed_name; - char *ext, *real_name; + char *ext, *real_name, *tmp_name; int retval; canonicalize_filename (file); @@ -1522,9 +1525,20 @@ C code are parsed as C code (use --help --lang=c --lang=yacc\n\ } if (real_name == compressed_name) { - char *cmd = concat (compr->command, " ", real_name); - inf = popen (cmd, "r" FOPEN_BINARY); - free (cmd); + tmp_name = etags_mktmp (); + if (!tmp_name) + inf = NULL; + else + { + char *cmd1 = concat (compr->command, " ", real_name); + char *cmd = concat (cmd1, " > ", tmp_name); + free (cmd1); + if (system (cmd) == -1) + inf = NULL; + else + inf = fopen (tmp_name, "r" FOPEN_BINARY); + free (cmd); + } } else inf = fopen (real_name, "r" FOPEN_BINARY); @@ -1536,10 +1550,12 @@ C code are parsed as C code (use --help --lang=c --lang=yacc\n\ process_file (inf, uncompressed_name, lang); + retval = fclose (inf); if (real_name == compressed_name) - retval = pclose (inf); - else - retval = fclose (inf); + { + remove (tmp_name); + free (tmp_name); + } if (retval < 0) pfatal (file); @@ -1707,9 +1723,6 @@ C code are parsed as C code (use --help --lang=c --lang=yacc\n\ } } - /* We rewind here, even if inf may be a pipe. We fail if the - length of the first line is longer than the pipe block size, - which is unlikely. */ rewind (inf); /* Else try to guess the language given the case insensitive file name. */ @@ -1734,8 +1747,6 @@ C code are parsed as C code (use --help --lang=c --lang=yacc\n\ if (old_last_node == last_node) /* No Fortran entries found. Try C. */ { - /* We do not tag if rewind fails. - Only the file name will be recorded in the tags file. */ rewind (inf); curfdp->lang = get_language_from_langname (cplusplus ? "c++" : "c"); find_entries (inf); @@ -5015,8 +5026,6 @@ enum, 0, st_C_enum TEX_opgrp = '<'; TEX_clgrp = '>'; } - /* If the input file is compressed, inf is a pipe, and rewind may fail. - No attempt is made to correct the situation. */ rewind (inf); } @@ -6344,6 +6353,51 @@ enum, 0, st_C_enum return path; } +/* Return a newly allocated string containing a name of a temporary file. */ +static char * +etags_mktmp (void) +{ + const char *tmpdir = getenv ("TMPDIR"); + const char *slash = "/"; + +#if MSDOS || defined (DOS_NT) + if (!tmpdir) + tmpdir = getenv ("TEMP"); + if (!tmpdir) + tmpdir = getenv ("TMP"); + if (!tmpdir) + tmpdir = "."; + if (tmpdir[strlen (tmpdir) - 1] == '/' + || tmpdir[strlen (tmpdir) - 1] == '\\') + slash = ""; +#else + if (!tmpdir) + tmpdir = "/tmp"; + if (tmpdir[strlen (tmpdir) - 1] == '/') + slash = ""; +#endif + + char *templt = concat (tmpdir, slash, "etXXXXXX"); + int fd = mkostemp (templt, O_CLOEXEC); + if (fd < 0) + { + free (templt); + templt = NULL; + } + else + close (fd); + +#if defined (DOS_NT) + /* The file name will be used in shell redirection, so it needs to have + DOS-style backslashes, or else the Windows shell will barf. */ + char *p; + for (p = templt; *p; p++) + if (*p == '/') + *p = '\\'; +#endif + return templt; +} + /* Return a newly allocated string containing the file name of FILE relative to the absolute directory DIR (which should end with a slash). */ static char * ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 16:49 ` Eli Zaretskii @ 2015-05-23 8:46 ` Eli Zaretskii 0 siblings, 0 replies; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 8:46 UTC (permalink / raw) To: pot; +Cc: eggert, emacs-devel > Date: Thu, 21 May 2015 19:49:22 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org > > > >Now, these 3 files have exactly identical contents, and the _only_ > > >difference between the first 2 and the 3rd is that the latter is > > >gzip-compressed. So that should be the only reason why all its line > > >counts are off by 1, and its byte counts are all off by 73, which just > > >happens to be the length of the first line of the (uncompressed) file. > > > > This is a bug. > > > > >So could it be that rewinding a 'popen'-created stream doesn't work > > >correctly on GNU/Linux as well? If so, we will have to make changes > > >in etags to not do that, I think, and instead reuse the already-read > > >stuff as needed. > > > > It could well be. It may have happened that, when I checked that the > > TAGS files were the same, I just looked at them without running diff and > > I missed this discrepancy. > > After thinking a bit about the alternative solution, I concluded that > the simplest will be to decompress to a temporary file and read from > there. Does the patch below look OK? Or can someone think about a > more elegant way of solving this? No comments, so I pushed these changes. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-19 15:27 ` etags test is broken on MS-Windows Eli Zaretskii 2015-05-19 17:57 ` Paul Eggert 2015-05-20 15:38 ` Eli Zaretskii @ 2015-05-21 13:16 ` Francesco Potortì 2015-05-21 16:31 ` Eli Zaretskii 2 siblings, 1 reply; 51+ messages in thread From: Francesco Potortì @ 2015-05-21 13:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel >> Without that patch, the tests failed on my GNU/Linux host due to encoding >> problems. See attached file > >I don't think it's due to encoding problem. (AFAIK, etags doesn't >regard its input as characters, but as a stream of bytes.) Ye, etags has no notion of character sets. >I think it's due to DOS CR-LF EOL format of some files in the test >suite. For example, the first file whose tags were different in your >testing is dostorture.c, which has DOS EOLs, the second file, c.C, has >a lone ^M character at the end of one of its lines, and so on. > >Could you please verify that this is indeed the source of the problem? Those files were put there to test the behaviour of etags with different EOL styles. However, few tests were in fact done for etags running on DOS systems, so in fact there may be undetected regressions on etags for DOS. About utf-8, etags' behaviour should be independent of locale... ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 13:16 ` Francesco Potortì @ 2015-05-21 16:31 ` Eli Zaretskii 2015-05-21 16:37 ` Paul Eggert 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-21 16:31 UTC (permalink / raw) To: Francesco Potortì; +Cc: eggert, emacs-devel > Date: Thu, 21 May 2015 15:16:01 +0200 > From: Francesco Potortì <pot@gnu.org> > Cc: emacs-devel@gnu.org, Paul Eggert <eggert@cs.ucla.edu> > > >I think it's due to DOS CR-LF EOL format of some files in the test > >suite. For example, the first file whose tags were different in your > >testing is dostorture.c, which has DOS EOLs, the second file, c.C, has > >a lone ^M character at the end of one of its lines, and so on. > > > >Could you please verify that this is indeed the source of the problem? > > Those files were put there to test the behaviour of etags with different > EOL styles. However, few tests were in fact done for etags running on > DOS systems, so in fact there may be undetected regressions on etags for > DOS. There's no problem with etags on DOS and Windows, it behaves exactly as designed and implemented. The problem is on Unix: because etags on Unix does not strip the CR characters, its character counts are wrong, because Emacs will strip them when it reads the source file. IOW, what was at some point only done by Emacs on DOS and Windows, is now done by default on all platforms. So I think etags should use teh same code in Unix as well. I mean this fragment: if (c == '\n') { if (p > buffer && p[-1] == '\r') { p -= 1; #ifdef DOS_NT /* Assume CRLF->LF translation will be performed by Emacs when loading this file, so CRs won't appear in the buffer. It would be cleaner to compensate within Emacs; however, Emacs does not know how many CRs were deleted before any given point in the file. */ chars_deleted = 1; #else chars_deleted = 2; #endif } ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 16:31 ` Eli Zaretskii @ 2015-05-21 16:37 ` Paul Eggert 2015-05-21 16:55 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Paul Eggert @ 2015-05-21 16:37 UTC (permalink / raw) To: Eli Zaretskii, Francesco Potortì; +Cc: emacs-devel On 05/21/2015 09:31 AM, Eli Zaretskii wrote: > I think etags should use teh > same code in Unix as well. I mean this fragment: > > if (c == '\n') > { > if (p > buffer && p[-1] == '\r') > { > p -= 1; > #ifdef DOS_NT > /* Assume CRLF->LF translation will be performed by Emacs > when loading this file, so CRs won't appear in the buffer. > It would be cleaner to compensate within Emacs; > however, Emacs does not know how many CRs were deleted > before any given point in the file. */ > chars_deleted = 1; > #else > chars_deleted = 2; > #endif > } Sorry, I'm a little lost. Would it actually work with an Emacs on a GNUish host if we simply set chars_deleted = 1 here? If etags is locale-agnostic, its output files must contain byte counts and not character counts. This is because etags doesn't even know where the characters are. And if the output files contain byte counts, surely they need to count the CR bytes as well as the LF bytes, at least on a GNU or POSIXish host. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 16:37 ` Paul Eggert @ 2015-05-21 16:55 ` Eli Zaretskii 2015-05-21 19:03 ` Paul Eggert 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-21 16:55 UTC (permalink / raw) To: Paul Eggert; +Cc: pot, emacs-devel > Date: Thu, 21 May 2015 09:37:02 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: emacs-devel@gnu.org > > > if (c == '\n') > > { > > if (p > buffer && p[-1] == '\r') > > { > > p -= 1; > > #ifdef DOS_NT > > /* Assume CRLF->LF translation will be performed by Emacs > > when loading this file, so CRs won't appear in the buffer. > > It would be cleaner to compensate within Emacs; > > however, Emacs does not know how many CRs were deleted > > before any given point in the file. */ > > chars_deleted = 1; > > #else > > chars_deleted = 2; > > #endif > > } > > Sorry, I'm a little lost. Would it actually work with an Emacs on a > GNUish host if we simply set chars_deleted = 1 here? I think it will, and that's what I was suggesting: remove the #ifdef and use the code currently conditioned by DOS_NT. > If etags is locale-agnostic, its output files must contain byte counts > and not character counts. Yes, they are called "character counts", but are actually byte counts. > And if the output files contain byte counts, surely they need to > count the CR bytes as well as the LF bytes, at least on a GNU or > POSIXish host. I think CRs don't need to be counted, because they will not be in the Emacs buffer when a DOS-ish file is visited, due to EOL decoding. IOW, the "CRLF->LF translation" that the comment mentions is done on all platforms. Or am I missing something? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 16:55 ` Eli Zaretskii @ 2015-05-21 19:03 ` Paul Eggert 2015-05-21 19:54 ` Eli Zaretskii 2015-05-22 12:40 ` Francesco Potortì 0 siblings, 2 replies; 51+ messages in thread From: Paul Eggert @ 2015-05-21 19:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, emacs-devel [-- Attachment #1: Type: text/plain, Size: 575 bytes --] On 05/21/2015 09:55 AM, Eli Zaretskii wrote: > IOW, the "CRLF->LF translation" that the comment mentions is done on > all platforms. Or am I missing something? I was thinking about the case where a source file has mostly lines with LF but a few lines end in CRLF. E.g., the attached file has a CR at the end of the second line. In that case, Emacs doesn't strip the trailing CRs on GNU/Linux. Wouldn't the byte counts get messed up then? Come to think of it, one of the etags test cases did that before I removed the CR (and perhaps that was part of the test...). [-- Attachment #2: xx.c --] [-- Type: text/plain, Size: 23 bytes --] int x; char y; int z; ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 19:03 ` Paul Eggert @ 2015-05-21 19:54 ` Eli Zaretskii 2015-05-21 23:28 ` Paul Eggert 2015-05-22 12:40 ` Francesco Potortì 1 sibling, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-21 19:54 UTC (permalink / raw) To: Paul Eggert; +Cc: pot, emacs-devel > Date: Thu, 21 May 2015 12:03:44 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: pot@gnu.org, emacs-devel@gnu.org > > On 05/21/2015 09:55 AM, Eli Zaretskii wrote: > > IOW, the "CRLF->LF translation" that the comment mentions is done on > > all platforms. Or am I missing something? > > I was thinking about the case where a source file has mostly lines with > LF but a few lines end in CRLF. E.g., the attached file has a CR at the > end of the second line. In that case, Emacs doesn't strip the trailing > CRs on GNU/Linux. Wouldn't the byte counts get messed up then? Yes, they would, but it's not fatal, since etags.el searches around the position for the pattern stated on the tag line. And of course, in the case you present, the byte counts will be slightly off on Windows as well. But the way etags works currently, a file with all of its lines ending in CRLF will _always_ have all of its byte counts messed up. Not a catastrophe, either, but still worse than under my suggestion. > Come to think of it, one of the etags test cases did that before I > removed the CR (and perhaps that was part of the test...). Yes, one of the files has a single line with CRLF (I thought it was part of the test as well). ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 19:54 ` Eli Zaretskii @ 2015-05-21 23:28 ` Paul Eggert 2015-05-22 8:32 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Paul Eggert @ 2015-05-21 23:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, emacs-devel On 05/21/2015 12:54 PM, Eli Zaretskii wrote: > Yes, they would, but it's not fatal, since etags.el searches around > the position for the pattern stated on the tag line. > > And of course, in the case you present, the byte counts will be > slightly off on Windows as well. > > But the way etags works currently, a file with all of its lines ending > in CRLF will_always_ have all of its byte counts messed up. Not a > catastrophe, either, but still worse than under my suggestion. I don't see why it's worth our trouble to substitute one incorrect solution for another, if it's OK that the solutions are approximate. If it's important to fix this, how about the following idea instead. Have etags always compute byte offsets the POSIX way, counting any CRs, and put POSIX-oriented byte counts into the TAGS file (the way it already does on GNU hosts). When Emacs starts up, if the source file is in DOS mode (with CRLF replaced by LF internally), Emacs subtracts the line count from the POSIX byte count, and uses the resulting byte count instead. That way, we don't need to change how etags works on GNU platforms, nor do we need to tell GNU users to regenerate their TAGS files. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 23:28 ` Paul Eggert @ 2015-05-22 8:32 ` Eli Zaretskii 2015-05-22 13:08 ` Francesco Potortì 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-22 8:32 UTC (permalink / raw) To: Paul Eggert; +Cc: pot, emacs-devel > Date: Thu, 21 May 2015 16:28:21 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: pot@gnu.org, emacs-devel@gnu.org > > I don't see why it's worth our trouble to substitute one incorrect > solution for another, if it's OK that the solutions are approximate. It's OK if we don't want to include back in the test suite the files with DOS EOLs that caused the trouble in the first place. If we don't care about that subtle feature, I'm OK with the current code. After all, it worked nicely until now. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 8:32 ` Eli Zaretskii @ 2015-05-22 13:08 ` Francesco Potortì 2015-05-22 13:19 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Francesco Potortì @ 2015-05-22 13:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel > >> Date: Thu, 21 May 2015 16:28:21 -0700 >> From: Paul Eggert <eggert@cs.ucla.edu> >> CC: pot@gnu.org, emacs-devel@gnu.org >> >> I don't see why it's worth our trouble to substitute one incorrect >> solution for another, if it's OK that the solutions are approximate. > >It's OK if we don't want to include back in the test suite the files >with DOS EOLs that caused the trouble in the first place. If we don't >care about that subtle feature, I'm OK with the current code. After >all, it worked nicely until now. However, if in fact Emacs works the same on all platforms, maybe there is no reason for Etags to compensate for differences that do not exist (any more?). In fact, as back as I can go with the etags.c sources, I see that code has always been there, so unless I'm mistaken it's very very old. If what I write is correct, I'd go with removeing the different treatment of crlf on dos and unix. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 13:08 ` Francesco Potortì @ 2015-05-22 13:19 ` Eli Zaretskii 2015-05-22 18:23 ` Paul Eggert 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-22 13:19 UTC (permalink / raw) To: Francesco Potortì; +Cc: eggert, emacs-devel > Date: Fri, 22 May 2015 15:08:28 +0200 > From: Francesco Potortì <pot@gnu.org> > Cc: emacs-devel@gnu.org, Paul Eggert <eggert@cs.ucla.edu> > > However, if in fact Emacs works the same on all platforms, maybe there > is no reason for Etags to compensate for differences that do not exist > (any more?). In fact, as back as I can go with the etags.c sources, I > see that code has always been there, so unless I'm mistaken it's very > very old. Yes, the code is very old. > If what I write is correct, I'd go with removeing the different > treatment of crlf on dos and unix. I agree, but it sounds like Paul doesn't. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 13:19 ` Eli Zaretskii @ 2015-05-22 18:23 ` Paul Eggert 2015-05-22 19:08 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Paul Eggert @ 2015-05-22 18:23 UTC (permalink / raw) To: Eli Zaretskii, Francesco Potortì; +Cc: emacs-devel On 05/22/2015 06:19 AM, Eli Zaretskii wrote: >> I'd go with removeing the different >> >treatment of crlf on dos and unix. > I agree, but it sounds like Paul doesn't. Yes and no. My understanding is that the code now works without glitches for files with a few stray CRLFs but has some glitches for files that consistently use CRLF, and that the change proposed in <http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00637.html> would introduce glitches in the former case while removing glitches in the latter. I'd rather not trade one bug for another. That is, if we're going to change this code at all, let's do it in a way that doesn't introduce glitches for the stray CRLF case. One possible way to do that is suggested in the last paragraph of <http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00657.html>. This approach does remove the different treatment of CRLF on MS-Windows and Unix (as Francesco suggested); but it does so in a different way, by using the Unix convention everywhere, and it suggests an approach that should let Emacs do the right thing on both Unix and MS-Windows, without any glitches on either platform. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 18:23 ` Paul Eggert @ 2015-05-22 19:08 ` Eli Zaretskii 2015-05-22 19:25 ` Andreas Schwab 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-22 19:08 UTC (permalink / raw) To: Paul Eggert; +Cc: pot, emacs-devel > Date: Fri, 22 May 2015 11:23:09 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: emacs-devel@gnu.org > > One possible way to do that is suggested in the last paragraph of > <http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00657.html>. > This approach does remove the different treatment of CRLF on MS-Windows > and Unix (as Francesco suggested); but it does so in a different way, by > using the Unix convention everywhere, and it suggests an approach that > should let Emacs do the right thing on both Unix and MS-Windows, without > any glitches on either platform. These byte counts are not file byte counts (if they were, then what you suggest would have been TRT). They are buffer byte counts, i.e. etags needs to compute the byte counts that Emacs will see when the file is visited in an Emacs buffer. So each CRLF EOL needs to be counted as 1 byte, not 2. Therefore, the DOS_NT code does TRT in this case, for both Windows and Posix hosts, as amazing as it sounds. It is, of course possible to let etags count file bytes, and then have etags.el correct those to get buffer bytes instead. But that doesn't sound right to me: first "break" the perfectly correct code, and then "unbreak" the result in Emacs. To say nothing of the fact that visiting a TAGS table will be slower that way. However, the issue is minor, and I really don't want to waste any more time arguing about it. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 19:08 ` Eli Zaretskii @ 2015-05-22 19:25 ` Andreas Schwab 2015-05-22 19:38 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-22 19:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, Paul Eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > These byte counts are not file byte counts (if they were, then what > you suggest would have been TRT). They are buffer byte counts, > i.e. etags needs to compute the byte counts that Emacs will see when > the file is visited in an Emacs buffer. So each CRLF EOL needs to be > counted as 1 byte, not 2. What do you do about non-ASCII characters? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 19:25 ` Andreas Schwab @ 2015-05-22 19:38 ` Eli Zaretskii 2015-05-22 19:41 ` Andreas Schwab 2015-05-22 19:42 ` Eli Zaretskii 0 siblings, 2 replies; 51+ messages in thread From: Eli Zaretskii @ 2015-05-22 19:38 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: Paul Eggert <eggert@cs.ucla.edu>, pot@gnu.org, emacs-devel@gnu.org > Date: Fri, 22 May 2015 21:25:59 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > These byte counts are not file byte counts (if they were, then what > > you suggest would have been TRT). They are buffer byte counts, > > i.e. etags needs to compute the byte counts that Emacs will see when > > the file is visited in an Emacs buffer. So each CRLF EOL needs to be > > counted as 1 byte, not 2. > > What do you do about non-ASCII characters? Etags counts bytes, not characters, so it doesn't matter. Or maybe I misunderstand the question. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 19:38 ` Eli Zaretskii @ 2015-05-22 19:41 ` Andreas Schwab 2015-05-22 19:42 ` Eli Zaretskii 1 sibling, 0 replies; 51+ messages in thread From: Andreas Schwab @ 2015-05-22 19:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Cc: Paul Eggert <eggert@cs.ucla.edu>, pot@gnu.org, emacs-devel@gnu.org >> Date: Fri, 22 May 2015 21:25:59 +0200 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > These byte counts are not file byte counts (if they were, then what >> > you suggest would have been TRT). They are buffer byte counts, ^^^^^^^^^^^^^^^^^^ >> > i.e. etags needs to compute the byte counts that Emacs will see when >> > the file is visited in an Emacs buffer. So each CRLF EOL needs to be >> > counted as 1 byte, not 2. >> >> What do you do about non-ASCII characters? > > Etags counts bytes, not characters, so it doesn't matter. See above. How does etags know how many bytes the characters will count in an Emacs buffer? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 19:38 ` Eli Zaretskii 2015-05-22 19:41 ` Andreas Schwab @ 2015-05-22 19:42 ` Eli Zaretskii 2015-05-22 19:50 ` Andreas Schwab 1 sibling, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-22 19:42 UTC (permalink / raw) To: schwab; +Cc: pot, eggert, emacs-devel > Date: Fri, 22 May 2015 22:38:15 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > > > From: Andreas Schwab <schwab@linux-m68k.org> > > Cc: Paul Eggert <eggert@cs.ucla.edu>, pot@gnu.org, emacs-devel@gnu.org > > Date: Fri, 22 May 2015 21:25:59 +0200 > > > > Eli Zaretskii <eliz@gnu.org> writes: > > > > > These byte counts are not file byte counts (if they were, then what > > > you suggest would have been TRT). They are buffer byte counts, > > > i.e. etags needs to compute the byte counts that Emacs will see when > > > the file is visited in an Emacs buffer. So each CRLF EOL needs to be > > > counted as 1 byte, not 2. > > > > What do you do about non-ASCII characters? > > Etags counts bytes, not characters, so it doesn't matter. Or maybe I > misunderstand the question. Or maybe you mean the use case where a Latin-1 file is read into an Emacs buffer, and each non-ASCII character is expanded into a UTF-8 sequence. Indeed, that will make the byte counts inaccurate (and etags.el will have to compensate by searching around the specified place). One more reason not to change anything, I guess. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 19:42 ` Eli Zaretskii @ 2015-05-22 19:50 ` Andreas Schwab 2015-05-22 20:05 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-22 19:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Or maybe you mean the use case where a Latin-1 file is read into an > Emacs buffer, and each non-ASCII character is expanded into a UTF-8 > sequence. Indeed, that will make the byte counts inaccurate (and > etags.el will have to compensate by searching around the specified > place). One more reason not to change anything, I guess. ??? It's exactly the counter argument. The indices in the tag file must be file offsets, everything else will lead to wrong offsets. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 19:50 ` Andreas Schwab @ 2015-05-22 20:05 ` Eli Zaretskii 2015-05-22 20:30 ` Andreas Schwab 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-22 20:05 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Fri, 22 May 2015 21:50:48 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Or maybe you mean the use case where a Latin-1 file is read into an > > Emacs buffer, and each non-ASCII character is expanded into a UTF-8 > > sequence. Indeed, that will make the byte counts inaccurate (and > > etags.el will have to compensate by searching around the specified > > place). One more reason not to change anything, I guess. > > ??? It's exactly the counter argument. The indices in the tag file must > be file offsets, everything else will lead to wrong offsets. If by "file offsets" you mean counting bytes in the file, then those will also be wrong after decoding non-ASCII characters, unless the file was encoded in UTF-8 to begin with, right? And if you mean counting characters in the file, then etags will be unable to do that, unless it grows the capability to detect the encoding of the file, or rely on the locale and assume that the file is encoded in locale's codeset. Right? Or am I again missing something? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 20:05 ` Eli Zaretskii @ 2015-05-22 20:30 ` Andreas Schwab 2015-05-22 21:26 ` Paul Eggert 2015-05-23 6:39 ` Eli Zaretskii 0 siblings, 2 replies; 51+ messages in thread From: Andreas Schwab @ 2015-05-22 20:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > If by "file offsets" you mean counting bytes in the file, Of course! > then those will also be wrong after decoding non-ASCII characters, > unless the file was encoded in UTF-8 to begin with, right? Yes, of course. Emacs will have to cope. > And if you mean counting characters in the file, This is impossible to do. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 20:30 ` Andreas Schwab @ 2015-05-22 21:26 ` Paul Eggert 2015-05-23 6:40 ` Eli Zaretskii 2015-05-23 6:39 ` Eli Zaretskii 1 sibling, 1 reply; 51+ messages in thread From: Paul Eggert @ 2015-05-22 21:26 UTC (permalink / raw) To: Andreas Schwab, Eli Zaretskii; +Cc: pot, emacs-devel On 05/22/2015 01:30 PM, Andreas Schwab wrote: >> then those will also be wrong after decoding non-ASCII characters, >> >unless the file was encoded in UTF-8 to begin with, right? > Yes, of course. Emacs will have to cope. > Andreas is right, as usual. TAGS should contain hard info about file contents, not guesswork about what Emacs's internal encoding might be, as the latter depends on user input. If the input file is UTF-8 and isn't munged by CRLF removal etc., file byte offsets should equal buffer byte offsets. If not, it's up to Emacs to map the hard info to its internal representation. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 21:26 ` Paul Eggert @ 2015-05-23 6:40 ` Eli Zaretskii 0 siblings, 0 replies; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 6:40 UTC (permalink / raw) To: Paul Eggert; +Cc: pot, schwab, emacs-devel > Date: Fri, 22 May 2015 14:26:27 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: pot@gnu.org, emacs-devel@gnu.org > > On 05/22/2015 01:30 PM, Andreas Schwab wrote: > >> then those will also be wrong after decoding non-ASCII characters, > >> >unless the file was encoded in UTF-8 to begin with, right? > > Yes, of course. Emacs will have to cope. > > > > Andreas is right, as usual. TAGS should contain hard info about file > contents, not guesswork about what Emacs's internal encoding might be, > as the latter depends on user input. If the input file is UTF-8 and > isn't munged by CRLF removal etc., file byte offsets should equal buffer > byte offsets. If not, it's up to Emacs to map the hard info to its > internal representation. I don't see how this is better than what we have already, but I don't mind such a change. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-22 20:30 ` Andreas Schwab 2015-05-22 21:26 ` Paul Eggert @ 2015-05-23 6:39 ` Eli Zaretskii 2015-05-23 8:02 ` Andreas Schwab 1 sibling, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 6:39 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Fri, 22 May 2015 22:30:57 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > If by "file offsets" you mean counting bytes in the file, > > Of course! > > > then those will also be wrong after decoding non-ASCII characters, > > unless the file was encoded in UTF-8 to begin with, right? > > Yes, of course. Emacs will have to cope. OK, then how do you go from "byte offsets will be wrong" and "Emacs will have to cope" to this: > ??? It's exactly the counter argument. The indices in the tag file must > be file offsets, everything else will lead to wrong offsets. This seems to say that such byte offsets are the only "right" ones. But we have just established that all of the byte offsets discussed here, including the ones currently produced by etags, are wrong in some sense. What makes this "wrong" be "the only right one"? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 6:39 ` Eli Zaretskii @ 2015-05-23 8:02 ` Andreas Schwab 2015-05-23 8:27 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-23 8:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > This seems to say that such byte offsets are the only "right" ones. Right. > But we have just established that all of the byte offsets discussed > here, including the ones currently produced by etags, are wrong in ??? etags _does_ produce byte offsets, currently. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 8:02 ` Andreas Schwab @ 2015-05-23 8:27 ` Eli Zaretskii 2015-05-23 9:41 ` Andreas Schwab 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 8:27 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Sat, 23 May 2015 10:02:52 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > This seems to say that such byte offsets are the only "right" ones. > > Right. > > > But we have just established that all of the byte offsets discussed > > here, including the ones currently produced by etags, are wrong in > > ??? etags _does_ produce byte offsets, currently. The issue is their accuracy, not their existence or being byte offsets. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 8:27 ` Eli Zaretskii @ 2015-05-23 9:41 ` Andreas Schwab 2015-05-23 9:49 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-23 9:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > The issue is their accuracy, not their existence or being byte > offsets. Byte offsets are always accurate. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 9:41 ` Andreas Schwab @ 2015-05-23 9:49 ` Eli Zaretskii 2015-05-23 9:59 ` Andreas Schwab 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 9:49 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Sat, 23 May 2015 11:41:45 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > The issue is their accuracy, not their existence or being byte > > offsets. > > Byte offsets are always accurate. Not if they are supposed to be byte counts in an Emacs buffer. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 9:49 ` Eli Zaretskii @ 2015-05-23 9:59 ` Andreas Schwab 2015-05-23 10:20 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-23 9:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Not if they are supposed to be byte counts in an Emacs buffer. This concept doesn't make sense. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 9:59 ` Andreas Schwab @ 2015-05-23 10:20 ` Eli Zaretskii 2015-05-23 10:54 ` Andreas Schwab 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 10:20 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Sat, 23 May 2015 11:59:24 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Not if they are supposed to be byte counts in an Emacs buffer. > > This concept doesn't make sense. Nonetheless, that's what etags attempts to compute. Its byte counts are prepared for consumption by etags.el, not by humans. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 10:20 ` Eli Zaretskii @ 2015-05-23 10:54 ` Andreas Schwab 2015-05-23 11:31 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-23 10:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > Nonetheless, that's what etags attempts to compute. That's why it fails. Emacs has removed this API for good reason. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 10:54 ` Andreas Schwab @ 2015-05-23 11:31 ` Eli Zaretskii 2015-05-23 12:10 ` Andreas Schwab 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 11:31 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Sat, 23 May 2015 12:54:34 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Nonetheless, that's what etags attempts to compute. > > That's why it fails. It doesn't fail. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 11:31 ` Eli Zaretskii @ 2015-05-23 12:10 ` Andreas Schwab 2015-05-23 13:46 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-23 12:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> Date: Sat, 23 May 2015 12:54:34 +0200 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > Nonetheless, that's what etags attempts to compute. >> >> That's why it fails. > > It doesn't fail. In which way does it not fail? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 12:10 ` Andreas Schwab @ 2015-05-23 13:46 ` Eli Zaretskii 2015-05-23 17:27 ` Andreas Schwab 2015-05-23 19:01 ` Paul Eggert 0 siblings, 2 replies; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 13:46 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Sat, 23 May 2015 14:10:27 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> From: Andreas Schwab <schwab@linux-m68k.org> > >> Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > >> Date: Sat, 23 May 2015 12:54:34 +0200 > >> > >> Eli Zaretskii <eliz@gnu.org> writes: > >> > >> > Nonetheless, that's what etags attempts to compute. > >> > >> That's why it fails. > > > > It doesn't fail. > > In which way does it not fail? There was never a requirement for etags to compute precise byte positions for Emacs, mainly because source files get changed, and we don't want users to have to re-run etags upon every change. The function 'etags-goto-tag-location' will look around that position in a progressively-expanding window. You will also see there that the position in TAGS is interpreted as a character position, which already introduces inaccuracies. So etags is only required to produce approximately correct byte positions, and it does that well enough. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 13:46 ` Eli Zaretskii @ 2015-05-23 17:27 ` Andreas Schwab 2015-05-23 17:37 ` Eli Zaretskii 2015-05-23 19:01 ` Paul Eggert 1 sibling, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-23 17:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > So etags is only required to produce approximately correct byte > positions, and it does that well enough. If you want approximate positions then a line-column pair would be better. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 17:27 ` Andreas Schwab @ 2015-05-23 17:37 ` Eli Zaretskii 2015-05-23 18:46 ` Andreas Schwab 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 17:37 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Sat, 23 May 2015 19:27:43 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > So etags is only required to produce approximately correct byte > > positions, and it does that well enough. > > If you want approximate positions then a line-column pair would be > better. How do you define a column count? Doesn't that require counting characters, rather than bytes? ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 17:37 ` Eli Zaretskii @ 2015-05-23 18:46 ` Andreas Schwab 2015-05-23 19:04 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Andreas Schwab @ 2015-05-23 18:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> Date: Sat, 23 May 2015 19:27:43 +0200 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> > So etags is only required to produce approximately correct byte >> > positions, and it does that well enough. >> >> If you want approximate positions then a line-column pair would be ^^^^^^^^^^^ >> better. > > How do you define a column count? It doesn't matter. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 18:46 ` Andreas Schwab @ 2015-05-23 19:04 ` Eli Zaretskii 2015-05-25 12:33 ` Francesco Potortì 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 19:04 UTC (permalink / raw) To: Andreas Schwab; +Cc: pot, eggert, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Sat, 23 May 2015 20:46:04 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> From: Andreas Schwab <schwab@linux-m68k.org> > >> Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > >> Date: Sat, 23 May 2015 19:27:43 +0200 > >> > >> Eli Zaretskii <eliz@gnu.org> writes: > >> > >> > So etags is only required to produce approximately correct byte > >> > positions, and it does that well enough. > >> > >> If you want approximate positions then a line-column pair would be > ^^^^^^^^^^^ > >> better. > > > > How do you define a column count? > > It doesn't matter. If you count bytes from the beginning of line, then yes, the accuracy should be better that way. However, I wonder whether the format of TAGS, which assumes file offsets, is a de-facto standard by now. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 19:04 ` Eli Zaretskii @ 2015-05-25 12:33 ` Francesco Potortì 0 siblings, 0 replies; 51+ messages in thread From: Francesco Potortì @ 2015-05-25 12:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, Andreas Schwab, emacs-devel Eli Zaretskii: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> Date: Sat, 23 May 2015 20:46:04 +0200 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> >> From: Andreas Schwab <schwab@linux-m68k.org> >> >> Cc: pot@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org >> >> Date: Sat, 23 May 2015 19:27:43 +0200 >> >> >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> >> >> > So etags is only required to produce approximately correct byte >> >> > positions, and it does that well enough. >> >> >> >> If you want approximate positions then a line-column pair would be >> ^^^^^^^^^^^ >> >> better. >> > >> > How do you define a column count? >> >> It doesn't matter. > >If you count bytes from the beginning of line, then yes, the accuracy >should be better that way. However, I wonder whether the format of >TAGS, which assumes file offsets, is a de-facto standard by now. In fact it is. The format has not changed in at least 20 years, etags has always been available for all platforms and it could be used outside of Emacs. For a format that old, changing it for no other reason than elegance is probably not a good idea. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 13:46 ` Eli Zaretskii 2015-05-23 17:27 ` Andreas Schwab @ 2015-05-23 19:01 ` Paul Eggert 2015-05-23 19:27 ` Eli Zaretskii 1 sibling, 1 reply; 51+ messages in thread From: Paul Eggert @ 2015-05-23 19:01 UTC (permalink / raw) To: Eli Zaretskii, Andreas Schwab; +Cc: pot, emacs-devel Eli Zaretskii wrote: > etags is only required to produce approximately correct byte > positions, and it does that well enough. OK, but in that case the method I proposed in <http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00657.html> will do a better job (on all platforms) that what we have now, right? So it'd be an improvement, even if it's not perfect. Although Andreas's suggestion of switching to a byte column count would also be an improvement over what we have now, it'd require a change to the tags file format whereas the method I proposed would leave the file format unchanged. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 19:01 ` Paul Eggert @ 2015-05-23 19:27 ` Eli Zaretskii 2015-05-25 16:44 ` Paul Eggert 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-23 19:27 UTC (permalink / raw) To: Paul Eggert; +Cc: pot, schwab, emacs-devel > Date: Sat, 23 May 2015 12:01:04 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: pot@gnu.org, emacs-devel@gnu.org > > Eli Zaretskii wrote: > > etags is only required to produce approximately correct byte > > positions, and it does that well enough. > > OK, but in that case the method I proposed in > <http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00657.html> will do a > better job (on all platforms) that what we have now, right? So it'd be an > improvement, even if it's not perfect. I already said I didn't mind, although I don't think it's an improvement. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-23 19:27 ` Eli Zaretskii @ 2015-05-25 16:44 ` Paul Eggert 2015-05-25 19:33 ` Eli Zaretskii 0 siblings, 1 reply; 51+ messages in thread From: Paul Eggert @ 2015-05-25 16:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, schwab, emacs-devel [-- Attachment #1: Type: text/plain, Size: 363 bytes --] Eli Zaretskii wrote: > I already said I didn't mind, although I don't think it's an > improvement. Isn't it an improvement in the sense that it makes TAGS files portable between MS-Windows and other platforms? That is, with the attached patch one can create a TAGS file on GNU/Linux and use it on MS-Windows and vice versa, and it works the same either way. [-- Attachment #2: 0001-Make-TAGS-files-more-portable-to-MS-Windows.patch --] [-- Type: text/x-patch, Size: 2603 bytes --] From 247e3bf4aa06b5b2dab9f70556292458751f0445 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Mon, 25 May 2015 09:40:45 -0700 Subject: [PATCH] Make TAGS files more portable to MS-Windows * etc/NEWS: Document this. * lib-src/etags.c (readline_internal) [DOS_NT]: Don't treat CRs differently from GNUish hosts. * lisp/progmodes/etags.el (etags-goto-tag-location): Adjust STARTPOS to account for the skipped CRs in dos-style files. --- etc/NEWS | 3 +++ lib-src/etags.c | 9 --------- lisp/progmodes/etags.el | 8 ++++++-- 3 files changed, 9 insertions(+), 11 deletions(-) diff --git a/etc/NEWS b/etc/NEWS index b922a27..9f861b2 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -992,6 +992,9 @@ of Windows starting with Windows 9X. +++ ** Emacs running on MS-Windows now supports the daemon mode. +** The byte counts in etags-generated TAGS files are now the same on +MS-Windows as they are on other platforms. + ** OS X 10.5 or older is no longer supported. ** OS X on PowerPC is no longer supported. diff --git a/lib-src/etags.c b/lib-src/etags.c index f124d29..8b7f53c 100644 --- a/lib-src/etags.c +++ b/lib-src/etags.c @@ -6075,16 +6075,7 @@ readline_internal (linebuffer *lbp, FILE *stream, char const *filename) if (p > buffer && p[-1] == '\r') { p -= 1; -#ifdef DOS_NT - /* Assume CRLF->LF translation will be performed by Emacs - when loading this file, so CRs won't appear in the buffer. - It would be cleaner to compensate within Emacs; - however, Emacs does not know how many CRs were deleted - before any given point in the file. */ - chars_deleted = 1; -#else chars_deleted = 2; -#endif } else { diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el index 60ea456..d99db8b 100644 --- a/lisp/progmodes/etags.el +++ b/lisp/progmodes/etags.el @@ -1355,9 +1355,13 @@ hits the start of file." pat (concat (if (eq selective-display t) "\\(^\\|\^m\\)" "^") (regexp-quote (car tag-info)))) - ;; The character position in the tags table is 0-origin. + ;; The character position in the tags table is 0-origin and counts CRs. ;; Convert it to a 1-origin Emacs character position. - (if startpos (setq startpos (1+ startpos))) + (when startpos + (setq startpos (1+ startpos)) + (when (and line + (eq 1 (coding-system-eol-type buffer-file-coding-system))) + (setq startpos (- startpos (1- line))))) ;; If no char pos was given, try the given line number. (or startpos (if line -- 2.1.0 ^ permalink raw reply related [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-25 16:44 ` Paul Eggert @ 2015-05-25 19:33 ` Eli Zaretskii 2015-05-25 20:29 ` Paul Eggert 0 siblings, 1 reply; 51+ messages in thread From: Eli Zaretskii @ 2015-05-25 19:33 UTC (permalink / raw) To: Paul Eggert; +Cc: pot, schwab, emacs-devel > Date: Mon, 25 May 2015 09:44:00 -0700 > From: Paul Eggert <eggert@cs.ucla.edu> > CC: schwab@linux-m68k.org, pot@gnu.org, emacs-devel@gnu.org > > Eli Zaretskii wrote: > > I already said I didn't mind, although I don't think it's an > > improvement. > > Isn't it an improvement in the sense that it makes TAGS files portable between > MS-Windows and other platforms? Yes, it is, which is why I said I didn't mind. > That is, with the attached patch one can create a TAGS file on > GNU/Linux and use it on MS-Windows and vice versa, and it works the > same either way. We'll get the same result -- platform-independence of TAGS files -- if we use the Windows variant of the code, with the added advantage that the Lisp part will not be needed, and reading the data from TAGS will be slightly faster. But again, I don't mind it either way. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-25 19:33 ` Eli Zaretskii @ 2015-05-25 20:29 ` Paul Eggert 0 siblings, 0 replies; 51+ messages in thread From: Paul Eggert @ 2015-05-25 20:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: pot, schwab, emacs-devel Eli Zaretskii wrote: > I said I didn't mind. OK, thanks, I installed the patch. ^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: etags test is broken on MS-Windows 2015-05-21 19:03 ` Paul Eggert 2015-05-21 19:54 ` Eli Zaretskii @ 2015-05-22 12:40 ` Francesco Potortì 1 sibling, 0 replies; 51+ messages in thread From: Francesco Potortì @ 2015-05-22 12:40 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, emacs-devel >On 05/21/2015 09:55 AM, Eli Zaretskii wrote: >> IOW, the "CRLF->LF translation" that the comment mentions is done on >> all platforms. Or am I missing something? > >I was thinking about the case where a source file has mostly lines with >LF but a few lines end in CRLF. E.g., the attached file has a CR at the >end of the second line. In that case, Emacs doesn't strip the trailing >CRs on GNU/Linux. Wouldn't the byte counts get messed up then? > >Come to think of it, one of the etags test cases did that before I >removed the CR (and perhaps that was part of the test...). > >int x; >char y; >int z; It was definitely part of the test :) ^ permalink raw reply [flat|nested] 51+ messages in thread
end of thread, other threads:[~2015-05-25 20:29 UTC | newest] Thread overview: 51+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <83y4kmdjmj.fsf@gnu.org> [not found] ` <555A8E62.7060700@cs.ucla.edu> 2015-05-19 15:27 ` etags test is broken on MS-Windows Eli Zaretskii 2015-05-19 17:57 ` Paul Eggert 2015-05-19 18:26 ` Eli Zaretskii 2015-05-20 15:38 ` Eli Zaretskii 2015-05-21 5:05 ` Paul Eggert 2015-05-21 13:24 ` Francesco Potortì 2015-05-21 16:49 ` Eli Zaretskii 2015-05-23 8:46 ` Eli Zaretskii 2015-05-21 13:16 ` Francesco Potortì 2015-05-21 16:31 ` Eli Zaretskii 2015-05-21 16:37 ` Paul Eggert 2015-05-21 16:55 ` Eli Zaretskii 2015-05-21 19:03 ` Paul Eggert 2015-05-21 19:54 ` Eli Zaretskii 2015-05-21 23:28 ` Paul Eggert 2015-05-22 8:32 ` Eli Zaretskii 2015-05-22 13:08 ` Francesco Potortì 2015-05-22 13:19 ` Eli Zaretskii 2015-05-22 18:23 ` Paul Eggert 2015-05-22 19:08 ` Eli Zaretskii 2015-05-22 19:25 ` Andreas Schwab 2015-05-22 19:38 ` Eli Zaretskii 2015-05-22 19:41 ` Andreas Schwab 2015-05-22 19:42 ` Eli Zaretskii 2015-05-22 19:50 ` Andreas Schwab 2015-05-22 20:05 ` Eli Zaretskii 2015-05-22 20:30 ` Andreas Schwab 2015-05-22 21:26 ` Paul Eggert 2015-05-23 6:40 ` Eli Zaretskii 2015-05-23 6:39 ` Eli Zaretskii 2015-05-23 8:02 ` Andreas Schwab 2015-05-23 8:27 ` Eli Zaretskii 2015-05-23 9:41 ` Andreas Schwab 2015-05-23 9:49 ` Eli Zaretskii 2015-05-23 9:59 ` Andreas Schwab 2015-05-23 10:20 ` Eli Zaretskii 2015-05-23 10:54 ` Andreas Schwab 2015-05-23 11:31 ` Eli Zaretskii 2015-05-23 12:10 ` Andreas Schwab 2015-05-23 13:46 ` Eli Zaretskii 2015-05-23 17:27 ` Andreas Schwab 2015-05-23 17:37 ` Eli Zaretskii 2015-05-23 18:46 ` Andreas Schwab 2015-05-23 19:04 ` Eli Zaretskii 2015-05-25 12:33 ` Francesco Potortì 2015-05-23 19:01 ` Paul Eggert 2015-05-23 19:27 ` Eli Zaretskii 2015-05-25 16:44 ` Paul Eggert 2015-05-25 19:33 ` Eli Zaretskii 2015-05-25 20:29 ` Paul Eggert 2015-05-22 12:40 ` Francesco Potortì
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).