all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* etags confused with uppercase filenames (on Windows)
@ 2002-03-30  1:56 Stavros Macrakis
  2002-03-30  8:37 ` Eli Zaretskii
  0 siblings, 1 reply; 14+ messages in thread
From: Stavros Macrakis @ 2002-03-30  1:56 UTC (permalink / raw)


etags copyright 98 distributed with Emacs 20.7.1 (i386-*-nt5.0.2195)
running on Windows 2000

Here's a funny little bug....

Take the file below, call it foo.el.

Run the following command line:

 > etags foo.el FOO.EL

You get the tags file shown below, which is correct for fox.el and incorrect
for FOO.EL.  Same error if the command includes only FOO.EL.

Note that on Windows, case is ignored in dereferencing filenames, so these
two filenames refer to the same file, and in fact *.el finds FOO.EL.  If the
file name is all-caps in the directory, you get the same problem.  .EL works
fine everywhere else in Emacs as far as I can tell.

This happened to me because in some transfer from one filesystem to another,
some piece of software decided to canonicalize filenames as all-caps....
They worked fine, except for this glitch in etags.

       -s

-----------foo.el---------
(defun zoo2 (n) (delete-region 3 4))

(defun sdfsdf ()
    ;; comment 1
    ;; comment 2
    (let ((sdf 0))
      ;; comment 3
      ;; comment 4
      ))

----------TAGS---------
^L
fox.el,49
(defun zoo2 \x7fzoo2\x011,0
(defun sdfsdf \x7fsdfsdf\x013,38
^L
FOX.EL,45
    ;; comment \x7f4,55
      ;; comment \x7f7,108

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-03-30  1:56 etags confused with uppercase filenames (on Windows) Stavros Macrakis
@ 2002-03-30  8:37 ` Eli Zaretskii
  0 siblings, 0 replies; 14+ messages in thread
From: Eli Zaretskii @ 2002-03-30  8:37 UTC (permalink / raw)
  Cc: bug-gnu-emacs

> Reply-To: <macrakis@alum.mit.edu>
> Date: Fri, 29 Mar 2002 20:56:20 -0500
> 
> Run the following command line:
> 
>  > etags foo.el FOO.EL
> 
> You get the tags file shown below, which is correct for fox.el and incorrect
> for FOO.EL.  Same error if the command includes only FOO.EL.
> 
> Note that on Windows, case is ignored in dereferencing filenames, so these
> two filenames refer to the same file, and in fact *.el finds FOO.EL.

This particular problem can be solved by adding "EL" to the
Lisp_suffixes array in etags.c.  However, I don't think etags can be
made case-insensitive to file names in general, since foo.C needs to
be processed as C++ code, while foo.c should be processed as C code.

So I'd suggest to keep your file names in proper letter-case.  Windows
is indeed case-insensitive, but it does preserve the letter-case in
file names.

> This happened to me because in some transfer from one filesystem to another,
> some piece of software decided to canonicalize filenames as all-caps....

If you have a program that generates UPCASED file names, try to
replace it with some other program, which does not.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
@ 2002-04-02 15:19 Francesco Potorti`
  2002-04-02 15:46 ` Stavros Macrakis
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Francesco Potorti` @ 2002-04-02 15:19 UTC (permalink / raw)
  Cc: Emacs developers, Stavros Macrakis

   This particular problem can be solved by adding "EL" to the
   Lisp_suffixes array in etags.c.  

Indeed.

By the way, Stavros can circumvent his problem by prepending to the file
names the option --language=lisp.

   However, I don't think etags can be made case-insensitive to file
   names in general, since foo.C needs to be processed as C++ code,
   while foo.c should be processed as C code.

That would not be a big problem, because etags can distinguish C and C++
by looking at the file contents.  But in general, I agree that making
etags case insensitive on file names means losing information, and
is not the right thing to do.

   > This happened to me because in some transfer from one filesystem to
   >another, some piece of software decided to canonicalize filenames as
   >all-caps....

I have an idea, and would like to hear if anyone has anything against
it.

To determine a file's language, currently etags does the following:

1) if the user specified a language, use that
2) else, guess it from the file name
3) else, look for #!
4) ... (other euristics)

I think that I could add:

2bis) else, if the file name is all upcase, upcase the builtin file name
      suffixes and retry

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: etags confused with uppercase filenames (on Windows)
  2002-04-02 15:19 Francesco Potorti`
@ 2002-04-02 15:46 ` Stavros Macrakis
  2002-04-02 16:02 ` Stefan Monnier
  2002-04-03  7:54 ` Eli Zaretskii
  2 siblings, 0 replies; 14+ messages in thread
From: Stavros Macrakis @ 2002-04-02 15:46 UTC (permalink / raw)
  Cc: Emacs developers

Thanks, Francesco and Eli, for your thoughts on etags *.el.

I wouldn't want you to spend too much time thinking about this, though!
Mainly, it was a surprise to me.  I had transferred some files a long time
ago, and they happened to have been upcased somewhere, and I never noticed
that they weren't being indexed by etags.

One of the confusing things in all this is that the wildcard "*.el" in
Windows matches "xxx.EL", too.  Not much we can do about that.

One simple solution might be to give a warning message in etags for
unrecognized extensions, and perhaps files with zero tag entries.  If etags
had told me that .EL was unrecognized, I would have just renamed the files.

By the way, what's the purpose of generating a tag entry for every line of
an unrecognized file type?

        -s

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-02 15:19 Francesco Potorti`
  2002-04-02 15:46 ` Stavros Macrakis
@ 2002-04-02 16:02 ` Stefan Monnier
  2002-04-03  7:57   ` Eli Zaretskii
  2002-04-03  7:54 ` Eli Zaretskii
  2 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2002-04-02 16:02 UTC (permalink / raw)
  Cc: Eli Zaretskii, Emacs developers, Stavros Macrakis

> To determine a file's language, currently etags does the following:
> 
> 1) if the user specified a language, use that
> 2) else, guess it from the file name
> 3) else, look for #!
> 4) ... (other euristics)
> 
> I think that I could add:
> 
> 2bis) else, if the file name is all upcase, upcase the builtin file name
>       suffixes and retry

How about

	4) try to guess from the filename but ignore case this time.

I don't think that the "all upcase" condition is necessary.
But I do think that #! should take precedence since I'd rather
not change the existing behavior on POSIX systems.


	Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-02 15:19 Francesco Potorti`
  2002-04-02 15:46 ` Stavros Macrakis
  2002-04-02 16:02 ` Stefan Monnier
@ 2002-04-03  7:54 ` Eli Zaretskii
  2002-04-03 21:40   ` Stavros Macrakis
  2 siblings, 1 reply; 14+ messages in thread
From: Eli Zaretskii @ 2002-04-03  7:54 UTC (permalink / raw)
  Cc: emacs-devel, stavros.macrakis

> From: Francesco Potorti` <pot@gnu.org>
> Date: Tue, 02 Apr 2002 17:19:09 +0200
> 
> I think that I could add:
> 
> 2bis) else, if the file name is all upcase, upcase the builtin file name
>       suffixes and retry

It would probably be better to apply a bit more fine checks to the
file name.  For example, I suspect that only file names that fit into
the DOS 8+3 limitations are upcased like that.  Stavros, can you
confirm that in your case?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-02 16:02 ` Stefan Monnier
@ 2002-04-03  7:57   ` Eli Zaretskii
  2002-04-03  8:43     ` Francesco Potorti`
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Eli Zaretskii @ 2002-04-03  7:57 UTC (permalink / raw)
  Cc: pot, emacs-devel, stavros.macrakis

> From: "Stefan Monnier" <monnier+gnu/emacs@RUM.cs.yale.edu>
> Date: Tue, 02 Apr 2002 11:02:49 -0500
> 
> But I do think that #! should take precedence since I'd rather
> not change the existing behavior on POSIX systems.

Does it really make sense to have etags behavior be different on
different platforms?  Especially given the fact that some file you are
working on can well come from a Windows system that exports its
filesystem?

More specifically, what bad effects could be caused by treating
FOO.EL on Unix and GNU systems as an Emacs Lisp file for the purposes
of etags?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-03  7:57   ` Eli Zaretskii
@ 2002-04-03  8:43     ` Francesco Potorti`
  2002-04-03 15:42       ` Stefan Monnier
  2002-04-03 15:22     ` Stefan Monnier
  2002-04-04 17:36     ` Richard Stallman
  2 siblings, 1 reply; 14+ messages in thread
From: Francesco Potorti` @ 2002-04-03  8:43 UTC (permalink / raw)
  Cc: stavros.macrakis, emacs-devel, monnier+gnu/emacs

   Does it really make sense to have etags behavior be different on
   different platforms?  Especially given the fact that some file you are
   working on can well come from a Windows system that exports its
   filesystem?

I partly agree with Stefan's observations.  I am inclined towards
implementing the following behaviour for etags when determining
languages.  Each line is considered only if the previous ones did not
yield any match.

1) use the explicitely given language, if any
2) guess it from the file name (usually from the suffix)
3) guess it from the #! interpreter
4) if file name is all upcased, guess it from the file name without
   regard to case (usually from the suffix)
5) try Fortran and give a warning if succeded
6) try C/C++ and give a warning (always succeeds)

The differences from the current behaviour are that 4) does not
currently exists, and 5) and 6) do not currently elicit a warning.
   
Note that 3) is currently used for Perl only, even if other languages
using #! may be added in the future.

Note also that this behaviour is independent of the platform.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-03  7:57   ` Eli Zaretskii
  2002-04-03  8:43     ` Francesco Potorti`
@ 2002-04-03 15:22     ` Stefan Monnier
  2002-04-04 17:36     ` Richard Stallman
  2 siblings, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2002-04-03 15:22 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, pot, emacs-devel, stavros.macrakis

> > From: "Stefan Monnier" <monnier+gnu/emacs@RUM.cs.yale.edu>
> > Date: Tue, 02 Apr 2002 11:02:49 -0500
> > 
> > But I do think that #! should take precedence since I'd rather
> > not change the existing behavior on POSIX systems.
> 
> Does it really make sense to have etags behavior be different on
> different platforms?  Especially given the fact that some file you are
> working on can well come from a Windows system that exports its
> filesystem?

I didn't say the behavior should be different.  Just that the current
behavior works fine on the free systems that we care about, so we
should make this change lower down the precedence rather than
higher up, unless we think it's also a desirable (rather than just
harmless) change for the case-sensitive systems, of course.


	Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-03  8:43     ` Francesco Potorti`
@ 2002-04-03 15:42       ` Stefan Monnier
  2002-04-03 21:23         ` Francesco Potorti`
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2002-04-03 15:42 UTC (permalink / raw)
  Cc: Eli Zaretskii, stavros.macrakis, emacs-devel, monnier+gnu/emacs

> 4) if file name is all upcased, guess it from the file name without
>    regard to case (usually from the suffix)

Why bother to check that it's all upcased ?
What harm can it do if we don't do this check ?


	Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-03 15:42       ` Stefan Monnier
@ 2002-04-03 21:23         ` Francesco Potorti`
  0 siblings, 0 replies; 14+ messages in thread
From: Francesco Potorti` @ 2002-04-03 21:23 UTC (permalink / raw)
  Cc: Eli Zaretskii, stavros.macrakis, emacs-devel, monnier+gnu/emacs

Stefan Monnier <monnier+gnu/emacs@RUM.cs.yale.edu> writes:
   > 4) if file name is all upcased, guess it from the file name without
   >    regard to case (usually from the suffix)
   
   Why bother to check that it's all upcased ?

In fact, you are right.  It did make sense the way I had initially
proposed, but once I agreed that this point should be moved after the
#! check, then just making a case-insensitive match is the right thing
to do.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: etags confused with uppercase filenames (on Windows)
  2002-04-03  7:54 ` Eli Zaretskii
@ 2002-04-03 21:40   ` Stavros Macrakis
  0 siblings, 0 replies; 14+ messages in thread
From: Stavros Macrakis @ 2002-04-03 21:40 UTC (permalink / raw)
  Cc: emacs-devel

> > 2bis) else, if the file name is all upcase, upcase the builtin file name
suffixes and retry
>
> It would probably be better to apply a bit more fine checks to the
> file name.  For example, I suspect that only file names that fit into
> the DOS 8+3 limitations are upcased like that.  Stavros, can you
> confirm that in your case?

That's probably true, although I've renamed them now.  The main reason I
suggested that a warning would be appropriate is that it took me a few
*years* since I copied over these files to notice that they weren't being
indexed by etags!

On Windows, Emacs seems to accept *.EL as equivalent to *.el, so etags
should, too (although a warning might be a good idea).  In glancing at the
code, I'm not 100% sure that Emacs is completely consistent about this
(depending on the setting of case-fold-search) though....  I have not taken
the time to check thoroughly.

But without investigating this further (I don't think it deserves it!), I
agree with Francesco's suggestions, except for (4), which I would make:

4) if suffix is all upcased, try downcasing, and give a warning if that
succeeds

This is a good idea because .EL is not the standard suffix, and may not work
in all cases (e.g. when the files get moved to another system).  On systems
which only support uppercase filenames (does etags support any such), of
course there should be no warning.

      -s

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: etags confused with uppercase filenames (on Windows)
@ 2002-04-03 22:01 Stavros Macrakis
  0 siblings, 0 replies; 14+ messages in thread
From: Stavros Macrakis @ 2002-04-03 22:01 UTC (permalink / raw)
  Cc: emacs-devel

Just to make things weirder, if you have a file called xx.EL, then currently

  etags xx.el

treats it as Emacs-Lisp, while

  etags xx.EL xx*.el xx*.EL

treat it as C.  Conversely, if you have a file called yy.el, then currently
etags treats yy.EL as C, and the other cases as Emacs-lisp.

Apparently etags is not checking the real name of the file, just what it was
named on the command line.

      -s

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: etags confused with uppercase filenames (on Windows)
  2002-04-03  7:57   ` Eli Zaretskii
  2002-04-03  8:43     ` Francesco Potorti`
  2002-04-03 15:22     ` Stefan Monnier
@ 2002-04-04 17:36     ` Richard Stallman
  2 siblings, 0 replies; 14+ messages in thread
From: Richard Stallman @ 2002-04-04 17:36 UTC (permalink / raw)
  Cc: monnier+gnu/emacs, pot, emacs-devel, stavros.macrakis

    More specifically, what bad effects could be caused by treating
    FOO.EL on Unix and GNU systems as an Emacs Lisp file for the purposes
    of etags?

In the case of .EL, probably no harm.
There are some situations where case matters, though:
foo.c is a C file, whereas FOO.C is a C++ file.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2002-04-04 17:36 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-03-30  1:56 etags confused with uppercase filenames (on Windows) Stavros Macrakis
2002-03-30  8:37 ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2002-04-02 15:19 Francesco Potorti`
2002-04-02 15:46 ` Stavros Macrakis
2002-04-02 16:02 ` Stefan Monnier
2002-04-03  7:57   ` Eli Zaretskii
2002-04-03  8:43     ` Francesco Potorti`
2002-04-03 15:42       ` Stefan Monnier
2002-04-03 21:23         ` Francesco Potorti`
2002-04-03 15:22     ` Stefan Monnier
2002-04-04 17:36     ` Richard Stallman
2002-04-03  7:54 ` Eli Zaretskii
2002-04-03 21:40   ` Stavros Macrakis
2002-04-03 22:01 Stavros Macrakis

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.