From: Dmitry Gutov <dgutov@yandex.ru>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 22241@debbugs.gnu.org
Subject: bug#22241: 25.0.50; etags Ruby parser problems
Date: Sat, 23 Jan 2016 21:23:57 +0300 [thread overview]
Message-ID: <56A3C53D.1050408@yandex.ru> (raw)
In-Reply-To: <83si1o45g1.fsf@gnu.org>
On 01/23/2016 07:38 PM, Eli Zaretskii wrote:
> I don't speak Ruby. So please give a more detailed spec for the
> features you want added. I wrote some questions below, but I'm quite
> sure there are more questions I should ask, but don't know about. So
> please provide as complete specification for each feature as you
> possibly can, TIA.
There's no actual up-to-date language spec, and when in doubt, I fire up
the REPL and try things out (and forget many of the results afterwards).
So there's no "detailed spec" in my head. Let me just try my best
answering your questions, for now.
>> - Constants are not indexed.
>
> What is the full syntax of a "constant"? Is it just
>
> IDENTIFIER "=" INTEGER-NUMBER
Pretty much. IDENTIFIER should be ALL_CAPS, or CamelCase, with
underscores allowed.
INTEGER-NUMBER should be just EXPRESSION, because it can be any
expression, possibly a multiline one.
CamelCase constants usually are assigned some "anonymous class" value,
like in the following example:
SpecialError = Class.new(StandardError)
(Which is a metaprogramming-y way to define the class SpecialError).
But you probably shouldn't worry about ALL_CAPS vs CamelCase distinction
here, and just treat them the same.
> ? Is whitespace significant? What about newlines?
No spaces around "=" is fine. Spaces can also be replaced by tabs. A
newline before "=" is not allowed.
>> - Class methods (def self.foo) are given the wrong name ("self."
>> shouldn't be included).
>
> Is it enough to remove a single "self.", case-sensitive, at the
> beginning of an identifier? Can there be more than one, like
> "self.self.SOMETHING"?
One one "self." is allowed. When you remove it, you should record that
SOMETHING is a method defined on the current class (or module). In Java
terms, say, it would be like "static" method.
The upshot is, it can be called on the class itself, but not on its
instance:
irb(main):001:0> class C
irb(main):002:1> def self.foo
irb(main):003:2> 3
irb(main):004:2> end
irb(main):005:1> end
=> nil
irb(main):006:0> C.foo
=> 3
irb(main):007:0> C.new.foo
NoMethodError: undefined method `foo' for #<C:0x000000020141e8>
So the qualified name of that method should be "C.foo", as opposed to
"C#foo" for an instance method.
> Your other example, i.e.
>
> def ModuleExample.singleton_module_method
>
> indicates that anything up to and including the period should be
> removed, is that correct?
More or less. This is an "explicit syntax", which is equivalent to using
"self.". These two declarations are equivalent:
module ModuleExample
def ModuleExample.foo
end
end
module ModuleExample
def self.foo
end
end
> Is there only one, or can there be many?
There can be only one dot there. There could be a method resolution
operator (::) in there, I suppose, but I'm not sure if you want to add
support for that right now, or ever.
> Should they all be removed for an unqualified name?
Yes.
>> - "class << self" blocks are given a separate entry.
>
> What should be done instead? Can't a class be named "<<"?
A class cannot be named "<<". You should not add that line to the index,
but record that the method definitions inside the following scope are
defined on the current class or module. These are equivalent:
class C
def self.foo
end
end
class C
class << self
def foo
end
end
end
>> - Qualified tag names are never generated.
>
> (Etags never promised qualified names except for C and derived
> languages, and also in Java.)
OK, that would be a nice bonus, but we can live without it. ctags
doesn't define qualified names either.
Without qualified names, I suppose you should treat
def self.foo
end
and
def foo
end
and
def Class.foo
end
the same. Only record those as "foo".
> How to know when a module's or a class's scope ends? Is it enough to
> count "end" lines?
Hmm, maybe? I'm guessing etags doesn't really handle heredoc syntax, or
multiline strings defined with percent literals (examples here:
https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#.22Here_document.22_notation)
The result shouldn't be too bad if you do that, anyway. Except:
> Can I assume that "end" will always appear by
> itself on a line?
Unfortunately, no. It can also be on the same line, after a semicolon
(or on any other line, I suppose, but nobody writes Ruby like that).
Examples:
class SpecialError < StandardError; end
or
class MyStruct < Struct.new(:a, :b, :c); end
(One could also stick a method definition inside that, but I haven't
seen that in practice yet). So, either:
- 'end' is on a separate line (after ^[ \t]*).
- class/module Name[< ]...; end$
'end' can also be followed by "# some comment" in both cases.
> Can I disregard indentation of "end" (and of
> everything else) when I determine where a scope begins and ends?
Probably, yes.
Indentation is not significant in Ruby, but heredocs can mess up the
detection of 'end' keywords, so we could use indentation as a way to
detect where each scope ends. But if etags doesn't normally do that,
let's not go there now.
>> A
>> A::B
>> A::B::ABC
>> A::B#foo!
>> A::B.bar?
>> A::B.qux=
>
> Why did 'foo!' get a '#' instead of a '.', as for '_bar'?
It's common to use '#' in the qualified names of instance methods, in
Java, Ruby and JS docstrings. '.' is used for class methods (static
methods, in Java), or methods defined on other singleton objects.
Examples:
http://usejsdoc.org/tags-inline-link.html (search for '#' there)
http://stackoverflow.com/questions/5915992/javadoc-writing-links-to-methods
http://docs.ruby-lang.org/en/2.1.0/RDoc/Markup.html#class-RDoc::Markup-label-Links
(the documentation also says to use ":: for class methods", but let's
not do that)
> Why doesn't
> "class << self" count as a class scope, and add something to qualified
> names?
It just served to turn 'qux=' into a class (static) method.
>> should become (the unqualified version):
>>
>> A
>> foo
>> bar=
>> tee
>> tee=
>> qux
>>
>> All attr_* methods can take a variable number of arguments. The parser
>> should take each argument, check that it's a symbol and not a variable
>> (starts with :), and if so, record the corresponding method name.
>
> Why did 'bar' and 'tee' git a '=' appended?
Because 'attr_writer :bar' effectively expands to
def bar=(val)
@bar = val
end
and 'attr_accessor :tee' expands into
def tee
@tee
end
def tee=(val)
@tee = val
end
> Are there any other such "append rules"?
There are other macros (any code can define a macro), but let's not
worry about them now.
next prev parent reply other threads:[~2016-01-23 18:23 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-26 3:59 bug#22241: 25.0.50; etags Ruby parser problems Dmitry Gutov
2015-12-26 4:13 ` Dmitry Gutov
2015-12-26 4:34 ` Dmitry Gutov
2016-01-23 16:38 ` Eli Zaretskii
2016-01-23 18:23 ` Dmitry Gutov [this message]
2016-01-23 18:59 ` Eli Zaretskii
2016-01-23 19:29 ` Dmitry Gutov
2016-01-23 20:48 ` Eli Zaretskii
2016-01-23 21:43 ` Dmitry Gutov
2016-01-24 15:44 ` Eli Zaretskii
2016-01-30 12:21 ` Eli Zaretskii
2016-01-30 22:06 ` Dmitry Gutov
2016-01-31 3:37 ` Eli Zaretskii
2016-01-31 5:43 ` Dmitry Gutov
2016-01-31 18:11 ` Eli Zaretskii
2016-02-01 8:40 ` Dmitry Gutov
2016-02-02 18:16 ` Eli Zaretskii
2016-02-02 19:59 ` Dmitry Gutov
2016-02-03 16:26 ` Eli Zaretskii
2016-02-03 23:21 ` Dmitry Gutov
2016-02-04 3:43 ` Eli Zaretskii
2016-02-04 8:24 ` Dmitry Gutov
2016-02-04 17:24 ` Eli Zaretskii
2016-02-04 20:06 ` Dmitry Gutov
2016-01-31 18:01 ` Eli Zaretskii
2016-02-01 8:24 ` Dmitry Gutov
2016-02-02 18:13 ` Eli Zaretskii
2016-01-30 10:52 ` Eli Zaretskii
2016-01-30 16:43 ` Dmitry Gutov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A3C53D.1050408@yandex.ru \
--to=dgutov@yandex.ru \
--cc=22241@debbugs.gnu.org \
--cc=eliz@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.