* Guidelines for the "symbol" syntax class
@ 2016-01-03 5:09 Dmitry Gutov
2016-01-03 22:56 ` John Wiegley
0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-03 5:09 UTC (permalink / raw)
To: emacs-devel; +Cc: Stefan Monnier
Hi all and Stefan,
I intend to make some changes to the syntax of `:' in ruby-mode, and I'm
wondering how far should that change go. I can remove it from the syntax
table, but still apply it via syntax-propertize-function in other cases,
see below.
Do we have any solid guidelines for that?
Context: my two main uses of the notion of symbol are 1) "all symbols in
all buffer" completion candidates, 2) filtering the results of
xref-find-references by checking that the match begins and ends at a
symbol boundary. Currently, both of these features don't work well in
ruby-mode.
First, "M::C" is interpreted as one symbol. If I just search for
references to "C", this won't match. And vice versa, this qualified name
usually corresponds to the definition like this:
module M
class C
end
end
so if I search for references to "M::C", this won't match either. So `:'
should simply become "punctuation". Then the simplest approach will
leave to false positives, but no false negatives.
There is another way `:' is used in Ruby: Ruby Symbols (I'm going to
mention those only using a capital S, to distinguish). Which is like
weird syntax for interned strings, but they're often used to refer to
method names: for introspection, or when defining a method dynamically,
or to dispatch a call dynamically. Examples:
class C
def foo
end
end
C.instance_method(:foo) # => #<UnboundMethod: C#foo>
class C
define_method(:foo) do
3
end
end
C.new.send(:foo) # => 3
Consequently, if somewhere in my Ruby program there's a method foo_bar,
it might be beneficial to be able to complete a Symbol :fo to :foo_bar
as well, or for xref-find-references, when looking for references to
this method, include the usages of Symbol :foo_bar.
Or take this example:
class C
# attr_reader is a macro, kinda.
# Define a method C#foo that simply returns the value
# of the instance variable with the same name:
attr_reader :foo
def initialize(foo)
# Assign that instance variable.
@foo = foo
end
def do_something
# Call the previously defined method (parens are optional)
# and then call a method on the returned value:
foo.do_something_amazing
end
end
After writing the attr_reader call, it would be handy if I could use the
name of the symbol in completion when writing the name of the argument,
and the name of the variable (so there's also a question of whether @
should have the "symbol" syntax; it currently doesn't). And then later,
when calling the method.
Another argument in favor of not having `:' be symbol constituents in
Symbol literals is that we have two ways to write Hash (associative
array) literal with Symbol keys:
{:key => value} and {key: value},
where the latter is syntactic sugar for the former. If `:' is not a
symbol constituents, we won't have two superficially "different" symbols
in the buffer, and the "find references" search will easily find both.
Or, should I stop trying to make the simplest general approaches work in
ruby-mode, and write a dedicated xref backend for Ruby? One that would
use etags and Grep, but use a bit smarter filtering.
What should company-dabbrev-code do? Should it use
dabbrev-abbrev-char-regexp, which ruby-mode will then set?
Should both company-dabbrev-code and ruby-mode make use of
dabbrev-abbrev-skip-leading-regexp? Note that it still won't help to
avoid making {:key and {key: look like different symbols.
And if I do all that, what *will* be the purpose of making `:' remain
symbol constituents inside Symbol literals?
Thanks all,
especially to those who've read all this ;-)
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-03 5:09 Guidelines for the "symbol" syntax class Dmitry Gutov
@ 2016-01-03 22:56 ` John Wiegley
2016-01-04 0:46 ` Dmitry Gutov
0 siblings, 1 reply; 13+ messages in thread
From: John Wiegley @ 2016-01-03 22:56 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel
>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:
> Or, should I stop trying to make the simplest general approaches work in
> ruby-mode, and write a dedicated xref backend for Ruby? One that would use
> etags and Grep, but use a bit smarter filtering.
Does removing ':' from the symbol class for ruby solve all of your problems,
and create no new ones? :)
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-03 22:56 ` John Wiegley
@ 2016-01-04 0:46 ` Dmitry Gutov
2016-01-04 0:51 ` Stefan Monnier
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-04 0:46 UTC (permalink / raw)
To: emacs-devel, Stefan Monnier
On 01/04/2016 12:56 AM, John Wiegley wrote:
> Does removing ':' from the symbol class for ruby solve all of your problems,
> and create no new ones? :)
I wish.
With fewer (hopefully none) false negatives returned by
xref-collect-references, there will come more false positive matches.
One false negative comes to mind: if `:' is not a symbol constituent,
and the user searches for all references to `:foo', using the current
implementation, they will get none. The default input suggested by
xref-find-references will be `foo', though (when point is on `:foo').
In completion, if we continue to simply collect all symbols from all
buffers, the user will start typing a method name , try completion and
will get offered the names of all method and Symbols they ever typed
anywhere, in the code. While Symbols are used to refer to method names,
they're also used for method keyword arguments (in Ruby 2.0+), and you
even often see them in business logic. For instance, in most web
applications there will be Symbols :username, :password, :account_id,
and so on, referring to the HTTP request parameters. So, there will be a
lot of false positives here as well.
I don't know how to fight that, except by using a smarter program, one
that loads the application and/or parses the code, etc, but there will
always be some use cases that are not handled by "smart" logic already
written, and being able to write a quick-and-simple solution is often handy.
Further, I'm sure there are a lot of third-party packages out there,
some of them language-agnostic, which deal with source code and use the
notion of a symbol.
One example that I do use is `easy-kill'. If my cursor is at the end of
a Symbol :foo, currently calling this command will select `:foo',
including the colon, which is handy to be able to copy and paste, or
kill, that value as a whole. If `:' is no longer a symbol constituent,
either I'll have to live with always additionally typing or deleting
these colons in that kind of situations, or will have to provide the
"boundary of thing" info to easy-kill additionally somehow.
That's why I'm asking if there are any existing guidelines, formal or
informal, that I can take into consideration. That might also inform
changes to xref-collection-references and company-dabbrev-code; not just
ruby-mode.
Would anyone care for another wall of text?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 0:46 ` Dmitry Gutov
@ 2016-01-04 0:51 ` Stefan Monnier
2016-01-04 0:58 ` Dmitry Gutov
2016-01-04 1:13 ` John Yates
2016-01-04 0:55 ` John Wiegley
[not found] ` <CAJnXXogonsWpqadNpX0BijzoiztorYP1d=b31seBfvGVBwwT_Q@mail.gmail.com>
2 siblings, 2 replies; 13+ messages in thread
From: Stefan Monnier @ 2016-01-04 0:51 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: emacs-devel
With things like Foo:Bar (or Foo.bar or what have you), there are indeed
conflicting definitions of "symbol" and there's usually not a single one
that works everywhere. IOW you can't expect Emacs's notion of "symbol"
to cover all the use cases. More specifically, Emacs's notion of symbol
can only be used as a stepping stone on which to construct the things
you need, on a case by case basis.
Stefan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 0:46 ` Dmitry Gutov
2016-01-04 0:51 ` Stefan Monnier
@ 2016-01-04 0:55 ` John Wiegley
2016-01-04 1:14 ` Dmitry Gutov
[not found] ` <CAJnXXogonsWpqadNpX0BijzoiztorYP1d=b31seBfvGVBwwT_Q@mail.gmail.com>
2 siblings, 1 reply; 13+ messages in thread
From: John Wiegley @ 2016-01-04 0:55 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel
>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:
> That's why I'm asking if there are any existing guidelines, formal or
> informal, that I can take into consideration. That might also inform changes
> to xref-collection-references and company-dabbrev-code; not just ruby-mode.
> Would anyone care for another wall of text?
I suppose my informal guideline is to implement a strategy that works best for
the mode you want to derive information from, and to not expect syntax classes
to be a capable enough interface. I'd expect Ruby symbols to include ":",
personally. A::B is the qualified name of a symbol -- although "B" is
technically an unqualified symbol in its own right within that qualified name.
Better yet, define a more general API that all modes can use, since many modes
struggle with these same issues (imenu, thing-at-pt, dabbrev, etc). This
echoes back to our long IDE thread. Perhaps we need layered, semantically-
defined classes, such that a given text position might occur within many such
layers (for example, selection might choose B, A::B, or A::B.foo, depending on
how many times I smash the "select current" key).
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 0:51 ` Stefan Monnier
@ 2016-01-04 0:58 ` Dmitry Gutov
2016-01-04 1:13 ` John Yates
1 sibling, 0 replies; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-04 0:58 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
On 01/04/2016 02:51 AM, Stefan Monnier wrote:
> With things like Foo:Bar (or Foo.bar or what have you), there are indeed
> conflicting definitions of "symbol" and there's usually not a single one
> that works everywhere.
But as a major mode author, I *do* have to pick one. That's the main
choice I'm asking about.
Splitting Foo::Bar into two symbols is a done decision (`::' is a scope
resolution operator anyway). Still undecided whether the symbol-at-point
at instance variable (@var) should include the @ sign, and whether the
symbol-at-point at a Symbol literal (:symbol) should include the colon.
And what to do about syntax-sugared Symbol literals (symbol: value).
> IOW you can't expect Emacs's notion of "symbol"
> to cover all the use cases. More specifically, Emacs's notion of symbol
> can only be used as a stepping stone on which to construct the things
> you need, on a case by case basis.
Naturally, the use cases left suboptimal by the eventual choice would
need to be handled specially.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 0:51 ` Stefan Monnier
2016-01-04 0:58 ` Dmitry Gutov
@ 2016-01-04 1:13 ` John Yates
2016-01-04 1:18 ` Dmitry Gutov
1 sibling, 1 reply; 13+ messages in thread
From: John Yates @ 2016-01-04 1:13 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel, Dmitry Gutov
[-- Attachment #1: Type: text/plain, Size: 811 bytes --]
On Sun, Jan 3, 2016 at 7:51 PM, Stefan Monnier <monnier@iro.umontreal.ca>
wrote:
> IOW you can't expect Emacs's notion of "symbol"
> to cover all the use cases. More specifically, Emacs's notion of symbol
> can only be used as a stepping stone on which to construct the things
> you need, on a case by case basis.
>
I interpret this as "Emacs supplies only a basic notion of symbol".
Since xref inches closer to understanding the semantics of the user's
programming language it might want to introduce some new abstraction
for a of a qualified name. These come in two flavors
- object qualified
- namespace or package qualified
Trying to jigger emacs' symbol notion to cover qualified names as provided
in contemporary languages is likely to be a source of continuing complaints
and frustration.
/john
[-- Attachment #2: Type: text/html, Size: 1325 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 0:55 ` John Wiegley
@ 2016-01-04 1:14 ` Dmitry Gutov
2016-01-04 2:56 ` Stefan Monnier
0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-04 1:14 UTC (permalink / raw)
To: emacs-devel, Stefan Monnier
On 01/04/2016 02:55 AM, John Wiegley wrote:
> I suppose my informal guideline is to implement a strategy that works best for
> the mode you want to derive information from, and to not expect syntax classes
> to be a capable enough interface. I'd expect Ruby symbols to include ":",
> personally.
In c++-mode, `std::cout' is two separate symbols, so I'm going to follow
that model.
> A::B is the qualified name of a symbol
Yes, and methods have qualified names like A::B#foo or A::B.bar, but we
don't make `#' or `.' symbol constituents.
> -- although "B" is
> technically an unqualified symbol in its own right within that qualified name.
Yup. "B" is the name of a constant set on the module/class A. This is
relevant because we can reference A::B from code lexically inside A (or
even inside A::C) by its base name (B). And it's impossible to know
whether the referenced constant (classes are constants, BTW) is B, A::B
or A::C::B without runtime information, or parsing the whole project and
its dependencies.
> Better yet, define a more general API that all modes can use, since many modes
> struggle with these same issues (imenu, thing-at-pt, dabbrev, etc). This
> echoes back to our long IDE thread.
Ouch. We do need to release 25.1 sometime. And I want
xref-find-references to work okay-ish in ruby-mode by then.
There *are* some variables already in Emacs that I might have to use,
and maybe I'm missing some of them. E.g.
dabbrev-abbrev-skip-leading-regexp and find-tag-default-function (should
xref-collect-references use find-tag-default-function?).
> Perhaps we need layered, semantically-
> defined classes, such that a given text position might occur within many such
> layers (for example, selection might choose B, A::B, or A::B.foo, depending on
> how many times I smash the "select current" key).
easy-kill defines a hierarchy of things (though a simplistic one), which
works like you describe. How to apply that idea to dabbrev-expand and
xref-find-references is not immediately obvious to me.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 1:13 ` John Yates
@ 2016-01-04 1:18 ` Dmitry Gutov
[not found] ` <CAJnXXog5fO_h5UNnVR67EJtT+u7+G-BVMFV3FnJgK=weGj0m_w@mail.gmail.com>
0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-04 1:18 UTC (permalink / raw)
To: John Yates, Stefan Monnier; +Cc: emacs-devel
On 01/04/2016 03:13 AM, John Yates wrote:
> Trying to jigger emacs' symbol notion to cover qualified names as provided
> in contemporary languages is likely to be a source of continuing complaints
> and frustration.
I'm not trying to cover qualified names here. In many languages, it's
impossible to find out the qualified name of the type or method at point
without parsing the whole application with its dependencies (and
sometimes you have to run it anyway).
As far as xref is concerned, qualified symbol names are currently an
implementation detail: some backends might operate them under the
covers, but the API stays ignorant.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
[not found] ` <CAJnXXog5fO_h5UNnVR67EJtT+u7+G-BVMFV3FnJgK=weGj0m_w@mail.gmail.com>
@ 2016-01-04 2:01 ` Dmitry Gutov
0 siblings, 0 replies; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-04 2:01 UTC (permalink / raw)
To: John Yates; +Cc: emacs-devel
(Cc-ing emacs-devel)
On 01/04/2016 03:46 AM, John Yates wrote:
> I think that you are confusing issues of syntax and symbol resolution.
I'd put the question this way: should the symbol correspond more to an
atomic expression in a given language, or should it be the "name" of the
identifier or atom denoted by the expression.
To give a distant example: in Perl an PHP, you usually declare and use a
variable by prefixing its name with $. Should $ be a symbol constituent?
Both perl-mode and cperl-mode say no.
> emacs' notion of symbol is purely syntactic.
And that's the model I'm trying to work in. Again, I'm not trying to
determine qualified names.
> Starting with the the
> current symbol collection framework you could build a purely syntactic
> model of qualified names that should cover a very large set of
> contemporary languages.
I'm not sure how you think I could do that.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
[not found] ` <CAJnXXojy1b6LUdXcC+cDVPYT-OJMXCE8m8yqObE9oUYwU_PGbg@mail.gmail.com>
@ 2016-01-04 2:34 ` Dmitry Gutov
0 siblings, 0 replies; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-04 2:34 UTC (permalink / raw)
To: John Yates; +Cc: emacs-devel
On 01/04/2016 04:21 AM, John Yates wrote:
> On Sun, Jan 3, 2016 at 8:35 PM, Dmitry Gutov <dgutov@yandex.ru
> <mailto:dgutov@yandex.ru>> wrote:
> >
> > => [:<=>, :==, :===, :eql?, :hash, :casecmp, :+, :*, :%, :[], :[]=,
> :insert, :length, :size, :bytesize, :empty?, ...]
>
> It seems similar to C++'s operator keyword. Am I getting closer? Is
> the colon required to abut the subsequent characters or can one write :
> <=> (note intervening space)?
No space allowed. See this usage example:
irb(main):003:0> "abc" == "def"
=> false
irb(main):004:0> String.instance_method(:==)
=> #<UnboundMethod: String#==>
irb(main):005:0> "abc".size
=> 3
irb(main):006:0> String.instance_method(:size)
=> #<UnboundMethod: String#size>
irb(main):007:0> "abc".method(:size)
=> #<Method: String#size>
irb(main):008:0> "abc".method(:size).call
=> 3
You can have Symbols with any name, though. So they are not tied to
methods, variables or anything.
Here's a good explanation:
http://www.randomhacks.net/2007/01/20/13-ways-of-looking-at-a-ruby-symbol/,
in particular, comparison #6 rings true: Ruby Symbols are similar to
Lisp symbols. In Lisp, one can reference a dynamically bound variable,
or call a function, using a symbol:
(let ((sym 'car)) (funcall sym nil))
or even
(let ((sym 'car)) (funcall (symbol-function sym) nil))
which is similar to "abc".method(:size).call I've tried above.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 1:14 ` Dmitry Gutov
@ 2016-01-04 2:56 ` Stefan Monnier
2016-01-04 3:47 ` Dmitry Gutov
0 siblings, 1 reply; 13+ messages in thread
From: Stefan Monnier @ 2016-01-04 2:56 UTC (permalink / raw)
To: Dmitry Gutov; +Cc: emacs-devel
> In c++-mode, `std::cout' is two separate symbols, so I'm going to follow
> that model.
>> A::B is the qualified name of a symbol
> Yes, and methods have qualified names like A::B#foo or A::B.bar, but we
> don't make `#' or `.' symbol constituents.
Right, I think in general you'll be better off to err on the side of
having Emacs symbols be "too short" (and hence having to grow them by
combining Emacs symbols with surrounding chars or surrounding Emacs
symbols) rather than having Emacs symbols be "too long" (and hence
having to parse the inside of symbols rather than treat them as "atomic"
identifiers).
Stefan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Guidelines for the "symbol" syntax class
2016-01-04 2:56 ` Stefan Monnier
@ 2016-01-04 3:47 ` Dmitry Gutov
0 siblings, 0 replies; 13+ messages in thread
From: Dmitry Gutov @ 2016-01-04 3:47 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
On 01/04/2016 04:56 AM, Stefan Monnier wrote:
> Right, I think in general you'll be better off to err on the side of
> having Emacs symbols be "too short" (and hence having to grow them by
> combining Emacs symbols with surrounding chars or surrounding Emacs
> symbols) rather than having Emacs symbols be "too long" (and hence
> having to parse the inside of symbols rather than treat them as "atomic"
> identifiers).
Thanks, Stefan. I guess I'll try doing that for both constants and
Symbols, and will handle ':' in Symbol literals like perl-mode does with
'$' or '@' (not sure of the reason for the difference between the two, yet).
And easy-kill works fine with it, because the default "thing" it
interacts with is actually sexps, not symbols.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-01-04 3:47 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-03 5:09 Guidelines for the "symbol" syntax class Dmitry Gutov
2016-01-03 22:56 ` John Wiegley
2016-01-04 0:46 ` Dmitry Gutov
2016-01-04 0:51 ` Stefan Monnier
2016-01-04 0:58 ` Dmitry Gutov
2016-01-04 1:13 ` John Yates
2016-01-04 1:18 ` Dmitry Gutov
[not found] ` <CAJnXXog5fO_h5UNnVR67EJtT+u7+G-BVMFV3FnJgK=weGj0m_w@mail.gmail.com>
2016-01-04 2:01 ` Dmitry Gutov
2016-01-04 0:55 ` John Wiegley
2016-01-04 1:14 ` Dmitry Gutov
2016-01-04 2:56 ` Stefan Monnier
2016-01-04 3:47 ` Dmitry Gutov
[not found] ` <CAJnXXogonsWpqadNpX0BijzoiztorYP1d=b31seBfvGVBwwT_Q@mail.gmail.com>
[not found] ` <5689CC5C.4000408@yandex.ru>
[not found] ` <CAJnXXojy1b6LUdXcC+cDVPYT-OJMXCE8m8yqObE9oUYwU_PGbg@mail.gmail.com>
2016-01-04 2:34 ` Dmitry Gutov
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.