unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* forward-sexp when on a floating point number
@ 2016-01-12 10:42 Oleh Krehel
  2016-01-12 13:58 ` Herring, Davis
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Oleh Krehel @ 2016-01-12 10:42 UTC (permalink / raw)
  To: emacs-devel


Hi all,

When working with C++ or Python code an having e.g. this point position:

    1|23.456

I'd like "C-M-f" (`forward-sexp') to move the point here:

    123.456|

Instead, I get this:

    123|.456

This of course extends to all other sexp interactions (`mark-sexp',
`kill-sexp', `bounds-of-thing-at-point' etc).  The problem here is that
I can't do something like:

    (modify-syntax-entry ?\. "w" c++-mode-syntax-table)

since the current behavior is actually correct for things like:

    f|oo.bar ()

I finally ended up with this solution:

    (setq forward-sexp-function 'my-forward-sexp-function)
    
    (defun my-forward-sexp-function (arg)
      (let ((forward-sexp-function nil))
        (forward-sexp arg))
      (when (and (eq (char-after) ?.)
                 (looking-back "[0-9]+" (line-beginning-position)))
        (forward-char)
        (skip-chars-forward "[0-9]")))

Is there any interest in making this behavior, i.e. treating each
floating point number as a single sexp, the default (or at least easily
customizable) in the core?

regards,
Oleh



^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: forward-sexp when on a floating point number
  2016-01-12 10:42 forward-sexp when on a floating point number Oleh Krehel
@ 2016-01-12 13:58 ` Herring, Davis
  2016-01-12 14:20 ` Andreas Schwab
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Herring, Davis @ 2016-01-12 13:58 UTC (permalink / raw)
  To: Oleh Krehel, emacs-devel@gnu.org

>     (modify-syntax-entry ?\. "w" c++-mode-syntax-table)

This could be done for only those periods following numbers with a font-lock syntactic highlight:

("[0-9]\\(\\.\\)" 1 "_")    ; you want "symbol", not "word" anyway

Better would be to check for numbers after it too, to catch numbers less than 1 with no leading zero.

>                  (looking-back "[0-9]+" (line-beginning-position)))

Just use "[0-9]"; how many doesn't matter.

>         (skip-chars-forward "[0-9]")))

You want [-+0-9e] to handle scientific notation too.  In C you also need to tolerate a trailing "d" or "f" for full generality.  Of course, from after the decimal your command won't skip + or -, because you can't tell (looking only forward) that "1e+2" isn't part of "0x1e+2".  So more font-lock trickery would be better (other than in requiring font-lock!), since it could mark the "-/+" ahead of time.

Davis



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-12 10:42 forward-sexp when on a floating point number Oleh Krehel
  2016-01-12 13:58 ` Herring, Davis
@ 2016-01-12 14:20 ` Andreas Schwab
  2016-01-12 14:41   ` Oleh Krehel
  2016-01-12 17:35 ` John Wiegley
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Andreas Schwab @ 2016-01-12 14:20 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: emacs-devel

Oleh Krehel <ohwoeowho@gmail.com> writes:

>         (skip-chars-forward "[0-9]")))

You don't want to skip over [ and ].

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-12 14:20 ` Andreas Schwab
@ 2016-01-12 14:41   ` Oleh Krehel
  0 siblings, 0 replies; 15+ messages in thread
From: Oleh Krehel @ 2016-01-12 14:41 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: emacs-devel

Andreas Schwab <schwab@suse.de> writes:

>>         (skip-chars-forward "[0-9]")))
>
> You don't want to skip over [ and ].

You're right, of course. That was just the initial sketch I wrote today.
For exactly this reason I'd like to see this functionality in the core -
it's tricky to have many people implement their own version of this.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-12 10:42 forward-sexp when on a floating point number Oleh Krehel
  2016-01-12 13:58 ` Herring, Davis
  2016-01-12 14:20 ` Andreas Schwab
@ 2016-01-12 17:35 ` John Wiegley
  2016-01-12 17:45 ` Dmitry Gutov
  2016-01-17 23:07 ` Stefan Monnier
  4 siblings, 0 replies; 15+ messages in thread
From: John Wiegley @ 2016-01-12 17:35 UTC (permalink / raw)
  To: Oleh Krehel; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1069 bytes --]

>>>>> Oleh Krehel <ohwoeowho@gmail.com> writes:

> Is there any interest in making this behavior, i.e. treating each floating
> point number as a single sexp, the default (or at least easily customizable)
> in the core?

Changing core to move from syntactic-based to semantics-based movement was
brought up in another thread, concerning the meaning of ":" in ruby-mode:

  https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg00125.html

These are deep changes, with many implications, and I don't think we're ready
for that just yet. If we want to support a more semantic notion of what
symbols and punctuations mean in various modes, we should think through all
the ramifications, and come up with a design that either replaces or extends
the current syntactic notions we use now.

So I'm not in favor of making any code changes today; but I am interested in
hearing proposals and ideas.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-12 10:42 forward-sexp when on a floating point number Oleh Krehel
                   ` (2 preceding siblings ...)
  2016-01-12 17:35 ` John Wiegley
@ 2016-01-12 17:45 ` Dmitry Gutov
  2016-01-17 23:07 ` Stefan Monnier
  4 siblings, 0 replies; 15+ messages in thread
From: Dmitry Gutov @ 2016-01-12 17:45 UTC (permalink / raw)
  To: Oleh Krehel, emacs-devel

On 01/12/2016 01:42 PM, Oleh Krehel wrote:

> Is there any interest in making this behavior, i.e. treating each
> floating point number as a single sexp, the default (or at least easily
> customizable) in the core?

I think that would be better left to syntax-propertize-function in 
c++-mode (it would propertize those . as symbol--or word--constituents 
when inside numbers).

But cc-mode does not set s-p-f currently.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-12 10:42 forward-sexp when on a floating point number Oleh Krehel
                   ` (3 preceding siblings ...)
  2016-01-12 17:45 ` Dmitry Gutov
@ 2016-01-17 23:07 ` Stefan Monnier
  2016-01-17 23:42   ` John Wiegley
  4 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier @ 2016-01-17 23:07 UTC (permalink / raw)
  To: emacs-devel

> I'd like "C-M-f" (`forward-sexp') to move the point here:
>     123.456|

Hopefully at some point syntax-tables will be extended to FSM
(currently, they're trivial FSMs where you always get to a final state
after exactly one transition) so we can make them understand such
lexical details.


        Stefan




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-17 23:07 ` Stefan Monnier
@ 2016-01-17 23:42   ` John Wiegley
  2016-01-18  1:36     ` Stefan Monnier
  0 siblings, 1 reply; 15+ messages in thread
From: John Wiegley @ 2016-01-17 23:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>>>>> Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Hopefully at some point syntax-tables will be extended to FSM
> (currently, they're trivial FSMs where you always get to a final state
> after exactly one transition) so we can make them understand such
> lexical details.

Except that not everyone will want this. I still think that fancier movement
like this should be done (at least initially) in an ELPA package for those who
want it, rather than request a change to the behavior of Emacs core.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-17 23:42   ` John Wiegley
@ 2016-01-18  1:36     ` Stefan Monnier
  2016-01-18  5:08       ` John Wiegley
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier @ 2016-01-18  1:36 UTC (permalink / raw)
  To: emacs-devel

>> Hopefully at some point syntax-tables will be extended to FSM
>> (currently, they're trivial FSMs where you always get to a final state
>> after exactly one transition) so we can make them understand such
>> lexical details.
> Except that not everyone will want this.  I still think that fancier movement
> like this should be done (at least initially) in an ELPA package for those who
> want it, rather than request a change to the behavior of Emacs core.

Lexing via FSM is "standard" in the world of computer languages, so I'm
pretty sure it'd be good/useful to add such functionality to Emacs's core.

E.g. it would improve performance and robustness of many SMIE tokenizers.

Whether the functionality is added by extending syntax-tables or as
a new thingy is of course up for debate.

And whether the user-facing commands should change semantics is also up
for debate.  But that's an orthogonal debate.


        Stefan




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-18  1:36     ` Stefan Monnier
@ 2016-01-18  5:08       ` John Wiegley
  2016-01-18 13:30         ` Stefan Monnier
  0 siblings, 1 reply; 15+ messages in thread
From: John Wiegley @ 2016-01-18  5:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

>>>>> Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Lexing via FSM is "standard" in the world of computer languages, so I'm
> pretty sure it'd be good/useful to add such functionality to Emacs's core.

If it can provide identical behavior, simpler and more efficiently, I'm happy
to swap out what we have in core. If it provides new functionality at the same
time, great. If it's just trading one difficulty for another, I'd ask for it
to develop first in ELPA.

In fact, I'd like to push back a bit on the rate of changes made to core in
recent years, to solve problems that didn't necessarily need to be solved
there first.

Yes, we control core, but that doesn't mean it should be the first place we
look to make changes when brewing new ideas.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-18  5:08       ` John Wiegley
@ 2016-01-18 13:30         ` Stefan Monnier
  2016-01-18 19:02           ` John Wiegley
  2016-01-18 21:03           ` Marcin Borkowski
  0 siblings, 2 replies; 15+ messages in thread
From: Stefan Monnier @ 2016-01-18 13:30 UTC (permalink / raw)
  To: emacs-devel

>> Lexing via FSM is "standard" in the world of computer languages, so I'm
>> pretty sure it'd be good/useful to add such functionality to Emacs's core.
> If it can provide identical behavior, simpler and more efficiently, I'm happy
> to swap out what we have in core.

My suggestion is to *extend* syntax-table, such that (aref
<syntax-table> <char>) can return another syntax-table (IOW another
state in the FSM, because the char we just considered is part of a token
but that token isn't complete yet).

Major modes could use it or not.

> Yes, we control core, but that doesn't mean it should be the first place we
> look to make changes when brewing new ideas.

I largely agree.  The main reason why it didn't turn out that way for
many of the features I added is that I wanted to make use of them, and
since most of the packages I work on (and use) are in core, I couldn't
make use of those new features in them until that new feature is
in core.

It's also part of the motivation to try and bring GNU ELPA and core
closer together (either by exporting core packages to GNU ELPA like we
have now, or by including GNU ELPA packages into core like we want to
do but still haven't done).


        Stefan


PS: Lexing via FSM is harder than I make it out to be, of course, since
multi-char tokens introduce the question of how to figure out with which
state to start lexing (e.g. if we start a command from the middle of
a token), as well as how to "tokenize backward" ("single-char tokens"
(like we have now) can trivially be parsed with the same FSM going
forward and backward, but that's not true for the more general case).


        Stefan



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-18 13:30         ` Stefan Monnier
@ 2016-01-18 19:02           ` John Wiegley
  2016-01-18 21:03           ` Marcin Borkowski
  1 sibling, 0 replies; 15+ messages in thread
From: John Wiegley @ 2016-01-18 19:02 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>>>>> Stefan Monnier <monnier@iro.umontreal.ca> writes:

> My suggestion is to *extend* syntax-table, such that (aref
> <syntax-table> <char>) can return another syntax-table (IOW another
> state in the FSM, because the char we just considered is part of a token
> but that token isn't complete yet).

That sounds like a pretty elegant extension, actually.

> I largely agree. The main reason why it didn't turn out that way for many of
> the features I added is that I wanted to make use of them, and since most of
> the packages I work on (and use) are in core, I couldn't make use of those
> new features in them until that new feature is in core.

Makes sense. I love you for adding pcase. :)

> It's also part of the motivation to try and bring GNU ELPA and core closer
> together (either by exporting core packages to GNU ELPA like we have now, or
> by including GNU ELPA packages into core like we want to do but still
> haven't done).

I think the tight GNU ELPA integration will be for 26.x, which means we can
focus on making that happen once we feel 25.1 is getting ready.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-18 13:30         ` Stefan Monnier
  2016-01-18 19:02           ` John Wiegley
@ 2016-01-18 21:03           ` Marcin Borkowski
  2016-01-18 21:34             ` Stefan Monnier
  1 sibling, 1 reply; 15+ messages in thread
From: Marcin Borkowski @ 2016-01-18 21:03 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel


On 2016-01-18, at 14:30, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> PS: Lexing via FSM is harder than I make it out to be, of course, since
> multi-char tokens introduce the question of how to figure out with which
> state to start lexing (e.g. if we start a command from the middle of
> a token), as well as how to "tokenize backward" ("single-char tokens"
> (like we have now) can trivially be parsed with the same FSM going
> forward and backward, but that's not true for the more general case).

FWIW, I did something like this for (La)TeX here:
https://github.com/mbork/tex-plus.el (note: this is still WiP, and
contains more than just tokenizing TeX).  Either TeX token syntax is
weird and/or difficult, either I really suck at writing lexers/parsers,
or this is indeed a (potentially) difficult problem.  (Probably all
three.)

>         Stefan

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-18 21:03           ` Marcin Borkowski
@ 2016-01-18 21:34             ` Stefan Monnier
  2016-01-20 22:15               ` Marcin Borkowski
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier @ 2016-01-18 21:34 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: emacs-devel

> FWIW, I did something like this for (La)TeX here:
> https://github.com/mbork/tex-plus.el (note: this is still WiP, and
> contains more than just tokenizing TeX).  Either TeX token syntax is
> weird and/or difficult, either I really suck at writing lexers/parsers,
> or this is indeed a (potentially) difficult problem.  (Probably all
> three.)

Don't know about the middle one, but the other two indeed apply.  In the
general case, the esiest way to solve this problem is probably to go
back to a safe earlier state and then lex forward from there.

A "safe earlier state" could be "right after a character which can only
appear at the end of a token" [ of course, there's no guarantee that such
a character exists ].

Also "going back and then lex forward" implies a potential serious
performance problem, so it would require some form of caching.


        Stefan



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: forward-sexp when on a floating point number
  2016-01-18 21:34             ` Stefan Monnier
@ 2016-01-20 22:15               ` Marcin Borkowski
  0 siblings, 0 replies; 15+ messages in thread
From: Marcin Borkowski @ 2016-01-20 22:15 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel


On 2016-01-18, at 22:34, Stefan Monnier <monnier@IRO.UMontreal.CA> wrote:

>> FWIW, I did something like this for (La)TeX here:
>> https://github.com/mbork/tex-plus.el (note: this is still WiP, and
>> contains more than just tokenizing TeX).  Either TeX token syntax is
>> weird and/or difficult, either I really suck at writing lexers/parsers,
>> or this is indeed a (potentially) difficult problem.  (Probably all
>> three.)
>
> Don't know about the middle one, but the other two indeed apply.  In the
> general case, the esiest way to solve this problem is probably to go
> back to a safe earlier state and then lex forward from there.

Yes, in case of TeX, it is relatively easy (assuming the user did not
play around with catcodes - in such a case, /anything/ can happen (see
e.g. https://tug.org/TUGboat/tb19-4/tb61carl.pdf - try it!).

> A "safe earlier state" could be "right after a character which can only
> appear at the end of a token" [ of course, there's no guarantee that such
> a character exists ].

In TeX (under normal catcode regime), it's just like that: newline.

> Also "going back and then lex forward" implies a potential serious
> performance problem, so it would require some form of caching.

Exactly, and I wanted to avoid that.  Frankly, I had no idea how to do
this anyway; now I'm a bit wiser, and I guess I could run an idle timer
to update the cache, and have a global variable for keeping the state
(cache synchronized or not), set to "not in sync" after any
text-changing command.  Of course, any command that actually needs
a cache would have to recreate it, too.  Would that be a good plan?

>         Stefan

Best,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-01-20 22:15 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-12 10:42 forward-sexp when on a floating point number Oleh Krehel
2016-01-12 13:58 ` Herring, Davis
2016-01-12 14:20 ` Andreas Schwab
2016-01-12 14:41   ` Oleh Krehel
2016-01-12 17:35 ` John Wiegley
2016-01-12 17:45 ` Dmitry Gutov
2016-01-17 23:07 ` Stefan Monnier
2016-01-17 23:42   ` John Wiegley
2016-01-18  1:36     ` Stefan Monnier
2016-01-18  5:08       ` John Wiegley
2016-01-18 13:30         ` Stefan Monnier
2016-01-18 19:02           ` John Wiegley
2016-01-18 21:03           ` Marcin Borkowski
2016-01-18 21:34             ` Stefan Monnier
2016-01-20 22:15               ` Marcin Borkowski

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).